FHA UNIT 1 INTRODUCTION
FHA UNIT 1 INTRODUCTION
Introduction:
We are frequently reminded of the fact that we are living in the information age.
Appropriately, then, this book is about information—how it is obtained, how it is analyzed, and
how it is interpreted. The information about which we are concerned we call data, and the data
are available to us in the form of numbers.
Basic Concepts:
What is data?
Data is a collection of facts, such as numbers, words, measurements, observations.
Types of Data:
1.Structured Data: highly organized (Example Spread Sheet and Databases)
2.Unstructured Data: no regular structure (emails, social media posts, online blogs,
newspapers, books, and scientific publications)
3.Big Data: structured(DATABASES), semi-structured(XML, HTML), unstructured (photo,
video)
What are the sources of Data?
1.Routinely kept records
2.Surveys.
3.Experiments.
4.External sources(already existing datas)
Data analysis-used to analyze large datasets to extract meaningful insights and patterns
Modeling and Simulation-used to develop and analyze mathematical models of complex systems
Machine learning-used to develop and apply algorithms automatically learning from data and
making predictions
Optimization-used to find the best or the most efficient/robust solution to a problem
Visualization-used to create visual representations of data and models
Data :
The raw material of statistics is data. For our purposes we may define data as numbers.
The two kinds of numbers that we use in statistics are numbers that result from the taking—in
the usual sense of the term—of a measurement, and those that result from the process of
counting. For example, when a nurse weighs a patient or takes a patient’s temperature, a
measurement, consisting of a number such as 150 pounds or 100 degrees Fahrenheit, is
obtained. Quite a different type of number is obtained when a hospital administrator counts the
number of patients—perhaps 20—discharged from the hospital on a given day. Each of the
three numbers is a datum, and the three taken together are data.
Statistics
Statistics is a field of study concerned with the collection, organization, summarization,
and analysis of data; and the drawing of inferences about a body of data when only a part of
the data is observed. For example The person who performs these statistical activities must be
prepared to interpret and to communicate the results to someone else as the situation demands.
Simply put, we may say that data are numbers, numbers contain information, and the purpose
of statistics is to investigate and evaluate the nature and meaning of this information.
Biostatistics
The tools of statistics are employed in many fields—business, education, psychology,
agriculture, and economics, to mention only a few. When the data analyzed are derived from
the biological sciences and medicine, we use the term biostatistics to distinguish this particular
application of statistical tools and concepts.
Variable -If, as we observe a characteristic, we find that it takes on different values in different
persons, places, or things, we label the characteristic a variable. Some examples of variables
include diastolic blood pressure, heart rate, the heights of adult males, the weights of preschool
children, and the ages of patients seen in a clinic.
1.Quantitative Variables -A quantitative variable is one that can be measured in the usual
sense. We can, for example, obtain measurements on the heights of adult males, the weights of
preschool children, and the ages of patients seen in a dental clinic. These are examples of
quantitative variables.
2.Qualitative Variables -Some characteristics are not capable of being measured in the sense
that height, weight, and age are measured. Many characteristics can be categorized only, as, for
example, when an ill person is given a medical diagnosis, a person is designated as belonging
to an ethnic group, or a person, place, or object is said to possess or not to possess some
characteristic of interest. In such cases measuring consists of categorizing. We refer to variables
of this kind as qualitative variables.
3.Random Variable -Whenever we determine the height, weight, or age of an individual, the
result is frequently referred to as a value of the respective variable. When the values obtained
arise as a result of chance factors, so that they cannot be exactly predicted in advance, the
variable is called a random variable. An example of a random variable is adult height. When a
child is born, we cannot predict exactly his or her height at maturity. Attained adult height is
the result of numerous genetic and environmental factors. Values resulting from measurement
procedures are often referred to as observations or measurements.
4.Discrete Random Variable -Variables may be characterized further as to whether they are
discrete or continuous. discrete variable is characterized by gaps or interruptions in the values
that it can assume. The number of daily admissions to a general hospital is a discrete random
variable since the number of admissions each day must be represented by a whole number,
such as 0, 1, 2, or 3. The number of admissions on a given day cannot be a number such as 1.5,
2.997, or 3.333.
5.Continuous Random Variable- A continuous random variable does not possess the gaps or
interruptions characteristic of a discrete random variable. A continuous random variable can
assume any value within a specified relevant interval. of values assumed by the variable.
Examples of continuous variables include the various measurements that can be made on
individuals such as height, weight, and skull circumference. No matter how close together the
observed heights of two people, for example, we can, theoretically, find another person whose
height falls somewhere inbetween.
4.Population-A population or collection of entities may, however, consist of animals,
machines, places, or cells. For our purposes, we define a population of entities as the largest
collection of entities a population of values as the largest collection of values of a random
variable for which we have an interest at a particular time. for example, we are interested in
the weights of all the children enrolled in a certain county elementary school system, our
population consists of all these weights. If our interest lies only in the weights of first-grade
students in the system, we have a different population—weights of first-grade students enrolled
in the school system. Hence, populations are determined or defined by our sphere of interest.
Populations may be finite or infinite. If a population of values consists of a fixed number of
these values, the population is said to be finite. If, on the other hand, a population consists of
an endless succession of values, the population is an infinite one.
5.Sample -A sample may be defined simply as a part of a population. Suppose our population
consists of the weights of all the elementary school children enrolled in a certain county school
system. If we collect for analysis the weights of only a fraction of these children, we have only
a part of our population of weights, that is, we have a sample.
Types Of Statistics:
Consider an example of a book dealing with lot of information. The objectives of this
book are twofold: (1) to teach the student to organize and summarize data, and (2) to teach the
student how to reach decisions about a large body of data by examining only a small part of it.
The concepts and methods necessary for achieving the first objective are presented under the
heading of descriptive statistics, and the second objective is reached through the study of what
is called inferential statistics.