0% found this document useful (0 votes)
50 views

ADIGRAT UNIVERSITY Bass New

This document contains an individual assignment submitted by Mebrahtu Hadush Gesesew to their assistant professor Birhane F. at Adigrat University's Department of Public Health. The assignment addresses biostatistics topics including the role of biostatistics in public health, measures of central tendency, and probability sampling methods. It provides descriptions and examples of key biostatistical concepts in 3 paragraphs or less for each question.

Uploaded by

mebrahtuhadush3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

ADIGRAT UNIVERSITY Bass New

This document contains an individual assignment submitted by Mebrahtu Hadush Gesesew to their assistant professor Birhane F. at Adigrat University's Department of Public Health. The assignment addresses biostatistics topics including the role of biostatistics in public health, measures of central tendency, and probability sampling methods. It provides descriptions and examples of key biostatistical concepts in 3 paragraphs or less for each question.

Uploaded by

mebrahtuhadush3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

ADIGRAT UNIVERSITY

COLLEGE OF MEDICINE AND HEALTH SCIENCES

DEPARTMENT OF PUBLIC HEALTH

(HUMAN NUTRITION)

BISTATISTICS COURSE

INDIVIDUAL ASSIGNMENT

PREPARED BY:
MEBRAHTU HADUSH GESESEW
IDNO………………………
Phone- 0914857487

SUBMITTED TO:
BIRHANE F. [ASSISTANT PROFESSOR]
OCTOBER 2023
WUKRO, TIGRAY
1. Describe the role of biostatistics in public health. at least two?

Biostatistics plays a crucial role in public health by providing statistical tools and techniques to
analyze health-related data and make informed decisions. The key roles of biostatistics in public
health are:

A. Data analysis and interpretation: biostatistics helps in analyzing and


interpretation as follows.
1. Biostatistics helps in analyzing complex health data sets, identifying patterns, and
interpreting the results.
2. It provides methods for summarizing and describing the data, conducting hypothesis
testing, and estimating population parameters.
3. Biostatistics helps public health professionals in understanding disease prevalence,
evaluating interventions, and identifying risk factors.
4. It encompasses the design of biological experiments, the collection and analysis of data
from those experiments and interpretation of the results.
B. Study design and sample size determination: biostatistics helps in Study design
and sample size determination as follows.
1. Biostatistics assists in designing studies and determining the appropriate sample size
required to achieve the desired level of statistical power.
2. Biostatistics helps in considering factors such as study objectives, population
characteristics, and available resources.
3. Biostatistics helps ensure that studies are well designed and can provide reliable and
valid conclusions.
4. Biostatistics helps in sample size determination, which is essential step of research
methodology.
5. Biostatistics determining the optimal sample size for a study assures an adequate
power to detect statistical significance. Hence, it is acritical step in the design of a
planned research protocol
C. role of biostatistics in assurance
1. Use sampling and estimation method to study’s the factor related to compliance and
outcome.
2. Decide if improvement is due to compliance or something else, how best measure
compliance level in the target population.
2. List measure of central tendency and their properties. At least two?

Measures of central tendency are statistical measures that provide a representative value for a
dataset. It is a central or typical value for a probability distribution. Measures of central
tendency are often called average.
Two commonly used measures of central tendency are:

A. Mean: It is obtained by summing up all the values and dividing by the total number of
observations. It is the descriptive measure most people have in mind when they speak of
the “average.” The adjective arithmetic distinguishes this mean from other means that can
be computed. It is the most familiar measure of central tendency is the arithmetic mean.

General Formula for the Mean It will be convenient if we can generalize the procedure for
obtaining the mean and, also, represent the procedure in a more compact notational form. Let
us begin by designating the random variable of interest by the capital letter X. In our present
illustration we let X represent the random variable, age. Specific values of a random variable
will be designated by the lowercase letter x. To distinguish one value from another, we attach a
subscript to the x and let the subscript refer to the first, the second, the third value, and so on.
The Sample Mean When we compute the mean for a sample of values, the procedure just
outlined is followed with some modifications in notation. We use x to designate the sample
mean and n to indicate the number of values in the sample.
Properties of the Mean the arithmetic mean possesses certain properties, some desirable and
some not so desirable. These properties include the following:
1. Uniqueness. For a given set of data, there is one and only one arithmetic mean.
2. Simplicity. The arithmetic mean is easily understood and easy to compute.
3. Since each and every value in a set of data centers in to the computation of the mean, it is
affected by each value. Extreme values, therefore, have an influence on the mean and, in some
cases, can so distort it that it becomes undesirable as a measure of central tendency.

B. Median: The median is the middle value of an ordered dataset. It divides the data into
two equal halves, with half of the observations above and half below the median. The
number of values equal to or greater than the median is equal to the number of values
equal to or less than the median.
 If the number of values is odd, the median will be the middle value when all values have
been arranged in order of magnitude. When the number of values is even, there is no
single middle value. Instead, there are two middle values. In this case the median is
taken to be the mean of these two middle values, when all values have been arranged
in the order of their magnitudes.

Properties of the Median. Properties of the median include the following:


1. Uniqueness. As is true with the mean, there is only one median for a given set of data.
2. Simplicity. The median is easy to calculate.
3. It is not drastically affected by extreme values.
3. Describe the four types of probability sampling method and their specific
characteristics.

The four types of probability sampling methods are:


A. Simple Random Sampling: In this method, each individual in the population has an equal
chance of being selected.
2 It involves randomly selecting a sample without any specific criteria. It ensures
representativeness and allows for generalization of the findings to the population.
3 This is the most basic scheme of random sampling.
4 It is costly to conduct SRS. Moreover, minority subgroups of interest in the
population my not be present in the sample in sufficient numbers for study.

To select a simple random sample, you need to:


a. Make a numbered list of all the units in the population from which you want to
draw a sample.
b. Each unit on the list should be numbered in sequence from 1 to N (where N is
the size of the population)
c. Decide on the size of the sample.

Select the required number of study units, use BY


1. “lottery” method
2. A table of random numbers.
3. Computer program.

Merit& demerit
B. Stratified Sampling:
1 Stratified sampling involves dividing the population into homogeneous subgroups or
strata based on certain characteristics. A random sample is then selected from each
stratum.
2 This method ensures representation from each subgroup and can provide more precise
estimates for specific subgroups of interest.
3 It is appropriate when the distribution of the characteristic to be studied is strongly
affected by a certain variable (heterogeneous population). The population is first divided
in to groups (strata) according to a characteristic of interest (e.g., sex, geographic area,
prevalence of disease etc.) a separate sample is then taken independently from each
stratum, by simple random or systematic sampling

Proportional allocation: - if the same sampling fraction is used for each stratum.
Non- proportional allocation: - if a different sampling fraction is used for each stratum or if
the strata are unequal in size and a fixed number of units is selected from each stratum.

Merit: The representativeness of the sample is improved. That is, adequate representation of
minority subgroups of interest can be ensured by stratification and by varying the sampling
fraction between strata as required
DEMERIT: Sampling frame for the entire population has to be prepared separately for each
stratum.

C. Cluster Sampling: Cluster sampling involves dividing the population into clusters or
groups and randomly selecting a few clusters for inclusion in the sample. This method is
useful when it is impractical or costly to sample individuals directly. In this sampling
scheme, selection of the required sample is done on groups of study units (clusters) instead
of each study unit individually. The sampling unit is a cluster, and the sampling frame is a list
of these clusters.

Procedure: The reference population (homogeneous) is divided into clusters. These clusters
are often geographic units (e.g., districts, villages, etc.)
1. A sample of such clusters is selected.
2. All the units in the selected clusters are studied.
3. It is preferable to select a large number of small clusters rather a than small number of large
clusters.

Merit: A list of all the individual study units in the reference population is not required. It is
sufficient to have a list of clusters.
Demerit: It is based on the assumption that the characteristic to be studied is uniformly
distributed throughout the reference population, which may not always be the case. Hence,
sampling error is usually higher than for a simple random sample of the same size.

D. Systematic Sampling: Systematic sampling involves selecting individuals from a


population at fixed intervals. For example, every 5th person on a list is selected.
 This method is easy to implement and provides a representative sample if the list is
random or ordered randomly. Individuals are chosen at regular intervals (for example,
every kth) from the sampling frame.
 The first unit to be selected is taken at random from among the first k units. etc.

Merits:
 Systematic sampling is usually less time consuming and easier to perform than
simple random sampling. It provides a good approximation to SRS.
 Unlike SRS, systematic sampling can be conducted without a sampling frame (useful

Demerits:
 If there is any sort of cyclic pattern in the ordering of the subjects which coincides with
the sampling interval, the sample will not be representative of the population.
4. Explain basic difference between correlation and regression?
The basic difference between correlation and regression is:
Correlation
1. Measures the strength and direction of the relationship between two variables.
2. It quantifies the degree to which changes in one variable are associated with changes in
another variable.
3. Correlation does not imply causation, meaning it does not determine if one variable
causes change in the other.
4. Measures: the degree of relationship between two independent variables (X and Y) and
stipulates the degree to which both variables can move together.

Regression
1. Is used to model the relationship between a dependent variable and one or more
independent variables.
2. It aims to predict the value of the dependent variable based on the values of the
independent variables.
3. Regression can help identify the nature and strength of the relationship and can be
used for prediction and hypothesis testing. Measures how one variable affects another
and the main purpose is to calculate the values of a random variable

We can differentiate Correlation and Regression by the following important points

1. Interchangeable factors

Regression establishes how X causes Y to change and the results will change if X and Y are
swapped. With correlation, X and Y are variables that can be interchanged and get the same
result.

2. single data point vs. equation: correlation is a single statistics or data point whereas
regression is the entire equation with all of the data point that are represented with a
line
3. Relationship vs. effect: correlation shows the relationships between the two variables,
while regression allows us to see how one affects the other.
4. Cause and effect the data shown with regression establishes a cause and effect. when
one changes so does the other, and not always in the same direction. with correlation
the variables move together.

In summary, correlation focuses on the relationship between variables, whereas regression


focuses on predicting or explaining the value of a dependent variable based on independent
variables

5. What are the assumption of logistic regression? explain? describe at least two?

The assumptions of logistic regression include:


A. Linearity of the logit: The relationship between the independent variables and the log-
odds of the outcome variable is assumed to be linear. This assumption implies that the
effect of the independent variables is constant across all levels of the predictors.
B. Independence of observations: The observations used in logistic regression should be
independent of each other. This assumption ensures that the observations are not
influenced by each other, and the estimated coefficients are not biased.
C. No multicollinearity: Logistic regression assumes that there is no
severe multicollinearity among the explanatory variables. Multicollinearity occurs when two
or more explanatory variables are highly correlated to each other, such that they do not
provide unique or independent information in the regression model. If the degree of
correlation is high enough between variables, it can cause problems when fitting and
interpreting the model.

D. Inclusion of all relevant variable in the regression model.

E. Exclusion of all irrelevant variable from the model.

These assumptions are important to ensure the validity and reliability of the logistic regression
model. Violation of these assumptions may lead to biased or misleading results.
IN general, logistic regression is a data analysis technique that uses mathematics to find the
relationship between two data factor.it then uses this relationship between two data factors. It
then uses this relationship to predict the value of one those factors based on the others the
prediction usually has a finite number of outcomes like yes or no.

You might also like