0% found this document useful (0 votes)
32 views

OP5205: BUSINESS STATISTICS (2 Credits) Session 21-24 - Confidence Interval & Hypothesis Testing

Uploaded by

pkhulbemba24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

OP5205: BUSINESS STATISTICS (2 Credits) Session 21-24 - Confidence Interval & Hypothesis Testing

Uploaded by

pkhulbemba24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

OP5205: BUSINESS STATISTICS(2 Credits)

Session 21-24 – Confidence Interval & Hypothesis Testing

Lecture Presentation | Dr. Piyush Kumar


Highlights
• Confidence Interval for the Population Mean When the Population Standard Deviation Is
Known

• Confidence Intervals for Population Mean When Population Standard Deviation Is


Unknown — The t Distribution

• Null and Alternate Hypothesis

• The Concepts of Hypothesis Testing


 Type I and II Errors, p-value, significance level, optimal alpha, beta, sample size

• Computing the p-Value


 1-tailed and 2-tailed tests

• The Hypothesis Test


 Test of Hypothesis about Population Mean, Population proportion, Population
Variance
Lecture Presentation | Dr. Piyush Kumar
Confidence Interval
Confidence Interval
• A confidence interval is a range of numbers believed to include an unknown population
parameter. Associated with the interval is a measure of the confidence we have that the
interval does indeed contain the parameter of interest.

• Statement 1: The sample mean is 550.


It is giving a point estimate of the population mean.

• Statement 2: We are 99% confident that population mean is in the interval [449, 551].
This conveys much more information about the possible value of population mean.

• Statement 3: We are 90% confident that population mean is in the interval [400, 700].
This interval conveys less information about the possible value of population mean, both because it
is wider and because the level of confidence is lower.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Confidence Interval
• It is an interval of values computed from the sample, that is almost sure to cover the true
population value.

• 95% Confidence interval means:


 In 95% of the samples we take, the true population parameter will be in the interval.

 This is also the same as saying we are 95% confident that the true population parameter will
be in the interval

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Lecture Presentation | Dr. Piyush Kumar
Lecture Presentation | Dr. Piyush Kumar
Calculation of Confidence Interval – Different Scenarios
• Confidence interval for population mean when population standard deviation is known

• Confidence interval for population mean when population standard deviation is unknown:
Student t-distribution

• Confidence interval for population proportion

• Confidence Intervals for the Population Variance: chi-square distribution

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Scenario 1:Confidence Interval for the Population Mean
When the Population Standard Deviation
Is Known
Scenario 1: CI – For Population Mean when Population SD is Known

95% Confidence Interval, Z


table => 0.475 (0.95/2) = 1.96

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Scenario 1: CI – For Population Mean when Population SD is Known

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Scenario 1: CI – For Population Mean when Population SD is Known

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Z-Table

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Calculation of Confidence Interval – Example 1
What is the value of Z for the following:
• 95% confidence interval
• 90% confidence interval
• 99% confidence interval

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Calculation of Confidence Interval – Example 1 - Solution
What is the value of Z for the following:
• 95% confidence interval
• 90% confidence interval
• 99% confidence interval

• 95% confidence interval => 0.475 both sides = +-1.96


• 90% confidence interval => 0.45 both sides = +-1.645
• 99% confidence interval => 0.495 both sides = +-2.57

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 2
Suppose sample size is 25, sample mean is 122 and population SD is 20. Calculate the population
mean at 95% CI.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 2 - Solution
Suppose sample size is 25, sample mean is 122 and population SD is 20. Calculate the population
mean at 95% CI.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 3

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 3 - Solution

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 4
Comcast, the computer services company, is planning to invest heavily in online television service. As
part of the decision, the company wants to estimate the average number of online shows a family of
four would watch per day. A random sample of n =100 families is obtained, and in this sample the
average number of shows viewed per day is 6.5 and the population standard deviation is known to be
3.2. Construct a 95% confidence interval for the average number of online television shows watched
by the entire population of families of four.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 4 - Solution
Comcast, the computer services company, is planning to invest heavily in online television service. As
part of the decision, the company wants to estimate the average number of online shows a family of
four would watch per day. A random sample of n =100 families is obtained, and in this sample the
average number of shows viewed per day is 6.5 and the population standard deviation is known to be
3.2. Construct a 95% confidence interval for the average number of online television shows watched
by the entire population of families of four.

Answer: Comcast can be 95% confident that the average family of four within its population of
subscribers will watch an average daily number of online television shows between about 5.87 and
7.13.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Scenario 2:Confidence Interval for the Population Mean
When the Population Standard Deviation
Is Unknown – The t distribution
Scenario 2: CI – For Population Mean when Population SD is Unknown – The t distribution
• When the population standard deviation is not known, we may use the sample standard deviation S
in its place. If the population is normally distributed, the standardized statistic has a t distribution
with n-1 degrees of freedom. The degrees of freedom of the distribution are the degrees of freedom
associated with the sample standard deviation S.

• The t distribution is also called Student’s distribution, or Student’s t distribution.

• The t distribution is characterized by its degrees-of-freedom parameter df. For any integer value df
= 1, 2, 3, . . . , there is a corresponding t distribution. The t distribution resembles the standard
normal distribution Z: it is symmetric and bell-shaped. The t distribution, however, has wider tails
than the Z distribution.
[Confidence Interval Formula – Small Sample <=30]

Sample Size < 30

• The mean of a t distribution is zero. For df > 2, the variance of the t distribution is equal to df/(df - 2).
• The t distribution thus reflects the uncertainty in two random variables, sample mean and SD, while
Z reflects only an uncertainty due to sample mean.
• As df increases, the t distribution Lecture
approaches the Z distribution
Presentation | Dr. Piyush Kumar
Source: Complete Business Statistics (4e) (TMH)
For Large Sample – Where Pop SD is unknown – t Converges to Z-distribution

[Confidence Interval Formula – Large Sample > 30]

Sample Size > 30

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
T-Distribution Table

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 5 - Small Sample and Population SD Unknown

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 5 - Solution

Since the sample size is n =15, we need to use the t distribution with n -1 = 14 degrees of freedom. In
Table 3, in the row corresponding to 14 degrees of freedom and the column corresponding to a right-
tail area of 0.025 (this is alpha/2), we find that t(0.025) = 2.145

Thus, the analyst may be 95% sure that the average annualized return on the stock is anywhere from
8.43% to 12.31%.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 6 - Large Sample and Population SD Unknown

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
CI – Example 6 - Solution

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Thank You!
Null Hypothesis
Alternate and Null Hypothesis
• Hypothesis testing can be used to determine whether a statement about the value of a population
parameter should or should not be rejected.

• The null hypothesis, denoted by H0 , is a tentative assumption about a population parameter


(Status quo or no diff).

• A null hypothesis is an assertion about the value of a population parameter. It is an assertion that
we hold as true unless we have sufficient statistical evidence to conclude otherwise.

• For example, a null hypothesis might assert that the population mean is equal to 100. Unless we
obtain sufficient evidence that it is not 100, we will accept it as 100. We write the null hypothesis
compactly as

• The alternative hypothesis, denoted by Ha, is the opposite of what is stated in the null hypothesis.

• The hypothesis testing procedure uses data from a sample to test the two competing statements
indicated by H0 and Ha. Examples - In a court case you look at the evidence, and convict the
person only if there is enough evidence that they are guilty. Your spam filter looks at each email and
rejects it if there is sufficient evidence that it is junk mail. When you give blood they test for HIV and
throw out the blood if there is evidence it is infected.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Alternate and Null Hypothesis
• Hypotheses about other parameters such as population proportion or population variance are also
possible.

• In addition, a hypothesis may assert that the parameter in question is at least or at most some
value.

• For example, the null hypothesis may assert that the population proportion p is at least 40%. In this
case, the null and alternative hypotheses are:

• Another example, population variance is at most 50.

• In both scenarios, = sign lies in Null Hypothesis

• Generally what the statistician aims to prove is the alternative hypothesis, the null hypothesis
standing for the status quo, do-nothing situation.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Develop Hypothesis – Null and Alternate
• Alternative Hypothesis as a Research Hypothesis

 Many applications of hypothesis testing involve an attempt to gather evidence in support of a


research hypothesis.

 In such cases, it is often best to begin with the alternative hypothesis and make it the
conclusion that the researcher hopes to support.

 The conclusion that the research hypothesis is true is made if the sample data provide
sufficient evidence to show that the null hypothesis can be rejected.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Develop Hypothesis – Example 1
• Alternative Hypothesis as a Research Hypothesis

Statement 1: A new teaching method is developed that is believed to be better than the current
method

Alternative Hypothesis: The new teaching method is better

Null Hypothesis: The new method is no better than the old method.

Statement 2: A new sales force bonus plan is developed in an attempt to increase sales.

Alternative Hypothesis: The new sales force bonus plan will increase sales

Null Hypothesis: The new sales force bonus plan is no better than the old method.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Develop Hypothesis – Example 2
A vendor claims that his company fills any accepted order, on the average, in at most six working days.
You suspect that the average is greater than six working days and want to test the same. How will you
set up the null and alternative hypotheses?

The claim is the null hypothesis and the suspicion is the alternative hypothesis. Thus, with
denoting the average time to fill an order,

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Develop Hypothesis – Example 2 - Solution
A vendor claims that his company fills any accepted order, on the average, in at most six working days.
You suspect that the average is greater than six working days and want to test the same. How will you
set up the null and alternative hypotheses?

The claim is the null hypothesis and the suspicion is the alternative hypothesis. Thus, with
denoting the average time to fill an order,

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Develop Hypothesis – Example 3
A manufacturer of golf balls claims that the variance of the weights of the company’s golf balls is
controlled to within 0.0028 oz2. You suspect it may not be true. If you wish to test this claim, how will
you set up the null and alternative hypotheses?

The claim is the null hypothesis. Thus, with denoting the variance,

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Develop Hypothesis – Example 3 - Solution
A manufacturer of golf balls claims that the variance of the weights of the company’s golf balls is
controlled to within 0.0028 oz2. You suspect it may not be true. If you wish to test this claim, how will
you set up the null and alternative hypotheses?

The claim is the null hypothesis. Thus, with denoting the variance,

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Develop Hypothesis – Example 4
At least 20% of the visitors to a particular commercial Web site where an electronic product is sold are
said to end up ordering the product. If you wish to test this claim, how will you set up the null and
alternative hypotheses? With p denoting the proportion of visitors ordering the product

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Develop Hypothesis – Example 4 - Solution
At least 20% of the visitors to a particular commercial Web site where an electronic product is sold are
said to end up ordering the product. If you wish to test this claim, how will you set up the null and
alternative hypotheses? With p denoting the proportion of visitors ordering the product

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Concept of Hypothesis Testing
Concept of Hypothesis Testing
• Type I and Type II Errors

• P-value

• Significance Level (alpha)

• Optimal alpha and the Compromise between Type I and Type II Errors

• Beta and Power

• Sample Size

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Concept of Hypothesis Testing – Type I and II Errors
Type I and Type II Errors
• Accept or Reject decisions are driven by sample based outcomes like:
• An inspector has to accept or reject a batch of parts supplied by a vendor, usually based on
test results of a random sample.
• A recruiter has to accept or reject a job applicant, usually based on evidence gathered from a
résumé and interview.
• A bank manager has to accept or reject a loan application, usually based on financial data on
the application. Accepted a False appl (Type II) & Rejecting a True appl (Type I).

• As long as such decisions are made based on evidence that does not provide 100% confidence,
there will be chances for error.

• In the context of statistical hypothesis testing,


• Rejecting a true null hypothesis is known as a Type I error; and
• Accepting a false null hypothesis is known as a Type II error.

• How to minimize Type I and II errors?


• Always accept Null Hypothesis (Type I error is zero)
• Always reject Null Hypothesis (Type II error is zero)
Both these are Impractical options?

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Type I & II Errors – Avoidance Options – p-value
Fundamentally, it is not possible to avoid Type I or II errors completely. Depending upon the context,
one can make a choice which will have more probability of either Type I or II. Possibilities as follows:

1. Concept of p-value - Definition


Given a null hypothesis and sample evidence with sample size n, the p-value is the probability of
getting a sample evidence that is equally or more unfavorable to the null hypothesis while the null
hypothesis is actually true. The p-value is calculated giving the null hypothesis the maximum benefit of
doubt.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Type I & II Errors – Avoidance Options`- Significance level
2. Significance Level

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Type I & II Errors – Avoidance Options`- Significance level
2. Significance Level
Points of Attention
Point 1: The first thing to note is that if we do not reject H0 , this does not prove that H0 is true.

Point 2: The maximum probability of type I error we set for ourselves. Since is the maximum p-value at
which we reject H0, it is the maximum probability of committing a type I error. In other words, setting 5%
means that we are willing to put up with up to 5% chance of committing a type I error.

Point 3: Increasing the value of alpha will decrease the probability of type II error. Example, increasing
from 5% to 10% means that in those instances with a p-value in the range 5% to 10% the H0 that would not
have been rejected before would now be rejected.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Type I & II Errors – Avoidance Options`- Optimal Alpha
• Selecting a value for alpha is a question of compromise between type I and type II error probabilities.

• Requires knowing the cost of each type of error. Cost estimation is difficult. Hence, normal approach
involves intuitive assigning of one of the three standard values, 1%, 5%, and 10%, to alpha.

• For example, Suppose we are testing the average tensile strength of a large batch of bolts produced by
a machine to see if it is above the minimum specified.
 Type I error => Rejecting a good batch of bolts and the cost of the error is roughly equal to the
cost of the batch of bolts.

 Case 1: Type II error => Accepting a bad batch of bolts and its cost can be high or low depending
on how the bolts are used. If the bolts are used to hold together a structure, then the cost is high
because defective bolts can result in the collapse of the structure, causing great damage.

 In this case, we should strive to reduce the probability of type II error more than that of type I error
by keeping a large value of alpha, namely, 10%.

 Case 2: Else, if the bolts are used to secure the lids on trash cans, then the cost of type II error is
not high and we should strive to reduce the probability of type I error more than that of type II
error. In such cases where type I error is more costly, we keep a small value for alpha, namely, 1%.

 Case 3: Then there are cases where we are not able to determine which type of error is more
costly. If the costs are roughly equal, or if we have not much knowledge about the relative costs of
the two types of errors, then Lecture
keep alpha at 5%.| Dr. Piyush Kumar
Presentation
Source: Complete Business Statistics (4e) (TMH)
Type I & II Errors – Avoidance Options`- Beta & Sample Size
Power of Beta
• The symbol used for the probability of type II error is Beta
• The complement of Beta (1 - Beta) is known as the power of the test.
• The power of a test is the probability that a false null hypothesis will be detected by the test.

Sample Size Type I / II Errors


• When the costs of both types of error are high, the best policy is to have a large sample and a low alpha,
such as 1%.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Computing p value
How to get p-value
• p-value is the probability of getting evidence that is equally or more unfavorable to H0
• Consider the example here where population SD is given, population mean is 1000 and n>30. We
calculate Z value here (Normal distribution) followed by computation of p-value from Z table.

• On the basis of p value, we determine whether or not to reject H0 (based upon alpha).

• In this scenario, since calculation of Z determines the p-value, Z is also known as test statistic .

• A test statistic is a random variable calculated from the sample evidence, which follows a well-known
distribution and thus can be used to calculate the p-value.

• Similarly, t value, Chi-Square value, and F value are also known as test statistic as those can also be
used to determine the p-value.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
P-value calculation – Example 5
• Consider the example here where population SD is given, population mean is 1000 and n>30. We
calculate Z value here (Normal distribution) followed by computation of p-value from Z table.

Z value is the test statistic here

• Suppose the population standard deviation is 5 and the sample size n is 100

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
P-value calculation – 1-Tailed (Left and Right) Tests

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
P-value calculation – 2-Tailed Test

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
P-value calculation – Example 6
In a hypothesis test, the test statistic Z = -1.86.
1. Find the p-value if the test is (a) left-tailed, (b) right-tailed, and (c) two-tailed.
2. In which of these three cases will H0 be rejected at an alpha of 5%?

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Z-Table

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
P-value calculation – Example 6 - Solution
In a hypothesis test, the test statistic Z = -1.86.
1. Find the p-value if the test is (a) left-tailed, (b) right-tailed, and (c) two-tailed.
2. In which of these three cases will H0 be rejected at an alpha of 5%?

1(a) The area to the left of -1.86, from the tables, is 0.5 - 0.4686 = 0.0314, or the p-value is 3.14%.

1(b) The area to the right of -1.86, from the tables, is 0.5 + 0.4686 = 0.9686, or the p-value is 96.86%. (Such a
large p-value means that the evidence greatly favors H0, and there is no basis for rejecting H0.)

1(c) The value -1.86 falls on the left tail. The area to the left of -1.86 is 3.14%. Multiplying that by 2, we get
6.28%, which is the p-value.

2. Only in the case of a left-tailed test does the p-value fall below the alpha of 5%. Hence that is the only
case where H0 will be rejected.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
The Hypothesis Test – Population Mean
The Hypothesis Test
1. Tests of hypotheses about population means.

2. Tests of hypotheses about population proportions.

3. Tests of hypotheses about population variances.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Test Population Means
• When the null hypothesis is about a population mean, the test statistic can be either Z or t.

Cases in Which the Test Statistic Is Z – Two Scenarios


1. Population SD (Sigma) is known and the population is normal.
2. Population SD (Sigma) is known and the sample size is at least 30. (The population need not be
normal.)

Here, The normality of the population may be established by direct tests or the normality may be
assumed based on the nature of the population. Z is calculated as:

Cases in Which the Test Statistic Is t


The population is normal and Population SD (Sigma) is unknown but the sample standard deviation S
is known.

Note: Since the t table provides only the critical values, it cannot be used to find exact p-values. For
example, if the calculated value of t is 2.000 and the degrees of freedom are 24, we see from the tables that
t(0.05) is 1.711 and t(0.025) is 2.064. Thus, the one-tailed p-value corresponding to t = 2.000 must be
somewhere between 0.025 and 0.05, but we don’t know its exact value..
Lecture Presentation | Dr. Piyush Kumar
Source: Complete Business Statistics (4e) (TMH)
Test Population Means

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Test Population Means – Example 7
An automatic bottling machine fills cola into 2-liter (2,000-cm3) bottles. A consumer advocate wants to
test the null hypothesis that the average amount filled by the machine into a bottle is at least 2,000
cm3. A random sample of 40 bottles coming out of the machine was selected and the exact contents
of the selected bottles are recorded.
The sample mean was 1,999.6 cm3. The population standard deviation is known from past experience
to be 1.30 cm3.
1. Test the null hypothesis at an alpha of 5%.

2. Assume that the population is normally distributed with the same population SD of 1.30 cm3.
Assume that the sample size is only 20 but the sample mean is the same 1,999.6 cm3. Conduct the
test once again at an of 5%.

3. If there is a difference in the two test results, explain the reason for the difference.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Z-Table

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Test Population Means – Example 7 - Solution
1. Test the null hypothesis at an of 5%.

2. Assume that the population is normally distributed with the same population SD of 1.30 cm3.
Assume that the sample size is only 20 but the sample mean is the same 1,999.6 cm3. Conduct
the test once again at an of 5%.

3. If there is a difference in the two test results, explain the reason for the difference.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
The Hypothesis Test – Population Proportion
Test Population Proportion
• Hypotheses about population proportions can be tested using the binomial distribution or normal
approximation to calculate the p-value.

• The cases in which each approach is to be used are detailed below.

Scenario 1 - Cases in Which the Binomial Distribution Can Be Used


The binomial distribution can be used whenever we are able to calculate the necessary binomial
probabilities. This means for calculations using tables, the sample size n and the population proportion
p should have been tabulated.

When the binomial distribution is used, the number of successes X serves as the test statistic. The
p-value is the appropriate tail area, determined by X (follows a discrete distribution), of the binomial
distribution defined by n and the hypothesized value of population proportion p.

Scenario 2 - Cases in Which the Normal Approximation Is to Be Used


If the sample size n is too large ( 500) to calculate binomial probabilities, then the normal
approximation method is to be used.

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Test Population Proportion – Binomial Distribution - Example 8
A coin is to be tested for fairness. It is tossed 25 times and only 8 heads are observed. Test if the coin
is fair at alpha = 5%

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Test Population Proportion – Using Binomial Distribution - Example 8- Solution

A coin is to be tested for fairness. It is tossed 25 times and only 8 heads are observed. Test if the coin
is fair at alpha = 5%

0.053876 =BINOM.DIST(8,25,0.5,TRUE)

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Test Population Proportion – Using Normal Distribution (n>500)
Cases in Which the Normal Approximation Is to Be Used
If the sample size n is too large ( 500) to calculate binomial probabilities, then the normal
approximation method is to be used.

Z = ((132/210) – 0.7) / SQRT ((0.7*0.3)/210) = 2.258


Z value @ 2.26 = 0.4881
0.5 – 0.4881 = .0119
2- tailed => p = 2 * .0119 = 0.0238 which is <5%
H0 rejected

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
The Hypothesis Test – Population Variances
Test Population Variances

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Test Population Variance – Example 9
A manufacturer of golf balls claims that the company controls the weights of the golf balls accurately
so that the variance of the weights is not more than 1 mg2. A random sample of 31 golf balls yields a
sample variance of 1.62 mg2. Is that sufficient evidence to reject the claim at an of 5%?

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Test Population Variance – Example 9 - Solution
A manufacturer of golf balls claims that the company controls the weights of the golf balls accurately
so that the variance of the weights is not more than 1 mg2. A random sample of 31 golf balls yields a
sample variance of 1.62 mg2. Is that sufficient evidence to reject the claim at an of 5%?

Chi-Sq = (n-1)s^2/sigma^2

= (31-1)*1.62/1
= 48.6

At 30 df, Chi-Sq for 48.6 is approx.


.0173

P = .0173 which is less that .05


So, H0 is rejected at alpha = 0.05

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Thank You!
Z-Table

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
T-Distribution Table

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)
Values and Probabilities of Chi-Square Distribution

Lecture Presentation | Dr. Piyush Kumar


Source: Complete Business Statistics (4e) (TMH)

You might also like