# 11.4 Test for homogeneity  (Page 2/8)

 Page 2 / 8

## Try it

Do families and singles have the same distribution of cars? Use a level of significance of 0.05. Suppose that 100 randomly selected families and 200 randomly selected singles were asked what type of car they drove: sport, sedan, hatchback, truck, van/SUV. The results are shown in [link] . Do families and singles have the same distribution of cars? Test at a level of significance of 0.05.

Sport Sedan Hatchback Truck Van/SUV
Family 5 15 35 17 28
Single 45 65 37 46 7

With a p -value of almost zero, we reject the null hypothesis. The data show that the distribution of cars is not the same for families and singles.

Both before and after a recent earthquake, surveys were conducted asking voters which of the three candidates they planned on voting for in the upcoming city council election. Has there been a change since the earthquake? Use a level of significance of 0.05. [link] shows the results of the survey. Has there been a change in the distribution of voter preferences since the earthquake?

 Perez Chung Stevens Before 167 128 135 After 214 197 225

H 0 : The distribution of voter preferences was the same before and after the earthquake.

H a : The distribution of voter preferences was not the same before and after the earthquake.

Degrees of Freedom ( df ):
df = number of columns – 1 = 3 – 1 = 2

Distribution for the test: ${\chi }_{2}^{2}$
Calculate the test statistic : χ 2 = 3.2603 (calculator or computer)
Probability statement: p -value= P ( χ 2 >3.2603) = 0.1959

Press the MATRX key and arrow over to EDIT . Press 1:[A] . Press 2 ENTER 3 ENTER . Enter the table values by row. Press ENTER after each. Press 2nd QUIT . Press STAT and arrow over to TESTS . Arrow down to C:χ2-TEST . Press ENTER . You should see Observed:[A] and Expected:[B] . Arrow down to Calculate . Press ENTER . The test statistic is 3.2603 and the p -value = 0.1959. Do the procedure a second time but arrow down to Draw instead of calculate .

Compare α and the p -value: α = 0.05 and the p -value = 0.1959. α < p -value.

Make a decision: Since α < p -value, do not reject H o .

Conclusion: At a 5% level of significance, from the data, there is insufficient evidence to conclude that the distribution of voter preferences was not the same before and after the earthquake.

## Try it

Ivy League schools receive many applications, but only some can be accepted. At the schools listed in [link] , two types of applications are accepted: regular and early decision.

Application Type Accepted Brown Columbia Cornell Dartmouth Penn Yale
Regular 2,115 1,792 5,306 1,734 2,685 1,245
Early Decision 577 627 1,228 444 1,195 761

We want to know if the number of regular applications accepted follows the same distribution as the number of early applications accepted. State the null and alternative hypotheses, the degrees of freedom and the test statistic, sketch the graph of the p -value, and draw a conclusion about the test of homogeneity.

H 0 : The distribution of regular applications accepted is the same as the distribution of early applications accepted.

H a : The distribution of regular applications accepted is not the same as the distribution of early applications accepted.
df = 5
χ 2 test statistic = 430.06

Press the MATRX key and arrow over to EDIT . Press 1:[A] . Press 3 ENTER 3 ENTER . Enter the table values by row. Press ENTER after each. Press 2nd QUIT . Press STAT and arrow over to TESTS . Arrow down to C:χ2-TEST . Press ENTER . You should see Observed:[A] and Expected:[B] . Arrow down to Calculate . Press ENTER . The test statistic is 430.06 and the p -value = 9.80E-91. Do the procedure a second time but arrow down to Draw instead of calculate .

## References

Data from the Insurance Institute for Highway Safety, 2013. Available online at www.iihs.org/iihs/ratings (accessed May 24, 2013).

“Energy use (kg of oil equivalent per capita).” The World Bank, 2013. Available online at http://data.worldbank.org/indicator/EG.USE.PCAP.KG.OE/countries (accessed May 24, 2013).

“Parent and Family Involvement Survey of 2007 National Household Education Survey Program (NHES),” U.S. Department of Education, National Center for Education Statistics. Available online at http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2009030 (accessed May 24, 2013).

“Parent and Family Involvement Survey of 2007 National Household Education Survey Program (NHES),” U.S. Department of Education, National Center for Education Statistics. Available online at http://nces.ed.gov/pubs2009/2009030_sup.pdf (accessed May 24, 2013).

## Chapter review

To assess whether two data sets are derived from the same distribution—which need not be known, you can apply the test for homogeneity that uses the chi-square distribution. The null hypothesis for this test states that the populations of the two data sets come from the same distribution. The test compares the observed values against the expected values if the two populations followed the same distribution. The test is right-tailed. Each observation or cell category must have an expected value of at least five.

## Formula review

$\sum _{i\cdot j}\frac{{\left(O-E\right)}^{2}}{E}$ Homogeneity test statistic where: O = observed values
E = expected values
i = number of rows in data contingency table
j = number of columns in data contingency table

df = ( i −1)( j −1) Degrees of freedom

A math teacher wants to see if two of her classes have the same distribution of test scores. What test should she use?

test for homogeneity

What are the null and alternative hypotheses for [link] ?

A market researcher wants to see if two different stores have the same distribution of sales throughout the year. What type of test should he use?

test for homogeneity

A meteorologist wants to know if East and West Australia have the same distribution of storms. What type of test should she use?

What condition must be met to use the test for homogeneity?

All values in the table must be greater than or equal to five.

Use the following information to answer the next five exercises: Do private practice doctors and hospital doctors have the same distribution of working hours? Suppose that a sample of 100 private practice doctors and 150 hospital doctors are selected at random and asked about the number of hours a week they work. The results are shown in [link] .

20–30 30–40 40–50 50–60
Private Practice 16 40 38 6
Hospital 8 44 59 39

State the null and alternative hypotheses.

df = _______

3

What is the test statistic?

What is the p -value?

0.00005

What can you conclude at the 5% significance level?

what is statistics
statistics is the beach of mathematics which deals with collection ,organisation, presentation, analysis and interpretation of numerical data
Saeed
oh but interpretation of data, like what and how? 🤔
Bhavani
interpretation: Think in a way that you have given a company year turnover and you have a record of 100years and data set is like (Year,Turnover). Now with that data you can interpret many thing how was the company growth, when were the losses and other things
Akash
interpretation: it is a process in which we make a decision about a population on the basis of sample data . example: if we want to interpret the average income of employees for upcoming year so we have to interpret the income of employees on the basis of previous year's income of those employees
Saeed
thank you saeed, Akash. I understood.
Bhavani
Finding correlation and regression
explain statistics whether it is a science or arts or both
I would say art is a creation. A chef is an artist. They create new dishes just like the painters. I believe one who creates something new, is an artist. So, Statistics is also an art, if you know it, you can create some new formula, theory, law, etcetera. It is also Science. So yes, it is both.
Rohan
how do you use the normal distribution table when testing the hypothesis
Davia
Rohan
percentages of all the possible outcomes are measured. This is so simple and bases on the questionnaire or interview schedule. It's just measuring the probability chances of high %age of the either part of the hypothesis ... dependent ..independent. data is classified on the basis of respondents
saifuddin
what percent of the students would be expected to score above 95?
inferential statistics is what?
in which we make infrences (hypothsis)
surpose a data set of 2,3,5,6,1,4 are given find median
lucy
Mean (average) 4... Median (middle term) 3.5.. Mode (frequency) every element in a set has 1 frequrncy
Akash
i arrange the data set in ascending order. that is, 1,2,3,4,5,6. then find the data set that falls in the middle. in this case, 3 & 4 fall in the middle. you then sum and obtain the average. that is, (3+4)/2=3.5. therefore, 3.5 is the median.
Gbenga
both of you are correct.
Joseph
hello guys
Abasikponke
thanks
lucy
great to be here
King
how does a line graph look
King
hi
Davia
hello
lucy
pls who knows how line graph look like
King
line graph usually have a straight line running through axis
Dike
am new here anyone willing to orient me?
Timothy
find the media of the following numbers 61,64,67,70,73
my body pls
lucy
67
Benmike
Benmike
what is the percentile for the set of data in the class C and frequency F(c,f)given by (9.3-9.7,2) (9.8-10.2,5) (10.3-10.7,12) (10.8-11.2,17) (11.3-11.7,14) (11.8-12.2,6) (12.3-12.7,3) (12.8-13.2,1)
how to find median
arrange ascending and desending order than the mid value is Median
rajendra
ok
Hrishe
what if it is a group data
Oloyede
mean/ medium/ mode
Michelle
n\2 and n+1\2
An operational manager at a manufacturing company is interested in the level of satisfaction of computer buyers. The manager has developed a satisfaction scale of 1-10 to mark their level of understanding with the company.What is the population of the interest?
Any clues
Virtual
how to use grouped and ungrouped data
Just a test from gplay
how come 5.67
by dividing 11.37 on 2
saifuddin
by dividing 11.34 on 2
saifuddin
what is index number?
vinayak
What is the differences between quota an lottery system of sampling
EGBE
What are the are the characteristics that are critically expedients in selecting the sample size
EGBE
fit a binomial distribution for the following data and test the goodness of fit x: 0 1 2 3 4 5 6 f: 5 18 28 12 7 6 4
solution
Mano
Mano
Simonsakala
It is a square chi
Nelson
But can't be a binomial because, the x numbers are 0 to 6, instead those would be "0" or "1" in a straight way
Nelson
You can do a chi-square test, but the assumption has to be a normal distribution, and the last f's number need to be "64"
Nelson
sorry the last f's numbers : "6 and 4" which are the observed values for 5 and 6 (expected values)
Nelson
hi
rajendra
can't understand basic of statistics ..
rajendra
Sorry I see my mistake, we have to calculate the expected values
Nelson
So we need this equation: P= (X=x)=(n to x) p^x(1-p)^n-x
Nelson
why it is not possible brother
ibrar
were n= 2 ( binomial) x= number of makes (0 to 6) and p= probability, could be 0.8.
Nelson
so after we calculate the expected values for each observed value (f) we do the chi-square. x^2=summatory(observed-expected)^2 / expected and compare with x^2 in table with 0.8
Nelson
tomorrow I'll post the answer, I'm so tired today, sorry for my mistake in the first messages.
Nelson
It is possible, sorry for my mistake
Nelson
two trader shared investment and buoght Cattle.Mr.Omer bought 255 cows & rented the farm for a period of 32 days. Mr. Ahmed grazed his Cattle for 25 days. Mr. Ahmed's cattle was 180 cows.Together they profited $7800. the rent of the farm is$ 3000 so divide the profit per gows/day for grazing day
Mohamed
how to start this book, who is reading thins first time
It is my first time reading this book
Good one
ihsan