<< Chapter < Page Chapter >> Page >
This module describes how the chi-square distribution is used to conduct goodness-of-fit test.

In this type of hypothesis test, you determine whether the data "fit" a particular distribution or not. For example, you may suspect your unknown data fit a binomialdistribution. You use a chi-square test (meaning the distribution for the hypothesis test is chi-square) to determine if there is a fit or not. The null and the alternate hypotheses for this test may be written in sentences or may be stated as equations orinequalities.

The test statistic for a goodness-of-fit test is:

Σ k ( O E ) 2 E

where:

  • O = observed values (data)
  • E = expected values (from theory)
  • k = the number of different data cells or categories

The observed values are the data values and the expected values are the values you would expect to get if the null hypothesis were true. There are n terms of the form ( O E ) 2 E .

The degrees of freedom are df = (number of categories - 1) .

The goodness-of-fit test is almost always right tailed. If the observed values and the corresponding expected values are not close to each other, then the test statisticcan get very large and will be way out in the right tail of the chi-square curve.

The expected value for each cell needs to be at least 5 in order to use this test.

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart.

Number absences per term Expected number of students
0 - 2 50
3 - 5 30
6 - 8 12
9 - 11 6
12+ 2

A random survey across all mathematics courses was then done to determine the actual number (observed) of absences in a course. The next chart displays the result of that survey.

Number absences per term Actual number of students
0 - 2 35
3 - 5 40
6 - 8 20
9 - 11 1
12+ 4

Determine the null and alternate hypotheses needed to conduct a goodness-of-fit test.

H o : Student absenteeism fits faculty perception.

The alternate hypothesis is the opposite of the null hypothesis.

H a : Student absenteeism does not fit faculty perception.

Can you use the information as it appears in the charts to conduct the goodness-of-fit test?

No. Notice that the expected number of absences for the "12+" entry is less than 5 (it is 2). Combine that group with the "9 - 11" group to create new tables where the number of students for each entry are at least 5. The new tables are below.

Number absences per term Expected number of students
0 - 2 50
3 - 5 30
6 - 8 12
9+ 8
Number absences per term Actual number of students
0 - 2 35
3 - 5 40
6 - 8 20
9+ 5

Got questions? Get instant answers now!

What are the degrees of freedom ( df )?

There are 4 "cells" or categories in each of the new tables.

df = number of cells - 1 = 4 - 1 = 3

Got questions? Get instant answers now!
Got questions? Get instant answers now!

Employers particularly want to know which days of the week employees are absent in a five day work week. Most employers wouldlike to believe that employees are absent equally during the week. Suppose a random sample of 60 managers were asked on which day of the week did they have the highest number of employee absences. The results were distributed as follows:

Day of the week employees were most absent
Monday Tuesday Wednesday Thursday Friday
Number of Absences 15 12 9 9 15

Questions & Answers

how do you get the 2/50
Abba Reply
number of sport play by 50 student construct discrete data
Aminu Reply
width of the frangebany leaves on how to write a introduction
Theresa Reply
Solve the mean of variance
Veronica Reply
Step 1: Find the mean. To find the mean, add up all the scores, then divide them by the number of scores. ... Step 2: Find each score's deviation from the mean. ... Step 3: Square each deviation from the mean. ... Step 4: Find the sum of squares. ... Step 5: Divide the sum of squares by n – 1 or N.
kenneth
what is error
Yakuba Reply
Is mistake done to something
Vutshila
Hy
anas
hy
What is the life teble
anas
hy
Jibrin
statistics is the analyzing of data
Tajudeen Reply
what is statics?
Zelalem Reply
how do you calculate mean
Gloria Reply
diveving the sum if all values
Shaynaynay
let A1,A2 and A3 events be independent,show that (A1)^c, (A2)^c and (A3)^c are independent?
Fisaye Reply
what is statistics
Akhisani Reply
data collected all over the world
Shaynaynay
construct a less than and more than table
Imad Reply
The sample of 16 students is taken. The average age in the sample was 22 years with astandard deviation of 6 years. Construct a 95% confidence interval for the age of the population.
Aschalew Reply
Bhartdarshan' is an internet-based travel agency wherein customer can see videos of the cities they plant to visit. The number of hits daily is a normally distributed random variable with a mean of 10,000 and a standard deviation of 2,400 a. what is the probability of getting more than 12,000 hits? b. what is the probability of getting fewer than 9,000 hits?
Akshay Reply
Bhartdarshan'is an internet-based travel agency wherein customer can see videos of the cities they plan to visit. The number of hits daily is a normally distributed random variable with a mean of 10,000 and a standard deviation of 2,400. a. What is the probability of getting more than 12,000 hits
Akshay
1
Bright
Sorry i want to learn more about this question
Bright
Someone help
Bright
a= 0.20233 b=0.3384
Sufiyan
a
Shaynaynay
How do I interpret level of significance?
Mohd Reply
It depends on your business problem or in Machine Learning you could use ROC- AUC cruve to decide the threshold value
Shivam
how skewness and kurtosis are used in statistics
Owen Reply
yes what is it
Taneeya
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Collaborative statistics. OpenStax CNX. Jul 03, 2012 Download for free at http://cnx.org/content/col10522/1.40
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Collaborative statistics' conversation and receive update notifications?

Ask