<< Chapter < Page Chapter >> Page >
This module describes how the chi-square distribution can be used to test for independence.

Tests of independence involve using a contingency table of observed (data) values. You first saw a contingency table when you studied probability in the Probability Topics chapter.

The test statistic for a test of independence is similar to that of a goodness-of-fit test:

Σ ( i j ) ( O - E ) 2 E

where:

  • O = observed values
  • E = expected values
  • i = the number of rows in the table
  • j = the number of columns in the table

There are i j terms of the form ( O - E ) 2 E .

A test of independence determines whether two factors are independent or not. You first encountered the term independence in Chapter 3. As a review, consider the following example.

The expected value for each cell needs to be at least 5 in order to use this test.

Suppose A = a speeding violation in the last year and B = a cell phone user while driving. If A and B are independent then P ( A AND B ) = P ( A ) P ( B ) . A AND B is the event that a driver received a speeding violation last year and is also a cell phone user while driving.Suppose, in a study of drivers who received speeding violations in the last year and who uses cell phones while driving, that 755 people were surveyed. Out of the 755, 70 had a speedingviolation and 685 did not; 305 were cell phone users while driving and 450 were not.

Let y = expected number of drivers that use a cell phone while driving and received speeding violations.

If A and B are independent, then P ( A AND B ) = P ( A ) P ( B ) . By substitution,

y 755 = 70 755 305 755

Solve for y : y = 70 305 755 = 28.3

About 28 people from the sample are expected to be cell phone users while driving and to receive speeding violations.

In a test of independence, we state the null and alternate hypotheses in words. Since the contingency table consists of two factors , the null hypothesis states that the factors are independent and the alternate hypothesis states that they are not independent (dependent) . If we do a test of independence using the example above, then the null hypothesis is:

H o : Being a cell phone user while driving and receiving a speeding violation are independent events.

If the null hypothesis were true, we would expect about 28 people to be cell phone users while driving and to receive a speeding violation.

The test of independence is always right-tailed because of the calculation of the test statistic. If the expected and observed values are not close together, then the teststatistic is very large and way out in the right tail of the chi-square curve, like goodness-of-fit.

The degrees of freedom for the test of independence are:

df = (number of columns - 1)(number of rows - 1)

The following formula calculates the expected number ( E ):

E = (row total)(column total) total number surveyed

Got questions? Get instant answers now!

In a volunteer group, adults 21 and older volunteer from one to nine hours each week to spend time with a disabled seniorcitizen. The program recruits among community college students, four-year college students, and nonstudents. The following table is a sample of the adult volunteers and the number of hours they volunteer per week.

Practice Key Terms 1

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Collaborative statistics. OpenStax CNX. Jul 03, 2012 Download for free at http://cnx.org/content/col10522/1.40
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Collaborative statistics' conversation and receive update notifications?

Ask