<< Chapter < Page Chapter >> Page >

Try it

Two swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time for the 50 meter freestyle when compared to her team. Which swimmer had the fastest time when compared to her team?

Swimmer Time (seconds) Team Mean Time Team Standard Deviation
Angie 26.2 27.2 0.8
Beth 27.3 30.1 1.4

For Angie: z = 26 .2 – 27 .2 0 .8 = –1.25

For Beth: z = 27 .3 30 .1 1. 4 = –2

The following lists give a few facts that provide a little more insight into what the standard deviation tells us about the distribution of the data.

    For any data set, no matter what the distribution of the data is:

  • At least 75% of the data is within two standard deviations of the mean.
  • At least 89% of the data is within three standard deviations of the mean.
  • At least 95% of the data is within 4.5 standard deviations of the mean.
  • This is known as Chebyshev's Rule.

    For data having a normal distribution, which we will examine in great detail later:

  • Approximately 68% of the data is within one standard deviation of the mean.
  • Approximately 95% of the data is within two standard deviations of the mean.
  • More than 99% of the data is within three standard deviations of the mean.
  • This is known as the Empirical Rule.
  • It is important to note that this rule only applies when the shape of the distribution of the data is bell-shaped and symmetric. We will learn more about this when studying the "Normal" or "Gaussian" probability distribution in later chapters.

Coefficient of variation

Another useful way to compare distributions besides simple comparisons of means or standard deviations is to adjust for differences in the scale of the data being measured. Quite simply, a large variation in data with a large mean is different than the same variation in data with a small mean. To adjust for the scale of the underlying data the Coefficient of Variation (CV) has been developed. Mathematically:

C V = s x ¯ * 100 conditioned upon x ¯ 0, where s is the standard deviation of the data and x ¯ is the mean.

We can see that this measures the variability of the underlying data as a percentage of the mean value; the center weight of the data set. This measure is useful in comparing risk where an adjustment is warranted because of differences in scale of two data sets. In effect, the scale is changed to common scale, percentage differences and allows direct comparison of the two or more magnitudes of variation of different data sets.

References

Data from Microsoft Bookshelf.

King, Bill.“Graphically Speaking.” Institutional Research, Lake Tahoe Community College. Available online at http://www.ltcc.edu/web/about/institutional-research (accessed April 3, 2013).

Chapter review

The standard deviation can help you calculate the spread of data. There are different equations to use if are calculating the standard deviation of a sample or of a population.

  • The Standard Deviation allows us to compare individual data or classes to the data set mean numerically.
  • s = ( x x ) 2 n 1 or s = f ( x x ) 2 n 1 is the formula for calculating the standard deviation of a sample. To calculate the standard deviation of a population, we would use the population mean, μ , and the formula σ = ( x μ ) 2 N or σ = f ( x μ ) 2 N .

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introductory statistics. OpenStax CNX. Aug 09, 2016 Download for free at http://legacy.cnx.org/content/col11776/1.26
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introductory statistics' conversation and receive update notifications?

Ask