Even More About Central Tendency

October 22, 2008 by  
Filed under Even More

If left unattended, data would cover the desks of researchers and gather dust. To be useful, it must be organized into a data matrix: a row-column table of scores. The spreadsheet of scores has a row for each subject. Each row contains all of the scores for that individual but neatly laid out in columns. A quick view of the spreadsheet will show if there are any missing scores (empty cells). Each row is a person; each column is a variable. Traditionally, the farthest column to the left contains the ID number of the subject.

Once the data is organized, it’s time to describe it. The shortest description of a group is how many people are in it (N). A more graphic and information descriptions would be a frequency distribution. The “frequency” is indicated in the height of the graph: the more people who have the same score, the taller the graph is. The “distribution” (width) shows how many different scores there were.

Here is a frequency distribution where everyone has a different score (no one has the same score).

Here is a frequency distribution of a constant: everyone has the same score:

Most frequency distributions look more like this:

Notice that this “bell-shaped” curve is symmetrical. There are more scores in the middle than at the ends. There are a few scores at the ends but most are in the middle. Philosophically, we believe this describes people well. If we measure them on almost anything, most will be in the middle of the distribution but a few will be at each end. Although there are a few very musical people, and a few very unmusical people, most are in the middle of the musical ability distribution. This is normal. It’s how we define average.

Sometimes the data doesn’t look like a normal bell-shaped curve. Usually, it’s because the researcher did something wrong (asked only highly gifted people) or limited the sample in some manner. The result is a skewed distribution: a normal curve with a long tail. The direction of the tail gives the distribution its name: a tail to the right (toward the high scores) is a positively-skewed distribution. Most folk scored low on the variable but a few (maybe only one person) scored quite high. Here’s a positively skewed distribution:

A negatively-skewed distribution is normal except for an outlying score toward the negative (lowest scores). It looks like this:

MODE

You can use a frequency distribution to identify a single score to represent the group. The highest point of the distribution is called the mode. It is the most common (popular) score. But it’s hard to be accurate reading graphs, and the mode isn’t very useful for advanced statistical analysis. The general rule is: if it’s easy to calculate, it’s not very helpful.

MEDIAN

A better measure of a group representative is the median. The median is the middlemost score. If you start at the ends and count toward the middle, whichever score you end on is the median. It’s fairly easy to calculate but the scores have to be arranged from lowest to highest (or highest to lowest) in order to count toward the middle. And the median isn’t very useful for advanced statistical analysis.

MEAN

That leaves the mean (also called the average). Calculation is harder than pointing or counting, but not really all the tough. You add up all the scores and divide by the number of scores. Pretty simple.

The mean represents the average, typical person. It’s the hypothetical middle point that balances the entire distribution, which is why we end up with 2.4 children or 3.1 cars. Unlike the median and mode, the mean is very sensitive to outlying scores.

WHY CALCULATE CENTRAL TENDENCY

Nearly all statistics books tell you how to calculate measure of central tendency but few explain why. Why all the concern with central tendency? Why even calculate a mean? We use central tendency because there is one. If you take an armload of toys and drop them on the floor, they don’t line up in rows. They don’t arrange themselves in a triangle or circle. They fall in a heap. A pile that looks like a small mountain: a frequency distribution. We look for a score to represent an entire group because we believe that people are more alike than different. We believe that chance follows a pattern. And that pattern is a heap in the middle with less and less on the edges. We look for a mean because there is a center to a group. To understand most people, all we need to do is describe the middle of the group; the middle of the distribution is where most score are. There is little difference between the mean and the score next to it. Everyone in the middle of the pack is about the same. It’s the way nature is built.

NOW YOU CHOOSE:
   Day 2: Central Tendency
   A Bit More About Central Tendency
   Even More About Central Tendency
   More Examples
      More Mean Examples

      More Median Examples
      Median Is Middle Of Distribution
      More Mode Examples
   Impact of Outlying Scores
      On The Mean
      On The Median
      On The Mode
   How To Calculate Central Tendency
      Calculating The Mean
      Calculating The Median
      When There’s No Middle-Most Score

      Calculating The Mode
   Formulas For Central Tendency
   Basic Facts About Central Tendency
   Vocabulary
   Quiz 2
   Summary

 

Comments

Feel free to leave a comment...
and oh, if you want a pic to show with your comment, go get a gravatar!