Calculate: Range

November 22, 2008 by  
Filed under How To Calculate, Range

Range is the high score minus the low score.

It’s easy to calculate, and very helpful for identifying typing mistakes. If your 5-point rating scale has a range of 72, you know something is wrong. It should only have values from 1 to 5. So the range should only be 4.

 

NOW YOU CHOOSE:
Day 3: Dispersion

A Bit More About Dispersion
Even More About Dispersion
Range
MAD
Sum of Squares
Variance
Standard Deviation
How To Calculate
Range
MAD
Sum of Squares
Variance
Standard Deviation
Formulas For Dispersion
Practice Problems
More Practice Problems
Basic Facts About Dispersion
Vocabulary
Quiz 3
Summary

Calculate: Mode

November 5, 2008 by  
Filed under How To Calculate, Mode

There are two ways to calculate this popularity.

First, the mode may be found by sorting the scores and selecting the one most frequently given.

The mode of this distribution is 5:

11
9
5
5
5
2

Second, and more practical in a distribution of many scores, the mode is the highest point on a frequency distribution. If a frequency distribution is accurately drawn, both approaches will yield the same result.

In this case, there is one person who scored 2. Three who scored 5. One who scored, 9. And one who scored 11. So the highest point of this graph (histogram) is the mode.

 

When we make a distribution, the scores are arranged from left to right, with the lowest scores on the left and the highest scores on the right. When everyone has the same score, the distribution is a straight horizontal line. When more than one person has the same score, the scores are stacked vertically. Consequently, a distribution where everyone had the same score would be represented by a straight vertical line.

NOW YOU CHOOSE:
Day 2: Central Tendency
A Bit More About Central Tendency
Even More About Central Tendency
More Examples
More Mean Examples

More Median Examples
Median Is Middle Of Distribution
More Mode Examples
Impact of Outlying Scores
On The Mean
On The Median
On The Mode
How To Calculate Central Tendency
Calculating The Mean
Calculating The Median
When There’s No Middle-Most Score
Calculating The Mode
Formulas For Central Tendency
Basic Facts About Central Tendency
Vocabulary
Quiz 2
Summary

Calculate: Median

November 5, 2008 by  
Filed under How To Calculate, Median

Finding the median in a distribution of integers is relatively easy. When there is an odd number of scores: it is the one left over when counting in from either end. When there are an even number of scores, the median is whatever the middle two scores are (if they are the same) or the halfway point between the middle-most two scores when they differ from each other.

Medians are most often used when distributions are skewed. Indeed, when data is presented in medians, ask about the means. If they are quite different, the distribution is highly skewed, and the sample may not be as representative as you would like.

To calculate the median, arrange the scores in order of magnitude from high to low or from  low to high (it doesn’t matter which one you choose). Select the score in the middle.

Take these number, and arrangement from high to low:

2
9
4
7
8

Here they are arranged in a distribution:

9
8
7
4
2

Find the score in the middle. In the following numbers, the median is 7:

9
8
7
4
2

NOW YOU CHOOSE:
Day 2: Central Tendency
A Bit More About Central Tendency

Even More About Central Tendency
More Examples
More Mean Examples

More Median Examples
Median Is Middle Of Distribution
More Mode Examples
Impact of Outlying Scores
On The Mean
On The Median
On The Mode
How To Calculate Central Tendency
Calculating The Mean
Calculating The Median
When There’s No Middle-Most Score

Calculating The Mode
Formulas For Central Tendency
Basic Facts About Central Tendency
Vocabulary
Quiz 2
Summary

Calculate: Mean

November 5, 2008 by  
Filed under How To Calculate, Mean

To calculate the mean:

Sum (add) all of the scores in a variable

Divide that sum by the number of scores in the distribution.

Calculate the mean of these numbers:

7
6
5
5
5
4
3

The  sum of the variable called X is 35. That is

= 35

N (number of scores) is 7.

The mean of these scores is calculated by dividing 35 by 7. So, the mean of these scores is 5. That is:

= 5

 

NOW YOU CHOOSE:
Day 2: Central Tendency
A Bit More About Central Tendency
Even More About Central Tendency
More Examples
More Mean Examples

More Median Examples
Median Is Middle Of Distribution
More Mode Examples
Impact of Outlying Scores
On The Mean
On The Median
On The Mode
How To Calculate Central Tendency
Calculating The Mean
Calculating The Median
When There’s No Middle-Most Score
Calculating The Mode
Formulas For Central Tendency
Basic Facts About Central Tendency
Vocabulary
Quiz 2
Summary

 

Calculate: Standard Deviation

November 5, 2008 by  
Filed under Dispersion, How To Calculate

Regardless of how you obtained variance (by dividing by N or by N-1), you calculate the standard deviation by taking the square root of variance.

Take what you calculated variance to be. Put it in your calculator and push this button:

 

The result is the standard deviation.

NOW YOU CHOOSE:
Day 3: Dispersion
A Bit More About Dispersion
Even More About Dispersion
Range
MAD
Sum of Squares
Variance
Standard Deviation
How To Calculate
Range
MAD
Sum of Squares
Variance
Standard Deviation
Formulas For Dispersion
Practice Problems
More Practice Problems
Basic Facts About Dispersion
Vocabulary
Quiz 3
Summary

 

Calculate: Variance

November 5, 2008 by  
Filed under Dispersion, How To Calculate

The fourth measure of dispersion is variance. Variance is the Sum of Squares divided by N

Variance of a population is always SS divided by N; regardless whether it is a large population or a small population.

Variance of a large sample (where N is equal to or larger than 30) is calculated like a population. Divide SS by N.

If the sample is small (less than 30), adjust for the small sample size by dividing SS by N-1.

 

NOW YOU CHOOSE:
Day 3: Dispersion
A Bit More About Dispersion
Even More About Dispersion
Range
MAD
Sum of Squares
Variance
Standard Deviation
How To Calculate
Range
MAD
Sum of Squares
Variance
Standard Deviation
Formulas For Dispersion
Practice Problems
More Practice Problems
Basic Facts About Dispersion
Vocabulary
Quiz 3
Summary

Calculate: MAD

November 5, 2008 by  
Filed under Dispersion, How To Calculate

Mean Absolute Deviations (MAD) is a measure of dispersion.
Start by subtracting the mean from each score. Put the answers in a colum labeled d (use little d because these are deviations).
Take the absolute value of the d’s. Ignore the sign (positive or negative), and add up the magnitudes.

Calculate: Correlation

November 5, 2008 by  
Filed under Correlation, How To Calculate

For the sake of simplicity, we’ll restrict ourselves to the Pearson r, the most commonly used type of correlation. To calculate the Pearson, three Sum of Squares are needed. The Pearson r is the ratio of SSxy to the squareroot of the product of SSx and SSy. Here is the formula:

For SSx, find the Sum of Squares of the X variable. Similarly, SSy is the simply the Sum of Squares of Y. The SSxy, however, is a bit different. First, we have to make a new variable: XY. To do so, we multiply each X by its respective Y. Now we have 3 columns: X, Y and XY. Second, sum the XYs. Third, use this formula:

Notice that this formula is a lot like the regular formula for Sum of Squares; it’s a variation on the theme. It’s the sum of the XYs but we don’t have to square them (they’re already big enough). And we don’t square the Sum of X; we multiple the Sum of X and the Sum of Y together. Fourth, finish off the formula and the result is the Pearson r.

EXAMPLE

We create a new variable by multiplying every X by its Y partner. So this:

 

X Y
2 17

13

3
10 4
3 18
2 19
12 11

becomes this:

X Y XY
2 17 34
13 3 39
10 4 40
3 18 54
2 19 38
12 11 132

Then, we sum each column. The sum of X = 42, the sum of Y = 72, and the sum of XY is 337.

Calculate the SS for X (136) and the SS of Y (256). And calculate the SS of XY. Multiple the sum of X by the sum of Y (42 * 72 = 3024). Now divide the result by N (the number of pairs of scores = 6); 3024/6 = 504. Subtract the result from the Sum of XYs (337-504 = -167.

Notice the SSxy is negative. It’s OK. The SSxy can be negative. It is the only Sum of Squares that can be negative. The SSx or the SSy are measures of dispersion from the variable’s mean. But we created the XY variable; it’s not a real variable when it comes to dispersion. The sign of SSxy indicates the direction of the relationship between X and Y. So we have a negative SSxy because X and Y have an inverse relationship.

Look at the original data: when X is small (2), Y is large (17). When X is large (13), Y is small (3). It is a consistent but inverse relationship. It’s like pushing the yoke down and the plane going up.

Let’s finish off the calculation of the Pearson r. Multiple the SSx by the SSy (136 * 256 = 34816). Take the square root of that number (sqrt if 34816 = 186.59). Divide the SSxy (-167/186.59 = -.895). Rounding to 2 decimal places, the Pearson r for this data set equals -.90. It is a strong, negative correlation.

 

NOW YOU CHOOSE:
Day 5: Correlation
Bit More About Correlation
Even More About Correlation
Calculate Correlation
Practice Problems
More Practice Problems
Word Problems
Sim1 Sim2 Sim3
Sim4 Sim5 Sim6
Sim7 Sim8 Sim9
Basic Facts About Correlation
Vocabulary
Formulas
Quiz 5
Summary

 

Calculate ANOR

November 5, 2008 by  
Filed under ANOR, How To Calculate

Getting from Sum of Squares (SS) to mean squares (ms)

The F test is a ratio of variance (understood/not understood). To find the variance, we begin by partitioning the Sum of Squares (SS) of the regression into explained and unexplained components. Explained variance is simply the SSy multiplied by r2 (the coefficient of determination). The result is the SSregression (the understood portion of the regression).

The unexplained, not yet understood portion of the regression is found by multiplying the SSy by 1-r2 (the coefficient of nondetermination). The result is the SSerror (the non-understood portion of the regression).

To get from Sum of Squares to variance, we divided each SS by its respective degrees of freedom. The resulting variance terms are called mean squares (a reminder that variance is the average of the squared deviations from a distributions mean).

Degrees of Freedom

The degrees of freedom (df) for Regression is k-1 (columns minus one). Since a simple linear regression has only 2 columns, the df for an Analysis of Regression always equals 1. The df for Error is N-k (number of people minus the number of columns). And Totalerror = N-1.

If it seems like it’s getting hard to keep track of all this, there is good news. An Analysis of Regression using a summary table that organizes all of the important information. Simply fill in the blanks of the table and the hard part is done.

EXAMPLE

In order to calculate an Analysis of Regression for this data,

X Y
11          1
4          2
8          8
2        12
7        11
16          2

We fill in the blanks for the Analysis of Regression’s summary table:

SS df ms

Regression _____ ____ ____

Error         _____ ____ ____

Total         _____ ____ ____

Let’s start with the degrees of freedom. Since this is a simple linear regression, we know that dfregression = 1. Two columns minus 1 = 1. We know that dferror is equal to N-k (the number of people minus the number of columns); so 6-2 = 4. The total degrees of freedom is equal to N-1; 6-1 = 5.

We know that SStotal equals SSy. In this example, the SSy = 122. We partition this into SSregression and SSerror by multiplying the SStotal by r2and 1-r2, respectively. So 122 is partitioned into 41.14 (explained dispersion) and 80.86 (unexplained dispersion).

With this in mind, let’s update the summary table with what we know:

SS df ms

Regression     41.14     1     ____

Error              80.86     4     ____

Total            122.00     5     ____

Variance (which in a F-test is given the special designation of mean squares) is calculated by dividing the SS term by its respective degrees of freedom. Updating the summary table gives us:

SS df ms

Regression     41.14     1     41.14

Error              80.86     4     20.21

Total            122.00     5     25.20

 

Testing F

The F statistic is the mean squares of Regression divided by the mean squares of Error. Use the mean squares from the summary table:

SS df ms

Regression     41.14     1     41.14

Error              80.86     4     20.21

Total            122.00     5     25.20

So F = 41.14 / 20.21. When divided through, you get: F = 2.04We test the significance of this F by comparing it to the critical value in the F Table. We enter the table by going across to the dfregression (1) and down the dferror (in this case it’s 4). So the critical value = 7.71. In order to be significant, the F we calculated would have to be larger than 7.71. Since it isn’t, the pattern we see is likely to be due to chance.

 

NOW YOU CHOOSE:
Day 7: Probability
Bit More About Probability
Even More About Probability
Even More About ANOR
Calculate ANOR
Practice Problems
More Practice Problems
Word Problems
Sim1 Sim2 Sim3
Basic Facts About Probability
Vocabulary
Formulas
Quiz 7
Summary

Calculate: Regression

November 5, 2008 by  
Filed under How To Calculate, Regression

Think of regression as a 5-step process.
First, calculate the mean of X and the mean of Y.
Second, calculate the slope of the line (called b). To find the slope, divide the SSxy by the SS of the predictor. That is:
Third, calculate the Y intercept (called a). The formula is:
Fourth, make a prediction. Plug in your X. Since you’ve already calculated a and b, all you need is an X value and you can predict what Y will be. Use the formula for a straight line:
Fifth, estimate the accuracy of the prediction. Don’t worry, there’s a formula for that too. Here it is:
 
EXAMPLE
X            Y
2             5
4             7
6             6
9             8
12         14
15          11
15          12
 
First, calculate the mean of X and the mean of Y. Each mean = 9.
 
Second, calculate the slope of the line (called b). To find the slope, divide the SSxy by the SS of the predictor. The SSx is 164; the SSy is 68; and the SSxy 92. So the slope (b) = .56.
 
Third, calculate the Y intercept (called a). So a = 9 – (.56 * 9) = 3.95.
 
Fourth, make a prediction. Use the formula for a straight line: Let’s assume that the X value is 8, we would predict that Y (which will call Y-prime so we know it’s a prediction) equals 8.44.
 
The standard deviation of Y is 3.12, and r = .87. So the standard error of estimate is 1.81. This means that we’re 68% sure that the real score will be 8.44, plus or minus 1.81. In other words, we’re fairly sure the score will be between 6.63 and 10.25.
 
 
 

Next Page »