How to calculate r by hand
When it comes to measuring the strength and direction of the linear relationship between two variables in statistics, one of the most commonly used techniques is Pearson’s correlation coefficient, also known as ‘r’. In this article, we will explore how you can calculate ‘r’ by hand with a step-by-step example.
Step 1: Organize your data
Before you start the calculations, arrange your data into a clear format. Write down both variables – let’s call them X and Y – in separate columns. Make sure that each X value corresponds to a Y value.
Step 2: Calculate mean values
Find the mean (average) of both X and Y. To do this, add up all the values in each column and then divide the sum by the total number of values (n).
Mean of X = ΣX / n
Mean of Y = ΣY / n
Step 3: Subtract mean values from data points
Subtract the respective means from each value in both columns, which will give us deviations from the mean for X and Y.
For example:
xi – Mean of X= ΔX
yi – Mean of Y = ΔY
Step 4: Multiply deviations
Now multiply each ΔX value by its corresponding ΔY value.
ΔX * ΔY
Step 5: Find square deviations
Next, find the square of deviations for both columns:
(ΔX)^2 and (ΔY)^2
Step 6: Sum up products and squared deviations
Calculate the sum or total of these new columns:
Σ(ΔX * ΔY), Σ(ΔX)^2, and Σ(ΔY)^2
Step 7: Calculate r
Finally, it’s time to find ‘r’. The formula for calculating Pearson’s correlation coefficient is:
r = Σ(ΔX * ΔY) / √[Σ(ΔX)^2 * Σ(ΔY)^2]
Insert the values obtained in Step 6 into this formula and calculate the result. Your ‘r’ value should range from -1 to +1, where -1 indicates a perfect negative correlation, +1 indicates a perfect positive correlation, and 0 signifies that there is no correlation.
Conclusion:
While statistical software can easily calculate Pearson’s correlation coefficient, understanding how to calculate ‘r’ by hand is crucial for building solid foundations in statistics. Using this step-by-step guide, you can now easily compute ‘r’ by hand and analyze the linear relationship between two variables in any dataset.