How to calculate correlation coefficient by hand
The correlation coefficient, often represented by the letter ‘r,’ measures the strength and direction of a linear relationship between two variables on a scatterplot. The value of r ranges from -1 to 1, with -1 indicating a perfect negative correlation, 1 indicating a perfect positive correlation, and 0 indicating no correlation. In this article, we will learn how to calculate the correlation coefficient by hand using the Pearson’s formula.
Step 1: Gather your data
The first step in calculating the correlation coefficient is to gather your data in pairs. Each pair should consist of two corresponding values for the two variables you want to measure the correlation between.
Example:
X: 3, 6, 8, 10, 12
Y: 4, 10, 12, 18, 24
Step 2: Calculate means
Calculate the mean of both datasets X and Y by adding up all the values in each dataset and dividing by the total number of individual values.
Mean_X = (ΣX) / n
Mean_Y = (ΣY) / n
Example:
Mean_X = (3+6+8+10+12)/5 = 39/5 = 7.8
Mean_Y = (4+10+12+18+24)/5 = 68/5 = 13.6
Step 3: Compute deviations
Calculate the deviation for each value in datasets X and Y from their respective means. The deviation is simply the difference between each value and its mean.
Deviation_X = (xi – Mean_X)
Deviation_Y = (yi – Mean_Y)
Step 4: Calculate product of deviations
For each pair of values, multiply the deviations calculated in step 3.
Product_of_deviations = Deviation_X * Deviation_Y
Step 5: Find the sum of product of deviations
Add up all the product of deviations from step 4.
Σ(Product_of_deviations)
Step 6: Compute the standard deviation of both datasets
Standard_Deviation(X) = √(Σ(xi – Mean_X)^2 / n)
Standard_Deviation(Y) = √(Σ(yi – Mean_Y)^2 / n)
Step 7: Calculate the correlation coefficient (r)
Apply the Pearson’s correlation coefficient formula:
r = Σ(Product_of_deviations) / (n * Standard_Deviation(X) * Standard_Deviation(Y))
Example:
After completing steps 3 through 6, you will get:
Product_of_deviations (-7.8, -1.8, -0.8, +2.2, +6.2)
Σ(Product_of_deviations) = 1.4
Standard_Deviation(X) = 3.11
Standard_Deviation(Y) = 7.33
Now apply the Pearson’s correlation coefficient formula to calculate r:
r = 1.4 / (5 * 3.11 * 7.33) = 1.4 / 113.77 ≈ 0.012
Conclusion:
In this example, the correlation coefficient ‘r’ is approximately equal to 0.012, which indicates a very weak positive relationship between the two datasets X and Y that we analyzed. Calculating the correlation coefficient by hand can be a helpful exercise to understand its underlying concepts and application in various fields such as finance, psychology, statistics, and more.