How to Calculate Sum of Squares Total (SST)
In the world of statistics, understanding various concepts like variance, correlation, and regression is crucial. One such essential concept is the Sum of Squares Total (SST). This article will guide you through understanding what SST is and how to calculate it.
What is Sum of Squares Total (SST)?
SST is a measure of the total variability within a dataset. It represents the sum of the squared differences between each data point and the mean of the dataset. SST helps in partitioning the variability in a dataset into different components, such as those due to regression and residual errors.
Calculating SST
In order to calculate the SST, follow these steps:
1. Calculate the mean:
First, find the mean or average of your dataset by summing up all the data points and then dividing by the total number of data points.
Formula for mean: μ = Σx / n
Where μ represents the mean, Σ symbolizes summation, x denotes individual data points, and n is the total number of data points.
2. Compute the deviations:
Next, find the deviation for each data point by subtracting its value from the mean. It measures how far each point in your dataset is from the overall mean. These deviations can be either positive or negative.
Deviation formula: d_i = x_i – μ
Where d_i represents deviation for each data point i, x_i denotes individual data points i, and μ denotes mean.
3. Square deviations:
To eliminate any potential skewing because of negative deviations, square each one. This step transforms all deviations into positive values while retaining their relative magnitudes.
Squared deviation formula: d_i^2 = (x_i – μ)^2
4. Sum squared deviations:
Finally, add up all squared deviations to get SST.
SST formula: SST = Σd_i^2
Alternatively, there is another popular formula for calculating SST by using the sum of squares of individual data points:
SST = Σx_i^2 – (Σx)^2 / n
In this method, Σx_i^2 represents the sum of the squared data points, and (Σx)^2/n denotes the square of the sum of the data points divided by the number of data points.
Example:
Let’s consider a sample dataset with five data points: [4, 6, 8, 10, 12]
Mean: μ = (4 + 6 + 8 + 10 + 12) / 5 = 40 / 5 = 8
Squared deviations:
d1^2 = (4 – 8)^2 = (-4)^2 = 16
d2^2 = (6 – 8)^2 = (-2)^2 = 4
d3^2 = (8 – 8)^2 = (0)^2 = 0
d4^2 = (10 – 8)^2= (2)^2 = 4
d5^2 = (12 – 8)^2= (4)^2 =16
SST: SST = Σd_i^2= then16 + 4+0+4+16=40
The SST in this example is equal to 40.
Understanding how to calculate SST and utilizing formulas will make it easier for you to analyze datasets and determine their variability. This knowledge is vital for any statistician or data analyst looking to improve their skills.