Mathematics/Statistics/Correlation Coefficients
The Correlation Coefficient is a value that describes "how well can a straight line fit this data". It is similar to covariance except that a correlation coefficient will always be between [-1, 1].
A value of exactly 1 indicates that there is a strong positive correlation between x and y. That is, as x increases, so does y. And as y increases, so does x.
A value of exactly -1 indicates that there is a strong negative correlation between x and y. That is, as x increases, y decreases. And as y increases, x decreases.
As values approach 0, it indicates a weaker and weaker correlation, with 0 indicating that there is absolutely no correlation between x and y.
There are a few ways to calculate a correlation coefficient.
Pearson's Correlation Coefficient
This is one of the most popular forms of calculating a correlation coefficient.
The equation to calculate the Pearson Correlation Coefficient is
Where
- is the sum of all our original x values in our dataset.
- is the sum of all our original y values in our dataset.
- is the sum of all our x values, after squaring them first.
- is the sum of all our y values, after squaring them first.
- is the sum of all our x and y pairs, after multiplying together first.
For additional explanation, see this youtube video.
Sample Correlation Coefficient
The equation to calculate the Sample Correlation Coefficient is
Where
- is the mean of our x.
- is the mean of our y.
- is the standard deviation of our x.
- is the standard deviation of our y.
For further explanation, see this Khan Academy video.