Mathematics/Statistics/Correlation Coefficients: Difference between revisions

From Dev Wiki
Jump to navigation Jump to search
(Expand page)
 

Latest revision as of 17:21, 25 October 2020

Tip: This is often represented as r.

The Correlation Coefficient is a value that describes "how well can a straight line fit this data". It is similar to covariance except that a correlation coefficient will always be between [-1, 1].

A value of exactly 1 indicates that there is a strong positive correlation between x and y. That is, as x increases, so does y. And as y increases, so does x.

A value of exactly -1 indicates that there is a strong negative correlation between x and y. That is, as x increases, y decreases. And as y increases, x decreases.

As values approach 0, it indicates a weaker and weaker correlation, with 0 indicating that there is absolutely no correlation between x and y.

There are a few ways to calculate a correlation coefficient.

Pearson's Correlation Coefficient

This is one of the most popular forms of calculating a correlation coefficient.

The equation to calculate the Pearson Correlation Coefficient is


Where

  • is the sum of all our original x values in our dataset.
  • is the sum of all our original y values in our dataset.
  • is the sum of all our x values, after squaring them first.
  • is the sum of all our y values, after squaring them first.
  • is the sum of all our x and y pairs, after multiplying together first.

For additional explanation, see this youtube video.


Sample Correlation Coefficient

The equation to calculate the Sample Correlation Coefficient is


Where

For further explanation, see this Khan Academy video.