A Quick Note on Correlation

The correlation coefficient of 2 variables measures the strength and direction of the linear relationship between the 2 variables.

Strength: Expressed visually by how “line-like” a plot of both variables will appear.  Lines are thin (infinitely thin in the most abstract sense).  Strength is indicated by how close the absolute value of the correlation coefficient is to 1.

Direction: Do the variables rise and fall together?  Or does one variable fall as the other rises?  This is indicated by the sign of the correlation coefficient (positive or negative respectively).

Converse considerations.  While a non-zero correlation coefficient implies some degree of linear dependence between the 2 variables, a correlation coefficient of 0 does not imply independence.

  • The relationship might be non-linear.  The correlation coefficient identifies linear relationships.

Why did I bother to write this note?

Given a linear relationship between 2 variables (or, put another way, a linear dependence of one variable on another) one variable might be used to predict the other.  If the variables in question represent measurements than this can be incredibly valuable because some measurements are much harder to perform than others.  Substituting a prediction based on a cheap measurement for an expensive actual measurement can yield cost savings.  Cost might be “measured” in dollars, time or some other way so this statistic can turn out to be valuable in many industries/disciplines.

No comments:

Post a Comment