Correlation
Proposition
The correlationof two random variables X and Y, denoted by
,
is
defined, as long as Var(X) Var(Y) is positive, by
In fact, since Var(Z)=0 implies that Z is constant with probability 1, we see that implies that Y=a+bX, where and implies that Y=a+bX, where .
If Y=a+bX, then is either +1 or -1, depending on the sign of b.
The correlation coefficient is a measure of the degree of linearity between Xand Y. A value of near +1 or -1 indicates a high degree of linearity between X and Y, whereas a value near 0 indicates a lack of such linearity. A positive value of , then X and Y are said to be uncorrelated
Although and the deviation are uncorrelated, they are not, in general, independent. However, in the special case where the Xi are normal random variables it turns out that not only is independent of a single deviation but it is independent of the entire sequence of deviations . The sample mean and the sample variance S2/(n-1) are independent with haveing a chi-squared distribution with n-1 degrees of freedom.