Correlation
Proposition
The correlationof two random variables X and Y, denoted by
,
is
defined, as long as Var(X) Var(Y) is positive, by
In fact, since Var(Z)=0 implies that Z is constant with probability 1, we
see that
implies that Y=a+bX, where
and
implies that Y=a+bX, where
.
If Y=a+bX, then
is either +1 or -1, depending on the sign of
b.
The correlation coefficient is a measure of the degree of linearity between Xand Y. A value of
near +1 or -1 indicates a high degree of
linearity between X and Y, whereas a value near 0 indicates a lack of such
linearity. A positive value of
,
then X and Y are said to be
uncorrelated
Although
and the deviation
are uncorrelated,
they are not, in general, independent. However, in the special case where the
Xi are normal random variables it turns out that not only is
independent of a single deviation but it is independent of the entire sequence
of deviations
.
The sample mean
and the sample variance S2/(n-1) are independent with
haveing
a chi-squared distribution with n-1 degrees of freedom.