No Title

Correlation

Proposition
The correlationof two random variables X and Y, denoted by $\rho (X,Y)$ , is defined, as long as Var(X) Var(Y) is positive, by

$\rho(X,Y)=\displaystyle\frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}$

It can be shown that $-1\leq\rho(X,Y)\leq 1$
Solution:
Suppose that X and Y have variances given by $\sigma_x^2$ and $\sigma_y^2$ , respectively. Then

$\begin{array}{rcl} 0&\leq&Var\left (\displaystyle\frac{X}{\sigma_x}+\frac{Y}{\s... ...a_y^2}+ \frac{2Cov(X,Y)}{\sigma_x\sigma_y} \\ \\ &=&2[1+\rho(X,Y)] \end{array}$

implying that $-1\leq\rho(X,Y)\qquad\rule[0.02em]{1.0mm}{1.5mm}$

In fact, since Var(Z)=0 implies that Z is constant with probability 1, we see that $\rho(X,Y)=1$ implies that Y=a+bX, where $b=\sigma_x/\sigma_y>0$ and $\rho(X,Y)=-1$ implies that Y=a+bX, where $b=-\sigma_y/\sigma_x<0$ .

If Y=a+bX, then $\rho (X,Y)$ is either +1 or -1, depending on the sign of b.

The correlation coefficient is a measure of the degree of linearity between Xand Y. A value of $\rho (X,Y)$ near +1 or -1 indicates a high degree of linearity between X and Y, whereas a value near 0 indicates a lack of such linearity. A positive value of $\rho(X,Y)=0$ , then X and Y are said to be uncorrelated

Example: Let I_A and I_B be indicator variables for the events A and B. That is,
$I_A=\left\{ \begin{array}{ll} 1&\mbox{ if }A\mbox{ occurs } \\ \\ 0&\mbox{ oherwise } \end{array}\right .$ $I_B=\left\{ \begin{array}{ll} 1&\mbox{ if }B\mbox{ occurs } \\ \\ 0&\mbox{ oherwise } \end{array}\right .$
Then E[I_A]=P(A), E[I_B]=P(B), E[I_AI_B]=P(AB) so
$\begin{array}{rcl} Cov(I_A,I_B)&=&P(AB)-P(A)P(B) \\ \\ &+&P(B)[P(A\vert B)-P(A)] \end{array}$
Thus we obtain the quite intuitive result that the indicator variables for Aand B are either positively correlated, uncorrelated, or negatively correlated depending on whether P(A|B) is greater than, equal to, or less than $P(A)\qquad\rule[0.02em]{1.0mm}{1.5mm}$

Example: Let $X_1,\ldots,X_n$ be independent and identically distributed random variables having variance $\sigma^2$ . Show that $Cov(X_i-\overline{X},\overline{X})=0$
Solution:: $\begin{array}{ll} Cov(X_i-\overline{X},\overline{X}) \\ \\ =Cov(X_i,\overline{... ...a^2}{n} \\ \\ =\displaystyle\frac{\sigma^2}{n}\frac{\sigma^2}{n}=0 \end{array}$
the final equality follows since
$Cov(X_i,X_j)=\left\{ \begin{array}{lll} 0&\mbox{ if } j\neq i&\mbox{ by indepen... ... } \\ \\ 1&\mbox{ if } j=i&\mbox{since } Var(X_i)=\sigma^2 \end{array}\right .$

Although $\overline{X}$ and the deviation $X_i-\overline{X}$ are uncorrelated, they are not, in general, independent. However, in the special case where the X_i are normal random variables it turns out that not only is $\overline{X}$ independent of a single deviation but it is independent of the entire sequence of deviations $X_j-\overline{X}, j=1,\ldots,n$ . The sample mean $\overline{X}$ and the sample variance S²/(n-1) are independent with $S^2/\sigma^2$ haveing a chi-squared distribution with n-1 degrees of freedom.