No Title

Linear Predictor

It sometimes happens that the joint probability distribution of X and Y is not completely known; or if it is known, it is such that the calculation of E[Y|X=x] is mathematically intractable. If, however, the means and variances of X and Y and the correlation of X and Y are known, then we can at least determine the best linear predictor of Y with respect to X.

To obtain the best linear predictor of Y with respect to X, we need to choose a and b so as to minimize E[(Y-(a+bX))²]. Now,

$\begin{array}{l} E[(Y-(a+bX))^2] \\ \\ =E[Y^2-2aY-2bXY+a^2+2abX+b^2X^2] \\ \\ =E[Y^2]-2aE[Y]-2bE[XY]+a^2+2abE[X]+b^2E[X^2] \end{array}$

Taking partial derivatives, we obtain

$\begin{array}{rcl} \displaystyle\frac{\partial}{\partial a}E[(Y-a-bX)^2]&=&-2E[... ...ial}{\partial b}E[(Y-a-bX)^2]&=& -2E[XY]+2aE[X]+2bE[X^2]\qquad(5.3) \end{array}$

Equating Equations (5.3) to 0 and solving for a and b yields the solutions

$\begin{array}{rcl} a&=&\displaystyle\frac{E[XY]-E[X]E[Y]}{E[X^2]-(E[X])^2}=\fra... ...[X]=E[Y]-\displaystyle\frac{\rho\sigma_yE[X]}{\sigma_x}\qquad (5.4) \end{array}$

where $\rho=\mbox{ Correlation }(X,Y)$ , $\sigma_y^2=Var(Y)$ , and $\sigma_x^2=Var(X)$ . It is easy to verify that the values of a and b from Equation (5.4) minimize E[Y-a-bX)²], and thus the best (in the sense of mean square error) linear predictor Y with respect to X is $\mu_y+\displaystyle\frac{\rho\sigma_y}{\sigma_x}(X-\mu_x)$ where $\mu_y=E[Y]$ and $\mu_x=E[X]$ .

The mean square error of this predictor is given by

$\begin{array}{l} E\left [\left (Y-\mu_y-\rho\displaystyle\frac{\sigma_y} {\sigm... ...^2-2\rho^2\sigma_y^2 \\ \\ =\sigma_y^2(1-\rho^2)\qquad\qquad (5.5) \end{array}$

We note from Equation (5.5) that if $\rho$ is near +1 or -1, then the mean square error of the best linear predictor is near 0.