Linear Predictor
It sometimes happens that the joint probability distribution of X and Y is not completely known; or if it is known, it is such that the calculation of E[Y|X=x] is mathematically intractable. If, however, the means and variances of X and Y and the correlation of X and Y are known, then we can at least determine the best linear predictor of Y with respect to X.
To obtain the best linear predictor of Y with respect to X, we need to choose a and b so as to minimize E[(Y-(a+bX))2]. Now,
The mean square error of this predictor is given by