fbpx
Wikipedia

Mean squared prediction error

In statistics the mean squared prediction error (MSPE), also known as mean squared error of the predictions, of a smoothing, curve fitting, or regression procedure is the expected value of the squared prediction errors (PE), the square difference between the fitted values implied by the predictive function and the values of the (unobservable) true value g. It is an inverse measure of the explanatory power of and can be used in the process of cross-validation of an estimated model. Knowledge of g would be required in order to calculate the MSPE exactly; in practice, MSPE is estimated.[1]

Formulation edit

If the smoothing or fitting procedure has projection matrix (i.e., hat matrix) L, which maps the observed values vector   to predicted values vector   then PE and MSPE are formulated as:

 
 

The MSPE can be decomposed into two terms: the squared bias (mean error) of the fitted values and the variance of the fitted values:

 
 
 

The quantity SSPE=nMSPE is called sum squared prediction error. The root mean squared prediction error is the square root of MSPE: RMSPE=MSPE.

Computation of MSPE over out-of-sample data edit

The mean squared prediction error can be computed exactly in two contexts. First, with a data sample of length n, the data analyst may run the regression over only q of the data points (with q < n), holding back the other n – q data points with the specific purpose of using them to compute the estimated model’s MSPE out of sample (i.e., not using data that were used in the model estimation process). Since the regression process is tailored to the q in-sample points, normally the in-sample MSPE will be smaller than the out-of-sample one computed over the n – q held-back points. If the increase in the MSPE out of sample compared to in sample is relatively slight, that results in the model being viewed favorably. And if two models are to be compared, the one with the lower MSPE over the n – q out-of-sample data points is viewed more favorably, regardless of the models’ relative in-sample performances. The out-of-sample MSPE in this context is exact for the out-of-sample data points that it was computed over, but is merely an estimate of the model’s MSPE for the mostly unobserved population from which the data were drawn.

Second, as time goes on more data may become available to the data analyst, and then the MSPE can be computed over these new data.

Estimation of MSPE over the population edit

When the model has been estimated over all available data with none held back, the MSPE of the model over the entire population of mostly unobserved data can be estimated as follows.

For the model   where  , one may write

 

Using in-sample data values, the first term on the right side is equivalent to

 

Thus,

 

If   is known or well-estimated by  , it becomes possible to estimate MSPE by

 

Colin Mallows advocated this method in the construction of his model selection statistic Cp, which is a normalized version of the estimated MSPE:

 

where p the number of estimated parameters p and   is computed from the version of the model that includes all possible regressors. That concludes this proof.

See also edit

References edit

  1. ^ Pindyck, Robert S.; Rubinfeld, Daniel L. (1991). "Forecasting with Time-Series Models". Econometric Models & Economic Forecasts (3rd ed.). New York: McGraw-Hill. pp. 516–535. ISBN 0-07-050098-3.

mean, squared, prediction, error, statistics, mean, squared, prediction, error, mspe, also, known, mean, squared, error, predictions, smoothing, curve, fitting, regression, procedure, expected, value, squared, prediction, errors, square, difference, between, f. In statistics the mean squared prediction error MSPE also known as mean squared error of the predictions of a smoothing curve fitting or regression procedure is the expected value of the squared prediction errors PE the square difference between the fitted values implied by the predictive function g displaystyle widehat g and the values of the unobservable true value g It is an inverse measure of the explanatory power of g displaystyle widehat g and can be used in the process of cross validation of an estimated model Knowledge of g would be required in order to calculate the MSPE exactly in practice MSPE is estimated 1 Contents 1 Formulation 2 Computation of MSPE over out of sample data 3 Estimation of MSPE over the population 4 See also 5 ReferencesFormulation editIf the smoothing or fitting procedure has projection matrix i e hat matrix L which maps the observed values vector y displaystyle y nbsp to predicted values vector y Ly displaystyle hat y Ly nbsp then PE and MSPE are formulated as PEi g xi g xi displaystyle operatorname PE i g x i widehat g x i nbsp MSPE E PEi2 i 1nPEi2 n displaystyle operatorname MSPE operatorname E left operatorname PE i 2 right sum i 1 n operatorname PE i 2 n nbsp The MSPE can be decomposed into two terms the squared bias mean error of the fitted values and the variance of the fitted values MSPE ME2 VAR displaystyle operatorname MSPE operatorname ME 2 operatorname VAR nbsp ME E g xi g xi displaystyle operatorname ME operatorname E left widehat g x i g x i right nbsp VAR E g xi E g xi 2 displaystyle operatorname VAR operatorname E left left widehat g x i operatorname E left g x i right right 2 right nbsp The quantity SSPE nMSPE is called sum squared prediction error The root mean squared prediction error is the square root of MSPE RMSPE MSPE Computation of MSPE over out of sample data editFurther information Cross validation statistics The mean squared prediction error can be computed exactly in two contexts First with a data sample of length n the data analyst may run the regression over only q of the data points with q lt n holding back the other n q data points with the specific purpose of using them to compute the estimated model s MSPE out of sample i e not using data that were used in the model estimation process Since the regression process is tailored to the q in sample points normally the in sample MSPE will be smaller than the out of sample one computed over the n q held back points If the increase in the MSPE out of sample compared to in sample is relatively slight that results in the model being viewed favorably And if two models are to be compared the one with the lower MSPE over the n q out of sample data points is viewed more favorably regardless of the models relative in sample performances The out of sample MSPE in this context is exact for the out of sample data points that it was computed over but is merely an estimate of the model s MSPE for the mostly unobserved population from which the data were drawn Second as time goes on more data may become available to the data analyst and then the MSPE can be computed over these new data Estimation of MSPE over the population editThis article s factual accuracy is disputed Relevant discussion may be found on the talk page Please help to ensure that disputed statements are reliably sourced May 2018 Learn how and when to remove this template message When the model has been estimated over all available data with none held back the MSPE of the model over the entire population of mostly unobserved data can be estimated as follows For the model yi g xi sei displaystyle y i g x i sigma varepsilon i nbsp where ei N 0 1 displaystyle varepsilon i sim mathcal N 0 1 nbsp one may write n MSPE L gT I L T I L g s2tr LTL displaystyle n cdot operatorname MSPE L g text T I L text T I L g sigma 2 operatorname tr left L text T L right nbsp Using in sample data values the first term on the right side is equivalent to i 1n E g xi g xi 2 E i 1n yi g xi 2 s2tr I L T I L displaystyle sum i 1 n left operatorname E left g x i widehat g x i right right 2 operatorname E left sum i 1 n left y i widehat g x i right 2 right sigma 2 operatorname tr left left I L right T left I L right right nbsp Thus n MSPE L E i 1n yi g xi 2 s2 n tr L displaystyle n cdot operatorname MSPE L operatorname E left sum i 1 n left y i widehat g x i right 2 right sigma 2 left n operatorname tr left L right right nbsp If s2 displaystyle sigma 2 nbsp is known or well estimated by s 2 displaystyle widehat sigma 2 nbsp it becomes possible to estimate MSPE by n MSPE L i 1n yi g xi 2 s 2 n tr L displaystyle n cdot operatorname widehat MSPE L sum i 1 n left y i widehat g x i right 2 widehat sigma 2 left n operatorname tr left L right right nbsp Colin Mallows advocated this method in the construction of his model selection statistic Cp which is a normalized version of the estimated MSPE Cp i 1n yi g xi 2s 2 n 2p displaystyle C p frac sum i 1 n left y i widehat g x i right 2 widehat sigma 2 n 2p nbsp where p the number of estimated parameters p and s 2 displaystyle widehat sigma 2 nbsp is computed from the version of the model that includes all possible regressors That concludes this proof See also editAkaike information criterion Bias variance tradeoff Mean squared error Errors and residuals in statistics Law of total variance Mallows s Cp Model selectionReferences edit Pindyck Robert S Rubinfeld Daniel L 1991 Forecasting with Time Series Models Econometric Models amp Economic Forecasts 3rd ed New York McGraw Hill pp 516 535 ISBN 0 07 050098 3 Retrieved from https en wikipedia org w index php title Mean squared prediction error amp oldid 1145206586, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.