fbpx
Wikipedia

Bernstein–von Mises theorem

In Bayesian inference, the Bernstein–von Mises theorem provides the basis for using Bayesian credible sets for confidence statements in parametric models. It states that under some conditions, a posterior distribution converges in the limit of infinite data to a multivariate normal distribution centered at the maximum likelihood estimator with covariance matrix given by , where is the true population parameter and is the Fisher information matrix at the true population parameter value:[1]

The Bernstein–von Mises theorem links Bayesian inference with frequentist inference. It assumes there is some true probabilistic process that generates the observations, as in frequentism, and then studies the quality of Bayesian methods of recovering that process, and making uncertainty statements about that process. In particular, it states that Bayesian credible sets of a certain credibility level will asymptotically be confidence sets of confidence level , which allows for the interpretation of Bayesian credible sets.

Heuristic statement edit

In a model  , under certain regularity conditions (finite-dimensional, well-specified, smooth, existence of tests), if the prior distribution   on   has a density with respect to the Lebesgue measure which is smooth enough (near   bounded away from zero), the total variation distance between the rescaled posterior distribution (by centring and rescaling to  ) and a Gaussian distribution centred on any efficient estimator and with the inverse Fisher information as variance will converge in probability to zero.

Bernstein–von Mises and maximum likelihood estimation edit

In case the maximum likelihood estimator is an efficient estimator, we can plug this in, and we recover a common, more specific, version of the Bernstein–von Mises theorem.

Implications edit

The most important implication of the Bernstein–von Mises theorem is that the Bayesian inference is asymptotically correct from a frequentist point of view. This means that for large amounts of data, one can use the posterior distribution to make, from a frequentist point of view, valid statements about estimation and uncertainty.

History edit

The theorem is named after Richard von Mises and S. N. Bernstein, although the first proper proof was given by Joseph L. Doob in 1949 for random variables with finite probability space.[2] Later Lucien Le Cam, his PhD student Lorraine Schwartz, David A. Freedman and Persi Diaconis extended the proof under more general assumptions.[citation needed]

Limitations edit

In case of a misspecified model, the posterior distribution will also become asymptotically Gaussian with a correct mean, but not necessarily with the Fisher information as the variance. This implies that Bayesian credible sets of level   cannot be interpreted as confidence sets of level  .[3]

In the case of nonparametric statistics, the Bernstein–von Mises theorem usually fails to hold with a notable exception of the Dirichlet process.

A remarkable result was found by Freedman in 1965: the Bernstein–von Mises theorem does not hold almost surely if the random variable has an infinite countable probability space; however, this depends on allowing a very broad range of possible priors. In practice, the priors used typically in research do have the desirable property even with an infinite countable probability space.

Different summary statistics such as the mode and mean may behave differently in the posterior distribution. In Freedman's examples, the posterior density and its mean can converge on the wrong result, but the posterior mode is consistent and will converge on the correct result.

Notes edit

  1. ^ van der Vaart, A.W. (1998). "10.2 Bernstein–von Mises Theorem". Asymptotic Statistics. Cambridge University Press. ISBN 0-521-78450-6.
  2. ^ Doob, Joseph L. (1949). "Application of the theory of martingales". Colloq. Intern. Du C.N.R.S (Paris). 13: 23–27.
  3. ^ Kleijn, B.J.K.; van der Vaart, A.W. (2012). "The Bernstein-Von–Mises theorem under misspecification". Electronic Journal of Statistics. 6: 354–381. doi:10.1214/12-EJS675. hdl:1887/61499.

References edit

  • Hartigan, J. A. (1983). "Asymptotic Normality of Posterior Distributions". Bayes Theory. New York: Springer. doi:10.1007/978-1-4613-8242-3_11.
  • van der Vaart, A. W. (1998). "Bernstein–von Mises Theorem". Asymptotic Statistics. Cambridge University Press. ISBN 0-521-49603-9.

bernstein, mises, theorem, bayesian, inference, provides, basis, using, bayesian, credible, sets, confidence, statements, parametric, models, states, that, under, some, conditions, posterior, distribution, converges, limit, infinite, data, multivariate, normal. In Bayesian inference the Bernstein von Mises theorem provides the basis for using Bayesian credible sets for confidence statements in parametric models It states that under some conditions a posterior distribution converges in the limit of infinite data to a multivariate normal distribution centered at the maximum likelihood estimator with covariance matrix given by n 1I 80 1 displaystyle n 1 I theta 0 1 where 80 displaystyle theta 0 is the true population parameter and I 80 displaystyle I theta 0 is the Fisher information matrix at the true population parameter value 1 P 8 x1 xn N 80 n 1I 80 1 for n displaystyle P theta x 1 dots x n mathcal N theta 0 n 1 I theta 0 1 text for n to infty The Bernstein von Mises theorem links Bayesian inference with frequentist inference It assumes there is some true probabilistic process that generates the observations as in frequentism and then studies the quality of Bayesian methods of recovering that process and making uncertainty statements about that process In particular it states that Bayesian credible sets of a certain credibility level a displaystyle alpha will asymptotically be confidence sets of confidence level a displaystyle alpha which allows for the interpretation of Bayesian credible sets Contents 1 Heuristic statement 2 Bernstein von Mises and maximum likelihood estimation 3 Implications 4 History 5 Limitations 6 Notes 7 ReferencesHeuristic statement editIn a model P8 8 8 displaystyle P theta theta in Theta nbsp under certain regularity conditions finite dimensional well specified smooth existence of tests if the prior distribution P displaystyle Pi nbsp on 8 displaystyle theta nbsp has a density with respect to the Lebesgue measure which is smooth enough near 80 displaystyle theta 0 nbsp bounded away from zero the total variation distance between the rescaled posterior distribution by centring and rescaling to n 8 80 displaystyle sqrt n theta theta 0 nbsp and a Gaussian distribution centred on any efficient estimator and with the inverse Fisher information as variance will converge in probability to zero Bernstein von Mises and maximum likelihood estimation editIn case the maximum likelihood estimator is an efficient estimator we can plug this in and we recover a common more specific version of the Bernstein von Mises theorem Implications editThe most important implication of the Bernstein von Mises theorem is that the Bayesian inference is asymptotically correct from a frequentist point of view This means that for large amounts of data one can use the posterior distribution to make from a frequentist point of view valid statements about estimation and uncertainty History editThe theorem is named after Richard von Mises and S N Bernstein although the first proper proof was given by Joseph L Doob in 1949 for random variables with finite probability space 2 Later Lucien Le Cam his PhD student Lorraine Schwartz David A Freedman and Persi Diaconis extended the proof under more general assumptions citation needed Limitations editIn case of a misspecified model the posterior distribution will also become asymptotically Gaussian with a correct mean but not necessarily with the Fisher information as the variance This implies that Bayesian credible sets of level a displaystyle alpha nbsp cannot be interpreted as confidence sets of level a displaystyle alpha nbsp 3 In the case of nonparametric statistics the Bernstein von Mises theorem usually fails to hold with a notable exception of the Dirichlet process A remarkable result was found by Freedman in 1965 the Bernstein von Mises theorem does not hold almost surely if the random variable has an infinite countable probability space however this depends on allowing a very broad range of possible priors In practice the priors used typically in research do have the desirable property even with an infinite countable probability space Different summary statistics such as the mode and mean may behave differently in the posterior distribution In Freedman s examples the posterior density and its mean can converge on the wrong result but the posterior mode is consistent and will converge on the correct result Notes edit van der Vaart A W 1998 10 2 Bernstein von Mises Theorem Asymptotic Statistics Cambridge University Press ISBN 0 521 78450 6 Doob Joseph L 1949 Application of the theory of martingales Colloq Intern Du C N R S Paris 13 23 27 Kleijn B J K van der Vaart A W 2012 The Bernstein Von Mises theorem under misspecification Electronic Journal of Statistics 6 354 381 doi 10 1214 12 EJS675 hdl 1887 61499 References editHartigan J A 1983 Asymptotic Normality of Posterior Distributions Bayes Theory New York Springer doi 10 1007 978 1 4613 8242 3 11 van der Vaart A W 1998 Bernstein von Mises Theorem Asymptotic Statistics Cambridge University Press ISBN 0 521 49603 9 Retrieved from https en wikipedia org w index php title Bernstein von Mises theorem amp oldid 1209844572, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.