fbpx
Wikipedia

Reduced chi-squared statistic

In statistics, the reduced chi-square statistic is used extensively in goodness of fit testing. It is also known as mean squared weighted deviation (MSWD) in isotopic dating[1] and variance of unit weight in the context of weighted least squares.[2][3]

Its square root is called regression standard error,[4] standard error of the regression,[5][6] or standard error of the equation[7] (see Ordinary least squares § Reduced chi-squared)

Definition Edit

It is defined as chi-square per degree of freedom:[8][9][10][11]: 85 [12][13][14][15]

 
where the chi-squared is a weighted sum of squared deviations:
 
with inputs: variance  , observations O, and calculated data C.[8] The degree of freedom,  , equals the number of observations n minus the number of fitted parameters m.

In weighted least squares, the definition is often written in matrix notation as

 
where r is the vector of residuals, and W is the weight matrix, the inverse of the input (diagonal) covariance matrix of observations. If W is non-diagonal, then generalized least squares applies.

In ordinary least squares, the definition simplifies to:

 
 
where the numerator is the residual sum of squares (RSS).

When the fit is just an ordinary mean, then   equals the sample standard deviation.

Discussion Edit

As a general rule, when the variance of the measurement error is known a priori, a   indicates a poor model fit. A   indicates that the fit has not fully captured the data (or that the error variance has been underestimated). In principle, a value of   around   indicates that the extent of the match between observations and estimates is in accord with the error variance. A   indicates that the model is "over-fitting" the data: either the model is improperly fitting noise, or the error variance has been overestimated.[11]: 89

When the variance of the measurement error is only partially known, the reduced chi-squared may serve as a correction estimated a posteriori.

Applications Edit

Geochronology Edit

In geochronology, the MSWD is a measure of goodness of fit that takes into account the relative importance of both the internal and external reproducibility, with most common usage in isotopic dating.[16][17][1][18][19][20]

In general when:

MSWD = 1 if the age data fit a univariate normal distribution in t (for the arithmetic mean age) or log(t) (for the geometric mean age) space, or if the compositional data fit a bivariate normal distribution in [log(U/He),log(Th/He)]-space (for the central age).

MSWD < 1 if the observed scatter is less than that predicted by the analytical uncertainties. In this case, the data are said to be "underdispersed", indicating that the analytical uncertainties were overestimated.

MSWD > 1 if the observed scatter exceeds that predicted by the analytical uncertainties. In this case, the data are said to be "overdispersed". This situation is the rule rather than the exception in (U-Th)/He geochronology, indicating an incomplete understanding of the isotope system. Several reasons have been proposed to explain the overdispersion of (U-Th)/He data, including unevenly distributed U-Th distributions and radiation damage.

Often the geochronologist will determine a series of age measurements on a single sample, with the measured value   having a weighting   and an associated error   for each age determination. As regards weighting, one can either weight all of the measured ages equally, or weight them by the proportion of the sample that they represent. For example, if two thirds of the sample was used for the first measurement and one third for the second and final measurement, then one might weight the first measurement twice that of the second.

The arithmetic mean of the age determinations is

 
but this value can be misleading, unless each determination of the age is of equal significance.

When each measured value can be assumed to have the same weighting, or significance, the biased and unbiased (or "sample" and "population" respectively) estimators of the variance are computed as follows:

 

The standard deviation is the square root of the variance.

When individual determinations of an age are not of equal significance, it is better to use a weighted mean to obtain an "average" age, as follows:

 

The biased weighted estimator of variance can be shown to be

 
which can be computed as
 

The unbiased weighted estimator of the sample variance can be computed as follows:

 
Again, the corresponding standard deviation is the square root of the variance.

The unbiased weighted estimator of the sample variance can also be computed on the fly as follows:

 

The unweighted mean square of the weighted deviations (unweighted MSWD) can then be computed, as follows:

 

By analogy, the weighted mean square of the weighted deviations (weighted MSWD) can be computed as follows:

 

Rasch analysis Edit

In data analysis based on the Rasch model, the reduced chi-squared statistic is called the outfit mean-square statistic, and the information-weighted reduced chi-squared statistic is called the infit mean-square statistic.[21]

References Edit

  1. ^ a b Wendt, I., and Carl, C., 1991,The statistical distribution of the mean squared weighted deviation, Chemical Geology, 275–285.
  2. ^ Strang, Gilbert; Borre, Kae (1997). Linear algebra, geodesy, and GPS. Wellesley-Cambridge Press. p. 301. ISBN 9780961408862.
  3. ^ Koch, Karl-Rudolf (2013). Parameter Estimation and Hypothesis Testing in Linear Models. Springer Berlin Heidelberg. Section 3.2.5. ISBN 9783662039762.
  4. ^ Julian Faraway (2000), Practical Regression and Anova using R
  5. ^ Kenney, J.; Keeping, E. S. (1963). Mathematics of Statistics. van Nostrand. p. 187.
  6. ^ Zwillinger, D. (1995). Standard Mathematical Tables and Formulae. Chapman&Hall/CRC. p. 626. ISBN 0-8493-2479-3.
  7. ^ Hayashi, Fumio (2000). Econometrics. Princeton University Press. ISBN 0-691-01018-8.
  8. ^ a b Laub, Charlie; Kuhl, Tonya L. (n.d.), (PDF), University California, Davis, archived from the original (PDF) on 6 October 2016, retrieved 30 May 2015
  9. ^ Taylor, John Robert (1997), An introduction to error analysis, University Science Books, p. 268
  10. ^ Kirkman, T. W. (n.d.), Chi-Square Curve Fitting, retrieved 30 May 2015
  11. ^ a b Bevington, Philip R. (1969), Data Reduction and Error Analysis for the Physical Sciences, New York: McGraw-Hill
  12. ^ Measurements and Their Uncertainties: A Practical Guide to Modern Error Analysis, By Ifan Hughes, Thomas Hase [1]
  13. ^ Dealing with Uncertainties: A Guide to Error Analysis, By Manfred Drosg [2]
  14. ^ Practical Statistics for Astronomers, By J. V. Wall, C. R. Jenkins
  15. ^ Computational Methods in Physics and Engineering, By Samuel Shaw Ming Wong [3]
  16. ^ Dickin, A. P. 1995. Radiogenic Isotope Geology. Cambridge University Press, Cambridge, UK, 1995, ISBN 0-521-43151-4, ISBN 0-521-59891-5
  17. ^ McDougall, I. and Harrison, T. M. 1988. Geochronology and Thermochronology by the 40Ar/39Ar Method. Oxford University Press.
  18. ^ Lance P. Black, Sandra L. Kamo, Charlotte M. Allen, John N. Aleinikoff, Donald W. Davis, Russell J. Korsch, Chris Foudoulis 2003. TEMORA 1: a new zircon standard for Phanerozoic U–Pb geochronology. Chemical Geology 200, 155–170.
  19. ^ M. J. Streule, R. J. Phillips, M. P. Searle, D. J. Waters and M. S. A. Horstwood 2009. Evolution and chronology of the Pangong Metamorphic Complex adjacent to themodelling and U-Pb geochronology Karakoram Fault, Ladakh: constraints from thermobarometry, metamorphic modelling and U-Pb geochronology. Journal of the Geological Society 166, 919–932 doi:10.1144/0016-76492008-117
  20. ^ Roger Powell, Janet Hergt, Jon Woodhead 2002. Improving isochron calculations with robust statistics and the bootstrap. Chemical Geology 185, 191–204.
  21. ^ Linacre, J. M. (2002). "What do Infit and Outfit, Mean-square and Standardized mean?". Rasch Measurement Transactions. 16 (2): 878.

reduced, squared, statistic, this, article, technical, most, readers, understand, please, help, improve, make, understandable, experts, without, removing, technical, details, april, 2021, learn, when, remove, this, template, message, statistics, reduced, squar. This article may be too technical for most readers to understand Please help improve it to make it understandable to non experts without removing the technical details April 2021 Learn how and when to remove this template message In statistics the reduced chi square statistic is used extensively in goodness of fit testing It is also known as mean squared weighted deviation MSWD in isotopic dating 1 and variance of unit weight in the context of weighted least squares 2 3 Its square root is called regression standard error 4 standard error of the regression 5 6 or standard error of the equation 7 see Ordinary least squares Reduced chi squared Contents 1 Definition 2 Discussion 3 Applications 3 1 Geochronology 3 2 Rasch analysis 4 ReferencesDefinition EditIt is defined as chi square per degree of freedom 8 9 10 11 85 12 13 14 15 x n 2 x 2 n displaystyle chi nu 2 frac chi 2 nu nbsp where the chi squared is a weighted sum of squared deviations x 2 i O i C i 2 s i 2 displaystyle chi 2 sum i frac O i C i 2 sigma i 2 nbsp with inputs variance s i 2 displaystyle sigma i 2 nbsp observations O and calculated data C 8 The degree of freedom n n m displaystyle nu n m nbsp equals the number of observations n minus the number of fitted parameters m In weighted least squares the definition is often written in matrix notation asx n 2 r T W r n displaystyle chi nu 2 frac r mathrm T Wr nu nbsp where r is the vector of residuals and W is the weight matrix the inverse of the input diagonal covariance matrix of observations If W is non diagonal then generalized least squares applies In ordinary least squares the definition simplifies to x n 2 R S S n displaystyle chi nu 2 frac mathrm RSS nu nbsp R S S r 2 displaystyle mathrm RSS sum r 2 nbsp where the numerator is the residual sum of squares RSS When the fit is just an ordinary mean then x n 2 displaystyle chi nu 2 nbsp equals the sample standard deviation Discussion EditAs a general rule when the variance of the measurement error is known a priori a x n 2 1 displaystyle chi nu 2 gg 1 nbsp indicates a poor model fit A x n 2 gt 1 displaystyle chi nu 2 gt 1 nbsp indicates that the fit has not fully captured the data or that the error variance has been underestimated In principle a value of x n 2 displaystyle chi nu 2 nbsp around 1 displaystyle 1 nbsp indicates that the extent of the match between observations and estimates is in accord with the error variance A x n 2 lt 1 displaystyle chi nu 2 lt 1 nbsp indicates that the model is over fitting the data either the model is improperly fitting noise or the error variance has been overestimated 11 89 When the variance of the measurement error is only partially known the reduced chi squared may serve as a correction estimated a posteriori Applications EditGeochronology Edit In geochronology the MSWD is a measure of goodness of fit that takes into account the relative importance of both the internal and external reproducibility with most common usage in isotopic dating 16 17 1 18 19 20 In general when MSWD 1 if the age data fit a univariate normal distribution in t for the arithmetic mean age or log t for the geometric mean age space or if the compositional data fit a bivariate normal distribution in log U He log Th He space for the central age MSWD lt 1 if the observed scatter is less than that predicted by the analytical uncertainties In this case the data are said to be underdispersed indicating that the analytical uncertainties were overestimated MSWD gt 1 if the observed scatter exceeds that predicted by the analytical uncertainties In this case the data are said to be overdispersed This situation is the rule rather than the exception in U Th He geochronology indicating an incomplete understanding of the isotope system Several reasons have been proposed to explain the overdispersion of U Th He data including unevenly distributed U Th distributions and radiation damage Often the geochronologist will determine a series of age measurements on a single sample with the measured value x i displaystyle x i nbsp having a weighting w i displaystyle w i nbsp and an associated error s x i displaystyle sigma x i nbsp for each age determination As regards weighting one can either weight all of the measured ages equally or weight them by the proportion of the sample that they represent For example if two thirds of the sample was used for the first measurement and one third for the second and final measurement then one might weight the first measurement twice that of the second The arithmetic mean of the age determinations isx i 1 N x i N displaystyle overline x frac sum i 1 N x i N nbsp but this value can be misleading unless each determination of the age is of equal significance When each measured value can be assumed to have the same weighting or significance the biased and unbiased or sample and population respectively estimators of the variance are computed as follows s 2 i 1 N x i x 2 N and s 2 N N 1 s 2 1 N 1 i 1 N x i x 2 displaystyle sigma 2 frac sum i 1 N x i overline x 2 N text and s 2 frac N N 1 cdot sigma 2 frac 1 N 1 cdot sum i 1 N x i overline x 2 nbsp The standard deviation is the square root of the variance When individual determinations of an age are not of equal significance it is better to use a weighted mean to obtain an average age as follows x i 1 N w i x i i 1 N w i displaystyle overline x frac sum i 1 N w i x i sum i 1 N w i nbsp The biased weighted estimator of variance can be shown to bes 2 i 1 N w i x i x 2 i 1 N w i displaystyle sigma 2 frac sum i 1 N w i x i overline x 2 sum i 1 N w i nbsp which can be computed as s 2 i 1 N w i x i 2 i 1 N w i i 1 N w i x i 2 i 1 N w i 2 displaystyle sigma 2 frac sum i 1 N w i x i 2 cdot sum i 1 N w i big sum i 1 N w i x i big 2 big sum i 1 N w i big 2 nbsp The unbiased weighted estimator of the sample variance can be computed as follows s 2 i 1 N w i i 1 N w i 2 i 1 N w i 2 i 1 N w i x i x 2 displaystyle s 2 frac sum i 1 N w i big sum i 1 N w i big 2 sum i 1 N w i 2 cdot sum i 1 N w i x i overline x 2 nbsp Again the corresponding standard deviation is the square root of the variance The unbiased weighted estimator of the sample variance can also be computed on the fly as follows s 2 i 1 N w i x i 2 i 1 N w i i 1 N w i x i 2 i 1 N w i 2 i 1 N w i 2 displaystyle s 2 frac sum i 1 N w i x i 2 cdot sum i 1 N w i big sum i 1 N w i x i big 2 big sum i 1 N w i big 2 sum i 1 N w i 2 nbsp The unweighted mean square of the weighted deviations unweighted MSWD can then be computed as follows MSWD u 1 N 1 i 1 N x i x 2 s x i 2 displaystyle text MSWD u frac 1 N 1 cdot sum i 1 N frac x i overline x 2 sigma x i 2 nbsp By analogy the weighted mean square of the weighted deviations weighted MSWD can be computed as follows MSWD w i 1 N w i i 1 N w i 2 i 1 N w i 2 i 1 N w i x i x 2 s x i 2 displaystyle text MSWD w frac sum i 1 N w i big sum i 1 N w i big 2 sum i 1 N w i 2 cdot sum i 1 N frac w i x i overline x 2 sigma x i 2 nbsp Rasch analysis Edit In data analysis based on the Rasch model the reduced chi squared statistic is called the outfit mean square statistic and the information weighted reduced chi squared statistic is called the infit mean square statistic 21 References Edit a b Wendt I and Carl C 1991 The statistical distribution of the mean squared weighted deviation Chemical Geology 275 285 Strang Gilbert Borre Kae 1997 Linear algebra geodesy and GPS Wellesley Cambridge Press p 301 ISBN 9780961408862 Koch Karl Rudolf 2013 Parameter Estimation and Hypothesis Testing in Linear Models Springer Berlin Heidelberg Section 3 2 5 ISBN 9783662039762 Julian Faraway 2000 Practical Regression and Anova using R Kenney J Keeping E S 1963 Mathematics of Statistics van Nostrand p 187 Zwillinger D 1995 Standard Mathematical Tables and Formulae Chapman amp Hall CRC p 626 ISBN 0 8493 2479 3 Hayashi Fumio 2000 Econometrics Princeton University Press ISBN 0 691 01018 8 a b Laub Charlie Kuhl Tonya L n d How Bad is Good A Critical Look at the Fitting of Reflectivity Models using the Reduced Chi Square Statistic PDF University California Davis archived from the original PDF on 6 October 2016 retrieved 30 May 2015 Taylor John Robert 1997 An introduction to error analysis University Science Books p 268 Kirkman T W n d Chi Square Curve Fitting retrieved 30 May 2015 a b Bevington Philip R 1969 Data Reduction and Error Analysis for the Physical Sciences New York McGraw Hill Measurements and Their Uncertainties A Practical Guide to Modern Error Analysis By Ifan Hughes Thomas Hase 1 Dealing with Uncertainties A Guide to Error Analysis By Manfred Drosg 2 Practical Statistics for Astronomers By J V Wall C R Jenkins Computational Methods in Physics and Engineering By Samuel Shaw Ming Wong 3 Dickin A P 1995 Radiogenic Isotope Geology Cambridge University Press Cambridge UK 1995 ISBN 0 521 43151 4 ISBN 0 521 59891 5 McDougall I and Harrison T M 1988 Geochronology and Thermochronology by the 40Ar 39Ar Method Oxford University Press Lance P Black Sandra L Kamo Charlotte M Allen John N Aleinikoff Donald W Davis Russell J Korsch Chris Foudoulis 2003 TEMORA 1 a new zircon standard for Phanerozoic U Pb geochronology Chemical Geology 200 155 170 M J Streule R J Phillips M P Searle D J Waters and M S A Horstwood 2009 Evolution and chronology of the Pangong Metamorphic Complex adjacent to themodelling and U Pb geochronology Karakoram Fault Ladakh constraints from thermobarometry metamorphic modelling and U Pb geochronology Journal of the Geological Society 166 919 932 doi 10 1144 0016 76492008 117 Roger Powell Janet Hergt Jon Woodhead 2002 Improving isochron calculations with robust statistics and the bootstrap Chemical Geology 185 191 204 Linacre J M 2002 What do Infit and Outfit Mean square and Standardized mean Rasch Measurement Transactions 16 2 878 Retrieved from https en wikipedia org w index php title Reduced chi squared statistic amp oldid 1175438722, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.