fbpx
Wikipedia

68–95–99.7 rule

In statistics, the 68–95–99.7 rule, also known as the empirical rule, is a shorthand used to remember the percentage of values that lie within an interval estimate in a normal distribution: 68%, 95%, and 99.7% of the values lie within one, two, and three standard deviations of the mean, respectively.

For an approximately normal data set, the values within one standard deviation of the mean account for about 68% of the set; while within two standard deviations account for about 95%; and within three standard deviations account for about 99.7%. Shown percentages are rounded theoretical probabilities intended only to approximate the empirical data derived from a normal population.
Prediction interval (on the y-axis) given from the standard score (on the x-axis). The y-axis is logarithmically scaled (but the values on it are not modified).

In mathematical notation, these facts can be expressed as follows, where Pr() is the probability function,[1] Χ is an observation from a normally distributed random variable, μ (mu) is the mean of the distribution, and σ (sigma) is its standard deviation:

The usefulness of this heuristic especially depends on the question under consideration.

In the empirical sciences, the so-called three-sigma rule of thumb (or 3σ rule) expresses a conventional heuristic that nearly all values are taken to lie within three standard deviations of the mean, and thus it is empirically useful to treat 99.7% probability as near certainty.[2]

In the social sciences, a result may be considered "significant" if its confidence level is of the order of a two-sigma effect (95%), while in particle physics, there is a convention of a five-sigma effect (99.99994% confidence) being required to qualify as a discovery.

A weaker three-sigma rule can be derived from Chebyshev's inequality, stating that even for non-normally distributed variables, at least 88.8% of cases should fall within properly calculated three-sigma intervals. For unimodal distributions, the probability of being within the interval is at least 95% by the Vysochanskij–Petunin inequality. There may be certain assumptions for a distribution that force this probability to be at least 98%.[3]

Proof edit

We have that

 
doing the change of variable  , we have
 

and this integral is independent of   and  . We only need to calculate each integral for the cases  .

 

Cumulative distribution function edit

 
Diagram showing the cumulative distribution function for the normal distribution with mean (μ) 0 and variance (σ2) 1

These numerical values "68%, 95%, 99.7%" come from the cumulative distribution function of the normal distribution.

The prediction interval for any standard score z corresponds numerically to (1 − (1 − Φμ,σ2(z)) · 2).

For example, Φ(2) ≈ 0.9772, or Pr(Xμ + 2σ) ≈ 0.9772, corresponding to a prediction interval of (1 − (1 − 0.97725)·2) = 0.9545 = 95.45%. This is not a symmetrical interval – this is merely the probability that an observation is less than μ + 2σ. To compute the probability that an observation is within two standard deviations of the mean (small differences due to rounding):

 

This is related to confidence interval as used in statistics:   is approximately a 95% confidence interval when   is the average of a sample of size  .

Normality tests edit

The "68–95–99.7 rule" is often used to quickly get a rough probability estimate of something, given its standard deviation, if the population is assumed to be normal. It is also used as a simple test for outliers if the population is assumed normal, and as a normality test if the population is potentially not normal.

To pass from a sample to a number of standard deviations, one first computes the deviation, either the error or residual depending on whether one knows the population mean or only estimates it. The next step is standardizing (dividing by the population standard deviation), if the population parameters are known, or studentizing (dividing by an estimate of the standard deviation), if the parameters are unknown and only estimated.

To use as a test for outliers or a normality test, one computes the size of deviations in terms of standard deviations, and compares this to expected frequency. Given a sample set, one can compute the studentized residuals and compare these to the expected frequency: points that fall more than 3 standard deviations from the norm are likely outliers (unless the sample size is significantly large, by which point one expects a sample this extreme), and if there are many points more than 3 standard deviations from the norm, one likely has reason to question the assumed normality of the distribution. This holds ever more strongly for moves of 4 or more standard deviations.

One can compute more precisely, approximating the number of extreme moves of a given magnitude or greater by a Poisson distribution, but simply, if one has multiple 4 standard deviation moves in a sample of size 1,000, one has strong reason to consider these outliers or question the assumed normality of the distribution.

For example, a 6σ event corresponds to a chance of about two parts per billion. For illustration, if events are taken to occur daily, this would correspond to an event expected every 1.4 million years. This gives a simple normality test: if one witnesses a 6σ in daily data and significantly fewer than 1 million years have passed, then a normal distribution most likely does not provide a good model for the magnitude or frequency of large deviations in this respect.

In The Black Swan, Nassim Nicholas Taleb gives the example of risk models according to which the Black Monday crash would correspond to a 36-σ event: the occurrence of such an event should instantly suggest that the model is flawed, i.e. that the process under consideration is not satisfactorily modeled by a normal distribution. Refined models should then be considered, e.g. by the introduction of stochastic volatility. In such discussions it is important to be aware of the problem of the gambler's fallacy, which states that a single observation of a rare event does not contradict that the event is in fact rare. It is the observation of a plurality of purportedly rare events that increasingly undermines the hypothesis that they are rare, i.e. the validity of the assumed model. A proper modelling of this process of gradual loss of confidence in a hypothesis would involve the designation of prior probability not just to the hypothesis itself but to all possible alternative hypotheses. For this reason, statistical hypothesis testing works not so much by confirming a hypothesis considered to be likely, but by refuting hypotheses considered unlikely.

Table of numerical values edit

Because of the exponentially decreasing tails of the normal distribution, odds of higher deviations decrease very quickly. From the rules for normally distributed data for a daily event:

Range Expected fraction of

population inside range

Expected fraction of

population outside range

Approx. expected
frequency outside range
Approx. frequency outside range for daily event
μ ± 0.5σ 0.382924922548026 6.171E-01 = 61.71 % 3 in  5 Four or five times a week
μ ± σ 0.682689492137086[4] 3.173E-01 = 31.73 % 1 in  3 Twice or thrice a week
μ ± 1.5σ 0.866385597462284 1.336E-01 = 13.36 % 2 in  15 Weekly
μ ± 2σ 0.954499736103642[5] 4.550E-02 = 4.550 % 1 in  22 Every three weeks
μ ± 2.5σ 0.987580669348448 1.242E-02 = 1.242 % 1 in  81 Quarterly
μ ± 3σ 0.997300203936740[6] 2.700E-03 = 0.270 % = 2.700 ‰ 1 in  370 Yearly
μ ± 3.5σ 0.999534741841929 4.653E-04 = 0.04653 % = 465.3 ppm 1 in  2149 Every 6 years
μ ± 4σ 0.999936657516334 6.334E-05 = 63.34 ppm 1 in  15787 Every 43 years (twice in a lifetime)
μ ± 4.5σ 0.999993204653751 6.795E-06 = 6.795 ppm 1 in  147160 Every 403 years (once in the modern era)
μ ± 5σ 0.999999426696856 5.733E-07 = 0.5733 ppm = 573.3 ppb 1 in  1744278 Every 4776 years (once in recorded history)
μ ± 5.5σ 0.999999962020875 3.798E-08 = 37.98 ppb 1 in  26330254 Every 72090 years (thrice in history of modern humankind)
μ ± 6σ 0.999999998026825 1.973E-09 = 1.973 ppb 1 in  506797346 Every 1.38 million years (twice in history of humankind)
μ ± 6.5σ 0.999999999919680 8.032E-11 = 0.08032 ppb = 80.32 ppt 1 in  12450197393 Every 34 million years (twice since the extinction of dinosaurs)
μ ± 7σ 0.999999999997440 2.560E-12 = 2.560 ppt 1 in  390682215445 Every 1.07 billion years (four occurrences in history of Earth)
μ ± 7.5σ 0.999999999999936 6.382E-14 = 63.82 ppq 1 in  15669601204101 Once every 43 billion years (never in the history of the Universe, twice in the future of the Local Group before its merger)
μ ± 8σ 0.999999999999999 1.244E-15 = 1.244 ppq 1 in  803734397655348 Once every 2.2 trillion years (never in the history of the Universe, once during the life of a red dwarf)
μ ± xσ     1 in    Every   days

See also edit

References edit

  1. ^ Huber, Franz (2018). A Logical Introduction to Probability and Induction. New York: Oxford University Press. p. 80. ISBN 9780190845414.
  2. ^ This usage of "three-sigma rule" entered common usage in the 2000s, e.g. cited in
    • Schaum's Outline of Business Statistics. McGraw Hill Professional. 2003. p. 359. ISBN 9780071398763
    • Grafarend, Erik W. (2006). Linear and Nonlinear Models: Fixed Effects, Random Effects, and Mixed Models. Walter de Gruyter. p. 553. ISBN 9783110162165.
  3. ^ See:
    • Wheeler, D. J.; Chambers, D. S. (1992). Understanding Statistical Process Control. SPC Press. ISBN 9780945320135.
    • Czitrom, Veronica; Spagon, Patrick D. (1997). Statistical Case Studies for Industrial Process Improvement. SIAM. p. 342. ISBN 9780898713947.
    • Pukelsheim, F. (1994). "The Three Sigma Rule". American Statistician. 48 (2): 88–91. doi:10.2307/2684253. JSTOR 2684253.
  4. ^ Sloane, N. J. A. (ed.). "Sequence A178647". The On-Line Encyclopedia of Integer Sequences. OEIS Foundation.
  5. ^ Sloane, N. J. A. (ed.). "Sequence A110894". The On-Line Encyclopedia of Integer Sequences. OEIS Foundation.
  6. ^ Sloane, N. J. A. (ed.). "Sequence A270712". The On-Line Encyclopedia of Integer Sequences. OEIS Foundation.

External links edit

  • "Calculate percentage proportion within x sigmas at WolframAlpha

rule, this, article, needs, additional, citations, verification, please, help, improve, this, article, adding, citations, reliable, sources, unsourced, material, challenged, removed, find, sources, news, newspapers, books, scholar, jstor, september, 2023, lear. This article needs additional citations for verification Please help improve this article by adding citations to reliable sources Unsourced material may be challenged and removed Find sources 68 95 99 7 rule news newspapers books scholar JSTOR September 2023 Learn how and when to remove this message In statistics the 68 95 99 7 rule also known as the empirical rule is a shorthand used to remember the percentage of values that lie within an interval estimate in a normal distribution 68 95 and 99 7 of the values lie within one two and three standard deviations of the mean respectively For an approximately normal data set the values within one standard deviation of the mean account for about 68 of the set while within two standard deviations account for about 95 and within three standard deviations account for about 99 7 Shown percentages are rounded theoretical probabilities intended only to approximate the empirical data derived from a normal population Prediction interval on the y axis given from the standard score on the x axis The y axis is logarithmically scaled but the values on it are not modified In mathematical notation these facts can be expressed as follows where Pr is the probability function 1 X is an observation from a normally distributed random variable m mu is the mean of the distribution and s sigma is its standard deviation Pr m 1 s X m 1 s 68 27 Pr m 2 s X m 2 s 95 45 Pr m 3 s X m 3 s 99 73 displaystyle begin aligned Pr mu 1 sigma leq X leq mu 1 sigma amp approx 68 27 Pr mu 2 sigma leq X leq mu 2 sigma amp approx 95 45 Pr mu 3 sigma leq X leq mu 3 sigma amp approx 99 73 end aligned The usefulness of this heuristic especially depends on the question under consideration In the empirical sciences the so called three sigma rule of thumb or 3s rule expresses a conventional heuristic that nearly all values are taken to lie within three standard deviations of the mean and thus it is empirically useful to treat 99 7 probability as near certainty 2 In the social sciences a result may be considered significant if its confidence level is of the order of a two sigma effect 95 while in particle physics there is a convention of a five sigma effect 99 99994 confidence being required to qualify as a discovery A weaker three sigma rule can be derived from Chebyshev s inequality stating that even for non normally distributed variables at least 88 8 of cases should fall within properly calculated three sigma intervals For unimodal distributions the probability of being within the interval is at least 95 by the Vysochanskij Petunin inequality There may be certain assumptions for a distribution that force this probability to be at least 98 3 Contents 1 Proof 2 Cumulative distribution function 3 Normality tests 4 Table of numerical values 5 See also 6 References 7 External linksProof editWe have thatPr m n s m n s m n s m n s 1 2 p s e 1 2 x m s 2 d x displaystyle begin aligned Pr mu n sigma leq mu n sigma int mu n sigma mu n sigma frac 1 sqrt 2 pi sigma e frac 1 2 left frac x mu sigma right 2 dx end aligned nbsp doing the change of variable u x m s displaystyle u frac x mu sigma nbsp we have 1 2 p n n e u 2 2 d u displaystyle begin aligned frac 1 sqrt 2 pi int n n e frac u 2 2 du end aligned nbsp and this integral is independent of m displaystyle mu nbsp and s displaystyle sigma nbsp We only need to calculate each integral for the cases n 1 2 3 displaystyle n 1 2 3 nbsp X m 1 s 1 2 p 1 1 e u 2 2 d u 0 6827 Pr m 2 s X m 2 s 1 2 p 2 2 e u 2 2 d u 0 9545 Pr m 3 s X m 3 s 1 2 p 3 3 e u 2 2 d u 0 9973 displaystyle begin aligned X leq mu 1 sigma frac 1 sqrt 2 pi int 1 1 e frac u 2 2 du approx 0 6827 amp Pr mu 2 sigma leq X leq mu 2 sigma frac 1 sqrt 2 pi int 2 2 e frac u 2 2 du approx 0 9545 amp Pr mu 3 sigma leq X leq mu 3 sigma frac 1 sqrt 2 pi int 3 3 e frac u 2 2 du approx 0 9973 end aligned nbsp Cumulative distribution function edit nbsp Diagram showing the cumulative distribution function for the normal distribution with mean m 0 and variance s2 1 These numerical values 68 95 99 7 come from the cumulative distribution function of the normal distribution The prediction interval for any standard score z corresponds numerically to 1 1 F m s2 z 2 For example F 2 0 9772 or Pr X m 2s 0 9772 corresponding to a prediction interval of 1 1 0 97725 2 0 9545 95 45 This is not a symmetrical interval this is merely the probability that an observation is less than m 2s To compute the probability that an observation is within two standard deviations of the mean small differences due to rounding Pr m 2 s X m 2 s F 2 F 2 0 9772 1 0 9772 0 9545 displaystyle Pr mu 2 sigma leq X leq mu 2 sigma Phi 2 Phi 2 approx 0 9772 1 0 9772 approx 0 9545 nbsp This is related to confidence interval as used in statistics X 2 s n displaystyle bar X pm 2 frac sigma sqrt n nbsp is approximately a 95 confidence interval when X displaystyle bar X nbsp is the average of a sample of size n displaystyle n nbsp Normality tests editMain article Normality test The 68 95 99 7 rule is often used to quickly get a rough probability estimate of something given its standard deviation if the population is assumed to be normal It is also used as a simple test for outliers if the population is assumed normal and as a normality test if the population is potentially not normal To pass from a sample to a number of standard deviations one first computes the deviation either the error or residual depending on whether one knows the population mean or only estimates it The next step is standardizing dividing by the population standard deviation if the population parameters are known or studentizing dividing by an estimate of the standard deviation if the parameters are unknown and only estimated To use as a test for outliers or a normality test one computes the size of deviations in terms of standard deviations and compares this to expected frequency Given a sample set one can compute the studentized residuals and compare these to the expected frequency points that fall more than 3 standard deviations from the norm are likely outliers unless the sample size is significantly large by which point one expects a sample this extreme and if there are many points more than 3 standard deviations from the norm one likely has reason to question the assumed normality of the distribution This holds ever more strongly for moves of 4 or more standard deviations One can compute more precisely approximating the number of extreme moves of a given magnitude or greater by a Poisson distribution but simply if one has multiple 4 standard deviation moves in a sample of size 1 000 one has strong reason to consider these outliers or question the assumed normality of the distribution For example a 6s event corresponds to a chance of about two parts per billion For illustration if events are taken to occur daily this would correspond to an event expected every 1 4 million years This gives a simple normality test if one witnesses a 6s in daily data and significantly fewer than 1 million years have passed then a normal distribution most likely does not provide a good model for the magnitude or frequency of large deviations in this respect In The Black Swan Nassim Nicholas Taleb gives the example of risk models according to which the Black Monday crash would correspond to a 36 s event the occurrence of such an event should instantly suggest that the model is flawed i e that the process under consideration is not satisfactorily modeled by a normal distribution Refined models should then be considered e g by the introduction of stochastic volatility In such discussions it is important to be aware of the problem of the gambler s fallacy which states that a single observation of a rare event does not contradict that the event is in fact rare It is the observation of a plurality of purportedly rare events that increasingly undermines the hypothesis that they are rare i e the validity of the assumed model A proper modelling of this process of gradual loss of confidence in a hypothesis would involve the designation of prior probability not just to the hypothesis itself but to all possible alternative hypotheses For this reason statistical hypothesis testing works not so much by confirming a hypothesis considered to be likely but by refuting hypotheses considered unlikely Table of numerical values editBecause of the exponentially decreasing tails of the normal distribution odds of higher deviations decrease very quickly From the rules for normally distributed data for a daily event Range Expected fraction of population inside range Expected fraction of population outside range Approx expectedfrequency outside range Approx frequency outside range for daily event m 0 5s 0 382924 922 548 026 6 171E 01 61 71 3 in 5 Four or five times a week m s 0 682689 492 137 086 4 3 173E 01 31 73 1 in 3 Twice or thrice a week m 1 5s 0 866385 597 462 284 1 336E 01 13 36 2 in 15 Weekly m 2s 0 954499 736 103 642 5 4 550E 02 4 550 1 in 22 Every three weeks m 2 5s 0 987580 669 348 448 1 242E 02 1 242 1 in 81 Quarterly m 3s 0 997300 203 936 740 6 2 700E 03 0 270 2 700 1 in 370 Yearly m 3 5s 0 999534 741 841 929 4 653E 04 0 04653 465 3 ppm 1 in 2149 Every 6 years m 4s 0 999936 657 516 334 6 334E 05 63 34 ppm 1 in 15787 Every 43 years twice in a lifetime m 4 5s 0 999993 204 653 751 6 795E 06 6 795 ppm 1 in 147160 Every 403 years once in the modern era m 5s 0 999999 426 696 856 5 733E 07 0 5733 ppm 573 3 ppb 1 in 1744 278 Every 4776 years once in recorded history m 5 5s 0 999999 962 020 875 3 798E 08 37 98 ppb 1 in 26330 254 Every 72090 years thrice in history of modern humankind m 6s 0 999999 998 026 825 1 973E 09 1 973 ppb 1 in 506797 346 Every 1 38 million years twice in history of humankind m 6 5s 0 999999 999 919 680 8 032E 11 0 08032 ppb 80 32 ppt 1 in 12450 197 393 Every 34 million years twice since the extinction of dinosaurs m 7s 0 999999 999 997 440 2 560E 12 2 560 ppt 1 in 390682 215 445 Every 1 07 billion years four occurrences in history of Earth m 7 5s 0 999999 999 999 936 6 382E 14 63 82 ppq 1 in 15669 601 204 101 Once every 43 billion years never in the history of the Universe twice in the future of the Local Group before its merger m 8s 0 999999 999 999 999 1 244E 15 1 244 ppq 1 in 803734 397 655 348 Once every 2 2 trillion years never in the history of the Universe once during the life of a red dwarf m x s erf x 2 displaystyle operatorname erf left frac x sqrt 2 right nbsp 1 erf x 2 displaystyle 1 operatorname erf left frac x sqrt 2 right nbsp 1 in 1 1 erf x 2 displaystyle tfrac 1 1 operatorname erf left frac x sqrt 2 right nbsp Every 1 1 erf x 2 displaystyle tfrac 1 1 operatorname erf left frac x sqrt 2 right nbsp daysSee also editp value Six Sigma Sigma levels Standard score t statisticReferences edit Huber Franz 2018 A Logical Introduction to Probability and Induction New York Oxford University Press p 80 ISBN 9780190845414 This usage of three sigma rule entered common usage in the 2000s e g cited in Schaum s Outline of Business Statistics McGraw Hill Professional 2003 p 359 ISBN 9780071398763 Grafarend Erik W 2006 Linear and Nonlinear Models Fixed Effects Random Effects and Mixed Models Walter de Gruyter p 553 ISBN 9783110162165 See Wheeler D J Chambers D S 1992 Understanding Statistical Process Control SPC Press ISBN 9780945320135 Czitrom Veronica Spagon Patrick D 1997 Statistical Case Studies for Industrial Process Improvement SIAM p 342 ISBN 9780898713947 Pukelsheim F 1994 The Three Sigma Rule American Statistician 48 2 88 91 doi 10 2307 2684253 JSTOR 2684253 Sloane N J A ed Sequence A178647 The On Line Encyclopedia of Integer Sequences OEIS Foundation Sloane N J A ed Sequence A110894 The On Line Encyclopedia of Integer Sequences OEIS Foundation Sloane N J A ed Sequence A270712 The On Line Encyclopedia of Integer Sequences OEIS Foundation External links edit Calculate percentage proportion within x sigmas at WolframAlpha Retrieved from https en wikipedia org w index php title 68 95 99 7 rule amp oldid 1221966463, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.