fbpx
Wikipedia

Binomial proportion confidence interval

In statistics, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments (Bernoulli trials). In other words, a binomial proportion confidence interval is an interval estimate of a success probability p when only the number of experiments n and the number of successes nS are known.

There are several formulas for a binomial confidence interval, but all of them rely on the assumption of a binomial distribution. In general, a binomial distribution applies when an experiment is repeated a fixed number of times, each trial of the experiment has two possible outcomes (success and failure), the probability of success is the same for each trial, and the trials are statistically independent. Because the binomial distribution is a discrete probability distribution (i.e., not continuous) and difficult to calculate for large numbers of trials, a variety of approximations are used to calculate this confidence interval, all with their own tradeoffs in accuracy and computational intensity.

A simple example of a binomial distribution is the set of various possible outcomes, and their probabilities, for the number of heads observed when a coin is flipped ten times. The observed binomial proportion is the fraction of the flips that turn out to be heads. Given this observed proportion, the confidence interval for the true probability of the coin landing on heads is a range of possible proportions, which may or may not contain the true proportion. A 95% confidence interval for the proportion, for instance, will contain the true proportion 95% of the times that the procedure for constructing the confidence interval is employed.[1]

Normal approximation interval or Wald interval edit

 
Plotting the normal approximation interval on an arbitrary logistic curve reveals problems of overshoot and zero-width intervals.[2]

A commonly used formula for a binomial confidence interval relies on approximating the distribution of error about a binomially-distributed observation,  , with a normal distribution.[3] This approximation is based on the central limit theorem and is unreliable when the sample size is small or the success probability is close to 0 or 1.[4]

Using the normal approximation, the success probability p is estimated as

 

or the equivalent

 

where   is the proportion of successes in a Bernoulli trial process, measured with   trials yielding   successes and   failures, and   is the   quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate  . For a 95% confidence level, the error  , so   and  . From this one finds two problems. First, for   approaching unit (or zero), the interval narrows to zero width (implying certainty). Second, for values of   (or equivalently for  ), the interval boundaries exceed   (overshoot).

An important theoretical derivation of this confidence interval involves the inversion of a hypothesis test. Under this formulation, the confidence interval represents those values of the population parameter that would have large p-values if they were tested as a hypothesized population proportion. The collection of values,  , for which the normal approximation is valid can be represented as

 

where   is the   quantile of a standard normal distribution.

Since the test in the middle of the inequality is a Wald test, the normal approximation interval is sometimes called the Wald interval or Wald method, after Abraham Wald, but it was first described by Pierre-Simon Laplace in 1812.[5]

Bracketing the confidence interval edit

Extending the normal approximation and Wald-Laplace interval concepts, Michael Short has shown that inequalities on the approximation error between the binomial distribution and the normal distribution can be used to accurately bracket the estimate of the confidence interval around  :[6]

 

where   is again the (unknown) proportion of successes in a Bernoulli trial process, measured with   trials yielding   successes,   is the   quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate  , and the constants   and   are simple algebraic functions of  .[6] For a fixed   (and hence  ), the above inequalities give easily computed one- or two-sided intervals which bracket the exact binomial upper and lower confidence limits corresponding to the error rate  .

Standard error of a proportion estimation when using weighted data edit

Let there be a simple random sample   where each   is i.i.d from a Bernoulli(p) distribution and weight   is the weight for each observation. Standardize the (positive) weights   so they sum to 1. The weighted sample proportion is:  . Since the   are independent and each one has variance  , the sampling variance of the proportion therefore is:[7]

 .

The standard error of   is the square root of this quantity. Because we do not know  , we have to estimate it. Although there are many possible estimators, a conventional one is to use  , the sample mean, and plug this into the formula. That gives:

 

For unweighted data,  , giving  . The SE becomes  , leading to the familiar formulas, showing that the calculation for weighted data is a direct generalization of them.

Wilson score interval edit

 
Wilson score intervals plotted on a logistic curve, revealing asymmetry and good performance for small n and where p is at or near 0 or 1.

The Wilson score interval is an improvement over the normal approximation interval in multiple respects. It was developed by Edwin Bidwell Wilson (1927).[8] Unlike the symmetric normal approximation interval (above), the Wilson score interval is asymmetric. It does not suffer from problems of overshoot and zero-width intervals that afflict the normal interval, and it may be safely employed with small samples and skewed observations.[3] The observed coverage probability is consistently closer to the nominal value,  .[2]

Like the normal interval, the interval can be computed directly from a formula.

Wilson started with the normal approximation to the binomial:

 

with the analytic formula for the sample standard deviation given by

 
Combining the two, and squaring out the radical, gives an equation that is quadratic in p:
 

Transforming the relation into a standard-form quadratic equation for p, treating   and n as known values from the sample (see prior section), and using the value of z that corresponds to the desired confidence for the estimate of p gives this:

 
where all of the values in parentheses are known quantities. The solution for p estimates the upper and lower limits of the confidence interval for p. Hence the probability of success p is estimated by
 

or the equivalent

 

The practical observation from using this interval is that it has good properties even for a small number of trials and / or an extreme probability.

Intuitively, the center value of this interval is the weighted average of   and  , with   receiving greater weight as the sample size increases. Formally, the center value corresponds to using a pseudocount of 1/2 z2, the number of standard deviations of the confidence interval: add this number to both the count of successes and of failures to yield the estimate of the ratio. For the common two standard deviations in each direction interval (approximately 95% coverage, which itself is approximately 1.96 standard deviations), this yields the estimate  , which is known as the "plus four rule".

Although the quadratic can be solved explicitly, in most cases Wilson's equations can also be solved numerically using the fixed-point iteration

 

with  .

The Wilson interval can also be derived from the single sample z-test or Pearson's chi-squared test with two categories. The resulting interval,

 

can then be solved for   to produce the Wilson score interval. The test in the middle of the inequality is a score test.

The interval equality principle edit

 
The probability density function for the Wilson score interval, plus pdfs at interval bounds. Tail areas are equal.

Since the interval is derived by solving from the normal approximation to the binomial, the Wilson score interval   has the property of being guaranteed to obtain the same result as the equivalent z-test or chi-squared test.

This property can be visualised by plotting the probability density function for the Wilson score interval (see Wallis 2021: 297-313)[9] and then plotting a normal pdf at each bound. The tail areas of the resulting Wilson and normal distributions, representing the chance of a significant result in that direction, must be equal.

The continuity-corrected Wilson score interval and the Clopper-Pearson interval are also compliant with this property. The practical import is that these intervals may be employed as significance tests, with identical results to the source test, and new tests may be derived by geometry.[9]

Wilson score interval with continuity correction edit

The Wilson interval may be modified by employing a continuity correction, in order to align the minimum coverage probability, rather than the average coverage probability, with the nominal value,  

Just as the Wilson interval mirrors Pearson's chi-squared test, the Wilson interval with continuity correction mirrors the equivalent Yates' chi-squared test.

The following formulae for the lower and upper bounds of the Wilson score interval with continuity correction   are derived from Newcombe (1998).[2]

 

However, if p = 0 ,   must be taken as 0; or if p = 1 , then   is then 1.

Wallis (2021)[9] identifies a simpler method for computing continuity-corrected Wilson intervals that employs special functions. In Wallis' notation, for the lower bound, let   where   is the selected error level for  . Then   This method has the advantage of being further decomposable.

Jeffreys interval edit

The Jeffreys interval has a Bayesian derivation, but it has good frequentist properties. In particular, it has coverage properties that are similar to those of the Wilson interval, but it is one of the few intervals with the advantage of being equal-tailed (e.g., for a 95% confidence interval, the probabilities of the interval lying above or below the true value are both close to 2.5%). In contrast, the Wilson interval has a systematic bias such that it is centred too close to p = 0.5.[10]

The Jeffreys interval is the Bayesian credible interval obtained when using the non-informative Jeffreys prior for the binomial proportion p. The Jeffreys prior for this problem is a Beta distribution with parameters (1/2, 1/2), it is a conjugate prior. After observing x successes in n trials, the posterior distribution for p is a Beta distribution with parameters (x + 1/2, n – x + 1/2).

When x ≠0 and x ≠ n, the Jeffreys interval is taken to be the 100(1 – α)% equal-tailed posterior probability interval, i.e., the α / 2 and 1 – α / 2 quantiles of a Beta distribution with parameters (x + 1/2, n – x + 1/2). These quantiles need to be computed numerically, although this is reasonably simple with modern statistical software.

In order to avoid the coverage probability tending to zero when p → 0 or 1, when x = 0 the upper limit is calculated as before but the lower limit is set to 0, and when x = n the lower limit is calculated as before but the upper limit is set to 1.[4]

Clopper–Pearson interval edit

The Clopper–Pearson interval is an early and very common method for calculating binomial confidence intervals.[11] This is often called an 'exact' method, as it attains the nominal coverage level in an exact sense, meaning that the coverage level is never less than the nominal  .[2]

The Clopper–Pearson interval can be written as

 

or equivalently,

 

with

 

where 0 ≤ xn is the number of successes observed in the sample and Bin(n p) is a binomial random variable with n trials and probability of success  p.

Equivalently we can say that the Clopper–Pearson interval is   with confidence level   if   is the infimum of those such that the following tests of hypothesis succeed with significance  :

  1. H0:   with HA:  
  2. H0:   with HA:  .

Because of a relationship between the binomial distribution and the beta distribution, the Clopper–Pearson interval is sometimes presented in an alternate format that uses quantiles from the beta distribution.[12]

 

where x is the number of successes, n is the number of trials, and B(p; v,w) is the pth quantile from a beta distribution with shape parameters v and w.

Thus,  , where:

 
 

The binomial proportion confidence interval is then  , as follows from the relation between the Binomial distribution cumulative distribution function and the regularized incomplete beta function.

When   is either   or  , closed-form expressions for the interval bounds are available: when   the interval is   and when   it is  .[12]

The beta distribution is, in turn, related to the F-distribution so a third formulation of the Clopper–Pearson interval can be written using F quantiles:

 

where x is the number of successes, n is the number of trials, and F(c; d1, d2) is the c quantile from an F-distribution with d1 and d2 degrees of freedom.[13]

The Clopper–Pearson interval is an exact interval since it is based directly on the binomial distribution rather than any approximation to the binomial distribution. This interval never has less than the nominal coverage for any population proportion, but that means that it is usually conservative. For example, the true coverage rate of a 95% Clopper–Pearson interval may be well above 95%, depending on n and  p.[4] Thus the interval may be wider than it needs to be to achieve 95% confidence, and wider than other intervals. In contrast, it is worth noting that other confidence interval may have coverage levels that are lower than the nominal  , i.e., the normal approximation (or "standard") interval, Wilson interval,[8] Agresti–Coull interval,[13] etc., with a nominal coverage of 95% may in fact cover less than 95%,[4] even for large sample sizes.[12]

The definition of the Clopper–Pearson interval can also be modified to obtain exact confidence intervals for different distributions. For instance, it can also be applied to the case where the samples are drawn without replacement from a population of a known size, instead of repeated draws of a binomial distribution. In this case, the underlying distribution would be the hypergeometric distribution.

The interval boundaries are easily computed with numerical methods functions like qbeta in R and scipy.stats.beta.ppf in Python.

from scipy.stats import beta k = 20 n = 400 alpha = 0.05 p_u, p_o = beta.ppf([alpha/2, 1 - alpha/2], [k, k + 1], [n - k + 1, n - k]) 

Agresti–Coull interval edit

The Agresti–Coull interval is also another approximate binomial confidence interval.[13]

Given   successes in   trials, define

 

and

 

Then, a confidence interval for   is given by

 

where   is the quantile of a standard normal distribution, as before (for example, a 95% confidence interval requires  , thereby producing  ). According to Brown, Cai, and DasGupta,[4] taking   instead of 1.96 produces the "add 2 successes and 2 failures" interval previously described by Agresti and Coull.[13]

This interval can be summarised as employing the centre-point adjustment,  , of the Wilson score interval, and then applying the Normal approximation to this point.[3][4]

 

Arcsine transformation edit

The arcsine transformation has the effect of pulling out the ends of the distribution.[14] While it can stabilize the variance (and thus confidence intervals) of proportion data, its use has been criticized in several contexts.[15]

Let X be the number of successes in n trials and let p = X/n. The variance of p is

 

Using the arc sine transform ,the variance of the arcsine of p1/2 is[16]

 

So, the confidence interval itself has the form

 

where   is the   quantile of a standard normal distribution.

This method may be used to estimate the variance of p but its use is problematic when p is close to 0 or 1.

ta transform edit

Let p be the proportion of successes. For 0 ≤ a ≤ 2,

 

This family is a generalisation of the logit transform which is a special case with a = 1 and can be used to transform a proportional data distribution to an approximately normal distribution. The parameter a has to be estimated for the data set.

Rule of three — for when no successes are observed edit

The rule of three is used to provide a simple way of stating an approximate 95% confidence interval for p, in the special case that no successes ( ) have been observed.[17] The interval is (0,3/n).

By symmetry, in the case of only successes ( ), the interval is (1 − 3/n,1).

Comparison and discussion edit

There are several research papers that compare these and other confidence intervals for the binomial proportion.[3][2][18][19] Both Agresti and Coull (1998)[13] and Ross (2003)[20] point out that exact methods such as the Clopper–Pearson interval may not work as well as certain approximations. The Normal approximation interval and its presentation in textbooks has been heavily criticised, with many statisticians advocating that it be not used.[4] The principal problems are overshoot (bounds exceed [0, 1]), zero-width intervals at   = 0 and 1 (falsely implying certainty),[2] and overall inconsistency with significance testing.[3]

Of the approximations listed above, Wilson score interval methods (with or without continuity correction) have been shown to be the most accurate and the most robust,[3][4][2] though some prefer the Agresti–Coull approach for larger sample sizes.[4] Wilson and Clopper–Pearson methods obtain consistent results with source significance tests,[9] and this property is decisive for many researchers.

Many of these intervals can be calculated in R using packages like "binom".

See also edit

References edit

  1. ^ Sullivan, Lisa (2017-10-27). "Confidence Intervals". Boston University School of Public Health.
  2. ^ a b c d e f g Newcombe, R. G. (1998). "Two-sided confidence intervals for the single proportion: comparison of seven methods". Statistics in Medicine. 17 (8): 857–872. doi:10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E. PMID 9595616.
  3. ^ a b c d e f Wallis, Sean A. (2013). "Binomial confidence intervals and contingency tests: mathematical fundamentals and the evaluation of alternative methods" (PDF). Journal of Quantitative Linguistics. 20 (3): 178–208. doi:10.1080/09296174.2013.799918. S2CID 16741749.
  4. ^ a b c d e f g h i Brown, Lawrence D.; Cai, T. Tony; DasGupta, Anirban (2001). "Interval Estimation for a Binomial Proportion". Statistical Science. 16 (2): 101–133. CiteSeerX 10.1.1.50.3025. doi:10.1214/ss/1009213286. MR 1861069. Zbl 1059.62533.
  5. ^ Laplace, Pierre Simon (1812). Théorie analytique des probabilités (in French). Ve. Courcier. p. 283.
  6. ^ a b Short, Michael (2021-11-08). "On binomial quantile and proportion bounds: With applications in engineering and informatics". Communications in Statistics - Theory and Methods. 52 (12): 4183–4199. doi:10.1080/03610926.2021.1986540. ISSN 0361-0926. S2CID 243974180.
  7. ^ How to calculate the standard error of a proportion using weighted data?
  8. ^ a b Wilson, E. B. (1927). "Probable inference, the law of succession, and statistical inference". Journal of the American Statistical Association. 22 (158): 209–212. doi:10.1080/01621459.1927.10502953. JSTOR 2276774.
  9. ^ a b c d Wallis, Sean A. (2021). Statistics in Corpus Linguistics - a new approach. New York: Routledge. ISBN 9781138589384.
  10. ^ Cai, TT (2005). "One-sided confidence intervals in discrete distributions". Journal of Statistical Planning and Inference. 131 (1): 63–88. doi:10.1016/j.jspi.2004.01.005.
  11. ^ Clopper, C.; Pearson, E. S. (1934). "The use of confidence or fiducial limits illustrated in the case of the binomial". Biometrika. 26 (4): 404–413. doi:10.1093/biomet/26.4.404.
  12. ^ a b c Thulin, Måns (2014-01-01). "The cost of using exact confidence intervals for a binomial proportion". Electronic Journal of Statistics. 8 (1): 817–840. arXiv:1303.1288. doi:10.1214/14-EJS909. ISSN 1935-7524. S2CID 88519382.
  13. ^ a b c d e Agresti, Alan; Coull, Brent A. (1998). "Approximate is better than 'exact' for interval estimation of binomial proportions". The American Statistician. 52 (2): 119–126. doi:10.2307/2685469. JSTOR 2685469. MR 1628435.
  14. ^ Holland, Steven. "Transformations of proportions and percentages". strata.uga.edu. Retrieved 2020-09-08.
  15. ^ Warton, David I.; Hui, Francis K. C. (January 2011). "The arcsine is asinine: the analysis of proportions in ecology". Ecology. 92 (1): 3–10. doi:10.1890/10-0340.1. hdl:1885/152287. ISSN 0012-9658. PMID 21560670.
  16. ^ Shao J (1998) Mathematical statistics. Springer. New York, New York, USA
  17. ^ Steve Simon (2010) "Confidence interval with zero events", The Children's Mercy Hospital, Kansas City, Mo. (website: "Ask Professor Mean at Stats topics or Medical Research October 15, 2011, at the Wayback Machine)
  18. ^ Reiczigel, J (2003). "Confidence intervals for the binomial parameter: some new considerations" (PDF). Statistics in Medicine. 22 (4): 611–621. doi:10.1002/sim.1320. PMID 12590417. S2CID 7715293.
  19. ^ Sauro J., Lewis J.R. (2005) "Comparison of Wald, Adj-Wald, Exact and Wilson intervals Calculator" 2012-06-18 at the Wayback Machine. Proceedings of the Human Factors and Ergonomics Society, 49th Annual Meeting (HFES 2005), Orlando, FL, pp. 2100–2104
  20. ^ Ross, T. D. (2003). "Accurate confidence intervals for binomial proportion and Poisson rate estimation". Computers in Biology and Medicine. 33 (6): 509–531. doi:10.1016/S0010-4825(03)00019-2. PMID 12878234.

binomial, proportion, confidence, interval, statistics, binomial, proportion, confidence, interval, confidence, interval, probability, success, calculated, from, outcome, series, success, failure, experiments, bernoulli, trials, other, words, binomial, proport. In statistics a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success failure experiments Bernoulli trials In other words a binomial proportion confidence interval is an interval estimate of a success probability p when only the number of experiments n and the number of successes nS are known There are several formulas for a binomial confidence interval but all of them rely on the assumption of a binomial distribution In general a binomial distribution applies when an experiment is repeated a fixed number of times each trial of the experiment has two possible outcomes success and failure the probability of success is the same for each trial and the trials are statistically independent Because the binomial distribution is a discrete probability distribution i e not continuous and difficult to calculate for large numbers of trials a variety of approximations are used to calculate this confidence interval all with their own tradeoffs in accuracy and computational intensity A simple example of a binomial distribution is the set of various possible outcomes and their probabilities for the number of heads observed when a coin is flipped ten times The observed binomial proportion is the fraction of the flips that turn out to be heads Given this observed proportion the confidence interval for the true probability of the coin landing on heads is a range of possible proportions which may or may not contain the true proportion A 95 confidence interval for the proportion for instance will contain the true proportion 95 of the times that the procedure for constructing the confidence interval is employed 1 Contents 1 Normal approximation interval or Wald interval 1 1 Bracketing the confidence interval 1 2 Standard error of a proportion estimation when using weighted data 2 Wilson score interval 2 1 The interval equality principle 2 2 Wilson score interval with continuity correction 3 Jeffreys interval 4 Clopper Pearson interval 5 Agresti Coull interval 6 Arcsine transformation 7 ta transform 8 Rule of three for when no successes are observed 9 Comparison and discussion 10 See also 11 ReferencesNormal approximation interval or Wald interval edit nbsp Plotting the normal approximation interval on an arbitrary logistic curve reveals problems of overshoot and zero width intervals 2 A commonly used formula for a binomial confidence interval relies on approximating the distribution of error about a binomially distributed observation p displaystyle hat p nbsp with a normal distribution 3 This approximation is based on the central limit theorem and is unreliable when the sample size is small or the success probability is close to 0 or 1 4 Using the normal approximation the success probability p is estimated as p z p 1 p n displaystyle hat p pm z sqrt frac hat p left 1 hat p right n nbsp or the equivalent n S n z n n n S n F displaystyle frac n S n pm frac z n sqrt n sqrt n S n F nbsp where p n S n displaystyle hat p n S n nbsp is the proportion of successes in a Bernoulli trial process measured with n displaystyle n nbsp trials yielding n S displaystyle n S nbsp successes and n F n n S displaystyle n F n n S nbsp failures and z displaystyle z nbsp is the 1 a 2 displaystyle 1 tfrac alpha 2 nbsp quantile of a standard normal distribution i e the probit corresponding to the target error rate a displaystyle alpha nbsp For a 95 confidence level the error a 1 0 95 0 05 displaystyle alpha 1 0 95 0 05 nbsp so 1 a 2 0 975 displaystyle 1 tfrac alpha 2 0 975 nbsp and z 1 96 displaystyle z 1 96 nbsp From this one finds two problems First for p displaystyle hat p nbsp approaching unit or zero the interval narrows to zero width implying certainty Second for values of p lt z 2 z 2 n displaystyle hat p lt tfrac z 2 z 2 n nbsp or equivalently for 1 p displaystyle 1 hat p nbsp the interval boundaries exceed 0 1 displaystyle 0 1 nbsp overshoot An important theoretical derivation of this confidence interval involves the inversion of a hypothesis test Under this formulation the confidence interval represents those values of the population parameter that would have large p values if they were tested as a hypothesized population proportion The collection of values 8 displaystyle theta nbsp for which the normal approximation is valid can be represented as 8 y p 8 1 n p 1 p z a 2 displaystyle left theta bigg y leq frac hat p theta sqrt frac 1 n hat p left 1 hat p right leq z alpha 2 right nbsp where y displaystyle y nbsp is the a 2 displaystyle tfrac alpha 2 nbsp quantile of a standard normal distribution Since the test in the middle of the inequality is a Wald test the normal approximation interval is sometimes called the Wald interval or Wald method after Abraham Wald but it was first described by Pierre Simon Laplace in 1812 5 Bracketing the confidence interval edit Extending the normal approximation and Wald Laplace interval concepts Michael Short has shown that inequalities on the approximation error between the binomial distribution and the normal distribution can be used to accurately bracket the estimate of the confidence interval around p displaystyle hat p nbsp 6 k C L 1 n z 2 z n k k 2 C L 2 n C L 3 k C L 4 n n z 2 2 p k C U 1 n z 2 z n k k 2 C U 2 n C U 3 k C U 4 n n z 2 2 displaystyle frac k C L1 n z 2 z sqrt frac nk k 2 C L2 n C L3 k C L4 n n z 2 2 leq hat p leq frac k C U1 n z 2 z sqrt frac nk k 2 C U2 n C U3 k C U4 n n z 2 2 nbsp where p displaystyle hat p nbsp is again the unknown proportion of successes in a Bernoulli trial process measured with n displaystyle n nbsp trials yielding k displaystyle k nbsp successes z displaystyle z nbsp is the 1 a 2 displaystyle 1 tfrac alpha 2 nbsp quantile of a standard normal distribution i e the probit corresponding to the target error rate a displaystyle alpha nbsp and the constants C L 1 C L 2 C L 3 C L 4 C U 1 C U 2 C U 3 displaystyle C L1 C L2 C L3 C L4 C U1 C U2 C U3 nbsp and C U 4 displaystyle C U4 nbsp are simple algebraic functions of z displaystyle z nbsp 6 For a fixed a displaystyle alpha nbsp and hence z displaystyle z nbsp the above inequalities give easily computed one or two sided intervals which bracket the exact binomial upper and lower confidence limits corresponding to the error rate a displaystyle alpha nbsp Standard error of a proportion estimation when using weighted data edit Let there be a simple random sample X 1 X n displaystyle X 1 ldots X n nbsp where each X i displaystyle X i nbsp is i i d from a Bernoulli p distribution and weight w i displaystyle w i nbsp is the weight for each observation Standardize the positive weights w i displaystyle w i nbsp so they sum to 1 The weighted sample proportion is p i 1 n w i X i textstyle hat p sum i 1 n w i X i nbsp Since the X i displaystyle X i nbsp are independent and each one has variance Var X i p 1 p displaystyle text Var X i p 1 p nbsp the sampling variance of the proportion therefore is 7 Var p i 1 n Var w i X i p 1 p i 1 n w i 2 displaystyle text Var hat p sum i 1 n text Var w i X i p 1 p sum i 1 n w i 2 nbsp The standard error of p displaystyle hat p nbsp is the square root of this quantity Because we do not know p 1 p displaystyle p 1 p nbsp we have to estimate it Although there are many possible estimators a conventional one is to use p displaystyle hat p nbsp the sample mean and plug this into the formula That gives SE p p 1 p i 1 n w i 2 displaystyle text SE hat p sqrt hat p 1 hat p sum i 1 n w i 2 nbsp For unweighted data w i 1 n textstyle w i 1 n nbsp giving i 1 n w i 2 1 n textstyle sum i 1 n w i 2 1 n nbsp The SE becomes p 1 p n textstyle sqrt hat p 1 hat p n nbsp leading to the familiar formulas showing that the calculation for weighted data is a direct generalization of them Wilson score interval edit nbsp Wilson score intervals plotted on a logistic curve revealing asymmetry and good performance for small n and where p is at or near 0 or 1 The Wilson score interval is an improvement over the normal approximation interval in multiple respects It was developed by Edwin Bidwell Wilson 1927 8 Unlike the symmetric normal approximation interval above the Wilson score interval is asymmetric It does not suffer from problems of overshoot and zero width intervals that afflict the normal interval and it may be safely employed with small samples and skewed observations 3 The observed coverage probability is consistently closer to the nominal value 1 a displaystyle 1 alpha nbsp 2 Like the normal interval the interval can be computed directly from a formula Wilson started with the normal approximation to the binomial z p p s n displaystyle z approx frac left p hat p right sigma n nbsp with the analytic formula for the sample standard deviation given bys n p 1 p n displaystyle sigma n sqrt frac p left 1 p right n nbsp Combining the two and squaring out the radical gives an equation that is quadratic in p p p 2 z 2 p 1 p n displaystyle left hat p p right 2 z 2 cdot frac p left 1 p right n nbsp Transforming the relation into a standard form quadratic equation for p treating p displaystyle hat p nbsp and n as known values from the sample see prior section and using the value of z that corresponds to the desired confidence for the estimate of p gives this 1 z 2 n p 2 2 p z 2 n p p 2 0 displaystyle left 1 frac z 2 n right p 2 left 2 hat p frac z 2 n right p biggl hat p 2 biggr 0 nbsp where all of the values in parentheses are known quantities The solution for p estimates the upper and lower limits of the confidence interval for p Hence the probability of success p is estimated by p w w 1 1 z 2 n p z 2 2 n z 1 z 2 n p 1 p n z 2 4 n 2 displaystyle p approx w w frac 1 1 frac z 2 n left hat p frac z 2 2n right pm frac z 1 frac z 2 n sqrt frac hat p 1 hat p n frac z 2 4n 2 nbsp or the equivalent p n S 1 2 z 2 n z 2 z n z 2 n S n F n z 2 4 displaystyle p approx frac n S tfrac 1 2 z 2 n z 2 pm frac z n z 2 sqrt frac n S n F n frac z 2 4 nbsp The practical observation from using this interval is that it has good properties even for a small number of trials and or an extreme probability Intuitively the center value of this interval is the weighted average of p displaystyle hat p nbsp and 1 2 displaystyle tfrac 1 2 nbsp with p displaystyle hat p nbsp receiving greater weight as the sample size increases Formally the center value corresponds to using a pseudocount of 1 2 z2 the number of standard deviations of the confidence interval add this number to both the count of successes and of failures to yield the estimate of the ratio For the common two standard deviations in each direction interval approximately 95 coverage which itself is approximately 1 96 standard deviations this yields the estimate n S 2 n 4 displaystyle n S 2 n 4 nbsp which is known as the plus four rule Although the quadratic can be solved explicitly in most cases Wilson s equations can also be solved numerically using the fixed point iteration p k 1 p z p k 1 p k n displaystyle p k 1 hat p pm z cdot sqrt frac p k cdot left 1 p k right n nbsp with p 0 p displaystyle p 0 hat p nbsp The Wilson interval can also be derived from the single sample z test or Pearson s chi squared test with two categories The resulting interval 8 y p 8 1 n 8 1 8 z displaystyle left theta bigg y leq frac hat p theta sqrt tfrac 1 n theta 1 theta leq z right nbsp can then be solved for 8 displaystyle theta nbsp to produce the Wilson score interval The test in the middle of the inequality is a score test The interval equality principle edit nbsp The probability density function for the Wilson score interval plus pdfs at interval bounds Tail areas are equal Since the interval is derived by solving from the normal approximation to the binomial the Wilson score interval w w displaystyle w w nbsp has the property of being guaranteed to obtain the same result as the equivalent z test or chi squared test This property can be visualised by plotting the probability density function for the Wilson score interval see Wallis 2021 297 313 9 and then plotting a normal pdf at each bound The tail areas of the resulting Wilson and normal distributions representing the chance of a significant result in that direction must be equal The continuity corrected Wilson score interval and the Clopper Pearson interval are also compliant with this property The practical import is that these intervals may be employed as significance tests with identical results to the source test and new tests may be derived by geometry 9 Wilson score interval with continuity correction edit The Wilson interval may be modified by employing a continuity correction in order to align the minimum coverage probability rather than the average coverage probability with the nominal value 1 a displaystyle 1 alpha nbsp Just as the Wilson interval mirrors Pearson s chi squared test the Wilson interval with continuity correction mirrors the equivalent Yates chi squared test The following formulae for the lower and upper bounds of the Wilson score interval with continuity correction w c c w c c displaystyle w cc w cc nbsp are derived from Newcombe 1998 2 w c c max 0 2 n p z 2 z z 2 1 n 4 n p 1 p 4 p 2 1 2 n z 2 w c c min 1 2 n p z 2 z z 2 1 n 4 n p 1 p 4 p 2 1 2 n z 2 displaystyle begin aligned w cc amp max left 0 frac 2n hat p z 2 left z sqrt z 2 frac 1 n 4n hat p 1 hat p 4 hat p 2 1 right 2 n z 2 right w cc amp min left 1 frac 2n hat p z 2 left z sqrt z 2 frac 1 n 4n hat p 1 hat p 4 hat p 2 1 right 2 n z 2 right end aligned nbsp However if p 0 w c c displaystyle w cc nbsp must be taken as 0 or if p 1 then w c c displaystyle w cc nbsp is then 1 Wallis 2021 9 identifies a simpler method for computing continuity corrected Wilson intervals that employs special functions In Wallis notation for the lower bound let W i l s o n L o w e r p n a 2 w displaystyle mathrm WilsonLower hat p n alpha 2 w nbsp where a displaystyle alpha nbsp is the selected error level for z displaystyle z nbsp Then w c c W i l s o n L o w e r max p 1 2 n 0 n a 2 displaystyle w cc mathrm WilsonLower max hat p tfrac 1 2n 0 n alpha 2 nbsp This method has the advantage of being further decomposable Jeffreys interval editThe Jeffreys interval has a Bayesian derivation but it has good frequentist properties In particular it has coverage properties that are similar to those of the Wilson interval but it is one of the few intervals with the advantage of being equal tailed e g for a 95 confidence interval the probabilities of the interval lying above or below the true value are both close to 2 5 In contrast the Wilson interval has a systematic bias such that it is centred too close to p 0 5 10 The Jeffreys interval is the Bayesian credible interval obtained when using the non informative Jeffreys prior for the binomial proportion p The Jeffreys prior for this problem is a Beta distribution with parameters 1 2 1 2 it is a conjugate prior After observing x successes in n trials the posterior distribution for p is a Beta distribution with parameters x 1 2 n x 1 2 When x 0 and x n the Jeffreys interval is taken to be the 100 1 a equal tailed posterior probability interval i e the a 2 and 1 a 2 quantiles of a Beta distribution with parameters x 1 2 n x 1 2 These quantiles need to be computed numerically although this is reasonably simple with modern statistical software In order to avoid the coverage probability tending to zero when p 0 or 1 when x 0 the upper limit is calculated as before but the lower limit is set to 0 and when x n the lower limit is calculated as before but the upper limit is set to 1 4 Clopper Pearson interval editThe Clopper Pearson interval is an early and very common method for calculating binomial confidence intervals 11 This is often called an exact method as it attains the nominal coverage level in an exact sense meaning that the coverage level is never less than the nominal 1 a displaystyle 1 alpha nbsp 2 The Clopper Pearson interval can be written as S S displaystyle S leq cap S geq nbsp or equivalently inf S sup S displaystyle left inf S geq sup S leq right nbsp with S p P Bin n p x gt a 2 and S p P Bin n p x gt a 2 displaystyle S leq left p Big P left operatorname Bin left n p right leq x right gt frac alpha 2 right text and S geq left p Big P left operatorname Bin left n p right geq x right gt frac alpha 2 right nbsp where 0 x n is the number of successes observed in the sample and Bin n p is a binomial random variable with n trials and probability of success p Equivalently we can say that the Clopper Pearson interval is x n e 1 x n e 2 textstyle left frac x n varepsilon 1 frac x n varepsilon 2 right nbsp with confidence level 1 a displaystyle 1 alpha nbsp if e i displaystyle varepsilon i nbsp is the infimum of those such that the following tests of hypothesis succeed with significance a 2 textstyle frac alpha 2 nbsp H0 p x n e 1 displaystyle p frac x n varepsilon 1 nbsp with HA p gt x n e 1 displaystyle p gt frac x n varepsilon 1 nbsp H0 p x n e 2 displaystyle p frac x n varepsilon 2 nbsp with HA p lt x n e 2 displaystyle p lt frac x n varepsilon 2 nbsp Because of a relationship between the binomial distribution and the beta distribution the Clopper Pearson interval is sometimes presented in an alternate format that uses quantiles from the beta distribution 12 B a 2 x n x 1 lt p lt B 1 a 2 x 1 n x displaystyle B left frac alpha 2 x n x 1 right lt p lt B left 1 frac alpha 2 x 1 n x right nbsp where x is the number of successes n is the number of trials and B p v w is the pth quantile from a beta distribution with shape parameters v and w Thus p min lt p lt p max displaystyle p min lt p lt p max nbsp where G n 1 G x G n x 1 0 p min t x 1 1 t n x d t a 2 displaystyle frac Gamma n 1 Gamma x Gamma n x 1 int 0 p min t x 1 1 t n x dt frac alpha 2 nbsp G n 1 G x 1 G n x 0 p max t x 1 t n x 1 d t 1 a 2 displaystyle frac Gamma n 1 Gamma x 1 Gamma n x int 0 p max t x 1 t n x 1 dt 1 frac alpha 2 nbsp The binomial proportion confidence interval is then p min p max displaystyle p min p max nbsp as follows from the relation between the Binomial distribution cumulative distribution function and the regularized incomplete beta function When x displaystyle x nbsp is either 0 displaystyle 0 nbsp or n displaystyle n nbsp closed form expressions for the interval bounds are available when x 0 displaystyle x 0 nbsp the interval is 0 1 a 2 1 n textstyle left 0 1 left frac alpha 2 right frac 1 n right nbsp and when x n displaystyle x n nbsp it is a 2 1 n 1 textstyle left left frac alpha 2 right frac 1 n 1 right nbsp 12 The beta distribution is in turn related to the F distribution so a third formulation of the Clopper Pearson interval can be written using F quantiles 1 n x 1 x F a 2 2 x 2 n x 1 1 lt p lt 1 n x x 1 F 1 a 2 2 x 1 2 n x 1 displaystyle left 1 frac n x 1 x F left frac alpha 2 2 x 2 n x 1 right right 1 lt p lt left 1 frac n x x 1 F left 1 frac alpha 2 2 x 1 2 n x right right 1 nbsp where x is the number of successes n is the number of trials and F c d1 d2 is the c quantile from an F distribution with d1 and d2 degrees of freedom 13 The Clopper Pearson interval is an exact interval since it is based directly on the binomial distribution rather than any approximation to the binomial distribution This interval never has less than the nominal coverage for any population proportion but that means that it is usually conservative For example the true coverage rate of a 95 Clopper Pearson interval may be well above 95 depending on n and p 4 Thus the interval may be wider than it needs to be to achieve 95 confidence and wider than other intervals In contrast it is worth noting that other confidence interval may have coverage levels that are lower than the nominal 1 a displaystyle 1 alpha nbsp i e the normal approximation or standard interval Wilson interval 8 Agresti Coull interval 13 etc with a nominal coverage of 95 may in fact cover less than 95 4 even for large sample sizes 12 The definition of the Clopper Pearson interval can also be modified to obtain exact confidence intervals for different distributions For instance it can also be applied to the case where the samples are drawn without replacement from a population of a known size instead of repeated draws of a binomial distribution In this case the underlying distribution would be the hypergeometric distribution The interval boundaries are easily computed with numerical methods functions like qbeta in R and scipy stats beta ppf in Python from scipy stats import beta k 20 n 400 alpha 0 05 p u p o beta ppf alpha 2 1 alpha 2 k k 1 n k 1 n k Agresti Coull interval editThe Agresti Coull interval is also another approximate binomial confidence interval 13 Given n S displaystyle n S nbsp successes in n displaystyle n nbsp trials define n n z 2 displaystyle tilde n n z 2 nbsp and p 1 n n S z 2 2 displaystyle tilde p frac 1 tilde n left n S frac z 2 2 right nbsp Then a confidence interval for p displaystyle p nbsp is given by p z p n 1 p displaystyle tilde p pm z sqrt frac tilde p tilde n left 1 tilde p right nbsp where z F 1 1 a 2 displaystyle z Phi 1 left 1 frac alpha 2 right nbsp is the quantile of a standard normal distribution as before for example a 95 confidence interval requires a 0 05 displaystyle alpha 0 05 nbsp thereby producing z 1 96 displaystyle z 1 96 nbsp According to Brown Cai and DasGupta 4 taking z 2 displaystyle z 2 nbsp instead of 1 96 produces the add 2 successes and 2 failures interval previously described by Agresti and Coull 13 This interval can be summarised as employing the centre point adjustment p displaystyle tilde p nbsp of the Wilson score interval and then applying the Normal approximation to this point 3 4 p p z 2 2 n 1 z 2 n displaystyle tilde p frac hat p frac z 2 2n 1 frac z 2 n nbsp Arcsine transformation editMain article Arcsine transformation Further information Cohen s h The arcsine transformation has the effect of pulling out the ends of the distribution 14 While it can stabilize the variance and thus confidence intervals of proportion data its use has been criticized in several contexts 15 Let X be the number of successes in n trials and let p X n The variance of p is var p p 1 p n displaystyle operatorname var p frac p 1 p n nbsp Using the arc sine transform the variance of the arcsine of p1 2 is 16 var arcsin p var p 4 p 1 p p 1 p 4 n p 1 p 1 4 n displaystyle operatorname var left arcsin sqrt p right approx frac operatorname var p 4p 1 p frac p 1 p 4np 1 p frac 1 4n nbsp So the confidence interval itself has the form sin arcsin p z 2 n 2 lt 8 lt sin arcsin p z 2 n 2 displaystyle left sin left arcsin sqrt p frac z 2 sqrt n right right 2 lt theta lt left sin left arcsin sqrt p frac z 2 sqrt n right right 2 nbsp where z displaystyle z nbsp is the 1 a 2 displaystyle scriptstyle 1 frac alpha 2 nbsp quantile of a standard normal distribution This method may be used to estimate the variance of p but its use is problematic when p is close to 0 or 1 ta transform editThis section does not cite any sources Please help improve this section by adding citations to reliable sources Unsourced material may be challenged and removed July 2017 Learn how and when to remove this template message Let p be the proportion of successes For 0 a 2 t a log p a 1 p 2 a a log p 2 a log 1 p displaystyle t a log left frac p a 1 p 2 a right a log p 2 a log 1 p nbsp This family is a generalisation of the logit transform which is a special case with a 1 and can be used to transform a proportional data distribution to an approximately normal distribution The parameter a has to be estimated for the data set Rule of three for when no successes are observed editThe rule of three is used to provide a simple way of stating an approximate 95 confidence interval for p in the special case that no successes p 0 displaystyle hat p 0 nbsp have been observed 17 The interval is 0 3 n By symmetry in the case of only successes p 1 displaystyle hat p 1 nbsp the interval is 1 3 n 1 Comparison and discussion editThere are several research papers that compare these and other confidence intervals for the binomial proportion 3 2 18 19 Both Agresti and Coull 1998 13 and Ross 2003 20 point out that exact methods such as the Clopper Pearson interval may not work as well as certain approximations The Normal approximation interval and its presentation in textbooks has been heavily criticised with many statisticians advocating that it be not used 4 The principal problems are overshoot bounds exceed 0 1 zero width intervals at p displaystyle hat p nbsp 0 and 1 falsely implying certainty 2 and overall inconsistency with significance testing 3 Of the approximations listed above Wilson score interval methods with or without continuity correction have been shown to be the most accurate and the most robust 3 4 2 though some prefer the Agresti Coull approach for larger sample sizes 4 Wilson and Clopper Pearson methods obtain consistent results with source significance tests 9 and this property is decisive for many researchers Many of these intervals can be calculated in R using packages like binom See also editBinomial distribution Confidence intervals Estimation theory PseudocountReferences edit Sullivan Lisa 2017 10 27 Confidence Intervals Boston University School of Public Health a b c d e f g Newcombe R G 1998 Two sided confidence intervals for the single proportion comparison of seven methods Statistics in Medicine 17 8 857 872 doi 10 1002 SICI 1097 0258 19980430 17 8 lt 857 AID SIM777 gt 3 0 CO 2 E PMID 9595616 a b c d e f Wallis Sean A 2013 Binomial confidence intervals and contingency tests mathematical fundamentals and the evaluation of alternative methods PDF Journal of Quantitative Linguistics 20 3 178 208 doi 10 1080 09296174 2013 799918 S2CID 16741749 a b c d e f g h i Brown Lawrence D Cai T Tony DasGupta Anirban 2001 Interval Estimation for a Binomial Proportion Statistical Science 16 2 101 133 CiteSeerX 10 1 1 50 3025 doi 10 1214 ss 1009213286 MR 1861069 Zbl 1059 62533 Laplace Pierre Simon 1812 Theorie analytique des probabilites in French Ve Courcier p 283 a b Short Michael 2021 11 08 On binomial quantile and proportion bounds With applications in engineering and informatics Communications in Statistics Theory and Methods 52 12 4183 4199 doi 10 1080 03610926 2021 1986540 ISSN 0361 0926 S2CID 243974180 How to calculate the standard error of a proportion using weighted data a b Wilson E B 1927 Probable inference the law of succession and statistical inference Journal of the American Statistical Association 22 158 209 212 doi 10 1080 01621459 1927 10502953 JSTOR 2276774 a b c d Wallis Sean A 2021 Statistics in Corpus Linguistics a new approach New York Routledge ISBN 9781138589384 Cai TT 2005 One sided confidence intervals in discrete distributions Journal of Statistical Planning and Inference 131 1 63 88 doi 10 1016 j jspi 2004 01 005 Clopper C Pearson E S 1934 The use of confidence or fiducial limits illustrated in the case of the binomial Biometrika 26 4 404 413 doi 10 1093 biomet 26 4 404 a b c Thulin Mans 2014 01 01 The cost of using exact confidence intervals for a binomial proportion Electronic Journal of Statistics 8 1 817 840 arXiv 1303 1288 doi 10 1214 14 EJS909 ISSN 1935 7524 S2CID 88519382 a b c d e Agresti Alan Coull Brent A 1998 Approximate is better than exact for interval estimation of binomial proportions The American Statistician 52 2 119 126 doi 10 2307 2685469 JSTOR 2685469 MR 1628435 Holland Steven Transformations of proportions and percentages strata uga edu Retrieved 2020 09 08 Warton David I Hui Francis K C January 2011 The arcsine is asinine the analysis of proportions in ecology Ecology 92 1 3 10 doi 10 1890 10 0340 1 hdl 1885 152287 ISSN 0012 9658 PMID 21560670 Shao J 1998 Mathematical statistics Springer New York New York USA Steve Simon 2010 Confidence interval with zero events The Children s Mercy Hospital Kansas City Mo website Ask Professor Mean at Stats topics or Medical Research Archived October 15 2011 at the Wayback Machine Reiczigel J 2003 Confidence intervals for the binomial parameter some new considerations PDF Statistics in Medicine 22 4 611 621 doi 10 1002 sim 1320 PMID 12590417 S2CID 7715293 Sauro J Lewis J R 2005 Comparison of Wald Adj Wald Exact and Wilson intervals Calculator Archived 2012 06 18 at the Wayback Machine Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting HFES 2005 Orlando FL pp 2100 2104 Ross T D 2003 Accurate confidence intervals for binomial proportion and Poisson rate estimation Computers in Biology and Medicine 33 6 509 531 doi 10 1016 S0010 4825 03 00019 2 PMID 12878234 Retrieved from https en wikipedia org w index php title Binomial proportion confidence interval amp oldid 1186615456, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.