fbpx
Wikipedia

Margin of error

The margin of error is a statistic expressing the amount of random sampling error in the results of a survey. The larger the margin of error, the less confidence one should have that a poll result would reflect the result of a census of the entire population. The margin of error will be positive whenever a population is incompletely sampled and the outcome measure has positive variance, which is to say, the measure varies.

Probability densities of polls of different sizes, each color-coded to its 95% confidence interval (below), margin of error (left), and sample size (right). Each interval reflects the range within which one may have 95% confidence that the true percentage may be found, given a reported percentage of 50%. The margin of error is half the confidence interval (also, the radius of the interval). The larger the sample, the smaller the margin of error. Also, the further from 50% the reported percentage, the smaller the margin of error.

The term margin of error is often used in non-survey contexts to indicate observational error in reporting measured quantities.

Concept

Consider a simple yes/no poll   as a sample of   respondents drawn from a population   reporting the percentage   of yes responses. We would like to know how close   is to the true result of a survey of the entire population  , without having to conduct one. If, hypothetically, we were to conduct poll   over subsequent samples of   respondents (newly drawn from  ), we would expect those subsequent results   to be normally distributed about  . The margin of error describes the distance within which a specified percentage of these results is expected to vary from  .

According to the 68-95-99.7 rule, we would expect that 95% of the results   will fall within about two standard deviations ( ) either side of the true mean  .  This interval is called the confidence interval, and the radius (half the interval) is called the margin of error, corresponding to a 95% confidence level.

Generally, at a confidence level  , a sample sized   of a population having expected standard deviation   has a margin of error

 

where   denotes the quantile (also, commonly, a z-score), and   is the standard error.

Standard deviation and standard error

We would expect the normally distributed values    to have a standard deviation which somehow varies with  . The smaller  , the wider the margin. This is called the standard error  .

For the single result from our survey, we assume that  , and that all subsequent results   together would have a variance  .

 

Note that   corresponds to the variance of a Bernoulli distribution.

Maximum margin of error at different confidence levels

 

For a confidence level  , there is a corresponding confidence interval about the mean  , that is, the interval   within which values of   should fall with probability  . Precise values of   are given by the quantile function of the normal distribution (which the 68-95-99.7 rule approximates).

Note that   is undefined for  , that is,   is undefined, as is  .

         
0.68 0.994457883210 0.999 3.290526731492
0.90 1.644853626951 0.9999 3.890591886413
0.95 1.959963984540 0.99999 4.417173413469
0.98 2.326347874041 0.999999 4.891638475699
0.99 2.575829303549 0.9999999 5.326723886384
0.995 2.807033768344 0.99999999 5.730728868236
0.997 2.967737925342 0.999999999 6.109410204869
 
Log-log graphs of   vs sample size n and confidence level γ. The arrows show that the maximum margin error for a sample size of 1000 is ±3.1% at 95% confidence level, and ±4.1% at 99%.
The inset parabola   illustrates the relationship between   at   and   at  . In the example, MOE95(0.71) ≈ 0.9 × ±3.1% ≈ ±2.8%.

Since   at  , we can arbitrarily set  , calculate  ,  , and   to obtain the maximum margin of error for   at a given confidence level   and sample size  , even before having actual results.  With  

 
 

Also, usefully, for any reported  

 

Specific margins of error

If a poll has multiple percentage results (for example, a poll measuring a single multiple-choice preference), the result closest to 50% will have the highest margin of error. Typically, it is this number that is reported as the margin of error for the entire poll. Imagine poll   reports   as  

  (as in the figure above)
 
 

As a given percentage approaches the extremes of 0% or 100%, its margin of error approaches ±0%.

Comparing percentages

Imagine multiple-choice poll   reports   as  . As described above, the margin of error reported for the poll would typically be  , as  is closest to 50%. The popular notion of statistical tie or statistical dead heat, however, concerns itself not with the accuracy of the individual results, but with that of the ranking of the results. Which is in first?

If, hypothetically, we were to conduct poll   over subsequent samples of   respondents (newly drawn from  ), and report result  , we could use the standard error of difference to understand how   is expected to fall about  . For this, we need to apply the sum of variances to obtain a new variance,  ,

 

where   is the covariance of  and  .

Thus (after simplifying),

 
 
 

Note that this assumes that   is close to constant, that is, respondents choosing either A or B would almost never chose C (making  and   close to perfectly negatively correlated). With three or more choices in closer contention, choosing a correct formula for   becomes more complicated.

Effect of finite population size

The formulae above for the margin of error assume that there is an infinitely large population and thus do not depend on the size of population  , but only on the sample size  . According to sampling theory, this assumption is reasonable when the sampling fraction is small. The margin of error for a particular sampling method is essentially the same regardless of whether the population of interest is the size of a school, city, state, or country, as long as the sampling fraction is small.

In cases where the sampling fraction is larger (in practice, greater than 5%), analysts might adjust the margin of error using a finite population correction to account for the added precision gained by sampling a much larger percentage of the population. FPC can be calculated using the formula[1]

 

...and so, if poll   were conducted over 24% of, say, an electorate of 300,000 voters,

 
 

Intuitively, for appropriately large  ,

 
 

In the former case,   is so small as to require no correction. In the latter case, the poll effectively becomes a census and sampling error becomes moot.

See also

References

  1. ^ Isserlis, L. (1918). "On the value of a mean as calculated from a sample". Journal of the Royal Statistical Society. Blackwell Publishing. 81 (1): 75–81. doi:10.2307/2340569. JSTOR 2340569. (Equation 1)

Sources

  • Sudman, Seymour and Bradburn, Norman (1982). Asking Questions: A Practical Guide to Questionnaire Design. San Francisco: Jossey Bass. ISBN 0-87589-546-8
  • Wonnacott, T.H.; R.J. Wonnacott (1990). Introductory Statistics (5th ed.). Wiley. ISBN 0-471-61518-8.

External links

margin, error, this, article, about, statistical, precision, estimates, from, sample, surveys, observational, errors, observational, error, safety, margins, engineering, factor, safety, tolerance, engineering, engineering, tolerance, eponymous, movie, margin, . This article is about the statistical precision of estimates from sample surveys For observational errors see Observational error For safety margins in engineering see Factor of safety For tolerance in engineering see Engineering tolerance For the eponymous movie see Margin for error film This article includes a list of general references but it lacks sufficient corresponding inline citations Please help to improve this article by introducing more precise citations November 2021 Learn how and when to remove this template message The margin of error is a statistic expressing the amount of random sampling error in the results of a survey The larger the margin of error the less confidence one should have that a poll result would reflect the result of a census of the entire population The margin of error will be positive whenever a population is incompletely sampled and the outcome measure has positive variance which is to say the measure varies Probability densities of polls of different sizes each color coded to its 95 confidence interval below margin of error left and sample size right Each interval reflects the range within which one may have 95 confidence that the true percentage may be found given a reported percentage of 50 The margin of error is half the confidence interval also the radius of the interval The larger the sample the smaller the margin of error Also the further from 50 the reported percentage the smaller the margin of error The term margin of error is often used in non survey contexts to indicate observational error in reporting measured quantities Contents 1 Concept 2 Standard deviation and standard error 3 Maximum margin of error at different confidence levels 4 Specific margins of error 5 Comparing percentages 6 Effect of finite population size 7 See also 8 References 9 Sources 10 External linksConcept EditConsider a simple yes no poll P displaystyle P as a sample of n displaystyle n respondents drawn from a population N n N displaystyle N text n ll N reporting the percentage p displaystyle p of yes responses We would like to know how close p displaystyle p is to the true result of a survey of the entire population N displaystyle N without having to conduct one If hypothetically we were to conduct poll P displaystyle P over subsequent samples of n displaystyle n respondents newly drawn from N displaystyle N we would expect those subsequent results p 1 p 2 displaystyle p 1 p 2 ldots to be normally distributed about p displaystyle overline p The margin of error describes the distance within which a specified percentage of these results is expected to vary from p displaystyle overline p According to the 68 95 99 7 rule we would expect that 95 of the results p 1 p 2 displaystyle p 1 p 2 ldots will fall within about two standard deviations 2 s P displaystyle pm 2 sigma P either side of the true mean p displaystyle overline p This interval is called the confidence interval and the radius half the interval is called the margin of error corresponding to a 95 confidence level Generally at a confidence level g displaystyle gamma a sample sized n displaystyle n of a population having expected standard deviation s displaystyle sigma has a margin of error M O E g z g s 2 n displaystyle MOE gamma z gamma times sqrt frac sigma 2 n where z g displaystyle z gamma denotes the quantile also commonly a z score and s 2 n displaystyle sqrt frac sigma 2 n is the standard error Standard deviation and standard error EditWe would expect the normally distributed values p 1 p 2 displaystyle p 1 p 2 ldots to have a standard deviation which somehow varies with n displaystyle n The smaller n displaystyle n the wider the margin This is called the standard error s p displaystyle sigma overline p For the single result from our survey we assume that p p displaystyle p overline p and that all subsequent results p 1 p 2 displaystyle p 1 p 2 ldots together would have a variance s P 2 P 1 P displaystyle sigma P 2 P 1 P Standard error s p s P 2 n p 1 p n displaystyle text Standard error sigma overline p approx sqrt frac sigma P 2 n approx sqrt frac p 1 p n Note that p 1 p displaystyle p 1 p corresponds to the variance of a Bernoulli distribution Maximum margin of error at different confidence levels Edit For a confidence level g displaystyle gamma there is a corresponding confidence interval about the mean m z g s displaystyle mu pm z gamma sigma that is the interval m z g s m z g s displaystyle mu z gamma sigma mu z gamma sigma within which values of P displaystyle P should fall with probability g displaystyle gamma Precise values of z g displaystyle z gamma are given by the quantile function of the normal distribution which the 68 95 99 7 rule approximates Note that z g displaystyle z gamma is undefined for g 1 displaystyle gamma geq 1 that is z 1 00 displaystyle z 1 00 is undefined as is z 1 10 displaystyle z 1 10 g displaystyle gamma z g displaystyle z gamma g displaystyle gamma z g displaystyle z gamma 0 68 0 994457 883 210 0 999 3 290526 731 4920 90 1 644853 626 951 0 9999 3 890591 886 4130 95 1 959963984540 0 99999 4 417173 413 4690 98 2 326347 874 041 0 999999 4 891638 475 6990 99 2 575829 303 549 0 9999999 5 326723 886 3840 995 2 807033 768 344 0 99999999 5 730728 868 2360 997 2 967737 925 342 0 999999999 6 109410 204 869 Log log graphs of M O E g 0 5 displaystyle MOE gamma 0 5 vs sample size n and confidence level g The arrows show that the maximum margin error for a sample size of 1000 is 3 1 at 95 confidence level and 4 1 at 99 The inset parabola s p 2 p p 2 displaystyle sigma p 2 p p 2 illustrates the relationship between s p 2 displaystyle sigma p 2 at p 0 71 displaystyle p 0 71 and s m a x 2 displaystyle sigma max 2 at p 0 5 displaystyle p 0 5 In the example MOE95 0 71 0 9 3 1 2 8 Since max s P 2 max P 1 P 0 25 displaystyle max sigma P 2 max P 1 P 0 25 at p 0 5 displaystyle p 0 5 we can arbitrarily set p p 0 5 displaystyle p overline p 0 5 calculate s P displaystyle sigma P s p displaystyle sigma overline p and z g s p displaystyle z gamma sigma overline p to obtain the maximum margin of error for P displaystyle P at a given confidence level g displaystyle gamma and sample size n displaystyle n even before having actual results With p 0 5 n 1013 displaystyle p 0 5 n 1013 M O E 95 0 5 z 0 95 s p z 0 95 s P 2 n 1 96 25 n 0 98 n 3 1 displaystyle MOE 95 0 5 z 0 95 sigma overline p approx z 0 95 sqrt frac sigma P 2 n 1 96 sqrt frac 25 n 0 98 sqrt n pm 3 1 M O E 99 0 5 z 0 99 s p z 0 99 s P 2 n 2 58 25 n 1 29 n 4 1 displaystyle MOE 99 0 5 z 0 99 sigma overline p approx z 0 99 sqrt frac sigma P 2 n 2 58 sqrt frac 25 n 1 29 sqrt n pm 4 1 Also usefully for any reported M O E 95 displaystyle MOE 95 M O E 99 z 0 99 z 0 95 M O E 95 1 3 M O E 95 displaystyle MOE 99 frac z 0 99 z 0 95 MOE 95 approx 1 3 times MOE 95 Specific margins of error EditIf a poll has multiple percentage results for example a poll measuring a single multiple choice preference the result closest to 50 will have the highest margin of error Typically it is this number that is reported as the margin of error for the entire poll Imagine poll P displaystyle P reports p a p b p c displaystyle p a p b p c as 71 27 2 n 1013 displaystyle 71 27 2 n 1013 M O E 95 P a z 0 95 s p a 1 96 p a 1 p a n 0 89 n 2 8 displaystyle MOE 95 P a z 0 95 sigma overline p a approx 1 96 sqrt frac p a 1 p a n 0 89 sqrt n pm 2 8 as in the figure above M O E 95 P b z 0 95 s p b 1 96 p b 1 p b n 0 87 n 2 7 displaystyle MOE 95 P b z 0 95 sigma overline p b approx 1 96 sqrt frac p b 1 p b n 0 87 sqrt n pm 2 7 M O E 95 P c z 0 95 s p c 1 96 p c 1 p c n 0 27 n 0 8 displaystyle MOE 95 P c z 0 95 sigma overline p c approx 1 96 sqrt frac p c 1 p c n 0 27 sqrt n pm 0 8 As a given percentage approaches the extremes of 0 or 100 its margin of error approaches 0 Comparing percentages EditImagine multiple choice poll P displaystyle P reports p a p b p c displaystyle p a p b p c as 46 42 12 n 1013 displaystyle 46 42 12 n 1013 As described above the margin of error reported for the poll would typically be M O E 95 P a displaystyle MOE 95 P a as p a displaystyle p a is closest to 50 The popular notion of statistical tie or statistical dead heat however concerns itself not with the accuracy of the individual results but with that of the ranking of the results Which is in first If hypothetically we were to conduct poll P displaystyle P over subsequent samples of n displaystyle n respondents newly drawn from N displaystyle N and report result p w p a p b displaystyle p w p a p b we could use the standard error of difference to understand how p w 1 p w 2 p w 3 displaystyle p w 1 p w 2 p w 3 ldots is expected to fall about p w displaystyle overline p w For this we need to apply the sum of variances to obtain a new variance s P w 2 displaystyle sigma P w 2 s P w 2 s P a P b 2 s P a 2 s P b 2 2 s P a P b p a 1 p a p b 1 p b 2 p a p b displaystyle sigma P w 2 sigma P a P b 2 sigma P a 2 sigma P b 2 2 sigma P a P b p a 1 p a p b 1 p b 2p a p b where s P a P b P a P b displaystyle sigma P a P b P a P b is the covariance of P a displaystyle P a and P b displaystyle P b Thus after simplifying Standard error of difference s w s P w 2 n p a p b p a p b 2 n 0 029 P w P a P b displaystyle text Standard error of difference sigma overline w approx sqrt frac sigma P w 2 n sqrt frac p a p b p a p b 2 n 0 029 P w P a P b M O E 95 P a z 0 95 s p a 3 1 displaystyle MOE 95 P a z 0 95 sigma overline p a approx pm 3 1 M O E 95 P w z 0 95 s w 5 8 displaystyle MOE 95 P w z 0 95 sigma overline w approx pm 5 8 Note that this assumes that P c displaystyle P c is close to constant that is respondents choosing either A or B would almost never chose C making P a displaystyle P a and P b displaystyle P b close to perfectly negatively correlated With three or more choices in closer contention choosing a correct formula for s P w 2 displaystyle sigma P w 2 becomes more complicated Effect of finite population size EditThe formulae above for the margin of error assume that there is an infinitely large population and thus do not depend on the size of population N displaystyle N but only on the sample size n displaystyle n According to sampling theory this assumption is reasonable when the sampling fraction is small The margin of error for a particular sampling method is essentially the same regardless of whether the population of interest is the size of a school city state or country as long as the sampling fraction is small In cases where the sampling fraction is larger in practice greater than 5 analysts might adjust the margin of error using a finite population correction to account for the added precision gained by sampling a much larger percentage of the population FPC can be calculated using the formula 1 FPC N n N 1 displaystyle operatorname FPC sqrt frac N n N 1 and so if poll P displaystyle P were conducted over 24 of say an electorate of 300 000 voters M O E 95 0 5 z 0 95 s p 0 98 72 000 0 4 displaystyle MOE 95 0 5 z 0 95 sigma overline p approx frac 0 98 sqrt 72 000 pm 0 4 M O E 95 F P C 0 5 z 0 95 s p N n N 1 0 98 72 000 300 000 72 000 300 000 1 0 3 displaystyle MOE 95 FPC 0 5 z 0 95 sigma overline p sqrt frac N n N 1 approx frac 0 98 sqrt 72 000 sqrt frac 300 000 72 000 300 000 1 pm 0 3 Intuitively for appropriately large N displaystyle N lim n 0 N n N 1 1 displaystyle lim n to 0 sqrt frac N n N 1 approx 1 lim n N N n N 1 0 displaystyle lim n to N sqrt frac N n N 1 0 In the former case n displaystyle n is so small as to require no correction In the latter case the poll effectively becomes a census and sampling error becomes moot See also EditEngineering tolerance Key relevance Measurement uncertainty Random errorReferences Edit Isserlis L 1918 On the value of a mean as calculated from a sample Journal of the Royal Statistical Society Blackwell Publishing 81 1 75 81 doi 10 2307 2340569 JSTOR 2340569 Equation 1 Sources EditSudman Seymour and Bradburn Norman 1982 Asking Questions A Practical Guide to Questionnaire Design San Francisco Jossey Bass ISBN 0 87589 546 8 Wonnacott T H R J Wonnacott 1990 Introductory Statistics 5th ed Wiley ISBN 0 471 61518 8 External links Edit Wikibooks has more on the topic of Margin of error Errors theory of Encyclopedia of Mathematics EMS Press 2001 1994 Weisstein Eric W Margin of Error MathWorld Retrieved from https en wikipedia org w index php title Margin of error amp oldid 1115611017, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.