fbpx
Wikipedia

Bernoulli sampling

In the theory of finite population sampling, Bernoulli sampling is a sampling process where each element of the population is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sample. An essential property of Bernoulli sampling is that all elements of the population have equal probability of being included in the sample.[1]

Bernoulli sampling is therefore a special case of Poisson sampling. In Poisson sampling each element of the population may have a different probability of being included in the sample. In Bernoulli sampling, the probability is equal for all the elements.

Because each element of the population is considered separately for the sample, the sample size is not fixed but rather follows a binomial distribution.

Example edit

The most basic Bernoulli method generates n random variates to extract a sample from a population of n items. Suppose you want to extract a given percentage pct of the population. The algorithm can be described as follows:[2]

for each item in the set generate a random non-negative integer R if (R mod 100) < pct then select item 
 
Scaled f(k, n, 0.2) for four values of n.

A percentage of 20%, say, is usually expressed as a probability p=0.2. In that case, random variates are generated in the unit interval. After running the algorithm, a sample of size k will have been selected. One would expect to have  , which is more and more likely as n grows. In fact, It is possible to calculate the probability of obtaining a sample size of k by the Binomial distribution:

 

On the left this function is shown for four values of   and  . In order to compare the values for different values of  , the  's in abscissa are scaled from   to the unit interval, while the value of the function, in ordinate, is multiplied by the inverse, so that the area under the graph maintains the same value —that area is related to the corresponding cumulative distribution function. The values are shown in logarithmic scale.

 
Values of n such that a Bernoulli sample size is within error in 95% of cases.

On the right the minimum values of   that satisfy given error bounds with 95% probability. Given an error, the set of  's within bounds can be described as follows:

 

The probability to end up within   is given again by the binomial distribution as:

 

The picture shows the lowest values of   such that the sum is at least 0.95. For   and   the algorithm delivers exact results for all  's. The  's in between are obtained by bisection. Note that, if   is an integer percentage,  , guarantees that  . Values as high as   can be required for such an exact match.

See also edit

References edit

  1. ^ Carl-Erik Sarndal; Bengt Swensson; Jan Wretman (1992). Model Assisted Survey Sampling. ISBN 978-0-387-97528-3.
  2. ^ Voratas Kachitvichyanukul; Bruce W. Schmeise (1 February 1988). "Binomial Random Variate Generation". Communications of the ACM. 31 (2): 216–222. doi:10.1145/42372.42381. S2CID 18698828.

External links edit

  • Faster Random Samples With Gap Sampling

bernoulli, sampling, theory, finite, population, sampling, sampling, process, where, each, element, population, subjected, independent, bernoulli, trial, which, determines, whether, element, becomes, part, sample, essential, property, that, elements, populatio. In the theory of finite population sampling Bernoulli sampling is a sampling process where each element of the population is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sample An essential property of Bernoulli sampling is that all elements of the population have equal probability of being included in the sample 1 Bernoulli sampling is therefore a special case of Poisson sampling In Poisson sampling each element of the population may have a different probability of being included in the sample In Bernoulli sampling the probability is equal for all the elements Because each element of the population is considered separately for the sample the sample size is not fixed but rather follows a binomial distribution Contents 1 Example 2 See also 3 References 4 External linksExample editThe most basic Bernoulli method generates n random variates to extract a sample from a population of n items Suppose you want to extract a given percentage pct of the population The algorithm can be described as follows 2 for each item in the set generate a random non negative integer R if R mod 100 lt pct then select item nbsp Scaled f k n 0 2 for four values of n A percentage of 20 say is usually expressed as a probability p 0 2 In that case random variates are generated in the unit interval After running the algorithm a sample of size k will have been selected One would expect to have k n p displaystyle k approx n cdot p nbsp which is more and more likely as n grows In fact It is possible to calculate the probability of obtaining a sample size of k by the Binomial distribution f k n p nk pk 1 p n k displaystyle f k n p binom n k p k 1 p n k nbsp On the left this function is shown for four values of n displaystyle n nbsp and p 0 2 displaystyle p 0 2 nbsp In order to compare the values for different values of n displaystyle n nbsp the k displaystyle k nbsp s in abscissa are scaled from 0 n displaystyle left 0 n right nbsp to the unit interval while the value of the function in ordinate is multiplied by the inverse so that the area under the graph maintains the same value that area is related to the corresponding cumulative distribution function The values are shown in logarithmic scale nbsp Values of n such that a Bernoulli sample size is within error in 95 of cases On the right the minimum values of n displaystyle n nbsp that satisfy given error bounds with 95 probability Given an error the set of k displaystyle k nbsp s within bounds can be described as follows Kn p k N kn p lt error displaystyle K n p left k in mathbb N left vert frac k n p right vert lt mathrm error right nbsp The probability to end up within K displaystyle K nbsp is given again by the binomial distribution as k Kf k n p displaystyle sum k in K f k n p nbsp The picture shows the lowest values of n displaystyle n nbsp such that the sum is at least 0 95 For p 0 0 displaystyle p 0 0 nbsp and p 1 00 displaystyle p 1 00 nbsp the algorithm delivers exact results for all n displaystyle n nbsp s The p displaystyle p nbsp s in between are obtained by bisection Note that if 100 p displaystyle 100 cdot p nbsp is an integer percentage error 0 005 displaystyle mathrm error 0 005 nbsp guarantees that 100 k n 100 p displaystyle 100 cdot k n 100 cdot p nbsp Values as high as n 38400 displaystyle n 38400 nbsp can be required for such an exact match See also editPoisson sampling Bernoulli trial Bernoulli process Sampling designReferences edit Carl Erik Sarndal Bengt Swensson Jan Wretman 1992 Model Assisted Survey Sampling ISBN 978 0 387 97528 3 Voratas Kachitvichyanukul Bruce W Schmeise 1 February 1988 Binomial Random Variate Generation Communications of the ACM 31 2 216 222 doi 10 1145 42372 42381 S2CID 18698828 External links editFaster Random Samples With Gap Sampling Retrieved from https en wikipedia org w index php title Bernoulli sampling amp oldid 1157290305, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.