fbpx
Wikipedia

Box–Muller transform

The Box–Muller transform, by George Edward Pelham Box and Mervin Edgar Muller,[1] is a random number sampling method for generating pairs of independent, standard, normally distributed (zero expectation, unit variance) random numbers, given a source of uniformly distributed random numbers. The method was in fact first mentioned explicitly by Raymond E. A. C. Paley and Norbert Wiener in 1934.[2]

Visualisation of the Box–Muller transform — the coloured points in the unit square (u1, u2), drawn as circles, are mapped to a 2D Gaussian (z0, z1), drawn as crosses. The plots at the margins are the probability distribution functions of z0 and z1. z0 and z1 are unbounded; they appear to be in [−2.5, 2.5] due to the choice of the illustrated points. In the SVG file, hover over a point to highlight it and its corresponding point.

The Box–Muller transform is commonly expressed in two forms. The basic form as given by Box and Muller takes two samples from the uniform distribution on the interval (0, 1) and maps them to two standard, normally distributed samples. The polar form takes two samples from a different interval, [−1, +1], and maps them to two normally distributed samples without the use of sine or cosine functions.

The Box–Muller transform was developed as a more computationally efficient alternative to the inverse transform sampling method.[3] The ziggurat algorithm gives a more efficient method for scalar processors (e.g. old CPUs), while the Box–Muller transform is superior for processors with vector units (e.g. GPUs or modern CPUs).[4]

Basic form edit

Suppose U1 and U2 are independent samples chosen from the uniform distribution on the unit interval (0, 1). Let

 
and
 

Then Z0 and Z1 are independent random variables with a standard normal distribution.

The derivation[5] is based on a property of a two-dimensional Cartesian system, where X and Y coordinates are described by two independent and normally distributed random variables, the random variables for R2 and Θ (shown above) in the corresponding polar coordinates are also independent and can be expressed as

 
and
 

Because R2 is the square of the norm of the standard bivariate normal variable (X, Y), it has the chi-squared distribution with two degrees of freedom. In the special case of two degrees of freedom, the chi-squared distribution coincides with the exponential distribution, and the equation for R2 above is a simple way of generating the required exponential variate.

Polar form edit

 
Two uniformly distributed values, u and v are used to produce the value s = R2, which is likewise uniformly distributed. The definitions of the sine and cosine are then applied to the basic form of the Box–Muller transform to avoid using trigonometric functions.

The polar form was first proposed by J. Bell[6] and then modified by R. Knop.[7] While several different versions of the polar method have been described, the version of R. Knop will be described here because it is the most widely used, in part due to its inclusion in Numerical Recipes. A slightly different form is described as "Algorithm P" by D. Knuth in The Art of Computer Programming.[8]

Given u and v, independent and uniformly distributed in the closed interval [−1, +1], set s = R2 = u2 + v2. If s = 0 or s ≥ 1, discard u and v, and try another pair (u, v). Because u and v are uniformly distributed and because only points within the unit circle have been admitted, the values of s will be uniformly distributed in the open interval (0, 1), too. The latter can be seen by calculating the cumulative distribution function for s in the interval (0, 1). This is the area of a circle with radius  , divided by  . From this we find the probability density function to have the constant value 1 on the interval (0, 1). Equally so, the angle θ divided by   is uniformly distributed in the interval [0, 1) and independent of s.

We now identify the value of s with that of U1 and   with that of U2 in the basic form. As shown in the figure, the values of   and   in the basic form can be replaced with the ratios   and  , respectively. The advantage is that calculating the trigonometric functions directly can be avoided. This is helpful when trigonometric functions are more expensive to compute than the single division that replaces each one.

Just as the basic form produces two standard normal deviates, so does this alternate calculation.

 
and
 

Contrasting the two forms edit

The polar method differs from the basic method in that it is a type of rejection sampling. It discards some generated random numbers, but can be faster than the basic method because it is simpler to compute (provided that the random number generator is relatively fast) and is more numerically robust.[9] Avoiding the use of expensive trigonometric functions improves speed over the basic form.[6] It discards 1 − π/4 ≈ 21.46% of the total input uniformly distributed random number pairs generated, i.e. discards 4/π − 1 ≈ 27.32% uniformly distributed random number pairs per Gaussian random number pair generated, requiring 4/π ≈ 1.2732 input random numbers per output random number.

The basic form requires two multiplications, 1/2 logarithm, 1/2 square root, and one trigonometric function for each normal variate.[10] On some processors, the cosine and sine of the same argument can be calculated in parallel using a single instruction. Notably for Intel-based machines, one can use the fsincos assembler instruction or the expi instruction (usually available from C as an intrinsic function), to calculate complex

 
and just separate the real and imaginary parts.

Note: To explicitly calculate the complex-polar form use the following substitutions in the general form,

Let   and   Then

 

The polar form requires 3/2 multiplications, 1/2 logarithm, 1/2 square root, and 1/2 division for each normal variate. The effect is to replace one multiplication and one trigonometric function with a single division and a conditional loop.

Tails truncation edit

When a computer is used to produce a uniform random variable it will inevitably have some inaccuracies because there is a lower bound on how close numbers can be to 0. If the generator uses 32 bits per output value, the smallest non-zero number that can be generated is  . When   and   are equal to this the Box–Muller transform produces a normal random deviate equal to  . This means that the algorithm will not produce random variables more than 6.660 standard deviations from the mean. This corresponds to a proportion of   lost due to the truncation, where   is the standard cumulative normal distribution. With 64 bits the limit is pushed to   standard deviations, for which  .

Implementation edit

C++ edit

The standard Box–Muller transform generates values from the standard normal distribution (i.e. standard normal deviates) with mean 0 and standard deviation 1. The implementation below in standard C++ generates values from any normal distribution with mean   and variance  . If   is a standard normal deviate, then   will have a normal distribution with mean   and standard deviation  . The random number generator has been seeded to ensure that new, pseudo-random values will be returned from sequential calls to the generateGaussianNoise function.

#include <cmath> #include <limits> #include <random> #include <utility> //"mu" is the mean of the distribution, and "sigma" is the standard deviation. std::pair<double, double> generateGaussianNoise(double mu, double sigma) {  constexpr double two_pi = 2.0 * M_PI;  //initialize the random uniform number generator (runif) in a range 0 to 1  static std::mt19937 rng(std::random_device{}()); // Standard mersenne_twister_engine seeded with rd()  static std::uniform_real_distribution<> runif(0.0, 1.0);  //create two random numbers, make sure u1 is greater than zero  double u1, u2;  do  {  u1 = runif(rng);  }  while (u1 == 0);  u2 = runif(rng);  //compute z0 and z1  auto mag = sigma * sqrt(-2.0 * log(u1));  auto z0 = mag * cos(two_pi * u2) + mu;  auto z1 = mag * sin(two_pi * u2) + mu;  return std::make_pair(z0, z1); } 

Julia edit

"""  boxmullersample(N) Generate `2N` samples from the standard normal distribution using the Box-Muller method. """ function boxmullersample(N)  z = Array{Float64}(undef,N,2);  for i in axes(z,1)  z[i,:] .= sincospi(2 * rand());  z[i,:] .*= sqrt(-2 * log(rand()));  end  vec(z) end """  boxmullersample(n,μ,σ) Generate `n` samples from the normal distribution with mean `μ` and standard deviation `σ` using the Box-Muller method. """ function boxmullersample(n,μ,σ)  μ .+ σ*boxmullersample(cld(n,2))[1:n]; end 

See also edit

References edit

  1. ^ Box, G. E. P.; Muller, Mervin E. (1958). "A Note on the Generation of Random Normal Deviates". The Annals of Mathematical Statistics. 29 (2): 610–611. doi:10.1214/aoms/1177706645. JSTOR 2237361.
  2. ^ Raymond E. A. C. Paley and Norbert Wiener Fourier Transforms in the Complex Domain, New York: American Mathematical Society (1934) §37.
  3. ^ Kloeden and Platen, Numerical Solutions of Stochastic Differential Equations, pp. 11–12
  4. ^ Howes, Lee; Thomas, David (2008). GPU Gems 3 - Efficient Random Number Generation and Application Using CUDA. Pearson Education, Inc. ISBN 978-0-321-51526-1.
  5. ^ Sheldon Ross, A First Course in Probability, (2002), pp. 279–281
  6. ^ a b Bell, James R. (1968). "Algorithm 334: Normal random deviates". Communications of the ACM. 11 (7): 498. doi:10.1145/363397.363547.
  7. ^ Knop, R. (1969). "Remark on algorithm 334 [G5]: Normal random deviates". Communications of the ACM. 12 (5): 281. doi:10.1145/362946.362996.
  8. ^ Knuth, Donald (1998). The Art of Computer Programming: Volume 2: Seminumerical Algorithms. p. 122. ISBN 0-201-89684-2.
  9. ^ Everett F. Carter, Jr., The Generation and Application of Random Numbers, Forth Dimensions (1994), Vol. 16, No. 1 & 2.
  10. ^ The evaluation of 2πU1 is counted as one multiplication because the value of 2π can be computed in advance and used repeatedly.

External links edit

muller, transform, george, edward, pelham, mervin, edgar, muller, random, number, sampling, method, generating, pairs, independent, standard, normally, distributed, zero, expectation, unit, variance, random, numbers, given, source, uniformly, distributed, rand. The Box Muller transform by George Edward Pelham Box and Mervin Edgar Muller 1 is a random number sampling method for generating pairs of independent standard normally distributed zero expectation unit variance random numbers given a source of uniformly distributed random numbers The method was in fact first mentioned explicitly by Raymond E A C Paley and Norbert Wiener in 1934 2 Visualisation of the Box Muller transform the coloured points in the unit square u1 u2 drawn as circles are mapped to a 2D Gaussian z0 z1 drawn as crosses The plots at the margins are the probability distribution functions of z0 and z1 z0 and z1 are unbounded they appear to be in 2 5 2 5 due to the choice of the illustrated points In the SVG file hover over a point to highlight it and its corresponding point The Box Muller transform is commonly expressed in two forms The basic form as given by Box and Muller takes two samples from the uniform distribution on the interval 0 1 and maps them to two standard normally distributed samples The polar form takes two samples from a different interval 1 1 and maps them to two normally distributed samples without the use of sine or cosine functions The Box Muller transform was developed as a more computationally efficient alternative to the inverse transform sampling method 3 The ziggurat algorithm gives a more efficient method for scalar processors e g old CPUs while the Box Muller transform is superior for processors with vector units e g GPUs or modern CPUs 4 Contents 1 Basic form 2 Polar form 3 Contrasting the two forms 4 Tails truncation 5 Implementation 5 1 C 5 2 Julia 6 See also 7 References 8 External linksBasic form editSuppose U1 and U2 are independent samples chosen from the uniform distribution on the unit interval 0 1 LetZ 0 R cos 8 2 ln U 1 cos 2 p U 2 displaystyle Z 0 R cos Theta sqrt 2 ln U 1 cos 2 pi U 2 nbsp and Z 1 R sin 8 2 ln U 1 sin 2 p U 2 displaystyle Z 1 R sin Theta sqrt 2 ln U 1 sin 2 pi U 2 nbsp Then Z0 and Z1 are independent random variables with a standard normal distribution The derivation 5 is based on a property of a two dimensional Cartesian system where X and Y coordinates are described by two independent and normally distributed random variables the random variables for R2 and 8 shown above in the corresponding polar coordinates are also independent and can be expressed asR 2 2 ln U 1 displaystyle R 2 2 cdot ln U 1 nbsp and 8 2 p U 2 displaystyle Theta 2 pi U 2 nbsp Because R2 is the square of the norm of the standard bivariate normal variable X Y it has the chi squared distribution with two degrees of freedom In the special case of two degrees of freedom the chi squared distribution coincides with the exponential distribution and the equation for R2 above is a simple way of generating the required exponential variate Polar form editMain article Marsaglia polar method nbsp Two uniformly distributed values u and v are used to produce the value s R2 which is likewise uniformly distributed The definitions of the sine and cosine are then applied to the basic form of the Box Muller transform to avoid using trigonometric functions The polar form was first proposed by J Bell 6 and then modified by R Knop 7 While several different versions of the polar method have been described the version of R Knop will be described here because it is the most widely used in part due to its inclusion in Numerical Recipes A slightly different form is described as Algorithm P by D Knuth in The Art of Computer Programming 8 Given u and v independent and uniformly distributed in the closed interval 1 1 set s R2 u2 v2 If s 0 or s 1 discard u and v and try another pair u v Because u and v are uniformly distributed and because only points within the unit circle have been admitted the values of s will be uniformly distributed in the open interval 0 1 too The latter can be seen by calculating the cumulative distribution function for s in the interval 0 1 This is the area of a circle with radius s textstyle sqrt s nbsp divided by p displaystyle pi nbsp From this we find the probability density function to have the constant value 1 on the interval 0 1 Equally so the angle 8 divided by 2 p displaystyle 2 pi nbsp is uniformly distributed in the interval 0 1 and independent of s We now identify the value of s with that of U1 and 8 2 p displaystyle theta 2 pi nbsp with that of U2 in the basic form As shown in the figure the values of cos 8 cos 2 p U 2 displaystyle cos theta cos 2 pi U 2 nbsp and sin 8 sin 2 p U 2 displaystyle sin theta sin 2 pi U 2 nbsp in the basic form can be replaced with the ratios cos 8 u R u s displaystyle cos theta u R u sqrt s nbsp and sin 8 v R v s textstyle sin theta v R v sqrt s nbsp respectively The advantage is that calculating the trigonometric functions directly can be avoided This is helpful when trigonometric functions are more expensive to compute than the single division that replaces each one Just as the basic form produces two standard normal deviates so does this alternate calculation z 0 2 ln U 1 cos 2 p U 2 2 ln s u s u 2 ln s s displaystyle z 0 sqrt 2 ln U 1 cos 2 pi U 2 sqrt 2 ln s left frac u sqrt s right u cdot sqrt frac 2 ln s s nbsp and z 1 2 ln U 1 sin 2 p U 2 2 ln s v s v 2 ln s s displaystyle z 1 sqrt 2 ln U 1 sin 2 pi U 2 sqrt 2 ln s left frac v sqrt s right v cdot sqrt frac 2 ln s s nbsp Contrasting the two forms editThe polar method differs from the basic method in that it is a type of rejection sampling It discards some generated random numbers but can be faster than the basic method because it is simpler to compute provided that the random number generator is relatively fast and is more numerically robust 9 Avoiding the use of expensive trigonometric functions improves speed over the basic form 6 It discards 1 p 4 21 46 of the total input uniformly distributed random number pairs generated i e discards 4 p 1 27 32 uniformly distributed random number pairs per Gaussian random number pair generated requiring 4 p 1 2732 input random numbers per output random number The basic form requires two multiplications 1 2 logarithm 1 2 square root and one trigonometric function for each normal variate 10 On some processors the cosine and sine of the same argument can be calculated in parallel using a single instruction Notably for Intel based machines one can use the fsincos assembler instruction or the expi instruction usually available from C as an intrinsic function to calculate complexexp i z e i z cos z i sin z displaystyle exp iz e iz cos z i sin z nbsp and just separate the real and imaginary parts Note To explicitly calculate the complex polar form use the following substitutions in the general form Let r ln u 1 textstyle r sqrt ln u 1 nbsp and z 2 p u 2 textstyle z 2 pi u 2 nbsp Thenr e i z ln u 1 e i 2 p u 2 2 ln u 1 cos 2 p u 2 i sin 2 p u 2 displaystyle re iz sqrt ln u 1 e i2 pi u 2 sqrt 2 ln u 1 left cos 2 pi u 2 i sin 2 pi u 2 right nbsp The polar form requires 3 2 multiplications 1 2 logarithm 1 2 square root and 1 2 division for each normal variate The effect is to replace one multiplication and one trigonometric function with a single division and a conditional loop Tails truncation editWhen a computer is used to produce a uniform random variable it will inevitably have some inaccuracies because there is a lower bound on how close numbers can be to 0 If the generator uses 32 bits per output value the smallest non zero number that can be generated is 2 32 displaystyle 2 32 nbsp When U 1 displaystyle U 1 nbsp and U 2 displaystyle U 2 nbsp are equal to this the Box Muller transform produces a normal random deviate equal to d 2 ln 2 32 cos 2 p 2 32 6 660 textstyle delta sqrt 2 ln 2 32 cos 2 pi 2 32 approx 6 660 nbsp This means that the algorithm will not produce random variables more than 6 660 standard deviations from the mean This corresponds to a proportion of 2 1 F d 2 738 10 11 displaystyle 2 1 Phi delta simeq 2 738 times 10 11 nbsp lost due to the truncation where F d displaystyle Phi delta nbsp is the standard cumulative normal distribution With 64 bits the limit is pushed to d 9 419 displaystyle delta 9 419 nbsp standard deviations for which 2 1 F d lt 5 10 21 displaystyle 2 1 Phi delta lt 5 times 10 21 nbsp Implementation editC edit The standard Box Muller transform generates values from the standard normal distribution i e standard normal deviates with mean 0 and standard deviation 1 The implementation below in standard C generates values from any normal distribution with mean m displaystyle mu nbsp and variance s 2 displaystyle sigma 2 nbsp If Z displaystyle Z nbsp is a standard normal deviate then X Z s m displaystyle X Z sigma mu nbsp will have a normal distribution with mean m displaystyle mu nbsp and standard deviation s displaystyle sigma nbsp The random number generator has been seeded to ensure that new pseudo random values will be returned from sequential calls to the generateGaussianNoise function include lt cmath gt include lt limits gt include lt random gt include lt utility gt mu is the mean of the distribution and sigma is the standard deviation std pair lt double double gt generateGaussianNoise double mu double sigma constexpr double two pi 2 0 M PI initialize the random uniform number generator runif in a range 0 to 1 static std mt19937 rng std random device Standard mersenne twister engine seeded with rd static std uniform real distribution lt gt runif 0 0 1 0 create two random numbers make sure u1 is greater than zero double u1 u2 do u1 runif rng while u1 0 u2 runif rng compute z0 and z1 auto mag sigma sqrt 2 0 log u1 auto z0 mag cos two pi u2 mu auto z1 mag sin two pi u2 mu return std make pair z0 z1 Julia edit boxmullersample N Generate 2N samples from the standard normal distribution using the Box Muller method function boxmullersample N z Array Float64 undef N 2 for i in axes z 1 z i sincospi 2 rand z i sqrt 2 log rand end vec z end boxmullersample n m s Generate n samples from the normal distribution with mean m and standard deviation s using the Box Muller method function boxmullersample n m s m s boxmullersample cld n 2 1 n endSee also editInverse transform sampling Marsaglia polar method similar transform to Box Muller which uses Cartesian coordinates instead of polar coordinatesReferences edit Box G E P Muller Mervin E 1958 A Note on the Generation of Random Normal Deviates The Annals of Mathematical Statistics 29 2 610 611 doi 10 1214 aoms 1177706645 JSTOR 2237361 Raymond E A C Paley and Norbert Wiener Fourier Transforms in the Complex Domain New York American Mathematical Society 1934 37 Kloeden and Platen Numerical Solutions of Stochastic Differential Equations pp 11 12 Howes Lee Thomas David 2008 GPU Gems 3 Efficient Random Number Generation and Application Using CUDA Pearson Education Inc ISBN 978 0 321 51526 1 Sheldon Ross A First Course in Probability 2002 pp 279 281 a b Bell James R 1968 Algorithm 334 Normal random deviates Communications of the ACM 11 7 498 doi 10 1145 363397 363547 Knop R 1969 Remark on algorithm 334 G5 Normal random deviates Communications of the ACM 12 5 281 doi 10 1145 362946 362996 Knuth Donald 1998 The Art of Computer Programming Volume 2 Seminumerical Algorithms p 122 ISBN 0 201 89684 2 Everett F Carter Jr The Generation and Application of Random Numbers Forth Dimensions 1994 Vol 16 No 1 amp 2 The evaluation of 2p U1 is counted as one multiplication because the value of 2p can be computed in advance and used repeatedly External links editWeisstein Eric W Box Muller Transformation MathWorld How to Convert a Uniform Distribution to a Gaussian Distribution C Code Retrieved from https en wikipedia org w index php title Box Muller transform amp oldid 1201417277, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.