fbpx
Wikipedia

Statistical randomness

A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal dice roll or the digits of π exhibit statistical randomness.[1]

Statistical randomness does not necessarily imply "true" randomness, i.e., objective unpredictability. Pseudorandomness is sufficient for many uses, such as statistics, hence the name statistical randomness.

Global randomness and local randomness are different. Most philosophical conceptions of randomness are global—because they are based on the idea that "in the long run" a sequence looks truly random, even if certain sub-sequences would not look random. In a "truly" random sequence of numbers of sufficient length, for example, it is probable there would be long sequences of nothing but repeating numbers, though on the whole the sequence might be random. Local randomness refers to the idea that there can be minimum sequence lengths in which random distributions are approximated. Long stretches of the same numbers, even those generated by "truly" random processes, would diminish the "local randomness" of a sample (it might only be locally random for sequences of 10,000 numbers; taking sequences of less than 1,000 might not appear random at all, for example).

A sequence exhibiting a pattern is not thereby proved not statistically random. According to principles of Ramsey theory, sufficiently large objects must necessarily contain a given substructure ("complete disorder is impossible").

Legislation concerning gambling imposes certain standards of statistical randomness to slot machines.

Tests

The first tests for random numbers were published by M.G. Kendall and Bernard Babington Smith in the Journal of the Royal Statistical Society in 1938.[2] They were built on statistical tools such as Pearson's chi-squared test that were developed to distinguish whether experimental phenomena matched their theoretical probabilities. Pearson developed his test originally by showing that a number of dice experiments by W.F.R. Weldon did not display "random" behavior.

Kendall and Smith's original four tests were hypothesis tests, which took as their null hypothesis the idea that each number in a given random sequence had an equal chance of occurring, and that various other patterns in the data should be also distributed equiprobably.

  • The frequency test, was very basic: checking to make sure that there were roughly the same number of 0s, 1s, 2s, 3s, etc.
  • The serial test, did the same thing but for sequences of two digits at a time (00, 01, 02, etc.), comparing their observed frequencies with their hypothetical predictions were they equally distributed.
  • The poker test, tested for certain sequences of five numbers at a time (AAAAA, AAAAB, AAABB, etc.) based on hands in the game poker.
  • The gap test, looked at the distances between zeroes (00 would be a distance of 0, 030 would be a distance of 1, 02250 would be a distance of 3, etc.).

If a given sequence was able to pass all of these tests within a given degree of significance (generally 5%), then it was judged to be, in their words "locally random". Kendall and Smith differentiated "local randomness" from "true randomness" in that many sequences generated with truly random methods might not display "local randomness" to a given degree — very large sequences might contain many rows of a single digit. This might be "random" on the scale of the entire sequence, but in a smaller block it would not be "random" (it would not pass their tests), and would be useless for a number of statistical applications.

As random number sets became more and more common, more tests, of increasing sophistication were used. Some modern tests plot random digits as points on a three-dimensional plane, which can then be rotated to look for hidden patterns. In 1995, the statistician George Marsaglia created a set of tests known as the diehard tests, which he distributes with a CD-ROM of 5 billion pseudorandom numbers. In 2015, Yongge Wang distributed a Java software package [3] for statistically distance based randomness testing.

Pseudorandom number generators require tests as exclusive verifications for their "randomness," as they are decidedly not produced by "truly random" processes, but rather by deterministic algorithms. Over the history of random number generation, many sources of numbers thought to appear "random" under testing have later been discovered to be very non-random when subjected to certain types of tests. The notion of quasi-random numbers was developed to circumvent some of these problems, though pseudorandom number generators are still extensively used in many applications (even ones known to be extremely "non-random"), as they are "good enough" for most applications.

Other tests:

  • The Monobit test treats each output bit of the random number generator as a coin flip test, and determine if the observed number of heads and tails are close to the expected 50% frequency. The number of heads in a coin flip trail forms a binomial distribution.
  • The Wald–Wolfowitz runs test tests for the number of bit transitions between 0 bits, and 1 bits, comparing the observed frequencies with expected frequency of a random bit sequence.
  • Information entropy
  • Autocorrelation test
  • Kolmogorov–Smirnov test
  • Statistically distance based randomness test. Yongge Wang showed [4][5] that NIST SP800-22 testing standards are not sufficient to detect some weakness in randomness generators and proposed statistically distance based randomness test.
  • Spectral Density Estimation[6] - performing a Fourier transform on a "random" signal transforms it into a sum of periodic functions in order to detect non random repetitive trends
  • Maurer's Universal Statistical Test
  • The Diehard tests

See also

References

  1. ^ Pi seems a good random number generator – but not always the best, Chad Boutin, Purdue University
  2. ^ Kendall, M.G.; Smith, B. Babington (1938). "Randomness and Random Sampling Numbers". Journal of the Royal Statistical Society. 101 (1): 147–166. doi:10.2307/2980655. JSTOR 2980655.
  3. ^ Yongge Wang. Statistical Testing Techniques For Pseudorandom generation. http://webpages.uncc.edu/yonwang/liltest/
  4. ^ Yongge Wang: On the Design of LIL Tests for (Pseudo) Random Generators and Some Experimental Results. PDF
  5. ^ Wang, Yongge; Nicol, Tony (2015). "Statistical Properties of Pseudo Random Sequences and Experiments with PHP and Debian OpenSSL". Computers and Security. 53: 44–64. doi:10.1016/j.cose.2015.05.005.
  6. ^ Knuth, Donald (1998). The Art of Computer Programming Vol. 2 : Seminumerical Algorithms. Addison Wesley. pp. 93–118. ISBN 978-0-201-89684-8.

External links

  • DieHarder: A free (GPL) C Random Number Test Suite.
  • Generating Normal Distributed Random Numbers

statistical, randomness, numeric, sequence, said, statistically, random, when, contains, recognizable, patterns, regularities, sequences, such, results, ideal, dice, roll, digits, exhibit, statistical, randomness, does, necessarily, imply, true, randomness, ob. A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities sequences such as the results of an ideal dice roll or the digits of p exhibit statistical randomness 1 Statistical randomness does not necessarily imply true randomness i e objective unpredictability Pseudorandomness is sufficient for many uses such as statistics hence the name statistical randomness Global randomness and local randomness are different Most philosophical conceptions of randomness are global because they are based on the idea that in the long run a sequence looks truly random even if certain sub sequences would not look random In a truly random sequence of numbers of sufficient length for example it is probable there would be long sequences of nothing but repeating numbers though on the whole the sequence might be random Local randomness refers to the idea that there can be minimum sequence lengths in which random distributions are approximated Long stretches of the same numbers even those generated by truly random processes would diminish the local randomness of a sample it might only be locally random for sequences of 10 000 numbers taking sequences of less than 1 000 might not appear random at all for example A sequence exhibiting a pattern is not thereby proved not statistically random According to principles of Ramsey theory sufficiently large objects must necessarily contain a given substructure complete disorder is impossible Legislation concerning gambling imposes certain standards of statistical randomness to slot machines Contents 1 Tests 2 See also 3 References 4 External linksTests EditMain article Randomness test The first tests for random numbers were published by M G Kendall and Bernard Babington Smith in the Journal of the Royal Statistical Society in 1938 2 They were built on statistical tools such as Pearson s chi squared test that were developed to distinguish whether experimental phenomena matched their theoretical probabilities Pearson developed his test originally by showing that a number of dice experiments by W F R Weldon did not display random behavior Kendall and Smith s original four tests were hypothesis tests which took as their null hypothesis the idea that each number in a given random sequence had an equal chance of occurring and that various other patterns in the data should be also distributed equiprobably The frequency test was very basic checking to make sure that there were roughly the same number of 0s 1s 2s 3s etc The serial test did the same thing but for sequences of two digits at a time 00 01 02 etc comparing their observed frequencies with their hypothetical predictions were they equally distributed The poker test tested for certain sequences of five numbers at a time AAAAA AAAAB AAABB etc based on hands in the game poker The gap test looked at the distances between zeroes 00 would be a distance of 0 030 would be a distance of 1 02250 would be a distance of 3 etc If a given sequence was able to pass all of these tests within a given degree of significance generally 5 then it was judged to be in their words locally random Kendall and Smith differentiated local randomness from true randomness in that many sequences generated with truly random methods might not display local randomness to a given degree very large sequences might contain many rows of a single digit This might be random on the scale of the entire sequence but in a smaller block it would not be random it would not pass their tests and would be useless for a number of statistical applications As random number sets became more and more common more tests of increasing sophistication were used Some modern tests plot random digits as points on a three dimensional plane which can then be rotated to look for hidden patterns In 1995 the statistician George Marsaglia created a set of tests known as the diehard tests which he distributes with a CD ROM of 5 billion pseudorandom numbers In 2015 Yongge Wang distributed a Java software package 3 for statistically distance based randomness testing Pseudorandom number generators require tests as exclusive verifications for their randomness as they are decidedly not produced by truly random processes but rather by deterministic algorithms Over the history of random number generation many sources of numbers thought to appear random under testing have later been discovered to be very non random when subjected to certain types of tests The notion of quasi random numbers was developed to circumvent some of these problems though pseudorandom number generators are still extensively used in many applications even ones known to be extremely non random as they are good enough for most applications Other tests The Monobit test treats each output bit of the random number generator as a coin flip test and determine if the observed number of heads and tails are close to the expected 50 frequency The number of heads in a coin flip trail forms a binomial distribution The Wald Wolfowitz runs test tests for the number of bit transitions between 0 bits and 1 bits comparing the observed frequencies with expected frequency of a random bit sequence Information entropy Autocorrelation test Kolmogorov Smirnov test Statistically distance based randomness test Yongge Wang showed 4 5 that NIST SP800 22 testing standards are not sufficient to detect some weakness in randomness generators and proposed statistically distance based randomness test Spectral Density Estimation 6 performing a Fourier transform on a random signal transforms it into a sum of periodic functions in order to detect non random repetitive trends Maurer s Universal Statistical Test The Diehard testsSee also EditAlgorithmic randomness Checking Complete spatial randomness Normal number One time pad Random error Randomness Randomness tests Statistical hypothesis testing Seven states of randomness TestU01References Edit Pi seems a good random number generator but not always the best Chad Boutin Purdue University Kendall M G Smith B Babington 1938 Randomness and Random Sampling Numbers Journal of the Royal Statistical Society 101 1 147 166 doi 10 2307 2980655 JSTOR 2980655 Yongge Wang Statistical Testing Techniques For Pseudorandom generation http webpages uncc edu yonwang liltest Yongge Wang On the Design of LIL Tests for Pseudo Random Generators and Some Experimental Results PDF Wang Yongge Nicol Tony 2015 Statistical Properties of Pseudo Random Sequences and Experiments with PHP and Debian OpenSSL Computers and Security 53 44 64 doi 10 1016 j cose 2015 05 005 Knuth Donald 1998 The Art of Computer Programming Vol 2 Seminumerical Algorithms Addison Wesley pp 93 118 ISBN 978 0 201 89684 8 External links EditDieHarder A free GPL C Random Number Test Suite Generating Normal Distributed Random Numbers Retrieved from https en wikipedia org w index php title Statistical randomness amp oldid 1073021351, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.