fbpx
Wikipedia

Nonparametric statistics

Nonparametric statistics is the type of statistics that is not restricted by assumptions concerning the nature of the population from which a sample is drawn. This is opposed to parametric statistics, for which a problem is restricted a priori by assumptions concerning the specific distribution of the population (such as the normal distribution) and parameters (such the mean or variance). Nonparametric statistics is based on either not assuming a particular distribution or having a distribution specified but with the distribution's parameters not specified in advance (though a parameter may be generated by the data, such as the median). Nonparametric statistics can be used for descriptive statistics or statistical inference. Nonparametric tests are often used when the assumptions of parametric tests are evidently violated.[1]

Definitions Edit

The term "nonparametric statistics" has been defined imprecisely in the following two ways, among others:

  1. The first meaning of nonparametric involves techniques that do not rely on data belonging to any particular parametric family of probability distributions.

    These include, among others:

    • Methods which are distribution-free, which do not rely on assumptions that the data are drawn from a given parametric family of probability distributions.
    • Statistics defined to be a function on a sample, without dependency on a parameter.

    An example is Order statistics, which are based on ordinal ranking of observations.

    The discussion following is taken from Kendall's Advanced Theory of Statistics.[2]

    Statistical hypotheses concern the behavior of observable random variables.... For example, the hypothesis (a) that a normal distribution has a specified mean and variance is statistical; so is the hypothesis (b) that it has a given mean but unspecified variance; so is the hypothesis (c) that a distribution is of normal form with both mean and variance unspecified; finally, so is the hypothesis (d) that two unspecified continuous distributions are identical.

    It will have been noticed that in the examples (a) and (b) the distribution underlying the observations was taken to be of a certain form (the normal) and the hypothesis was concerned entirely with the value of one or both of its parameters. Such a hypothesis, for obvious reasons, is called parametric.

    Hypothesis (c) was of a different nature, as no parameter values are specified in the statement of the hypothesis; we might reasonably call such a hypothesis non-parametric. Hypothesis (d) is also non-parametric but, in addition, it does not even specify the underlying form of the distribution and may now be reasonably termed distribution-free. Notwithstanding these distinctions, the statistical literature now commonly applies the label "non-parametric" to test procedures that we have just termed "distribution-free", thereby losing a useful classification.

  2. The second meaning of non-parametric involves techniques that do not assume that the structure of a model is fixed. Typically, the model grows in size to accommodate the complexity of the data. In these techniques, individual variables are typically assumed to belong to parametric distributions, and assumptions about the types of associations among variables are also made. These techniques include, among others:
    • non-parametric regression, which is modeling whereby the structure of the relationship between variables is treated non-parametrically, but where nevertheless there may be parametric assumptions about the distribution of model residuals.
    • non-parametric hierarchical Bayesian models, such as models based on the Dirichlet process, which allow the number of latent variables to grow as necessary to fit the data, but where individual variables still follow parametric distributions and even the process controlling the rate of growth of latent variables follows a parametric distribution.

Applications and purpose Edit

Non-parametric methods are widely used for studying populations that have a ranked order (such as movie reviews receiving one to four "stars"). The use of non-parametric methods may be necessary when data have a ranking but no clear numerical interpretation, such as when assessing preferences. In terms of levels of measurement, non-parametric methods result in ordinal data.

As non-parametric methods make fewer assumptions, their applicability is much more general than the corresponding parametric methods. In particular, they may be applied in situations where less is known about the application in question. Also, due to the reliance on fewer assumptions, non-parametric methods are more robust.

Another justification for the use of non-parametric methods is simplicity. In certain cases, even when the use of parametric methods is justified, non-parametric methods may be easier to use. Due both to this simplicity and to their greater robustness, non-parametric methods are considered by some statisticians as being less susceptible to improper use and misunderstanding.

The wider applicability and increased robustness of non-parametric tests comes at a cost: in cases where a parametric test would be appropriate, non-parametric tests have less statistical power. In other words, a larger sample size can be required to draw conclusions with the same degree of confidence.

Non-parametric models Edit

Non-parametric models differ from parametric models in that the model structure is not specified a priori but is instead determined from data. The term non-parametric is not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance.

Methods Edit

Non-parametric (or distribution-free) inferential statistical methods are mathematical procedures for statistical hypothesis testing which, unlike parametric statistics, make no assumptions about the probability distributions of the variables being assessed. The most frequently used tests include

History Edit

Early nonparametric statistics include the median (13th century or earlier, use in estimation by Edward Wright, 1599; see Median § History) and the sign test by John Arbuthnot (1710) in analyzing the human sex ratio at birth (see Sign test § History).[3][4]

See also Edit

Notes Edit

  1. ^ Pearce, J; Derrick, B (2019). "Preliminary testing: The devil of statistics?". Reinvention: An International Journal of Undergraduate Research. 12 (2). doi:10.31273/reinvention.v12i2.339.
  2. ^ Stuart A., Ord J.K, Arnold S. (1999), Kendall's Advanced Theory of Statistics: Volume 2A—Classical Inference and the Linear Model, sixth edition, §20.2–20.3 (Arnold).
  3. ^ Conover, W.J. (1999), "Chapter 3.4: The Sign Test", Practical Nonparametric Statistics (Third ed.), Wiley, pp. 157–176, ISBN 0-471-16068-7
  4. ^ Sprent, P. (1989), Applied Nonparametric Statistical Methods (Second ed.), Chapman & Hall, ISBN 0-412-44980-3

General references Edit

  • Bagdonavicius, V., Kruopis, J., Nikulin, M.S. (2011). "Non-parametric tests for complete data", ISTE & WILEY: London & Hoboken. ISBN 978-1-84821-269-5.
  • Corder, G. W.; Foreman, D. I. (2014). Nonparametric Statistics: A Step-by-Step Approach. Wiley. ISBN 978-1118840313.
  • Gibbons, Jean Dickinson; Chakraborti, Subhabrata (2003). Nonparametric Statistical Inference, 4th Ed. CRC Press. ISBN 0-8247-4052-1.
  • Hettmansperger, T. P.; McKean, J. W. (1998). Robust Nonparametric Statistical Methods. Kendall's Library of Statistics. Vol. 5 (First ed.). London: Edward Arnold. New York: John Wiley & Sons. ISBN 0-340-54937-8. MR 1604954. also ISBN 0-471-19479-4.
  • Hollander M., Wolfe D.A., Chicken E. (2014). Nonparametric Statistical Methods, John Wiley & Sons.
  • Sheskin, David J. (2003) Handbook of Parametric and Nonparametric Statistical Procedures. CRC Press. ISBN 1-58488-440-1
  • Wasserman, Larry (2007). All of Nonparametric Statistics, Springer. ISBN 0-387-25145-6.

nonparametric, statistics, type, statistics, that, restricted, assumptions, concerning, nature, population, from, which, sample, drawn, this, opposed, parametric, statistics, which, problem, restricted, priori, assumptions, concerning, specific, distribution, . Nonparametric statistics is the type of statistics that is not restricted by assumptions concerning the nature of the population from which a sample is drawn This is opposed to parametric statistics for which a problem is restricted a priori by assumptions concerning the specific distribution of the population such as the normal distribution and parameters such the mean or variance Nonparametric statistics is based on either not assuming a particular distribution or having a distribution specified but with the distribution s parameters not specified in advance though a parameter may be generated by the data such as the median Nonparametric statistics can be used for descriptive statistics or statistical inference Nonparametric tests are often used when the assumptions of parametric tests are evidently violated 1 Contents 1 Definitions 2 Applications and purpose 3 Non parametric models 4 Methods 5 History 6 See also 7 Notes 8 General referencesDefinitions EditThe term nonparametric statistics has been defined imprecisely in the following two ways among others The first meaning of nonparametric involves techniques that do not rely on data belonging to any particular parametric family of probability distributions These include among others Methods which are distribution free which do not rely on assumptions that the data are drawn from a given parametric family of probability distributions Statistics defined to be a function on a sample without dependency on a parameter An example is Order statistics which are based on ordinal ranking of observations The discussion following is taken from Kendall s Advanced Theory of Statistics 2 Statistical hypotheses concern the behavior of observable random variables For example the hypothesis a that a normal distribution has a specified mean and variance is statistical so is the hypothesis b that it has a given mean but unspecified variance so is the hypothesis c that a distribution is of normal form with both mean and variance unspecified finally so is the hypothesis d that two unspecified continuous distributions are identical It will have been noticed that in the examples a and b the distribution underlying the observations was taken to be of a certain form the normal and the hypothesis was concerned entirely with the value of one or both of its parameters Such a hypothesis for obvious reasons is called parametric Hypothesis c was of a different nature as no parameter values are specified in the statement of the hypothesis we might reasonably call such a hypothesis non parametric Hypothesis d is also non parametric but in addition it does not even specify the underlying form of the distribution and may now be reasonably termed distribution free Notwithstanding these distinctions the statistical literature now commonly applies the label non parametric to test procedures that we have just termed distribution free thereby losing a useful classification The second meaning of non parametric involves techniques that do not assume that the structure of a model is fixed Typically the model grows in size to accommodate the complexity of the data In these techniques individual variables are typically assumed to belong to parametric distributions and assumptions about the types of associations among variables are also made These techniques include among others non parametric regression which is modeling whereby the structure of the relationship between variables is treated non parametrically but where nevertheless there may be parametric assumptions about the distribution of model residuals non parametric hierarchical Bayesian models such as models based on the Dirichlet process which allow the number of latent variables to grow as necessary to fit the data but where individual variables still follow parametric distributions and even the process controlling the rate of growth of latent variables follows a parametric distribution Applications and purpose EditNon parametric methods are widely used for studying populations that have a ranked order such as movie reviews receiving one to four stars The use of non parametric methods may be necessary when data have a ranking but no clear numerical interpretation such as when assessing preferences In terms of levels of measurement non parametric methods result in ordinal data As non parametric methods make fewer assumptions their applicability is much more general than the corresponding parametric methods In particular they may be applied in situations where less is known about the application in question Also due to the reliance on fewer assumptions non parametric methods are more robust Another justification for the use of non parametric methods is simplicity In certain cases even when the use of parametric methods is justified non parametric methods may be easier to use Due both to this simplicity and to their greater robustness non parametric methods are considered by some statisticians as being less susceptible to improper use and misunderstanding The wider applicability and increased robustness of non parametric tests comes at a cost in cases where a parametric test would be appropriate non parametric tests have less statistical power In other words a larger sample size can be required to draw conclusions with the same degree of confidence Non parametric models EditNon parametric models differ from parametric models in that the model structure is not specified a priori but is instead determined from data The term non parametric is not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance A histogram is a simple nonparametric estimate of a probability distribution Kernel density estimation is another method to estimate a probability distribution Nonparametric regression and semiparametric regression methods have been developed based on kernels splines and wavelets Data envelopment analysis provides efficiency coefficients similar to those obtained by multivariate analysis without any distributional assumption KNNs classify the unseen instance based on the K points in the training set which are nearest to it A support vector machine with a Gaussian kernel is a nonparametric large margin classifier The method of moments with polynomial probability distributions Methods EditNon parametric or distribution free inferential statistical methods are mathematical procedures for statistical hypothesis testing which unlike parametric statistics make no assumptions about the probability distributions of the variables being assessed The most frequently used tests include Analysis of similarities Anderson Darling test tests whether a sample is drawn from a given distribution Statistical bootstrap methods estimates the accuracy sampling distribution of a statistic Cochran s Q tests whether k treatments in randomized block designs with 0 1 outcomes have identical effects Cohen s kappa measures inter rater agreement for categorical items Friedman two way analysis of variance by ranks tests whether k treatments in randomized block designs have identical effects Empirical likelihood Kaplan Meier estimates the survival function from lifetime data modeling censoring Kendall s tau measures statistical dependence between two variables Kendall s W a measure between 0 and 1 of inter rater agreement Kolmogorov Smirnov test tests whether a sample is drawn from a given distribution or whether two samples are drawn from the same distribution Kruskal Wallis one way analysis of variance by ranks tests whether gt 2 independent samples are drawn from the same distribution Kuiper s test tests whether a sample is drawn from a given distribution sensitive to cyclic variations such as day of the week Logrank test compares survival distributions of two right skewed censored samples Mann Whitney U or Wilcoxon rank sum test tests whether two samples are drawn from the same distribution as compared to a given alternative hypothesis McNemar s test tests whether in 2 2 contingency tables with a dichotomous trait and matched pairs of subjects row and column marginal frequencies are equal Median test tests whether two samples are drawn from distributions with equal medians Pitman s permutation test a statistical significance test that yields exact p values by examining all possible rearrangements of labels Rank products detects differentially expressed genes in replicated microarray experiments Siegel Tukey test tests for differences in scale between two groups Sign test tests whether matched pair samples are drawn from distributions with equal medians Spearman s rank correlation coefficient measures statistical dependence between two variables using a monotonic function Squared ranks test tests equality of variances in two or more samples Tukey Duckworth test tests equality of two distributions by using ranks Wald Wolfowitz runs test tests whether the elements of a sequence are mutually independent random Wilcoxon signed rank test tests whether matched pair samples are drawn from populations with different mean ranks History EditEarly nonparametric statistics include the median 13th century or earlier use in estimation by Edward Wright 1599 see Median History and the sign test by John Arbuthnot 1710 in analyzing the human sex ratio at birth see Sign test History 3 4 See also EditCDF based nonparametric confidence interval Parametric statistics Resampling statistics Semiparametric modelNotes Edit Pearce J Derrick B 2019 Preliminary testing The devil of statistics Reinvention An International Journal of Undergraduate Research 12 2 doi 10 31273 reinvention v12i2 339 Stuart A Ord J K Arnold S 1999 Kendall s Advanced Theory of Statistics Volume 2A Classical Inference and the Linear Model sixth edition 20 2 20 3 Arnold Conover W J 1999 Chapter 3 4 The Sign Test Practical Nonparametric Statistics Third ed Wiley pp 157 176 ISBN 0 471 16068 7 Sprent P 1989 Applied Nonparametric Statistical Methods Second ed Chapman amp Hall ISBN 0 412 44980 3General references EditBagdonavicius V Kruopis J Nikulin M S 2011 Non parametric tests for complete data ISTE amp WILEY London amp Hoboken ISBN 978 1 84821 269 5 Corder G W Foreman D I 2014 Nonparametric Statistics A Step by Step Approach Wiley ISBN 978 1118840313 Gibbons Jean Dickinson Chakraborti Subhabrata 2003 Nonparametric Statistical Inference 4th Ed CRC Press ISBN 0 8247 4052 1 Hettmansperger T P McKean J W 1998 Robust Nonparametric Statistical Methods Kendall s Library of Statistics Vol 5 First ed London Edward Arnold New York John Wiley amp Sons ISBN 0 340 54937 8 MR 1604954 also ISBN 0 471 19479 4 Hollander M Wolfe D A Chicken E 2014 Nonparametric Statistical Methods John Wiley amp Sons Sheskin David J 2003 Handbook of Parametric and Nonparametric Statistical Procedures CRC Press ISBN 1 58488 440 1 Wasserman Larry 2007 All of Nonparametric Statistics Springer ISBN 0 387 25145 6 Retrieved from https en wikipedia org w index php title Nonparametric statistics amp oldid 1169699910, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.