fbpx
Wikipedia

Efficiency (statistics)

In statistics, efficiency is a measure of quality of an estimator, of an experimental design,[1] or of a hypothesis testing procedure.[2] Essentially, a more efficient estimator needs fewer input data or observations than a less efficient one to achieve the Cramér–Rao bound. An efficient estimator is characterized by having the smallest possible variance, indicating that there is a small deviance between the estimated value and the "true" value in the L2 norm sense. [1]

The relative efficiency of two procedures is the ratio of their efficiencies, although often this concept is used where the comparison is made between a given procedure and a notional "best possible" procedure. The efficiencies and the relative efficiency of two procedures theoretically depend on the sample size available for the given procedure, but it is often possible to use the asymptotic relative efficiency (defined as the limit of the relative efficiencies as the sample size grows) as the principal comparison measure.

Estimators

The efficiency of an unbiased estimator, T, of a parameter θ is defined as [3]

 

where   is the Fisher information of the sample. Thus e(T) is the minimum possible variance for an unbiased estimator divided by its actual variance. The Cramér–Rao bound can be used to prove that e(T) ≤ 1.

Efficient estimators

An efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors of different magnitudes. The most common choice of the loss function is quadratic, resulting in the mean squared error criterion of optimality.[4]

In general, the spread of an estimator around the parameter θ is a measure of estimator efficiency and performance. This performance can be calculated by finding the mean squared error. More formally, let T be an estimator for the parameter θ. The mean squared error of T is the value  , which can be decomposed as a sum of its variance and bias:

 

An estimator T1 performs better than an estimator T2 if  .[5] For a more specific case, if T1 and T2 are two unbiased estimators for the same parameter θ, then the variance can be compared to determine performance. In this case, T2 is more efficient than T1 if the variance of T2 is smaller than the variance of T1, i.e.   for all values of θ. This relationship can be determined by simplifying the more general case above for mean squared error; since the expected value of an unbiased estimator is equal to the parameter value,  . Therefore, for an unbiased estimator,  , as the   term drops out for being equal to 0.[5]

If an unbiased estimator of a parameter θ attains   for all values of the parameter, then the estimator is called efficient.[3]

Equivalently, the estimator achieves equality in the Cramér–Rao inequality for all θ. The Cramér–Rao lower bound is a lower bound of the variance of an unbiased estimator, representing the "best" an unbiased estimator can be.

An efficient estimator is also the minimum variance unbiased estimator (MVUE). This is because an efficient estimator maintains equality on the Cramér–Rao inequality for all parameter values, which means it attains the minimum variance for all parameters (the definition of the MVUE). The MVUE estimator, even if it exists, is not necessarily efficient, because "minimum" does not mean equality holds on the Cramér–Rao inequality.

Thus an efficient estimator need not exist, but if it does, it is the MVUE.

Finite-sample efficiency

Suppose { Pθ | θ ∈ Θ } is a parametric model and X = (X1, …, Xn) are the data sampled from this model. Let T = T(X) be an estimator for the parameter θ. If this estimator is unbiased (that is, E[ T ] = θ), then the Cramér–Rao inequality states the variance of this estimator is bounded from below:

 

where   is the Fisher information matrix of the model at point θ. Generally, the variance measures the degree of dispersion of a random variable around its mean. Thus estimators with small variances are more concentrated, they estimate the parameters more precisely. We say that the estimator is a finite-sample efficient estimator (in the class of unbiased estimators) if it reaches the lower bound in the Cramér–Rao inequality above, for all θ ∈ Θ. Efficient estimators are always minimum variance unbiased estimators. However the converse is false: There exist point-estimation problems for which the minimum-variance mean-unbiased estimator is inefficient.[6]

Historically, finite-sample efficiency was an early optimality criterion. However this criterion has some limitations:

  • Finite-sample efficient estimators are extremely rare. In fact, it was proved that efficient estimation is possible only in an exponential family, and only for the natural parameters of that family.[citation needed]
  • This notion of efficiency is sometimes restricted to the class of unbiased estimators. (Often it isn't.[7]) Since there are no good theoretical reasons to require that estimators are unbiased, this restriction is inconvenient. In fact, if we use mean squared error as a selection criterion, many biased estimators will slightly outperform the “best” unbiased ones. For example, in multivariate statistics for dimension three or more, the mean-unbiased estimator, sample mean, is inadmissible: Regardless of the outcome, its performance is worse than for example the James–Stein estimator.[citation needed]
  • Finite-sample efficiency is based on the variance, as a criterion according to which the estimators are judged. A more general approach is to use loss functions other than quadratic ones, in which case the finite-sample efficiency can no longer be formulated.[citation needed][dubious ]

As an example, among the models encountered in practice, efficient estimators exist for: the mean μ of the normal distribution (but not the variance σ2), parameter λ of the Poisson distribution, the probability p in the binomial or multinomial distribution.

Consider the model of a normal distribution with unknown mean but known variance: { Pθ = N(θ, σ2) | θR }. The data consists of n independent and identically distributed observations from this model: X = (x1, …, xn). We estimate the parameter θ using the sample mean of all observations:

 

This estimator has mean θ and variance of σ2 / n, which is equal to the reciprocal of the Fisher information from the sample. Thus, the sample mean is a finite-sample efficient estimator for the mean of the normal distribution.

Asymptotic efficiency

Asymptotic efficiency requires Consistency (statistics), asymptotic normally distribution of estimator, and asymptotic variance-covariance matrix no worse than any other estimator.[8]

Example: Median

Consider a sample of size   drawn from a normal distribution of mean   and unit variance, i.e.,  

The sample mean,  , of the sample  , defined as

 

The variance of the mean, 1/N (the square of the standard error) is equal to the reciprocal of the Fisher information from the sample and thus, by the Cramér–Rao inequality, the sample mean is efficient in the sense that its efficiency is unity (100%).

Now consider the sample median,  . This is an unbiased and consistent estimator for  . For large   the sample median is approximately normally distributed with mean   and variance  [9]

 

The efficiency of the median for large   is thus

 

In other words, the relative variance of the median will be  , or 57% greater than the variance of the mean – the standard error of the median will be 25% greater than that of the mean.[10]

Note that this is the asymptotic efficiency — that is, the efficiency in the limit as sample size   tends to infinity. For finite values of   the efficiency is higher than this (for example, a sample size of 3 gives an efficiency of about 74%).[citation needed]

The sample mean is thus more efficient than the sample median in this example. However, there may be measures by which the median performs better. For example, the median is far more robust to outliers, so that if the Gaussian model is questionable or approximate, there may advantages to using the median (see Robust statistics).

Dominant estimators

If   and   are estimators for the parameter  , then   is said to dominate   if:

  1. its mean squared error (MSE) is smaller for at least some value of  
  2. the MSE does not exceed that of   for any value of θ.

Formally,   dominates   if

 

holds for all  , with strict inequality holding somewhere.

Relative efficiency

The relative efficiency of two unbiased estimators is defined as[11]

 

Although   is in general a function of  , in many cases the dependence drops out; if this is so,   being greater than one would indicate that   is preferable, regardless of the true value of  .

An alternative to relative efficiency for comparing estimators, is the Pitman closeness criterion. This replaces the comparison of mean-squared-errors with comparing how often one estimator produces estimates closer to the true value than another estimator.

If   and   are estimators for the parameter  , then   is said to dominate   if:

  1. its mean squared error (MSE) is smaller for at least some value of  
  2. the MSE does not exceed that of   for any value of θ.

Formally,   dominates   if

 

holds for all  , with strict inequality holding somewhere.

Estimators of the mean of u.i.d. variables

In estimating the mean of uncorrelated, identically distributed variables we can take advantage of the fact that the variance of the sum is the sum of the variances. In this case efficiency can be defined as the square of the coefficient of variation, i.e.,[12]

 

Relative efficiency of two such estimators can thus be interpreted as the relative sample size of one required to achieve the certainty of the other. Proof:

 

Now because   we have  , so the relative efficiency expresses the relative sample size of the first estimator needed to match the variance of the second.

Robustness

Efficiency of an estimator may change significantly if the distribution changes, often dropping. This is one of the motivations of robust statistics – an estimator such as the sample mean is an efficient estimator of the population mean of a normal distribution, for example, but can be an inefficient estimator of a mixture distribution of two normal distributions with the same mean and different variances. For example, if a distribution is a combination of 98% N(μ, σ) and 2% N(μ, 10σ), the presence of extreme values from the latter distribution (often "contaminating outliers") significantly reduces the efficiency of the sample mean as an estimator of μ. By contrast, the trimmed mean is less efficient for a normal distribution, but is more robust (i.e., less affected) by changes in the distribution, and thus may be more efficient for a mixture distribution. Similarly, the shape of a distribution, such as skewness or heavy tails, can significantly reduce the efficiency of estimators that assume a symmetric distribution or thin tails.

Uses of inefficient estimators

While efficiency is a desirable quality of an estimator, it must be weighed against other considerations, and an estimator that is efficient for certain distributions may well be inefficient for other distributions. Most significantly, estimators that are efficient for clean data from a simple distribution, such as the normal distribution (which is symmetric, unimodal, and has thin tails) may not be robust to contamination by outliers, and may be inefficient for more complicated distributions. In robust statistics, more importance is placed on robustness and applicability to a wide variety of distributions, rather than efficiency on a single distribution. M-estimators are a general class of solutions motivated by these concerns, yielding both robustness and high relative efficiency, though possibly lower efficiency than traditional estimators for some cases. These are potentially very computationally complicated, however.

A more traditional alternative are L-estimators, which are very simple statistics that are easy to compute and interpret, in many cases robust, and often sufficiently efficient for initial estimates. See applications of L-estimators for further discussion.

Efficiency in statistics

Efficiency in statistics is important because they allow one to compare the performance of various estimators. Although an unbiased estimator is usually favored over a biased one, a more efficient biased estimator can sometimes be more valuable than a less efficient unbiased estimator. For example, this can occur when the values of the biased estimator gathers around a number closer to the true value. Thus, estimator performance can be predicted easily by comparing their mean squared errors or variances.

Hypothesis tests

For comparing significance tests, a meaningful measure of efficiency can be defined based on the sample size required for the test to achieve a given task power.[13]

Pitman efficiency[14] and Bahadur efficiency (or Hodges–Lehmann efficiency)[15][16][17] relate to the comparison of the performance of statistical hypothesis testing procedures. The Encyclopedia of Mathematics provides a brief exposition of these three criteria.

Experimental design

For experimental designs, efficiency relates to the ability of a design to achieve the objective of the study with minimal expenditure of resources such as time and money. In simple cases, the relative efficiency of designs can be expressed as the ratio of the sample sizes required to achieve a given objective.[18]

See also

Notes

  1. ^ a b Everitt 2002, p. 128.
  2. ^ Nikulin, M.S. (2001) [1994], "Efficiency of a statistical procedure", Encyclopedia of Mathematics, EMS Press
  3. ^ a b Fisher, R (1921). "On the Mathematical Foundations of Theoretical Statistics". Philosophical Transactions of the Royal Society of London A. 222: 309–368. JSTOR 91208.
  4. ^ Everitt 2002, p. 128.
  5. ^ a b Dekking, F.M. (2007). A Modern Introduction to Probability and Statistics: Understanding Why and How. Springer. pp. 303–305. ISBN 978-1852338961.
  6. ^ Romano, Joseph P.; Siegel, Andrew F. (1986). Counterexamples in Probability and Statistics. Chapman and Hall. p. 194.
  7. ^ DeGroot; Schervish (2002). Probability and Statistics (3rd ed.). pp. 440–441.
  8. ^ Greene, William H. (2012). Econometric analysis (7th ed., international ed.). Boston: Pearson. ISBN 978-0-273-75356-8. OCLC 726074601.
  9. ^ Williams, D. (2001). Weighing the Odds. Cambridge University Press. p. 165. ISBN 052100618X.
  10. ^ Maindonald, John; Braun, W. John (2010-05-06). Data Analysis and Graphics Using R: An Example-Based Approach. Cambridge University Press. p. 104. ISBN 978-1-139-48667-5.
  11. ^ Wackerly, Dennis D.; Mendenhall, William; Scheaffer, Richard L. (2008). Mathematical statistics with applications (Seventh ed.). Belmont, CA: Thomson Brooks/Cole. p. 445. ISBN 9780495110811. OCLC 183886598.
  12. ^ Grubbs, Frank (1965). Statistical Measures of Accuracy for Riflemen and Missile Engineers. pp. 26–27.
  13. ^ Everitt 2002, p. 321.
  14. ^ Nikitin, Ya.Yu. (2001) [1994], "Efficiency, asymptotic", Encyclopedia of Mathematics, EMS Press
  15. ^ "Bahadur efficiency - Encyclopedia of Mathematics".
  16. ^ Arcones M. A. "Bahadur efficiency of the likelihood ratio test" preprint
  17. ^ Canay I. A. & Otsu, T. "Hodges–Lehmann Optimality for Testing Moment Condition Models"
  18. ^ Dodge, Y. (2006). The Oxford Dictionary of Statistical Terms. Oxford University Press. ISBN 0-19-920613-9.

References

Further reading

efficiency, statistics, statistics, efficiency, measure, quality, estimator, experimental, design, hypothesis, testing, procedure, essentially, more, efficient, estimator, needs, fewer, input, data, observations, than, less, efficient, achieve, cramér, bound, . In statistics efficiency is a measure of quality of an estimator of an experimental design 1 or of a hypothesis testing procedure 2 Essentially a more efficient estimator needs fewer input data or observations than a less efficient one to achieve the Cramer Rao bound An efficient estimator is characterized by having the smallest possible variance indicating that there is a small deviance between the estimated value and the true value in the L2 norm sense 1 The relative efficiency of two procedures is the ratio of their efficiencies although often this concept is used where the comparison is made between a given procedure and a notional best possible procedure The efficiencies and the relative efficiency of two procedures theoretically depend on the sample size available for the given procedure but it is often possible to use the asymptotic relative efficiency defined as the limit of the relative efficiencies as the sample size grows as the principal comparison measure Contents 1 Estimators 1 1 Efficient estimators 1 1 1 Finite sample efficiency 1 2 Asymptotic efficiency 1 2 1 Example Median 1 3 Dominant estimators 1 4 Relative efficiency 1 4 1 Estimators of the mean of u i d variables 1 5 Robustness 1 6 Uses of inefficient estimators 1 7 Efficiency in statistics 2 Hypothesis tests 3 Experimental design 4 See also 5 Notes 6 References 7 Further readingEstimators EditThe efficiency of an unbiased estimator T of a parameter 8 is defined as 3 e T 1 I 8 var T displaystyle e T frac 1 mathcal I theta operatorname var T where I 8 displaystyle mathcal I theta is the Fisher information of the sample Thus e T is the minimum possible variance for an unbiased estimator divided by its actual variance The Cramer Rao bound can be used to prove that e T 1 Efficient estimators Edit An efficient estimator is an estimator that estimates the quantity of interest in some best possible manner The notion of best possible relies upon the choice of a particular loss function the function which quantifies the relative degree of undesirability of estimation errors of different magnitudes The most common choice of the loss function is quadratic resulting in the mean squared error criterion of optimality 4 In general the spread of an estimator around the parameter 8 is a measure of estimator efficiency and performance This performance can be calculated by finding the mean squared error More formally let T be an estimator for the parameter 8 The mean squared error of T is the value MSE T E T 8 2 displaystyle operatorname MSE T E T theta 2 which can be decomposed as a sum of its variance and bias MSE T E T 8 2 E T E T E T 8 2 E T E T 2 2 E T E T E T 8 E T 8 2 var T E T 8 2 displaystyle begin aligned operatorname MSE T amp operatorname E T theta 2 operatorname E T operatorname E T operatorname E T theta 2 5pt amp operatorname E T operatorname E T 2 2E T E T operatorname E T theta operatorname E T theta 2 5pt amp operatorname var T operatorname E T theta 2 end aligned An estimator T1 performs better than an estimator T2 if MSE T 1 lt MSE T 2 displaystyle operatorname MSE T 1 lt operatorname MSE T 2 5 For a more specific case if T1 and T2are two unbiased estimators for the same parameter 8 then the variance can be compared to determine performance In this case T2 is more efficient than T1 if the variance of T2 is smaller than the variance of T1 i e var T 1 gt var T 2 displaystyle operatorname var T 1 gt operatorname var T 2 for all values of 8 This relationship can be determined by simplifying the more general case above for mean squared error since the expected value of an unbiased estimator is equal to the parameter value E T 8 displaystyle operatorname E T theta Therefore for an unbiased estimator MSE T var T displaystyle operatorname MSE T operatorname var T as the E T 8 2 displaystyle operatorname E T theta 2 term drops out for being equal to 0 5 If an unbiased estimator of a parameter 8 attains e T 1 displaystyle e T 1 for all values of the parameter then the estimator is called efficient 3 Equivalently the estimator achieves equality in the Cramer Rao inequality for all 8 The Cramer Rao lower bound is a lower bound of the variance of an unbiased estimator representing the best an unbiased estimator can be An efficient estimator is also the minimum variance unbiased estimator MVUE This is because an efficient estimator maintains equality on the Cramer Rao inequality for all parameter values which means it attains the minimum variance for all parameters the definition of the MVUE The MVUE estimator even if it exists is not necessarily efficient because minimum does not mean equality holds on the Cramer Rao inequality Thus an efficient estimator need not exist but if it does it is the MVUE Finite sample efficiency Edit Suppose P8 8 8 is a parametric model and X X1 Xn are the data sampled from this model Let T T X be an estimator for the parameter 8 If this estimator is unbiased that is E T 8 then the Cramer Rao inequality states the variance of this estimator is bounded from below var T I 8 1 displaystyle operatorname var T geq mathcal I theta 1 where I 8 displaystyle scriptstyle mathcal I theta is the Fisher information matrix of the model at point 8 Generally the variance measures the degree of dispersion of a random variable around its mean Thus estimators with small variances are more concentrated they estimate the parameters more precisely We say that the estimator is a finite sample efficient estimator in the class of unbiased estimators if it reaches the lower bound in the Cramer Rao inequality above for all 8 8 Efficient estimators are always minimum variance unbiased estimators However the converse is false There exist point estimation problems for which the minimum variance mean unbiased estimator is inefficient 6 Historically finite sample efficiency was an early optimality criterion However this criterion has some limitations Finite sample efficient estimators are extremely rare In fact it was proved that efficient estimation is possible only in an exponential family and only for the natural parameters of that family citation needed This notion of efficiency is sometimes restricted to the class of unbiased estimators Often it isn t 7 Since there are no good theoretical reasons to require that estimators are unbiased this restriction is inconvenient In fact if we use mean squared error as a selection criterion many biased estimators will slightly outperform the best unbiased ones For example in multivariate statistics for dimension three or more the mean unbiased estimator sample mean is inadmissible Regardless of the outcome its performance is worse than for example the James Stein estimator citation needed Finite sample efficiency is based on the variance as a criterion according to which the estimators are judged A more general approach is to use loss functions other than quadratic ones in which case the finite sample efficiency can no longer be formulated citation needed dubious discuss As an example among the models encountered in practice efficient estimators exist for the mean m of the normal distribution but not the variance s2 parameter l of the Poisson distribution the probability p in the binomial or multinomial distribution Consider the model of a normal distribution with unknown mean but known variance P8 N 8 s2 8 R The data consists of n independent and identically distributed observations from this model X x1 xn We estimate the parameter 8 using the sample mean of all observations T X 1 n i 1 n x i displaystyle T X frac 1 n sum i 1 n x i This estimator has mean 8 and variance of s2 n which is equal to the reciprocal of the Fisher information from the sample Thus the sample mean is a finite sample efficient estimator for the mean of the normal distribution Asymptotic efficiency Edit Asymptotic efficiency requires Consistency statistics asymptotic normally distribution of estimator and asymptotic variance covariance matrix no worse than any other estimator 8 Example Median Edit Consider a sample of size N displaystyle N drawn from a normal distribution of mean m displaystyle mu and unit variance i e X n N m 1 displaystyle X n sim mathcal N mu 1 The sample mean X displaystyle overline X of the sample X 1 X 2 X N displaystyle X 1 X 2 ldots X N defined as X 1 N n 1 N X n N m 1 N displaystyle overline X frac 1 N sum n 1 N X n sim mathcal N left mu frac 1 N right The variance of the mean 1 N the square of the standard error is equal to the reciprocal of the Fisher information from the sample and thus by the Cramer Rao inequality the sample mean is efficient in the sense that its efficiency is unity 100 Now consider the sample median X displaystyle widetilde X This is an unbiased and consistent estimator for m displaystyle mu For large N displaystyle N the sample median is approximately normally distributed with mean m displaystyle mu and variance p 2 N displaystyle pi 2N 9 X N m p 2 N displaystyle widetilde X sim mathcal N left mu frac pi 2N right The efficiency of the median for large N displaystyle N is thus e X 1 N p 2 N 1 2 p 0 64 displaystyle e left widetilde X right left frac 1 N right left frac pi 2N right 1 2 pi approx 0 64 In other words the relative variance of the median will be p 2 1 57 displaystyle pi 2 approx 1 57 or 57 greater than the variance of the mean the standard error of the median will be 25 greater than that of the mean 10 Note that this is the asymptotic efficiency that is the efficiency in the limit as sample size N displaystyle N tends to infinity For finite values of N displaystyle N the efficiency is higher than this for example a sample size of 3 gives an efficiency of about 74 citation needed The sample mean is thus more efficient than the sample median in this example However there may be measures by which the median performs better For example the median is far more robust to outliers so that if the Gaussian model is questionable or approximate there may advantages to using the median see Robust statistics Dominant estimators Edit If T 1 displaystyle T 1 and T 2 displaystyle T 2 are estimators for the parameter 8 displaystyle theta then T 1 displaystyle T 1 is said to dominate T 2 displaystyle T 2 if its mean squared error MSE is smaller for at least some value of 8 displaystyle theta the MSE does not exceed that of T 2 displaystyle T 2 for any value of 8 Formally T 1 displaystyle T 1 dominates T 2 displaystyle T 2 if E T 1 8 2 E T 2 8 2 displaystyle operatorname E T 1 theta 2 leq operatorname E T 2 theta 2 holds for all 8 displaystyle theta with strict inequality holding somewhere Relative efficiency Edit The relative efficiency of two unbiased estimators is defined as 11 e T 1 T 2 E T 2 8 2 E T 1 8 2 var T 2 var T 1 displaystyle e T 1 T 2 frac operatorname E T 2 theta 2 operatorname E T 1 theta 2 frac operatorname var T 2 operatorname var T 1 Although e displaystyle e is in general a function of 8 displaystyle theta in many cases the dependence drops out if this is so e displaystyle e being greater than one would indicate that T 1 displaystyle T 1 is preferable regardless of the true value of 8 displaystyle theta An alternative to relative efficiency for comparing estimators is the Pitman closeness criterion This replaces the comparison of mean squared errors with comparing how often one estimator produces estimates closer to the true value than another estimator If T 1 displaystyle T 1 and T 2 displaystyle T 2 are estimators for the parameter 8 displaystyle theta then T 1 displaystyle T 1 is said to dominate T 2 displaystyle T 2 if its mean squared error MSE is smaller for at least some value of 8 displaystyle theta the MSE does not exceed that of T 2 displaystyle T 2 for any value of 8 Formally T 1 displaystyle T 1 dominates T 2 displaystyle T 2 if E T 1 8 2 E T 2 8 2 displaystyle mathrm E left T 1 theta 2 right leq mathrm E left T 2 theta 2 right holds for all 8 displaystyle theta with strict inequality holding somewhere Estimators of the mean of u i d variables Edit In estimating the mean of uncorrelated identically distributed variables we can take advantage of the fact that the variance of the sum is the sum of the variances In this case efficiency can be defined as the square of the coefficient of variation i e 12 e s m 2 displaystyle e equiv left frac sigma mu right 2 Relative efficiency of two such estimators can thus be interpreted as the relative sample size of one required to achieve the certainty of the other Proof e 1 e 2 s 1 2 s 2 2 displaystyle frac e 1 e 2 frac s 1 2 s 2 2 Now because s 1 2 n 1 s 2 s 2 2 n 2 s 2 displaystyle s 1 2 n 1 sigma 2 s 2 2 n 2 sigma 2 we have e 1 e 2 n 1 n 2 displaystyle frac e 1 e 2 frac n 1 n 2 so the relative efficiency expresses the relative sample size of the first estimator needed to match the variance of the second Robustness Edit Efficiency of an estimator may change significantly if the distribution changes often dropping This is one of the motivations of robust statistics an estimator such as the sample mean is an efficient estimator of the population mean of a normal distribution for example but can be an inefficient estimator of a mixture distribution of two normal distributions with the same mean and different variances For example if a distribution is a combination of 98 N m s and 2 N m 10s the presence of extreme values from the latter distribution often contaminating outliers significantly reduces the efficiency of the sample mean as an estimator of m By contrast the trimmed mean is less efficient for a normal distribution but is more robust i e less affected by changes in the distribution and thus may be more efficient for a mixture distribution Similarly the shape of a distribution such as skewness or heavy tails can significantly reduce the efficiency of estimators that assume a symmetric distribution or thin tails Uses of inefficient estimators Edit Further information L estimator Applications While efficiency is a desirable quality of an estimator it must be weighed against other considerations and an estimator that is efficient for certain distributions may well be inefficient for other distributions Most significantly estimators that are efficient for clean data from a simple distribution such as the normal distribution which is symmetric unimodal and has thin tails may not be robust to contamination by outliers and may be inefficient for more complicated distributions In robust statistics more importance is placed on robustness and applicability to a wide variety of distributions rather than efficiency on a single distribution M estimators are a general class of solutions motivated by these concerns yielding both robustness and high relative efficiency though possibly lower efficiency than traditional estimators for some cases These are potentially very computationally complicated however A more traditional alternative are L estimators which are very simple statistics that are easy to compute and interpret in many cases robust and often sufficiently efficient for initial estimates See applications of L estimators for further discussion Efficiency in statistics Edit Efficiency in statistics is important because they allow one to compare the performance of various estimators Although an unbiased estimator is usually favored over a biased one a more efficient biased estimator can sometimes be more valuable than a less efficient unbiased estimator For example this can occur when the values of the biased estimator gathers around a number closer to the true value Thus estimator performance can be predicted easily by comparing their mean squared errors or variances Hypothesis tests EditFor comparing significance tests a meaningful measure of efficiency can be defined based on the sample size required for the test to achieve a given task power 13 Pitman efficiency 14 and Bahadur efficiency or Hodges Lehmann efficiency 15 16 17 relate to the comparison of the performance of statistical hypothesis testing procedures The Encyclopedia of Mathematics provides a brief exposition of these three criteria Experimental design EditFurther information Optimal design For experimental designs efficiency relates to the ability of a design to achieve the objective of the study with minimal expenditure of resources such as time and money In simple cases the relative efficiency of designs can be expressed as the ratio of the sample sizes required to achieve a given objective 18 See also EditBayes estimator Consistent estimator Hodges estimator Optimal instrumentsNotes Edit a b Everitt 2002 p 128 Nikulin M S 2001 1994 Efficiency of a statistical procedure Encyclopedia of Mathematics EMS Press a b Fisher R 1921 On the Mathematical Foundations of Theoretical Statistics Philosophical Transactions of the Royal Society of London A 222 309 368 JSTOR 91208 Everitt 2002 p 128 a b Dekking F M 2007 A Modern Introduction to Probability and Statistics Understanding Why and How Springer pp 303 305 ISBN 978 1852338961 Romano Joseph P Siegel Andrew F 1986 Counterexamples in Probability and Statistics Chapman and Hall p 194 DeGroot Schervish 2002 Probability and Statistics 3rd ed pp 440 441 Greene William H 2012 Econometric analysis 7th ed international ed Boston Pearson ISBN 978 0 273 75356 8 OCLC 726074601 Williams D 2001 Weighing the Odds Cambridge University Press p 165 ISBN 052100618X Maindonald John Braun W John 2010 05 06 Data Analysis and Graphics Using R An Example Based Approach Cambridge University Press p 104 ISBN 978 1 139 48667 5 Wackerly Dennis D Mendenhall William Scheaffer Richard L 2008 Mathematical statistics with applications Seventh ed Belmont CA Thomson Brooks Cole p 445 ISBN 9780495110811 OCLC 183886598 Grubbs Frank 1965 Statistical Measures of Accuracy for Riflemen and Missile Engineers pp 26 27 Everitt 2002 p 321 Nikitin Ya Yu 2001 1994 Efficiency asymptotic Encyclopedia of Mathematics EMS Press Bahadur efficiency Encyclopedia of Mathematics Arcones M A Bahadur efficiency of the likelihood ratio test preprint Canay I A amp Otsu T Hodges Lehmann Optimality for Testing Moment Condition Models Dodge Y 2006 The Oxford Dictionary of Statistical Terms Oxford University Press ISBN 0 19 920613 9 References EditEveritt Brian S 2002 The Cambridge Dictionary of Statistics Cambridge University Press ISBN 0 521 81099 X Lehmann Erich L 1998 Elements of Large Sample Theory New York Springer Verlag ISBN 978 0 387 98595 4 Further reading EditLehmann E L Casella G 1998 Theory of Point Estimation 2nd ed Springer ISBN 0 387 98502 6 Pfanzagl Johann with the assistance of R Hamboker 1994 Parametric Statistical Theory Berlin Walter de Gruyter ISBN 3 11 013863 8 MR 1291393 Retrieved from https en wikipedia org w index php title Efficiency statistics amp oldid 1131127485, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.