fbpx
Wikipedia

Sample size determination

Sample size determination is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is usually determined based on the cost, time, or convenience of collecting the data, and the need for it to offer sufficient statistical power. In complicated studies there may be several different sample sizes: for example, in a stratified survey there would be different sizes for each stratum. In a census, data is sought for an entire population, hence the intended sample size is equal to the population. In experimental design, where a study may be divided into different treatment groups, there may be different sample sizes for each group.

Sample sizes may be chosen in several ways:

  • using experience – small samples, though sometimes unavoidable, can result in wide confidence intervals and risk of errors in statistical hypothesis testing.
  • using a target variance for an estimate to be derived from the sample eventually obtained, i.e., if a high precision is required (narrow confidence interval) this translates to a low target variance of the estimator.
  • using a target for the power of a statistical test to be applied once the sample is collected.
  • using a confidence level, i.e. the larger the required confidence level, the larger the sample size (given a constant precision requirement).

Introduction

Larger sample sizes generally lead to increased precision when estimating unknown parameters. For example, if we wish to know the proportion of a certain species of fish that is infected with a pathogen, we would generally have a more precise estimate of this proportion if we sampled and examined 200 rather than 100 fish. Several fundamental facts of mathematical statistics describe this phenomenon, including the law of large numbers and the central limit theorem.

In some situations, the increase in precision for larger sample sizes is minimal, or even non-existent. This can result from the presence of systematic errors or strong dependence in the data, or if the data follows a heavy-tailed distribution.

Sample sizes may be evaluated by the quality of the resulting estimates. For example, if a proportion is being estimated, one may wish to have the 95% confidence interval be less than 0.06 units wide. Alternatively, sample size may be assessed based on the power of a hypothesis test. For example, if we are comparing the support for a certain political candidate among women with the support for that candidate among men, we may wish to have 80% power to detect a difference in the support levels of 0.04 units.

Estimation

Estimation of a proportion

A relatively simple situation is estimation of a proportion. For example, we may wish to estimate the proportion of residents in a community who are at least 65 years old.

The estimator of a proportion is  , where X is the number of 'positive' e.g., the number of people out of the n sampled people who are at least 65 years old). When the observations are independent, this estimator has a (scaled) binomial distribution (and is also the sample mean of data from a Bernoulli distribution). The maximum variance of this distribution is 0.25, which occurs when the true parameter is p = 0.5. In practice, since p is unknown, the maximum variance is often used for sample size assessments. If a reasonable estimate for p is known the quantity   may be used in place of 0.25.

For sufficiently large n, the distribution of   will be closely approximated by a normal distribution.[1] Using this and the Wald method for the binomial distribution, yields a confidence interval of the form

  ,
where Z is a standard Z-score for the desired level of confidence (1.96 for a 95% confidence interval).

If we wish to have a confidence interval that is W units total in width (W/2 on each side of the sample mean), we will solve

 

for n, yielding the sample size

 
sample sizes for binomial proportions given different confidence levels and margins of error

  , in the case of using .5 as the most conservative estimate of the proportion. (Note: W/2 = margin of error.)

In the figure below one can observe how sample sizes for binomial proportions change given different confidence levels and margins of error.


Otherwise, the formula would be   , which yields  .

For example, if we are interested in estimating the proportion of the US population who supports a particular presidential candidate, and we want the width of 95% confidence interval to be at most 2 percentage points (0.02), then we would need a sample size of (1.96)2/ (0.022) = 9604. It is reasonable to use the 0.5 estimate for p in this case because the presidential races are often close to 50/50, and it is also prudent to use a conservative estimate. The margin of error in this case is 1 percentage point (half of 0.02).

The foregoing is commonly simplified

 

will form a 95% confidence interval for the true proportion. If this interval needs to be no more than W units wide, the equation

 

can be solved for n, yielding[2][3] n = 4/W2 = 1/B2 where B is the error bound on the estimate, i.e., the estimate is usually given as within ± B. For B = 10% one requires n = 100, for B = 5% one needs n = 400, for B = 3% the requirement approximates to n = 1000, while for B = 1% a sample size of n = 10000 is required. These numbers are quoted often in news reports of opinion polls and other sample surveys. However, the results reported may not be the exact value as numbers are preferably rounded up. Knowing that the value of the n is the minimum number of sample points needed to acquire the desired result, the number of respondents then must lie on or above the minimum.

Estimation of a mean

When estimating the population mean using an independent and identically distributed (iid) sample of size n, where each data value has variance σ2, the standard error of the sample mean is:

 

This expression describes quantitatively how the estimate becomes more precise as the sample size increases. Using the central limit theorem to justify approximating the sample mean with a normal distribution yields a confidence interval of the form

  ,
where Z is a standard Z-score for the desired level of confidence (1.96 for a 95% confidence interval).

If we wish to have a confidence interval that is W units total in width (W/2 being the margin of error on each side of the sample mean), we would solve

 

for n, yielding the sample size

 .

For example, if we are interested in estimating the amount by which a drug lowers a subject's blood pressure with a 95% confidence interval that is six units wide, and we know that the standard deviation of blood pressure in the population is 15, then the required sample size is  , which would be rounded up to 97, because the obtained value is the minimum sample size, and sample sizes must be integers and must lie on or above the calculated minimum.

Required sample sizes for hypothesis tests

A common problem faced by statisticians is calculating the sample size required to yield a certain power for a test, given a predetermined Type I error rate α. As follows, this can be estimated by pre-determined tables for certain values, by Mead's resource equation, or, more generally, by the cumulative distribution function:

Tables

[4]
 
Power
Cohen's d
0.2 0.5 0.8
0.25 84 14 6
0.50 193 32 13
0.60 246 40 16
0.70 310 50 20
0.80 393 64 26
0.90 526 85 34
0.95 651 105 42
0.99 920 148 58

The table shown on the right can be used in a two-sample t-test to estimate the sample sizes of an experimental group and a control group that are of equal size, that is, the total number of individuals in the trial is twice that of the number given, and the desired significance level is 0.05.[4] The parameters used are:

Mead's resource equation

Mead's resource equation is often used for estimating sample sizes of laboratory animals, as well as in many other laboratory experiments. It may not be as accurate as using other methods in estimating sample size, but gives a hint of what is the appropriate sample size where parameters such as expected standard deviations or expected differences in values between groups are unknown or very hard to estimate.[5]

All the parameters in the equation are in fact the degrees of freedom of the number of their concepts, and hence, their numbers are subtracted by 1 before insertion into the equation.

The equation is:[5]

 

where:

  • N is the total number of individuals or units in the study (minus 1)
  • B is the blocking component, representing environmental effects allowed for in the design (minus 1)
  • T is the treatment component, corresponding to the number of treatment groups (including control group) being used, or the number of questions being asked (minus 1)
  • E is the degrees of freedom of the error component and should be somewhere between 10 and 20.

For example, if a study using laboratory animals is planned with four treatment groups (T=3), with eight animals per group, making 32 animals total (N=31), without any further stratification (B=0), then E would equal 28, which is above the cutoff of 20, indicating that sample size may be a bit too large, and six animals per group might be more appropriate.[6]

Cumulative distribution function

Let Xi, i = 1, 2, ..., n be independent observations taken from a normal distribution with unknown mean μ and known variance σ2. Consider two hypotheses, a null hypothesis:

 

and an alternative hypothesis:

 

for some 'smallest significant difference' μ* > 0. This is the smallest value for which we care about observing a difference. Now, if we wish to (1) reject H0 with a probability of at least 1 − β when Ha is true (i.e. a power of 1 − β), and (2) reject H0 with probability α when H0 is true, then we need the following:

If zα is the upper α percentage point of the standard normal distribution, then

 

and so

'Reject H0 if our sample average ( ) is more than  '

is a decision rule which satisfies (2). (This is a 1-tailed test.)

Now we wish for this to happen with a probability at least 1 − β when Ha is true. In this case, our sample average will come from Normal distribution with mean μ*. Therefore, we require

 

Through careful manipulation, this can be shown (see Statistical power Example) to happen when

 

where   is the normal cumulative distribution function.

Stratified sample size

With more complicated sampling techniques, such as stratified sampling, the sample can often be split up into sub-samples. Typically, if there are H such sub-samples (from H different strata) then each of them will have a sample size nh, h = 1, 2, ..., H. These nh must conform to the rule that n1 + n2 + ... + nH = n (i.e., that the total sample size is given by the sum of the sub-sample sizes). Selecting these nh optimally can be done in various ways, using (for example) Neyman's optimal allocation.

There are many reasons to use stratified sampling:[7] to decrease variances of sample estimates, to use partly non-random methods, or to study strata individually. A useful, partly non-random method would be to sample individuals where easily accessible, but, where not, sample clusters to save travel costs.[8]

In general, for H strata, a weighted sample mean is

 

with

 [9]

The weights,  , frequently, but not always, represent the proportions of the population elements in the strata, and  . For a fixed sample size, that is  ,

 [10]

which can be made a minimum if the sampling rate within each stratum is made proportional to the standard deviation within each stratum:  , where   and   is a constant such that  .

An "optimum allocation" is reached when the sampling rates within the strata are made directly proportional to the standard deviations within the strata and inversely proportional to the square root of the sampling cost per element within the strata,  :

 [11]

where   is a constant such that  , or, more generally, when

 [12]

Qualitative research

Sample size determination in qualitative studies takes a different approach. It is generally a subjective judgment, taken as the research proceeds.[13] One approach is to continue to include further participants or material until saturation is reached.[14] The number needed to reach saturation has been investigated empirically.[15][16][17][18]

There is a paucity of reliable guidance on estimating sample sizes before starting the research, with a range of suggestions given.[16][19][20][21] A tool akin to a quantitative power calculation, based on the negative binomial distribution, has been suggested for thematic analysis.[22][21]

See also

References

  1. ^ NIST/SEMATECH, "7.2.4.2. Sample sizes required", e-Handbook of Statistical Methods.
  2. ^ "Inference for Regression". utdallas.edu.
  3. ^ "Confidence Interval for a Proportion" 2011-08-23 at the Wayback Machine
  4. ^ a b Chapter 13, page 215, in: Kenny, David A. (1987). Statistics for the social and behavioral sciences. Boston: Little, Brown. ISBN 978-0-316-48915-7.
  5. ^ a b Kirkwood, James; Robert Hubrecht (2010). The UFAW Handbook on the Care and Management of Laboratory and Other Research Animals. Wiley-Blackwell. p. 29. ISBN 978-1-4051-7523-4. online Page 29
  6. ^ Isogenic.info > Resource equation by Michael FW Festing. Updated Sept. 2006
  7. ^ Kish (1965, Section 3.1)
  8. ^ Kish (1965), p. 148.
  9. ^ Kish (1965), p. 78.
  10. ^ Kish (1965), p. 81.
  11. ^ Kish (1965), p. 93.
  12. ^ Kish (1965), p. 94.
  13. ^ Sandelowski, M. (1995). Sample size in qualitative research. Research in Nursing & Health, 18, 179–183
  14. ^ Glaser, B. (1965). The constant comparative method of qualitative analysis. Social Problems, 12, 436–445
  15. ^ Francis, Jill J.; Johnston, Marie; Robertson, Clare; Glidewell, Liz; Entwistle, Vikki; Eccles, Martin P.; Grimshaw, Jeremy M. (2010). "What is an adequate sample size? Operationalising data saturation for theory-based interview studies" (PDF). Psychology & Health. 25 (10): 1229–1245. doi:10.1080/08870440903194015. PMID 20204937. S2CID 28152749.
  16. ^ a b Guest, Greg; Bunce, Arwen; Johnson, Laura (2006). "How Many Interviews Are Enough?". Field Methods. 18: 59–82. doi:10.1177/1525822X05279903. S2CID 62237589.
  17. ^ Wright, Adam; Maloney, Francine L.; Feblowitz, Joshua C. (2011). "Clinician attitudes toward and use of electronic problem lists: A thematic analysis". BMC Medical Informatics and Decision Making. 11: 36. doi:10.1186/1472-6947-11-36. PMC 3120635. PMID 21612639.
  18. ^ Mason, Mark (2010). "Sample Size and Saturation in PhD Studies Using Qualitative Interviews". Forum Qualitative Sozialforschung. 11 (3): 8.
  19. ^ Emmel, N. (2013). Sampling and choosing cases in qualitative research: A realist approach. London: Sage.
  20. ^ Onwuegbuzie, Anthony J.; Leech, Nancy L. (2007). "A Call for Qualitative Power Analyses". Quality & Quantity. 41: 105–121. doi:10.1007/s11135-005-1098-1. S2CID 62179911.
  21. ^ a b Fugard AJB; Potts HWW (10 February 2015). "Supporting thinking on sample sizes for thematic analyses: A quantitative tool" (PDF). International Journal of Social Research Methodology. 18 (6): 669–684. doi:10.1080/13645579.2015.1005453. S2CID 59047474.
  22. ^ Galvin R (2015). How many interviews are enough? Do qualitative interviews in building energy consumption research produce reliable knowledge? Journal of Building Engineering, 1:2–12.

General references

  • Bartlett, J. E., II; Kotrlik, J. W.; Higgins, C. (2001). "Organizational research: Determining appropriate sample size for survey research" (PDF). Information Technology, Learning, and Performance Journal. 19 (1): 43–50.
  • Kish, L. (1965). Survey Sampling. Wiley. ISBN 978-0-471-48900-9.
  • Smith, Scott (8 April 2013). "Determining Sample Size: How to Ensure You Get the Correct Sample Size". Qualtrics. Retrieved 19 September 2018.
  • Israel, Glenn D. (1992). "Determining Sample Size". University of Florida, PEOD-6. Retrieved 29 June 2019.
  • Rens van de Schoot, Milica Miočević (eds.). 2020. Small Sample Size Solutions (Open Access): A Guide for Applied Researchers and Practitioners. Routledge.

Further reading

  • NIST: Selecting Sample Sizes
  • ASTM E122-07: Standard Practice for Calculating Sample Size to Estimate, With Specified Precision, the Average for a Characteristic of a Lot or Process

External links

  • A MATLAB script implementing Cochran's sample size formula

sample, size, determination, choosing, number, observations, replicates, include, statistical, sample, sample, size, important, feature, empirical, study, which, goal, make, inferences, about, population, from, sample, practice, sample, size, used, study, usua. Sample size determination is the act of choosing the number of observations or replicates to include in a statistical sample The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample In practice the sample size used in a study is usually determined based on the cost time or convenience of collecting the data and the need for it to offer sufficient statistical power In complicated studies there may be several different sample sizes for example in a stratified survey there would be different sizes for each stratum In a census data is sought for an entire population hence the intended sample size is equal to the population In experimental design where a study may be divided into different treatment groups there may be different sample sizes for each group Sample sizes may be chosen in several ways using experience small samples though sometimes unavoidable can result in wide confidence intervals and risk of errors in statistical hypothesis testing using a target variance for an estimate to be derived from the sample eventually obtained i e if a high precision is required narrow confidence interval this translates to a low target variance of the estimator using a target for the power of a statistical test to be applied once the sample is collected using a confidence level i e the larger the required confidence level the larger the sample size given a constant precision requirement Contents 1 Introduction 2 Estimation 2 1 Estimation of a proportion 2 2 Estimation of a mean 3 Required sample sizes for hypothesis tests 3 1 Tables 3 2 Mead s resource equation 3 3 Cumulative distribution function 4 Stratified sample size 5 Qualitative research 6 See also 7 References 8 General references 9 Further reading 10 External linksIntroduction EditLarger sample sizes generally lead to increased precision when estimating unknown parameters For example if we wish to know the proportion of a certain species of fish that is infected with a pathogen we would generally have a more precise estimate of this proportion if we sampled and examined 200 rather than 100 fish Several fundamental facts of mathematical statistics describe this phenomenon including the law of large numbers and the central limit theorem In some situations the increase in precision for larger sample sizes is minimal or even non existent This can result from the presence of systematic errors or strong dependence in the data or if the data follows a heavy tailed distribution Sample sizes may be evaluated by the quality of the resulting estimates For example if a proportion is being estimated one may wish to have the 95 confidence interval be less than 0 06 units wide Alternatively sample size may be assessed based on the power of a hypothesis test For example if we are comparing the support for a certain political candidate among women with the support for that candidate among men we may wish to have 80 power to detect a difference in the support levels of 0 04 units Estimation EditEstimation of a proportion Edit Main article Population proportion A relatively simple situation is estimation of a proportion For example we may wish to estimate the proportion of residents in a community who are at least 65 years old The estimator of a proportion is p X n displaystyle hat p X n where X is the number of positive e g the number of people out of the n sampled people who are at least 65 years old When the observations are independent this estimator has a scaled binomial distribution and is also the sample mean of data from a Bernoulli distribution The maximum variance of this distribution is 0 25 which occurs when the true parameter is p 0 5 In practice since p is unknown the maximum variance is often used for sample size assessments If a reasonable estimate for p is known the quantity p 1 p displaystyle p 1 p may be used in place of 0 25 For sufficiently large n the distribution of p displaystyle hat p will be closely approximated by a normal distribution 1 Using this and the Wald method for the binomial distribution yields a confidence interval of the form p Z 0 25 n p Z 0 25 n displaystyle left widehat p Z sqrt frac 0 25 n quad widehat p Z sqrt frac 0 25 n right where Z is a standard Z score for the desired level of confidence 1 96 for a 95 confidence interval If we wish to have a confidence interval that is W units total in width W 2 on each side of the sample mean we will solve Z 0 25 n W 2 displaystyle Z sqrt frac 0 25 n W 2 for n yielding the sample size sample sizes for binomial proportions given different confidence levels and margins of error n Z 2 W 2 displaystyle n frac Z 2 W 2 in the case of using 5 as the most conservative estimate of the proportion Note W 2 margin of error In the figure below one can observe how sample sizes for binomial proportions change given different confidence levels and margins of error Otherwise the formula would be Z p 1 p n W 2 displaystyle Z sqrt frac p 1 p n W 2 which yields n 4 Z 2 p 1 p W 2 displaystyle n frac 4Z 2 p 1 p W 2 For example if we are interested in estimating the proportion of the US population who supports a particular presidential candidate and we want the width of 95 confidence interval to be at most 2 percentage points 0 02 then we would need a sample size of 1 96 2 0 022 9604 It is reasonable to use the 0 5 estimate for p in this case because the presidential races are often close to 50 50 and it is also prudent to use a conservative estimate The margin of error in this case is 1 percentage point half of 0 02 The foregoing is commonly simplified p 1 96 0 25 n p 1 96 0 25 n displaystyle left widehat p 1 96 sqrt frac 0 25 n widehat p 1 96 sqrt frac 0 25 n right will form a 95 confidence interval for the true proportion If this interval needs to be no more than W units wide the equation 4 0 25 n W displaystyle 4 sqrt frac 0 25 n W can be solved for n yielding 2 3 n 4 W2 1 B2 where B is the error bound on the estimate i e the estimate is usually given as within B For B 10 one requires n 100 for B 5 one needs n 400 for B 3 the requirement approximates to n 1000 while for B 1 a sample size of n 10000 is required These numbers are quoted often in news reports of opinion polls and other sample surveys However the results reported may not be the exact value as numbers are preferably rounded up Knowing that the value of the n is the minimum number of sample points needed to acquire the desired result the number of respondents then must lie on or above the minimum Estimation of a mean Edit When estimating the population mean using an independent and identically distributed iid sample of size n where each data value has variance s2 the standard error of the sample mean is s n displaystyle frac sigma sqrt n This expression describes quantitatively how the estimate becomes more precise as the sample size increases Using the central limit theorem to justify approximating the sample mean with a normal distribution yields a confidence interval of the form x Z s n x Z s n displaystyle left bar x frac Z sigma sqrt n quad bar x frac Z sigma sqrt n right where Z is a standard Z score for the desired level of confidence 1 96 for a 95 confidence interval If we wish to have a confidence interval that is W units total in width W 2 being the margin of error on each side of the sample mean we would solve Z s n W 2 displaystyle frac Z sigma sqrt n W 2 for n yielding the sample sizen 4 Z 2 s 2 W 2 displaystyle n frac 4Z 2 sigma 2 W 2 For example if we are interested in estimating the amount by which a drug lowers a subject s blood pressure with a 95 confidence interval that is six units wide and we know that the standard deviation of blood pressure in the population is 15 then the required sample size is 4 1 96 2 15 2 6 2 96 04 displaystyle frac 4 times 1 96 2 times 15 2 6 2 96 04 which would be rounded up to 97 because the obtained value is the minimum sample size and sample sizes must be integers and must lie on or above the calculated minimum Required sample sizes for hypothesis tests EditA common problem faced by statisticians is calculating the sample size required to yield a certain power for a test given a predetermined Type I error rate a As follows this can be estimated by pre determined tables for certain values by Mead s resource equation or more generally by the cumulative distribution function Tables Edit 4 Power Cohen s d0 2 0 5 0 80 25 84 14 60 50 193 32 130 60 246 40 160 70 310 50 200 80 393 64 260 90 526 85 340 95 651 105 420 99 920 148 58The table shown on the right can be used in a two sample t test to estimate the sample sizes of an experimental group and a control group that are of equal size that is the total number of individuals in the trial is twice that of the number given and the desired significance level is 0 05 4 The parameters used are The desired statistical power of the trial shown in column to the left Cohen s d effect size which is the expected difference between the means of the target values between the experimental group and the control group divided by the expected standard deviation Mead s resource equation Edit Mead s resource equation is often used for estimating sample sizes of laboratory animals as well as in many other laboratory experiments It may not be as accurate as using other methods in estimating sample size but gives a hint of what is the appropriate sample size where parameters such as expected standard deviations or expected differences in values between groups are unknown or very hard to estimate 5 All the parameters in the equation are in fact the degrees of freedom of the number of their concepts and hence their numbers are subtracted by 1 before insertion into the equation The equation is 5 E N B T displaystyle E N B T where N is the total number of individuals or units in the study minus 1 B is the blocking component representing environmental effects allowed for in the design minus 1 T is the treatment component corresponding to the number of treatment groups including control group being used or the number of questions being asked minus 1 E is the degrees of freedom of the error component and should be somewhere between 10 and 20 For example if a study using laboratory animals is planned with four treatment groups T 3 with eight animals per group making 32 animals total N 31 without any further stratification B 0 then E would equal 28 which is above the cutoff of 20 indicating that sample size may be a bit too large and six animals per group might be more appropriate 6 Cumulative distribution function Edit Let Xi i 1 2 n be independent observations taken from a normal distribution with unknown mean m and known variance s2 Consider two hypotheses a null hypothesis H 0 m 0 displaystyle H 0 mu 0 and an alternative hypothesis H a m m displaystyle H a mu mu for some smallest significant difference m gt 0 This is the smallest value for which we care about observing a difference Now if we wish to 1 reject H0 with a probability of at least 1 b when Ha is true i e a power of 1 b and 2 reject H0 with probability a when H0 is true then we need the following If za is the upper a percentage point of the standard normal distribution then Pr x gt z a s n H 0 a displaystyle Pr bar x gt z alpha sigma sqrt n mid H 0 alpha and so Reject H0 if our sample average x displaystyle bar x is more than z a s n displaystyle z alpha sigma sqrt n is a decision rule which satisfies 2 This is a 1 tailed test Now we wish for this to happen with a probability at least 1 b when Ha is true In this case our sample average will come from Normal distribution with mean m Therefore we require Pr x gt z a s n H a 1 b displaystyle Pr bar x gt z alpha sigma sqrt n mid H a geq 1 beta Through careful manipulation this can be shown see Statistical power Example to happen when n z a F 1 1 b m s 2 displaystyle n geq left frac z alpha Phi 1 1 beta mu sigma right 2 where F displaystyle Phi is the normal cumulative distribution function Stratified sample size EditWith more complicated sampling techniques such as stratified sampling the sample can often be split up into sub samples Typically if there are H such sub samples from H different strata then each of them will have a sample size nh h 1 2 H These nh must conform to the rule that n1 n2 nH n i e that the total sample size is given by the sum of the sub sample sizes Selecting these nh optimally can be done in various ways using for example Neyman s optimal allocation There are many reasons to use stratified sampling 7 to decrease variances of sample estimates to use partly non random methods or to study strata individually A useful partly non random method would be to sample individuals where easily accessible but where not sample clusters to save travel costs 8 In general for H strata a weighted sample mean is x w h 1 H W h x h displaystyle bar x w sum h 1 H W h bar x h with Var x w h 1 H W h 2 Var x h displaystyle operatorname Var bar x w sum h 1 H W h 2 operatorname Var bar x h 9 The weights W h displaystyle W h frequently but not always represent the proportions of the population elements in the strata and W h N h N displaystyle W h N h N For a fixed sample size that is n n h displaystyle n sum n h Var x w h 1 H W h 2 Var x h 1 n h 1 N h displaystyle operatorname Var bar x w sum h 1 H W h 2 operatorname Var bar x h left frac 1 n h frac 1 N h right 10 which can be made a minimum if the sampling rate within each stratum is made proportional to the standard deviation within each stratum n h N h k S h displaystyle n h N h kS h where S h Var x h displaystyle S h sqrt operatorname Var bar x h and k displaystyle k is a constant such that n h n displaystyle sum n h n An optimum allocation is reached when the sampling rates within the strata are made directly proportional to the standard deviations within the strata and inversely proportional to the square root of the sampling cost per element within the strata C h displaystyle C h n h N h K S h C h displaystyle frac n h N h frac KS h sqrt C h 11 where K displaystyle K is a constant such that n h n displaystyle sum n h n or more generally when n h K W h S h C h displaystyle n h frac K W h S h sqrt C h 12 Qualitative research EditSample size determination in qualitative studies takes a different approach It is generally a subjective judgment taken as the research proceeds 13 One approach is to continue to include further participants or material until saturation is reached 14 The number needed to reach saturation has been investigated empirically 15 16 17 18 There is a paucity of reliable guidance on estimating sample sizes before starting the research with a range of suggestions given 16 19 20 21 A tool akin to a quantitative power calculation based on the negative binomial distribution has been suggested for thematic analysis 22 21 See also Edit Mathematics portalDesign of experiments Engineering response surface example under Stepwise regression Cohen s hReferences Edit NIST SEMATECH 7 2 4 2 Sample sizes required e Handbook of Statistical Methods Inference for Regression utdallas edu Confidence Interval for a Proportion Archived 2011 08 23 at the Wayback Machine a b Chapter 13 page 215 in Kenny David A 1987 Statistics for the social and behavioral sciences Boston Little Brown ISBN 978 0 316 48915 7 a b Kirkwood James Robert Hubrecht 2010 The UFAW Handbook on the Care and Management of Laboratory and Other Research Animals Wiley Blackwell p 29 ISBN 978 1 4051 7523 4 online Page 29 Isogenic info gt Resource equation by Michael FW Festing Updated Sept 2006 Kish 1965 Section 3 1 Kish 1965 p 148 Kish 1965 p 78 Kish 1965 p 81 Kish 1965 p 93 Kish 1965 p 94 Sandelowski M 1995 Sample size in qualitative research Research in Nursing amp Health 18 179 183 Glaser B 1965 The constant comparative method of qualitative analysis Social Problems 12 436 445 Francis Jill J Johnston Marie Robertson Clare Glidewell Liz Entwistle Vikki Eccles Martin P Grimshaw Jeremy M 2010 What is an adequate sample size Operationalising data saturation for theory based interview studies PDF Psychology amp Health 25 10 1229 1245 doi 10 1080 08870440903194015 PMID 20204937 S2CID 28152749 a b Guest Greg Bunce Arwen Johnson Laura 2006 How Many Interviews Are Enough Field Methods 18 59 82 doi 10 1177 1525822X05279903 S2CID 62237589 Wright Adam Maloney Francine L Feblowitz Joshua C 2011 Clinician attitudes toward and use of electronic problem lists A thematic analysis BMC Medical Informatics and Decision Making 11 36 doi 10 1186 1472 6947 11 36 PMC 3120635 PMID 21612639 Mason Mark 2010 Sample Size and Saturation in PhD Studies Using Qualitative Interviews Forum Qualitative Sozialforschung 11 3 8 Emmel N 2013 Sampling and choosing cases in qualitative research A realist approach London Sage Onwuegbuzie Anthony J Leech Nancy L 2007 A Call for Qualitative Power Analyses Quality amp Quantity 41 105 121 doi 10 1007 s11135 005 1098 1 S2CID 62179911 a b Fugard AJB Potts HWW 10 February 2015 Supporting thinking on sample sizes for thematic analyses A quantitative tool PDF International Journal of Social Research Methodology 18 6 669 684 doi 10 1080 13645579 2015 1005453 S2CID 59047474 Galvin R 2015 How many interviews are enough Do qualitative interviews in building energy consumption research produce reliable knowledge Journal of Building Engineering 1 2 12 General references EditBartlett J E II Kotrlik J W Higgins C 2001 Organizational research Determining appropriate sample size for survey research PDF Information Technology Learning and Performance Journal 19 1 43 50 Kish L 1965 Survey Sampling Wiley ISBN 978 0 471 48900 9 Smith Scott 8 April 2013 Determining Sample Size How to Ensure You Get the Correct Sample Size Qualtrics Retrieved 19 September 2018 Israel Glenn D 1992 Determining Sample Size University of Florida PEOD 6 Retrieved 29 June 2019 Rens van de Schoot Milica Miocevic eds 2020 Small Sample Size Solutions Open Access A Guide for Applied Researchers and Practitioners Routledge Further reading EditNIST Selecting Sample Sizes ASTM E122 07 Standard Practice for Calculating Sample Size to Estimate With Specified Precision the Average for a Characteristic of a Lot or ProcessExternal links EditA MATLAB script implementing Cochran s sample size formula Retrieved from https en wikipedia org w index php title Sample size determination amp oldid 1123737429, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.