fbpx
Wikipedia

F-test

An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact "F-tests" mainly arise when the models have been fitted to the data using least squares. The name was coined by George W. Snedecor, in honour of Ronald Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s.[1]

Common examples

Common examples of the use of F-tests include the study of the following cases:

In addition, some statistical procedures, such as Scheffé's method for multiple comparisons adjustment in linear models, also use F-tests.

F-test of the equality of two variances

The F-test is sensitive to non-normality.[2][3] In the analysis of variance (ANOVA), alternative tests include Levene's test, Bartlett's test, and the Brown–Forsythe test. However, when any of these tests are conducted to test the underlying assumption of homoscedasticity (i.e. homogeneity of variance), as a preliminary step to testing for mean effects, there is an increase in the experiment-wise Type I error rate.[4]

Formula and calculation

Most F-tests arise by considering a decomposition of the variability in a collection of data in terms of sums of squares. The test statistic in an F-test is the ratio of two scaled sums of squares reflecting different sources of variability. These sums of squares are constructed so that the statistic tends to be greater when the null hypothesis is not true. In order for the statistic to follow the F-distribution under the null hypothesis, the sums of squares should be statistically independent, and each should follow a scaled χ²-distribution. The latter condition is guaranteed if the data values are independent and normally distributed with a common variance.

Multiple-comparison ANOVA problems

The F-test in one-way analysis of variance (ANOVA) is used to assess whether the expected values of a quantitative variable within several pre-defined groups differ from each other. For example, suppose that a medical trial compares four treatments. The ANOVA F-test can be used to assess whether any of the treatments are on average superior, or inferior, to the others versus the null hypothesis that all four treatments yield the same mean response. This is an example of an "omnibus" test, meaning that a single test is performed to detect any of several possible differences. Alternatively, we could carry out pairwise tests among the treatments (for instance, in the medical trial example with four treatments we could carry out six tests among pairs of treatments). The advantage of the ANOVA F-test is that we do not need to pre-specify which treatments are to be compared, and we do not need to adjust for making multiple comparisons. The disadvantage of the ANOVA F-test is that if we reject the null hypothesis, we do not know which treatments can be said to be significantly different from the others, nor, if the F-test is performed at level α, can we state that the treatment pair with the greatest mean difference is significantly different at level α.

The formula for the one-way ANOVA F-test statistic is

 

or

 

The "explained variance", or "between-group variability" is

 

where   denotes the sample mean in the i-th group,   is the number of observations in the i-th group,  denotes the overall mean of the data, and   denotes the number of groups.

The "unexplained variance", or "within-group variability" is

 

where   is the jth observation in the ith out of   groups and   is the overall sample size. This F-statistic follows the F-distribution with degrees of freedom   and   under the null hypothesis. The statistic will be large if the between-group variability is large relative to the within-group variability, which is unlikely to happen if the population means of the groups all have the same value.

Note that when there are only two groups for the one-way ANOVA F-test,  where t is the Student's   statistic.

Regression problems

Consider two models, 1 and 2, where model 1 is 'nested' within model 2. Model 1 is the restricted model, and model 2 is the unrestricted one. That is, model 1 has p1 parameters, and model 2 has p2 parameters, where p1 < p2, and for any choice of parameters in model 1, the same regression curve can be achieved by some choice of the parameters of model 2.

One common context in this regard is that of deciding whether a model fits the data significantly better than does a naive model, in which the only explanatory term is the intercept term, so that all predicted values for the dependent variable are set equal to that variable's sample mean. The naive model is the restricted model, since the coefficients of all potential explanatory variables are restricted to equal zero.

Another common context is deciding whether there is a structural break in the data: here the restricted model uses all data in one regression, while the unrestricted model uses separate regressions for two different subsets of the data. This use of the F-test is known as the Chow test.

The model with more parameters will always be able to fit the data at least as well as the model with fewer parameters. Thus typically model 2 will give a better (i.e. lower error) fit to the data than model 1. But one often wants to determine whether model 2 gives a significantly better fit to the data. One approach to this problem is to use an F-test.

If there are n data points to estimate parameters of both models from, then one can calculate the F statistic, given by

 

where RSSi is the residual sum of squares of model i. If the regression model has been calculated with weights, then replace RSSi with χ2, the weighted sum of squared residuals. Under the null hypothesis that model 2 does not provide a significantly better fit than model 1, F will have an F distribution, with (p2p1np2) degrees of freedom. The null hypothesis is rejected if the F calculated from the data is greater than the critical value of the F-distribution for some desired false-rejection probability (e.g. 0.05). Since F is a monotone function of the likelihood ratio statistic, the F-test is a likelihood ratio test.

See also

References

  1. ^ Lomax, Richard G. (2007). Statistical Concepts: A Second Course. p. 10. ISBN 978-0-8058-5850-1.
  2. ^ Box, G. E. P. (1953). "Non-Normality and Tests on Variances". Biometrika. 40 (3/4): 318–335. doi:10.1093/biomet/40.3-4.318. JSTOR 2333350.
  3. ^ Markowski, Carol A; Markowski, Edward P. (1990). "Conditions for the Effectiveness of a Preliminary Test of Variance". The American Statistician. 44 (4): 322–326. doi:10.2307/2684360. JSTOR 2684360.
  4. ^ Sawilowsky, S. (2002). "Fermat, Schubert, Einstein, and Behrens–Fisher: The Probable Difference Between Two Means When σ12 ≠ σ22". Journal of Modern Applied Statistical Methods. 1 (2): 461–472. doi:10.22237/jmasm/1036109940. from the original on 2015-04-03. Retrieved 2015-03-30.

Further reading

  • Fox, Karl A. (1980). Intermediate Economic Statistics (Second ed.). New York: John Wiley & Sons. pp. 290–310. ISBN 0-88275-521-8.
  • Johnston, John (1972). Econometric Methods (Second ed.). New York: McGraw-Hill. pp. 35–38.
  • Kmenta, Jan (1986). Elements of Econometrics (Second ed.). New York: Macmillan. pp. 147–148. ISBN 0-02-365070-2.
  • Maddala, G. S.; Lahiri, Kajal (2009). Introduction to Econometrics (Fourth ed.). Chichester: Wiley. pp. 155–160. ISBN 978-0-470-01512-4.

External links

  • Table of F-test critical values
  • Free calculator for F-testing
  • The F-test for Linear Regression
  • Econometrics lecture (topic: hypothesis testing) on YouTube by Mark Thoma

test, statistical, test, which, test, statistic, distribution, under, null, hypothesis, most, often, used, when, comparing, statistical, models, that, have, been, fitted, data, order, identify, model, that, best, fits, population, from, which, data, were, samp. An F test is any statistical test in which the test statistic has an F distribution under the null hypothesis It is most often used when comparing statistical models that have been fitted to a data set in order to identify the model that best fits the population from which the data were sampled Exact F tests mainly arise when the models have been fitted to the data using least squares The name was coined by George W Snedecor in honour of Ronald Fisher Fisher initially developed the statistic as the variance ratio in the 1920s 1 Contents 1 Common examples 1 1 F test of the equality of two variances 2 Formula and calculation 2 1 Multiple comparison ANOVA problems 2 2 Regression problems 3 See also 4 References 5 Further reading 6 External linksCommon examples EditCommon examples of the use of F tests include the study of the following cases The hypothesis that the means of a given set of normally distributed populations all having the same standard deviation are equal This is perhaps the best known F test and plays an important role in the analysis of variance ANOVA The hypothesis that a proposed regression model fits the data well See Lack of fit sum of squares The hypothesis that a data set in a regression analysis follows the simpler of two proposed linear models that are nested within each other In addition some statistical procedures such as Scheffe s method for multiple comparisons adjustment in linear models also use F tests F test of the equality of two variances Edit Main article F test of equality of variances The F test is sensitive to non normality 2 3 In the analysis of variance ANOVA alternative tests include Levene s test Bartlett s test and the Brown Forsythe test However when any of these tests are conducted to test the underlying assumption of homoscedasticity i e homogeneity of variance as a preliminary step to testing for mean effects there is an increase in the experiment wise Type I error rate 4 Formula and calculation EditMost F tests arise by considering a decomposition of the variability in a collection of data in terms of sums of squares The test statistic in an F test is the ratio of two scaled sums of squares reflecting different sources of variability These sums of squares are constructed so that the statistic tends to be greater when the null hypothesis is not true In order for the statistic to follow the F distribution under the null hypothesis the sums of squares should be statistically independent and each should follow a scaled x distribution The latter condition is guaranteed if the data values are independent and normally distributed with a common variance Multiple comparison ANOVA problems Edit The F test in one way analysis of variance ANOVA is used to assess whether the expected values of a quantitative variable within several pre defined groups differ from each other For example suppose that a medical trial compares four treatments The ANOVA F test can be used to assess whether any of the treatments are on average superior or inferior to the others versus the null hypothesis that all four treatments yield the same mean response This is an example of an omnibus test meaning that a single test is performed to detect any of several possible differences Alternatively we could carry out pairwise tests among the treatments for instance in the medical trial example with four treatments we could carry out six tests among pairs of treatments The advantage of the ANOVA F test is that we do not need to pre specify which treatments are to be compared and we do not need to adjust for making multiple comparisons The disadvantage of the ANOVA F test is that if we reject the null hypothesis we do not know which treatments can be said to be significantly different from the others nor if the F test is performed at level a can we state that the treatment pair with the greatest mean difference is significantly different at level a The formula for the one way ANOVA F test statistic is F explained variance unexplained variance displaystyle F frac text explained variance text unexplained variance or F between group variability within group variability displaystyle F frac text between group variability text within group variability The explained variance or between group variability is i 1 K n i Y i Y 2 K 1 displaystyle sum i 1 K n i bar Y i cdot bar Y 2 K 1 where Y i displaystyle bar Y i cdot denotes the sample mean in the i th group n i displaystyle n i is the number of observations in the i th group Y displaystyle bar Y denotes the overall mean of the data and K displaystyle K denotes the number of groups The unexplained variance or within group variability is i 1 K j 1 n i Y i j Y i 2 N K displaystyle sum i 1 K sum j 1 n i left Y ij bar Y i cdot right 2 N K where Y i j displaystyle Y ij is the jth observation in the ith out of K displaystyle K groups and N displaystyle N is the overall sample size This F statistic follows the F distribution with degrees of freedom d 1 K 1 displaystyle d 1 K 1 and d 2 N K displaystyle d 2 N K under the null hypothesis The statistic will be large if the between group variability is large relative to the within group variability which is unlikely to happen if the population means of the groups all have the same value Note that when there are only two groups for the one way ANOVA F test F t 2 displaystyle F t 2 where t is the Student s t displaystyle t statistic Regression problems Edit Further information Stepwise regression Consider two models 1 and 2 where model 1 is nested within model 2 Model 1 is the restricted model and model 2 is the unrestricted one That is model 1 has p1 parameters and model 2 has p2 parameters where p1 lt p2 and for any choice of parameters in model 1 the same regression curve can be achieved by some choice of the parameters of model 2 One common context in this regard is that of deciding whether a model fits the data significantly better than does a naive model in which the only explanatory term is the intercept term so that all predicted values for the dependent variable are set equal to that variable s sample mean The naive model is the restricted model since the coefficients of all potential explanatory variables are restricted to equal zero Another common context is deciding whether there is a structural break in the data here the restricted model uses all data in one regression while the unrestricted model uses separate regressions for two different subsets of the data This use of the F test is known as the Chow test The model with more parameters will always be able to fit the data at least as well as the model with fewer parameters Thus typically model 2 will give a better i e lower error fit to the data than model 1 But one often wants to determine whether model 2 gives a significantly better fit to the data One approach to this problem is to use an F test If there are n data points to estimate parameters of both models from then one can calculate the F statistic given by F RSS 1 RSS 2 p 2 p 1 RSS 2 n p 2 displaystyle F frac left frac text RSS 1 text RSS 2 p 2 p 1 right left frac text RSS 2 n p 2 right where RSSi is the residual sum of squares of model i If the regression model has been calculated with weights then replace RSSi with x2 the weighted sum of squared residuals Under the null hypothesis that model 2 does not provide a significantly better fit than model 1 F will have an F distribution with p2 p1 n p2 degrees of freedom The null hypothesis is rejected if the F calculated from the data is greater than the critical value of the F distribution for some desired false rejection probability e g 0 05 Since F is a monotone function of the likelihood ratio statistic the F test is a likelihood ratio test See also EditGoodness of fitReferences Edit Lomax Richard G 2007 Statistical Concepts A Second Course p 10 ISBN 978 0 8058 5850 1 Box G E P 1953 Non Normality and Tests on Variances Biometrika 40 3 4 318 335 doi 10 1093 biomet 40 3 4 318 JSTOR 2333350 Markowski Carol A Markowski Edward P 1990 Conditions for the Effectiveness of a Preliminary Test of Variance The American Statistician 44 4 322 326 doi 10 2307 2684360 JSTOR 2684360 Sawilowsky S 2002 Fermat Schubert Einstein and Behrens Fisher The Probable Difference Between Two Means When s12 s22 Journal of Modern Applied Statistical Methods 1 2 461 472 doi 10 22237 jmasm 1036109940 Archived from the original on 2015 04 03 Retrieved 2015 03 30 Further reading EditFox Karl A 1980 Intermediate Economic Statistics Second ed New York John Wiley amp Sons pp 290 310 ISBN 0 88275 521 8 Johnston John 1972 Econometric Methods Second ed New York McGraw Hill pp 35 38 Kmenta Jan 1986 Elements of Econometrics Second ed New York Macmillan pp 147 148 ISBN 0 02 365070 2 Maddala G S Lahiri Kajal 2009 Introduction to Econometrics Fourth ed Chichester Wiley pp 155 160 ISBN 978 0 470 01512 4 External links EditTable of F test critical values Free calculator for F testing The F test for Linear Regression Econometrics lecture topic hypothesis testing on YouTube by Mark Thoma Retrieved from https en wikipedia org w index php title F test amp oldid 1135472089, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.