fbpx
Wikipedia

Power transform

In statistics, a power transform is a family of functions applied to create a monotonic transformation of data using power functions. It is a data transformation technique used to stabilize variance, make the data more normal distribution-like, improve the validity of measures of association (such as the Pearson correlation between variables), and for other data stabilization procedures.

Power transforms are used in multiple fields, including multi-resolution and wavelet analysis,[1] statistical data analysis, medical research, modeling of physical processes,[2] geochemical data analysis,[3] epidemiology[4] and many other clinical, environmental and social research areas.

Definition edit

The power transformation is defined as a continuous function of power parameter λ, typically given in piece-wise form that makes it continuous at the point of singularity (λ = 0). For data vectors (y1,..., yn) in which each yi > 0, the power transform is

 

where

 

is the geometric mean of the observations y1, ..., yn. The case for   is the limit as   approaches 0. To see this, note that   - using Taylor series. Then  , and everything but   becomes negligible for   sufficiently small.

The inclusion of the (λ − 1)th power of the geometric mean in the denominator simplifies the scientific interpretation of any equation involving  , because the units of measurement do not change as λ changes.

Box and Cox (1964) introduced the geometric mean into this transformation by first including the Jacobian of rescaled power transformation

 

with the likelihood. This Jacobian is as follows:

 

This allows the normal log likelihood at its maximum to be written as follows:

 

From here, absorbing   into the expression for   produces an expression that establishes that minimizing the sum of squares of residuals from  is equivalent to maximizing the sum of the normal log likelihood of deviations from   and the log of the Jacobian of the transformation.

The value at Y = 1 for any λ is 0, and the derivative with respect to Y there is 1 for any λ. Sometimes Y is a version of some other variable scaled to give Y = 1 at some sort of average value.

The transformation is a power transformation, but done in such a way as to make it continuous with the parameter λ at λ = 0. It has proved popular in regression analysis, including econometrics.

Box and Cox also proposed a more general form of the transformation that incorporates a shift parameter.

 

which holds if yi + α > 0 for all i. If τ(Y, λ, α) follows a truncated normal distribution, then Y is said to follow a Box–Cox distribution.

Bickel and Doksum eliminated the need to use a truncated distribution by extending the range of the transformation to all y, as follows:

 

where sgn(.) is the sign function. This change in definition has little practical import as long as   is less than  , which it usually is.[5]

Bickel and Doksum also proved that the parameter estimates are consistent and asymptotically normal under appropriate regularity conditions, though the standard Cramér–Rao lower bound can substantially underestimate the variance when parameter values are small relative to the noise variance.[5] However, this problem of underestimating the variance may not be a substantive problem in many applications.[6][7]

Box–Cox transformation edit

The one-parameter Box–Cox transformations are defined as

 

and the two-parameter Box–Cox transformations as

 

as described in the original article.[8][9] Moreover, the first transformations hold for  , and the second for  .[8]

The parameter   is estimated using the profile likelihood function and using goodness-of-fit tests.[10]

Confidence interval edit

Confidence interval for the Box–Cox transformation can be asymptotically constructed using Wilks's theorem on the profile likelihood function to find all the possible values of   that fulfill the following restriction:[11]

 

Example edit

The BUPA liver data set[12] contains data on liver enzymes ALT and γGT. Suppose we are interested in using log(γGT) to predict ALT. A plot of the data appears in panel (a) of the figure. There appears to be non-constant variance, and a Box–Cox transformation might help.

 

The log-likelihood of the power parameter appears in panel (b). The horizontal reference line is at a distance of χ12/2 from the maximum and can be used to read off an approximate 95% confidence interval for λ. It appears as though a value close to zero would be good, so we take logs.

Possibly, the transformation could be improved by adding a shift parameter to the log transformation. Panel (c) of the figure shows the log-likelihood. In this case, the maximum of the likelihood is close to zero suggesting that a shift parameter is not needed. The final panel shows the transformed data with a superimposed regression line.

Note that although Box–Cox transformations can make big improvements in model fit, there are some issues that the transformation cannot help with. In the current example, the data are rather heavy-tailed so that the assumption of normality is not realistic and a robust regression approach leads to a more precise model.

Econometric application edit

Economists often characterize production relationships by some variant of the Box–Cox transformation.[13]

Consider a common representation of production Q as dependent on services provided by a capital stock K and by labor hours N:

 

Solving for Q by inverting the Box–Cox transformation we find

 

which is known as the constant elasticity of substitution (CES) production function.

The CES production function is a homogeneous function of degree one.

When λ = 1, this produces the linear production function:

 

When λ → 0 this produces the famous Cobb–Douglas production function:

 

Activities and demonstrations edit

The SOCR resource pages contain a number of hands-on interactive activities[14] demonstrating the Box–Cox (power) transformation using Java applets and charts. These directly illustrate the effects of this transform on Q–Q plots, X–Y scatterplots, time-series plots and histograms.

Yeo–Johnson transformation edit

The Yeo–Johnson transformation[15] allows also for zero and negative values of  .   can be any real number, where   produces the identity transformation. The transformation law reads:

 

Notes edit

  1. ^ Gao, Peisheng; Wu, Weilin (2006). "Power Quality Disturbances Classification using Wavelet and Support Vector Machines". Sixth International Conference on Intelligent Systems Design and Applications. ISDA '06. Vol. 1. Washington, DC, USA: IEEE Computer Society. pp. 201–206. doi:10.1109/ISDA.2006.217. ISBN 9780769525280. S2CID 2444503.
  2. ^ Gluzman, S.; Yukalov, V. I. (2006-01-01). "Self-similar power transforms in extrapolation problems". Journal of Mathematical Chemistry. 39 (1): 47–56. arXiv:cond-mat/0606104. Bibcode:2006cond.mat..6104G. doi:10.1007/s10910-005-9003-7. ISSN 1572-8897. S2CID 118965098.
  3. ^ Howarth, R. J.; Earle, S. A. M. (1979-02-01). "Application of a generalized power transformation to geochemical data". Journal of the International Association for Mathematical Geology. 11 (1): 45–62. doi:10.1007/BF01043245. ISSN 1573-8868. S2CID 121582755.
  4. ^ Peters, J. L.; Rushton, L.; Sutton, A. J.; Jones, D. R.; Abrams, K. R.; Mugglestone, M. A. (2005). "Bayesian methods for the cross-design synthesis of epidemiological and toxicological evidence". Journal of the Royal Statistical Society, Series C. 54: 159–172. doi:10.1111/j.1467-9876.2005.00476.x. S2CID 121909404.
  5. ^ a b Bickel, Peter J.; Doksum, Kjell A. (June 1981). "An analysis of transformations revisited". Journal of the American Statistical Association. 76 (374): 296–311. doi:10.1080/01621459.1981.10477649.
  6. ^ Sakia, R. M. (1992), "The Box–Cox transformation technique: a review", The Statistician, 41 (2): 169–178, CiteSeerX 10.1.1.469.7176, doi:10.2307/2348250, JSTOR 2348250
  7. ^ Li, Fengfei (April 11, 2005), Box–Cox Transformations: An Overview (PDF) (slide presentation), Sao Paulo, Brazil: University of Sao Paulo, Brazil, retrieved 2014-11-02
  8. ^ a b Box, George E. P.; Cox, D. R. (1964). "An analysis of transformations". Journal of the Royal Statistical Society, Series B. 26 (2): 211–252. JSTOR 2984418. MR 0192611.
  9. ^ Johnston, J. (1984). Econometric Methods (Third ed.). New York: McGraw-Hill. pp. 61–74. ISBN 978-0-07-032685-9.
  10. ^ Asar, O.; Ilk, O.; Dag, O. (2017). "Estimating Box-Cox power transformation parameter via goodness-of-fit tests". Communications in Statistics - Simulation and Computation. 46 (1): 91–105. arXiv:1401.3812. doi:10.1080/03610918.2014.957839. S2CID 41501327.
  11. ^ Abramovich, Felix; Ritov, Ya'acov (2013). Statistical Theory: A Concise Introduction. CRC Press. pp. 121–122. ISBN 978-1-4398-5184-5.
  12. ^
  13. ^ Zarembka, P. (1974). "Transformation of Variables in Econometrics". Frontiers in Econometrics. New York: Academic Press. pp. 81–104. ISBN 0-12-776150-0.
  14. ^ Power Transform Family Graphs, SOCR webpages
  15. ^ Yeo, In-Kwon; Johnson, Richard A. (2000). "A New Family of Power Transformations to Improve Normality or Symmetry". Biometrika. 87 (4): 954–959. doi:10.1093/biomet/87.4.954. JSTOR 2673623.

References edit

  • Box, George E. P.; Cox, D. R. (1964). "An analysis of transformations". Journal of the Royal Statistical Society, Series B. 26 (2): 211–252. JSTOR 2984418. MR 0192611.
  • Carroll, R. J.; Ruppert, D. (1981). "On prediction and the power transformation family" (PDF). Biometrika. 68 (3): 609–615. doi:10.1093/biomet/68.3.609.
  • DeGroot, M. H. (1987). "A Conversation with George Box". Statistical Science. 2 (3): 239–258. doi:10.1214/ss/1177013223.
  • Handelsman, D. J. (2002). "Optimal Power Transformations for Analysis of Sperm Concentration and Other Semen Variables". Journal of Andrology. 23 (5).
  • Gluzman, S.; Yukalov, V. I. (2006). "Self-similar power transforms in extrapolation problems". Journal of Mathematical Chemistry. 39 (1): 47–56. arXiv:cond-mat/0606104. Bibcode:2006cond.mat..6104G. doi:10.1007/s10910-005-9003-7. S2CID 118965098.
  • Howarth, R. J.; Earle, S. A. M. (1979). "Application of a generalized power transformation to geochemical data". Journal of the International Association for Mathematical Geology. 11 (1): 45–62. doi:10.1007/BF01043245. S2CID 121582755.

External links edit

power, transform, statistics, power, transform, family, functions, applied, create, monotonic, transformation, data, using, power, functions, data, transformation, technique, used, stabilize, variance, make, data, more, normal, distribution, like, improve, val. In statistics a power transform is a family of functions applied to create a monotonic transformation of data using power functions It is a data transformation technique used to stabilize variance make the data more normal distribution like improve the validity of measures of association such as the Pearson correlation between variables and for other data stabilization procedures Power transforms are used in multiple fields including multi resolution and wavelet analysis 1 statistical data analysis medical research modeling of physical processes 2 geochemical data analysis 3 epidemiology 4 and many other clinical environmental and social research areas Contents 1 Definition 2 Box Cox transformation 2 1 Confidence interval 2 2 Example 2 3 Econometric application 2 4 Activities and demonstrations 3 Yeo Johnson transformation 4 Notes 5 References 6 External linksDefinition editThe power transformation is defined as a continuous function of power parameter l typically given in piece wise form that makes it continuous at the point of singularity l 0 For data vectors y1 yn in which each yi gt 0 the power transform is y i l y i l 1 l GM y l 1 if l 0 GM y ln y i if l 0 displaystyle y i lambda begin cases dfrac y i lambda 1 lambda operatorname GM y lambda 1 amp text if lambda neq 0 12pt operatorname GM y ln y i amp text if lambda 0 end cases nbsp where GM y i 1 n y i 1 n y 1 y 2 y n n displaystyle operatorname GM y left prod i 1 n y i right frac 1 n sqrt n y 1 y 2 cdots y n nbsp is the geometric mean of the observations y1 yn The case for l 0 displaystyle lambda 0 nbsp is the limit as l displaystyle lambda nbsp approaches 0 To see this note that y i l exp l ln y i 1 l ln y i O l ln y i 2 displaystyle y i lambda exp lambda ln y i 1 lambda ln y i O lambda ln y i 2 nbsp using Taylor series Then y i l 1 l ln y i O l displaystyle dfrac y i lambda 1 lambda ln y i O lambda nbsp and everything but ln y i displaystyle ln y i nbsp becomes negligible for l displaystyle lambda nbsp sufficiently small The inclusion of the l 1 th power of the geometric mean in the denominator simplifies the scientific interpretation of any equation involving y i l displaystyle y i lambda nbsp because the units of measurement do not change as l changes Box and Cox 1964 introduced the geometric mean into this transformation by first including the Jacobian of rescaled power transformation y l 1 l displaystyle frac y lambda 1 lambda nbsp with the likelihood This Jacobian is as follows J l y 1 y n i 1 n d y i l d y i 1 n y i l 1 GM y n l 1 displaystyle J lambda y 1 ldots y n prod i 1 n dy i lambda dy prod i 1 n y i lambda 1 operatorname GM y n lambda 1 nbsp This allows the normal log likelihood at its maximum to be written as follows log L m s n 2 log 2 p s 2 1 n l 1 log GM y n 2 log 2 p s 2 GM y 2 l 1 1 displaystyle begin aligned log mathcal L hat mu hat sigma amp n 2 log 2 pi hat sigma 2 1 n lambda 1 log operatorname GM y 5pt amp n 2 log 2 pi hat sigma 2 operatorname GM y 2 lambda 1 1 end aligned nbsp From here absorbing GM y 2 l 1 displaystyle operatorname GM y 2 lambda 1 nbsp into the expression for s 2 displaystyle hat sigma 2 nbsp produces an expression that establishes that minimizing the sum of squares of residuals from y i l displaystyle y i lambda nbsp is equivalent to maximizing the sum of the normal log likelihood of deviations from y l 1 l displaystyle y lambda 1 lambda nbsp and the log of the Jacobian of the transformation The value at Y 1 for any l is 0 and the derivative with respect to Y there is 1 for any l Sometimes Y is a version of some other variable scaled to give Y 1 at some sort of average value The transformation is a power transformation but done in such a way as to make it continuous with the parameter l at l 0 It has proved popular in regression analysis including econometrics Box and Cox also proposed a more general form of the transformation that incorporates a shift parameter t y i l a y i a l 1 l GM y a l 1 if l 0 GM y a ln y i a if l 0 displaystyle tau y i lambda alpha begin cases dfrac y i alpha lambda 1 lambda operatorname GM y alpha lambda 1 amp text if lambda neq 0 operatorname GM y alpha ln y i alpha amp text if lambda 0 end cases nbsp which holds if yi a gt 0 for all i If t Y l a follows a truncated normal distribution then Y is said to follow a Box Cox distribution Bickel and Doksum eliminated the need to use a truncated distribution by extending the range of the transformation to all y as follows t y i l a sgn y i a y i a l 1 l GM y a l 1 if l 0 GM y a sgn y a ln y i a if l 0 displaystyle tau y i lambda alpha begin cases dfrac operatorname sgn y i alpha y i alpha lambda 1 lambda operatorname GM y alpha lambda 1 amp text if lambda neq 0 operatorname GM y alpha operatorname sgn y alpha ln y i alpha amp text if lambda 0 end cases nbsp where sgn is the sign function This change in definition has little practical import as long as a displaystyle alpha nbsp is less than min y i displaystyle operatorname min y i nbsp which it usually is 5 Bickel and Doksum also proved that the parameter estimates are consistent and asymptotically normal under appropriate regularity conditions though the standard Cramer Rao lower bound can substantially underestimate the variance when parameter values are small relative to the noise variance 5 However this problem of underestimating the variance may not be a substantive problem in many applications 6 7 Box Cox transformation editThe one parameter Box Cox transformations are defined as y i l y i l 1 l if l 0 ln y i if l 0 displaystyle y i lambda begin cases dfrac y i lambda 1 lambda amp text if lambda neq 0 ln y i amp text if lambda 0 end cases nbsp and the two parameter Box Cox transformations as y i l y i l 2 l 1 1 l 1 if l 1 0 ln y i l 2 if l 1 0 displaystyle y i boldsymbol lambda begin cases dfrac y i lambda 2 lambda 1 1 lambda 1 amp text if lambda 1 neq 0 ln y i lambda 2 amp text if lambda 1 0 end cases nbsp as described in the original article 8 9 Moreover the first transformations hold for y i gt 0 displaystyle y i gt 0 nbsp and the second for y i gt l 2 displaystyle y i gt lambda 2 nbsp 8 The parameter l displaystyle lambda nbsp is estimated using the profile likelihood function and using goodness of fit tests 10 Confidence interval edit Confidence interval for the Box Cox transformation can be asymptotically constructed using Wilks s theorem on the profile likelihood function to find all the possible values of l displaystyle lambda nbsp that fulfill the following restriction 11 ln L l ln L l 1 2 x 2 1 1 a displaystyle ln big L lambda big geq ln big L hat lambda big frac 1 2 chi 2 1 1 alpha nbsp Example edit The BUPA liver data set 12 contains data on liver enzymes ALT and gGT Suppose we are interested in using log gGT to predict ALT A plot of the data appears in panel a of the figure There appears to be non constant variance and a Box Cox transformation might help nbsp The log likelihood of the power parameter appears in panel b The horizontal reference line is at a distance of x12 2 from the maximum and can be used to read off an approximate 95 confidence interval for l It appears as though a value close to zero would be good so we take logs Possibly the transformation could be improved by adding a shift parameter to the log transformation Panel c of the figure shows the log likelihood In this case the maximum of the likelihood is close to zero suggesting that a shift parameter is not needed The final panel shows the transformed data with a superimposed regression line Note that although Box Cox transformations can make big improvements in model fit there are some issues that the transformation cannot help with In the current example the data are rather heavy tailed so that the assumption of normality is not realistic and a robust regression approach leads to a more precise model Econometric application edit Economists often characterize production relationships by some variant of the Box Cox transformation 13 Consider a common representation of production Q as dependent on services provided by a capital stock K and by labor hours N t Q a t K 1 a t N displaystyle tau Q alpha tau K 1 alpha tau N nbsp Solving for Q by inverting the Box Cox transformation we find Q a K l 1 a N l 1 l displaystyle Q big alpha K lambda 1 alpha N lambda big 1 lambda nbsp which is known as the constant elasticity of substitution CES production function The CES production function is a homogeneous function of degree one When l 1 this produces the linear production function Q a K 1 a N displaystyle Q alpha K 1 alpha N nbsp When l 0 this produces the famous Cobb Douglas production function Q K a N 1 a displaystyle Q K alpha N 1 alpha nbsp Activities and demonstrations edit The SOCR resource pages contain a number of hands on interactive activities 14 demonstrating the Box Cox power transformation using Java applets and charts These directly illustrate the effects of this transform on Q Q plots X Y scatterplots time series plots and histograms Yeo Johnson transformation editThe Yeo Johnson transformation 15 allows also for zero and negative values of y displaystyle y nbsp l displaystyle lambda nbsp can be any real number where l 1 displaystyle lambda 1 nbsp produces the identity transformation The transformation law reads y i l y i 1 l 1 l if l 0 y 0 ln y i 1 if l 0 y 0 y i 1 2 l 1 2 l if l 2 y lt 0 ln y i 1 if l 2 y lt 0 displaystyle y i lambda begin cases y i 1 lambda 1 lambda amp text if lambda neq 0 y geq 0 4pt ln y i 1 amp text if lambda 0 y geq 0 4pt y i 1 2 lambda 1 2 lambda amp text if lambda neq 2 y lt 0 4pt ln y i 1 amp text if lambda 2 y lt 0 end cases nbsp Notes edit Gao Peisheng Wu Weilin 2006 Power Quality Disturbances Classification using Wavelet and Support Vector Machines Sixth International Conference on Intelligent Systems Design and Applications ISDA 06 Vol 1 Washington DC USA IEEE Computer Society pp 201 206 doi 10 1109 ISDA 2006 217 ISBN 9780769525280 S2CID 2444503 Gluzman S Yukalov V I 2006 01 01 Self similar power transforms in extrapolation problems Journal of Mathematical Chemistry 39 1 47 56 arXiv cond mat 0606104 Bibcode 2006cond mat 6104G doi 10 1007 s10910 005 9003 7 ISSN 1572 8897 S2CID 118965098 Howarth R J Earle S A M 1979 02 01 Application of a generalized power transformation to geochemical data Journal of the International Association for Mathematical Geology 11 1 45 62 doi 10 1007 BF01043245 ISSN 1573 8868 S2CID 121582755 Peters J L Rushton L Sutton A J Jones D R Abrams K R Mugglestone M A 2005 Bayesian methods for the cross design synthesis of epidemiological and toxicological evidence Journal of the Royal Statistical Society Series C 54 159 172 doi 10 1111 j 1467 9876 2005 00476 x S2CID 121909404 a b Bickel Peter J Doksum Kjell A June 1981 An analysis of transformations revisited Journal of the American Statistical Association 76 374 296 311 doi 10 1080 01621459 1981 10477649 Sakia R M 1992 The Box Cox transformation technique a review The Statistician 41 2 169 178 CiteSeerX 10 1 1 469 7176 doi 10 2307 2348250 JSTOR 2348250 Li Fengfei April 11 2005 Box Cox Transformations An Overview PDF slide presentation Sao Paulo Brazil University of Sao Paulo Brazil retrieved 2014 11 02 a b Box George E P Cox D R 1964 An analysis of transformations Journal of the Royal Statistical Society Series B 26 2 211 252 JSTOR 2984418 MR 0192611 Johnston J 1984 Econometric Methods Third ed New York McGraw Hill pp 61 74 ISBN 978 0 07 032685 9 Asar O Ilk O Dag O 2017 Estimating Box Cox power transformation parameter via goodness of fit tests Communications in Statistics Simulation and Computation 46 1 91 105 arXiv 1401 3812 doi 10 1080 03610918 2014 957839 S2CID 41501327 Abramovich Felix Ritov Ya acov 2013 Statistical Theory A Concise Introduction CRC Press pp 121 122 ISBN 978 1 4398 5184 5 BUPA liver disorder dataset Zarembka P 1974 Transformation of Variables in Econometrics Frontiers in Econometrics New York Academic Press pp 81 104 ISBN 0 12 776150 0 Power Transform Family Graphs SOCR webpages Yeo In Kwon Johnson Richard A 2000 A New Family of Power Transformations to Improve Normality or Symmetry Biometrika 87 4 954 959 doi 10 1093 biomet 87 4 954 JSTOR 2673623 References editBox George E P Cox D R 1964 An analysis of transformations Journal of the Royal Statistical Society Series B 26 2 211 252 JSTOR 2984418 MR 0192611 Carroll R J Ruppert D 1981 On prediction and the power transformation family PDF Biometrika 68 3 609 615 doi 10 1093 biomet 68 3 609 DeGroot M H 1987 A Conversation with George Box Statistical Science 2 3 239 258 doi 10 1214 ss 1177013223 Handelsman D J 2002 Optimal Power Transformations for Analysis of Sperm Concentration and Other Semen Variables Journal of Andrology 23 5 Gluzman S Yukalov V I 2006 Self similar power transforms in extrapolation problems Journal of Mathematical Chemistry 39 1 47 56 arXiv cond mat 0606104 Bibcode 2006cond mat 6104G doi 10 1007 s10910 005 9003 7 S2CID 118965098 Howarth R J Earle S A M 1979 Application of a generalized power transformation to geochemical data Journal of the International Association for Mathematical Geology 11 1 45 62 doi 10 1007 BF01043245 S2CID 121582755 External links editNishii R 2001 1994 Box Cox transformation Encyclopedia of Mathematics EMS Press fixed link Sanford Weisberg Yeo Johnson Power Transformations Retrieved from https en wikipedia org w index php title Power transform amp oldid 1217530637 Yeo Johnson transformation, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.