fbpx
Wikipedia

Rayleigh quotient

In mathematics, the Rayleigh quotient[1] (/ˈr.li/) for a given complex Hermitian matrix and nonzero vector is defined as:[2][3]

For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugate transpose to the usual transpose . Note that for any non-zero scalar . Recall that a Hermitian (or real symmetric) matrix is diagonalizable with only real eigenvalues. It can be shown that, for a given matrix, the Rayleigh quotient reaches its minimum value (the smallest eigenvalue of ) when is (the corresponding eigenvector).[4] Similarly, and .

The Rayleigh quotient is used in the min-max theorem to get exact values of all eigenvalues. It is also used in eigenvalue algorithms (such as Rayleigh quotient iteration) to obtain an eigenvalue approximation from an eigenvector approximation.

The range of the Rayleigh quotient (for any matrix, not necessarily Hermitian) is called a numerical range and contains its spectrum. When the matrix is Hermitian, the numerical radius is equal to the spectral norm. Still in functional analysis, is known as the spectral radius. In the context of -algebras or algebraic quantum mechanics, the function that to associates the Rayleigh–Ritz quotient for a fixed and varying through the algebra would be referred to as vector state of the algebra.

In quantum mechanics, the Rayleigh quotient gives the expectation value of the observable corresponding to the operator for a system whose state is given by .

If we fix the complex matrix , then the resulting Rayleigh quotient map (considered as a function of ) completely determines via the polarization identity; indeed, this remains true even if we allow to be non-Hermitian. However, if we restrict the field of scalars to the real numbers, then the Rayleigh quotient only determines the symmetric part of .

Bounds for Hermitian M edit

As stated in the introduction, for any vector x, one has  , where   are respectively the smallest and largest eigenvalues of  . This is immediate after observing that the Rayleigh quotient is a weighted average of eigenvalues of M:

 
where   is the  -th eigenpair after orthonormalization and   is the  th coordinate of x in the eigenbasis. It is then easy to verify that the bounds are attained at the corresponding eigenvectors  .

The fact that the quotient is a weighted average of the eigenvalues can be used to identify the second, the third, ... largest eigenvalues. Let   be the eigenvalues in decreasing order. If   and   is constrained to be orthogonal to  , in which case  , then   has maximum value  , which is achieved when  .

Special case of covariance matrices edit

An empirical covariance matrix   can be represented as the product   of the data matrix   pre-multiplied by its transpose  . Being a positive semi-definite matrix,   has non-negative eigenvalues, and orthogonal (or orthogonalisable) eigenvectors, which can be demonstrated as follows.

Firstly, that the eigenvalues   are non-negative:

 

Secondly, that the eigenvectors   are orthogonal to one another:

 
if the eigenvalues are different – in the case of multiplicity, the basis can be orthogonalized.

To now establish that the Rayleigh quotient is maximized by the eigenvector with the largest eigenvalue, consider decomposing an arbitrary vector   on the basis of the eigenvectors  :

 
where
 
is the coordinate of   orthogonally projected onto  . Therefore, we have:
 
which, by orthonormality of the eigenvectors, becomes:
 

The last representation establishes that the Rayleigh quotient is the sum of the squared cosines of the angles formed by the vector   and each eigenvector  , weighted by corresponding eigenvalues.

If a vector   maximizes  , then any non-zero scalar multiple   also maximizes  , so the problem can be reduced to the Lagrange problem of maximizing   under the constraint that  .

Define:  . This then becomes a linear program, which always attains its maximum at one of the corners of the domain. A maximum point will have   and   for all   (when the eigenvalues are ordered by decreasing magnitude).

Thus, the Rayleigh quotient is maximized by the eigenvector with the largest eigenvalue.

Formulation using Lagrange multipliers edit

Alternatively, this result can be arrived at by the method of Lagrange multipliers. The first part is to show that the quotient is constant under scaling  , where   is a scalar

 

Because of this invariance, it is sufficient to study the special case  . The problem is then to find the critical points of the function

 
subject to the constraint   In other words, it is to find the critical points of
 
where   is a Lagrange multiplier. The stationary points of   occur at
 
and
 

Therefore, the eigenvectors   of   are the critical points of the Rayleigh quotient and their corresponding eigenvalues   are the stationary values of  . This property is the basis for principal components analysis and canonical correlation.

Use in Sturm–Liouville theory edit

Sturm–Liouville theory concerns the action of the linear operator

 
on the inner product space defined by
 
of functions satisfying some specified boundary conditions at a and b. In this case the Rayleigh quotient is
 

This is sometimes presented in an equivalent form, obtained by separating the integral in the numerator and using integration by parts:

 

Generalizations edit

  1. For a given pair (A, B) of matrices, and a given non-zero vector x, the generalized Rayleigh quotient is defined as:
     
    The Generalized Rayleigh Quotient can be reduced to the Rayleigh Quotient   through the transformation   where   is the Cholesky decomposition of the Hermitian positive-definite matrix B.
  2. For a given pair (x, y) of non-zero vectors, and a given Hermitian matrix H, the generalized Rayleigh quotient can be defined as:
     
    which coincides with R(H,x) when x = y. In quantum mechanics, this quantity is called a "matrix element" or sometimes a "transition amplitude".

See also edit

References edit

  1. ^ Also known as the Rayleigh–Ritz ratio; named after Walther Ritz and Lord Rayleigh.
  2. ^ Horn, R. A.; Johnson, C. A. (1985). Matrix Analysis. Cambridge University Press. pp. 176–180. ISBN 0-521-30586-1.
  3. ^ Parlett, B. N. (1998). The Symmetric Eigenvalue Problem. Classics in Applied Mathematics. SIAM. ISBN 0-89871-402-8.
  4. ^ Costin, Rodica D. (2013). "Midterm notes" (PDF). Mathematics 5102 Linear Mathematics in Infinite Dimensions, lecture notes. The Ohio State University.

Further reading edit

  • Shi Yu, Léon-Charles Tranchevent, Bart Moor, Yves Moreau, Kernel-based Data Fusion for Machine Learning: Methods and Applications in Bioinformatics and Text Mining, Ch. 2, Springer, 2011.

rayleigh, quotient, mathematics, given, complex, hermitian, matrix, displaystyle, nonzero, vector, displaystyle, defined, displaystyle, over, real, matrices, vectors, condition, being, hermitian, reduces, that, being, symmetric, conjugate, transpose, displayst. In mathematics the Rayleigh quotient 1 ˈ r eɪ l i for a given complex Hermitian matrix M displaystyle M and nonzero vector x displaystyle x is defined as 2 3 R M x x M x x x displaystyle R M x x Mx over x x For real matrices and vectors the condition of being Hermitian reduces to that of being symmetric and the conjugate transpose x displaystyle x to the usual transpose x displaystyle x Note that R M c x R M x displaystyle R M cx R M x for any non zero scalar c displaystyle c Recall that a Hermitian or real symmetric matrix is diagonalizable with only real eigenvalues It can be shown that for a given matrix the Rayleigh quotient reaches its minimum value l min displaystyle lambda min the smallest eigenvalue of M displaystyle M when x displaystyle x is v min displaystyle v min the corresponding eigenvector 4 Similarly R M x l max displaystyle R M x leq lambda max and R M v max l max displaystyle R M v max lambda max The Rayleigh quotient is used in the min max theorem to get exact values of all eigenvalues It is also used in eigenvalue algorithms such as Rayleigh quotient iteration to obtain an eigenvalue approximation from an eigenvector approximation The range of the Rayleigh quotient for any matrix not necessarily Hermitian is called a numerical range and contains its spectrum When the matrix is Hermitian the numerical radius is equal to the spectral norm Still in functional analysis l max displaystyle lambda max is known as the spectral radius In the context of C displaystyle C star algebras or algebraic quantum mechanics the function that to M displaystyle M associates the Rayleigh Ritz quotient R M x displaystyle R M x for a fixed x displaystyle x and M displaystyle M varying through the algebra would be referred to as vector state of the algebra In quantum mechanics the Rayleigh quotient gives the expectation value of the observable corresponding to the operator M displaystyle M for a system whose state is given by x displaystyle x If we fix the complex matrix M displaystyle M then the resulting Rayleigh quotient map considered as a function of x displaystyle x completely determines M displaystyle M via the polarization identity indeed this remains true even if we allow M displaystyle M to be non Hermitian However if we restrict the field of scalars to the real numbers then the Rayleigh quotient only determines the symmetric part of M displaystyle M Contents 1 Bounds for Hermitian M 2 Special case of covariance matrices 2 1 Formulation using Lagrange multipliers 3 Use in Sturm Liouville theory 4 Generalizations 5 See also 6 References 7 Further readingBounds for Hermitian M editAs stated in the introduction for any vector x one has R M x l min l max displaystyle R M x in left lambda min lambda max right nbsp where l min l max displaystyle lambda min lambda max nbsp are respectively the smallest and largest eigenvalues of M displaystyle M nbsp This is immediate after observing that the Rayleigh quotient is a weighted average of eigenvalues of M R M x x M x x x i 1 n l i y i 2 i 1 n y i 2 displaystyle R M x x Mx over x x frac sum i 1 n lambda i y i 2 sum i 1 n y i 2 nbsp where l i v i displaystyle lambda i v i nbsp is the i displaystyle i nbsp th eigenpair after orthonormalization and y i v i x displaystyle y i v i x nbsp is the i displaystyle i nbsp th coordinate of x in the eigenbasis It is then easy to verify that the bounds are attained at the corresponding eigenvectors v min v max displaystyle v min v max nbsp The fact that the quotient is a weighted average of the eigenvalues can be used to identify the second the third largest eigenvalues Let l max l 1 l 2 l n l min displaystyle lambda max lambda 1 geq lambda 2 geq cdots geq lambda n lambda min nbsp be the eigenvalues in decreasing order If n 2 displaystyle n 2 nbsp and x displaystyle x nbsp is constrained to be orthogonal to v 1 displaystyle v 1 nbsp in which case y 1 v 1 x 0 displaystyle y 1 v 1 x 0 nbsp then R M x displaystyle R M x nbsp has maximum value l 2 displaystyle lambda 2 nbsp which is achieved when x v 2 displaystyle x v 2 nbsp Special case of covariance matrices editAn empirical covariance matrix M displaystyle M nbsp can be represented as the product A A displaystyle A A nbsp of the data matrix A displaystyle A nbsp pre multiplied by its transpose A displaystyle A nbsp Being a positive semi definite matrix M displaystyle M nbsp has non negative eigenvalues and orthogonal or orthogonalisable eigenvectors which can be demonstrated as follows Firstly that the eigenvalues l i displaystyle lambda i nbsp are non negative M v i A A v i l i v i v i A A v i v i l i v i A v i 2 l i v i 2 l i A v i 2 v i 2 0 displaystyle begin aligned amp Mv i A Av i lambda i v i Rightarrow amp v i A Av i v i lambda i v i Rightarrow amp left Av i right 2 lambda i left v i right 2 Rightarrow amp lambda i frac left Av i right 2 left v i right 2 geq 0 end aligned nbsp Secondly that the eigenvectors v i displaystyle v i nbsp are orthogonal to one another M v i l i v i v j M v i v j l i v i M v j v i l j v j v i l j v j v i l i v j v i l j l i v j v i 0 v j v i 0 displaystyle begin aligned amp Mv i lambda i v i Rightarrow amp v j Mv i v j lambda i v i Rightarrow amp left Mv j right v i lambda j v j v i Rightarrow amp lambda j v j v i lambda i v j v i Rightarrow amp left lambda j lambda i right v j v i 0 Rightarrow amp v j v i 0 end aligned nbsp if the eigenvalues are different in the case of multiplicity the basis can be orthogonalized To now establish that the Rayleigh quotient is maximized by the eigenvector with the largest eigenvalue consider decomposing an arbitrary vector x displaystyle x nbsp on the basis of the eigenvectors v i displaystyle v i nbsp x i 1 n a i v i displaystyle x sum i 1 n alpha i v i nbsp where a i x v i v i v i x v i v i 2 displaystyle alpha i frac x v i v i v i frac langle x v i rangle left v i right 2 nbsp is the coordinate of x displaystyle x nbsp orthogonally projected onto v i displaystyle v i nbsp Therefore we have R M x x A A x x x j 1 n a j v j A A i 1 n a i v i j 1 n a j v j i 1 n a i v i j 1 n a j v j i 1 n a i A A v i i 1 n a i 2 v i v i j 1 n a j v j i 1 n a i l i v i i 1 n a i 2 v i 2 displaystyle begin aligned R M x amp frac x A Ax x x amp frac Bigl sum j 1 n alpha j v j Bigr left A A right Bigl sum i 1 n alpha i v i Bigr Bigl sum j 1 n alpha j v j Bigr Bigl sum i 1 n alpha i v i Bigr amp frac Bigl sum j 1 n alpha j v j Bigr Bigl sum i 1 n alpha i A A v i Bigr Bigl sum i 1 n alpha i 2 v i v i Bigr amp frac Bigl sum j 1 n alpha j v j Bigr Bigl sum i 1 n alpha i lambda i v i Bigr Bigl sum i 1 n alpha i 2 v i 2 Bigr end aligned nbsp which by orthonormality of the eigenvectors becomes R M x i 1 n a i 2 l i i 1 n a i 2 i 1 n l i x v i 2 x x v i v i 2 i 1 n l i x v i 2 x x displaystyle begin aligned R M x amp frac sum i 1 n alpha i 2 lambda i sum i 1 n alpha i 2 amp sum i 1 n lambda i frac x v i 2 x x v i v i 2 amp sum i 1 n lambda i frac x v i 2 x x end aligned nbsp The last representation establishes that the Rayleigh quotient is the sum of the squared cosines of the angles formed by the vector x displaystyle x nbsp and each eigenvector v i displaystyle v i nbsp weighted by corresponding eigenvalues If a vector x displaystyle x nbsp maximizes R M x displaystyle R M x nbsp then any non zero scalar multiple k x displaystyle kx nbsp also maximizes R displaystyle R nbsp so the problem can be reduced to the Lagrange problem of maximizing i 1 n a i 2 l i textstyle sum i 1 n alpha i 2 lambda i nbsp under the constraint that i 1 n a i 2 1 textstyle sum i 1 n alpha i 2 1 nbsp Define b i a i 2 displaystyle beta i alpha i 2 nbsp This then becomes a linear program which always attains its maximum at one of the corners of the domain A maximum point will have a 1 1 displaystyle alpha 1 pm 1 nbsp and a i 0 displaystyle alpha i 0 nbsp for all i gt 1 displaystyle i gt 1 nbsp when the eigenvalues are ordered by decreasing magnitude Thus the Rayleigh quotient is maximized by the eigenvector with the largest eigenvalue Formulation using Lagrange multipliers edit Alternatively this result can be arrived at by the method of Lagrange multipliers The first part is to show that the quotient is constant under scaling x c x displaystyle x to cx nbsp where c displaystyle c nbsp is a scalarR M c x c x M c x c x c x c c c c x M x x x R M x displaystyle R M cx frac cx Mcx cx cx frac c c c c frac x Mx x x R M x nbsp Because of this invariance it is sufficient to study the special case x 2 x T x 1 displaystyle x 2 x T x 1 nbsp The problem is then to find the critical points of the functionR M x x T M x displaystyle R M x x mathsf T Mx nbsp subject to the constraint x 2 x T x 1 displaystyle x 2 x T x 1 nbsp In other words it is to find the critical points of L x x T M x l x T x 1 displaystyle mathcal L x x mathsf T Mx lambda left x mathsf T x 1 right nbsp where l displaystyle lambda nbsp is a Lagrange multiplier The stationary points of L x displaystyle mathcal L x nbsp occur at d L x d x 0 2 x T M 2 l x T 0 2 M x 2 l x 0 taking the transpose of both sides and noting that M is Hermitian M x l x displaystyle begin aligned amp frac d mathcal L x dx 0 Rightarrow amp 2x mathsf T M 2 lambda x mathsf T 0 Rightarrow amp 2Mx 2 lambda x 0 text taking the transpose of both sides and noting that M text is Hermitian Rightarrow amp Mx lambda x end aligned nbsp and R M x x T M x x T x l x T x x T x l displaystyle therefore R M x frac x mathsf T Mx x mathsf T x lambda frac x mathsf T x x mathsf T x lambda nbsp Therefore the eigenvectors x 1 x n displaystyle x 1 ldots x n nbsp of M displaystyle M nbsp are the critical points of the Rayleigh quotient and their corresponding eigenvalues l 1 l n displaystyle lambda 1 ldots lambda n nbsp are the stationary values of L displaystyle mathcal L nbsp This property is the basis for principal components analysis and canonical correlation Use in Sturm Liouville theory editSturm Liouville theory concerns the action of the linear operatorL y 1 w x d d x p x d y d x q x y displaystyle L y frac 1 w x left frac d dx left p x frac dy dx right q x y right nbsp on the inner product space defined by y 1 y 2 a b w x y 1 x y 2 x d x displaystyle langle y 1 y 2 rangle int a b w x y 1 x y 2 x dx nbsp of functions satisfying some specified boundary conditions at a and b In this case the Rayleigh quotient is y L y y y a b y x d d x p x d y d x q x y x d x a b w x y x 2 d x displaystyle frac langle y Ly rangle langle y y rangle frac int a b y x left frac d dx left p x frac dy dx right q x y x right dx int a b w x y x 2 dx nbsp This is sometimes presented in an equivalent form obtained by separating the integral in the numerator and using integration by parts y L y y y a b y x d d x p x y x d x a b q x y x 2 d x a b w x y x 2 d x y x p x y x a b a b y x p x y x d x a b q x y x 2 d x a b w x y x 2 d x p x y x y x a b a b p x y x 2 q x y x 2 d x a b w x y x 2 d x displaystyle begin aligned frac langle y Ly rangle langle y y rangle amp frac left int a b y x left frac d dx left p x y x right right dx right left int a b q x y x 2 dx right int a b w x y x 2 dx amp frac left left y x left p x y x right right a b right left int a b y x left p x y x right dx right left int a b q x y x 2 dx right int a b w x y x 2 dx amp frac left left p x y x y x right a b right left int a b left p x y x 2 q x y x 2 right dx right int a b w x y x 2 dx end aligned nbsp Generalizations editFor a given pair A B of matrices and a given non zero vector x the generalized Rayleigh quotient is defined as R A B x x A x x B x displaystyle R A B x frac x Ax x Bx nbsp The Generalized Rayleigh Quotient can be reduced to the Rayleigh Quotient R D C x displaystyle R D C x nbsp through the transformation D C 1 A C 1 displaystyle D C 1 A C 1 nbsp where C C displaystyle CC nbsp is the Cholesky decomposition of the Hermitian positive definite matrix B For a given pair x y of non zero vectors and a given Hermitian matrix H the generalized Rayleigh quotient can be defined as R H x y y H x y y x x displaystyle R H x y frac y Hx sqrt y y cdot x x nbsp which coincides with R H x when x y In quantum mechanics this quantity is called a matrix element or sometimes a transition amplitude See also editField of values Min max theorem Rayleigh s quotient in vibrations analysis Dirichlet eigenvalueReferences edit Also known as the Rayleigh Ritz ratio named after Walther Ritz and Lord Rayleigh Horn R A Johnson C A 1985 Matrix Analysis Cambridge University Press pp 176 180 ISBN 0 521 30586 1 Parlett B N 1998 The Symmetric Eigenvalue Problem Classics in Applied Mathematics SIAM ISBN 0 89871 402 8 Costin Rodica D 2013 Midterm notes PDF Mathematics 5102 Linear Mathematics in Infinite Dimensions lecture notes The Ohio State University Further reading editShi Yu Leon Charles Tranchevent Bart Moor Yves Moreau Kernel based Data Fusion for Machine Learning Methods and Applications in Bioinformatics and Text Mining Ch 2 Springer 2011 Retrieved from https en wikipedia org w index php title Rayleigh quotient amp oldid 1187308195, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.