fbpx
Wikipedia

Inverse iteration

In numerical analysis, inverse iteration (also known as the inverse power method) is an iterative eigenvalue algorithm. It allows one to find an approximate eigenvector when an approximation to a corresponding eigenvalue is already known. The method is conceptually similar to the power method. It appears to have originally been developed to compute resonance frequencies in the field of structural mechanics. [1]

The inverse power iteration algorithm starts with an approximation for the eigenvalue corresponding to the desired eigenvector and a vector , either a randomly selected vector or an approximation to the eigenvector. The method is described by the iteration

where are some constants usually chosen as Since eigenvectors are defined up to multiplication by constant, the choice of can be arbitrary in theory; practical aspects of the choice of are discussed below.

At every iteration, the vector is multiplied by the matrix and normalized. It is exactly the same formula as in the power method, except replacing the matrix by The closer the approximation to the eigenvalue is chosen, the faster the algorithm converges; however, incorrect choice of can lead to slow convergence or to the convergence to an eigenvector other than the one desired. In practice, the method is used when a good approximation for the eigenvalue is known, and hence one needs only few (quite often just one) iterations.

Theory and convergence edit

The basic idea of the power iteration is choosing an initial vector   (either an eigenvector approximation or a random vector) and iteratively calculating  . Except for a set of zero measure, for any initial vector, the result will converge to an eigenvector corresponding to the dominant eigenvalue.

The inverse iteration does the same for the matrix  , so it converges to the eigenvector corresponding to the dominant eigenvalue of the matrix  . Eigenvalues of this matrix are   where   are eigenvalues of  . The largest of these numbers corresponds to the smallest of   The eigenvectors of   and of   are the same, since

 

Conclusion: The method converges to the eigenvector of the matrix   corresponding to the closest eigenvalue to  

In particular, taking   we see that   converges to the eigenvector corresponding to the eigenvalue of   with the largest magnitude   and thus can be used to determine the smallest magnitude eigenvalue of   since they are inversely related.

Speed of convergence edit

Let us analyze the rate of convergence of the method.

The power method is known to converge linearly to the limit, more precisely:

 

hence for the inverse iteration method similar result sounds as:

 

This is a key formula for understanding the method's convergence. It shows that if   is chosen close enough to some eigenvalue  , for example   each iteration will improve the accuracy   times. (We use that for small enough   "closest to  " and "closest to  " is the same.) For small enough   it is approximately the same as  . Hence if one is able to find  , such that the   will be small enough, then very few iterations may be satisfactory.

Complexity edit

The inverse iteration algorithm requires solving a linear system or calculation of the inverse matrix. For non-structured matrices (not sparse, not Toeplitz,...) this requires   operations.

Implementation options edit

The method is defined by the formula:

 

There are, however, multiple options for its implementation.

Calculate inverse matrix or solve system of linear equations edit

We can rewrite the formula in the following way:

 

emphasizing that to find the next approximation   we may solve a system of linear equations. There are two options: one may choose an algorithm that solves a linear system, or one may calculate the inverse   and then apply it to the vector. Both options have complexity O(n3), the exact number depends on the chosen method.

The choice depends also on the number of iterations. Naively, if at each iteration one solves a linear system, the complexity will be kO(n3), where k is number of iterations; similarly, calculating the inverse matrix and applying it at each iteration is of complexity kO(n3). Note, however, that if the eigenvalue estimate   remains constant, then we may reduce the complexity to O(n3) + kO(n2) with either method. Calculating the inverse matrix once, and storing it to apply at each iteration is of complexity O(n3) + kO(n2). Storing an LU decomposition of   and using forward and back substitution to solve the system of equations at each iteration is also of complexity O(n3) + kO(n2).

Inverting the matrix will typically have a greater initial cost, but lower cost at each iteration. Conversely, solving systems of linear equations will typically have a lesser initial cost, but require more operations for each iteration.

Tridiagonalization, Hessenberg form edit

If it is necessary to perform many iterations (or few iterations, but for many eigenvectors), then it might be wise to bring the matrix to the upper Hessenberg form first (for symmetric matrix this will be tridiagonal form). Which costs   arithmetic operations using a technique based on Householder reduction), with a finite sequence of orthogonal similarity transforms, somewhat like a two-sided QR decomposition.[2][3] (For QR decomposition, the Householder rotations are multiplied only on the left, but for the Hessenberg case they are multiplied on both left and right.) For symmetric matrices this procedure costs   arithmetic operations using a technique based on Householder reduction.[2][3]

Solution of the system of linear equations for the tridiagonal matrix costs   operations, so the complexity grows like  , where   is the iteration number, which is better than for the direct inversion. However, for few iterations such transformation may not be practical.

Also transformation to the Hessenberg form involves square roots and the division operation, which are not universally supported by hardware.

Choice of the normalization constant Ck edit

On general purpose processors (e.g. produced by Intel) the execution time of addition, multiplication and division is approximately equal. But on embedded and/or low energy consuming hardware (digital signal processors, FPGA, ASIC) division may not be supported by hardware, and so should be avoided. Choosing   allows fast division without explicit hardware support, as division by a power of 2 may be implemented as either a bit shift (for fixed-point arithmetic) or subtraction of   from the exponent (for floating-point arithmetic).

When implementing the algorithm using fixed-point arithmetic, the choice of the constant   is especially important. Small values will lead to fast growth of the norm of   and to overflow; large values of   will cause the vector   to tend toward zero.

Usage edit

The main application of the method is the situation when an approximation to an eigenvalue is found and one needs to find the corresponding approximate eigenvector. In such a situation the inverse iteration is the main and probably the only method to use.

Methods to find approximate eigenvalues edit

Typically, the method is used in combination with some other method which finds approximate eigenvalues: the standard example is the bisection eigenvalue algorithm, another example is the Rayleigh quotient iteration, which is actually the same inverse iteration with the choice of the approximate eigenvalue as the Rayleigh quotient corresponding to the vector obtained on the previous step of the iteration.

There are some situations where the method can be used by itself, however they are quite marginal.

Norm of matrix as approximation to the dominant eigenvalue edit

The dominant eigenvalue can be easily estimated for any matrix. For any induced norm it is true that   for any eigenvalue  . So taking the norm of the matrix as an approximate eigenvalue one can see that the method will converge to the dominant eigenvector.

Estimates based on statistics edit

In some real-time applications one needs to find eigenvectors for matrices with a speed of millions of matrices per second. In such applications, typically the statistics of matrices is known in advance and one can take as an approximate eigenvalue the average eigenvalue for some large matrix sample. Better, one may calculate the mean ratio of the eigenvalues to the trace or the norm of the matrix and estimate the average eigenvalue as the trace or norm multiplied by the average value of that ratio. Clearly such a method can be used only with discretion and only when high precision is not critical. This approach of estimating an average eigenvalue can be combined with other methods to avoid excessively large error.

See also edit

References edit

  1. ^ Ernst Pohlhausen, Berechnung der Eigenschwingungen statisch-bestimmter Fachwerke, ZAMM - Zeitschrift für Angewandte Mathematik und Mechanik 1, 28-42 (1921).
  2. ^ a b Demmel, James W. (1997), Applied Numerical Linear Algebra, Philadelphia, PA: Society for Industrial and Applied Mathematics, ISBN 0-89871-389-7, MR 1463942.
  3. ^ a b Lloyd N. Trefethen and David Bau, Numerical Linear Algebra (SIAM, 1997).

inverse, iteration, numerical, analysis, inverse, iteration, also, known, inverse, power, method, iterative, eigenvalue, algorithm, allows, find, approximate, eigenvector, when, approximation, corresponding, eigenvalue, already, known, method, conceptually, si. In numerical analysis inverse iteration also known as the inverse power method is an iterative eigenvalue algorithm It allows one to find an approximate eigenvector when an approximation to a corresponding eigenvalue is already known The method is conceptually similar to the power method It appears to have originally been developed to compute resonance frequencies in the field of structural mechanics 1 The inverse power iteration algorithm starts with an approximation m displaystyle mu for the eigenvalue corresponding to the desired eigenvector and a vector b 0 displaystyle b 0 either a randomly selected vector or an approximation to the eigenvector The method is described by the iterationb k 1 A m I 1 b k C k displaystyle b k 1 frac A mu I 1 b k C k where C k displaystyle C k are some constants usually chosen as C k A m I 1 b k displaystyle C k A mu I 1 b k Since eigenvectors are defined up to multiplication by constant the choice of C k displaystyle C k can be arbitrary in theory practical aspects of the choice of C k displaystyle C k are discussed below At every iteration the vector b k displaystyle b k is multiplied by the matrix A m I 1 displaystyle A mu I 1 and normalized It is exactly the same formula as in the power method except replacing the matrix A displaystyle A by A m I 1 displaystyle A mu I 1 The closer the approximation m displaystyle mu to the eigenvalue is chosen the faster the algorithm converges however incorrect choice of m displaystyle mu can lead to slow convergence or to the convergence to an eigenvector other than the one desired In practice the method is used when a good approximation for the eigenvalue is known and hence one needs only few quite often just one iterations Contents 1 Theory and convergence 1 1 Speed of convergence 1 2 Complexity 2 Implementation options 2 1 Calculate inverse matrix or solve system of linear equations 2 2 Tridiagonalization Hessenberg form 2 3 Choice of the normalization constant Ck 3 Usage 3 1 Methods to find approximate eigenvalues 3 2 Norm of matrix as approximation to the dominant eigenvalue 3 3 Estimates based on statistics 4 See also 5 ReferencesTheory and convergence editThe basic idea of the power iteration is choosing an initial vector b displaystyle b nbsp either an eigenvector approximation or a random vector and iteratively calculating A b A 2 b A 3 b displaystyle Ab A 2 b A 3 b nbsp Except for a set of zero measure for any initial vector the result will converge to an eigenvector corresponding to the dominant eigenvalue The inverse iteration does the same for the matrix A m I 1 displaystyle A mu I 1 nbsp so it converges to the eigenvector corresponding to the dominant eigenvalue of the matrix A m I 1 displaystyle A mu I 1 nbsp Eigenvalues of this matrix are l 1 m 1 l n m 1 displaystyle lambda 1 mu 1 lambda n mu 1 nbsp where l i displaystyle lambda i nbsp are eigenvalues of A displaystyle A nbsp The largest of these numbers corresponds to the smallest of l 1 m l n m displaystyle lambda 1 mu lambda n mu nbsp The eigenvectors of A displaystyle A nbsp and of A m I 1 displaystyle A mu I 1 nbsp are the same sinceA v l v A m I v l v m v l m 1 v A m I 1 v displaystyle Av lambda v Leftrightarrow A mu I v lambda v mu v Leftrightarrow lambda mu 1 v A mu I 1 v nbsp Conclusion The method converges to the eigenvector of the matrix A displaystyle A nbsp corresponding to the closest eigenvalue to m displaystyle mu nbsp In particular taking m 0 displaystyle mu 0 nbsp we see that A 1 b k displaystyle A 1 b k nbsp converges to the eigenvector corresponding to the eigenvalue of A 1 displaystyle A 1 nbsp with the largest magnitude 1 l N displaystyle frac 1 lambda N nbsp and thus can be used to determine the smallest magnitude eigenvalue of A displaystyle A nbsp since they are inversely related Speed of convergence edit Let us analyze the rate of convergence of the method The power method is known to converge linearly to the limit more precisely D i s t a n c e b i d e a l b P o w e r M e t h o d k O l s u b d o m i n a n t l d o m i n a n t k displaystyle mathrm Distance b mathrm ideal b mathrm Power Method k O left left frac lambda mathrm subdominant lambda mathrm dominant right k right nbsp hence for the inverse iteration method similar result sounds as D i s t a n c e b i d e a l b I n v e r s e i t e r a t i o n k O m l c l o s e s t t o m m l s e c o n d c l o s e s t t o m k displaystyle mathrm Distance b mathrm ideal b mathrm Inverse iteration k O left left frac mu lambda mathrm closest to mu mu lambda mathrm second closest to mu right k right nbsp This is a key formula for understanding the method s convergence It shows that if m displaystyle mu nbsp is chosen close enough to some eigenvalue l displaystyle lambda nbsp for example m l ϵ displaystyle mu lambda epsilon nbsp each iteration will improve the accuracy ϵ l ϵ l c l o s e s t t o l displaystyle epsilon lambda epsilon lambda mathrm closest to lambda nbsp times We use that for small enough ϵ displaystyle epsilon nbsp closest to m displaystyle mu nbsp and closest to l displaystyle lambda nbsp is the same For small enough ϵ displaystyle epsilon nbsp it is approximately the same as ϵ l l closest to l displaystyle epsilon lambda lambda text closest to lambda nbsp Hence if one is able to find m displaystyle mu nbsp such that the ϵ displaystyle epsilon nbsp will be small enough then very few iterations may be satisfactory Complexity edit The inverse iteration algorithm requires solving a linear system or calculation of the inverse matrix For non structured matrices not sparse not Toeplitz this requires O n 3 displaystyle O n 3 nbsp operations Implementation options editThe method is defined by the formula b k 1 A m I 1 b k C k displaystyle b k 1 frac A mu I 1 b k C k nbsp There are however multiple options for its implementation Calculate inverse matrix or solve system of linear equations edit We can rewrite the formula in the following way A m I b k 1 b k C k displaystyle A mu I b k 1 frac b k C k nbsp emphasizing that to find the next approximation b k 1 displaystyle b k 1 nbsp we may solve a system of linear equations There are two options one may choose an algorithm that solves a linear system or one may calculate the inverse A m I 1 displaystyle A mu I 1 nbsp and then apply it to the vector Both options have complexity O n3 the exact number depends on the chosen method The choice depends also on the number of iterations Naively if at each iteration one solves a linear system the complexity will be k O n3 where k is number of iterations similarly calculating the inverse matrix and applying it at each iteration is of complexity k O n3 Note however that if the eigenvalue estimate m displaystyle mu nbsp remains constant then we may reduce the complexity to O n3 k O n2 with either method Calculating the inverse matrix once and storing it to apply at each iteration is of complexity O n3 k O n2 Storing an LU decomposition of A m I displaystyle A mu I nbsp and using forward and back substitution to solve the system of equations at each iteration is also of complexity O n3 k O n2 Inverting the matrix will typically have a greater initial cost but lower cost at each iteration Conversely solving systems of linear equations will typically have a lesser initial cost but require more operations for each iteration Tridiagonalization Hessenberg form edit If it is necessary to perform many iterations or few iterations but for many eigenvectors then it might be wise to bring the matrix to the upper Hessenberg form first for symmetric matrix this will be tridiagonal form Which costs 10 3 n 3 O n 2 textstyle frac 10 3 n 3 O n 2 nbsp arithmetic operations using a technique based on Householder reduction with a finite sequence of orthogonal similarity transforms somewhat like a two sided QR decomposition 2 3 For QR decomposition the Householder rotations are multiplied only on the left but for the Hessenberg case they are multiplied on both left and right For symmetric matrices this procedure costs 4 3 n 3 O n 2 textstyle frac 4 3 n 3 O n 2 nbsp arithmetic operations using a technique based on Householder reduction 2 3 Solution of the system of linear equations for the tridiagonal matrix costs O n displaystyle O n nbsp operations so the complexity grows like O n 3 k O n displaystyle O n 3 kO n nbsp where k displaystyle k nbsp is the iteration number which is better than for the direct inversion However for few iterations such transformation may not be practical Also transformation to the Hessenberg form involves square roots and the division operation which are not universally supported by hardware Choice of the normalization constant Ck edit On general purpose processors e g produced by Intel the execution time of addition multiplication and division is approximately equal But on embedded and or low energy consuming hardware digital signal processors FPGA ASIC division may not be supported by hardware and so should be avoided Choosing C k 2 n k displaystyle C k 2 n k nbsp allows fast division without explicit hardware support as division by a power of 2 may be implemented as either a bit shift for fixed point arithmetic or subtraction of k displaystyle k nbsp from the exponent for floating point arithmetic When implementing the algorithm using fixed point arithmetic the choice of the constant C k displaystyle C k nbsp is especially important Small values will lead to fast growth of the norm of b k displaystyle b k nbsp and to overflow large values of C k displaystyle C k nbsp will cause the vector b k displaystyle b k nbsp to tend toward zero Usage editThe main application of the method is the situation when an approximation to an eigenvalue is found and one needs to find the corresponding approximate eigenvector In such a situation the inverse iteration is the main and probably the only method to use Methods to find approximate eigenvalues edit Typically the method is used in combination with some other method which finds approximate eigenvalues the standard example is the bisection eigenvalue algorithm another example is the Rayleigh quotient iteration which is actually the same inverse iteration with the choice of the approximate eigenvalue as the Rayleigh quotient corresponding to the vector obtained on the previous step of the iteration There are some situations where the method can be used by itself however they are quite marginal Norm of matrix as approximation to the dominant eigenvalue edit The dominant eigenvalue can be easily estimated for any matrix For any induced norm it is true that A l displaystyle left A right geq lambda nbsp for any eigenvalue l displaystyle lambda nbsp So taking the norm of the matrix as an approximate eigenvalue one can see that the method will converge to the dominant eigenvector Estimates based on statistics edit In some real time applications one needs to find eigenvectors for matrices with a speed of millions of matrices per second In such applications typically the statistics of matrices is known in advance and one can take as an approximate eigenvalue the average eigenvalue for some large matrix sample Better one may calculate the mean ratio of the eigenvalues to the trace or the norm of the matrix and estimate the average eigenvalue as the trace or norm multiplied by the average value of that ratio Clearly such a method can be used only with discretion and only when high precision is not critical This approach of estimating an average eigenvalue can be combined with other methods to avoid excessively large error See also editPower iteration Rayleigh quotient iteration List of eigenvalue algorithmsReferences edit Ernst Pohlhausen Berechnung der Eigenschwingungen statisch bestimmter Fachwerke ZAMM Zeitschrift fur Angewandte Mathematik und Mechanik 1 28 42 1921 a b Demmel James W 1997 Applied Numerical Linear Algebra Philadelphia PA Society for Industrial and Applied Mathematics ISBN 0 89871 389 7 MR 1463942 a b Lloyd N Trefethen and David Bau Numerical Linear Algebra SIAM 1997 Retrieved from https en wikipedia org w index php title Inverse iteration amp oldid 1150716856, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.