fbpx
Wikipedia

Mercer's theorem

In mathematics, specifically functional analysis, Mercer's theorem is a representation of a symmetric positive-definite function on a square as a sum of a convergent sequence of product functions. This theorem, presented in (Mercer 1909), is one of the most notable results of the work of James Mercer (1883–1932). It is an important theoretical tool in the theory of integral equations; it is used in the Hilbert space theory of stochastic processes, for example the Karhunen–Loève theorem; and it is also used in the reproducing kernel Hilbert space theory where it characterizes a symmetric positive-definite kernel as a reproducing kernel.[1]

Introduction edit

To explain Mercer's theorem, we first consider an important special case; see below for a more general formulation. A kernel, in this context, is a symmetric continuous function

 

where symmetric means that   for all  .

K is said to be a positive-definite kernel if and only if

 

for all finite sequences of points x1, ..., xn of [ab] and all choices of real numbers c1, ..., cn. Note that the term "positive-definite" is well-established in literature despite the weak inequality in the definition.[2][3]

Associated to K is a linear operator (more specifically a Hilbert–Schmidt integral operator) on functions defined by the integral

 

For technical considerations we assume   can range through the space L2[ab] (see Lp space) of square-integrable real-valued functions. Since TK is a linear operator, we can talk about eigenvalues and eigenfunctions of TK.

Theorem. Suppose K is a continuous symmetric positive-definite kernel. Then there is an orthonormal basis {ei}i of L2[ab] consisting of eigenfunctions of TK such that the corresponding sequence of eigenvalues {λi}i is nonnegative. The eigenfunctions corresponding to non-zero eigenvalues are continuous on [ab] and K has the representation

 

where the convergence is absolute and uniform.

Details edit

We now explain in greater detail the structure of the proof of Mercer's theorem, particularly how it relates to spectral theory of compact operators.

  • The map KTK is injective.
  • TK is a non-negative symmetric compact operator on L2[a,b]; moreover K(x, x) ≥ 0.

To show compactness, show that the image of the unit ball of L2[a,b] under TK equicontinuous and apply Ascoli's theorem, to show that the image of the unit ball is relatively compact in C([a,b]) with the uniform norm and a fortiori in L2[a,b].

Now apply the spectral theorem for compact operators on Hilbert spaces to TK to show the existence of the orthonormal basis {ei}i of L2[a,b]

 

If λi ≠ 0, the eigenvector (eigenfunction) ei is seen to be continuous on [a,b]. Now

 

which shows that the sequence

 

converges absolutely and uniformly to a kernel K0 which is easily seen to define the same operator as the kernel K. Hence K=K0 from which Mercer's theorem follows.

Finally, to show non-negativity of the eigenvalues one can write   and expressing the right hand side as an integral well-approximated by its Riemann sums, which are non-negative by positive-definiteness of K, implying  , implying  .

Trace edit

The following is immediate:

Theorem. Suppose K is a continuous symmetric positive-definite kernel; TK has a sequence of nonnegative eigenvalues {λi}i. Then

 

This shows that the operator TK is a trace class operator and

 

Generalizations edit

Mercer's theorem itself is a generalization of the result that any symmetric positive-semidefinite matrix is the Gramian matrix of a set of vectors.

The first generalization [citation needed] replaces the interval [ab] with any compact Hausdorff space and Lebesgue measure on [ab] is replaced by a finite countably additive measure μ on the Borel algebra of X whose support is X. This means that μ(U) > 0 for any nonempty open subset U of X.

A recent generalization [citation needed] replaces these conditions by the following: the set X is a first-countable topological space endowed with a Borel (complete) measure μ. X is the support of μ and, for all x in X, there is an open set U containing x and having finite measure. Then essentially the same result holds:

Theorem. Suppose K is a continuous symmetric positive-definite kernel on X. If the function κ is L1μ(X), where κ(x)=K(x,x), for all x in X, then there is an orthonormal set {ei}i of L2μ(X) consisting of eigenfunctions of TK such that corresponding sequence of eigenvalues {λi}i is nonnegative. The eigenfunctions corresponding to non-zero eigenvalues are continuous on X and K has the representation

 

where the convergence is absolute and uniform on compact subsets of X.

The next generalization [citation needed] deals with representations of measurable kernels.

Let (X, M, μ) be a σ-finite measure space. An L2 (or square-integrable) kernel on X is a function

 

L2 kernels define a bounded operator TK by the formula

 

TK is a compact operator (actually it is even a Hilbert–Schmidt operator). If the kernel K is symmetric, by the spectral theorem, TK has an orthonormal basis of eigenvectors. Those eigenvectors that correspond to non-zero eigenvalues can be arranged in a sequence {ei}i (regardless of separability).

Theorem. If K is a symmetric positive-definite kernel on (X, M, μ), then

 

where the convergence in the L2 norm. Note that when continuity of the kernel is not assumed, the expansion no longer converges uniformly.

Mercer's condition edit

In mathematics, a real-valued function K(x,y) is said to fulfill Mercer's condition if for all square-integrable functions g(x) one has

 

Discrete analog edit

This is analogous to the definition of a positive-semidefinite matrix. This is a matrix   of dimension  , which satisfies, for all vectors  , the property

 .

Examples edit

A positive constant function

 

satisfies Mercer's condition, as then the integral becomes by Fubini's theorem

 

which is indeed non-negative.

See also edit

Notes edit

  1. ^ Bartlett, Peter (2008). "Reproducing Kernel Hilbert Spaces" (PDF). Lecture notes of CS281B/Stat241B Statistical Learning Theory. University of California at Berkeley.
  2. ^ Mohri, Mehryar (2018). Foundations of machine learning. Afshin Rostamizadeh, Ameet Talwalkar (Second ed.). Cambridge, Massachusetts. ISBN 978-0-262-03940-6. OCLC 1041560990.{{cite book}}: CS1 maint: location missing publisher (link)
  3. ^ Berlinet, A. (2004). Reproducing kernel Hilbert spaces in probability and statistics. Christine Thomas-Agnan. New York: Springer Science+Business Media. ISBN 1-4419-9096-8. OCLC 844346520.

References edit

  • Adriaan Zaanen, Linear Analysis, North Holland Publishing Co., 1960,
  • Ferreira, J. C., Menegatto, V. A., Eigenvalues of integral operators defined by smooth positive definite kernels, Integral equation and Operator Theory, 64 (2009), no. 1, 61–81. (Gives the generalization of Mercer's theorem for metric spaces. The result is easily adapted to first countable topological spaces)
  • Konrad Jörgens, Linear integral operators, Pitman, Boston, 1982,
  • Richard Courant and David Hilbert, Methods of Mathematical Physics, vol 1, Interscience 1953,
  • Robert Ash, Information Theory, Dover Publications, 1990,
  • Mercer, J. (1909), "Functions of positive and negative type and their connection with the theory of integral equations", Philosophical Transactions of the Royal Society A, 209 (441–458): 415–446, Bibcode:1909RSPTA.209..415M, doi:10.1098/rsta.1909.0016,
  • "Mercer theorem", Encyclopedia of Mathematics, EMS Press, 2001 [1994]
  • H. König, Eigenvalue distribution of compact operators, Birkhäuser Verlag, 1986. (Gives the generalization of Mercer's theorem for finite measures μ.)

mercer, theorem, mathematics, specifically, functional, analysis, representation, symmetric, positive, definite, function, square, convergent, sequence, product, functions, this, theorem, presented, mercer, 1909, most, notable, results, work, james, mercer, 18. In mathematics specifically functional analysis Mercer s theorem is a representation of a symmetric positive definite function on a square as a sum of a convergent sequence of product functions This theorem presented in Mercer 1909 is one of the most notable results of the work of James Mercer 1883 1932 It is an important theoretical tool in the theory of integral equations it is used in the Hilbert space theory of stochastic processes for example the Karhunen Loeve theorem and it is also used in the reproducing kernel Hilbert space theory where it characterizes a symmetric positive definite kernel as a reproducing kernel 1 Contents 1 Introduction 2 Details 3 Trace 4 Generalizations 5 Mercer s condition 5 1 Discrete analog 5 2 Examples 6 See also 7 Notes 8 ReferencesIntroduction editTo explain Mercer s theorem we first consider an important special case see below for a more general formulation A kernel in this context is a symmetric continuous function K a b a b R displaystyle K a b times a b rightarrow mathbb R nbsp where symmetric means that K x y K y x displaystyle K x y K y x nbsp for all x y a b displaystyle x y in a b nbsp K is said to be a positive definite kernel if and only if i 1 n j 1 n K x i x j c i c j 0 displaystyle sum i 1 n sum j 1 n K x i x j c i c j geq 0 nbsp for all finite sequences of points x1 xn of a b and all choices of real numbers c1 cn Note that the term positive definite is well established in literature despite the weak inequality in the definition 2 3 Associated to K is a linear operator more specifically a Hilbert Schmidt integral operator on functions defined by the integral T K f x a b K x s f s d s displaystyle T K varphi x int a b K x s varphi s ds nbsp For technical considerations we assume f displaystyle varphi nbsp can range through the space L2 a b see Lp space of square integrable real valued functions Since TK is a linear operator we can talk about eigenvalues and eigenfunctions of TK Theorem Suppose K is a continuous symmetric positive definite kernel Then there is an orthonormal basis ei i of L2 a b consisting of eigenfunctions of TK such that the corresponding sequence of eigenvalues li i is nonnegative The eigenfunctions corresponding to non zero eigenvalues are continuous on a b and K has the representation K s t j 1 l j e j s e j t displaystyle K s t sum j 1 infty lambda j e j s e j t nbsp where the convergence is absolute and uniform Details editWe now explain in greater detail the structure of the proof of Mercer s theorem particularly how it relates to spectral theory of compact operators The map K TK is injective TK is a non negative symmetric compact operator on L2 a b moreover K x x 0 To show compactness show that the image of the unit ball of L2 a b under TK equicontinuous and apply Ascoli s theorem to show that the image of the unit ball is relatively compact in C a b with the uniform norm and a fortiori in L2 a b Now apply the spectral theorem for compact operators on Hilbert spaces to TK to show the existence of the orthonormal basis ei i of L2 a b l i e i t T K e i t a b K t s e i s d s displaystyle lambda i e i t T K e i t int a b K t s e i s ds nbsp If li 0 the eigenvector eigenfunction ei is seen to be continuous on a b Now i 1 l i e i t e i s sup x a b K x x displaystyle sum i 1 infty lambda i e i t e i s leq sup x in a b K x x nbsp which shows that the sequence i 1 l i e i t e i s displaystyle sum i 1 infty lambda i e i t e i s nbsp converges absolutely and uniformly to a kernel K0 which is easily seen to define the same operator as the kernel K Hence K K0 from which Mercer s theorem follows Finally to show non negativity of the eigenvalues one can write l f f f T K f displaystyle lambda langle f f rangle langle f T K f rangle nbsp and expressing the right hand side as an integral well approximated by its Riemann sums which are non negative by positive definiteness of K implying l f f 0 displaystyle lambda langle f f rangle geq 0 nbsp implying l 0 displaystyle lambda geq 0 nbsp Trace editThe following is immediate Theorem Suppose K is a continuous symmetric positive definite kernel TK has a sequence of nonnegative eigenvalues li i Then a b K t t d t i l i displaystyle int a b K t t dt sum i lambda i nbsp This shows that the operator TK is a trace class operator and trace T K a b K t t d t displaystyle operatorname trace T K int a b K t t dt nbsp Generalizations editMercer s theorem itself is a generalization of the result that any symmetric positive semidefinite matrix is the Gramian matrix of a set of vectors The first generalization citation needed replaces the interval a b with any compact Hausdorff space and Lebesgue measure on a b is replaced by a finite countably additive measure m on the Borel algebra of X whose support is X This means that m U gt 0 for any nonempty open subset U of X A recent generalization citation needed replaces these conditions by the following the set X is a first countable topological space endowed with a Borel complete measure m X is the support of m and for all x in X there is an open set U containing x and having finite measure Then essentially the same result holds Theorem Suppose K is a continuous symmetric positive definite kernel on X If the function k is L1m X where k x K x x for all x in X then there is an orthonormal set ei i of L2m X consisting of eigenfunctions of TK such that corresponding sequence of eigenvalues li i is nonnegative The eigenfunctions corresponding to non zero eigenvalues are continuous on X and K has the representation K s t j 1 l j e j s e j t displaystyle K s t sum j 1 infty lambda j e j s e j t nbsp where the convergence is absolute and uniform on compact subsets of X The next generalization citation needed deals with representations of measurable kernels Let X M m be a s finite measure space An L2 or square integrable kernel on X is a function K L m m 2 X X displaystyle K in L mu otimes mu 2 X times X nbsp L2 kernels define a bounded operator TK by the formula T K f ps X X K y x f y ps x d m m y x displaystyle langle T K varphi psi rangle int X times X K y x varphi y psi x d mu otimes mu y x nbsp TK is a compact operator actually it is even a Hilbert Schmidt operator If the kernel K is symmetric by the spectral theorem TK has an orthonormal basis of eigenvectors Those eigenvectors that correspond to non zero eigenvalues can be arranged in a sequence ei i regardless of separability Theorem If K is a symmetric positive definite kernel on X M m then K y x i N l i e i y e i x displaystyle K y x sum i in mathbb N lambda i e i y e i x nbsp where the convergence in the L2 norm Note that when continuity of the kernel is not assumed the expansion no longer converges uniformly Mercer s condition editIn mathematics a real valued function K x y is said to fulfill Mercer s condition if for all square integrable functions g x one has g x K x y g y d x d y 0 displaystyle iint g x K x y g y dx dy geq 0 nbsp Discrete analog edit This is analogous to the definition of a positive semidefinite matrix This is a matrix K displaystyle K nbsp of dimension N displaystyle N nbsp which satisfies for all vectors g displaystyle g nbsp the property g K g g T K g i 1 N j 1 N g i K i j g j 0 displaystyle g Kg g T cdot Kg sum i 1 N sum j 1 N g i K ij g j geq 0 nbsp Examples edit A positive constant function K x y c displaystyle K x y c nbsp satisfies Mercer s condition as then the integral becomes by Fubini s theorem g x c g y d x d y c g x d x g y d y c g x d x 2 displaystyle iint g x c g y dx dy c int g x dx int g y dy c left int g x dx right 2 nbsp which is indeed non negative See also editKernel trick Representer theorem Reproducing kernel Hilbert space Spectral theoryNotes edit Bartlett Peter 2008 Reproducing Kernel Hilbert Spaces PDF Lecture notes of CS281B Stat241B Statistical Learning Theory University of California at Berkeley Mohri Mehryar 2018 Foundations of machine learning Afshin Rostamizadeh Ameet Talwalkar Second ed Cambridge Massachusetts ISBN 978 0 262 03940 6 OCLC 1041560990 a href Template Cite book html title Template Cite book cite book a CS1 maint location missing publisher link Berlinet A 2004 Reproducing kernel Hilbert spaces in probability and statistics Christine Thomas Agnan New York Springer Science Business Media ISBN 1 4419 9096 8 OCLC 844346520 References editAdriaan Zaanen Linear Analysis North Holland Publishing Co 1960 Ferreira J C Menegatto V A Eigenvalues of integral operators defined by smooth positive definite kernels Integral equation and Operator Theory 64 2009 no 1 61 81 Gives the generalization of Mercer s theorem for metric spaces The result is easily adapted to first countable topological spaces Konrad Jorgens Linear integral operators Pitman Boston 1982 Richard Courant and David Hilbert Methods of Mathematical Physics vol 1 Interscience 1953 Robert Ash Information Theory Dover Publications 1990 Mercer J 1909 Functions of positive and negative type and their connection with the theory of integral equations Philosophical Transactions of the Royal Society A 209 441 458 415 446 Bibcode 1909RSPTA 209 415M doi 10 1098 rsta 1909 0016 Mercer theorem Encyclopedia of Mathematics EMS Press 2001 1994 H Konig Eigenvalue distribution of compact operators Birkhauser Verlag 1986 Gives the generalization of Mercer s theorem for finite measures m Retrieved from https en wikipedia org w index php title Mercer 27s theorem amp oldid 1222387746, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.