fbpx
Wikipedia

Sum-of-squares optimization

A sum-of-squares optimization program is an optimization problem with a linear cost function and a particular type of constraint on the decision variables. These constraints are of the form that when the decision variables are used as coefficients in certain polynomials, those polynomials should have the polynomial SOS property. When fixing the maximum degree of the polynomials involved, sum-of-squares optimization is also known as the Lasserre hierarchy of relaxations in semidefinite programming.

Sum-of-squares optimization techniques have been applied across a variety of areas, including control theory (in particular, for searching for polynomial Lyapunov functions for dynamical systems described by polynomial vector fields), statistics, finance and machine learning.[1][2][3][4]

Optimization problem edit

Given a vector   and polynomials   for  ,  , a sum-of-squares optimization problem is written as

 

Here "SOS" represents the class of sum-of-squares (SOS) polynomials. The quantities   are the decision variables. SOS programs can be converted to semidefinite programs (SDPs) using the duality of the SOS polynomial program and a relaxation for constrained polynomial optimization using positive-semidefinite matrices, see the following section.

Dual problem: constrained polynomial optimization edit

Suppose we have an  -variate polynomial   , and suppose that we would like to minimize this polynomial over a subset  . Suppose furthermore that the constraints on the subset   can be encoded using   polynomial equalities of degree at most  , each of the form   where   is a polynomial of degree at most  . A natural, though generally non-convex program for this optimization problem is the following:

 
subject to:
 

 

 

 

 

(1)

 
where   is the  -dimensional vector with one entry for every monomial in   of degree at most  , so that for each multiset    ,   is a matrix of coefficients of the polynomial   that we want to minimize, and   is a matrix of coefficients of the polynomial   encoding the  -th constraint on the subset  . The additional, fixed constant index in our search space,  , is added for the convenience of writing the polynomials   and   in a matrix representation.

This program is generally non-convex, because the constraints (1) are not convex. One possible convex relaxation for this minimization problem uses semidefinite programming to replace the rank-one matrix of variables   with a positive-semidefinite matrix  : we index each monomial of size at most   by a multiset   of at most   indices,  . For each such monomial, we create a variable   in the program, and we arrange the variables   to form the matrix  , where  is the set of real matrices whose rows and columns are identified with multisets of elements from   of size at most  . We then write the following semidefinite program in the variables  :

 
subject to:
 
 
 
 

where again   is the matrix of coefficients of the polynomial   that we want to minimize, and   is the matrix of coefficients of the polynomial   encoding the  -th constraint on the subset  .

The third constraint ensures that the value of a monomial that appears several times within the matrix is equal throughout the matrix, and is added to make   respect the symmetries present in the quadratic form  .

Duality edit

One can take the dual of the above semidefinite program and obtain the following program:

 
subject to:
 

We have a variable   corresponding to the constraint   (where   is the matrix with all entries zero save for the entry indexed by  ), a real variable  for each polynomial constraint   and for each group of multisets  , we have a dual variable  for the symmetry constraint  . The positive-semidefiniteness constraint ensures that   is a sum-of-squares of polynomials over  : by a characterization of positive-semidefinite matrices, for any positive-semidefinite matrix  , we can write   for vectors  . Thus for any  ,

 

where we have identified the vectors   with the coefficients of a polynomial of degree at most  . This gives a sum-of-squares proof that the value   over  .

The above can also be extended to regions   defined by polynomial inequalities.

Sum-of-squares hierarchy edit

The sum-of-squares hierarchy (SOS hierarchy), also known as the Lasserre hierarchy, is a hierarchy of convex relaxations of increasing power and increasing computational cost. For each natural number   the corresponding convex relaxation is known as the  th level or  -th round of the SOS hierarchy. The  st round, when  , corresponds to a basic semidefinite program, or to sum-of-squares optimization over polynomials of degree at most  . To augment the basic convex program at the  st level of the hierarchy to  -th level, additional variables and constraints are added to the program to have the program consider polynomials of degree at most  .

The SOS hierarchy derives its name from the fact that the value of the objective function at the  -th level is bounded with a sum-of-squares proof using polynomials of degree at most   via the dual (see "Duality" above). Consequently, any sum-of-squares proof that uses polynomials of degree at most   can be used to bound the objective value, allowing one to prove guarantees on the tightness of the relaxation.

In conjunction with a theorem of Berg, this further implies that given sufficiently many rounds, the relaxation becomes arbitrarily tight on any fixed interval. Berg's result[5][6] states that every non-negative real polynomial within a bounded interval can be approximated within accuracy   on that interval with a sum-of-squares of real polynomials of sufficiently high degree, and thus if   is the polynomial objective value as a function of the point  , if the inequality   holds for all   in the region of interest, then there must be a sum-of-squares proof of this fact. Choosing   to be the minimum of the objective function over the feasible region, we have the result.

Computational cost edit

When optimizing over a function in   variables, the  -th level of the hierarchy can be written as a semidefinite program over   variables, and can be solved in time   using the ellipsoid method.

Sum-of-squares background edit

A polynomial   is a sum of squares (SOS) if there exist polynomials   such that  . For example,

 
is a sum of squares since
 
where
 
Note that if   is a sum of squares then   for all  . Detailed descriptions of polynomial SOS are available.[7][8][9]

Quadratic forms can be expressed as   where   is a symmetric matrix. Similarly, polynomials of degree ≤ 2d can be expressed as

 
where the vector   contains all monomials of degree  . This is known as the Gram matrix form. An important fact is that   is SOS if and only if there exists a symmetric and positive-semidefinite matrix   such that  . This provides a connection between SOS polynomials and positive-semidefinite matrices.

Software tools edit

  • SOSTOOLS, licensed under the GNU GPL. The reference guide is available at arXiv:1310.4716 [math.OC], and a presentation about its internals is available here.
  • CDCS-sos, a package from CDCS, an augmented Lagrangian method solver, to deal with large scale SOS programs.
  • The SumOfSquares extension of JuMP for Julia.
  • TSSOS for Julia, a polynomial optimization tool based on the sparsity adapted moment-SOS hierarchies.
  • For the dual problem of constrained polynomial optimization, GloptiPoly for MATLAB/Octave, Ncpol2sdpa for Python and MomentOpt for Julia.

References edit

  1. ^ Sum of squares : theory and applications : AMS short course, sum of squares : theory and applications, January 14-15, 2019, Baltimore, Maryland. Parrilo, Pablo A.; Thomas, Rekha R. Providence, Rhode Island: American Mathematical Society. 2020. ISBN 978-1-4704-5025-0. OCLC 1157604983.{{cite book}}: CS1 maint: others (link)
  2. ^ Tan, W., Packard, A., 2004. "Searching for control Lyapunov functions using sums of squares programming". In: Allerton Conf. on Comm., Control and Computing. pp. 210–219.
  3. ^ Tan, W., Topcu, U., Seiler, P., Balas, G., Packard, A., 2008. Simulation-aided reachability and local gain analysis for nonlinear dynamical systems. In: Proc. of the IEEE Conference on Decision and Control. pp. 4097–4102.
  4. ^ A. Chakraborty, P. Seiler, and G. Balas, "Susceptibility of F/A-18 Flight Controllers to the Falling-Leaf Mode: Nonlinear Analysis," AIAA Journal of Guidance, Control, and Dynamics, vol. 34 no. 1 (2011), pp. 73–85.
  5. ^ Berg, Christian (1987). Landau, Henry J. (ed.). "The multidimensional moment problem and semigroups". Proceedings of Symposia in Applied Mathematics. 37: 110–124. doi:10.1090/psapm/037/921086. ISBN 9780821801147.
  6. ^ Lasserre, J. (2007-01-01). "A Sum of Squares Approximation of Nonnegative Polynomials". SIAM Review. 49 (4): 651–669. arXiv:math/0412398. doi:10.1137/070693709. ISSN 0036-1445.
  7. ^ Parrilo, P., (2000) Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. Ph.D. thesis, California Institute of Technology.
  8. ^ Parrilo, P. (2003) "Semidefinite programming relaxations for semialgebraic problems". Mathematical Programming Ser. B 96 (2), 293–320.
  9. ^ Lasserre, J. (2001) "Global optimization with polynomials and the problem of moments". SIAM Journal on Optimization, 11 (3), 796{817.

squares, optimization, this, article, deals, with, squares, constraints, problems, with, squares, cost, functions, least, squares, squares, optimization, program, optimization, problem, with, linear, cost, function, particular, type, constraint, decision, vari. This article deals with sum of squares constraints For problems with sum of squares cost functions see Least squares A sum of squares optimization program is an optimization problem with a linear cost function and a particular type of constraint on the decision variables These constraints are of the form that when the decision variables are used as coefficients in certain polynomials those polynomials should have the polynomial SOS property When fixing the maximum degree of the polynomials involved sum of squares optimization is also known as the Lasserre hierarchy of relaxations in semidefinite programming Sum of squares optimization techniques have been applied across a variety of areas including control theory in particular for searching for polynomial Lyapunov functions for dynamical systems described by polynomial vector fields statistics finance and machine learning 1 2 3 4 Contents 1 Optimization problem 2 Dual problem constrained polynomial optimization 2 1 Duality 3 Sum of squares hierarchy 3 1 Computational cost 4 Sum of squares background 5 Software tools 6 ReferencesOptimization problem editGiven a vector c R n displaystyle c in mathbb R n nbsp and polynomials a k j displaystyle a k j nbsp for k 1 N s displaystyle k 1 dots N s nbsp j 0 1 n displaystyle j 0 1 dots n nbsp a sum of squares optimization problem is written asmaximize u R n c T u subject to a k 0 x a k 1 x u 1 a k n x u n SOS k 1 N s displaystyle begin aligned underset u in mathbb R n text maximize quad amp c T u text subject to quad amp a k 0 x a k 1 x u 1 cdots a k n x u n in text SOS quad k 1 ldots N s end aligned nbsp Here SOS represents the class of sum of squares SOS polynomials The quantities u R n displaystyle u in mathbb R n nbsp are the decision variables SOS programs can be converted to semidefinite programs SDPs using the duality of the SOS polynomial program and a relaxation for constrained polynomial optimization using positive semidefinite matrices see the following section Dual problem constrained polynomial optimization editSuppose we have an n displaystyle n nbsp variate polynomial p x R n R displaystyle p x mathbb R n to mathbb R nbsp and suppose that we would like to minimize this polynomial over a subset A R n textstyle A subseteq mathbb R n nbsp Suppose furthermore that the constraints on the subset A textstyle A nbsp can be encoded using m textstyle m nbsp polynomial equalities of degree at most 2 d displaystyle 2d nbsp each of the form a i x 0 textstyle a i x 0 nbsp where a i R n R displaystyle a i mathbb R n to mathbb R nbsp is a polynomial of degree at most 2 d displaystyle 2d nbsp A natural though generally non convex program for this optimization problem is the following min x R n C x d x d displaystyle min x in mathbb R n langle C x leq d x leq d top rangle nbsp subject to A i x d x d 0 i m displaystyle langle A i x leq d x leq d top rangle 0 qquad forall i in m nbsp 1 x 1 displaystyle x emptyset 1 nbsp where x d textstyle x leq d nbsp is the n O d displaystyle n O d nbsp dimensional vector with one entry for every monomial in x displaystyle x nbsp of degree at most d displaystyle d nbsp so that for each multiset S n S d displaystyle S subset n S leq d nbsp x S i S x i textstyle x S prod i in S x i nbsp C textstyle C nbsp is a matrix of coefficients of the polynomial p x textstyle p x nbsp that we want to minimize and A i textstyle A i nbsp is a matrix of coefficients of the polynomial a i x textstyle a i x nbsp encoding the i displaystyle i nbsp th constraint on the subset A R n displaystyle A subset mathbb R n nbsp The additional fixed constant index in our search space x 1 displaystyle x emptyset 1 nbsp is added for the convenience of writing the polynomials p x textstyle p x nbsp and a i x textstyle a i x nbsp in a matrix representation This program is generally non convex because the constraints 1 are not convex One possible convex relaxation for this minimization problem uses semidefinite programming to replace the rank one matrix of variables x d x d displaystyle x leq d x leq d top nbsp with a positive semidefinite matrix X displaystyle X nbsp we index each monomial of size at most 2 d displaystyle 2d nbsp by a multiset S displaystyle S nbsp of at most 2 d displaystyle 2d nbsp indices S n S 2 d displaystyle S subset n S leq 2d nbsp For each such monomial we create a variable X S displaystyle X S nbsp in the program and we arrange the variables X S displaystyle X S nbsp to form the matrix X R n d n d textstyle X in mathbb R n leq d times n leq d nbsp where R n d n d displaystyle mathbb R n leq d times n leq d nbsp is the set of real matrices whose rows and columns are identified with multisets of elements from n displaystyle n nbsp of size at most d displaystyle d nbsp We then write the following semidefinite program in the variables X S displaystyle X S nbsp min X R n d n d C X displaystyle min X in mathbb R n leq d times n leq d langle C X rangle nbsp subject to A i X 0 i m Q displaystyle langle A i X rangle 0 qquad forall i in m Q nbsp X 1 displaystyle X emptyset 1 nbsp X U V X S T U V S T n U V S T d and U V S T displaystyle X U cup V X S cup T qquad forall U V S T subseteq n U V S T leq d text and U cup V S cup T nbsp X 0 displaystyle X succeq 0 nbsp where again C textstyle C nbsp is the matrix of coefficients of the polynomial p x textstyle p x nbsp that we want to minimize and A i textstyle A i nbsp is the matrix of coefficients of the polynomial a i x textstyle a i x nbsp encoding the i displaystyle i nbsp th constraint on the subset A R n displaystyle A subset mathbb R n nbsp The third constraint ensures that the value of a monomial that appears several times within the matrix is equal throughout the matrix and is added to make X displaystyle X nbsp respect the symmetries present in the quadratic form x d x d displaystyle x leq d x leq d top nbsp Duality edit One can take the dual of the above semidefinite program and obtain the following program max y R m y 0 displaystyle max y in mathbb R m y 0 nbsp subject to C y 0 e i m y i A i S T U V y S T U V e S T e U V 0 displaystyle C y 0 e emptyset sum i in m y i A i sum S cup T U cup V y S T U V e S T e U V succeq 0 nbsp We have a variable y 0 displaystyle y 0 nbsp corresponding to the constraint e X 1 displaystyle langle e emptyset X rangle 1 nbsp where e displaystyle e emptyset nbsp is the matrix with all entries zero save for the entry indexed by displaystyle varnothing varnothing nbsp a real variable y i displaystyle y i nbsp for each polynomial constraint X A i 0 s t i m displaystyle langle X A i rangle 0 quad s t i in m nbsp and for each group of multisets S T U V n S T U V d S T U V displaystyle S T U V subset n S T U V leq d S cup T U cup V nbsp we have a dual variable y S T U V displaystyle y S T U V nbsp for the symmetry constraint X e S T e U V 0 displaystyle langle X e S T e U V rangle 0 nbsp The positive semidefiniteness constraint ensures that p x y 0 displaystyle p x y 0 nbsp is a sum of squares of polynomials over A R n displaystyle A subset mathbb R n nbsp by a characterization of positive semidefinite matrices for any positive semidefinite matrix Q R m m textstyle Q in mathbb R m times m nbsp we can write Q i m f i f i textstyle Q sum i in m f i f i top nbsp for vectors f i R m textstyle f i in mathbb R m nbsp Thus for any x A R n textstyle x in A subset mathbb R n nbsp p x y 0 p x y 0 i m y i a i x since x A x d C y 0 e i m y i A i S T U V y S T U V e S T e U V x d by symmetry x d i f i f i x d i x d f i 2 i f i x 2 displaystyle begin aligned p x y 0 amp p x y 0 sum i in m y i a i x qquad text since x in A amp x leq d top left C y 0 e emptyset sum i in m y i A i sum S cup T U cup V y S T U V e S T e U V right x leq d qquad text by symmetry amp x leq d top left sum i f i f i top right x leq d amp sum i langle x leq d f i rangle 2 amp sum i f i x 2 end aligned nbsp where we have identified the vectors f i textstyle f i nbsp with the coefficients of a polynomial of degree at most d displaystyle d nbsp This gives a sum of squares proof that the value p x y 0 textstyle p x geq y 0 nbsp over A R n displaystyle A subset mathbb R n nbsp The above can also be extended to regions A R n displaystyle A subset mathbb R n nbsp defined by polynomial inequalities Sum of squares hierarchy editThe sum of squares hierarchy SOS hierarchy also known as the Lasserre hierarchy is a hierarchy of convex relaxations of increasing power and increasing computational cost For each natural number d N textstyle d in mathbb N nbsp the corresponding convex relaxation is known as the d textstyle d nbsp th level or d textstyle d nbsp th round of the SOS hierarchy The 1 textstyle 1 nbsp st round when d 1 textstyle d 1 nbsp corresponds to a basic semidefinite program or to sum of squares optimization over polynomials of degree at most 2 displaystyle 2 nbsp To augment the basic convex program at the 1 textstyle 1 nbsp st level of the hierarchy to d textstyle d nbsp th level additional variables and constraints are added to the program to have the program consider polynomials of degree at most 2 d displaystyle 2d nbsp The SOS hierarchy derives its name from the fact that the value of the objective function at the d textstyle d nbsp th level is bounded with a sum of squares proof using polynomials of degree at most 2 d textstyle 2d nbsp via the dual see Duality above Consequently any sum of squares proof that uses polynomials of degree at most 2 d textstyle 2d nbsp can be used to bound the objective value allowing one to prove guarantees on the tightness of the relaxation In conjunction with a theorem of Berg this further implies that given sufficiently many rounds the relaxation becomes arbitrarily tight on any fixed interval Berg s result 5 6 states that every non negative real polynomial within a bounded interval can be approximated within accuracy e textstyle varepsilon nbsp on that interval with a sum of squares of real polynomials of sufficiently high degree and thus if O B J x textstyle OBJ x nbsp is the polynomial objective value as a function of the point x textstyle x nbsp if the inequality c e O B J x 0 textstyle c varepsilon OBJ x geq 0 nbsp holds for all x textstyle x nbsp in the region of interest then there must be a sum of squares proof of this fact Choosing c textstyle c nbsp to be the minimum of the objective function over the feasible region we have the result Computational cost edit When optimizing over a function in n textstyle n nbsp variables the d textstyle d nbsp th level of the hierarchy can be written as a semidefinite program over n O d textstyle n O d nbsp variables and can be solved in time n O d textstyle n O d nbsp using the ellipsoid method Sum of squares background editMain article Polynomial SOS A polynomial p displaystyle p nbsp is a sum of squares SOS if there exist polynomials f i i 1 m displaystyle f i i 1 m nbsp such that p i 1 m f i 2 textstyle p sum i 1 m f i 2 nbsp For example p x 2 4 x y 7 y 2 displaystyle p x 2 4xy 7y 2 nbsp is a sum of squares since p f 1 2 f 2 2 displaystyle p f 1 2 f 2 2 nbsp where f 1 x 2 y and f 2 3 y displaystyle f 1 x 2y text and f 2 sqrt 3 y nbsp Note that if p displaystyle p nbsp is a sum of squares then p x 0 displaystyle p x geq 0 nbsp for all x R n displaystyle x in mathbb R n nbsp Detailed descriptions of polynomial SOS are available 7 8 9 Quadratic forms can be expressed as p x x T Q x displaystyle p x x T Qx nbsp where Q displaystyle Q nbsp is a symmetric matrix Similarly polynomials of degree 2d can be expressed asp x z x T Q z x displaystyle p x z x mathsf T Qz x nbsp where the vector z displaystyle z nbsp contains all monomials of degree d displaystyle leq d nbsp This is known as the Gram matrix form An important fact is that p displaystyle p nbsp is SOS if and only if there exists a symmetric and positive semidefinite matrix Q displaystyle Q nbsp such that p x z x T Q z x displaystyle p x z x mathsf T Qz x nbsp This provides a connection between SOS polynomials and positive semidefinite matrices Software tools editSOSTOOLS licensed under the GNU GPL The reference guide is available at arXiv 1310 4716 math OC and a presentation about its internals is available here CDCS sos a package from CDCS an augmented Lagrangian method solver to deal with large scale SOS programs The SumOfSquares extension of JuMP for Julia TSSOS for Julia a polynomial optimization tool based on the sparsity adapted moment SOS hierarchies For the dual problem of constrained polynomial optimization GloptiPoly for MATLAB Octave Ncpol2sdpa for Python and MomentOpt for Julia References edit Sum of squares theory and applications AMS short course sum of squares theory and applications January 14 15 2019 Baltimore Maryland Parrilo Pablo A Thomas Rekha R Providence Rhode Island American Mathematical Society 2020 ISBN 978 1 4704 5025 0 OCLC 1157604983 a href Template Cite book html title Template Cite book cite book a CS1 maint others link Tan W Packard A 2004 Searching for control Lyapunov functions using sums of squares programming In Allerton Conf on Comm Control and Computing pp 210 219 Tan W Topcu U Seiler P Balas G Packard A 2008 Simulation aided reachability and local gain analysis for nonlinear dynamical systems In Proc of the IEEE Conference on Decision and Control pp 4097 4102 A Chakraborty P Seiler and G Balas Susceptibility of F A 18 Flight Controllers to the Falling Leaf Mode Nonlinear Analysis AIAA Journal of Guidance Control and Dynamics vol 34 no 1 2011 pp 73 85 Berg Christian 1987 Landau Henry J ed The multidimensional moment problem and semigroups Proceedings of Symposia in Applied Mathematics 37 110 124 doi 10 1090 psapm 037 921086 ISBN 9780821801147 Lasserre J 2007 01 01 A Sum of Squares Approximation of Nonnegative Polynomials SIAM Review 49 4 651 669 arXiv math 0412398 doi 10 1137 070693709 ISSN 0036 1445 Parrilo P 2000 Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization Ph D thesis California Institute of Technology Parrilo P 2003 Semidefinite programming relaxations for semialgebraic problems Mathematical Programming Ser B 96 2 293 320 Lasserre J 2001 Global optimization with polynomials and the problem of moments SIAM Journal on Optimization 11 3 796 817 Retrieved from https en wikipedia org w index php title Sum of squares optimization amp oldid 1209501274, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.