fbpx
Wikipedia

Optimal control

Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized.[1] It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the Moon with minimum fuel expenditure.[2] Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy.[3] A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory.[4][5]

Optimal control problem benchmark (Luus) with an integral objective, inequality, and differential constraint

Optimal control is an extension of the calculus of variations, and is a mathematical optimization method for deriving control policies.[6] The method is largely due to the work of Lev Pontryagin and Richard Bellman in the 1950s, after contributions to calculus of variations by Edward J. McShane.[7] Optimal control can be seen as a control strategy in control theory.[1]

General method edit

Optimal control deals with the problem of finding a control law for a given system such that a certain optimality criterion is achieved. A control problem includes a cost functional that is a function of state and control variables. An optimal control is a set of differential equations describing the paths of the control variables that minimize the cost function. The optimal control can be derived using Pontryagin's maximum principle (a necessary condition also known as Pontryagin's minimum principle or simply Pontryagin's principle),[8] or by solving the Hamilton–Jacobi–Bellman equation (a sufficient condition).

We begin with a simple example. Consider a car traveling in a straight line on a hilly road. The question is, how should the driver press the accelerator pedal in order to minimize the total traveling time? In this example, the term control law refers specifically to the way in which the driver presses the accelerator and shifts the gears. The system consists of both the car and the road, and the optimality criterion is the minimization of the total traveling time. Control problems usually include ancillary constraints. For example, the amount of available fuel might be limited, the accelerator pedal cannot be pushed through the floor of the car, speed limits, etc.

A proper cost function will be a mathematical expression giving the traveling time as a function of the speed, geometrical considerations, and initial conditions of the system. Constraints are often interchangeable with the cost function.

Another related optimal control problem may be to find the way to drive the car so as to minimize its fuel consumption, given that it must complete a given course in a time not exceeding some amount. Yet another related control problem may be to minimize the total monetary cost of completing the trip, given assumed monetary prices for time and fuel.

A more abstract framework goes as follows.[1] Minimize the continuous-time cost functional

 
subject to the first-order dynamic constraints (the state equation)
 
the algebraic path constraints
 
and the endpoint conditions
 
where   is the state,   is the control,   is the independent variable (generally speaking, time),   is the initial time, and   is the terminal time. The terms   and   are called the endpoint cost and the running cost respectively. In the calculus of variations,   and   are referred to as the Mayer term and the Lagrangian, respectively. Furthermore, it is noted that the path constraints are in general inequality constraints and thus may not be active (i.e., equal to zero) at the optimal solution. It is also noted that the optimal control problem as stated above may have multiple solutions (i.e., the solution may not be unique). Thus, it is most often the case that any solution   to the optimal control problem is locally minimizing.

Linear quadratic control edit

A special case of the general nonlinear optimal control problem given in the previous section is the linear quadratic (LQ) optimal control problem. The LQ problem is stated as follows. Minimize the quadratic continuous-time cost functional

 

Subject to the linear first-order dynamic constraints

 
and the initial condition
 

A particular form of the LQ problem that arises in many control system problems is that of the linear quadratic regulator (LQR) where all of the matrices (i.e.,  ,  ,  , and  ) are constant, the initial time is arbitrarily set to zero, and the terminal time is taken in the limit   (this last assumption is what is known as infinite horizon). The LQR problem is stated as follows. Minimize the infinite horizon quadratic continuous-time cost functional

 

Subject to the linear time-invariant first-order dynamic constraints

 
and the initial condition
 

In the finite-horizon case the matrices are restricted in that   and   are positive semi-definite and positive definite, respectively. In the infinite-horizon case, however, the matrices   and   are not only positive-semidefinite and positive-definite, respectively, but are also constant. These additional restrictions on   and   in the infinite-horizon case are enforced to ensure that the cost functional remains positive. Furthermore, in order to ensure that the cost function is bounded, the additional restriction is imposed that the pair   is controllable. Note that the LQ or LQR cost functional can be thought of physically as attempting to minimize the control energy (measured as a quadratic form).

The infinite horizon problem (i.e., LQR) may seem overly restrictive and essentially useless because it assumes that the operator is driving the system to zero-state and hence driving the output of the system to zero. This is indeed correct. However the problem of driving the output to a desired nonzero level can be solved after the zero output one is. In fact, it can be proved that this secondary LQR problem can be solved in a very straightforward manner. It has been shown in classical optimal control theory that the LQ (or LQR) optimal control has the feedback form

 
where   is a properly dimensioned matrix, given as
 
and   is the solution of the differential Riccati equation. The differential Riccati equation is given as
 

For the finite horizon LQ problem, the Riccati equation is integrated backward in time using the terminal boundary condition

 

For the infinite horizon LQR problem, the differential Riccati equation is replaced with the algebraic Riccati equation (ARE) given as

 

Understanding that the ARE arises from infinite horizon problem, the matrices  ,  ,  , and   are all constant. It is noted that there are in general multiple solutions to the algebraic Riccati equation and the positive definite (or positive semi-definite) solution is the one that is used to compute the feedback gain. The LQ (LQR) problem was elegantly solved by Rudolf E. Kálmán.[9]

Numerical methods for optimal control edit

Optimal control problems are generally nonlinear and therefore, generally do not have analytic solutions (e.g., like the linear-quadratic optimal control problem). As a result, it is necessary to employ numerical methods to solve optimal control problems. In the early years of optimal control (c. 1950s to 1980s) the favored approach for solving optimal control problems was that of indirect methods. In an indirect method, the calculus of variations is employed to obtain the first-order optimality conditions. These conditions result in a two-point (or, in the case of a complex problem, a multi-point) boundary-value problem. This boundary-value problem actually has a special structure because it arises from taking the derivative of a Hamiltonian. Thus, the resulting dynamical system is a Hamiltonian system of the form[1]

 
where
 
is the augmented Hamiltonian and in an indirect method, the boundary-value problem is solved (using the appropriate boundary or transversality conditions). The beauty of using an indirect method is that the state and adjoint (i.e.,  ) are solved for and the resulting solution is readily verified to be an extremal trajectory. The disadvantage of indirect methods is that the boundary-value problem is often extremely difficult to solve (particularly for problems that span large time intervals or problems with interior point constraints). A well-known software program that implements indirect methods is BNDSCO.[10]

The approach that has risen to prominence in numerical optimal control since the 1980s is that of so-called direct methods. In a direct method, the state or the control, or both, are approximated using an appropriate function approximation (e.g., polynomial approximation or piecewise constant parameterization). Simultaneously, the cost functional is approximated as a cost function. Then, the coefficients of the function approximations are treated as optimization variables and the problem is "transcribed" to a nonlinear optimization problem of the form:

Minimize

 
subject to the algebraic constraints
 

Depending upon the type of direct method employed, the size of the nonlinear optimization problem can be quite small (e.g., as in a direct shooting or quasilinearization method), moderate (e.g. pseudospectral optimal control[11]) or may be quite large (e.g., a direct collocation method[12]). In the latter case (i.e., a collocation method), the nonlinear optimization problem may be literally thousands to tens of thousands of variables and constraints. Given the size of many NLPs arising from a direct method, it may appear somewhat counter-intuitive that solving the nonlinear optimization problem is easier than solving the boundary-value problem. It is, however, the fact that the NLP is easier to solve than the boundary-value problem. The reason for the relative ease of computation, particularly of a direct collocation method, is that the NLP is sparse and many well-known software programs exist (e.g., SNOPT[13]) to solve large sparse NLPs. As a result, the range of problems that can be solved via direct methods (particularly direct collocation methods which are very popular these days) is significantly larger than the range of problems that can be solved via indirect methods. In fact, direct methods have become so popular these days that many people have written elaborate software programs that employ these methods. In particular, many such programs include DIRCOL,[14] SOCS,[15] OTIS,[16] GESOP/ASTOS,[17] DITAN.[18] and PyGMO/PyKEP.[19] In recent years, due to the advent of the MATLAB programming language, optimal control software in MATLAB has become more common. Examples of academically developed MATLAB software tools implementing direct methods include RIOTS,[20] DIDO,[21] DIRECT,[22] FALCON.m,[23] and GPOPS,[24] while an example of an industry developed MATLAB tool is PROPT.[25] These software tools have increased significantly the opportunity for people to explore complex optimal control problems both for academic research and industrial problems.[26] Finally, it is noted that general-purpose MATLAB optimization environments such as TOMLAB have made coding complex optimal control problems significantly easier than was previously possible in languages such as C and FORTRAN.

Discrete-time optimal control edit

The examples thus far have shown continuous time systems and control solutions. In fact, as optimal control solutions are now often implemented digitally, contemporary control theory is now primarily concerned with discrete time systems and solutions. The Theory of Consistent Approximations[27][28] provides conditions under which solutions to a series of increasingly accurate discretized optimal control problem converge to the solution of the original, continuous-time problem. Not all discretization methods have this property, even seemingly obvious ones.[29] For instance, using a variable step-size routine to integrate the problem's dynamic equations may generate a gradient which does not converge to zero (or point in the right direction) as the solution is approached. The direct method RIOTS is based on the Theory of Consistent Approximation.

Examples edit

A common solution strategy in many optimal control problems is to solve for the costate (sometimes called the shadow price)  . The costate summarizes in one number the marginal value of expanding or contracting the state variable next turn. The marginal value is not only the gains accruing to it next turn but associated with the duration of the program. It is nice when   can be solved analytically, but usually, the most one can do is describe it sufficiently well that the intuition can grasp the character of the solution and an equation solver can solve numerically for the values.

Having obtained  , the turn-t optimal value for the control can usually be solved as a differential equation conditional on knowledge of  . Again it is infrequent, especially in continuous-time problems, that one obtains the value of the control or the state explicitly. Usually, the strategy is to solve for thresholds and regions that characterize the optimal control and use a numerical solver to isolate the actual choice values in time.

Finite time edit

Consider the problem of a mine owner who must decide at what rate to extract ore from their mine. They own rights to the ore from date   to date  . At date   there is   ore in the ground, and the time-dependent amount of ore   left in the ground declines at the rate of   that the mine owner extracts it. The mine owner extracts ore at cost   (the cost of extraction increasing with the square of the extraction speed and the inverse of the amount of ore left) and sells ore at a constant price  . Any ore left in the ground at time   cannot be sold and has no value (there is no "scrap value"). The owner chooses the rate of extraction varying with time   to maximize profits over the period of ownership with no time discounting.

  1. Discrete-time version

    The manager maximizes profit  :

     
    subject to the law of motion for the state variable  
     

    Form the Hamiltonian and differentiate:

     

    As the mine owner does not value the ore remaining at time  ,

     

    Using the above equations, it is easy to solve for the   and   series

     
    and using the initial and turn-T conditions, the   series can be solved explicitly, giving  .
  2. Continuous-time version

    The manager maximizes profit  :

     
    where the state variable   evolves as follows:
     

    Form the Hamiltonian and differentiate:

     

    As the mine owner does not value the ore remaining at time  ,

     

    Using the above equations, it is easy to solve for the differential equations governing   and  

     
    and using the initial and turn-T conditions, the functions can be solved to yield
     

See also edit

References edit

  1. ^ a b c d Ross, Isaac (2015). A primer on Pontryagin's principle in optimal control. San Francisco: Collegiate Publishers. ISBN 978-0-9843571-0-9. OCLC 625106088.
  2. ^ Luenberger, David G. (1979). "Optimal Control". Introduction to Dynamic Systems. New York: John Wiley & Sons. pp. 393–435. ISBN 0-471-02594-1.
  3. ^ Kamien, Morton I. (2013). Dynamic Optimization: the Calculus of Variations and Optimal Control in Economics and Management. Dover Publications. ISBN 978-1-306-39299-0. OCLC 869522905.
  4. ^ Ross, I. M.; Proulx, R. J.; Karpenko, M. (6 May 2020). "An Optimal Control Theory for the Traveling Salesman Problem and Its Variants". arXiv:2005.03186 [math.OC].
  5. ^ Ross, Isaac M.; Karpenko, Mark; Proulx, Ronald J. (1 January 2016). "A Nonsmooth Calculus for Solving Some Graph-Theoretic Control Problems**This research was sponsored by the U.S. Navy". IFAC-PapersOnLine. 10th IFAC Symposium on Nonlinear Control Systems NOLCOS 2016. 49 (18): 462–467. doi:10.1016/j.ifacol.2016.10.208. ISSN 2405-8963.
  6. ^ Sargent, R. W. H. (2000). "Optimal Control". Journal of Computational and Applied Mathematics. 124 (1–2): 361–371. Bibcode:2000JCoAM.124..361S. doi:10.1016/S0377-0427(00)00418-0.
  7. ^ Bryson, A. E. (1996). "Optimal Control—1950 to 1985". IEEE Control Systems Magazine. 16 (3): 26–33. doi:10.1109/37.506395.
  8. ^ Ross, I. M. (2009). A Primer on Pontryagin's Principle in Optimal Control. Collegiate Publishers. ISBN 978-0-9843571-0-9.
  9. ^ Kalman, Rudolf. A new approach to linear filtering and prediction problems. Transactions of the ASME, Journal of Basic Engineering, 82:34–45, 1960
  10. ^ Oberle, H. J. and Grimm, W., "BNDSCO-A Program for the Numerical Solution of Optimal Control Problems," Institute for Flight Systems Dynamics, DLR, Oberpfaffenhofen, 1989
  11. ^ Ross, I. M.; Karpenko, M. (2012). "A Review of Pseudospectral Optimal Control: From Theory to Flight". Annual Reviews in Control. 36 (2): 182–197. doi:10.1016/j.arcontrol.2012.09.002.
  12. ^ Betts, J. T. (2010). Practical Methods for Optimal Control Using Nonlinear Programming (2nd ed.). Philadelphia, Pennsylvania: SIAM Press. ISBN 978-0-89871-688-7.
  13. ^ Gill, P. E., Murray, W. M., and Saunders, M. A., User's Manual for SNOPT Version 7: Software for Large-Scale Nonlinear Programming, University of California, San Diego Report, 24 April 2007
  14. ^ von Stryk, O., User's Guide for DIRCOL (version 2.1): A Direct Collocation Method for the Numerical Solution of Optimal Control Problems, Fachgebiet Simulation und Systemoptimierung (SIM), Technische Universität Darmstadt (2000, Version of November 1999).
  15. ^ Betts, J.T. and Huffman, W. P., Sparse Optimal Control Software, SOCS, Boeing Information and Support Services, Seattle, Washington, July 1997
  16. ^ Hargraves, C. R.; Paris, S. W. (1987). "Direct Trajectory Optimization Using Nonlinear Programming and Collocation". Journal of Guidance, Control, and Dynamics. 10 (4): 338–342. Bibcode:1987JGCD...10..338H. doi:10.2514/3.20223.
  17. ^ Gath, P.F., Well, K.H., "Trajectory Optimization Using a Combination of Direct Multiple Shooting and Collocation", AIAA 2001–4047, AIAA Guidance, Navigation, and Control Conference, Montréal, Québec, Canada, 6–9 August 2001
  18. ^ Vasile M., Bernelli-Zazzera F., Fornasari N., Masarati P., "Design of Interplanetary and Lunar Missions Combining Low-Thrust and Gravity Assists", Final Report of the ESA/ESOC Study Contract No. 14126/00/D/CS, September 2002
  19. ^ Izzo, Dario. "PyGMO and PyKEP: open source tools for massively parallel optimization in astrodynamics (the case of interplanetary trajectory optimization)." Proceed. Fifth International Conf. Astrodynam. Tools and Techniques, ICATT. 2012.
  20. ^ RIOTS 16 July 2011 at the Wayback Machine, based on Schwartz, Adam (1996). Theory and Implementation of Methods based on Runge–Kutta Integration for Solving Optimal Control Problems (Ph.D.). University of California at Berkeley. OCLC 35140322.
  21. ^ Ross, I. M., Enhancements to the DIDO Optimal Control Toolbox, arXiv 2020. https://arxiv.org/abs/2004.13112
  22. ^ Williams, P., User's Guide to DIRECT, Version 2.00, Melbourne, Australia, 2008
  23. ^ FALCON.m, described in Rieck, M., Bittner, M., Grüter, B., Diepolder, J., and Piprek, P., FALCON.m - User Guide, Institute of Flight System Dynamics, Technical University of Munich, October 2019
  24. ^ GPOPS 24 July 2011 at the Wayback Machine, described in Rao, A. V., Benson, D. A., Huntington, G. T., Francolin, C., Darby, C. L., and Patterson, M. A., User's Manual for GPOPS: A MATLAB Package for Dynamic Optimization Using the Gauss Pseudospectral Method, University of Florida Report, August 2008.
  25. ^ Rutquist, P. and Edvall, M. M, PROPT – MATLAB Optimal Control Software," 1260 S.E. Bishop Blvd Ste E, Pullman, WA 99163, USA: Tomlab Optimization, Inc.
  26. ^ I.M. Ross, Computational Optimal Control, 3rd Workshop in Computational Issues in Nonlinear Control, October 8th, 2019, Monterey, CA
  27. ^ E. Polak, On the use of consistent approximations in the solution of semi-infinite optimization and optimal control problems Math. Prog. 62 pp. 385–415 (1993).
  28. ^ Ross, I M. (1 December 2005). "A Roadmap for Optimal Control: The Right Way to Commute". Annals of the New York Academy of Sciences. 1065 (1): 210–231. Bibcode:2005NYASA1065..210R. doi:10.1196/annals.1370.015. ISSN 0077-8923. PMID 16510411. S2CID 7625851.
  29. ^ Fahroo, Fariba; Ross, I. Michael (September 2008). "Convergence of the Costates Does Not Imply Convergence of the Control". Journal of Guidance, Control, and Dynamics. 31 (5): 1492–1497. Bibcode:2008JGCD...31.1492F. doi:10.2514/1.37331. ISSN 0731-5090. S2CID 756939.

Further reading edit

External links edit

  • Victor M. Becerra, ed. (2008). "Optimal control". Scholarpedia. Retrieved 31 December 2022.
  • Computational Optimal Control
  • Dr. Benoît CHACHUAT: – Nonlinear Programming, Calculus of Variations and Optimal Control.
  • DIDO - MATLAB tool for optimal control
  • GEKKO - Python package for optimal control

  • GPOPS-II – General-Purpose MATLAB Optimal Control Software
  • CasADi – Free and open source symbolic framework for optimal control
  • PROPT – MATLAB Optimal Control Software
  • OpenOCL – Open Optimal Control Library
  • Elmer G. Wiens: – Applications of Optimal Control Theory Using the Pontryagin Maximum Principle with interactive models.
  • On Optimal Control by Yu-Chi Ho
  • Pseudospectral optimal control: Part 1
  • Pseudospectral optimal control: Part 2
  • Lecture Recordings and Script by Prof. Moritz Diehl, University of Freiburg on Numerical Optimal Control

optimal, control, theory, branch, control, theory, that, deals, with, finding, control, dynamical, system, over, period, time, such, that, objective, function, optimized, numerous, applications, science, engineering, operations, research, example, dynamical, s. Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized 1 It has numerous applications in science engineering and operations research For example the dynamical system might be a spacecraft with controls corresponding to rocket thrusters and the objective might be to reach the Moon with minimum fuel expenditure 2 Or the dynamical system could be a nation s economy with the objective to minimize unemployment the controls in this case could be fiscal and monetary policy 3 A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory 4 5 Optimal control problem benchmark Luus with an integral objective inequality and differential constraint Optimal control is an extension of the calculus of variations and is a mathematical optimization method for deriving control policies 6 The method is largely due to the work of Lev Pontryagin and Richard Bellman in the 1950s after contributions to calculus of variations by Edward J McShane 7 Optimal control can be seen as a control strategy in control theory 1 Contents 1 General method 2 Linear quadratic control 3 Numerical methods for optimal control 4 Discrete time optimal control 5 Examples 5 1 Finite time 6 See also 7 References 8 Further reading 9 External linksGeneral method editOptimal control deals with the problem of finding a control law for a given system such that a certain optimality criterion is achieved A control problem includes a cost functional that is a function of state and control variables An optimal control is a set of differential equations describing the paths of the control variables that minimize the cost function The optimal control can be derived using Pontryagin s maximum principle a necessary condition also known as Pontryagin s minimum principle or simply Pontryagin s principle 8 or by solving the Hamilton Jacobi Bellman equation a sufficient condition We begin with a simple example Consider a car traveling in a straight line on a hilly road The question is how should the driver press the accelerator pedal in order to minimize the total traveling time In this example the term control law refers specifically to the way in which the driver presses the accelerator and shifts the gears The system consists of both the car and the road and the optimality criterion is the minimization of the total traveling time Control problems usually include ancillary constraints For example the amount of available fuel might be limited the accelerator pedal cannot be pushed through the floor of the car speed limits etc A proper cost function will be a mathematical expression giving the traveling time as a function of the speed geometrical considerations and initial conditions of the system Constraints are often interchangeable with the cost function Another related optimal control problem may be to find the way to drive the car so as to minimize its fuel consumption given that it must complete a given course in a time not exceeding some amount Yet another related control problem may be to minimize the total monetary cost of completing the trip given assumed monetary prices for time and fuel A more abstract framework goes as follows 1 Minimize the continuous time cost functionalJ x u t 0 t f E x t 0 t 0 x t f t f t 0 t f F x t u t t d t displaystyle J textbf x cdot textbf u cdot t 0 t f E textbf x t 0 t 0 textbf x t f t f int t 0 t f F textbf x t textbf u t t mathrm d t nbsp subject to the first order dynamic constraints the state equation x t f x t u t t displaystyle dot textbf x t textbf f textbf x t textbf u t t nbsp the algebraic path constraints h x t u t t 0 displaystyle textbf h textbf x t textbf u t t leq textbf 0 nbsp and the endpoint conditions e x t 0 t 0 x t f t f 0 displaystyle textbf e textbf x t 0 t 0 textbf x t f t f 0 nbsp where x t displaystyle textbf x t nbsp is the state u t displaystyle textbf u t nbsp is the control t displaystyle t nbsp is the independent variable generally speaking time t 0 displaystyle t 0 nbsp is the initial time and t f displaystyle t f nbsp is the terminal time The terms E displaystyle E nbsp and F displaystyle F nbsp are called the endpoint cost and the running cost respectively In the calculus of variations E displaystyle E nbsp and F displaystyle F nbsp are referred to as the Mayer term and the Lagrangian respectively Furthermore it is noted that the path constraints are in general inequality constraints and thus may not be active i e equal to zero at the optimal solution It is also noted that the optimal control problem as stated above may have multiple solutions i e the solution may not be unique Thus it is most often the case that any solution x t u t t 0 t f displaystyle textbf x t textbf u t t 0 t f nbsp to the optimal control problem is locally minimizing Linear quadratic control editA special case of the general nonlinear optimal control problem given in the previous section is the linear quadratic LQ optimal control problem The LQ problem is stated as follows Minimize the quadratic continuous time cost functionalJ 1 2 x T t f S f x t f 1 2 t 0 t f x T t Q t x t u T t R t u t d t displaystyle J tfrac 1 2 mathbf x mathsf T t f mathbf S f mathbf x t f tfrac 1 2 int t 0 t f mathbf x mathsf T t mathbf Q t mathbf x t mathbf u mathsf T t mathbf R t mathbf u t mathrm d t nbsp Subject to the linear first order dynamic constraintsx t A t x t B t u t displaystyle dot mathbf x t mathbf A t mathbf x t mathbf B t mathbf u t nbsp and the initial condition x t 0 x 0 displaystyle mathbf x t 0 mathbf x 0 nbsp A particular form of the LQ problem that arises in many control system problems is that of the linear quadratic regulator LQR where all of the matrices i e A displaystyle mathbf A nbsp B displaystyle mathbf B nbsp Q displaystyle mathbf Q nbsp and R displaystyle mathbf R nbsp are constant the initial time is arbitrarily set to zero and the terminal time is taken in the limit t f displaystyle t f rightarrow infty nbsp this last assumption is what is known as infinite horizon The LQR problem is stated as follows Minimize the infinite horizon quadratic continuous time cost functionalJ 1 2 0 x T t Q x t u T t R u t d t displaystyle J tfrac 1 2 int 0 infty mathbf x mathsf T t mathbf Q mathbf x t mathbf u mathsf T t mathbf R mathbf u t mathrm d t nbsp Subject to the linear time invariant first order dynamic constraintsx t A x t B u t displaystyle dot mathbf x t mathbf A mathbf x t mathbf B mathbf u t nbsp and the initial condition x t 0 x 0 displaystyle mathbf x t 0 mathbf x 0 nbsp In the finite horizon case the matrices are restricted in that Q displaystyle mathbf Q nbsp and R displaystyle mathbf R nbsp are positive semi definite and positive definite respectively In the infinite horizon case however the matrices Q displaystyle mathbf Q nbsp and R displaystyle mathbf R nbsp are not only positive semidefinite and positive definite respectively but are also constant These additional restrictions on Q displaystyle mathbf Q nbsp and R displaystyle mathbf R nbsp in the infinite horizon case are enforced to ensure that the cost functional remains positive Furthermore in order to ensure that the cost function is bounded the additional restriction is imposed that the pair A B displaystyle mathbf A mathbf B nbsp is controllable Note that the LQ or LQR cost functional can be thought of physically as attempting to minimize the control energy measured as a quadratic form The infinite horizon problem i e LQR may seem overly restrictive and essentially useless because it assumes that the operator is driving the system to zero state and hence driving the output of the system to zero This is indeed correct However the problem of driving the output to a desired nonzero level can be solved after the zero output one is In fact it can be proved that this secondary LQR problem can be solved in a very straightforward manner It has been shown in classical optimal control theory that the LQ or LQR optimal control has the feedback formu t K t x t displaystyle mathbf u t mathbf K t mathbf x t nbsp where K t displaystyle mathbf K t nbsp is a properly dimensioned matrix given as K t R 1 B T S t displaystyle mathbf K t mathbf R 1 mathbf B mathsf T mathbf S t nbsp and S t displaystyle mathbf S t nbsp is the solution of the differential Riccati equation The differential Riccati equation is given as S t S t A A T S t S t B R 1 B T S t Q displaystyle dot mathbf S t mathbf S t mathbf A mathbf A mathsf T mathbf S t mathbf S t mathbf B mathbf R 1 mathbf B mathsf T mathbf S t mathbf Q nbsp For the finite horizon LQ problem the Riccati equation is integrated backward in time using the terminal boundary conditionS t f S f displaystyle mathbf S t f mathbf S f nbsp For the infinite horizon LQR problem the differential Riccati equation is replaced with the algebraic Riccati equation ARE given as0 S A A T S S B R 1 B T S Q displaystyle mathbf 0 mathbf S mathbf A mathbf A mathsf T mathbf S mathbf S mathbf B mathbf R 1 mathbf B mathsf T mathbf S mathbf Q nbsp Understanding that the ARE arises from infinite horizon problem the matrices A displaystyle mathbf A nbsp B displaystyle mathbf B nbsp Q displaystyle mathbf Q nbsp and R displaystyle mathbf R nbsp are all constant It is noted that there are in general multiple solutions to the algebraic Riccati equation and the positive definite or positive semi definite solution is the one that is used to compute the feedback gain The LQ LQR problem was elegantly solved by Rudolf E Kalman 9 Numerical methods for optimal control editOptimal control problems are generally nonlinear and therefore generally do not have analytic solutions e g like the linear quadratic optimal control problem As a result it is necessary to employ numerical methods to solve optimal control problems In the early years of optimal control c 1950s to 1980s the favored approach for solving optimal control problems was that of indirect methods In an indirect method the calculus of variations is employed to obtain the first order optimality conditions These conditions result in a two point or in the case of a complex problem a multi point boundary value problem This boundary value problem actually has a special structure because it arises from taking the derivative of a Hamiltonian Thus the resulting dynamical system is a Hamiltonian system of the form 1 x H l l H x displaystyle begin aligned dot textbf x amp frac partial H partial boldsymbol lambda 1 2ex dot boldsymbol lambda amp frac partial H partial textbf x end aligned nbsp where H F l T f m T h displaystyle H F boldsymbol lambda mathsf T textbf f boldsymbol mu mathsf T textbf h nbsp is the augmented Hamiltonian and in an indirect method the boundary value problem is solved using the appropriate boundary or transversality conditions The beauty of using an indirect method is that the state and adjoint i e l displaystyle boldsymbol lambda nbsp are solved for and the resulting solution is readily verified to be an extremal trajectory The disadvantage of indirect methods is that the boundary value problem is often extremely difficult to solve particularly for problems that span large time intervals or problems with interior point constraints A well known software program that implements indirect methods is BNDSCO 10 The approach that has risen to prominence in numerical optimal control since the 1980s is that of so called direct methods In a direct method the state or the control or both are approximated using an appropriate function approximation e g polynomial approximation or piecewise constant parameterization Simultaneously the cost functional is approximated as a cost function Then the coefficients of the function approximations are treated as optimization variables and the problem is transcribed to a nonlinear optimization problem of the form MinimizeF z displaystyle F mathbf z nbsp subject to the algebraic constraints g z 0 h z 0 displaystyle begin aligned mathbf g mathbf z amp mathbf 0 mathbf h mathbf z amp leq mathbf 0 end aligned nbsp Depending upon the type of direct method employed the size of the nonlinear optimization problem can be quite small e g as in a direct shooting or quasilinearization method moderate e g pseudospectral optimal control 11 or may be quite large e g a direct collocation method 12 In the latter case i e a collocation method the nonlinear optimization problem may be literally thousands to tens of thousands of variables and constraints Given the size of many NLPs arising from a direct method it may appear somewhat counter intuitive that solving the nonlinear optimization problem is easier than solving the boundary value problem It is however the fact that the NLP is easier to solve than the boundary value problem The reason for the relative ease of computation particularly of a direct collocation method is that the NLP is sparse and many well known software programs exist e g SNOPT 13 to solve large sparse NLPs As a result the range of problems that can be solved via direct methods particularly direct collocation methods which are very popular these days is significantly larger than the range of problems that can be solved via indirect methods In fact direct methods have become so popular these days that many people have written elaborate software programs that employ these methods In particular many such programs include DIRCOL 14 SOCS 15 OTIS 16 GESOP ASTOS 17 DITAN 18 and PyGMO PyKEP 19 In recent years due to the advent of the MATLAB programming language optimal control software in MATLAB has become more common Examples of academically developed MATLAB software tools implementing direct methods include RIOTS 20 DIDO 21 DIRECT 22 FALCON m 23 and GPOPS 24 while an example of an industry developed MATLAB tool is PROPT 25 These software tools have increased significantly the opportunity for people to explore complex optimal control problems both for academic research and industrial problems 26 Finally it is noted that general purpose MATLAB optimization environments such as TOMLAB have made coding complex optimal control problems significantly easier than was previously possible in languages such as C and FORTRAN Discrete time optimal control editThe examples thus far have shown continuous time systems and control solutions In fact as optimal control solutions are now often implemented digitally contemporary control theory is now primarily concerned with discrete time systems and solutions The Theory of Consistent Approximations 27 28 provides conditions under which solutions to a series of increasingly accurate discretized optimal control problem converge to the solution of the original continuous time problem Not all discretization methods have this property even seemingly obvious ones 29 For instance using a variable step size routine to integrate the problem s dynamic equations may generate a gradient which does not converge to zero or point in the right direction as the solution is approached The direct method RIOTS is based on the Theory of Consistent Approximation Examples editThis section does not cite any sources Please help improve this section by adding citations to reliable sources Unsourced material may be challenged and removed April 2018 Learn how and when to remove this message A common solution strategy in many optimal control problems is to solve for the costate sometimes called the shadow price l t displaystyle lambda t nbsp The costate summarizes in one number the marginal value of expanding or contracting the state variable next turn The marginal value is not only the gains accruing to it next turn but associated with the duration of the program It is nice when l t displaystyle lambda t nbsp can be solved analytically but usually the most one can do is describe it sufficiently well that the intuition can grasp the character of the solution and an equation solver can solve numerically for the values Having obtained l t displaystyle lambda t nbsp the turn t optimal value for the control can usually be solved as a differential equation conditional on knowledge of l t displaystyle lambda t nbsp Again it is infrequent especially in continuous time problems that one obtains the value of the control or the state explicitly Usually the strategy is to solve for thresholds and regions that characterize the optimal control and use a numerical solver to isolate the actual choice values in time Finite time edit This section may be confusing or unclear to readers In particular the law of evolution mentioned in the example is not mentioned in the article and is probably not the same as evolution Please help clarify the section There might be a discussion about this on the talk page October 2018 Learn how and when to remove this message Consider the problem of a mine owner who must decide at what rate to extract ore from their mine They own rights to the ore from date 0 displaystyle 0 nbsp to date T displaystyle T nbsp At date 0 displaystyle 0 nbsp there is x 0 displaystyle x 0 nbsp ore in the ground and the time dependent amount of ore x t displaystyle x t nbsp left in the ground declines at the rate of u t displaystyle u t nbsp that the mine owner extracts it The mine owner extracts ore at cost u t 2 x t displaystyle u t 2 x t nbsp the cost of extraction increasing with the square of the extraction speed and the inverse of the amount of ore left and sells ore at a constant price p displaystyle p nbsp Any ore left in the ground at time T displaystyle T nbsp cannot be sold and has no value there is no scrap value The owner chooses the rate of extraction varying with time u t displaystyle u t nbsp to maximize profits over the period of ownership with no time discounting Discrete time version The manager maximizes profit P displaystyle Pi nbsp P t 0 T 1 p u t u t 2 x t displaystyle Pi sum t 0 T 1 left pu t frac u t 2 x t right nbsp subject to the law of motion for the state variable x t displaystyle x t nbsp x t 1 x t u t displaystyle x t 1 x t u t nbsp Form the Hamiltonian and differentiate H p u t u t 2 x t l t 1 u t H u t p l t 1 2 u t x t 0 l t 1 l t H x t u t x t 2 displaystyle begin aligned H amp pu t frac u t 2 x t lambda t 1 u t frac partial H partial u t amp p lambda t 1 2 frac u t x t 0 lambda t 1 lambda t amp frac partial H partial x t left frac u t x t right 2 end aligned nbsp As the mine owner does not value the ore remaining at time T displaystyle T nbsp l T 0 displaystyle lambda T 0 nbsp Using the above equations it is easy to solve for the x t displaystyle x t nbsp and l t displaystyle lambda t nbsp seriesl t l t 1 p l t 1 2 4 x t 1 x t 2 p l t 1 2 displaystyle begin aligned lambda t amp lambda t 1 frac left p lambda t 1 right 2 4 x t 1 amp x t frac 2 p lambda t 1 2 end aligned nbsp and using the initial and turn T conditions the x t displaystyle x t nbsp series can be solved explicitly giving u t displaystyle u t nbsp Continuous time version The manager maximizes profit P displaystyle Pi nbsp P 0 T p u t u t 2 x t d t displaystyle Pi int 0 T left pu t frac u t 2 x t right dt nbsp where the state variable x t displaystyle x t nbsp evolves as follows x t u t displaystyle dot x t u t nbsp Form the Hamiltonian and differentiate H p u t u t 2 x t l t u t H u p l t 2 u t x t 0 l t H x u t x t 2 displaystyle begin aligned H amp pu t frac u t 2 x t lambda t u t frac partial H partial u amp p lambda t 2 frac u t x t 0 dot lambda t amp frac partial H partial x left frac u t x t right 2 end aligned nbsp As the mine owner does not value the ore remaining at time T displaystyle T nbsp l T 0 displaystyle lambda T 0 nbsp Using the above equations it is easy to solve for the differential equations governing u t displaystyle u t nbsp and l t displaystyle lambda t nbsp l t p l t 2 4 u t x t p l t 2 displaystyle begin aligned dot lambda t amp frac p lambda t 2 4 u t amp x t frac p lambda t 2 end aligned nbsp and using the initial and turn T conditions the functions can be solved to yield x t 4 p t p T 2 4 p T 2 x 0 displaystyle x t frac left 4 pt pT right 2 left 4 pT right 2 x 0 nbsp See also editActive inference Bellman equation Bellman pseudospectral method Brachistochrone DIDO DNSS point Dynamic programming Gauss pseudospectral method Generalized filtering GPOPS II CasADi JModelica org Modelica based open source platform for dynamic optimization Kalman filter Linear quadratic regulator Model Predictive Control Overtaking criterion PID controller PROPT Optimal Control Software for MATLAB Pseudospectral optimal control Pursuit evasion games Sliding mode control SNOPT Stochastic control Trajectory optimizationReferences edit a b c d Ross Isaac 2015 A primer on Pontryagin s principle in optimal control San Francisco Collegiate Publishers ISBN 978 0 9843571 0 9 OCLC 625106088 Luenberger David G 1979 Optimal Control Introduction to Dynamic Systems New York John Wiley amp Sons pp 393 435 ISBN 0 471 02594 1 Kamien Morton I 2013 Dynamic Optimization the Calculus of Variations and Optimal Control in Economics and Management Dover Publications ISBN 978 1 306 39299 0 OCLC 869522905 Ross I M Proulx R J Karpenko M 6 May 2020 An Optimal Control Theory for the Traveling Salesman Problem and Its Variants arXiv 2005 03186 math OC Ross Isaac M Karpenko Mark Proulx Ronald J 1 January 2016 A Nonsmooth Calculus for Solving Some Graph Theoretic Control Problems This research was sponsored by the U S Navy IFAC PapersOnLine 10th IFAC Symposium on Nonlinear Control Systems NOLCOS 2016 49 18 462 467 doi 10 1016 j ifacol 2016 10 208 ISSN 2405 8963 Sargent R W H 2000 Optimal Control Journal of Computational and Applied Mathematics 124 1 2 361 371 Bibcode 2000JCoAM 124 361S doi 10 1016 S0377 0427 00 00418 0 Bryson A E 1996 Optimal Control 1950 to 1985 IEEE Control Systems Magazine 16 3 26 33 doi 10 1109 37 506395 Ross I M 2009 A Primer on Pontryagin s Principle in Optimal Control Collegiate Publishers ISBN 978 0 9843571 0 9 Kalman Rudolf A new approach to linear filtering and prediction problems Transactions of the ASME Journal of Basic Engineering 82 34 45 1960 Oberle H J and Grimm W BNDSCO A Program for the Numerical Solution of Optimal Control Problems Institute for Flight Systems Dynamics DLR Oberpfaffenhofen 1989 Ross I M Karpenko M 2012 A Review of Pseudospectral Optimal Control From Theory to Flight Annual Reviews in Control 36 2 182 197 doi 10 1016 j arcontrol 2012 09 002 Betts J T 2010 Practical Methods for Optimal Control Using Nonlinear Programming 2nd ed Philadelphia Pennsylvania SIAM Press ISBN 978 0 89871 688 7 Gill P E Murray W M and Saunders M A User s Manual for SNOPT Version 7 Software for Large Scale Nonlinear Programming University of California San Diego Report 24 April 2007 von Stryk O User s Guide for DIRCOL version 2 1 A Direct Collocation Method for the Numerical Solution of Optimal Control Problems Fachgebiet Simulation und Systemoptimierung SIM Technische Universitat Darmstadt 2000 Version of November 1999 Betts J T and Huffman W P Sparse Optimal Control Software SOCS Boeing Information and Support Services Seattle Washington July 1997 Hargraves C R Paris S W 1987 Direct Trajectory Optimization Using Nonlinear Programming and Collocation Journal of Guidance Control and Dynamics 10 4 338 342 Bibcode 1987JGCD 10 338H doi 10 2514 3 20223 Gath P F Well K H Trajectory Optimization Using a Combination of Direct Multiple Shooting and Collocation AIAA 2001 4047 AIAA Guidance Navigation and Control Conference Montreal Quebec Canada 6 9 August 2001 Vasile M Bernelli Zazzera F Fornasari N Masarati P Design of Interplanetary and Lunar Missions Combining Low Thrust and Gravity Assists Final Report of the ESA ESOC Study Contract No 14126 00 D CS September 2002 Izzo Dario PyGMO and PyKEP open source tools for massively parallel optimization in astrodynamics the case of interplanetary trajectory optimization Proceed Fifth International Conf Astrodynam Tools and Techniques ICATT 2012 RIOTS Archived 16 July 2011 at the Wayback Machine based on Schwartz Adam 1996 Theory and Implementation of Methods based on Runge Kutta Integration for Solving Optimal Control Problems Ph D University of California at Berkeley OCLC 35140322 Ross I M Enhancements to the DIDO Optimal Control Toolbox arXiv 2020 https arxiv org abs 2004 13112 Williams P User s Guide to DIRECT Version 2 00 Melbourne Australia 2008 FALCON m described in Rieck M Bittner M Gruter B Diepolder J and Piprek P FALCON m User Guide Institute of Flight System Dynamics Technical University of Munich October 2019 GPOPS Archived 24 July 2011 at the Wayback Machine described in Rao A V Benson D A Huntington G T Francolin C Darby C L and Patterson M A User s Manual for GPOPS A MATLAB Package for Dynamic Optimization Using the Gauss Pseudospectral Method University of Florida Report August 2008 Rutquist P and Edvall M M PROPT MATLAB Optimal Control Software 1260 S E Bishop Blvd Ste E Pullman WA 99163 USA Tomlab Optimization Inc I M Ross Computational Optimal Control 3rd Workshop in Computational Issues in Nonlinear Control October 8th 2019 Monterey CA E Polak On the use of consistent approximations in the solution of semi infinite optimization and optimal control problems Math Prog 62 pp 385 415 1993 Ross I M 1 December 2005 A Roadmap for Optimal Control The Right Way to Commute Annals of the New York Academy of Sciences 1065 1 210 231 Bibcode 2005NYASA1065 210R doi 10 1196 annals 1370 015 ISSN 0077 8923 PMID 16510411 S2CID 7625851 Fahroo Fariba Ross I Michael September 2008 Convergence of the Costates Does Not Imply Convergence of the Control Journal of Guidance Control and Dynamics 31 5 1492 1497 Bibcode 2008JGCD 31 1492F doi 10 2514 1 37331 ISSN 0731 5090 S2CID 756939 Further reading editBertsekas D P 1995 Dynamic Programming and Optimal Control Belmont Athena ISBN 1 886529 11 6 Bryson A E Ho Y C 1975 Applied Optimal Control Optimization Estimation and Control Revised ed New York John Wiley and Sons ISBN 0 470 11481 9 Fleming W H Rishel R W 1975 Deterministic and Stochastic Optimal Control New York Springer ISBN 0 387 90155 8 Kamien M I Schwartz N L 1991 Dynamic Optimization The Calculus of Variations and Optimal Control in Economics and Management Second ed New York Elsevier ISBN 0 444 01609 0 Kirk D E 1970 Optimal Control Theory An Introduction Englewood Cliffs Prentice Hall ISBN 0 13 638098 0 External links editVictor M Becerra ed 2008 Optimal control Scholarpedia Retrieved 31 December 2022 Computational Optimal Control Dr Benoit CHACHUAT Automatic Control Laboratory Nonlinear Programming Calculus of Variations and Optimal Control DIDO MATLAB tool for optimal control GEKKO Python package for optimal control GESOP Graphical Environment for Simulation and OPtimization GPOPS II General Purpose MATLAB Optimal Control Software CasADi Free and open source symbolic framework for optimal control PROPT MATLAB Optimal Control Software OpenOCL Open Optimal Control Library Elmer G Wiens Optimal Control Applications of Optimal Control Theory Using the Pontryagin Maximum Principle with interactive models On Optimal Control by Yu Chi Ho Pseudospectral optimal control Part 1 Pseudospectral optimal control Part 2 Lecture Recordings and Script by Prof Moritz Diehl University of Freiburg on Numerical Optimal Control Retrieved from https en wikipedia org w index php title Optimal control amp oldid 1185965431, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.