Dynamic programming

Dynamic programming is both a mathematical optimization method and an algorithmic paradigm. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics.

In both contexts it refers to simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive manner. While some decision problems cannot be taken apart this way, decisions that span several points in time do often break apart recursively. Likewise, in computer science, if a problem can be solved optimally by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems, then it is said to have optimal substructure.

If sub-problems can be nested recursively inside larger problems, so that dynamic programming methods are applicable, then there is a relation between the value of the larger problem and the values of the sub-problems.^[1] In the optimization literature this relationship is called the Bellman equation.

Overview

Mathematical optimization

In terms of mathematical optimization, dynamic programming usually refers to simplifying a decision by breaking it down into a sequence of decision steps over time.

This is done by defining a sequence of value functions V₁, V₂, ..., V_n taking y as an argument representing the state of the system at times i from 1 to n.

The definition of V_n(y) is the value obtained in state y at the last time n.

The values V_i at earlier times i = n −1, n − 2, ..., 2, 1 can be found by working backwards, using a recursive relationship called the Bellman equation.

For i = 2, ..., n, V_i−1 at any state y is calculated from V_i by maximizing a simple function (usually the sum) of the gain from a decision at time i − 1 and the function V_i at the new state of the system if this decision is made.

Since V_i has already been calculated for the needed states, the above operation yields V_i−1 for those states.

Finally, V₁ at the initial state of the system is the value of the optimal solution. The optimal values of the decision variables can be recovered, one by one, by tracking back the calculations already performed.

Control theory

In control theory, a typical problem is to find an admissible control $\mathbf {u} ^{\ast }$ which causes the system ${\dot {\mathbf {x} }}(t)=\mathbf {g} \left(\mathbf {x} (t),\mathbf {u} (t),t\right)$ to follow an admissible trajectory $\mathbf {x} ^{\ast }$ on a continuous time interval $t_{0}\leq t\leq t_{1}$ that minimizes a cost function

J=b\left(\mathbf {x} (t_{1}),t_{1}\right)+\int _{t_{0}}^{t_{1}}f\left(\mathbf {x} (t),\mathbf {u} (t),t\right)\mathrm {d} t

The solution to this problem is an optimal control law or policy $\mathbf {u} ^{\ast }=h(\mathbf {x} (t),t)$ , which produces an optimal trajectory $\mathbf {x} ^{\ast }$ and a cost-to-go function $J^{\ast }$ . The latter obeys the fundamental equation of dynamic programming:

-J_{t}^{\ast }=\min _{\mathbf {u} }\left\{f\left(\mathbf {x} (t),\mathbf {u} (t),t\right)+J_{x}^{\ast {\mathsf {T}}}\mathbf {g} \left(\mathbf {x} (t),\mathbf {u} (t),t\right)\right\}

a partial differential equation known as the Hamilton–Jacobi–Bellman equation, in which $J_{x}^{\ast }={\frac {\partial J^{\ast }}{\partial \mathbf {x} }}=\left^{\mathsf {T}}$ and $J_{t}^{\ast }={\frac {\partial J^{\ast }}{\partial t}}$ . One finds that minimizing $\mathbf {u}$ in terms of $t$ , $\mathbf {x}$ , and the unknown function $J_{x}^{\ast }$ and then substitutes the result into the Hamilton–Jacobi–Bellman equation to get the partial differential equation to be solved with boundary condition $J\left(t_{1}\right)=b\left(\mathbf {x} (t_{1}),t_{1}\right)$

Navigácia: Veda >

Analytika
Antropológia
Aplikované vedy
Bibliometria
Dejiny vedy
Encyklopédie
Filozofia vedy
Forenzné vedy
Humanitné vedy
Knižničná veda
Kryogenika
Kryptológia
Kulturológia
Literárna veda
Medzidisciplinárne oblasti
Metódy kvantitatívnej analýzy
Metavedy
Metodika

Metodológia vedy
Náboženstvo a veda
Náučná literatúra
Podvody vo vede
Popularizácia vedy
Potravinárstvo
Prírodné vedy
Pseudoveda
Scientometria
Spoločenské vedy
Teórie
Teatrológia
Technické vedy
Technika
Terminológia
Umenie
Výskum

Veda
Veda a technika podľa štátu
Veda a technika podľa kontinentu
Veda a technika podľa roka
Veda v kozme
Vedci
Vedecká literatúra
Vedecké databázy
Vedecké experimenty
Vedecké konferencie
Vedecké metódy
Vedecké ocenenia
Vedecké organizácie
Vedecké parky
Vedeckí spisovatelia
Vzdelávanie
Záhady

Príbuzné výrazy:

Text je dostupný za podmienok Creative Commons Attribution/Share-Alike License 3.0 Unported; prípadne za ďalších podmienok.
Podrobnejšie informácie nájdete na stránke Podmienky použitia.

[1]