Online Optimal Control with Affine Constraints
Abstract
This paper considers online optimal control with linear constraints on the states and actions under a noisy linear dynamical system. The convex stage cost functions are adversarially changing and are unknown before selecting the stage actions. The dynamical system and the constraints are time-invariant and known beforehand. We propose an online control algorithm: Online Gradient Descent with Buffer Zone (OGD-BZ). OGD-BZ can guarantee the system to satisfy all the constraints despite the random process noises. We investigate the policy regret of OGD-BZ, which refers to the difference between OGD-BZ's total cost and the total cost of an optimal linear policy in hindsight. We show that OGD-BZ achieves $\tilde O(\sqrt T)$ regret, where $\tilde O(\cdot)$ absorbs logarithmic terms of $T$.