nips nips2004 nips2004-175 nips2004-175-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: H. J. Kim, Andrew Y. Ng
Abstract: Learning algorithms have enjoyed numerous successes in robotic control tasks. In problems with time-varying dynamics, online learning methods have also proved to be a powerful tool for automatically tracking and/or adapting to the changing circumstances. However, for safety-critical applications such as airplane flight, the adoption of these algorithms has been significantly hampered by their lack of safety, such as “stability,” guarantees. Rather than trying to show difficult, a priori, stability guarantees for specific learning methods, in this paper we propose a method for “monitoring” the controllers suggested by the learning algorithm online, and rejecting controllers leading to instability. We prove that even if an arbitrary online learning method is used with our algorithm to control a linear dynamical system, the resulting system is stable. 1
[1] B. Anderson and J. Moore. Optimal Control: Linear Quadratic Methods. Prentice-Hall, 1989.
[2] Karl Astrom and Bjorn Wittenmark. Adaptive Control (2nd Edition). Addison-Wesley, 1994.
[3] V. D. Blondel and J. N. Tsitsiklis. The boundedness of all products of a pair of matrices is undecidable. Systems and Control Letters, 41(2):135–140, 2000.
[4] Michael S. Branicky. Analyzing continuous switching systems: Theory and examples. In Proc. American Control Conference, 1994.
[5] Michael S. Branicky. Stability of switched and hybrid systems. In Proc. 33rd IEEE Conf. Decision Control, 1994.
[6] G. Franklin, J. Powell, and A. Emani-Naeini. Feedback Control of Dynamic Systems. AddisonWesley, 1995.
[7] M. Johansson and A. Rantzer. On the computation of piecewise quadratic lyapunov functions. In Proceedings of the 36th IEEE Conference on Decision and Control, 1997.
[8] H. Khalil. Nonlinear Systems (3rd ed). Prentice Hall, 2001.
[9] Daniel Liberzon, Jo˜ o Hespanha, and A. S. Morse. Stability of switched linear systems: A a lie-algebraic condition. Syst. & Contr. Lett., 3(37):117–122, 1999.
[10] J. Nakanishi, J.A. Farrell, and S. Schaal. A locally weighted learning composite adaptive controller with structure adaptation. In International Conference on Intelligent Robots, 2002.
[11] T. J. Perkins and A. G. Barto. Lyapunov design for safe reinforcement learning control. In Safe Learning Agents: Papers from the 2002 AAAI Symposium, pages 23–30, 2002.
[12] Jean-Jacques Slotine and Weiping Li. Applied Nonlinear Control. Prentice Hall, 1990. 8 Checking all k N such combinations takes time exponential in N , but it is often possible to use very small values of N , sometimes including N = 1, if the states xt are linearly reparameterized (xt = M xt ) to minimize σmax (D0 ).