Muutke küpsiste eelistusi

E-raamat: Self-Learning Optimal Control of Nonlinear Systems: Adaptive Dynamic Programming Approach

  • Formaat - PDF+DRM
  • Hind: 147,58 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

This book presents a class of novel, self-learning, optimal control schemes based on adaptive dynamic programming techniques, which quantitatively obtain the optimal control schemes of the systems. It analyzes the properties identified by the programming methods, including the convergence of the iterative value functions and the stability of the system under iterative control laws, helping to guarantee the effectiveness of the methods developed. When the system model is known, self-learning optimal control is designed on the basis of the system model; when the system model is not known, adaptive dynamic programming is implemented according to the system data, effectively making the performance of the system converge to the optimum.

With various real-world examples to complement and substantiate the mathematical analysis, the book is a valuable guide for engineers, researchers, and students in control science and engineering.

Arvustused

Book contains various real-world examples to illustrate the developed mathematical analysis. Thus, it is a valuable and important guide for engineers, researchers, and students in systems, decision and control science. (Savin Treanta, zbMATH 1403.49002, 2019)

1 Principle of Adaptive Dynamic Programming
1(18)
1.1 Dynamic Programming
1(2)
1.1.1 Discrete-Time Systems
1(1)
1.1.2 Continuous-Time Systems
2(1)
1.2 Original Forms of Adaptive Dynamic Programming
3(6)
1.2.1 Principle of Adaptive Dynamic Programming
4(5)
1.3 Iterative Forms of Adaptive Dynamic Programming
9(2)
1.3.1 Value Iteration
9(1)
1.3.2 Policy Iteration
10(1)
1.4 About This Book
11(3)
References
14(5)
2 An Iterative e-Optimal Control Scheme for a Class of Discrete-Time Nonlinear Systems with Unfixed Initial State
19(28)
2.1 Introduction
19(1)
2.2 Problem Statement
20(1)
2.3 Properties of the Iterative Adaptive Dynamic Programming Algorithm
21(7)
2.3.1 Derivation of the Iterative ADP Algorithm
21(2)
2.3.2 Properties of the Iterative ADP Algorithm
23(5)
2.4 The e-Optimal Control Algorithm
28(9)
2.4.1 The Derivation of the e-Optimal Control Algorithm
28(4)
2.4.2 Properties of the e-Optimal Control Algorithm
32(2)
2.4.3 The e-Optimal Control Algorithm for Unfixed Initial State
34(3)
2.4.4 The Expressions of the e-Optimal Control Algorithm
37(1)
2.5 Neural Network Implementation for the e-Optimal Control Scheme
37(3)
2.5.1 The Critic Network
38(1)
2.5.2 The Action Network
39(1)
2.6 Simulation Study
40(2)
2.7 Conclusions
42(1)
References
43(4)
3 Discrete-Time Optimal Control of Nonlinear Systems via Value Iteration-Based g-Learning
47(38)
3.1 Introduction
47(2)
3.2 Preliminaries and Assumptions
49(3)
3.2.1 Problem Formulations
49(1)
3.2.2 Derivation of the Discrete-Time Q-Learning Algorithm
50(2)
3.3 Properties of the Discrete-Time Q-Learning Algorithm
52(12)
3.3.1 Non-Discount Case
52(7)
3.3.2 Discount Case
59(5)
3.4 Neural Network Implementation for the Discrete-Time Q-Learning Algorithm
64(6)
3.4.1 The Action Network
65(2)
3.4.2 The Critic Network
67(2)
3.4.3 Training Phase
69(1)
3.5 Simulation Study
70(11)
3.5.1 Example 1
70(6)
3.5.2 Example 2
76(5)
3.6 Conclusion
81(1)
References
82(3)
4 A Novel Policy Iteration-Based Deterministic Q-Learning for Discrete-Time Nonlinear Systems
85(26)
4.1 Introduction
85(1)
4.2 Problem Formulation
86(1)
4.3 Policy Iteration-Based Deterministic Q-Learning Algorithm for Discrete-Time Nonlinear Systems
87(6)
4.3.1 Derivation of the Policy Iteration-Based Deterministic Q-Learning Algorithm
87(2)
4.3.2 Properties of the Policy Iteration-Based Deterministic Q-Learning Algorithm
89(4)
4.4 Neural Network Implementation for the Policy Iteration-Based Deterministic Q-Learning Algorithm
93(4)
4.4.1 The Critic Network
93(2)
4.4.2 The Action Network
95(1)
4.4.3 Summary of the Policy Iteration-Based Deterministic g-Learning Algorithm
96(1)
4.5 Simulation Study
97(10)
4.5.1 Example 1
97(3)
4.5.2 Example 2
100(7)
4.6 Conclusion
107(1)
References
107(4)
5 Nonlinear Neuro-Optimal Tracking Control via Stable Iterative Q-Learning Algorithm
111(22)
5.1 Introduction
111(1)
5.2 Problem Statement
112(2)
5.3 Policy Iteration Q-Learning Algorithm for Optimal Tracking Control
114(1)
5.4 Properties of the Policy Iteration Q-Learning Algorithm
114(5)
5.5 Neural Network Implementation for the Policy Iteration Q-Learning Algorithm
119(2)
5.5.1 The Critic Network
120(1)
5.5.2 The Action Network
120(1)
5.6 Simulation Study
121(8)
5.6.1 Example 1
122(3)
5.6.2 Example 2
125(4)
5.7 Conclusions
129(1)
References
129(4)
6 Model-Free Multiobjective Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems with General Performance Index Functions
133(26)
6.1 Introduction
133(1)
6.2 Preliminaries
134(1)
6.3 Multiobjective Adaptive Dynamic Programming Method
135(10)
6.4 Model-Free Incremental Q-Learning Method
145(2)
6.4.1 Derivation of the Incremental Q-Learning Method
145(2)
6.5 Neural Network Implementation for the Incremental Q-Learning Method
147(3)
6.5.1 The Critic Network
148(1)
6.5.2 The Action Network
149(1)
6.5.3 The Procedure of the Model-Free Incremental Q-learning Method
150(1)
6.6 Convergence Proof
150(3)
6.7 Simulation Study
153(4)
6.7.1 Example 1
153(2)
6.7.2 Example 2
155(2)
6.8 Conclusion
157(1)
References
157(2)
7 Multiobjective Optimal Control for a Class of Unknown Nonlinear Systems Based on Finite-Approximation-Error ADP Algorithm
159(26)
7.1 Introduction
159(1)
7.2 General Formulation
160(2)
7.3 Optimal Solution Based on Finite-Approximation-Error ADP
162(11)
7.3.1 Data-Based Identifier of Unknown System Dynamics
162(4)
7.3.2 Derivation of the ADP Algorithm with Finite Approximation Errors
166(2)
7.3.3 Convergence Analysis of the Iterative ADP Algorithm
168(5)
7.4 Implementation of the Iterative ADP Algorithm
173(2)
7.4.1 Critic Network
174(1)
7.4.2 The Action Network
174(1)
7.4.3 The Procedure of the ADP Algorithm
175(1)
7.5 Simulation Study
175(7)
7.5.1 Example 1
176(3)
7.5.2 Example 2
179(3)
7.6 Conclusions
182(1)
References
182(3)
8 A New Approach for a Class of Continuous-Time Chaotic Systems Optimal Control by Online ADP Algorithm
185(16)
8.1 Introduction
185(1)
8.2 Problem Statement
185(2)
8.3 Optimal Control Based on Online ADP Algorithm
187(8)
8.3.1 Design Method of the Critic Network and the Action Network
188(3)
8.3.2 Stability Analysis
191(4)
8.3.3 Online ADP Algorithm Implementation
195(1)
8.4 Simulation Examples
195(4)
8.4.1 Example 1
196(1)
8.4.2 Example 2
197(2)
8.5 Conclusions
199(1)
References
200(1)
9 Off-Policy IRL Optimal Tracking Control for Continuous-Time Chaotic Systems
201(14)
9.1 Introduction
201(1)
9.2 System Description and Problem Statement
201(2)
9.3 Off-Policy IRL ADP Algorithm
203(6)
9.3.1 Convergence Analysis of IRL ADP Algorithm
204(2)
9.3.2 Off-Policy IRL Method
206(2)
9.3.3 Methods for Updating Weights
208(1)
9.4 Simulation Study
209(4)
9.4.1 Example 1
209(2)
9.4.2 Example 2
211(2)
9.5 Conclusion
213(1)
References
213(2)
10 ADP-Based Optimal Sensor Scheduling for Target Tracking in Energy Harvesting Wireless Sensor Networks
215(1)
10.1 Introduction
215(1)
10.2 Problem Formulation
216(3)
10.2.1 NN Model Description of Solar Energy Harvesting
216(1)
10.2.2 Sensor Energy Consumption
217(1)
10.2.3 KF Technology
218(1)
10.3 ADP-Based Sensor Scheduling for Maximum WSNs Residual Energy and Minimum Measuring Accuracy
219(5)
10.3.1 Optimization Problem of the Sensor Scheduling
219(1)
10.3.2 ADP-Based Sensor Scheduling with Convergence Analysis
220(3)
10.3.3 Critic Network
223(1)
10.3.4 Implementation Process
224(1)
10.4 Simulation Study
224(2)
10.5 Conclusion
226(1)
References
227
Erratum to: Self-Learning Optimal Control of Nonlinear Systems 1(228)
Index 229
Qinglai Wei received his B.S. degree in Automation and Ph.D. degree in Control Theory and Control Engineering from the Northeastern University, Shenyang, China, in 2002 and 2009, respectively. From 2009 to 2011, he was a postdoctoral fellow with The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China and is currently a professor there. He has authored one book and published over 70 international journal papers. His research interests include adaptive dynamic programming, neural-networks-based control, optimal control, nonlinear systems and their industrial applications.

Dr. Wei is an associate editor of IEEE Transactions on Systems Man, and Cybernetics: Systems, Information Sciences, Neurocomputing, Optimal Control Applications and Methods, and Acta Automatica Sinica, and held the same position for IEEE Transactions on Neural Networks and Learning Systems from 2014 to 2015. He has been the secretary of the IEEE Computational Intelligence Society (CIS) Beijing Chapter since 2015. He was registration chair of the 12th World Congress on Intelligent Control and Automation (WCICA 2016), the IEEE World Congress on Computational Intelligence (WCCI 2014), the International Conference on Brain Inspired Cognitive Systems (BICS 2013), and the 8th International Symposium on Neural Networks (ISNN 2011). He was the publication chair of the 5th International Conference on Information Science and Technology (ICIST 2015) and the 9th International Symposium on Neural Networks (ISNN 2012). He was the finance chair of the 4th International Conference on Intelligent Control and Information Processing (ICICIP 2013) and the publicity chair of the International Conference on Brain Inspired Cognitive Systems (BICS 2012). He has been the guest editor for several international journals. He was a recipient of the Acta Automatica Sinica Outstanding Paper Award in 2011 and the Chinese Control, DecisionConference (CCDC) Zhang Siying Outstanding Paper Award in 2015, and Young Researcher Award of Asia Pacific Neural Network Society (APNNS) in 2016.

Ruizhuo Song received his Ph.D. degree in Control Theory and Control Engineering from Northeastern University, Shenyang, China, in 2012. She is currently an associate professor at the School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China. Her research interests include optimal control, neural-network-based control, nonlinear control, wireless sensor networks, adaptive dynamic programming and their industrial application.