Decentralized tracking control design based on intelligent critic for an interconnected spring-mass-damper system
Abstract
In this paper, the decentralized tracking control (DTC) problem is investigated for a class of continuous-time nonlinear systems with external disturbances. First, the DTC problem is resolved by converting it into the optimal tracking controller design for augmented tracking isolated subsystems (ATISs). %It is investigated in the form of the nominal system. A cost function with a discount is taken into consideration. Then, in the case of external disturbances, the DTC scheme is effectively constructed via adding the appropriate feedback gain to each ATIS. %Herein, we aim to obtain the optimal control strategy for minimizing the cost function with discount. In addition, utilizing the approximation property of the neural network, the critic network is constructed to solve the Hamilton-Jacobi-Isaacs equation, which can derive the optimal tracking control law and the worst disturbance law. Moreover, the updating rule is improved during the process of weight learning, which removes the requirement for initial admission control. Finally, through the interconnected spring-mass-damper system, a simulation example is given to verify the availability of the DTC scheme.
Keywords
1. INTRODUCTION
For large-scale nonlinear interconnected systems, which are considered as nonlinear plants consisting of many interconnected subsystems, decentralized control has become a research hotspot in the last few decades [1–4]. Compared with the centralized control, the decentralized control has the advantages of simplifying the structure and reducing the computation burden of the controller. Besides, the local controller only depends on the information of the local subsystem. Meanwhile, with the development of science and technology, interconnected engineering applications have become increasingly complex, such as robotic systems [5] and power systems [6, 7]. In [8–10], we found that the decentralized control of the large-scale system was connected with the optimal control of the isolated subsystems, which means the optimal control method [11–14] can be adopted to achieve the design purpose of the decentralized controllers. However, the optimal control of the nonlinear system often needs to solve the Hamilton-Jacobi-Bellman (HJB) or Hamilton-Jacobi-Isaacs (HJI) equation, which can be solved by using the adaptive dynamic programming (ADP) method [15, 16]. Besides, in [13], Wang et al. investigated the latest intelligent critic framework for advanced optimal control. In [14], the optimal feedback stabilization problem was discussed with discounted guaranteed cost for nonlinear systems. It follows that the interconnection plays a significant role in designing the controller. Hence, it can be classified as decentralized and distributed control schemes. There is a certain distinction between decentralized control and distributed control. For decentralized control, each sub-controller only uses local information and the interconnection among subsystems can be assumed to be weak in nature. Compared with the decentralized control, the distributed control [17–19] can be introduced to improve the performance of the subsystems when the interconnections among subsystems become strong. In [20], the distributed optimal observer was devised to assess the nonlinear leader state for all followers. In [21], the distributed control was developed by means of online reinforcement learning for interconnected systems with exploration.
It is worth mentioning that the ADP algorithm has been extensively employed for dealing with various optimal regulation problems and tracking problems [22–24], which will achieve the goal, that is, the actual signal can track the reference signal under the noisy and the uncertain environment. In [25], Ha et al. proposed a novel cost function to explore the evaluation framework of the optimal tracking control problem. Then, aimed at complicated control systems, it is necessary to consider decentralized tracking control (DTC) problems [26–29]. The DTC systems can be transformed into the the nominal augmented tracking isolated subsystems (ATISs), which are composed of the tracking error and the reference signal. In [26], Qu et al. proposed a novel formulation consisting of a steady-state controller and a modified optimal feedback controller of the DTC strategy. Besides, the asymptotic DTC was realized by introducing two integral bounded functions in [27]. In [28], Liu et al. proposed a finite-time DTC method for a class of nonstrict feedback interconnected systems with disturbances. Moreover, the adaptive fuzzy output-feedback DTC design was investigated for switched large-scale systems in [29].
Game theory is a discipline that implements corresponding strategies. It contains cooperative and noncooperative types, that is, zero-sum (ZS) games and non-ZS games. In particular, ZS games have been widely applied in many fields [30–33]. The object of the ZS game is to derive the Nash equilibrium of nonliner systems, which makes the cost function optimized. In [31], the finite-horizon H-infinity state estimator design was studied for periodic neural networks over multiple fading channels. The noncooperative control problem was formulated as a two-player ZS game in [32]. In [33], Wang et al. investigated the stability of the general value iteration algorithm for ZS games. At the same time, we can also combine the ZS problem with the tracking problem to make the system more stable while achieving the trajectory tracking. In [34], Zhang et al. developed an online model-free integral reinforcement learning algorithm for solving the H-infinity optimal tracking problem for completely unknown systems. In [35], a general bounded
As can be seen from the above, there are few studies that combine the DTC problem with the ZS game problem. It is necessary to take the related discounted cost function into account for the DTC system, which can transform the DTC problem into an optimal control problem with disturbances. In practice, the existence of disturbances will make an unpredictable impact on the plant. Hence, it is of vital importance to consider the stability of the DTC system. In the experimental simulation, it is a challenge to achieve the goal of effective online weight training, which is implemented under the tracking control law and the disturbance control law. Consequently, in this paper, we put forward a novel method in view of ADP to resolve the DTC problem with external disturbances for continuous-time (CT) nonlinear systems. More importantly, for the sake of overcoming the difficulty of selecting initial admissible control policies, an additional term is added during the weight updating process. Remarkably, in this paper, we introduce the discount factor for maximizing and minimizing the corresponding cost function.
The contributions of this paper are as follows: First, considering the disturbance input in the DTC system, the strategy feasibility and the system stability are discussed through theoretical proofs. It is worth noting that the discount factor is introduced to the cost function. Moreover, in the process of online weight training, we can make the DTC system reach a stable state without selecting the initial admissible control law. Additionally, we present the experimental process of the spring-mass-damper system. Besides, we derive the desired tracking error curves as well as control strategy curves, which demonstrates that they are uniformly ultimately bounded (UUB).
The whole paper is divided into six sections. The first section is the introduction of relevant background knowledges of the research content. The second section is the problem statement of basic problems about the two person ZS game and the DTC strategy. In the third section, we design the decentralized tracking controller by using the optimal control method through solving the HJI equations. Meanwhile, the relevant lemma and theorem are given to validate the establishment of the DTC strategy. In the fourth section, the design method in accordance with adaptive critic is elaborated. Most importantly, an improved critic learning rule is implemented via critic networks. In the fifth section, the practicability of this method is validated by an interconnected spring-mass-damper system. Finally, the sixth section displays conclusions and summarizes overall research content of the whole paper.
2. PROBLEM STATEMENT
Consider a CT nonlinear interconnected system with disturbances, which is composed of
where
where
In this paper, considering the nonlinear system Equation (1), a reference system is introduced as follows:
where
Noticing
where
We aim to design a pair of decentralized control policies
3. DTC DESIGN VIA OPTIMAL REGULATION
3.1. Optimal control and the HJI equations
In this section, the optimal DTC strategy of the ATIS with the disturbance rejection is elaborated. It is addressed by solving the HJI equation with a discounted cost function. Then, we consider the nominal part of the augmented system Equation (5) as
We assume that
where
If Equation (11) is continuously differentiable, the nonlinear Lyapunov equation is the infinitely small form of Equation (11). The Lyapunov equation is as follows:
Define the Hamiltonian of the ith ATIS for the optimization problem as
To acquire the saddle point solution
Then, the optimal cost function
Due to the saddle point solution
Substituting the optimal tracking control strategy Equation (16) into Equation (15), the HJI equation for the
3.2. Establishment of the DTC strategy design
In the following, we present the DTC strategy by adding the feedback gain to the interconnected system Equation (5). Herein, the following lemma is given by
Lemma 1 Considering the ATIS Equation (9), the feedback control
can ensure the
Proof. The lemma can be proved by showing
Substituting Equations (18) and (19) into Equation (20), we can rewrite it as
Observing Equation (21), we can obtain that
Remark 1.It is worth mentioning that only when
Theorem 1 Taking Equation (2) and the interconnected augmented tracking system Equation (5) into account, there exist
Proof. Inspired by Lemma 1, we observe that
where
Considering Equation (2), the mentioned inequality
Herein, in order to transform Equation (24) to the compact form, we denote
Therefore, we introduce a 2
Next, Equation (24) can be transformed to the following compact form:
According to Equation (29), it can be concluded that when
Obviously, the key point of designing the DTC strategy is to obtain the optimal controller of the ATIS based on Theorem 1. Next, for the sake of getting hold of optimal controllers for the
4. OPTIMAL DTC DESIGN VIA NEURAL NETWORKS
4.1. Implementation procedure via neural networks
In this section, we show the process of finding the approximate optimal solution by employing the ADP method based on neural networks. The critic networks have the capability of approximating nonlinear mapping, and the approximate cost function can be derived for the DTC system. Hence,
where
Considering Equation (16), the optimal control policy for the
Utilizing Equations (31) and (32), the Hamiltonian associated with the
where
where
Based on Equation (35), we obtain the estimated value of
Considering Equations (34-36), the approximate Hamiltonian is expressed as
Then, we obtain an error function of the Hamiltonian, which is denoted as
where
where
Usually, in the traditional weight training process, it is often necessary to select the appropriate initial weight vector for effective training. To eliminate the initial admissible control law, an improved critic learning rule is presented in the following.
4.2. Improved critic learning rule via neural networks
Herein, an additional Lyapunov function is introduced for the purpose of improving the critic learning mechanism. Then, the following rational assumption is given.
Assumption 1 Consider the dynamic of the
In other words, there exists a positive definite matrix
where
Remark 2.Herein, the motivation of selecting the cost function
When the condition occurs, that is,
Thus, we describe the improved learning rule as
where
It is found that when the derivative of
In accordance to
5. SIMULATION EXPERIMENT
In this section, we will introduce the common mechanical vibration system, that is, the spring-mass-damper system. The structural diagram of the mechanical system is shown in Figure 2. From it,
In addition, let
For the object
and
where
where
and
Based on the online ADP algorithm, two critic networks are constructed as follows:
and
During the online learning process, we take basic learning rates and additional learning rates as
Herein, two probing noises are added within the beginning 400 steps to keep the persistence of excitation condition of the ATIS. The weight convergence curves are shown in Figure 3. It can be seen that the weight has converged to a certain numerical value before turning off the excitation condition, which confirms the validity of the improved weight update algorithm. Form it, we find the initial weights are selected as zero, which indicates the initial admissible control is eliminated.
Next, in order to make the system achieve the purpose of the optimal tracking, feedback gains are selected as
6. CONCLUSION
In this paper, the optimal DTC strategy for CT nonlinear large-scale systems with external disturbances is proposed by employing the ADP algorithm. The approximate optimal control law of the ATISs can achieve the trajectory tracking goal. Then, the establishment of the DTC strategy is derived by adding the appropriate feedback gain, whose feasibility has been proved via the Lyapunov theory. Note that all the above-mentioned results are investigated by considering a cost function with the discount. Then, only a series of single critic networks are employed to solve HJI equations of
DECLARATIONS
Authors' contributions
Made significant contributions to the conception and experiments: Fan W, Wang D
Made significant contributions to the writing: Fan W, Wang D
Made substantial contributions to the revision and translation: Liu A, Wang D
Availability of data and materials
Not applicable
Financial support and sponsorship
This work was supported in part by the National Natural Science Foundation of China (No. 62222301; No. 61890930-5 and No. 62021003); in part by the National Key Research and Development Program of China (No. 2021ZD0112302; No. 2021ZD0112301 and No. 2018YFC1900800-5); and in part by the Beijing Natural Science Foundation (No. JQ19013).
Conflicts of interest
All authors declared that there are no conflicts of interest.
Ethical approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Copyright
© The Author(s) 2023.
REFERENCES
1. Saberi A. On optimality of decentralized control for a class of nonlinear interconnected systems. Automatica 1988;24:1101-4.
2. Mu CX, Sun CY, Wang D, Song AG, Qian CS. Decentralized adaptive optimal stabilization of nonlinear systems with matched interconnections. Soft Comput 2018;22:82705-15.
3. Mehraeen S, Jagannathan S. Decentralized optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Jacobi-Bellman formulation. IEEE Trans Neural Netw 2011;22:111715-96.
4. Yang X, He HB. Adaptive dynamic programming for decentralized stabilization of uncertain nonlinear large-scale systems with mismatched interconnections. IEEE Trans Syst Man Cybern Syst 2020;50:82870-82.
6. Xu Q, Yu C, Yuan X, Fu Z, Liu H. A distributed electricity energy trading strategy under energy shortage environment. Complex Eng Syst 2022;2:14.
7. Bian T, Jiang Y, Jiang ZP. Decentralized adaptive optimal control of large-scale systems with application to power systems. IEEE Trans, Ind Electron 2015;62:42439-47.
8. Liu DR, Wang D, Li HL. Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach. IEEE Trans Neural Netw Learn Syst 2014;25:2418-28.
9. Sun KK, Sui S, Tong SC. Fuzzy adaptive decentralized optimal control for strict feedback nonlinear large-scale systems. IEEE Trans, Cybern 2018;48:41326-39.
10. Wang XM, Feng ZG, Zhang GJ, Niu B, Yang D, et al. Adaptive decentralised control for large-scale non-linear non-strict-feedback interconnected systems with time-varying asymmetric output constraints and dead-zone inputs. IET Control Theory & Appl 2020;14:203417-27.
11. Wei QL, Liu DR, Lin Q, Song RZ. Discrete-time optimal control via local policy iteration adaptive dynamic programming. IEEE Trans, Cybern 2017;47:103367-79.
12. Wang D, Ren J, Ha MM, Qiao JF. System stability of learning-based linear optimal control with general discounted value iteration. IEEE, Trans Neural Netw Learn Syst 2022:Online ahead of print.
13. Wang D, Ha MM, Zhao MM. The intelligent critic framework for advanced optimal control. Artif Intell Rev 2022;55:11-22.
14. Wang D, Qiao JF, Cheng L. An approximate neuro-optimal solution of discounted guaranteed cost control design. IEEE Trans Cybern 2022;52:177-86.
15. Li YM, Liu YJ, Tong SC. Observer-based neuro-adaptive optimized control of strict-feedback nonlinear systems with state constraints. IEEE Trans Neural Netw Learn Syst 2022;33:73131-45.
16. Wang H, Yang CY, Liu XM, Zhou LN. Neural-network-based adaptive control of uncertain MIMO singularly perturbed systems with full-state constraints. IEEE Trans Neural Netw Learn Syst 2021; doi: 10.1109/TNNLS.2021.3123361.
17. Zhang H, Hong QQ, Yan HC, Yang FW, Guo G. Event-based distributed H-infinity filtering networks of 2-DOF quarter-car suspension systems. IEEE Trans Ind Inform 2017;13:1312-21.
18. Chen YG, Fei SM, Li YM. Robust stabilization for uncertain saturated time-delay systems: A distributed-delay-dependent polytopic approach. IEEE Trans Automat Contr 2017;62:73455-60.
19. Chen YG, Wang ZD. Local stabilization for discrete-time systems with distributed state delay and fast-varying input delay under actuator saturations. IEEE Trans Automat Contr 2021;66:31337-44.
20. Fu H, Chen X, Wu M. Distributed optimal observer design of networked systems via adaptive critic design. IEEE Trans Syst Man Cybern, Syst 2021;51:116976-85.
21. Narayanan V, Jagannathan S. Event-triggered distributed control of nonlinear interconnected systems using online reinforcement learning with exploration. IEEE Trans Cybern 2018;48:92510-9.
22. Wang D, Zhao MM, Ha MM, Qiao JF. Intelligent optimal tracking with application verifications via discounted generalized value iteration. Acta Automatica Sinica 2022;48:1182-93.
23. Zhang HG, Zhang K, Cai YL, Han J. Adaptive fuzzy fault-tolerant tracking control for partially unknown systems with actuator faults via integral reinforcement learning method. IEEE Trans Fuzzy Syst 2019;27:101986-98.
24. Modares H, Lewis FL. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 2014;50:71780-1792.
25. Ha MM, Wang D, Liu DR. Discounted iterative adaptive critic designs with novel stability analysis for tracking control. IEEE/CAA, Journal of Automat Sinica 2022;9:71262-1272.
26. Qu QX, Zhang HG, Feng T, Jiang H. Decentralized adaptive tracking control scheme for nonlinear large-scale interconnected systems via adaptive dynamic programming. Neurocomputing 2017;225:1-10.
27. Niu B, Liu JD, Wang D, Zhao XD, Wang HQ. Adaptive decentralized asymptotic tracking control for large-scale nonlinear systems with unknown strong interconnections. IEEE/CAA Journal of Automatica Sinica 2022;9:1173-86.
28. Liu JD, Niu B, Kao YG, Zhao P, Yang D. Decentralized adaptive command filtered neural tracking control of large-scale nonlinear systems: An almost fast finite-time framework. IEEE Trans Neural Netw Learn Syst 2021;32:83621-2.
29. Tong SC, Zhang LL, Li YM. Observed-based adaptive fuzzy decentralized tracking control for switched uncertain nonlinear large-scale systems with dead zones. IEEE Trans Syst Man Cybern Syst 2016;46:137-47.
30. Wang D, Hu LZ, Zhao MM, Qiao JF. Dual event-triggered constrained control through adaptive critic for discrete-time zero-sum games. IEEE Trans Syst Man Cybern Syst 2023;53:31584-9.
31. Li XM, Zhang B, Li PS, Zhou Q, Lu RQ. Finite-horizon H-infinity state estimation for periodic neural networks over fading channels. IEEE Trans Neural Netw Learn Syst 2020;31:51450-60.
32. Duan JJ, Xu H, Liu WX, Peng JC, Jiang H. Zero-sum game based cooperative control for onboard pulsed power load accommodation. IEEE Trans Ind Inform 2020;16:1238-47.
33. Wang D, Zhao MM, Ha MM, Qiao JF. Stability and admissibility analysis for zero-sum games under general value iteration formulation. IEEE Trans Neural Netw Learn Syst 2022; doi: 10.1109/TNNLS.2022.3152268.
34. Zhang HG, Cui XH, Luo YH, Jiang H. Finite-horizon H-infinity tracking control for unknown nonlinear systems with saturating actuators. IEEE Trans Neural Netw Learn Syst 2018;29:41200-12.
35. Modares H, Lewis FL, Jiang ZP. H-infinity tracking control of completely unknown continuous-time systems via off-policy reinforcement learning. IEEE Trans Neural Netw Learn Syst 2015;26:102550-62.
Cite This Article
How to Cite
Fan, W.; Liu A.; Wang D. Decentralized tracking control design based on intelligent critic for an interconnected spring-mass-damper system. Complex Eng. Syst. 2023, 3, 5. http://dx.doi.org/10.20517/ces.2023.04
Download Citation
Export Citation File:
Type of Import
Tips on Downloading Citation
Citation Manager File Format
Type of Import
Direct Import: When the Direct Import option is selected (the default state), a dialogue box will give you the option to Save or Open the downloaded citation data. Choosing Open will either launch your citation manager or give you a choice of applications with which to use the metadata. The Save option saves the file locally for later use.
Indirect Import: When the Indirect Import option is selected, the metadata is displayed and may be copied and pasted as needed.
Comments
Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at support@oaepublish.com.