REFERENCES

1. Aseltine J, Mancini A, Sarture C. A survey of adaptive control systems. IRE Trans Automat Contr 1958;6:102-8.

2. Stromer PR. Adaptive or self-optimizing control systems - a bibliography. IRE Trans Automat Contr 1959;AC-4:65-8.

3. Mishkin E, Ludwig BJ. Adaptive control systems. 1st ed. New York: McGraw-Hill; 1961.

4. Truxal JG. Adaptive control. IFAC Proceedings Volumes 1963;1:386-92.

5. Eveleigh VW. Adaptive control and optimization technique. 1st ed. New York: McGraw-Hill; 1967.

6. Wittenmark B. Stochastic adaptive control methods: a survey. Int J Control 1975;21:705-30.

7. Åström K, Borisson U, Ljung L, Wittenmark B. Theory and applications of self-tuning regulators. Automatica 1977;13:457-76.

8. Åström K. Theory and applications of adaptive control - a survey. Automatica 1983;19:471-86.

9. Jamali H. Adaptive control methods for mechanical manipulators: a comparative study. Monterey, CA: Naval Postgraduate School; 1989.

10. Mathelin MD, Lozano R. Robust adaptive identification of slowly time-varying parameters with bounded disturbances. Automatica 1999;35:1291-305.

11. Deisenroth MP, Rasmussen CE. PILCO: a model-based and data-efficient approach to policy search. Proceedings of the 28th International Conference on International Conference on Machine Learning; 2011 Jun; Madison, WI, USA. 2011. p. 465-72.

12. Wang LY, Zhang JF. Fundamental limitations and differences of robust and adaptive control. Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148); 2001 Jun 25-27; Arlington, VA, USA. IEEE; 2001. p. 4802-7.

13. Ioannou PA, Sun J. Robust adaptive control. Mineola, NY: Courier Corporation; 2012.

14. Lavretsky E. Adaptive output feedback design using asymptotic properties of LQG/LTR controllers. IEEE Trans Automat Contr 2012;57:1587-91.

15. Sastry S, Bodson M. adaptive control: stability, convergence and robustness. Mineola, NY: Dover Publications; 2011.

16. Larminat P. On overall stability of certain adaptive control systems. IFAC Proceedings Volumes 1979;12:1153-9.

17. Narendra K, Yuan-hao Lin. Stable discrete adaptive control. IEEE Trans Automat Contr 1980;25:456-61.

18. Peterson B, Narendra K. Bounded error adaptive control. IEEE Trans Automat Contr 1982;27:1161-8.

19. Fuchs J. Discrete adaptive control: a sufficient condition for stability and applications. IEEE Trans Automat Contr 1980;25:940-6.

20. Goodwin G, Ramadge P, Caines P. Discrete-time multivariable adaptive control. IEEE Trans Automat Contr 1980;25:449-56.

21. Egardt B. Global stability analysis of adaptive control systems with disturbances. Proceedings of the 1980 Joint Automatic Control Conference; 2021 Nov 1; San Fransisco, CA. 1980.

22. Rohrs CE, Valavani L, Athans M, Stein G. Robustness of adaptive control algorithms in the presence of unmodeled dynamics. 1982 21st IEEE Conference on Decision and Control; 1982 Dec 8-10; Orlando, FL, USA. IEEE; 1982. p. 3-11.

23. Aström KJ. Analysis of Rohrs counterexamples to adaptive control. The 22nd IEEE Conference on Decision and Control; 1983 Dec; San Antonio, TX, USA. 1983. p. 982-7.

24. Riedle B, Cyr B, Kokotovic P. Disturbance instabilities in an adaptive system. IEEE Trans Automat Contr 1984;29:822-4.

25. Ioannou P, Kokotovic P. Instability analysis and improvement of robustness of adaptive control. Automatica 1984;20:583-94.

26. Egardt B. Stability of adaptive controllers. Berlin Heidelberg: Springer; 1979.

27. Kreisselmeier G, Narendra K. Stable model reference adaptive control in the presence of bounded disturbances. IEEE Trans Automat Contr 1982;27:1169-75.

28. Samson C. Stability analysis of adaptively controlled systems subject to bounded disturbances. Automatica 1983;19:81-6.

29. Ioannou PA, Kokotovic PV. Adaptive systems with reduced models. New York, NY, USA: Springer-Verlag; 1983.

30. Peterson B, Narendra K. Bounded error adaptive control. IEEE Trans Automat Contr 1982;27:1161-8.

31. Narendra K, Annaswamy A. Robust adaptive control in the presence of bounded disturbances. IEEE Trans Automat Contr 1986;31:306-15.

32. Slotine JJE, Li W. Applied nonlinear control. Englewood Cliffs, NJ: Prentice Hall; 1991.

33. Bunich AL. Rapidly converging algorithm for the identification of a linear system with limited noise. Autom Remote Control 1983;44:1049-54.

34. Sastry SS. Model-reference adaptive control - stability, parameter convergence, and robustness. IMA J Math Control Info 1984;1:27-66.

35. Slotine JE, Coetsee JA. Adaptive sliding controller synthesis for non-linear systems. International Journal of Control 1986;43:1631-51.

36. Adaptive control in the presence of disturbances. In: Ioannou PA, Kokotovic PV, editors. Adaptive systems with reduced models. Berlin/Heidelberg: Springer-Verlag; 1983. p. 81-90.

37. Ioannou P, Tsakalis K. A robust direct adaptive controller. IEEE Trans Automat Contr 1986;31:1033-43.

38. Ioannou P. Robust adaptive controller with zero residual tracking errors. IEEE Trans Automat Contr 1986;31:773-6.

39. Ioannou P. Robust direct adaptive control. The 23rd IEEE Conference on Decision and Control; 1984 Dec 12-14; Las Vegas, NV, USA. IEEE; 1984. p. 1015-20.

40. Tsakalis KS. The σ-modification in the adaptive control of linear time-varying plants. [1992] Proceedings of the 31st IEEE Conference on Decision and Control; 1992 Dec 16-18; Tucson, AZ, USA. IEEE; 1992. p. 694-8.

41. He Z, Huang D, Xu J. On the asymptotic property analysis for a class of adaptive control systems with σ-modification: adaptive control systems with σ-modification. Int J Adapt Control Signal Process 2013;27:620-34.

42. Li MY, Muldowney JS. A geometric approach to global-stability problems. SIAM Journal on Mathematical Analysis 1996;27:14.

43. Narendra K, Annaswamy A. A new adaptive law for robust adaptation without persistent excitation. IEEE Trans Automat Contr 1987;32:134-45.

44. Lasalle J. Some extensions of Liapunov’s second method. IRE Trans Circuit Theory 1960;7:520-7.

45. Mattern DL. Practical applications and limitations of adaptive control. Available from: http://www.proquest.com/docview/303617884/abstract/FC4A275C8474474PQ/1 [Last accessed on 8 Mar 2022].

46. Kreisselmeier G, Anderson B. Robust model reference adaptive control. IEEE Trans Automat Contr 1986;31:127-33.

47. Davidson JM. Model reference adaptive control specification for a steam heated finned tube heat exchanger. Available from: https://www.proquest.com/docview/302770965/citation/9192D8E407D24AFBPQ/1 [Last accessed on 8 Mar 2022].

48. Davison E, Taylor P, Wright J. On the application of tuning regulators to control a commercial heat exchanger. IEEE Trans Automat Contr 1980;25:361-75.

49. Harrell RC, Kranzler GA, Hsu CS. Adaptive control of the fluid heat exchange process. J Dyn Syst Meas Control 1987;109:49-52.

50. Zhang Q, Tomizuka M. Multivariable direct adaptive control of thermal mixing processes. J Dyn Syst Meas Control 1985;107:278-83.

51. Lukas MP, Kaya A. Adaptive control of a heat exchanger using function blocks. Chemical Engineering Communications 2007;24:259-73.

52. Harris CJ, Billings SA. Self-tuning and adaptive control - theory and applications. 1st ed. London: Peter Peregrinus, Ltd; 1981.

53. Dubowsky S, Desforges DT. The application of model-referenced adaptive control to robotic manipulators. J Dyn Syst Meas Control 1979;101:193-200.

54. Dubowsky S. On the adaptive control of robotic manipulators: the discrete-time case. IEEE Trans Automat Contr 1981; doi: 10.1109/JACC.1981.4232298.

55. Nicosia S, Tomei P. Model reference adaptive control algorithms for industrial robots. Automatica 1984;20:635-44.

56. Koivo A, Guo TH. Adaptive linear controller for robotic manipulators. IEEE Trans Automat Contr 1983;28:162-71.

57. Horowitz R, Tomizuka M. An adaptive control scheme for mechanical manipulators - compensation of nonlinearity and decoupling control. J Dyn Syst Meas Control 1986;108:127-35.

58. Narendra KS, Parthasarathy K. Adaptive identification and control of dynamical systems using neural networks. Proceedings of the 28th IEEE Conference on Decision and Control; 1989 Dec 13-15; Tampa, FL, USA. 1989. p. 1737-8.

59. Lee C. Fuzzy logic in control systems: fuzzy logic controller. II. IEEE Trans Syst, Man, Cybern 1990;20:419-35.

60. Sutton RS, Barto AG, Williams RJ. Reinforcement learning is direct adaptive optimal control. IEEE Control Syst 1992;12:19-22.

61. Yechiel O. A survey of adaptive control. IRATJ 2017;3:0053.

62. Malik O. Amalgamation of adaptive control and AI techniques: applications to generator excitation control. Annu Rev Control 2004;28:97-106.

63. Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci U S A 1982;79:2554-8.

64. Hopfield JJ, Tank DW. “Neural” computation of decisions in optimization problems. Biol Cybern 1985;52:141-52.

65. Burr D. Experiments on neural net recognition of spoken and written text. IEEE Trans Acoust, Speech, Signal Processing 1988;36:1162-8.

66. Gorman R, Sejnowski T. Learned classification of sonar targets using a massively parallel network. IEEE Trans Acoust, Speech, Signal Processing 1988;36:1135-40.

67. Sejnowski T, Rosenberg CR. Parallel networks that learn to pronounce English text. Complex Syst 1987;1:145-68.

68. Widrow B, Winter R, Baxter R. Layered neural nets for pattern recognition. IEEE Trans Acoust, Speech, Signal Processing 1988;36:1109-18.

69. Levin AU, Narendra KS. Control of nonlinear dynamical systems using neural networks: controllability and stabilization. IEEE Trans Neural Netw 1993;4:192-206.

70. Narendra KS, Parthasarathy K. Identification and control of dynamical systems using neural networks. IEEE Trans Neural Netw 1990;1:4-27.

71. Sontag ED. Feedback stabilization using two-hidden-layer nets. IEEE Trans Neural Netw 1992;3:981-90.

72. Barto AG. Connectionist learning for control: an overview. In: Miller WT, Sutton RS, Werbos PJ. Neural networks for control. Cambridge, MA, USA: MIT Press; 1990. p. 5-58.

73. Dai SL, Wang C, Wang M. Dynamic learning from adaptive neural network control of a class of nonaffine nonlinear systems. IEEE Trans Neural Netw Learn Syst 2014;25:111-23.

74. Chen CL, Liu YJ, Wen GX. Fuzzy neural network-based adaptive control for a class of uncertain nonlinear stochastic systems. IEEE Trans Cybern 2014;44:583-93.

75. Dai S, Wang M, Wang C. Neural learning control of marine surface vessels with guaranteed transient tracking performance. IEEE Trans Ind Electron 2016;63:1717-27.

76. Li H, Bai L, Wang L, Zhou Q, Wang H. Adaptive neural control of uncertain nonstrict-feedback stochastic nonlinear systems with output constraint and unknown dead zone. IEEE Trans Syst Man Cybern, Syst 2017;47:2048-59.

77. Cheng L, Liu W, Hou Z, Yu J, Tan M. Neural-network-based nonlinear model predictive control for piezoelectric actuators. IEEE Trans Ind Electron 2015;62:7717-27.

78. Ren B, Ge SS, Su CY, Lee TH. Adaptive neural control for a class of uncertain nonlinear systems in pure-feedback form with hysteresis input. IEEE Trans Syst Man Cybern B Cybern 2009;39:431-43.

79. Luo B, Huang T, Wu HN, Yang X. Data-driven H∞ control for nonlinear distributed parameter systems. IEEE Trans Neural Netw Learn Syst 2015;26:2949-61.

80. Liu Y, Tong S. Barrier Lyapunov functions for Nussbaum gain adaptive control of full state constrained nonlinear systems. Automatica 2017;76:143-52.

81. Li Y, Qiang S, Zhuang X, Kaynak O. Robust and adaptive backstepping control for nonlinear systems using RBF neural networks. IEEE Trans Neural Netw 2004;15:693-701.

82. Zhang H, Luo Y, Liu D. Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 2009;20:1490-503.

83. Barto AG, Sutton RS, Anderson CW. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Syst, Man, Cybern 1983;SMC-13:834-46.

84. Michie D, Chambers RA. Boxes: an experiment in adaptive control. Edinburgh, UK: Oliver and Boyd; 1968. p. 137-52.

85. Michie D, Chambers RA. Boxes’ as a model of pattern-formation. 1st ed. Edinburgh: Edinburgh univ. press; 1968. p. 206-15.

86. Anderson CW. Strategy Learning with multilayer connectionist representations. proceedings of the fourth international workshop on machine learning. Elsevier; 1987. p. 103-14.

87. Anderson C. Learning to control an inverted pendulum using neural networks. IEEE Control Syst Mag 1989;9:31-7.

88. Lin CS, Kim H. CMAC-based adaptive critic self-learning control. IEEE Trans Neural Netw 1991;2:530-3.

89. Albus JS. Theoretical and experimental aspects of a Cerebellar Model. Available from: https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=820153 [Last accessed on 8 Mar 2022].

90. Albus JS. A new approach to manipulator control: the cerebellar model articulation controller (CMAC). J Dyn Syst Meas Control 1975;97:220-7.

91. Albus JS. Mechanisms of planning and problem solving in the brain. Math Biosci 1979;45:247-93.

92. Albus JS. Brains, behavior, and robotics. 1st ed. Peterborough: BYTE Books; 1981.

93. Huang, Chien-lo Huang. Control of an inverted pendulum using grey prediction model. IEEE Trans on Ind Applicat 2000;36:452-8.

94. Pathak K, Franch J, Agrawal S. Velocity and position control of a wheeled inverted pendulum by partial feedback linearization. IEEE Trans Robot 2005;21:505-13.

95. Li, Jun Luo. Adaptive Robust dynamic balance and motion controls of mobile wheeled inverted pendulums. IEEE Trans Contr Syst Technol 2009;17:233-41.

96. Chaoui H, Gueaieb W, Yagoub MCE. ANN-based adaptive motion and posture control of an inverted pendulum with unknown dynamics. 2009 3rd International Conference on Signals, Circuits and Systems (SCS); 2009 Nov 6-8; Medenine, Tunisia. IEEE; 2009. p. 1-6.

97. Guez A, Ahmad Z. Solution to the inverse kinematics problem in robotics by neural networks. IEEE 1988 International Conference on Neural Networks; 1988 Jul 24-27; San Diego, CA, USA. IEEE; 1988. p. 617-24.

98. Elsley. A learning architecture for control based on back-propagation neural networks. IEEE 1988 International Conference on Neural Networks; 1988 Jul 24-27; San Diego, CA, USA. IEEE; 1988. p. 587-94.

99. Jamshidi M, Horne B, Vadiee N. A neural network-based controller for a two-link robot. 29th IEEE Conference on Decision and Control; 1990 Dec 5-7; Honolulu, HI, USA. IEEE; 1990. p. 3256-7.

100. Karakasoglu A, Sundareshan MK. Decentralized variable structure control of robotic manipulators: neural computational algorithms. 29th IEEE Conference on Decision and Control; 1990 Dec 5-7; Honolulu, HI, USA. IEEE; 1990. p. 3258-9.

101. Xu G, Scherrer H, Schweitzer G. Application of neural networks on robot grippers. 1990 IJCNN International Joint Conference on Neural Networks; 1990 Jun 17-21; San Diego, CA, USA. IEEE; 1990. p. 337-42.

102. Wilhelmsen K, Cotter N. Neural network based controllers for a single-degree-of-freedom robotic arm. 1990 IJCNN International Joint Conference on Neural Networks; 1990 Jun 17-21; San Diego, CA, USA. IEEE; 1990. p. 407-13.

103. Miller WT, Glanz FH, Kraft LG. Application of a general learning algorithm to the control of robotic manipulators. Int J Rob Res 1987;6:84-98.

104. Miller W. Sensor-based control of robotic manipulators using a general learning algorithm. IEEE J Robot Automat 1987;3:157-65.

105. Miller WT. Real time learned sensor processing and motor control for a robot with vision. Neural Networks 1988;1:347.

106. Miller WT, Hewes RP. Real time experiments in neural network based learning control during high speed nonrepetitive robotic operations. Proceedings IEEE International Symposium on Intelligent Control 1988; 1988 Aug 24-26; Arlington, VA, USA. IEEE; 1988. p. 513-8.

107. Miller W. Real-time application of neural networks for sensor-based control of robots with vision. IEEE Trans Syst, Man, Cybern 1989;19:825-31.

108. Miller W, Glanz F, Kraft L. CMAC: an associative neural network alternative to backpropagation. Proc IEEE 1990;78:1561-7.

109. Huan L, Iberall, Bekey. Building a generic architecture for robot hand control. IEEE 1988 International Conference on Neural Networks; 1988 Jul 24-27; San Diego, CA, USA. IEEE; 1988. p. 567-74.

110. Wang SD, Yeh HMS. Self-adaptive neural architectures for control applications. 1990 IJCNN International Joint Conference on Neural Networks; 1990 Jun 17-21; San Diego, CA, USA. IEEE; 1990. p. 309-14.

111. Seidl D, Lam SL, Putman J, Lorenz R. Neural network compensation of gear backlash hysteresis in position-controlled mechanisms. IEEE Trans on Ind Applicat 1995;31:1475-83.

112. Olsson H, Åström K, Canudas de Wit C, Gäfvert M, Lischinsky P. Friction models and friction compensation. European Journal of Control 1998;4:176-95.

113. Katsura S, Suzuki J, Ohnishi K. Pushing operation by flexible manipulator taking environmental information into account. IEEE Trans Ind Electron 2006;53:1688-97.

114. Katsura S, Ohnishi K. Force servoing by flexible manipulator based on resonance ratio control. IEEE Trans Ind Electron 2007;54:539-47.

115. Ghorbel F, Hung J, Spong M. Adaptive control of flexible-joint manipulators. IEEE Control Syst Mag 1989;9:9-13.

116. Chien M, Huang A. Adaptive control for flexible-Joint electrically driven robot with time-varying uncertainties. IEEE Trans Ind Electron 2007;54:1032-8.

117. Hauschild JP, Heppler GR. Control of harmonic drive motor actuated flexible linkages. Proceedings 2007 IEEE International Conference on Robotics and Automation; 2007 Apr 10-14; Rome, Italy. IEEE; 2007. p. 3451-6.

118. Kong K, Tomizuka M, Moon H, Hwang B, Jeon D. Mechanical design and impedance compensation of SUBAR (Sogang University’s Biomedical Assist Robot). 2008 IEEE/ASME International Conference on Advanced Intelligent Mechatronics; 2008 Jul 2-5; Xi’an, China. IEEE; 2008. p. 377-82.

119. Ghorbel F, Spong MW. Adaptive integral manifold control of flexible joint robot manipulators. Proceedings 1992 IEEE International Conference on Robotics and Automation; 1992 May 12-14; Nice, France. IEEE; 1992. p. 707-14.

120. Al-ashoor R, Patel R, Khorasani K. Robust adaptive controller design and stability analysis for flexible-joint manipulators. IEEE Trans Syst, Man, Cybern 1993;23:589-602.

121. Ott C, Albu-Schaffer A, Hirzinger G. Comparison of adaptive and nonadaptive tracking control laws for a flexible joint manipulator. IEEE/RSJ International Conference on Intelligent Robots and Systems; 2002 Sep 30-Oct 4; Lausanne, Switzerland. IEEE; 2002. p. 2018-24.

122. Spong MW. Modeling and control of elastic joint robots. J Dyn Syst Meas Control 1987;109:310-8.

123. Ge SS, Postlethwaite I. Adaptive neural network controller design for flexible joint robots using singular perturbation technique. Transactions of the Institute of Measurement and Control 1995;17:120-31.

124. Taghirad HD, Khosravi MA. Design and simulation of robust composite controllers for flexible joint robots. 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422); 2003 Sep 14-19; Taipei, Taiwan. IEEE; 2003. p. 3108-13.

125. Huang L, Ge SS, Lee TH. Adaptive position/force control of an uncertain constrained flexible joint robots - singular perturbation approach. SICE 2004 Annual Conference; 2004 Aug 4-6; Sapporo, Japan; 2004. p. 220-5.

126. Chaoui H, Gueaieb W. Type-2 fuzzy logic control of a flexible-joint manipulator. J Intell Robot Syst 2008;51:159-86.

127. Karray F, Gueaieb W, Al-Sharhan S. The hierarchical expert tuning of PID controllers using tools of soft computing. IEEE Trans Syst Man Cybern B Cybern 2002;32:77-90.

128. Gueaieb W, Karray F, Al-sharhan S. A robust adaptive fuzzy position/force control scheme for cooperative manipulators. IEEE Trans Contr Syst Technol 2003;11:516-28.

129. Kim E. Output feedback tracking control of robot manipulators with model uncertainty via adaptive fuzzy logic. IEEE Trans Fuzzy Syst 2004;12:368-78.

130. Chaoui H, Gueaieb W, Yagoub MCE, Sicard P. Hybrid neural fuzzy sliding mode control of flexible-joint manipulators with unknown dynamics. IECON 2006 - 32nd Annual Conference on IEEE Industrial Electronics; 2006 Nov 6-10; Paris, France. IEEE; 2006. p. 4082-7.

131. Chaoui H, Sicard P, Lakhsasi A. Reference model supervisory loop for neural network based adaptive control of a flexible joint with hard nonlinearities. Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513); 2004 May 2-5; Niagara Falls, ON, Canada. IEEE; 2004. p. 2029-34.

132. Chaoui H, Sicard P, Lakhsasi A, Schwartz H. Neural network based model reference adaptive control structure for a flexible joint with hard nonlinearities. 2004 IEEE International Symposium on Industrial Electronics; 2004 May 4-7; Ajaccio, France. IEEE; 2004. p. 271-6.

133. Hui, Fuchun S, Zengqi S. Observer-based adaptive controller design of flexible manipulators using time-delay neuro-fuzzy networks. J Intell Robot Syst 2002;34:453-66.

134. Subudhi B, Morris AS. Singular perturbation based neuro-H_∞ control scheme for a manipulator with flexible links and joints. Robotica 2006;24:151-61.

135. Chaoui H, Sicard P, Gueaieb W. ANN-based adaptive control of robotic manipulators with friction and joint elasticity. IEEE Trans Ind Electron 2009;56:3174-87.

136. Hou ZG, Cheng L, Tan M. Multicriteria optimization for coordination of redundant robots using a dual neural network. IEEE Trans Syst Man Cybern B Cybern 2010;40:1075-87.

137. Li Z, Su CY. Neural-adaptive control of single-master-multiple-slaves teleoperation for coordinated multiple mobile manipulators with time-varying communication delays and input uncertainties. IEEE Trans Neural Netw Learn Syst 2013;24:1400-13.

138. Li Z, Xia Y, Sun F. Adaptive fuzzy control for multilateral cooperative teleoperation of multiple robotic manipulators under random network-induced delays. IEEE Trans Fuzzy Syst 2014;22:437-50.

139. He W, Huang H, Ge SS. Adaptive neural network control of a robotic manipulator with time-varying output constraints. IEEE Trans Cybern 2017;47:3136-47.

140. He W, Ouyang Y, Hong J. Vibration control of a flexible robotic manipulator in the presence of input deadzone. IEEE Trans Ind Inf 2017;13:48-59.

141. Zhu G, Ge S, Lee T. Simulation studies of tip tracking control of a single-link flexible robot based on a lumped model. Robotica 1999;17:71-8.

142. Sun C, He W, Hong J. Neural network control of a flexible robotic manipulator using the lumped spring-mass model. IEEE Trans Syst Man Cybern, Syst 2017;47:1863-74.

143. Bertsekas DP, Tsitsiklis JN. Neuro-dynamic programming. Belmont, MA: Athena Scientific; 1996.

144. Pitts W, Mcculloch WS. How we know universals; the perception of auditory and visual forms. Bull Math Biophys 1947;9:127-47.

145. Liu R. Multispectral images-based background subtraction using Codebook and deep learning approaches. Available from: https://www.theses.fr/2020UBFCA013.pdf [Last accessed on 8 Mar 2022].

146. Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE. A survey of deep neural network architectures and their applications. Neurocomputing 2017;234:11-26.

147. Silver D, Schrittwieser J, Simonyan K, et al. Mastering the game of Go without human knowledge. Nature 2017;550:354-9.

148. Laud AD. Theory and application of reward shaping in reinforcement learning. Available from: https://www.proquest.com/openview/bb29dc3d66eccbe7ab65560dd2c4147f/1?pq-origsite=gscholar&cbl=18750&diss=y [Last accessed on 8 Mar 2022].

149. Kober J, Bagnell JA, Peters J. Reinforcement learning in robotics: a survey. Int J Rob Res 2013;32:1238-74.

150. Digney BL. Nested Q-learning of hierarchical control structures. Proceedings of International Conference on Neural Networks (ICNN’96); 1996 Jun 3-6; Washington, DC, USA. IEEE; 1996. p. 161-6.

151. Schaal S. Learning from demonstration. Proceedings of the 9th International Conference on Neural Information Processing Systems; 1996 Dec; Cambridge, MA, USA. IEEE; 1996. p. 1040-6.

152. Kuan C, Young K. Reinforcement learning and robust control for robot compliance tasks. J Intell Robot Syst 1998;23:165-82.

153. Bucak IO, Zohdy MA. Application of reinforcement learning control to a nonlinear dexterous robot. Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304); 1999 Dec 7-10; Phoenix, AZ, USA. IEEE; 1999. p. 5108-13.

154. Bucak IO, Zohdy MA. Reinforcement learning control of nonlinear multi-link system. Eng Appl Artif Intell 2001;14:563-75.

155. Althoefer K, Krekelberg B, Husmeier D, Seneviratne L. Reinforcement learning in a rule-based navigator for robotic manipulators. Neurocomputing 2001;37:51-70.

156. Gaskett C. Q-learning for robot control. Available from: https://digitalcollections.anu.edu.au/bitstream/1885/47080/5/01front.pdf [Last accessed on 8 Mar 2022].

157. Smart WD, Kaelbling LP. Reinforcement learning for robot control. Proc SPIE 2002; doi: 10.1117/12.457434.

158. Izawa J, Kondo T, Ito K. Biological robot arm motion through reinforcement learning. Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292); 2002 May 11-15; Washington, DC, USA. IEEE; 2002. p. 3398-403.

159. Peters J, Vijayakumar S, Schaal S. Reinforcement learning for humanoid robotics. 3rd IEEE-RAS International Conference on Humanoid Robots; 2003 Sep 29-30; Karlsruhe, Germany. 2003.

160. Bhatnagar S, Sutton RS, Ghavamzadeh M, Lee M. Natural actor-critic algorithms. Automatica 2009;45:2471-82.

161. Theodorou E, Peters J, Schaal S. Reinforcement learning for optimal control of arm movements. Poster presented at 37th Annual Meeting of the Society for Neuroscience (Neuroscience 2007); San Diego, CA, USA. 2007.

162. Peters J, Schaal S. Natural actor-critic. Neurocomputing 2008;71:1180-90.

163. Atkeson CG, Schaal S. Learning tasks from a single demonstration. Proceedings of International Conference on Robotics and Automation; 1997 Apr 25-25; Albuquerque, NM, USA. IEEE; 1997. p. 1706-12.

164. Hoffmann H, Theodorou E, Schaal S. Behavioral experiments on reinforcement learning in human motor control. Available from: https://www.researchgate.net/publication/325463394 [Last accessed on 8 Mar 2022].

165. Peters J, Schaal S. Learning to control in operational space. Int J Rob Res 2008;27:197-212.

166. Buchli J, Theodorou E, Stulp F, Schaal S. Variable impedance control - a reinforcement learning approach. In: Matsuoka Y, Durrant-Whyte H, Neira J, editors. Robotics: Science and Systems VI. Cambridge: MIT Press; 2011.

167. Theodorou E, Buchli J, Schaal S. Reinforcement learning of motor skills in high dimensions: a path integral approach. 2010 IEEE International Conference on Robotics and Automation; 2010 May 3-7; Anchorage, AK, USA. IEEE; 2010. p. 2397-403.

168. Kappen HJ. Path integrals and symmetry breaking for optimal control theory. J Stat Mech 2005;2005:P11011.

169. Shah H, Gopal M. Reinforcement learning control of robot manipulators in uncertain environments. 2009 IEEE International Conference on Industrial Technology; 2009 Feb 10-13; Churchill, VIC, Australia. IEEE; 2009. p. 1-6.

170. Kim B, Kang B, Park S, Kang S. Learning robot stiffness for contact tasks using the natural actor-critic. 2008 IEEE International Conference on Robotics and Automation; 2008 May 19-23; Pasadena, CA, USA. IEEE; 2008. p. 3832-7.

171. Kim B, Park J, Park S, Kang S. Impedance learning for robotic contact tasks using natural actor-critic algorithm. IEEE Trans Syst Man Cybern B Cybern 2010;40:433-43.

172. Adam S, Busoniu L, Babuska R. Experience replay for real-time reinforcement learning control. IEEE Trans Syst , Man, Cybern C 2012;42:201-12.

173. Hafner R, Riedmiller M. Reinforcement learning in feedback control: Challenges and benchmarks from technical process control. Mach Learn 2011;84:137-69.

174. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM 2017;60:84-90.

175. Levine S, Finn C, Darrell T, Abbeel P. End-to-end training of deep visuomotor policies. Available from: http://arxiv.org/abs/1504.00702 [Last accessed on 8 Mar 2022].

176. Levine S, Wagener N, Abbeel P. Learning contact-rich manipulation skills with guided policy search. Available from: http://arxiv.org/abs/1501.05611 [Last accessed on 8 Mar 2022].

177. Tai L, Zhang J, Liu M, Boedecker J, Burgard W. A survey of deep network solutions for learning control in robotics: from reinforcement to imitation. Available from: http://arxiv.org/abs/1612.07139 [Last accessed on 8 Mar 2022].

178. Vecerik M, Hester T, Scholz J, et al. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. Available from: http://arxiv.org/abs/1707.08817 [Last accessed on 8 Mar 2022].

179. Liu R, Nageotte F, Zanne P, de Mathelin M, Dresp-langley B. Deep reinforcement learning for the control of robotic manipulation: a focussed mini-review. Robotics 2021;10:22.

180. Andrychowicz M, Wolski F, Ray A, et al. Hindsight experience replay. Available from: https://arxiv.org/abs/1707.01495v3 [Last accessed on 8 Mar 2022].

181. Gupta A, Savarese S, Ganguli S, Fei-Fei L. Embodied intelligence via learning and evolution. Nat Commun 2021;12:5721.

182. Rajeswaran A, Kumar V, Gupta A, et al. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. Available from: http://arxiv.org/abs/1709.10087 [Last accessed on 8 Mar 2022].

183. Matas J, James S, Davison AJ. Sim-to-real reinforcement learning for deformable object manipulation. Available from: http://arxiv.org/abs/1806.07851 [Last accessed on 8 Mar 2022].