Rotor resistance and excitation inductance estimation of an induction motor using deep-Q-learning algorithm
Introduction
Indirect field-oriented control (IFOC) is one of the most popular strategies used in high-performance induction motor (IM) applications. Since IFOC can exploit an inherent slip relation, it is essentially a feed-forward control method, where accurate values of at least some motor parameters are required to be estimated to obtain robust control performance. For example, the rotor time constant may not be correct or precisely known or the slip cannot be correctly attained due to motor heating, field-weakening, and other variation-induced changes. This would result in detuning of the controller and a loss of correct field orientation (Novotny and Lipo, 1996).
To avoid such situations, it is necessary to introduce estimation methods to equip the vector controller with accurate IM parameter values. So far, several methods have been proposed to estimate the correct motor parameters, including model reference adaptive system (MRAS), extended Kalman filter (EKF), sliding mode observer, and recursive least squares Zerdali and Barut, 2017, Yin et al., 2016a, Yin et al., 2016b, Yin et al., 2016c, Yang et al., 2017, Djadi et al., 2017. This class of methods can be summarized as model-based methods since their performance strongly depends on the accuracy of the approximated IM model. Model-based methods generally suffer from two shortcomings. First, they are very sensitive to noise, where the algorithm becomes very unstable under any noise overlaying the IM model. For example, Shi et al. (2012) proposed an EKF algorithm to estimate the rotor speed and position of permanent magnet synchronous motor. However, this algorithm is difficult to be practically applied owing to its sensitivity to noise. Second, in some situations, these methods may fail due to parameter perturbation. For example, MRAS cannot be used at low frequencies due to a drop in stator resistance voltage (Maiti et al., 2008).
Recent advances in artificial intelligence areas have provided new possibility of estimating motor parameters in a data-based manner. These strategies can utilize a range of data training architectures including artificial neural networks (ANNs), support vector machine (SVM), genetic algorithm, and particle swarm optimization (PSO) Papa and Koroušić-Seljak, 2005, Liu et al., 2008, Woodley et al., 2005, Liu et al., 2017. Data-based methods are generally more robust and accurate since they are independent of the motor models. However, they face a significant challenge. In general, most successful data-based applications require considerable hand-labelled training data, where reliable hand-labelled training data are rather difficult to obtain in practical engineering applications. Karanayil et al. (2007) introduced an ANN to estimate the resistance of a stator and rotor in sensorless applications. However, the dataset cannot be used for training the ANN, and the voltage model of the motor is used instead. In other words, the strategies should be model-based rather than data-based. Bilski (2014) presented an SVM-based strategy; however, the method of obtaining the dataset remained unclear in this method. Sakthivel et al. (2010) investigated an MOPSO strategy to estimate the IM parameters, where the manufacturer data were used as reference instead of the training dataset.
A feasible choice for obtaining the hand-labelled data is to use reinforcement learning, in which the agent can learn the correct knowledge from the environment and generate the label data automatically. Recently, the application of reinforcement learning in industry has become a hot issue due to theoretical breakthroughs. Khan et al. (2011) presented a humanoid robotic arm control strategy based on -learning and approximate dynamic programming. Yin et al. (2016a) proposed an approximate dynamic programming method to reduce the time delay of affected passengers in a congested metro line. Deisenroth and Rasmussen (2009) presented a Gauss process reinforcement learning algorithm and applied it to inverted pendulum control in hardware. Yin et al. (2016b) develop an integrated train operation algorithm based on -learning to realize real-time train operations with online adjusting the timetable. Zhang et al. (2017) presented an energy-efficient scheduling for real-time systems based on the deep -learning (DQL) model. Prashanth and Bhatnagar (2011) proposed a reinforcement learning algorithm along with the function approximation algorithm, which can be applied to traffic signal control.
This paper proposes a data-based method with labelled data obtained in a simple manner. By focusing on this objective, an estimation approach is proposed based on the DQL algorithm. In particular, the training dataset can be generated automatically during the estimation procedure in DQL, where the dataset is trained simultaneously. With a careful design of the algorithm’s architecture, DQL can obtain an unbiased estimation of parameters after a few iterations of the training procedure.
The main contributions of this paper can be summarized as follows.
(1) Although breakthrough has been achieved by DQL in some domains, such as computer vision games, recently (Mnih et al., 2013), no study has applied DQL to the motor control technology. In this study, DQL is used successfully for motor parameter estimation.
(2) To ensure DQL feasibility, the ways of designing the DQL architecture are also discussed, which include the design of observation, action, and reward. Moreover, an indicator, namely -sensitivity, is introduced to guarantee the robustness and efficiency of the algorithm, which also demonstrates that the algorithm with high -sensitivity is more stable and can converge at a much faster rate.
(3) Some comparative experiments are designed to illustrate the DQL performance. Through these experiments, DQL is verified to have more accuracy and robustness than some other traditional methods. These experiments demonstrate that DQL can generate a maximum-torque output in any state.
The remainder of this paper is organized as follows. In Section 2, an induction model of the motor with some parameters is established. In Section 3, a new DQL architecture is designed, where the observation, reward, and action are appropriately designed. The experimental and comparative studies are extensively illustrated in Section 4. Finally, the conclusion is drawn in Section 5.
Section snippets
IM model
The mathematical model of the IM in – axis voltage can be given as
The motor torque is achieved at where is the electrical angular velocity; is the electromagnetic torque; is the pole pair of the motor; , , , and are the – axis stator voltage and current; and indicate the stator and rotator resistances, respectively; and indicates the excitation inductance. Moreover, the stator and rotator
Reinforcement learning and Markov decision process
The task of motor parameter estimation is shown in Fig. 1, where an algorithm can interact with the environment in a sequence of actions , observation , and rewards . At each time step, the algorithm selects an action from the set of actions . The action is passed to the environment and modifies the observations, which is an internal signal of the motor controller. Simultaneously, the algorithm receives a reward in the form of motor electromagnetic torque. Note that all motor
Experiment environment
During the experiments, an IM benchmark is employed as the experimental equipment, which is illustrated in Fig. 5. In particular, the benchmark consists of a test motor (design parameters are shown in Table 2), a dynamometer motor, a torque–speed transducer, a data collector, and two motor controllers. Moreover, the test motor works in the IFOC torque loop, while the dynamometer motor works in the IFOC speed loop. The torque and current signals can be obtained from the torque–speed transducer
Conclusion
This paper proposes a parameter estimation method for the rotor resistance and excitation inductance in an IM’s IFOC application. Towing to the demonstrated disadvantage of the model-based method, a model-free method based on DQL is proposed. In particular, the structure of the DQL parameter estimation is constructed. Meanwhile, the DQL elements, including observation, reward, and action, are designed such that the feasibility of the algorithm can be ensured. Moreover, a new concept of ‘
References (28)
Application of support vector machines to the induction motor parameters identification
Measurement
(2014)- et al.
A Novel -learning based adaptive optimal controller implementation for a humanoid robotic arm *
IFAC Proc.
(2011) - et al.
Particle swarm optimization-based parameter identification applied to permanent magnet synchronous motors
Eng. Appl. Artif. Intell.
(2008) - et al.
An artificial intelligence approach to the efficiency improvement of a universal motor
Eng. Appl. Artif. Intell.
(2005) - et al.
Multi-objective parameter estimation of induction motor using particle swarm optimization
Eng. Appl. Artif. Intell.
(2010) - et al.
Neural network modeling of torque estimation and d-q transformation for induction machine
Eng. Appl. Artif. Intell.
(2005) - et al.
Energy-efficient metro train rescheduling with uncertain time-variant passenger demands: An approximate dynamic programming approach
Transp. Res. B
(2016) - Deisenroth, M.P., Rasmussen, C.E., 2009. Efficient reinforcement learning for motor control. In: International Phd...
- et al.
Parameters identification of a brushless doubly fed induction machine using PRBS excitation signal for recursive least squares method
IET Electr. Power Appl.
(2017) - et al.
Online stator and rotor resistance estimation scheme using artificial neural networks for vector controlled speed sensorless induction motor drive
IEEE Trans. Ind. Electron.
(2007)
Parameter estimation of wound-rotor induction motors from transient measurements
IEEE Trans. Energy Convers.
GPU Implementation of DPSO-RE algorithm for parameters identification of surface PMSM considering VSI nonlinearity
IEEE J. Emerg. Sel. Top. Power Electron.
Model reference adaptive controller-based rotor resistance and speed estimation techniques for vector controlled induction motor drive utilizing reactive power
IEEE Trans. Ind. Electron.
Playing atari with deep reinforcement learning
Comput. Sci.
Cited by (35)
Data-driven coordinated control method for multiple systems in proton exchange membrane fuel cells using deep reinforcement learning
2022, Energy ReportsCitation Excerpt :Using artificial intelligence (AI) technologies, researchers have devised several data-driven algorithms for multi-agent coordination. The MADDPG algorithm (Lowe et al., 2017) refers to a multi-agent reinforcement learning algorithm that can achieve rapid coordinated control of multiple agents through a centralized training strategy, and so it is used extensively adopted to solve various coordination problems (Li et al., 2021d; Li and Yu, 2021a,b; Zhang et al., 2020; Qi, 2018). As impacted by the insufficient exploration ability of the MADDPG algorithm, the robustness exhibited by this type of algorithm is extremely weak for coordination of PEMFC parameters.
A multi-objective energy coordinative and management policy for solid oxide fuel cell using triune brain large-scale multi-agent deep deterministic policy gradient
2022, Applied EnergyCitation Excerpt :The lower robustness of conventional MADDPG algorithm is attributed to the following problems: A large number of low-value samples are obtained by the MADDPG algorithm in the preliminary stage [30,31]. In offline training, the DDPG algorithm adopts a simple exploration approach, lowering the diversity of samples.
A critical survey of proton exchange membrane fuel cell system control: Summaries, advances, and perspectives
2022, International Journal of Hydrogen EnergyTotal travel costs minimization strategy of a dual-stack fuel cell logistics truck enhanced with artificial potential field and deep reinforcement learning
2022, EnergyCitation Excerpt :To this end, a more general form APF is introduced which enables an adjustable virtual attractive/repulsive force through dynamic tuning of ab. DDPG is a model-free, on-line, off-policy DRL algorithm that combines the concepts of deterministic policy gradient (DPG) and deep Q-network (DQN) [31,32]. DDPG is designed, trained and deployed to further optimize the APF parameter aiming to improve the adaptivity of SOC regulator.
Optimal trajectory exploration large-scale deep reinforcement learning tuned optimal controller for proton exchange membrane fuel cell
2022, Journal of the Franklin InstituteOptimal adaptive control for solid oxide fuel cell with operating constraints via large-scale deep reinforcement learning
2021, Control Engineering Practice