Rotor resistance and excitation inductance estimation of an induction motor using deep-Q-learning algorithm

https://doi.org/10.1016/j.engappai.2018.03.018Get rights and content

Abstract

To estimate the parameters of an induction motor in a data-based manner, this paper proposes a new offline method to estimate rotor resistance and excitation inductance based on the deep-Q-learning approach. In this method, parameter estimation can be facilitated without disturbing the model error or operating state. To achieve this goal, three important elements, namely observation, action, and reward, are appropriately designed. To improve the robustness and accelerate the convergence, a new concept, denoted as Q-sensitivity, is proposed and investigated in detail. The experimental results show that a high-Q-sensitivity design can allow the proposed method to obtain a fast and torque-maximized estimation. Results from the comparative studies confirm the accuracy and robustness of the proposed method.

Introduction

Indirect field-oriented control (IFOC) is one of the most popular strategies used in high-performance induction motor (IM) applications. Since IFOC can exploit an inherent slip relation, it is essentially a feed-forward control method, where accurate values of at least some motor parameters are required to be estimated to obtain robust control performance. For example, the rotor time constant may not be correct or precisely known or the slip cannot be correctly attained due to motor heating, field-weakening, and other variation-induced changes. This would result in detuning of the controller and a loss of correct field orientation (Novotny and Lipo, 1996).

To avoid such situations, it is necessary to introduce estimation methods to equip the vector controller with accurate IM parameter values. So far, several methods have been proposed to estimate the correct motor parameters, including model reference adaptive system (MRAS), extended Kalman filter (EKF), sliding mode observer, and recursive least squares Zerdali and Barut, 2017, Yin et al., 2016a, Yin et al., 2016b, Yin et al., 2016c, Yang et al., 2017, Djadi et al., 2017. This class of methods can be summarized as model-based methods since their performance strongly depends on the accuracy of the approximated IM model. Model-based methods generally suffer from two shortcomings. First, they are very sensitive to noise, where the algorithm becomes very unstable under any noise overlaying the IM model. For example, Shi et al. (2012) proposed an EKF algorithm to estimate the rotor speed and position of permanent magnet synchronous motor. However, this algorithm is difficult to be practically applied owing to its sensitivity to noise. Second, in some situations, these methods may fail due to parameter perturbation. For example, MRAS cannot be used at low frequencies due to a drop in stator resistance voltage (Maiti et al., 2008).

Recent advances in artificial intelligence areas have provided new possibility of estimating motor parameters in a data-based manner. These strategies can utilize a range of data training architectures including artificial neural networks (ANNs), support vector machine (SVM), genetic algorithm, and particle swarm optimization (PSO) Papa and Koroušić-Seljak, 2005, Liu et al., 2008, Woodley et al., 2005, Liu et al., 2017. Data-based methods are generally more robust and accurate since they are independent of the motor models. However, they face a significant challenge. In general, most successful data-based applications require considerable hand-labelled training data, where reliable hand-labelled training data are rather difficult to obtain in practical engineering applications. Karanayil et al. (2007) introduced an ANN to estimate the resistance of a stator and rotor in sensorless applications. However, the dataset cannot be used for training the ANN, and the voltage model of the motor is used instead. In other words, the strategies should be model-based rather than data-based. Bilski (2014) presented an SVM-based strategy; however, the method of obtaining the dataset remained unclear in this method. Sakthivel et al. (2010) investigated an MOPSO strategy to estimate the IM parameters, where the manufacturer data were used as reference instead of the training dataset.

A feasible choice for obtaining the hand-labelled data is to use reinforcement learning, in which the agent can learn the correct knowledge from the environment and generate the label data automatically. Recently, the application of reinforcement learning in industry has become a hot issue due to theoretical breakthroughs. Khan et al. (2011) presented a humanoid robotic arm control strategy based on Q-learning and approximate dynamic programming. Yin et al. (2016a) proposed an approximate dynamic programming method to reduce the time delay of affected passengers in a congested metro line. Deisenroth and Rasmussen (2009) presented a Gauss process reinforcement learning algorithm and applied it to inverted pendulum control in hardware. Yin et al. (2016b) develop an integrated train operation algorithm based on Q-learning to realize real-time train operations with online adjusting the timetable. Zhang et al. (2017) presented an energy-efficient scheduling for real-time systems based on the deep Q-learning (DQL) model. Prashanth and Bhatnagar (2011) proposed a reinforcement learning algorithm along with the function approximation algorithm, which can be applied to traffic signal control.

This paper proposes a data-based method with labelled data obtained in a simple manner. By focusing on this objective, an estimation approach is proposed based on the DQL algorithm. In particular, the training dataset can be generated automatically during the estimation procedure in DQL, where the dataset is trained simultaneously. With a careful design of the algorithm’s architecture, DQL can obtain an unbiased estimation of parameters after a few iterations of the training procedure.

The main contributions of this paper can be summarized as follows.

(1) Although breakthrough has been achieved by DQL in some domains, such as computer vision games, recently (Mnih et al., 2013), no study has applied DQL to the motor control technology. In this study, DQL is used successfully for motor parameter estimation.

(2) To ensure DQL feasibility, the ways of designing the DQL architecture are also discussed, which include the design of observation, action, and reward. Moreover, an indicator, namely Q-sensitivity, is introduced to guarantee the robustness and efficiency of the algorithm, which also demonstrates that the algorithm with high Q-sensitivity is more stable and can converge at a much faster rate.

(3) Some comparative experiments are designed to illustrate the DQL performance. Through these experiments, DQL is verified to have more accuracy and robustness than some other traditional methods. These experiments demonstrate that DQL can generate a maximum-torque output in any state.

The remainder of this paper is organized as follows. In Section 2, an induction model of the motor with some parameters is established. In Section 3, a new DQL architecture is designed, where the observation, reward, and action are appropriately designed. The experimental and comparative studies are extensively illustrated in Section 4. Finally, the conclusion is drawn in Section 5.

Section snippets

IM model

The mathematical model of the IM in dq axis voltage can be given as uds=RsidsωLsiqsuqs=Rsiqs+Lsdiqsdt+ωLsids.

The motor torque is achieved at Te=32npLm2Lrisdisqwhere ω is the electrical angular velocity; Te is the electromagnetic torque; np is the pole pair of the motor; uds, uqs, ids, and iqs are the dq axis stator voltage and current; Rs and Rr indicate the stator and rotator resistances, respectively; and Lm indicates the excitation inductance. Moreover, the stator and rotator

Reinforcement learning and Markov decision process

The task of motor parameter estimation is shown in Fig. 1, where an algorithm can interact with the environment in a sequence of actions A, observation S, and rewards R. At each time step, the algorithm selects an action at from the set of actions A={1,,K}. The action is passed to the environment and modifies the observations, which is an internal signal of the motor controller. Simultaneously, the algorithm receives a reward in the form of motor electromagnetic torque. Note that all motor

Experiment environment

During the experiments, an IM benchmark is employed as the experimental equipment, which is illustrated in Fig. 5. In particular, the benchmark consists of a test motor (design parameters are shown in Table 2), a dynamometer motor, a torque–speed transducer, a data collector, and two motor controllers. Moreover, the test motor works in the IFOC torque loop, while the dynamometer motor works in the IFOC speed loop. The torque and current signals can be obtained from the torque–speed transducer

Conclusion

This paper proposes a parameter estimation method for the rotor resistance Rr and excitation inductance Lm in an IM’s IFOC application. Towing to the demonstrated disadvantage of the model-based method, a model-free method based on DQL is proposed. In particular, the structure of the DQL parameter estimation is constructed. Meanwhile, the DQL elements, including observation, reward, and action, are designed such that the feasibility of the algorithm can be ensured. Moreover, a new concept of ‘Q

References (28)

  • Kojooyan-JafariH. et al.

    Parameter estimation of wound-rotor induction motors from transient measurements

    IEEE Trans. Energy Convers.

    (2014)
  • LiuZ.-H. et al.

    GPU Implementation of DPSO-RE algorithm for parameters identification of surface PMSM considering VSI nonlinearity

    IEEE J. Emerg. Sel. Top. Power Electron.

    (2017)
  • MaitiS. et al.

    Model reference adaptive controller-based rotor resistance and speed estimation techniques for vector controlled induction motor drive utilizing reactive power

    IEEE Trans. Ind. Electron.

    (2008)
  • MnihV. et al.

    Playing atari with deep reinforcement learning

    Comput. Sci.

    (2013)
  • Cited by (35)

    • Data-driven coordinated control method for multiple systems in proton exchange membrane fuel cells using deep reinforcement learning

      2022, Energy Reports
      Citation Excerpt :

      Using artificial intelligence (AI) technologies, researchers have devised several data-driven algorithms for multi-agent coordination. The MADDPG algorithm (Lowe et al., 2017) refers to a multi-agent reinforcement learning algorithm that can achieve rapid coordinated control of multiple agents through a centralized training strategy, and so it is used extensively adopted to solve various coordination problems (Li et al., 2021d; Li and Yu, 2021a,b; Zhang et al., 2020; Qi, 2018). As impacted by the insufficient exploration ability of the MADDPG algorithm, the robustness exhibited by this type of algorithm is extremely weak for coordination of PEMFC parameters.

    • A multi-objective energy coordinative and management policy for solid oxide fuel cell using triune brain large-scale multi-agent deep deterministic policy gradient

      2022, Applied Energy
      Citation Excerpt :

      The lower robustness of conventional MADDPG algorithm is attributed to the following problems: A large number of low-value samples are obtained by the MADDPG algorithm in the preliminary stage [30,31]. In offline training, the DDPG algorithm adopts a simple exploration approach, lowering the diversity of samples.

    • Total travel costs minimization strategy of a dual-stack fuel cell logistics truck enhanced with artificial potential field and deep reinforcement learning

      2022, Energy
      Citation Excerpt :

      To this end, a more general form APF is introduced which enables an adjustable virtual attractive/repulsive force through dynamic tuning of ab. DDPG is a model-free, on-line, off-policy DRL algorithm that combines the concepts of deterministic policy gradient (DPG) and deep Q-network (DQN) [31,32]. DDPG is designed, trained and deployed to further optimize the APF parameter aiming to improve the adaptivity of SOC regulator.

    View all citing articles on Scopus
    View full text