HYPERPARAMETERS OF Q-LEARNING ALGORITHM ADAPTING TO THE DRIVING CYCLE BASED ON KL DRIVING CYCLE RECOGNITION

, Yanli Yin; , Xuejiang Huang; , Xiaoliang Pan; , Sen Zhan; , Yongjuan Ma; , Xinxin Zhang

doi:2022.23.4.967

International Journal of Automotive Technology > Volume 23(4); 2022 > Article

Electric, Fuel Cell, and Hybrid Vehicle

International Journal of Automotive Technology 2022;23(4): 967-981.
doi: https://doi.org/10.1007/s12239-022-0084-0

HYPERPARAMETERS OF Q-LEARNING ALGORITHM ADAPTING TO THE DRIVING CYCLE BASED ON KL DRIVING CYCLE RECOGNITION

Yanli Yin ^1,2, Xuejiang Huang ¹, Xiaoliang Pan ³, Sen Zhan ¹, Yongjuan Ma ¹, Xinxin Zhang ¹

¹School of Mechatronics & Automobile Engineering, Chongqing Jiao Tong University
²Beiben Trucks Group Co., Ltd.
³Chongqing Changan Automobile Stock Co., Ltd.

Corresponding Author. Yanli Yin , Email. 990201200009@cqjtu.edu.cn

ABSTRACT

As an effective reinforcement learning (RL) algorithm, Q-learning has been applied to energy management strategy of hybrid electric vehicle (HEV) in recent years. In the existing literatures, the values of three hyperparameters based on Q-learning are all given in advance, which are respectively exploratory rate ε, discount factor γ and learning rate α. However, different values of hyperparameters will influence on fuel economy of the vehicle and offline computation speed. In this paper, it is proposed that the method of optimization on hyperparameters adapting to driving cycle. Firstly, the mathematical model between three hyperparameters and iteration times is established based on inherent regularity of hyperparameters influencing on vehicle performance respectively. Secondly, it is determined that the optimal changing index k_index of iteration number based on Q-learning corresponding to typical driving cycles. Finally, the simulation model of Yubei District in Chongqing is constructed based on the method of Kullback-Leibler (KL) driving cycle identification. The simulation results indicate that equivalent fuel consumption of the proposed strategy is reduced by 0.4 % and the offline operation time is reduced by 6 s. It can be concluded that the proposed strategy can not only improve fuel economy of the vehicle, but also accelerate the computation speed.

Key Words: Q-Learning, Hyperparameters, Adapting to, Kullback-Leibler (KL), Computation speed, Fuel economy

TOOLS