Deep Recurrent Q-learning Method for Area Traffic Coordination Control
Saijiang Shi *
Department of Automation, University of Science and Technology of China, Hefei 230027, China.
Feng Chen
Department of Automation, University of Science and Technology of China, Hefei 230027, China.
*Author to whom correspondence should be addressed.
Abstract
In order to improve the performance of Deep Q-learning when dealing with the area traffic control which is a partially observable Markov decision process. This paper introduces Deep Recurrent Q-learning by changing the fully connected network layers to LSTM layers. On the other hand, we use transfer learning to achieve the coordination of multiple intersections in the area. By the simulation experiments, this paper compares the average delay of our algorithm with the Deep Q-learning algorithm for three different saturation flows, respectively. We also compare our algorithm with another two popular traffic signal control algorithms, i.e., Q-learning and fixed time control algorithm. The experiment results show that the performance of our improved Deep Recurrent Q-learning algorithm is better than the other three algorithms.
Keywords: Area traffic coordination control, deep recurrent network, deep Q-learning, deep recurrent Q-learning.