Iranian Journal of Operations Research

en Constrained Multi-Objective Deep Reinforcement Learning for Safe and Fair Urban Traffic Signal Control Other Other پژوهشی Original This paper presents a constrained multi-objective deep reinforcement learning framework for urban traffic signal control. The problem is modeled as a constrained Markov decision process in which an agent simultaneously optimizes efficiency objectives while respecting explicit safety and fairness constraints. A dueling double deep Q-network (D3QN) is combined with a Lagrangian cost estimator to approximate both the reward value function and cumulative constraint costs. The state representation includes queue lengths, phase indicators and elapsed green times, and the action space consists of a small set of interpretable decisions such as extending the current green or switching to the next phase. The proposed controller is trained and evaluated in a SUMO-based microscopic simulation of a four-leg urban intersection under various traffic demand patterns. Its performance is compared with fixed-time, vehicle-actuated and unconstrained DQN controllers. Simulation results show that the proposed method can substantially reduce average delay and maximum queue length while keeping queue spillback and delay imbalance within predefined limits. These findings indicate that constrained multi-objective deep reinforcement learning offers a promising and practically deployable framework for safe and fair traffic signal control in congested urban networks, and can be extended to more complex corridors and network-wide settings in future work.   adaptive traffic signal control, deep reinforcement learning, constrained Markov decision process, safe reinforcement learning, multi-objective optimisation, SUMO. 46 62 http://iors.ir/journal/browse.php?a_code=A-10-6070-2&slc_lang=en&sid=1 Sara Motamed motamed.sarah@gmail@gmail.com 00031947532846003036 00031947532846003036 Yes Department of Computer Engineering, FSh.C., Islamic Azad University, Fouman, Iran