In recent years, the rapid development of intelligent transportation systems (ITS) and autonomous driving has made human driving behavior modeling accurate and critical for improving traffic safety, efficiency, and autonomous system adaptability. Traditional rule-based or utility-centric models, however, fail to handle the complexity, randomness, and scenario dependence of real driving. Thus, our study aims to explore inverse reinforcement learning (IRL), a data-driven method, for driving behavior modeling. We first reviewed major IRL variants such as Maximum Margin IRL, MaxEnt Deep IRL, GAIL, Bayesian IRL and IAL and analyze their strengths like MaxEnt Deep IRL's adaptability to large state space and limitations in ITS. We then proposed two frameworks: 1) an AOAT strategy based on MaxEnt IRL which uses HighD data set and reduces lateral deviations by 42.91%-55.35% vs fixed-weight schemes; 2) a multi-agent framework integrating multi-modal data, using Bradley-Terry regression and PPO algorithm for real-time traffic signal optimization. Finally, we discussed IRL's challenges such as data set reliance, poor reward function interpretability, high computation and cost and proposed future directions including standardized datasets, hybrid reward structures and algorithm optimization. This study proves IRL's value for human-centric modeling, laying a foundation for safer, more adaptable ITS.
Show more