We are interested in the theory of control, learning, and optimization of dynamical systems, and their applications to intelligent traffic systems (ITS). Our research developed the methodological tools for Intelligent Transportation Systems (ITS) applications from the following three research areas:
- PDE/ODE model-based control: backstepping boundary control (基于PDE/ODE模型的控制)
- Model-free optimization: extremum seeking (无模型优化)
- Data-driven and learning-based control: reinforcement learning (数据驱动及基于学习的控制)
The application of above methodologies is applied for the ITS topics including:
- Traffic control and estimation
- Macroscopic traffic modeling
- Mixed autonomy traffic flow
- Connected and autonomous vehicles
- Energy and risk management of transportation network
Model-based Control: Boundary control of stop-and-go traffic
Stop-and-go traffic, source by Ford Motor Company.
This thread of work provides control tools that have been previously unavailable for suppressing stop-and-go oscillations in congested traffic using actuation that is very sparsely located along the freeway, such as ramp metering or variable speed limits. The macroscopic PDEs are particularly suited for modeling large-scale and congested traffic flow patterns, such as the stop-and-go traffic. The aggregated state values (i.e. density, speed, and flow rate) in the models evolve in continuous temporal and spatial domains. Focused on several macroscopic-level traffic problems, our research developed a methodological PDE model-based control framework for boundary actuation and estimation. The backstepping control method is employed which only requires sensing and actuating of state values at boundaries to regulate continuous in-domain values to the desired reference system. The proposed methodology is practical meaningful and relevant since the point actuation and sensing overcomes the technical and financial limitations of implementing sensors and actuators in large-scale transportation systems. We also consider the boundary control problem on freeway traffic of multi-lane, multi-class, and multi-segment.
本课题使用位于高速公路沿线非常稀疏的致动来抑制拥挤交通中的走、停振荡，例如匝道信号控制或可变速度限制。现有方法无法解决这一问题。宏观偏微分方程可适用于模拟大规模的、拥挤的交通流模式，例如走、停的交通。模型描述了聚合状态值（即密度、速度和流量）在连续时空域的演变。针对几个宏观层面的交通问题，我们开发了一种基于 PDE 模型的控制框架，用于边界驱动和估计。我们采用反步控制方法，只需要在边界处测量和控制状态值，便可将内部的值调节到所需的参考系统。因为点观测和点驱动克服了在大型交通系统中部署传感器和控制器的技术及财务限制，因此该方法具有实际意义。我们还考虑了多车道、多级和多段高速公路交通的边界控制问题。
- H. Yu and M. Krstic, “Traffic congestion control of Aw-Rascle-Zhang model,” Automatica, vol. 100, pp. 38-51, 2019. DOI: 10.1016/j.automatica.2018.10.040.
- H. Yu, and M. Krstic, “Output Feedback Control of Two-lane Traffic Congestion,” Automatica, vol.125, 2021, DOI:10.1016/j.automatica.2020.109379.
- M. Burkhardt, H. Yu, and M. Krstic, “Stop-and-Go Suppression in Two-Class Congested Traffic,” Automatica, vol.125, 2021, DOI: 10.1016/j.automatica.2020.109381.
- H. Yu, Jean Auriol and M. Krstic, “Simultaneous Output Feedback Stabilization of Freeway Traffic Flow on Two Connected Roads,” Automatica, under review.
The practical implementations include data-validation of traffic state estimation and event-triggered for digital implementation.
- H. Yu, Q. Gan, A. M. Bayen, and M. Krstic, “PDE Traffic Observer Validated on Freeway Data,” IEEE Transactions on Control Systems Technology, vol.29, pp. 1048-1060, 2021. DOI: 10.1109/TCST.2020.2989101.
- N. Espita, J. Auriol, H. Yu, and M. Krstic, “Traffic flow control on cascaded roads by event-triggered output feedback,” International Journal of Robust and Nonlinear Control, under review.
Model-free optimization: Extremum seeking of optimal throughput
For traffic dynamics that cannot be accurately described with models, our research use extremum seeking (ES), a real-time, model-free, adaptive optimization approach to tackle such problems. We applied ES algorithms to solve a downstream traffic bottleneck problem. Traffic congestion forms upstream of the bottleneck because the traffic flow rate overflows its capacity. Since the traffic dynamics of the bottleneck are hard to model, the optimal input density at the downstream bottleneck area is unknown and needs to be found in order to maximize the discharging flow rate from the bottleneck. A small excitation is used to perturb the input density being tuned and to produce estimates of the gradient of a cost function. An extremum seeking controller is designed with its delay effect being compensated with a predictor feedback design.
针对交通动态特性无法用模型准确描述这一难题，我们提出使用极值搜索（ES），这是一种实时的、无模型的、自适应的优化方法。 我们应用 ES 算法来解决下游流量瓶颈问题。 在瓶颈上游，因为交通流量超出容量，所以产生了交通拥堵。 由于瓶颈的交通动态难以建模，瓶颈下游区域的最佳输入密度是未知的，因此需要找到该最佳值以最大化瓶颈的排放流量。 我们使用一个小的激励作为扰动，并获得成本函数梯度的估计。我们设计了一个ES控制器，并且通过预测器反馈设计对其延迟效应进行补偿。
- H. Yu, S. Koga, T. R. Oliveira, and M. Krstic, “Extremum seeking for traffic congestion control with a downstream bottleneck,” ASME Journal of Dynamic Systems, Measurement, and Control, vol. 143(3), 2021.
Learning-based control: Reinforcement Control of Traffic Flow
Model-based approaches usually rely on assumptions and knowledge of the system dynamics. For traffic systems, the calibration of model parameters can be laborious, time-consuming, and highly associated with certain transient traffic conditions. Considering the uncertain dynamics and different performance metrics, it is desirable to have an approach with modest tuning to adapt to various problems. Recent developments in Reinforcement Learning (RL) have enabled model-free control of high-dimensional continuous control systems through a complete data-driven process. The model-free RL approach does not have prior assumptions of the model structure and learns a control policy through interactions with the system directly. We developed RL state feedback controllers for congested traffic on a freeway segment. We employed proximal policy optimization, a deep neural network-based policy gradient algorithm, to obtain RL controllers through an iterative training process with a macroscopic traffic simulator. RL controllers are found to have comparable performance with the conventional feedback controllers in a traffic system with the perfect knowledge of model parameters. Remarkably, the RL controllers that were obtained from stochastic training processes outperformed the conventional controllers in an uncertain environment.
基于模型的方法通常基于系统动力学的假设和知识。对于交通系统，模型参数的校准可能是费力、耗时且与某些瞬态交通条件高度相关的。考虑到不确定的系统动态特性和不同的性能指标，需要一种可以自动调整的方法以适应各种问题。强化学习（RL）的最新发展通过数据驱动过程实现对连续高维系统的无模型控制。无模型强化学习没有对模型结构做出先验假设，而是通过与系统交互直接学习控制策略。在此研究中，我们为高速公路的拥堵路段设计了 RL 状态反馈控制器。我们采用了近端策略优化——一种基于深度神经网络的策略梯度算法，并通过宏观交通模拟器的迭代训练过程获得了 RL 控制器。 结果表明，RL 控制器可实现与已知模型参数的传统反馈控制器相当的性能。值得注意的是，从随机训练过程中得到的 RL 控制器在不确定环境中的性能优于传统控制器。
- H. Yu, S. Park, A. M. Bayen, S. Moura, and M. Krstic, “Reinforcement Learning versus PDE Backstepping and PI Control for Congested Freeway Traffic,” IEEE Transactions on Control Systems Technology, vol. 30(4), pp. 1595-1611, 2022.