A team from the University of California’s (UC) Lawrence Berkeley National Lab are working with the California Department of Transportation (Caltrans) to use high performance computing (HPC) and machine learning to help improve the agency’s real-time decision making when highway incidents occur.
Urban traffic roughly follows a periodic pattern associated with the typical ‘9 to 5’ working week schedule. However, when an accident happens, traffic patterns are disrupted. Designing accurate traffic flow models, for use during incidents, is a major challenge for traffic engineers, who must adapt to unforeseen traffic scenarios in real-time. The new research was done in conjunction with California Partners for Advanced Transportation Technology (PATH), part of UC Berkeley’s Institute for Transportation Studies (ITS), and Connected Corridors, a collaborative program to research, develop, and test an Integrated Corridor Management approach to controlling traffic on major routes in California.
Caltrans and Connected Corridors are implementing the new system on a trial basis in Los Angeles County through the I-210 pilot project. Using real-time data from partners in southern California at the city, county, and state level, the goal is to improve Caltrans’ real-time decision-making by executing coordinated multijurisdictional traffic incident response plans to limit the negative impacts of these events. The first iteration of this system will be deployed in the cities of Arcadia, Duarte, Monrovia, and Pasadena in 2020, with plans for future deployments around the state.
The new system uses ‘ensemble learning’, which is the art of combining a diverse set of learners (individual models) to improve, on the fly, the stability and predictive power of the model. Although the concept has been explored by machine learning researchers for some time, the traffic flow model is special due to its temporal characteristic; traffic flow measurements are correlated over time, as are the prediction results from different individual models. In the Berkeley Lab-Caltrans collaboration, the ensemble model takes into account the mutual dependency of sub-models and assigns the ‘shares of vote’ to balance their individual performance with their co-dependency. The ensemble model also values recent prediction performance more than older historical performance. At the end, the combined model is better than any of the single models used in testing in both prediction accuracy and stability.
Using data collected from Caltrans sensors on California highways, the project yielded novel algorithms that achieved accurate prediction on a 15-minute rolling basis. The team then validated and integrated the new algorithms using real-time traffic data collected using the Connected Corridors system: a streaming-based, real-time transportation data hub in which Spark MLlib (a scalable machine-learning library) provides machine-learning models that can be used within the proposed ensemble-learning framework. The specific implementation of this work was to generate predicted traffic flows at points where sensing was present on the freeway. This in turn could be used to predict traffic demands at freeway entrances and traffic flows at freeway exits.
“Many traffic-flow prediction methods exist, and each can be advantageous in the right situation,” said Sherry Li, a mathematician in Berkeley Lab’s Computational Research Division (CRD). “To alleviate the pain of relying on human operators who sometimes blindly trust one particular model, our goal was to integrate multiple models that produce more stable and accurate traffic predictions. We did this by designing an ensemble-learning algorithm that combines different sub-models.”