Newly released data from a collaboration between U of T Engineering, the University of Waterloo and Scale AI will help train future self-driving cars to handle the challenges of winter driving.
This week, Professors Steven Waslander (UTIAS) and Krzysztof Czarnecki (University of Waterloo) and their teams unveiled the Canadian Adverse Driving Conditions (CADC) dataset. Based on scans of real Canadian roads, the dataset acts as a virtual training course for the computer algorithms that enable cars to drive themselves.
“There are lots of great training datasets out there already, but they were collected on sunny, summer days,” says Waslander. “If you take algorithms trained on those datasets and try to use them in adverse conditions, they tend to get confused. They can misclassify objects — such as pedestrians and other vehicles — or even miss them entirely, all because of the changes in sensor data caused by snowfall.”
To counter this challenge, the two professors decided to create a dataset that would capture what Waslander describes as “some of the worst conditions that you might see while trying to drive in Canada.”
“We want to engage the research community to generate new ideas and enable innovation,” says Czarnecki. “This is how you can solve really hard problems, the problems that are just too big for anyone to solve on their own.”
The dataset was created with the Autonomoose, a Lincoln MKZ hybrid that has been equipped with a full suite of sensors, including eight onboard cameras, a lidar (light detection and ranging) scanner and a GPS tracker. Waslander and Czarnecki developed the vehicle as a test bed for self-driving software, but the Autonomoose also has a recording mode that captures data at a rate of 10 images or scans per second.
Over the past two winters, the teams have taken the Autonomoose around southwestern Ontario, recording data from more than 1,000 kilometres worth of driving. Of this, approximately 33 kilometres in harsh, snowy conditions were selected to form the basis of the CADC dataset.
The teams partnered with Scale AI, a San Francisco-based AI infrastructure company, to label the data. Through a combination of computer and human image recognition, Scale AI tagged more than 178,000 instances of passing vehicles and more than 83,000 instances of pedestrians, along with many other objects.
“Data is a critical bottleneck in current machine learning research,” said Alexandr Wang, founder and CEO of Scale. “Without reliable, high-quality data that captures the reality of driving in winter, it simply won’t be possible to build self-driving systems that work safely in these environments.”
Finally, the teams conducted statistical analysis, processing and validation, placing the data into a format that can be parsed by currently available software. The result: a virtual environment that represents Canadian winter driving at its finest.
In addition to the dataset, the teams have provided full documentation and support tools in GitHub, and a scientific article posted publicly on arXiv. All are open-access, available free of charge to researchers. (An additional license is required to use them for the development of commercial products.)
“We’re hoping that both industry and academia go nuts with it,” says Waslander. “We want the world to be working on driving everywhere, and bad weather is a condition that is going to happen. We don’t want Canada to be 10 or 15 years behind simply because conditions can be a bit tougher up here.”
Waslander and his team will also be making extensive use of the data in their future work.
“In my lab, we’re building up a strong research program trying to resolve the issues around winter driving perception,” he says. “We hope that the techniques we develop to locate and track objects in adverse weather will eventually be incorporated into future autonomous vehicle software packages around the world, making them safer for everyone.”