High-quality data is the fuel that powers AI algorithms. Without a continual flow of labeled training data, bottlenecks can occur and the algorithm will slowly get worse and add risk to the system.
From an article in Tech Crunch by Kirsten Korosec.
It’s why labeled data is so critical for companies like Zoox, Cruise and Waymo, which use it to train machine learning models to develop and deploy autonomous vehicles. That need is what led to the creation of Scale AI, a startup that uses software and people to process and label image, lidar and map data for companies building machine learning algorithms. Companies working on autonomous vehicle technology make up a large swath of Scale’s customer base, although its platform is also used by Airbnb, Pinterest and OpenAI, among others.
The COVID-19 pandemic has slowed, or even halted, that flow of data as AV companies suspended testing on public roads — the means of collecting billions of images. Scale is hoping to turn the tap back on, and for free.
The company, in collaboration with lidar manufacturer Hesai, launched this week an open-source data set called PandaSet that can be used for training machine learning models for autonomous driving. The data set, which is free and licensed for academic and commercial use, includes data collected using Hesai’s forward-facing PandarGT lidar with image-like resolution, as well as its mechanical spinning lidar known as Pandar64. The data was collected while driving urban areas in San Francisco and Silicon Valley before officials issued stay-at-home orders in the area, according to the company.
“AI and machine learning are incredible technologies with an incredible potential for impact, but also a huge pain in the ass,” Scale CEO and co-founder Alexandr Wang told TechCrunch in a recent interview. “Machine learning is definitely a garbage in, garbage out kind of framework — you really need high-quality data to be able to power these algorithms. It’s why we built Scale and it’s also why we’re using this data set today to help drive forward the industry with an open-source perspective.”
The goal with this lidar data set was to give free access to a dense and content-rich data set, which Wang said was achieved by using two kinds of lidars in complex urban environments filled with cars, bikes, traffic lights and pedestrians.
“The Zoox and the Cruises of the world will often talk about how battle-tested their systems are in these dense urban environments,” Wang said. “We wanted to really expose that to the whole community.”
For the full article CLICK HERE.
Note – If you liked this post click here to stay informed of all of the 3D laser scanning, geomatics, UAS, autonomous vehicle, Lidar News and more. If you have an informative 3D video that you would like us to promote, please forward to firstname.lastname@example.org and if you would like to join the Younger Geospatial Professional movement click here.