Leading players in the autonomous vehicle industry have gathered millions of miles of data. Their challenge now is finding a scalable way to accurately label their data. Considering that perception is wholly determined by the quality of labeled data used to train neural networks, innovative annotation methods will help speed up the labeling process, while improving accuracy at the same time.
We can leverage existing machine learning models to support humans in the labeling process. Training neural networks to perform time-consuming annotation tasks can increase the overall speed. For accuracy, we can implement a human-led quality assurance process. This hybrid model ensures autonomous vehicle AI consistently maintains human-level perception.
Below are three innovative methods for data labeling.
Transfer learning for data labeling
Transfer learning is a machine learning method which mimics the human ability of transferable skills. It uses trained weights from a source model as the initial weights for the training of a target dataset. Much like how a musician can learn a new instrument much easier using his existing knowledge, machine learning models can use knowledge learnt for one particular problem to aid in learning another. This makes transfer learning a prime method to speed up the data annotation process.
First, take an unlabeled image. Then feed the image into multiple specialized neural networks, each one being able to identify a specific object, such as car, lamp post, pedestrian, etc. These source neural networks which specialize in identifying select objects can create ‘pseudolabels’ on the unlabeled data. The resulting annotated data can now be fed into the target neural network. With respect to self-driving cars, the target neural network will be the main AI algorithm used to identify street objects.
Tolerance within self-driving AI algorithms is minimal, and even inaccuracies of a few percentile can lead to an underperforming autonomous driving system. To prevent this and to ensure human-level accuracy, the pseudolabels need to be verified by a trained human labeler to validate the selections and annotations. The human validation process can be further improved by ensuring the labelers are experts in their industry and are able to identify slight errors or misrepresentations.
3D Point Cloud Data Annotation
Light Detection and Ranging (LiDAR) sensors are used in autonomous vehicles to identify 3D entities using point clouds. LiDAR uses light pulses to measure the distance between the sensor and other objects to form a 3D point cloud and determine the outline of an object.
For a self-driving AI to recognize the objects captured in the 3D point clouds, the point clouds must be segmented as they belong to each object. Just as with other labeling techniques, this is very time consuming, especially considering the 3D nature of the data.
To increase the speed in which this can be completed, neural networks can be trained to segment data points into different objects, often completed by detecting the clusters of different points. Following segmentation, a human can then annotate each object. This removes a time-consuming portion of the work for the human labeler, but still requires the human to label each object correctly.
Automatic Multi-Point Selection via Bounding Boxes
Multi-point data annotation entails human labelers to draw the exact boundaries of an object within an image. Compared to bounding boxes – which only require a rectangle to encompass the annotated object – multi-point annotation is more accurate and more time consuming.
We can achieve the speediness of bounding boxes and accuracy of multi-point selection by using a trained neural network to detect the edges of an object in an image. A bounding box is provided by a human annotator, and an application that uses the trained neural network will create a multi-point annotation for the object in the bounding box. The data annotator can then correct any mistakes in the annotation and add the label. This significantly increases the speed in which data annotators can perform data labeling.
For autonomous vehicles, simple bounding boxes compromise data accuracy for an increase in annotation speed. They almost always contain background data that acts as noise to the machine learning model used in self-driving cars. Solely using bounding boxes will result in the self-driving models to be underperforming. On the other hand, multi-point selection contains considerably less noise and is preferred in creating models that achieve ground-truth.
Achieving human-level perception with hybrid data annotation models
All the methods described above leverage trained machine learning models to help humans in the labeling process. They all work off the assumption that the underlying model used for labeling is itself accurate and correct, only requiring human input for validation.
Here is where we reach a ‘chicken and egg’ problem – reliable ML algorithms are a powerful tool to label new data with minimal human supervision, but require accurate labels in the first place to train and mature the labeling ML algorithms.
Even with the initial human-labeled data, small errors by the labelers can amount to fundamental issues in the ML models. Rather than rushing toward automation, we can ensure better quality labeled data by upskilling labelers to become experts in specific areas.
iMerit’s services offer expert-level human labeling at >98% accuracy, which is guaranteed through a thorough Learning and Development process and industry-dedicated staff. Having labeled millions of images, iMerit’s solution has two-fold use cases for the autonomous vehicle industry:
- Train self-driving AI models with expert-labeled images which annotate all relevant information using multiple techniques
- Train data-labeling neural networks to create a fundamentally robust automated labeling process
With iMerit’s highly accurate labeled data, players in the autonomous vehicle industry can continue training their existing self-driving models to shorten their time to market, while also creating a sustainable automated labeling model by training neural networks for object recognition using either LiDAR or cameras.
Street data is much more valuable when labeled. So now that the industry has reached a good point in the amount of data gathered, we can focus more of our efforts in annotating the data with the highest level of precision and achieve superhuman safety in autonomous vehicles.