Data labeling tools provide comprehensive features to support labelers in the annotation process. Features as trivial as offering annotators a visual indicator of the labels they previously assigned can shoot up labeling consistency by 15%, as Microsoft found.
Regardless of the nature of your project, your data labeling tool must enhance all of the following:
- Ease of use – annotators must spend their energy annotating accurately rather than managing the labeling tool
- Speed up manual annotation– using a specialized tool must be faster than using all-purpose tools
- Consistency – enabling frameworks or guidelines to ensure consistency for each individual labeler and across teams
- Accuracy – allowing labelers to annotate with as little noise as possible
- Tracking – measuring performance indicators and other metrics for performance and project management
Below, we will describe core features and labeling support features by looking at specific data types. We will also be presenting evaluation metrics applicable for all data labeling tools regardless of data type.
Data Types for Machine Learning Projects
At a high level, there are six types of data which can be used for machine learning. Each of these has specific annotation types which involve the identification and tagging of the data, such that resulting labels will be a word or a short phrase.
- Image – these can range from human faces to plants and street signs, and are typically annotated with features such as bounding boxes or polygons
- Video – videos are treated as sequences of images, and typically approach in a similar way to images with a dependency on time
- Text – is typically used for analyzing sentiments and intents. Text labeling identifies phrases, sentences or keywords
- Audio – most common audio data use cases involve human speech. In this instance, labeling typically involves transcription
- Time-series – refers to data organized across time, such as stock market graphs or network performance
- Sensor – typically used in the autonomous vehicle industry, sensor data involves non-video information such as radar or Lidar
Tool Specialization for Data Types
Due to the innate difference between each type of data, tools have feature sets which apply to each specific use case. For example, computer vision annotation tools (CVAT) have features which support semantic segmentation, but are not applicable for sentiment analysis.
For each of the types of annotation below, we will describe the core features which the tool must provide.
Image Annotation Tool Features
Image annotation features typically have to support use cases such as image classification – where an AI algorithm can identify what the image is about; object detection – where an algorithm can identify one object in a picture with multiple elements; and image segmentation – where the model can separate between all the elements within an image.
To support annotators to label images, image annotation tools can support high quality labeling by providing features such as:
- Bounding boxes – an annotator will place a 2D rectangle around the object that requires labeling and correctly classify it.
- Cuboids – similar to bounding boxes, cuboids try to expand in the 3D space by adding a depth dimension when labeling an object.
- Polygons – also informally known as ‘freehand drawing’, polygons allow annotators to add multiple points around the labeled object to accurately identify its borders without background noise.
- Semantic segmentation – it classifies pixels in an image and assigns it to an entity, such that every pixel in the image belongs to a specific group.
- Landmark annotation – Landmarking is used for the precise detection of shapes in different sized faces for computer vision. The labeler adds a tag at the points of interest or differentiating within an image. In the case of facial recognition, landmarks are used to determine the outline of the face and other key unique identifiers such as the eyes, mouth and nose.
Industry-leading computer vision annotation tools (CVAT) can automate many labeling processes to improve the annotator’s workflow by taking a high-level input such as a bounding box and automatically creating more granular outputs such as polygons.
Video Annotation Tool Features
Because video files are just sequences of images, all the image annotation tool features described above apply. However, for an annotation tool to support video annotation, we recommend tools which support the following:
- Scene Classification – this is the practice of labeling a video by determining what is happening rather than the objects that are present. Consider an image with a car. A single image cannot help determine which way the car is moving or if it is stationary. This can only be determined via video, and scene classification allows labelers to classify selected scenes or entire clips by providing start and end timestamps.
- Object Tracking – constantly updating the position of an object within a video with each frame. Object tracking can be done with either bounding boxes, cuboids or polygons. Object tracking is a great use case that can be automated by the video annotation tool.
Text Annotation Tool Features for NLP
Text annotation in the context of natural language processing (NLP) refers to identifying underlying themes, sentiments, and phrases from a text written in natural language. A tool which can support the annotation of text datasets must provide features that enable the following:
- Sentiment analysis – this technique helps artificial intelligence algorithms determine whether a piece of text uses positive, negative or neutral phrasing. To do so, the labeled training datasets must identify keywords with respect to the context they are in.
- Parts of speech (POS) – this technique refers to labeling each word within a text with its adequate syntactic function. For example, labeling actions as verbs, and objects as nouns. Parts of speech plays a critical role in disambiguation, such that a ML algorithm can correctly identify homonyms – words with more than one meaning – in a wider context.
Audio Annotation Tool Features for NLP
Audio annotation has some similarities to text annotation, but has an additional layer of complexity. Typically, audio annotation is used to help ML models understand speech, and then transcribe speech into text so that it can be further processed using text-based NLP. Audio annotation tools should have the following features:
- Audio transcription – perhaps the most straightforward way for an annotation platform to support audio annotation is to provide a transcription feature, where a labeler listens to an audio file and transforms speech into text. Further NLP activities will be carried out from the resulting text
- Word and phrase tagging – compared to transcription, word or phrase tagging requires the labeler to assign a tag to a timestamped section of audio, such that a machine learning model can be trained to recognize the tagged speech.
Sensor Annotation Tool Features
Sensor data is becoming the standard approach in the autonomous vehicle space, where reliance on a single source, such as cameras, is not recommended. However, other industries such as agriculture or government agencies also use AI models for some use cases such as forecasting. The most common and reliable sensor data used nowadays is provided by Light Detection and Ranging (lidar) sensors. Sensor annotation tools must support lidar labeling, with features such as:
- 3D point cloud annotation – For a self-driving AI to recognize the objects captured in the 3D point clouds, the point clouds must be segmented as they belong to each object and accurately labeled.
- Polyline annotation – Lidar polyline annotation helps autonomous vehicles detect street lanes across roads on cities and highways. Compared to polygons, polylines have an open ended ending or side, which make it suitable for detecting long and narrow objects, such as street lines.
Time Series Analysis Tool Features
Time series datasets involve the representation of data points across time. The most common applications for time series AI models involve the finance industry, where stock market price and commodity prices are represented as single prices at different points in time. Time series analysis and prediction applications can involve price forecasting based on trends. For an annotation tool to support time series analysis, it should provide a powerful user interface that allows granular control over the dataset such as:
- Multichannel labeling – this refers to having multiple time-series datasets within the tool’s workspace. With multichannel labeling, you can synchronize multiple channels to identify relationships that happen across individual datasets.
- UI performance – considering that time-series data may contain hundreds of thousands of data points, it is important for the data annotation tool to display, zoom and pan across the points without delays or loading times. UI performance is even more important when using multichannel labeling as the number of datapoints can multiply by orders of magnitude.
Whether your project requires image annotation or sensor data annotation, all AI annotation tools have a set of common criteria, which you should consider when evaluating labeling tools. The features which are applicable to all annotation tools include the following:
- File formats and extensions – the input and output formats of datasets. If your datasets come from multiple sources and have different formats, the annotation tool must be able to support all the file extensions. Similarly, the tool should be able to export the training data into the formats you need.
- Security – as a wider consideration, security can look at certifications such as SOC2, data storage policies, transmission security via VPNs, and user access controls.
- Learning curve – especially in instances where the labelers have not worked with the chosen annotation tool, the amount of time required to become proficient in using the tool is highly important. This is typically improved by simple user interfaces, technical documentation and vendor support.
- Integrations – this refers to the tool’s capability of connecting with other applications. Integrations can either be done in a bespoke way through API, SDK, or, in some cases, using out-of-the-box connectors provided by the supplier.
- Pricing – the tool’s pricing should be a reflection of all its features and use cases. In instances where your project requires a single annotation type (i.e. a computer vision annotation tool), the price should be lower than a comprehensive, all-purpose tool.
- Deployment model – this looks at how the tool will be deployed and where it will run from. Today, most common deployment models include on-premise appliances (where you buy a dedicated piece of hardware that you need to install or run), virtual appliances (the tool can run as a piece of software either on a local machine, or in a virtual machine hosted in a public or private cloud), and Software-as-a-service (where the tools is accessible through a web portal, maintained by the provider).
- Project management and quality control– it is important to evaluate how datasets are managed, assigned to labelers, tracked for progress and reviewed for quality assurance.
AI Annotation as a Service
Navigating a tool is challenging enough. Selecting the humans to operate them, however, can be even harder. iMerit enables teams of skilled labelers with comprehensive AI annotation tools to provide high-quality data for all use cases. We support data scientists, artificial intelligence and machine learning engineers with a consistent flow of reliable datasets.
If you’re looking for more than just a tool for your next big project, then speak with an iMerit expert today.