How do geospatial data annotation projects begin and – most important – how are they guided to a successful conclusion? What is a successful conclusion? An ongoing iMerit engagement (as of late 2020) to apply semantic segmentation to fields of crops (and some weeds) provides some insights into the art and science of data annotation – and the crucial role of annotation methodology.
Arguably the key differentiator between the teams of data analysts a full-service annotation specialist such as iMerit deploys, and other approaches more heavily dependent on automated or even crowd sourced annotation, is the breadth and depth of experience a trained human-in-the-loop team brings to large, multi-workflow projects. The results (and definition of success) are largely defined by the evolution of data accuracy over the course of the initial weeks and then months through feedback loop models designed to eliminate individual errors and build consistency across a team of annotators that can number in the hundreds.
When an industry-leading precision agriculture AI technology developer approached iMerit in April 2018 the data scientists creating an AI algorithm specifically for crop management applications were spending roughly $10 million a year for the annotation output of five data labeling services, including two that employed a crowd sourcing approach. Why so many service vendors? The teams of generalists each of the vendors provided had an understandably high error rate that forced a damage control methodology: each segmented image was labeled three times by separate annotators before the results were merged through a consensus building discussion. Those collated labeled images were then sent to a team of 15 United States-based agronomist specialists for quality assurance and corrections.
The results – a range of 40-60 percent accuracy based on the scoring by the agronomists working the last steps of the production chain – were imperfect, but arguably the realistic expectation for the overall approach.
In a departure from the brute force system in place, iMerit’s worked with the client to build expertise from the ground up. Following a three-week training and development initiation for an initial new 20-analyst labeling team, each of the labelers worked through a feedback loop with the agronomists to steadily improve their accuracy. The results were illuminating. iMerit’s labelers reached a steady 80 percent accuracy (now 95 percent or greater, depending on crop type, image quality, lighting, and other related factors).
The impact of the shift in approach (and vendor) has reverberated well beyond the immediate process of annotation. The initial iMerit team of 20 data labelers has grown tenfold, and replaced the five separate vendors, and the client was able to reduce the team of agronomists acting as an (expensive) QA backstop to one. The overall result? A 40 percent reduction in spending on data annotation, and a reallocation of budget towards expanded data collection instead.