Post

What Makes ML DataOps Successful?

April 18, 2022

At the iMerit MLDataOps Summit 2021, iMerit CEO Radha Basu spoke with Facebook AI’s Ragavan Srinivasan about AI data solutions and the key elements of successful ML DataOps. Together the two discussed: 

  • The importance of context in AI development
  • The need for a single source of truth in ML DataOps
  • How companies are future proofing their AI projects

AI for Societal Applications

AI for Societal Applications

Local context has tremendous implications on the success of AI. This is because AI applications, depending on where they’re meant to be applied, must take their environments into consideration. 

For example, within the autonomous vehicle industry, driver, pedestrian behavior and regulations are different depending on the local context. Driving in southeast Asia is a considerably different experience compared to driving in North America. In addition to locality, ethics continue to be an ongoing challenge within AI. For AI models to truly be successful, they must accomplish some sort of greater societal good. That can happen through the right mixture of technology, talent, and techniques. This entails not only performant models, but also representative data and ethical frameworks in which AI must operate.

“I don’t want to look ten years from now or five years from now even and say, “oh, we need to bridge the AI divide.” I want to work on that today. “

Radha Basu

Rather than just focusing on technology and working ethics in the models retrospectively, we need to embed it into the technology itself. Touching every element of human life, the equity must be designed in the AI, for precision agriculture, healthcare, finance, recruiting and research. 

Removing Supply Chain Friction

Removing Supply Chain Friction

With a long and convoluted supply chain, the AI industry depends on seamless end-to-end solutions that can deliver consistently high-quality data. There is a huge amount of innovation that can happen when thinking about tooling, and actually rethinking it with data at the center. This can mean providing something along the lines of a command and control center where practitioners are able to come in, visualize the data as it goes through the entire lifecycle, and then be able to tweak it and learn from it. 

“So we did a survey among data scientists, and 95 percent said high-quality annotated data is essential to the long-term success of your ML.”

Radha Basu, CEO and Founder of iMerit

To rethink tooling with data at the center, the infrastructure needs to give AI leads, project managers, developers, and labelers a single view of the ML DataOps pipeline. With proper dashboard functionality, these innovators would finally be able to monitor the validation, monitoring, and security in a single place.

The AI  industry and community is still in the very early stages of coming up with design patterns for how to think about data in an end-to-end system. The autonomous vehicle industry is perhaps at the forefront of maturing design patterns. Other industries are able to learn from the AV vertical, distill some of these design patterns, integrate them, and automate them into tools. If you make each problem small enough, and focused enough, that you can automate and solve it.  

This will really help the space to bootstrap and accelerate the adoption of ML in so many other industries. Codifying some of these design patterns and building them into this tooling is a huge opportunity.

The Three S’s of Future-Proofing AI Project

To help  tackle issues that crop up throughout an AI project lifecycle, we recommend taking into consideration the following factors:

  1. Scaling a project in wider and wider markets requires a proportionate flow of high quality labeled data that takes into consideration local idiosyncrasies, edge cases, and societal changes.
  2. Safety is especially important in industries such as healthcare or autonomous vehicles, safety entails that all human agents that an AI interacts with are protected.
  3. Security must be integrated at every step of the way. This includes everything from infrastructure such as network firewalls, to processes such as identity and access management, to people through education and secure practices.

“I call that my three S’s, and I put it up there every day and look at it, the safety, the security, and the scaling.”

Radha Basu, CEO and Founder of iMerit

Bringing those three elements together can help set any AI project on the right path. Building an application on the pillars of scaling, safety, and security, can help easily overcome issues down the line, such as larger deployments, cyberattacks or model performance degradation that would put humans in danger.

Conclusion

Efficient, ethical, and centralized ML DataOps are behind every high-performing AI model. Without the necessary infrastructure, ethics, and centralization of operations, AI leads and project managers will struggle to hit their mark when it comes to model performance. If companies spend more time focusing on how to monitor, ensure, and achieve the highest-quality data, then its time they reevaluated their ML DataOps strategy.

If you wish to learn more about iMerit’s data annotation services, please contact us to talk to an expert.