Former U.S. Chief Data Scientist DJ Patil and Radha Basu on AI for Societal Applications

June 21, 2022

At the iMerit ML Data Ops Summit, iMerit CEO Radha Basu was joined by former White House Chief Data Scientist DJ Patil. Aside from transforming the U.S. federal government into a data-driven enterprise, DJ also has vast corporate experience where he has connected the dots between data science and practical societal applications.

In this talk, Radha and DJ discuss scaling machine learning projects throughout the project lifecycle, the issue of interoperability and international cooperation for aggregating datasets, how edge cases can scale projects, and the importance of curiosity and positive company culture for creating successful ML applications.

Scaling ML Projects

One of DJ’s most widespread ideas is the scalability model for technology-oriented projects. He recommends the following:

  • Prototype for 1x
  • Build for 10x
  • Engineer for 100x

This simple framework balances the resource investment in AI projects at different scales in order to help build an application that can be widely used when deployed in production. Prototype, the first stage, requires experimentation. It is the point when an idea is explored to see whether it is feasible, worth considering, and technically achievable.

For this early proof of concept, DJ recommends that the underlying technology stack and processes do not need to be complex or rigidly defined. Managing a project in the prototype phase with spreadsheets is realistic and expected. Rather than investing time and money in tooling and process definition, researchers should use common out-of-the-box tools and focus most of their energy on technical validation.

Build, the second stage, requires AI teams to lay the foundation of strong infrastructure. Rather than managing data with spreadsheets, the AI application needs to use scalable databases that can handle increases in data and can be easily managed. Following that, the engineering stage must have serious architecture, infrastructure and processes in place to scale and manage costs, and ensure service level agreements.

“The whole ecosystem needs to come up at 100x scale to make this work.”

– DJ Patil

The most important aspect of this scaling schema is to ensure that the end-to-end solution must be scaled up at the same time. Going from 1x to 10x and then to 100x requires that people, process, and technology get scaled at the same time. Having 100 people involved to manage data on a spreadsheet will only bottleneck the workforce. Having only one person involved to manage 100x data in a scalable database is just as inefficient.

Aggregating Datasets

While most AI projects have a lack of high quality data, there are some low-hanging fruit which do not get picked due to interoperability and red tape issues. One illustrative area is cancer research. We have unbelievable amounts of data spread across institutions and political entities. However, this data is siloed and cannot be brought together due to lack of cooperation and/or system interoperability. It might be the case that if we can aggregate all this data, we may already have a solution for the disease. To bring data together, we must consider devising interoperability frameworks, cross-institutional collaboration while also keeping in mind equity and equality.

Edge Cases Define Model Performance

“Edge cases will power us to do 1000x scale.”

– DJ Patil

Edge cases may make up less than 10% of real-world scenarios, which is a huge number. Investing a lot of time and resources in creating a model can get you very far, but edge cases will be an inherent challenge to the model. The key for a 1000x scale, an application in production deployed worldwide, is to put a process in place to iteratively learn and handle new edge cases.

Perhaps the most efficient way of handling use cases is by implementing humans-in-the-loop. These agents are able to intervene where the model fails to ensure service delivery and help capture the shortcomings of the model for future training.

Another way of thinking about edge cases is to think about nuance. How do we put nuance in the situation? We must be able to create models which take into consideration the intricacies of the real world and be applicable across all use cases. We have seen shortcomings in existing models where an application can perform well for one population but terrible for another, such as predictive policing or educational assessments.

Humans Determine the Model Input

DJ Patil has observed that the most important quality of anyone working in the AI space is curiosity and encourages enterprises to cultivate curiosity within their workforce. With a ‘curiosity over judgment’ mindset, tech enterprises need more people who question how things are done, and can generate new insights that can alter the course of a company. This is one of the qualities highlighted at iMerit’s Learning Academy, which covers topics such as communication, comprehension, culture, and confidence to upskill employees and build the tech-propelled, advanced workforce of the future.


“Edge caser is the curious learner at the center of the academy.”

– Radha Basu

Radha cited a labeling project focused on identifying different types of smoke. During the project, one person challenged a consensus between data scientists that initially concluded that smoke can easily be categorized into just a few buckets. The challenger had a different background from the data scientists, and went on to highlight that there are as many as nine types of smoke.


Following the new types of smoke categories, the project gained a whole new set of workflows around smoke. These proved invaluable in critical applications such as drone delivery or drone landing, or different autonomous mobility scenarios in bad weather conditions, to name a few. All because of a kernel of curiosity around smoke.

AI Mission Statement

As part of their efforts to become a data-driven enterprise, the U.S. federal government adopted the following mission statement – ‘To responsibly unleash the power of data to benefit all Americans’. Enterprises and organizations across the world can further expand on this idea further to “responsibly unleash the power of data and AI to benefit everyone”. To do so, AI projects can use the scalability guide to get their project from 1x to 100x, and tackle edge cases to scale further to 1000x. This will be possible as enterprises encourage a curiosity mindset, where everybody has the possibility of bringing input to the table.

Subscribe to Our Newsletter for AI Updates