Why Machine Learning Projects Fail

Start typing “artificial intelligence will change” into a search engine and you’ll see suggested sentence endings like “the world”, “everything in your life”, and “the face of business in the next decade”. Look a little deeper and it will become clear that AI and machine learning projects are not just driving breakthroughs, but integral to their success. According to a study by Accenture, 85% of leaders in capital-intensive industries say they will not achieve their growth goals unless they evolve AI.

At the same time, research from MIT Sloan suggests that the gap between organizations that are successfully leveraging data science and those that struggle to do so is widening. As we know, data science and machine learning are driving AI applications because it is through data processing that AI learns to interpret our world and react as we want. If AI is to have a real impact on businesses and their customers, businesses need a new approach to machine learning. As the MIT Technology Review concludes, “the way we train AI is fundamentally flawed.”

Numerous articles in publications such as Towards Data Science and Open Data Science (see here and here) seek to pinpoint exactly why machine learning projects fail with a fine-tooth comb and help from technical jargon. These articles are great if you’re a data scientist, but not so helpful if you’re a business trying to figure out why the conversational assistant or personalization campaign you’ve invested thousands in has never taken off. .

The reality is that your machine learning project probably didn’t fail because you messed up your approach to data versioning or model deployment. Most machine learning projects fail simply because companies didn’t have the right resources, expertise, or strategy in place from the start. McKinsey’s 2021 State of AI report corroborated this, reporting that companies that see the greatest impact on AI adoption’s bottom line are following both baseline and industry-leading best practices. AI and spend on AI more effectively and efficiently than their peers.

Five Common AI Mistakes Businesses Make

Through our work on ML projects for some of the world’s largest companies, Applause has identified a set of common mistakes that reduce efficiency, increase costs, push deadlines, and are ultimately the reasons projects fail. machine learning fail.

Common Mistake 1: Misjudging the Resources Needed to Train ML Algorithms

The number one reason machine learning projects fail is that companies are unprepared and ill-equipped to carry them out. According to Dimensional Research, 8 out of 10 companies find machine learning projects more difficult than expected because they correctly underestimate the work required for training models. This is why so few data science projects make it to production; without a clear understanding of the resources and expertise needed, businesses hit insurmountable obstacles or burn through budget due to inefficiencies. One thing they rate most poorly is the effort it takes to get the workout data right – which brings us to common mistake number two.

Common Mistake 2: Relying on data brokers to provide one-size-fits-all training data

Companies don’t struggle to get training data. After all, there are many data vendors out there selling large volume training data artifacts at low prices. The reason machine learning projects fail is because companies struggle to get high quality training data.

By buying one-size-fits-all data from vendors, companies don’t get data specific enough for their machine learning project needs. To understand why, consider the example of an online exercise class provider building a digital personal trainer (PT). For the physical therapist to recognize poor form and recommend improvements, they must be trained with data that goes beyond images of individuals in different exercise positions. He must also be able to recognize individuals at different levels of exhaustion and sweating, wearing different clothing, and with different levels of fitness and expertise.

There are many other issues with pre-packaged training datasets, including:

  • There is no guarantee that the data represents the balance of ages, genders, races, accents, etc. needed to reduce bias

  • The data was not annotated at all or was not annotated in a way that makes sense to the algorithm

  • The data has not been verified for compliance with data standards required by global AI regulations, such as the proposed European Artificial Intelligence Act (EU AIA)

  • Organizations cannot be sure that the correct data privacy and security measures have been followed, nor receive advice on how to protect data integrity in the future

To run truly successful machine learning projects, companies need to view training data as something they need. to selectrather than The source.

Common mistake 3: Underestimating the extent to which AI development requires constant iteration

Buying data from vendors not only has ramifications for the quality of training data, but also makes the rest of the AI ​​training process infinitely more difficult.

Training ML algorithms is not a one-time process. Once the training is underway, developers should continually request changes to the collected data as the needs of the data model become clearer. Indeed, training an AI algorithm is like trying to grocery shop and cook at the same time: you may think you have all the ingredients you need, but once you start cooking, you realize you forgot an ingredient, need to swap one out, or the balance of ingredients isn’t right – and you need to keep modifying your recipe accordingly.

In machine learning, it’s hard to know exactly what data you need until you start the process of learning the algorithm. You may realize that the training set is not large enough or that there has been a problem with the way the data was collected. Many data brokers have strict modification policies – or no ability to modify orders – leaving AI developers with data they cannot use and have no choice but to buy. another set of training that meets their new requirements. This is a common bottleneck for many businesses that drives up prices, pushes back deadlines, and reduces efficiency. Ultimately, this is the main reason machine learning projects fail.

Common Mistake 4: Not Integrating QA Testing

Companies in all industries often fail to integrate quality assurance testing into all stages of the product development process. It is falsely considered as an add-on, as a formality to recheck the correct functioning of a product, as opposed to a tool allowing to optimize the product in an iterative way.

One of the reasons machine learning projects fail is that this attitude toward quality assurance testing is untenable given the realities of AI development. Unlike traditional software development, you can’t fix bugs with a simple software update; on the contrary, errors discovered during the quality assurance testing stage can only be corrected by redoing the entire process. If your AI isn’t working as expected, it’s probably because there was a problem with the training data or the training data skewed the model in the wrong direction. In any case, this means going back to the first stage and keeping new training data artifacts.

Companies that don’t embed validation of results into all stages of the AI ​​development process are getting more work done. Rather than training the algorithm with a gigantic dataset and then testing the AI, companies need to train and test more iteratively. Adopting an agile and integrated approach to testing will help reduce unnecessary expense, speed up turnaround times, and enable more efficient allocation of resources.

Common Mistake 5: Not Scheduling Frequent Exams

The final reason machine learning projects fail is because companies celebrate their success too soon.

AI projects are never really finished. Even if an AI experiment fully meets expectations for accuracy and performance, it has always only been trained on data that reflects society as it is today. The algorithm has learned to make decisions based on opinions, dialogues and images that are already changing. Think of natural language processing (NLP) applications: they only know how to communicate because they have been trained to have real conversations with people. Since around 5,400 new words are created each year in the English language alone, NLP applications will lose accuracy very quickly.

If AI experiences are to continue to be useful to customers, they must be recycled as social attitudes, technological developments, and terminologies change.

How to ensure the success of machine learning projects

What companies need is a programmatic approach to developing AI. Rather than viewing each individual step in the process as separate projects, companies should consider bringing them together as part of a holistic program. AI development is an iterative, agile process in which teams must work in tandem, not in silos, all governed by a program manager accountable for the success of the program.

To learn more about how your business can implement a program approach to create truly valuable AI experiences for your customers, download our whitepaper: Create a Global ML/AI Data Collection and Quality Program .

Electronic books

Build a global AI/ML data collection and quality program

AI development requires a dedicated program. In this article, we explore where current approaches to AI development go wrong and show why a programmatic approach is the answer.


Want to see more?

Sherry J. Basler