Survey: 96% of Enterprises Encounter Training Data Quality and Labeling Challenges in Machine Learning Projects

Research finds Artificial Intelligence is still emerging, driving training data issues for AI and machine learning initiatives

AUSTIN, Texas--()--IDC predicts worldwide spending on artificial intelligence (AI) systems will reach $35.8 billion in 2019, and 84% of enterprises believe investing in AI will lead to greater competitive advantages (Statista). However, nearly eight out of 10 enterprise organizations currently engaged in AI and machine learning (ML) report that projects have stalled, and 96% of these companies have run into problems with data quality, data labeling required to train AI, and building model confidence, according to information released today from Alegion.

Data issues are causing enterprises to quickly burn through AI project budgets and face project hurdles. The new report, “Artificial Intelligence and Machine Learning Projects Obstructed by Data Issues” was conducted by Dimensional Research. The findings include feedback from 227 participants including data scientists and business stakeholders involved in active enterprise AI and ML projects, addressing the maturity of ML in the enterprise, today’s ML project challenges, and the tools and resources used in these projects.

“The single largest obstacle to implementing machine learning models into production is the volume and quality of the training data,” said Nathaniel Gates, CEO and co-founder of Alegion, a training data platform for AI and ML initiatives. “This research reinforces our own experience, that data science teams new to building ROI-driven systems try to tackle training data preparation in house, and get overwhelmed.”

Large businesses with more than 100,000 employees are most likely to have an AI strategy – but only 50% of them currently have one, according to MIT Sloan Management Review. Alegion’s survey reinforces this finding that AI is still nascent in the enterprise:

  • 70% report that their first AI/ML investment was within the last 24 months
  • Over half of enterprises report they have undertaken fewer than four AI and ML projects
  • Only half of enterprises have released AI/ML projects into production

To get AI systems off the ground, training data must be voluminous and accurately labeled and annotated. With AI becoming a growing enterprise priority, data science teams are under tremendous pressure to deliver projects but frequently are challenged to produce training data at the required scale and quality.

Alegion’s survey respondents echoed these observations:

  • 78% of their AI/ML projects stall at some stage before deployment
  • 81% admit the process of training AI with data is more difficult than they expected
  • 76% combat this challenge by attempting to label and annotate training data on their own
  • 63% go so far as to try to build their own labeling and annotation automation technology
  • 71% of teams report that they ultimately outsource training data and other ML project activities


Enterprise data scientists, other AI technologists and business stakeholders involved in active AI and ML projects were invited to participate in a survey on their company’s use and development of AI and ML projects. The survey was administered electronically, and participants were offered a token compensation for their participation. A total of 227 participants completed the survey, representing five continents and 20 industries.

Download a free copy of the report here.

About Alegion

Alegion is an Austin-based technology company that provides the most powerful and flexible annotation platform for training data in market. It accelerates model development for the most sophisticated and subjective use cases. It uses integrated ML and has unique capabilities like conditional logic, iterative tasks, multi-stage and workflows, that are essential for high quality at scale. The entire process is managed by our highly experienced and consultative team that configures and executes the platform to meet your business needs.

For more information, visit

About Dimensional Research

Dimensional Research provides practical marketing research to help technology companies make smarter business decisions. Our researchers are experts in technology and understand how corporate IT organizations operate. Our qualitative research services deliver a clear understanding of customer and market dynamics.

For more information, visit


Caitlin Haskins
10Fold Communications

Release Summary

Alegion announces survey report: “Artificial Intelligence and Machine Learning Projects Obstructed by Data Issues” by Dimensional Research.


Caitlin Haskins
10Fold Communications