Paxata Named an Innovator in the Use of Artificial Intelligence and Machine Learning for Data Integration and Preparation by EMA Research

New Research Finds Data Integration and Preparation a Priority for the Use of AI and ML in Analytics and Data Management

REDWOOD CITY, Calif.--()--Paxata, the pioneer in self-service data preparation, today announced that it was named an innovator in Enterprise Management Associates® (EMA™) "Innovation in the Use of Artificial Intelligence (AI) and Machine Learning (ML) for Data Integration and Preparation" Top 3 report. According to the findings, more than half of all participants (52 percent) said that the use of AI or ML to automate the data preparation or integration process is important to their organization. Because of the prominent role of data integration and preparation in any analytics project, the report stated that AI-enablement should be a priority for analytics leaders at all levels as it provides organizations with the ability to overcome the constraints of legacy or less-automated data processing. The complimentary report can be downloaded here.

“The next major shift in the analytics, business intelligence, and data management markets is coming from the use of AI and ML across the entire information supply chain. Along with using machine learning to find the next-best offer, companies can now point algorithms at modern data platforms to find links between data sets, automate data preparation, or breaches in data governance,” said John Santaferraro, Research Director at EMA and lead author of the report. “Vendors like Paxata that excel in the use of AI and ML in their analytics, business intelligence, and data management platforms will create significant differentiation and barriers to entry that will change the face of all vertical industries.”

When it comes to which inherent capabilities are most important for a data integration and preparation tool, three out of four stated automated data profiling was the top criteria followed by data cleansing recommendations (60 percent). Data integration and preparation were considered the most time-consuming activity for every analytics project with data profiling and cleansing the most time-consuming aspect of data integration. On the low side of this research, only 14 percent of participants were willing to surrender control of data preparation to automated tasks.

EMA built a scoring model based on the priority set by 155 randomly selected participants in the use of AI and ML in data integration and preparation platforms and selected Paxata for their comprehensive coverage of the different AI-enabled capabilities. This is particularly important because according to the report, organizations that are first to implement in their industries can expect an advantage over their competitors.

More specifically, Paxata was recognized for:

  • Automated Data Profiling: Paxata Rapid Data Profiling provides a one-click profile button that scans an organizations entire dataset and generates a summary scorecard showing an assessment of its content and quality. Paxata is unique in its ability to apply its algorithms across the entire body of the data while sample-based solutions inherently profile subsets of the data and therefore miss to identify outliers and accurate patterns.
  • Data Cleansing Recommendation: Paxata uses algorithmic techniques to offer insight into data quality issues. Paxata is unique in its anomaly detection as it applies its algorithms across the entire body of the data (hundreds of millions of rows) while sample-based solutions inherently miss surfacing data quality issues unless the user goes through many iterations to eventually process the full data to reach the same level of confidence in data quality.
  • Data Structure Identification: Paxata’s Intelligent Ingest intelligently detects source types, compression formats, and schemas, including inference in recognizing the content of extension-less files. The intelligent ingest then transforms each structure into a tabular format easy for point-and-click interactive profiling and preparation.
  • Correlation or Relationship Recommendation: Paxata uses machine learning algorithms to detect joins and overlaps within different data sets. The algorithms work with the data content as well as metadata to provide a confidence score for the detected joins.
  • Automated Data Preparation: Paxata’s Intelligent Automation auto-discovers dependent data preparation projects and data sets and creates multi-project data flows that can be operationalized from a single point. The automation can run on demand or can be scheduled to run all the time without triggers.

“We are extremely excited to be recognized in this report for the innovation that is foundational to our platform. We see customers across the globe continue to demonstrate how AI-enabled data preparation and integration have become instrumental to achieving a competitive advantage, a sentiment that was shared with more than three out of five (66 percent) of respondents,” said Prakash Nanduri, Co-Founder and CEO, at Paxata. “Dating back to 2012 when we created the industry’s first self-service data preparation solution, we have been committed to helping business consumers visually discover, profile, and clean data themselves in order to achieve significant business results. As the report mentions, we will continue to innovate by leveraging modern approaches such as AI and ML to enhance our product and generate even greater value.”

Research Methodology

All research results in this report are based on EMA’s survey of 155 randomly selected North American enterprise and midmarket data and analytics professionals. EMA research identified trends, adoption, drivers, and priorities for the use of AI and ML in five categories: 1) data preparation and integration, 2) data warehousing and big data platforms, 3) business intelligence, 4) analytics and data science, and 5) data catalog, master data management, and data governance. For each of the five categories, EMA identified 10-12 key AI or ML capabilities. The research provided input regarding the capabilities that were most important to the participants.

About the EMA Top 3 reports

EMA Top 3 reports identify priorities organizations operationalize when overcoming challenges or achieving an unfair advantage in analytics or IT management focus areas. The intent of their reports is to inform and inspire influencers and decision makers in their portfolio planning and vendor selection process. While EMA internally conducted a detailed analysis of solutions that help support the identified analytics or IT management priorities, reports are not designed to provide a feature-by-feature comparison for the entire product category. Additionally, some popularly adopted approaches may not be represented in this report because EMA’s analysis did not indicate they are fully addressing emerging market requirements. This guide was developed as a resource for organizations to gain insights from EMA’s extensive experience conducting thousands of product briefings, case studies, and demonstrations. To learn more visit

About Paxata

At Paxata, we transform data into information on-demand to empower every person, process, and system in the organization to be more intelligent. Our Adaptive Information Platform provides business leaders and analysts with an enterprise-grade, self-service data preparation application to deliver better customer experiences, improve operational efficiencies, and comply with regulatory requirements. Built on Apache SparkTM and optimized to run in hybrid, multi-cloud environments, Paxata leverages algorithmic intelligence and distributed computing to deliver an immersive business consumer experience that accelerates and automates the data-to-insight pipeline. Paxata is headquartered in Redwood City, California with offices in New York, Ohio, Texas, and Singapore. Visit or engage with us on Twitter, LinkedIn, Facebook, or YouTube.


McCoin & Smith Communications Inc.
Chris McCoin, 508-429-5988,
Rick Smith, 978-433-3304,


McCoin & Smith Communications Inc.
Chris McCoin, 508-429-5988,
Rick Smith, 978-433-3304,