-

Flatiron Health Research on AI-Driven Cancer Progression Extraction Presented at AACR Special Conference in Cancer Research: Artificial Intelligence and Machine Learning 2025

  • Flatiron Health presents two new pieces of research demonstrating the potential of AI to advance oncology research across multiple tumor types

NEW YORK--(BUSINESS WIRE)--Flatiron Health today announced that the novel findings from its research, “Using large language models for scalable extraction of real-world progression events across multiple cancer types,” have been presented at the American Association for Cancer Research (AACR) Special Conference in Cancer Research: Artificial Intelligence and Machine Learning, which took place July 10-12, 2025, in Montreal, QC, Canada.

The research demonstrates the power of large language models to accurately and efficiently extract real-world cancer progression events from unstructured electronic health record data across 14 cancer types.

Share

The research, led by a multidisciplinary Flatiron team of clinicians, research scientists, and machine learning engineers, demonstrates the power of large language models (LLMs) to accurately and efficiently extract real-world cancer progression events from unstructured electronic health record (EHR) data across 14 cancer types. Study findings concluded that LLMs achieved F1 scores similar to expert human abstractors and produced nearly identical real-world progression-free survival estimates, underscoring the potential of AI to scale high-quality clinical endpoint extraction for oncology research and care. This research utilized on the recently published Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework to assess the quality of LLM-extracted real-world data, uniquely evaluating both the LLM and an expert human abstractor against a duplicate expert human abstracted reference dataset to directly compare performance.

The LLM that Flatiron scientists optimized for this research was provided by Anthropic—the leading AI safety and research company that builds the Claude family of models.

“AI and machine learning are fundamentally transforming how we generate and use real-world evidence in oncology,” said Stephanie Reisinger, Senior Vice President & General Manager, Real-World Evidence at Flatiron Health. “This research exemplifies how Flatiron is harnessing AI and multimodal data to unlock new insights from oncology real-world data—accelerating clinical research, improving patient outcomes, and setting a new standard for evidence generation in cancer care.”

“We’re excited to share evidence that large language models can approach, and in some cases, may even exceed expert human performance in extracting critical and clinically nuanced cancer progression endpoints from EHRs,” said Aaron B. Cohen, MD, lead author, Head of Research Oncology, Clinical Data at Flatiron Health and practicing oncologist at Bellevue Hospital in New York City. “Scalable, high-quality extraction of such an important and complex endpoint like progression will open new doors for novel research, predictive modeling, and more personalized patient care.”

In addition to this work, Flatiron Health will also present “Fairness by design: End-to-end bias evaluation for LLM-generated data,” further highlighting the company’s leadership in applying AI and real-world data to advance oncology research. Specifically, this study demonstrates the VALID framework’s ability to evaluate both the quality and fairness of LLM-extracted data, highlighting the importance of ongoing bias assessment.

About Flatiron Health

Flatiron Health is a healthtech company expanding the possibilities for point of care solutions in oncology and using data for good to power smarter care for every person with cancer. Through machine learning and AI, real-world evidence, and breakthroughs in clinical trials, we continue to transform patients’ real-life experiences into knowledge and create a more modern, connected oncology ecosystem. Flatiron Health is an independent affiliate of the Roche Group.

Contacts

Media Contact:

Nina Toor
press@flatiron.com

Flatiron Health

Details
Headquarters: New York, NY
CEO: Carolyn Starrett
Employees: 2,500
Organization: PRI

Release Versions

Contacts

Media Contact:

Nina Toor
press@flatiron.com

Social Media Profiles
More News From Flatiron Health

Flatiron Health Launches Flatiron Telescope, a New AI Platform Delivering Oncology Insights in Minutes

NEW YORK--(BUSINESS WIRE)--Flatiron Health today announced the launch of Flatiron Telescope, a next-generation AI-powered platform designed to help life sciences teams and researchers find the right patients faster, assess study feasibility in real time, and generate oncology insights on demand. Built on Flatiron’s industry-leading real-world data and powered by its multi-agent adaptive analytics engine, Flatiron Telescope enables teams across clinical development, RWE, and commercial functions...

Flatiron Health Announces 18 Research Acceptances Featuring Flatiron's Real-World Data to Be Presented at ISPOR 2026

NEW YORK--(BUSINESS WIRE)--Flatiron Health today announced its presence at the ISPOR—The Professional Society for Health Economics and Outcomes Research Annual Meeting happening from May 17-20, 2026, in Philadelphia, Pennsylvania. Flatiron's high-quality real-world data and innovative research capabilities are featured across 18+ research acceptances, including seven Flatiron authored research posters as well as a panel presentation “Beyond Black Boxes: Transparent, Validated LLM Workflows for...

Flatiron Health Publishes First Peer-Reviewed Validation Framework for AI-Extracted Real-World Oncology Data in Journal of Clinical Oncology

NEW YORK--(BUSINESS WIRE)--Flatiron Health today announced the publication of the Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework in the Journal of Clinical Oncology Clinical Cancer Informatics. The framework represents the first and most comprehensive, peer-reviewed approach to evaluating the quality and reliability of real-world data extracted by large language models (LLMs) and machine learning—setting a methodological benchmark for data integrity in oncolo...
Back to Newsroom