-

3Play Media's State of Automatic Speech Recognition (ASR) Report Finds ASR Technology Has Advanced Significantly

In its annual State of ASR report, 3Play Media finds automatic speech recognition technology has advanced, but human intervention is still necessary for accuracy in captioning use cases

BOSTON--(BUSINESS WIRE)--3Play Media, the leading media accessibility provider, released its annual State of Automatic Speech Recognition (ASR) report. The study looks at the general state of speech-to-text technology and evaluates how 9 major speech recognition engines perform at the task of captioning and transcription. According to the study, the accuracy of the technology has improved measurably since the company’s last report, published in January of 2021.

3Play Media tested all 9 engines using a large dataset representative of 3Play Media’s diverse customer base. Accuracy was evaluated against two measurements: Word Error Rate (WER) and Formatted Error Rate (FER), which includes formatting errors like grammar, speaker identification, and non-speech elements in addition to word errors.

In both WER and FER measurements, Speechmatics with 3Play modeling and post-processing led the pack, followed by Speechmatics alone and Microsoft. Rev, Google VM, and Voicegain followed, each with respectable scores which were close enough that these vendors are hard to differentiate. Despite exciting improvement across the board, all engines performed well below the industry standard of 99% accuracy, confirming that ASR on its own still falls short of being “good enough” for compliance with closed captioning legal requirements.

“As the AI models driving ASR continue to evolve, many of the engines we evaluated have shown significant strides in their transcription accuracy over the last two years,” Chris Antunes, Co-CEO and Co-Founder, 3Play Media, said. “We run this report every year because we use ASR in our own transcription process, and we have a vested interest in making sure we’re utilizing the best engine on the market. Speechmatics remains a clear industry leader in both pre-recorded and live automated transcription, and applying 3Play’s mappings and post-processing resulted in an exciting improvement in word error rate of over 8%.”

The study showed a wide range in accuracy among the technology tested, with the highest and lowest performing engines differing by over 15 percentage points. This suggests that different engines are optimizing for different goals, and some ASR engines will not perform well for transcription. Compared to other uses of speech-to-text technology like automated assistants that are able to train on a specific voice, transcription is a very difficult task, with variables like diverse sentence structure and spontaneous speech, specialized terminology, and complex patterns including multiple speakers, accents, and background noise.

Accuracy is critical in captioning for a number of reasons, most important being that individuals who are d/Deaf or hard of hearing rely on captions as an accommodation. Accurate captions also improve viewer engagement - studies show that captions improve watch time, brand recognition, and comprehension. And, while customer experience has emerged as a critical driver for businesses, so has digital accessibility legislation: in 2021 alone, there were 10 accessibility lawsuits filed per day.

To download the report, please visit: https://go.3playmedia.com/rs-2021-asr.

About 3Play Media
3Play Media is an integrated media accessibility platform with patented solutions for closed captioning, transcription, live captioning, audio description, and subtitling. 3Play Media combines machine learning (ML) and automatic speech recognition (ASR) with human review to provide innovative, highly accurate services. Customers span multiple industries, including media & entertainment, corporate, ecommerce, fitness, higher education, government, and elearning.

Contacts

Media

Phil LeClare
phil.leclare@3playmedia.com
617-209-9406
www.3playmedia.com
@3playmedia

3Play Media

Details
Headquarters: Boston, Massachusetts
CEO: Chris Antunes
Employees: 50
Organization: PRI

Release Versions

Contacts

Media

Phil LeClare
phil.leclare@3playmedia.com
617-209-9406
www.3playmedia.com
@3playmedia

More News From 3Play Media

3Play Media Reintroduces Itself as Global Video Solutions Leader Backed by 15+ Years of AI Innovation

BOSTON--(BUSINESS WIRE)--3Play Media today announced a comprehensive rebrand reflecting its transformation from a video accessibility tech company into a leader in video localization and global accessibility solutions, driven by breakthrough AI technology that delivers unprecedented quality control and customization. Founded in 2008 out of MIT, 3Play Media was one of the first captioning companies to utilize AI, combining automated speech recognition (ASR) with custom modeling, patented softwar...

3Play Media Appoints Steve Nee as Chief Financial Officer

BOSTON--(BUSINESS WIRE)--3Play Media, the leading media accessibility and localization provider in North America, today announced the appointment of Steve Nee as Chief Financial Officer. Nee joins the company's executive leadership team as 3Play Media continues to scale its AI-enabled solutions and expand its market reach in the global media accessibility and localization markets. "We're thrilled to welcome Steve to our leadership team as we enter an exciting phase of growth and innovation," sa...

3Play Media Releases Annual Study, Finds ASR Technology Showing Signs of Plateau

BOSTON--(BUSINESS WIRE)--While Automatic Speech Recognition (ASR) technologies are maturing and becoming more sophisticated, human review remains essential for meeting accessibility standards, according to the latest State of ASR report by 3Play Media, the leading media accessibility provider in North America, released today. “Our research continues to show that while ASR technology has made remarkable strides, we're witnessing an increasing plateau in accuracy improvements for English pre-reco...
Back to Newsroom