Study Finds Auto-Caption Mistakes on Major UC Platforms

According to a study from Northeastern University and Pomona College, mistakes in auto-captions let down many video conferencing and social media apps.

Researchers worked with Consumer Reports (CR) to test auto-captions on seven popular platforms such as Zoom, Facebook, Google Meet and YouTube.

Mistakes were found on all of the platforms; however, according to Consumer Reports, some of them were getting around one in 10 words wrong.

The video conferencing platforms that the study looked at were BlueJeans, Cisco Webex, Google Meet, and Zoom.

Consumer Reports also included Microsoft Stream, the company’s video streaming service, which uses the same voice technology as Microsoft Teams.

The study found that Webex had more mistakes than Google Meet, but within each tested platform, there were considerable differences.

Consumer Reports said Zoom’s “very best transcription had just two errors per 100 words, while at its worst the software mistranscribed nearly every third word”.

Kaveh Waddell, Consumer Reports, said: “We controlled for the speaker’s age, gender, race and ethnicity, first language, and speech rate. As it turned out, only gender and first language status independently affected the variation in transcription mistakes.

“Though the accuracy differences we found between groups of speakers may not seem large, they can have a real impact on comprehension.

“Take Zoom’s average accuracy gap between a native and non-native speaker: It’s 3.6 per cent, which looks like a small number.

“But imagine if you misunderstood three or four extra words out of every hundred— on top of the roughly eight per cent of words that the auto-captions already bungled, according to our study.

“English is often spoken at about 150 words per minute, so those mistakes can pile up fast.”

When responding to Consumer Reports, Microsoft confirmed their findings to align with its internal testing, which also revealed lower accuracy when transcribing men and second-language English speakers.

A Zoom spokesperson told CR: “We’re continuously enhancing our transcription feature to improve accuracy toward a variety of factors, including English dialects and accents.”

Google said it was working to “improve the accuracy of live captions and translations so even more users can participate and stay engaged using Google Meet”.

Cisco told CR that its auto-caption testing puts Webex ahead of two “best-in-class speech recognition engines” but wouldn’t tell the publication what those products were.

In Consumer Report’s study, Webex had the highest error rate, prompting a Cisco spokesperson to claim that the discrepancy may be explained by the fact that Webex’s captions are fine-tuned for video conferencing rather than other scenarios.

 

 



from UC Today https://ift.tt/zyL7vYd

Post a Comment

0 Comments