Technology

Oxford University AI Excels At Lip-Reading

A computer program can lip-read four times more accurately than a human expert according to Oxford University researchers.

They worked with Google’s DeepMind project to develop an AI system named Watch, Attend and Spell. The project involved a neural network using image recognition tools to analyse 5,000 hours of TV news footage making up 118,000 sentences using 17,500 different words, comparing the mouth movements with the text from subtitles.

The accuracy improved over time, largely because the system learned more about the context of individual words. As the BBC notes, one example was that in news footage “Prime” often immediately preceded “Minister”.

Once the development was complete, the researchers ran tests on new silent footage. They found the software recognized 50 percent of the words correctly, compared with just a 12 percent success rate by a lip-reading expert. However, the accuracy is likely to only be so high with specific types of speech, namely the language and style used by newsreaders, rather than general conversation. The system would also need to be speeded up to cope with real-time “translation” rather than working on recorded footage.

While there’s plenty more work to be done, long-term uses could include more accurate transcription of video where multiple people are speaking over one another; dubbing speech in silent archive film; and improving speech recognition on smart phones in noisy environments.

 

JLister

Recent Posts

Ghosted, orbited, breadcrumbed? A psychotherapist breaks down some perils of digital dating and how to cope

About a third of U.S. adults have looked for love online. Maria Korneeva/Moment via Getty…

47 mins ago

The Fallout TV Series Gets the Honest Trailer It Deserves

Get ready, Vault Dwellers and Wastelanders! The Honest Trailer for the Fallout TV series is…

3 hours ago

Hamstrung [Comic]

His name should be Hamburgerburglar, not Hamburglar! [Source: @goattoself]

4 hours ago

Forest Of Mysterious Opportunities [Comic]

Typical scenario in an RPG: You get at a fork om the road and can…

23 hours ago

Teens see social media algorithms as accurate reflections of themselves, study finds

Teens say ‘for you’ algorithms get them right. Photo illustration by Spencer Platt/Getty Images Nora…

24 hours ago