LipNet

LipNet is a deep neural network for audio-visual speech recognition (ASVR). It was created by University of Oxford researchers Yannis Assael, Brendan Shillingford, Shimon Whiteson, and Nando de Freitas.^[1] Audio-visual speech recognition has enormous practical potential, with applications such as improved hearing aids, improving the recovery and wellbeing of critically ill patients,^[2] and speech recognition in noisy environments,^[3] implemented for example in Nvidia's autonomous vehicles.^[4]

[1]

[2]

[3]

[4]

LipNet

References

Wikiwand - on