Face-to-Face Translation
Face to Face Translation: Translating talking face videos to different languages
"How can we deliver a video of Obama's speech to a person in India who knows only Hindi"
Released codes: https://github.com/Rudrabha/LipGAN
Project Page: http://cvit.iiit.ac.in/research/projects/cvit-projects/facetoface-translation
Today, a substantial amount of information present over the internet is in the form of talking face videos. These include MOOC lecture videos, vlogs, instructional videos, speeches etc. These videos though are often not accessible to people without the knowledge of a particular language. So we can deliver at different levels of translation.
As shown in the above figure, we can translate the content present in the original video by overlaying translated subtitles. We can further use a text-to-speech system to generate speech from the translated subtitles. But this generated speech when directly overlayed will create "out-of-sync" lip movements. Thus to correct the lip movements, we add a visual module called LipGAN.
The architecture of LipGAN is described in the above figure.
Our system can be used for various applications as shown above. Some of them are shown in the attached demonstration video.



Comments
Post a Comment