Turn your professor into Morgan Freeman or David Attenborough, with deep learning and leverage of Google Cloud libraries.
The original lecture MP4 file.
First, we split the audio from the video, then send the resulting FLAC file through Google Cloud's Speech-to-Text API. For this clip, it results in the following transcription:
We then generate funny voices, by feeding the text through a neural network. Here's Mr. Ben Shapiro and Sir David Attenborough teaching you about recursion.
We also generate talking heads for you to follow along with, based off DinoMan's speech-driven animation.
Bored of real people? We can also do cartoon characters, such as Spongebob or Sonic the hedgehog.
We came up with this idea after we talked about how painful it is to focus on Zoom lectures, and how frequently they were skipped as a result. Honestly, sitting in front of a computer and watching recorded lectures is one of the most painful ways to spend time, so we brainstormed ways to alleviate that frustration. We figured, if our professor sounded like he was narrating a fascinating episode of Planet Earth instead of discussing the 99th Cobb-Douglas function, it might be more engaging. Initially, we only planned to do just audio conversions of a professor's voice, but the audio was more successful than we expected, so we decided to implement video as well.
Here we have Sonic the Hedgehog, doing his best.