Neha Sunil | Text to Video

Project Details

In this blogpost, we dissect and explain the mechanics behind the key building blocks for state-of-the-art Text-to-Video generation to a general audience. We provide detailed illustrations and interactive examples of these building blocks and demonstrate the key novelties/differences between two Text-to-Video models: Imagen Video and Make-a-Video. Finally, we summarize by showing how the building blocks fit together into a complete Text-to-Video framework as well as noting the current failure modes and limitations of the models today. Note that interactive page elements may take time to load.

Read more: Blogpost

Related Projects

Visuotactile Cloth Manipulation

CoRL 2022
Visuotactile Representation Learning

Computer Vision Final Project
Cable Following

RSS 2020

Building Blocks of Text to Video Generation

Project Details

Related Projects

Visuotactile Cloth Manipulation

Visuotactile Representation Learning

Cable Following

Contact Me