Phenaki video is a website showcasing examples of videos generated by the Phenaki model, which can synthesize realistic videos from textual prompt sequences. The model is trained on a combination of videos and images, with varying percentages of each, and can generate videos that are as long as multiple minutes. The examples provided on the website include a cat listening to music with headphones, a dog playing the piano, and an astronaut riding a horse on Mars with a sunset in the background, among others. The model is designed to handle variable-length videos and can generate videos conditioned on a sequence of prompts, which can be time-variable or in the form of a story.
⚡Top 5 Phenaki Features:
- Text-to-Video Astronaut: Users can create videos about astronauts by choosing combinations of context words for the video’s content. The model is trained on videos and can generate HD videos with various scenarios, such as riding a horse or dinosaur, swimming, or even on Mars with Earth in the background.
- Encoder-Decoder Model: Phenaki uses an encoder-decoder model that compresses videos to discrete embeddings, or tokens, with a tokenizer that can work with variable-length videos thanks to its use of causal attention in time.
- Transformer Model: The model translates text embeddings to video tokens using a bi-directional masked transformer conditioned on pre-computed text tokens, which are then de-tokenized to create the actual video.
- Joint Training: Phenaki demonstrates that joint training on a large corpus of image-text pairs and a smaller number of video-text examples can result in generalization beyond what is available in the video datasets alone.
- Spatial Super-Resolution: Phenaki’s output can be fed to Imagen Video, which performs spatial super-resolution, incorporating the text into the super-resolution module to enhance the output.
⚡Top 5 Phenaki Use Cases:
- First Person View of Riding a Motorcycle: Generate videos from textual prompts, such as a first-person view of riding a motorcycle through a busy street, a busy road in the woods, or very slowly in the woods.
- First Person View of Running: Phenaki can create videos of running through the woods, towards a beautiful house, or between houses with robots.
- First Person View of Flying: Create videos of flying on the sea over the ships, zooming towards a ship, or zooming out quickly from the coastal city.
- Timelapse of Sunset in the Modern City: Phenaki can create a timelapse of sunset in a modern city.
- Astronaut in the Blue Room: Generate videos of an astronaut in a blue room, typing in the keyboard, or leaving the keyboard and walking away.