OmniLottie: A Multimodal Model for Lottie Animation Generation

OmniLottie: A Multimodal Model for Lottie Animation Generation

Motion graphics generation typically involves switching between tools, editing timelines, and multiple export steps.

While this workflow is powerful, it slows down teams that need structured, reusable animation outputs for product interfaces or content pipelines.

OmniLottie on AIOZ AI simplifies this process by generating Lottie JSON animations directly from text, images, or video inputs, which produce editable vector-based outputs ready for integration.

About OmniLottie

OmniLottie is an end-to-end multimodal model purpose-built for generating structured Lottie animations.

Instead of producing raster video files, it outputs structured Lottie JSON that enables:

  • Resolution-independent animation
  • Editable motion assets
  • Direct integration into web and mobile applications

This structure makes OmniLottie particularly useful for teams working with reusable animation systems.

How It Works

OmniLottie follows this multimodal-to-structured animation pipeline:

  1. Accept input from text, image, or video.
  2. Interpret visual and language context through a VLM backbone.
  3. Generate structured animation tokens.
  4. Decode tokens into editable Lottie JSON.

This approach results in animations that are both visually coherent and structurally reusable, not just fixed video outputs.

Key Capabilities

OmniLottie focuses on practical, production-oriented features, such as:

  • Multimodal generation: Accepts text, image, and video as input conditions.
  • Vector-native output: Generates animations as editable Lottie JSON, not raster video files.
  • Workflow integration: Outputs are compatible with web and mobile animation pipelines.

Technical Profile

OmniLottie includes the following components:

  • Architecture: Qwen2.5-VL-3B-Instruct backbone.
  • Training data: MMLottie-2M -> 2 million annotated animations sourced from LottieFiles, IconScout, and Flaticon.
  • Tokenizer: Custom Parameterized Lottie Tokens for structured animation representation.
  • Evaluation: MMLottieBench benchmark, 900 samples.
  • Hardware: Single GPU, approximately 15.2 GB VRAM.
  • License: Apache-2.0, free for commercial use.
  • Recognition: Accepted to CVPR 2026.

Where It Fits Best

OmniLottie is well-suited for workflows such as:

  • UI animations: Generating animated icons and UI assets from text or image prompts.
  • Static-to-motion: Converting static images into editable vector animations.
  • Video-to-Lottie: Transforming video clips into lightweight, editable Lottie assets.
  • Design iteration: Accelerating animation cycles in product and design workflows.

Its structured output means every animation remains editable and reusable across projects.

Try OmniLottie on AIOZ AI

A practical way to explore OmniLottie is to test the same concept across text, image, and video inputs, before comparing how each input influences the resulting animation structure.

For teams building scalable animation systems, OmniLottie provides a direct path from multimodal input to editable motion assets.