Old Version

Select the model you want to generate your video with.

Click to upload or drag and drop

Supported formats: JPG, JPEG, PNG; each file max 10MB.

0/1800

Auto Sound

Auto Speech

No Watermark

Private

Wan 2.2: A Free Open-Source MoE Model for High-Fidelity Cinematic AI Video

Wan 2.2: Alibaba’s Tongyi Lab Releases the World’s First Open-Source MoE Video Generation Model

Key Features of Wan 2.2 – Next-Gen Open-Source AI Video Generation

Scalable AI Video Generation with Wan 2.2’s Mixture-of-Experts Architecture

Cinematic Aesthetic Control in Wan 2.2 for Professional-Grade Visuals

Unified Multi-Modal Video Creation with Wan2.2-T2V-A14B, I2V-A14B, and TI2V-5B

Fully Open-Source Wan 2.2 Models with ComfyUI Workflow Support

Wan2.2 Model Variants: T2V, I2V, and TI2V for Text, Image, and Hybrid Video Generation

Wan2.2-T2V-A14B:High-Fidelity Text-to-Video Generation with Cinematic Precision

Wan2.2-I2V-A14B:Stable and Stylized Image-to-Video Generation at 720P

Wan2.2-TI2V-5B:Lightweight Hybrid Text & Image-to-Video Model for Local Deployment

Wan 2.2 vs Wan 2.1: What’s New in Next-Gen Open-Source Video AI

FeatureWan 2.1Wan 2.2
Core ArchitectureDense diffusionMixture-of-Experts (MoE) diffusion with expert hand-off across timesteps
Model VariantsT2V (14B), I2V (14B)T2V (14B), I2V (14B), TI2V Hybrid (5B)
Training DataBaseline dataset+65.6% more images, +83.2% more videos – richer motion and semantics
Aesthetic ControlBasic tagsCinematic-level labels for lighting, color, composition
Motion GenerationModerate, less controllableHigh-complexity motion, improved camera logic (tilt, orbit, dolly, etc.)
Prompt ComplianceLimited accuracyStrong prompt adherence with precise scene, motion & object control
Resolution & Frame RateUp to 720P (T2V/I2V), lower FPS720P@24fps even on single RTX 4090 (TI2V)
Performance on Consumer HardwareLimited local feasibilityTI2V runs locally on 8GB+ GPU (e.g., RTX 4090)
Use Case FlexibilityText-to-video or image-to-video onlyUnified hybrid generation + faster iteration in ComfyUI workflows
Overall Visual QualityAcceptable for baseline contentSharper frames, fewer artifacts, cinematic output polish

How to Set Up and Use Wan2.2 for AI Video Generation

  • 01

    Option 1: Local Deployment of Wan 2.2

  • 01

    Option 2: Use Wan 2.2 Online via the Official Web Interface

  • 4 Professional Tips for Creating High-Quality Video Content with Wan 2.2

    Write Visually Descriptive and Intentional Prompts

    Use Prompt Structures That Combine Scene, Style, and Emotion

    Design with Rhythm: Align Visuals to Audio Cues

    Iterate and Refine Through Prompt Feedback Loops

    Use Wan 2.2 in YesChat.AI: Create Cinematic AI Videos Online

    Beyond local tools like ComfyUI, Wan 2.2 is also available on YesChat.AI, an online platform for effortless, browser-based video creation. With no installation or hardware setup required, users can generate cinematic AI videos directly from text or image prompts in seconds. Ideal for rapid prototyping, creative experimentation, and mobile workflows, YesChat.AI lowers the entry barrier for creators and researchers looking to explore Wan 2.2’s capabilities in a fast, intuitive, and accessible environment.

    FAQs About Wan 2.2

    Q

    What is Wan 2.2 and how does it redefine AI video generation?

    Wan 2.2, developed by Alibaba’s Tongyi Lab, is the world’s first open-source Mixture-of-Experts (MoE) video generation model, purpose-built for AI video generation tasks such as text to video (T2V), image to video (I2V), and hybrid workflows. Compared to previous dense models, Wan 2.2 offers cinematic fidelity, smoother motion, and scalable performance, enabling 720p@24fps generation even on consumer GPUs like the RTX 4090.

    Q

    What are the main differences between the Wan 2.2 models: Wan2.2-T2V-A14B, Wan2.2-I2V-A14B, and Wan2.2-TI2V-5B?

    The Wan 2.2 models come in three targeted variants: Wan2.2-T2V-A14B (14B parameters, optimized for high-fidelity text to video generation), Wan2.2-I2V-A14B (14B parameters, designed for stylized and stable image to video synthesis), and Wan2.2-TI2V-5B (5B parameters, a lightweight hybrid model supporting both T2V and I2V tasks at 720p on a single GPU). Each is built on the MoE architecture and optimized for different creative and technical use cases.

    Q

    How does Wan2.2-T2V-A14B achieve cinematic-level text to video generation?

    Wan2.2-T2V-A14B converts natural language prompts into visually rich, motion-consistent 5-second clips at 720p using 14B MoE parameters. It supports fine-grained control over lighting, composition, camera motion, and emotional tone—making it ideal for storytelling, concept development, and previsualization in creative industries.

    Q

    What are the advantages of using Wan2.2-I2V-A14B for image to video generation?

    Wan2.2-I2V-A14B brings stability and visual coherence to image to video generation. It transforms static images into cinematic motion while preserving artistic style and spatial layout. Leveraging MoE-based denoising, it reduces flickering, jitter, and distortion—essential for applications in digital art, stylized content creation, and animated illustration.

    Q

    When should I use Wan2.2-TI2V-5B instead of the larger 14B models?

    Wan2.2-TI2V-5B is perfect for creators seeking fast, resource-efficient hybrid video generation. It handles both text to video and image to video tasks within a compressed architecture (16×16×4 VAE), runs smoothly at 720p on a single RTX 4090, and is well-suited for real-time preview, local prototyping, and ComfyUI-based workflows without sacrificing output quality.

    Q

    What makes Wan 2.2 unique among AI video generation models today?

    Wan 2.2 is the first open-source model to combine MoE architecture with multimodal video generation (T2V, I2V, and hybrid). Its cinematic-level control, open Apache 2.0 licensing, 720p support, and real-time performance on consumer hardware make wan2.2 a uniquely accessible and powerful tool for professionals in film, advertising, gaming, and digital design.

    Q

    How can I use wan 2.2 with ComfyUI for local video generation workflows?

    Wan 2.2 offers full integration with ComfyUI, allowing users to create node-based pipelines for text to video, image to video, or hybrid tasks. After downloading the appropriate Wan 2.2 models, users can launch pre-built workflows (e.g., for Wan2.2-T2V-A14B or Wan2.2-TI2V-5B) and run local video synthesis at 720p within a visual interface—ideal for non-coders, artists, and fast iteration.

    Q

    Where can I download Wan 2.2 models and contribute to the open-source project?

    The entire wan 2.2 models suite is open-source under the Apache 2.0 license and available on GitHub, Hugging Face, and ModelScope. Users can clone the repositories, download safetensors for Wan2.2-T2V-A14B, Wan2.2-I2V-A14B, or Wan2.2-TI2V-5B, and run them locally via CLI or ComfyUI. Community contributions are encouraged through GitHub issues and pull requests—enabling global innovation in wan video creation and research.