HyperAI

1. Tutorial Introduction

This tutorial uses a single RTX 4090 card as the resource, and the video generation takes about 10 minutes. It is recommended to use a GPU with 80GB of memory for better generation quality.

HunyuanCustom, released by Tencent's Hunyuan team on May 9, 2025, is a multimodal customized video generation framework. It's a multimodal, conditionally controllable generative model built upon the Hunyuan Video generation framework, centered on topic consistency. It supports generating topic-consistent videos with text, images, audio, and video inputs. Leveraging HunyuanCustom's multimodal capabilities, numerous downstream tasks can be accomplished. For example, by acquiring multiple images as input, HunyuanCustom can facilitate virtual human advertising and virtual makeup try-ons. Related research papers are available. HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation .

This workflow tutorial uses the following model files in total:

hunyuan_video_custom_720p_fp8_scaled.safetensors
llava_llama3_fp16.safetensors
hunyuan_video_vae_bf16.safetensors
clip_l.safetensors

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

2. Functional Demonstration

How to use

The first clone requires manual import of the workflow file for loading

Image Generation Video

Select image

Input Prompt

Result Output

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information

The citation information for this project is as follows:

@misc{hu2025hunyuancustom,
      title={HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation}, 
      author={Teng Hu and Zhentao Yu and Zhengguang Zhou and Sen Liang and Yuan Zhou and Qin Lin and Qinglin Lu},
      year={2025},
      eprint={2505.04512},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.04512}, 
}

1. Tutorial Introduction

This tutorial uses a single RTX 4090 card as the resource, and the video generation takes about 10 minutes. It is recommended to use a GPU with 80GB of memory for better generation quality.

This workflow tutorial uses the following model files in total:

hunyuan_video_custom_720p_fp8_scaled.safetensors
llava_llama3_fp16.safetensors
hunyuan_video_vae_bf16.safetensors
clip_l.safetensors

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

2. Functional Demonstration

How to use

The first clone requires manual import of the workflow file for loading

Image Generation Video

Select image

Input Prompt

Result Output

4. Discussion

Citation Information

The citation information for this project is as follows:

@misc{hu2025hunyuancustom,
      title={HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation}, 
      author={Teng Hu and Zhentao Yu and Zhengguang Zhou and Sen Liang and Yuan Zhou and Qin Lin and Qinglin Lu},
      year={2025},
      eprint={2505.04512},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.04512}, 
}

1. Tutorial Introduction

This tutorial uses a single RTX 4090 card as the resource, and the video generation takes about 10 minutes. It is recommended to use a GPU with 80GB of memory for better generation quality.

This workflow tutorial uses the following model files in total:

hunyuan_video_custom_720p_fp8_scaled.safetensors
llava_llama3_fp16.safetensors
hunyuan_video_vae_bf16.safetensors
clip_l.safetensors

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

2. Functional Demonstration

How to use

The first clone requires manual import of the workflow file for loading

Image Generation Video

Select image

Input Prompt

Result Output

4. Discussion

Citation Information

The citation information for this project is as follows:

@misc{hu2025hunyuancustom,
      title={HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation}, 
      author={Teng Hu and Zhentao Yu and Zhengguang Zhou and Sen Liang and Yuan Zhou and Qin Lin and Qinglin Lu},
      year={2025},
      eprint={2505.04512},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.04512}, 
}

1. Tutorial Introduction

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

4. Discussion

Citation Information

Build AI with AI

HyperAI Newsletters

1. Tutorial Introduction

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

4. Discussion

Citation Information

Build AI with AI

HyperAI Newsletters

1. Tutorial Introduction

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

4. Discussion

Citation Information

Build AI with AI

HyperAI Newsletters

Command Palette

ComfyUI HunyuanCustom Video Generation Workflow Tutorial

1. Tutorial Introduction

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

4. Discussion

Citation Information

Build AI with AI

HyperAI Newsletters

Command Palette

ComfyUI HunyuanCustom Video Generation Workflow Tutorial

1. Tutorial Introduction

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

4. Discussion

Citation Information

Build AI with AI

HyperAI Newsletters

Command Palette

ComfyUI HunyuanCustom Video Generation Workflow Tutorial

1. Tutorial Introduction

2. Project Examples

Multimodal video customization

Various applications

3. Operation steps

4. Discussion

Citation Information

Build AI with AI

HyperAI Newsletters