Sana, released in January 2025, is a project jointly led by NVIDIA, MIT, and Tsinghua University. Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images very quickly and has strong text-image alignment capabilities. Related research papers include... SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion TransformersIt has been accepted by ICLR 2025.
This tutorial uses the Sana_1600M_1024px model for demonstration, and the computing power resource uses a single card 4090.
2. Operation steps
1. After starting the container, click the API address to enter the Web interface
If "Bad Gateway" is displayed, it means the model is initializing. Please wait for about 1-2 minutes and refresh the page.
2. Use Demonstration
Citation Information
Thanks to Github user SuperYang For the deployment of this tutorial, the project reference information is as follows:
@misc{Sana2025,
title={Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer},
author={Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, Song Han},
howpublished={\url{https://nvlabs.github.io/Sana/}},
note={GitHub Repository with Code, Model & Documentation},
year={2025}
}
Discussion
🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.
Sana, released in January 2025, is a project jointly led by NVIDIA, MIT, and Tsinghua University. Sana is a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images very quickly and has strong text-image alignment capabilities. Related research papers include... SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion TransformersIt has been accepted by ICLR 2025.
This tutorial uses the Sana_1600M_1024px model for demonstration, and the computing power resource uses a single card 4090.
2. Operation steps
1. After starting the container, click the API address to enter the Web interface
If "Bad Gateway" is displayed, it means the model is initializing. Please wait for about 1-2 minutes and refresh the page.
2. Use Demonstration
Citation Information
Thanks to Github user SuperYang For the deployment of this tutorial, the project reference information is as follows:
@misc{Sana2025,
title={Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer},
author={Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, Song Han},
howpublished={\url{https://nvlabs.github.io/Sana/}},
note={GitHub Repository with Code, Model & Documentation},
year={2025}
}
Discussion
🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.