Command Palette
Search for a command to run...
JarvisArt-Preview Smart Photo Retouching Proxy
Date
Size
11.96 MB
Tags
License
Apache 2.0
GitHub
Paper URL
1. Tutorial Introduction

JarvisArt-Preview is an intelligent photo retouching proxy model released on June 24, 2025, by institutions including Xiamen University, Hong Kong University of Science and Technology (Guangzhou), and Tsinghua University. On the Artistic Retouch Benchmark, this model achieved a win rate advantage of 68.31 TP3T and 61.51 TP3T over Adobe Firefly Retouch in the "Instruction Matching Accuracy" and "Professional Retouching Effect" categories, respectively. It also achieved state-of-the-art performance in traditional image editing benchmarks such as the Style Transfer Evaluation Suite and Human Preference Test. Furthermore, the model demonstrates features rarely seen in previous systems, including: end-to-end invocation of Lightroom 200+ tools driven by natural language, intelligent fusion of cross-style elements (supporting mixed styles such as oil painting + sketch), interpretable backtracking of retouching steps (generating natural language descriptions for each step), and bidirectional iterative optimization between text and image (automatically correcting instruction deviations based on the generated results). Related research papers are available. JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching AgentIt has been included in NeurIPS 2025.
This tutorial uses a single RTX 4090 graphics card. English is the only language supported.
2. Project Examples

3. Operation steps
1. After starting the container, click the API address to enter the Web interface

2. Usage steps
If "Bad Gateway" is displayed, it means the model is initializing. Due to the large size of the model, please wait approximately 2-3 minutes and then refresh the page. You will need to use LightRoom to view the generated files.

Parameter Description
- Advanced Generation Parameters:
- Max New Tokens: Limits the maximum number of tokens that the model can generate for image editing-related text (such as operation instructions, step descriptions, etc.). The larger the value, the more detailed the generated image editing logic description or steps may be, resulting in a longer output content.
- Temperature: Controls the randomness of the image retouching strategy. The lower the value (e.g., close to 0.1), the more stable and predictable the output retouching ideas are; the higher the value (e.g., close to 2), the more divergent and diverse the retouching ideas are, but unexpected adjustment logic may appear.
- Top-K: In each generation step, only the content is selected from the K tags with the highest probability. The smaller the value (e.g., 10), the more focused and conservative the generated retouching instructions are; the larger the value (e.g., 100), the more diverse the instruction selection, allowing more potential retouching ideas to participate.
- Top-P (Nucleus Sampling): Controls the diversity of output by using a cumulative probability threshold. The lower the value (e.g., 0.5), the more concentrated the image editing logic is, as it samples only from a small number of high-probability markers. The higher the value (e.g., 0.9), the more low-probability but creative markers are allowed to participate, resulting in greater diversity in the results.
- Conservative / Creative / Balanced: Shortcuts for quickly switching parameter combinations
- The "Conservative" mode tends to generate stable and predictable image retouching strategies.
- The "Creative" mode emphasizes divergent and diverse photo editing creativity;
- The "Balanced" mode strikes a balance between stability and creativity.
Citation Information
The citation information for this project is as follows:
@article{jarvisart2025,
title={JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent},
author={Yunlong Lin and Zixu Lin and Kunjie Lin and Jinbin Bai and Panwang Pan and Chenxin Li and Haoyu Chen and Zhongdao Wang and Xinghao Ding and Wenbo Li and Shuicheng Yan},
year={2025},
journal={arXiv preprint arXiv:2506.17612}
}
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.