1. Tutorial Introduction

MinerU 2.5-2509-1.2B is a visual language model launched by OpenDataLab and the Shanghai AI Lab in September 2025, designed specifically for high-precision and high-efficiency document parsing tasks. It is the latest iteration of the MinerU series, focusing on converting complex document formats such as PDFs into structured machine-readable data (such as Markdown and JSON). Related research papers are available. MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing .

This tutorial uses resources for a single RTX 4090 card.

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

2. Usage steps

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

Parameter Description

Enable formula recognition: Whether to enable formula recognition. When enabled, the system will recognize mathematical formulas in the document and convert them into LaTeX format.

Enable table recognition: Whether to enable the table recognition function. When enabled, the system will recognize the table in the document and convert it into HTML format.

Language: Used to specify the language of the document. It can improve the accuracy of OCR.

orce enable OCR: Force enable OCR function.

Citation Information

The citation information for this project is as follows:

@misc{niu2025mineru25decoupledvisionlanguagemodel, title={MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing}, author={Junbo Niu and Zheng Liu and Zhuangcheng Gu and Bin Wang and Linke Ouyang and Zhiyuan Zhao and Tao Chu and Tianyao He and Fan Wu and Qintong Zhang and Zhenjiang Jin and others}, year={2025}, eprint={2509.22186}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2509.22186}, }

HyperAI

Run this Notebook

Date

4 months ago

Size

708.79 MB

1. Tutorial Introduction

This tutorial uses resources for a single RTX 4090 card.

2. Project Examples

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

2. Usage steps

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

Parameter Description

Enable formula recognition: Whether to enable formula recognition. When enabled, the system will recognize mathematical formulas in the document and convert them into LaTeX format.
Enable table recognition: Whether to enable the table recognition function. When enabled, the system will recognize the table in the document and convert it into HTML format.
Language: Used to specify the language of the document. It can improve the accuracy of OCR.
orce enable OCR: Force enable OCR function.

Citation Information

The citation information for this project is as follows:

@misc{niu2025mineru25decoupledvisionlanguagemodel,
      title={MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing}, 
      author={Junbo Niu and Zheng Liu and Zhuangcheng Gu and Bin Wang and Linke Ouyang and Zhiyuan Zhao and Tao Chu and Tianyao He and Fan Wu and Qintong Zhang and Zhenjiang Jin and others},
      year={2025},
      eprint={2509.22186},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.22186}, 
}

This notebook is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at support@hyper.ai for prompt review and removal.

Related Notebooks

MonkeyOCR: Document Parsing Based on the structure-recognition-relation Triple Paradigm

3 months ago

OCRFlux-3B: Intelligent Text Recognition Toolkit

3 months ago

HunyuanOCR: Tencent Hunyuan End-to-End OCR

2 months ago

kyutai-tts-1.6 b-en_fr Audio Generation

a month ago

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

a month ago

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Run this Notebook

Date

4 months ago

Size

708.79 MB

1. Tutorial Introduction

This tutorial uses resources for a single RTX 4090 card.

2. Project Examples

3. Operation steps