HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Efficient Medical VIE via Reinforcement Learning

Lijun Liu Ruiyang Li Zhaocheng Liu Chenglin Zhu Chong Li Jiehan Cheng Qiang Ju Jian Xie

Efficient Medical VIE via Reinforcement Learning

Abstract

Visual Information Extraction (VIE) converts unstructured document imagesinto structured formats like JSON, critical for medical applications such asreport analysis and online consultations. Traditional methods rely on OCR andlanguage models, while end-to-end multimodal models offer direct JSONgeneration. However, domain-specific schemas and high annotation costs limittheir effectiveness in medical VIE. We base our approach on the ReinforcementLearning with Verifiable Rewards (RLVR) framework to address these challengesusing only 100 annotated samples. Our approach ensures dataset diversity, abalanced precision-recall reward mechanism to reduce hallucinations and improvefield coverage, and innovative sampling strategies to enhance reasoningcapabilities. Fine-tuning Qwen2.5-VL-7B with our RLVR method, we achievestate-of-the-art performance on medical VIE tasks, significantly improving F1,precision, and recall. While our models excel on tasks similar to medicaldatasets, performance drops on dissimilar tasks, highlighting the need fordomain-specific optimization. Case studies further demonstrate the value ofreasoning during training and inference for VIE.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Efficient Medical VIE via Reinforcement Learning | Papers | HyperAI