HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain

Yanis Labrak Adrien Bazoge Richard Dufour Mickael Rouvier Emmanuel Morin Béatrice Daille Pierre-Antoine Gourraud

FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain

Abstract

This paper introduces FrenchMedMCQA, the first publicly available Multiple-Choice Question Answering (MCQA) dataset in French for medical domain. It is composed of 3,105 questions taken from real exams of the French medical specialization diploma in pharmacy, mixing single and multiple answers. Each instance of the dataset contains an identifier, a question, five possible answers and their manual correction(s). We also propose first baseline models to automatically process this MCQA task in order to report on the current performances and to highlight the difficulty of the task. A detailed analysis of the results showed that it is necessary to have representations adapted to the medical domain or to the MCQA task: in our case, English specialized models yielded better results than generic French ones, even though FrenchMedMCQA is in French. Corpus, models and tools are available online.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
multiple-choice-question-answering-mcqa-on-22DrBERT
Exact Match Accuracy: 15.32
Hamming Score: 37.37
multiple-choice-question-answering-mcqa-on-22CamemBERT
Exact Match Accuracy: 16.55
Hamming Score: 36.24

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain | Papers | HyperAI