HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

BioMegatron: Larger Biomedical Domain Language Model

Hoo-Chang Shin Yang Zhang Evelina Bakhturina Raul Puri Mostofa Patwary Mohammad Shoeybi Raghav Mani

BioMegatron: Larger Biomedical Domain Language Model

Abstract

There has been an influx of biomedical domain-specific language models, showing language models pre-trained on biomedical text perform better on biomedical domain benchmarks than those trained on general domain text corpora such as Wikipedia and Books. Yet, most works do not study the factors affecting each domain language application deeply. Additionally, the study of model size on domain-specific models has been mostly missing. We empirically study and evaluate several factors that can affect performance on domain language applications, such as the sub-word vocabulary set, model size, pre-training corpus, and domain transfer. We show consistent improvements on benchmarks with our larger BioMegatron model trained on a larger domain corpus, contributing to our understanding of domain language model applications. We demonstrate noticeable improvements over the previous state-of-the-art (SOTA) on standard biomedical NLP benchmarks of named entity recognition, relation extraction, and question answering. Model checkpoints and code are available at [https://ngc.nvidia.com] and [https://github.com/NVIDIA/NeMo].

Code Repositories

NVIDIA/NeMo
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
named-entity-recognition-ner-on-ncbi-diseaseBioMegatron BERT-cased
F1: 87.8
named-entity-recognition-on-bc5cdr-chemicalBioMegatron
F1: 92.9
named-entity-recognition-on-bc5cdr-diseaseBioMegatron
F1: 88.5
relation-extraction-on-chemprotBioMegatron
F1: 77.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
BioMegatron: Larger Biomedical Domain Language Model | Papers | HyperAI