HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition

{Cláudia Maria Cabral Moro Barra Douglas Teodoro Emerson Cabrera Paraiso Lucas Ferro Antunes de Oliveira Yohan Bonescki Gumiel Jenny Copara Lucas Emanuel Silva e Oliveira Julien Knafou João Vitor Andrioli de Souza Elisa Terumi Rubel Schneider}

BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition

Abstract

With the growing number of electronic health record data, clinical NLP tasks have become increasingly relevant to unlock valuable information from unstructured clinical text. Although the performance of downstream NLP tasks, such as named-entity recognition (NER), in English corpus has recently improved by contextualised language models, less research is available for clinical texts in low resource languages. Our goal is to assess a deep contextual embedding model for Portuguese, so called BioBERTpt, to support clinical and biomedical NER. We transfer learned information encoded in a multilingual-BERT model to a corpora of clinical narratives and biomedical-scientific papers in Brazilian Portuguese. To evaluate the performance of BioBERTpt, we ran NER experiments on two annotated corpora containing clinical narratives and compared the results with existing BERT models. Our in-domain model outperformed the baseline model in F1-score by 2.72%, achieving higher performance in 11 out of 13 assessed entities. We demonstrate that enriching contextual embedding models with domain literature can play an important role in improving performance for specific NLP tasks. The transfer learning process enhanced the Portuguese biomedical NER model by reducing the necessity of labeled data and the demand for retraining a whole new model.

Benchmarks

BenchmarkMethodologyMetrics
named-entity-recognition-ner-on-semclinbrpucpr/biobertpt-clin
Micro F1: 0.602
named-entity-recognition-ner-on-semclinbrpucpr/biobertpt-all
Micro F1: 0.604
named-entity-recognition-ner-on-semclinbrpucpr/biobertpt-bio
Micro F1: 0.602

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition | Papers | HyperAI