Command Palette
Search for a command to run...
Veysel Kocaman; David Talby

Abstract
Named entity recognition (NER) is a widely applicable natural language processing task and building block of question answering, topic modeling, information retrieval, etc. In the medical domain, NER plays a crucial role by extracting meaningful chunks from clinical notes and reports, which are then fed to downstream tasks like assertion status detection, entity resolution, relation extraction, and de-identification. Reimplementing a Bi-LSTM-CNN-Char deep learning architecture on top of Apache Spark, we present a single trainable NER model that obtains new state-of-the-art results on seven public biomedical benchmarks without using heavy contextual embeddings like BERT. This includes improving BC4CHEMD to 93.72% (4.1% gain), Species800 to 80.91% (4.6% gain), and JNLPBA to 81.29% (5.2% gain). In addition, this model is freely available within a production-grade code base as part of the open-source Spark NLP library; can scale up for training and inference in any Spark cluster; has GPU support and libraries for popular programming languages such as Python, R, Scala and Java; and can be extended to support other human languages with no code changes.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| named-entity-recognition-ner-on-bc5cdr | BLSTM-CNN-Char (SparkNLP) | F1: 89.73 |
| named-entity-recognition-ner-on-bc5cdr | Spark NLP | F1: 89.73 |
| named-entity-recognition-ner-on-jnlpba | Spark NLP | F1: 81.29 |
| named-entity-recognition-ner-on-jnlpba | BLSTM-CNN-Char (SparkNLP) | F1: 81.29 |
| named-entity-recognition-ner-on-ncbi-disease | BLSTM-CNN-Char (SparkNLP) | F1: 89.13 |
| named-entity-recognition-ner-on-ncbi-disease | Spark NLP | F1: 89.13 |
| named-entity-recognition-on-anatem | BLSTM-CNN-Char (SparkNLP) | F1: 89.13 |
| named-entity-recognition-on-bc2gm | Spark NLP | F1: 88.75 |
| named-entity-recognition-on-bc4chemd | BLSTM-CNN-Char (SparkNLP) | F1: 93.72 |
| named-entity-recognition-on-bc5cdr-chemical | Spark NLP | F1: 94.88 |
| named-entity-recognition-on-bionlp13-cg | BLSTM-CNN-Char (SparkNLP) | F1: 85.58 |
| named-entity-recognition-on-linnaeus | BLSTM-CNN-Char (SparkNLP) | F1: 86.26 |
| named-entity-recognition-on-linnaeus | Spark NLP | F1: 86.26 |
| named-entity-recognition-on-species-800 | Spark NLP | F1: 80.91 |
| named-entity-recognition-on-species800 | BLSTM-CNN-Char (SparkNLP) | F1: 80.91 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.