HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Zhiheng Huang Peng Xu Davis Liang Ajay Mishra Bing Xiang

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Abstract

Bidirectional Encoder Representations from Transformers (BERT) has recently achieved state-of-the-art performance on a broad range of NLP tasks including sentence classification, machine translation, and question answering. The BERT model architecture is derived primarily from the transformer. Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling architecture for neural machine translation and question answering. In this paper, we investigate how these two modeling techniques can be combined to create a more powerful model architecture. We propose a new architecture denoted as Transformer with BLSTM (TRANS-BLSTM) which has a BLSTM layer integrated to each transformer block, leading to a joint modeling framework for transformer and BLSTM. We show that TRANS-BLSTM models consistently lead to improvements in accuracy compared to BERT baselines in GLUE and SQuAD 1.1 experiments. Our TRANS-BLSTM model obtains an F1 score of 94.01% on the SQuAD 1.1 development dataset, which is comparable to the state-of-the-art result.

Benchmarks

BenchmarkMethodologyMetrics
natural-language-inference-on-qnliTRANS-BLSTM
Accuracy: 94.08%
paraphrase-identification-on-quora-questionTRANS-BLSTM
Accuracy: 88.28
text-classification-on-glue-mrpcTRANS-BLSTM
Accuracy: 90.45
text-classification-on-glue-rteTRANS-BLSTM
Accuracy: 79.78
text-classification-on-glue-sst2TRANS-BLSTM
Accuracy: 94.38

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding | Papers | HyperAI