HyperAIHyperAI

Command Palette

Search for a command to run...

Machine Translation On Wmt2014 English German

Metrics

BLEU score
Hardware Burden
Operations per network pass

Results

Performance results of various models on this benchmark

Paper TitleRepository
Transformer Cycle (Rev)35.14--Lessons on Parameter Sharing across Layers in Transformers
Noisy back-translation35.0146GUnderstanding Back-Translation at Scale
Transformer+Rep(Uni)33.89Rethinking Perturbations in Encoder-Decoders for Fast Training
T5-11B32.1--Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
BiBERT31.26--BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
Transformer + R-Drop30.9149GR-Drop: Regularized Dropout for Neural Networks
Bi-SimCut30.78--Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
BERT-fused NMT30.75--Incorporating BERT into Neural Machine Translation
Data Diversification - Transformer30.7Data Diversification: A Simple Strategy For Neural Machine Translation
SimCut30.56--Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
Mask Attention Network (big)30.4Mask Attention Networks: Rethinking and Strengthen Transformer
Transformer (ADMIN init)30.1--Very Deep Transformers for Neural Machine Translation
PowerNorm (Transformer)30.1--PowerNorm: Rethinking Batch Normalization in Transformers
Depth Growing30.0724GDepth Growing for Neural Machine Translation
MUSE(Parallel Multi-scale Attention)29.9MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
Evolved Transformer Big29.8--The Evolved Transformer
OmniNetP29.8OmniNet: Omnidirectional Representations from Transformers
Local Joint Self-attention29.7Joint Source-Target Self Attention with Locality Constraints
DynamicConv29.7--Pay Less Attention with Lightweight and Dynamic Convolutions
TaLK Convolutions29.6--Time-aware Large Kernel Convolutions
0 of 91 row(s) selected.
Machine Translation On Wmt2014 English German | SOTA | HyperAI