Machine Translation On Wmt2014 English German

评估指标

BLEU score
Hardware Burden
Operations per network pass

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
Transformer Cycle (Rev)35.14--Lessons on Parameter Sharing across Layers in Transformers
Noisy back-translation35.0146GUnderstanding Back-Translation at Scale
Transformer+Rep(Uni)33.89Rethinking Perturbations in Encoder-Decoders for Fast Training
T5-11B32.1--Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
BiBERT31.26--BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
Transformer + R-Drop30.9149GR-Drop: Regularized Dropout for Neural Networks
Bi-SimCut30.78--Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
BERT-fused NMT30.75--Incorporating BERT into Neural Machine Translation
Data Diversification - Transformer30.7Data Diversification: A Simple Strategy For Neural Machine Translation
SimCut30.56--Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
Mask Attention Network (big)30.4Mask Attention Networks: Rethinking and Strengthen Transformer
Transformer (ADMIN init)30.1--Very Deep Transformers for Neural Machine Translation
PowerNorm (Transformer)30.1--PowerNorm: Rethinking Batch Normalization in Transformers
Depth Growing30.0724GDepth Growing for Neural Machine Translation
MUSE(Parallel Multi-scale Attention)29.9MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
Evolved Transformer Big29.8--The Evolved Transformer
OmniNetP29.8OmniNet: Omnidirectional Representations from Transformers
Local Joint Self-attention29.7Joint Source-Target Self Attention with Locality Constraints
DynamicConv29.7--Pay Less Attention with Lightweight and Dynamic Convolutions
TaLK Convolutions29.6--Time-aware Large Kernel Convolutions
0 of 91 row(s) selected.
Machine Translation On Wmt2014 English German | SOTA | HyperAI超神经