Review — The OpenNMT Neural Machine Translation Toolkit: 2020 Edition

OpenNMT Website: https://opennmt.net

4 min readJun 18, 2022

**OpenNMT: open source ecosystem for neural machine translation and neural sequence learning**

The OpenNMT Neural Machine Translation Toolkit: 2020 Edition
OpenNMT: Neural Machine Translation Toolkit,
OpenNMT: Open-Source Toolkit for Neural Machine Translation, OpenNMT, by SYSTRAN, Ubiqus, and Harvard SEAS
2020 AMTA, 2018 AMTA, 2017 ACL, Over 20, 90, 1600 Citations Respectively (Sik-Ho Tsang @ Medium)
Natural Language Processing, NLP, Neural Machine Translation, NMT

OpenNMT is a multi-year open-source ecosystem for neural machine translation (NMT) and natural language generation (NLG).
OpenNMT has been used in several production MT systems.
This is a paper to introduce OpenNMT toolkit rather than a NMT method.

Outline

OpenNMT
Experimental Results

1. OpenNMT

**Features implemented by OpenNMT-py (column py) and OpenNMT-tf (column tf)**

It supports a wide range of model architectures (ConvS2S, GPT-2, Transformer, etc.) and training procedures for neural machine translation as well as related tasks such as natural language generation and language modeling.

OpenNMT-py: A user-friendly and multimodal implementation benefiting from PyTorch ease of use and versatility.
OpenNMT-tf: A modular and stable implementation powered by the TensorFlow 2 ecosystem.

OpenNMT was first released in late 2016 as a Torch7 implementation. The original demonstration paper in 2017 was awarded “Best Demonstration Paper Runner-Up” at ACL 2017.
After the release of PyTorch, the sunsetting of the Torch7 was initiated.
After more than 3 years (2017 to 2020) of active development, OpenNMT projects have been starred by over 7,400 users. A community forum is also home of 970 users and more than 9,800 posts about NMT research and how to use OpenNMT effectively.

Live demo is also developed.
Research: OpenNMT was used for other tasks related to neural machine translation such as summarization, data-to-text, image-to-text, automatic speech recognition and semantic parsing.
Production: OpenNMT also proved to be widespread in industry. Companies such as SYSTRAN, Booking.com, or Ubiqus are known to deploy OpenNMT models in production.
Framework: It has been used in many frameworks such as SwissPost and BNP Paribas, while NVIDIA used OpenNMT as a benchmark for the release of TensorRT 6.

2. Experimental Results

2.1. 2020 ATMA Results

**Model size and translation speed (target tokens per second) for a base English-German** **Transformer**

Dataset: English to German WMT19 task, with the addition of ParaCrawl v5 instead of v3.
Tokenization: 40,000 BPE merge operations, learned and applied with Tokenizer.
Model: Transformer Medium (12 heads, 768 dmodel size, 3072 dff size).
Training: Trained with OpenNMT-py on 6 RTX 2080 Ti, using mixed precision. Initial batch size is around 50,000 tokens, final batch size around 200,000 tokens.
Inference: Shown scores are obtained with beam search of size 5 and average length penalty.

**OpenNMT system vs. some commercial systems**

During the WMT19 campaign, the best BLEU score for English to German was 44.9 but the best human evaluated system scored only 42.7 with an ensemble of Big Tranformers.
OpenNMT tools allow to reach a superior performance.

**OpenNMT English to French model performance on test sets of various domains**

2.2. 2018 ATMA Results

**Comparison with GNMT on EN→DE. ONMT used 2-layers bi-RNN of 1024, embedding size 512, dropout 0.1 and max length 100**

OpenNMT is compared with GNMT as well. OpenNMT has similar performance with GNMT.
(There are more results in these 3 papers, please feel free to read them directly if interested. OpenNMT was still having updates in 2021.)