Review — Convolutional Sequence to Sequence Learning (ConvS2S)

ConvS2S as Convolutional Network, Outperforms GNMT

Outline

1. ConvS2S: Network Architecture

Convolutional Sequence to Sequence Learning (ConvS2S) Network Architecture

To generate output yi+1, the decoder computes a new hidden state hi+1 based on the previous state hi, an embedding gi of the previous target language word yi, as well as a conditional input ci derived from the encoder output z.

Position Embeddings e at Encoder
Position Embeddings g at Decoder
1D Convolution at Encoder
1D Convolution at Decoder (Output
Gated Linear Unit (GLU) at Encoder, (Output is z)
Gated Linear Unit (GLU) at Decoder, (Output is h)
Multi-Step Attention
Predicted output y

2. Experimental Results

Accuracy on WMT tasks compared to previous work.

ConvS2S outperforms the WMT’16 winning entry by 1.9 BLEU.

The ConvS2S model outpeforms GNMT by 0.5 BLEU.

On WMT’14 English-French translation, ConvS2S improves over GNMT in the same setting by 1.6 BLEU on average. ConvS2S also outperforms GNMT’s reinforcement (RL) models by 0.5 BLEU.

Accuracy of ensembles with eight models

ConvS2S outperforms the best current ensembles on both datasets.

--

--

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store