What this paper is about
Deep Neural Networks (DNNs) are described as powerful models that have achieved excellent performance on difficult learning tasks. [S1] The paper states that DNNs work well whenever large labeled training sets are available. [S1] The paper also states that DNNs cannot be used to map sequences to sequences. [S1] The paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. [S1] The method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality. [S1] The method then uses another deep LSTM to decode the target sequence from that vector. [S1] The paper reports a main result on an English to French translation task from the WMT’14 dataset. [S1] The paper reports that the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set. [S1] The paper reports that this BLEU score was penalized on out-of-vocabulary words. [S1] The paper reports that the LSTM did not have difficulty on long sentences. [S1]
Core claims to remember
The paper states that DNNs are powerful models that have achieved excellent performance on difficult learning tasks. [S1] The paper states that DNNs work well whenever large labeled training sets are available. [S1] The paper states that DNNs cannot be used to map sequences to sequences. [S1] The paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. [S1] The paper’s approach uses a multilayered LSTM to map an input sequence into a vector of fixed dimensionality. [S1] The paper’s approach uses another deep LSTM to decode a target sequence from the fixed-dimensional vector. [S1] The paper reports a translation result on WMT’14 English-to-French in which the LSTM achieves a BLEU score of 34.8 on the entire test set. [S1] The paper reports that the BLEU score for the LSTM was penalized on out-of-vocabulary words. [S1] The paper reports that the LSTM did not have difficulty on long sentences. [S1]
Limitations and caveats
The paper reports that the BLEU score of 34.8 on the WMT’14 English-to-French test set was computed with a penalty on out-of-vocabulary words. [S1] The paper reports its main result on an English to French translation task from the WMT’14 dataset. [S1] The paper reports that the BLEU score of 34.8 is measured on the entire test set for that task. [S1] The paper reports that the LSTM did not have difficulty on long sentences, but it does not provide quantitative details about sentence length in the excerpted text. [S1]
How to apply this in study or projects
Read the paper’s statement that DNNs cannot be used to map sequences to sequences, and relate that statement to the paper’s proposed sequence-learning approach. [S1] Follow the described encoder step in which a multilayered LSTM maps an input sequence to a vector of fixed dimensionality. [S1] Follow the described decoder step in which another deep LSTM decodes the target sequence from the fixed-dimensional vector. [S1] Locate the reported WMT’14 English-to-French translation evaluation and note that the paper reports a BLEU score of 34.8 on the entire test set. [S1] Track how the paper reports penalizing BLEU on out-of-vocabulary words when presenting the 34.8 score. [S1] Review the paper’s reported observation that the LSTM did not have difficulty on long sentences, and connect it to the use of LSTMs for sequence learning as stated in the method description. [S1]