References

References

BCB14

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural Machine Translation By Jointly Learning To Align and Translate. In ICLR, 1–15. 2014. URL: http://arxiv.org/abs/1409.0473 http://arxiv.org/abs/1409.0473v3, arXiv:1409.0473, doi:10.1146/annurev.neuro.26.041002.131047.

GAG+17

Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. Convolutional sequence to sequence learning. CoRR, 2017. URL: http://arxiv.org/abs/1705.03122, arXiv:1705.03122.

LZA17

Tao Lei, Yu Zhang, and Yoav Artzi. Training rnns as fast as cnns. CoRR, 2017. URL: http://arxiv.org/abs/1709.02755, arXiv:1709.02755.

LL17

Yang Liu and Mirella Lapata. Learning structured text representations. CoRR, 2017. URL: http://arxiv.org/abs/1705.09207, arXiv:1705.09207.

LPM15

Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. Effective Approaches to Attention-based Neural Machine Translation. In Proc of EMNLP. 2015.

LSL+15

Minh-Thang Luong, Ilya Sutskever, Quoc Le, Oriol Vinyals, and Wojciech Zaremba. Addressing the Rare Word Problem in Neural Machine Translation. In Proc of ACL. 2015.

SLM17

Abigail See, Peter J. Liu, and Christopher D. Manning. Get to the point: summarization with pointer-generator networks. CoRR, 2017. URL: http://arxiv.org/abs/1704.04368, arXiv:1704.04368.

SH16

Rico Sennrich and Barry Haddow. Linguistic input features improve neural machine translation. arXiv preprint arXiv:1606.02892, 2016.

VSP+17

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. CoRR, 2017. URL: http://arxiv.org/abs/1706.03762, arXiv:1706.03762.

WSC+16

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, and others. Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.

ZXS18

Biao Zhang, Deyi Xiong, and Jinsong Su. Accelerating neural transformer via an average attention network. CoRR, 2018. URL: http://arxiv.org/abs/1805.00631, arXiv:1805.00631.