Pointing the unknown words

Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, Yoshua Bengio

Research output: Chapter in Book/Report/Conference proceedingConference contribution

77 Citations (Scopus)

Abstract

The problem of rare and unknown words is an important issue that can potentially effect the performance of many NLP systems, including traditional count-based and deep learning models. We propose a novel way to deal with the rare and unseen words for the neural network models using attention. Our model uses two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word in the shortlist vocabulary. At each timestep, the decision of which softmax layer to use is adaptively made by an MLP which is conditioned on the context. We motivate this work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known. Using our proposed model, we observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset.

Original languageEnglish (US)
Title of host publication54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages140-149
Number of pages10
ISBN (Electronic)9781510827585
StatePublished - Jan 1 2016
Event54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: Aug 7 2016Aug 12 2016

Publication series

Name54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
Volume1

Other

Other54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
CountryGermany
CityBerlin
Period8/7/168/12/16

Fingerprint

neural network
vocabulary
language
learning
performance
evidence
Layer
Learning Model
Parallel Corpora
Language Model
Neural Network Model
Shortlist
Parallel Texts
Machine Translation
Natural Language Processing
Names
Psychological
Vocabulary
Summarization

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Cite this

Gulcehre, C., Ahn, S., Nallapati, R., Zhou, B., & Bengio, Y. (2016). Pointing the unknown words. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers (pp. 140-149). (54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers; Vol. 1). Association for Computational Linguistics (ACL).
Gulcehre, Caglar ; Ahn, Sungjin ; Nallapati, Ramesh ; Zhou, Bowen ; Bengio, Yoshua. / Pointing the unknown words. 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers. Association for Computational Linguistics (ACL), 2016. pp. 140-149 (54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers).
@inproceedings{22d4e2be5f3044739870d3196043587f,
title = "Pointing the unknown words",
abstract = "The problem of rare and unknown words is an important issue that can potentially effect the performance of many NLP systems, including traditional count-based and deep learning models. We propose a novel way to deal with the rare and unseen words for the neural network models using attention. Our model uses two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word in the shortlist vocabulary. At each timestep, the decision of which softmax layer to use is adaptively made by an MLP which is conditioned on the context. We motivate this work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known. Using our proposed model, we observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset.",
author = "Caglar Gulcehre and Sungjin Ahn and Ramesh Nallapati and Bowen Zhou and Yoshua Bengio",
year = "2016",
month = "1",
day = "1",
language = "English (US)",
series = "54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers",
publisher = "Association for Computational Linguistics (ACL)",
pages = "140--149",
booktitle = "54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers",

}

Gulcehre, C, Ahn, S, Nallapati, R, Zhou, B & Bengio, Y 2016, Pointing the unknown words. in 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers. 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, vol. 1, Association for Computational Linguistics (ACL), pp. 140-149, 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany, 8/7/16.

Pointing the unknown words. / Gulcehre, Caglar; Ahn, Sungjin; Nallapati, Ramesh; Zhou, Bowen; Bengio, Yoshua.

54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers. Association for Computational Linguistics (ACL), 2016. p. 140-149 (54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers; Vol. 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Pointing the unknown words

AU - Gulcehre, Caglar

AU - Ahn, Sungjin

AU - Nallapati, Ramesh

AU - Zhou, Bowen

AU - Bengio, Yoshua

PY - 2016/1/1

Y1 - 2016/1/1

N2 - The problem of rare and unknown words is an important issue that can potentially effect the performance of many NLP systems, including traditional count-based and deep learning models. We propose a novel way to deal with the rare and unseen words for the neural network models using attention. Our model uses two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word in the shortlist vocabulary. At each timestep, the decision of which softmax layer to use is adaptively made by an MLP which is conditioned on the context. We motivate this work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known. Using our proposed model, we observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset.

AB - The problem of rare and unknown words is an important issue that can potentially effect the performance of many NLP systems, including traditional count-based and deep learning models. We propose a novel way to deal with the rare and unseen words for the neural network models using attention. Our model uses two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word in the shortlist vocabulary. At each timestep, the decision of which softmax layer to use is adaptively made by an MLP which is conditioned on the context. We motivate this work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known. Using our proposed model, we observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset.

UR - http://www.scopus.com/inward/record.url?scp=85012023666&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012023666&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85012023666

T3 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers

SP - 140

EP - 149

BT - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers

PB - Association for Computational Linguistics (ACL)

ER -

Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y. Pointing the unknown words. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers. Association for Computational Linguistics (ACL). 2016. p. 140-149. (54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers).