TY - GEN
T1 - Generating factoid questionswith recurrent neural networks
T2 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
AU - Serban, Iulian Vlad
AU - García-Durán, Alberto
AU - Gulcehre, Caglar
AU - Ahn, Sungjin
AU - Chandar, Sarath
AU - Courville, Aaron
AU - Bengio, Yoshua
N1 - Publisher Copyright:
© 2016 Association for Computational Linguistics.
PY - 2016
Y1 - 2016
N2 - Over the past decade, large-scale supervised learning corpora have enabled machine learning researchers to make substantial advances. However, to this date, there are no large-scale questionanswer corpora available. In this paper we present the 30M Factoid Question- Answer Corpus, an enormous questionanswer pair corpus produced by applying a novel neural network architecture on the knowledge base Freebase to transduce facts into natural language questions. The produced question-answer pairs are evaluated both by human evaluators and using automatic evaluation metrics, including well-established machine translation and sentence similarity metrics. Across all evaluation criteria the questiongeneration model outperforms the competing template-based baseline. Furthermore, when presented to human evaluators, the generated questions appear to be comparable in quality to real human-generated questions.
AB - Over the past decade, large-scale supervised learning corpora have enabled machine learning researchers to make substantial advances. However, to this date, there are no large-scale questionanswer corpora available. In this paper we present the 30M Factoid Question- Answer Corpus, an enormous questionanswer pair corpus produced by applying a novel neural network architecture on the knowledge base Freebase to transduce facts into natural language questions. The produced question-answer pairs are evaluated both by human evaluators and using automatic evaluation metrics, including well-established machine translation and sentence similarity metrics. Across all evaluation criteria the questiongeneration model outperforms the competing template-based baseline. Furthermore, when presented to human evaluators, the generated questions appear to be comparable in quality to real human-generated questions.
UR - http://www.scopus.com/inward/record.url?scp=85011954479&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85011954479&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85011954479
T3 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
SP - 588
EP - 598
BT - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
PB - Association for Computational Linguistics (ACL)
Y2 - 7 August 2016 through 12 August 2016
ER -