TY - GEN
T1 - Multimodal question answering over structured data with ambiguous entities
AU - Li, Huadong
AU - Wang, Yafang
AU - De Melo, Gerard
AU - Tu, Changhe
AU - Chen, Baoquan
N1 - Funding Information:
We would like to thank all reviewers for their valuable comments. This work is supported by the National Natural Science Foundation of China under Grant (No. 61503217), National 973 Program (No. 2015CB352500), the Joint NSFC-ISF Research Program 61561146397, jointly funded by the National Natural Science Foundation of China and the Israel Science Foundation, the China National Key Research and Development Project (2016YFB1001403), and the Shandong Provincial Science and Technology Development Program (2016GGX106001).
Publisher Copyright:
© 2017 International World Wide Web Conference Committee (IW3C2), published under Creative Commons CC BY 4.0 License.
PY - 2017
Y1 - 2017
N2 - In recent years, we have witnessed profound changes in the way people satisfy their information needs. For instance, with the ubiquitous 24/7 availability of mobile devices, the number of search engine queries on mobile devices has reportedly overtaken that of queries on regular personal computers. In this paper, we consider the task of multimodal question answering over structured data, in which a user supplies not just a natural language query but also an image. Our system addresses this by optimizing a non-convex objective function capturing multimodal constraints. Our experiments show that this enables it to answer even very challenging ambiguous entity queries with high accuracy.
AB - In recent years, we have witnessed profound changes in the way people satisfy their information needs. For instance, with the ubiquitous 24/7 availability of mobile devices, the number of search engine queries on mobile devices has reportedly overtaken that of queries on regular personal computers. In this paper, we consider the task of multimodal question answering over structured data, in which a user supplies not just a natural language query but also an image. Our system addresses this by optimizing a non-convex objective function capturing multimodal constraints. Our experiments show that this enables it to answer even very challenging ambiguous entity queries with high accuracy.
KW - Multimedia knowledge bases
KW - Multimodal
KW - Question answering
UR - http://www.scopus.com/inward/record.url?scp=85027419208&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85027419208&partnerID=8YFLogxK
U2 - 10.1145/3041021.3054135
DO - 10.1145/3041021.3054135
M3 - Conference contribution
AN - SCOPUS:85027419208
T3 - 26th International World Wide Web Conference 2017, WWW 2017 Companion
SP - 79
EP - 88
BT - 26th International World Wide Web Conference 2017, WWW 2017 Companion
PB - International World Wide Web Conferences Steering Committee
T2 - 26th International World Wide Web Conference, WWW 2017 Companion
Y2 - 3 April 2017 through 7 April 2017
ER -