TY - GEN
T1 - Bridging gaps
T2 - 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019
AU - Mitsui, Matthew
AU - Shah, Chirag
PY - 2019/7/18
Y1 - 2019/7/18
N2 - Interactive information retrieval (IIR) researchers often conduct laboratory studies to understand the relationship between people seeking information and information retrieval systems. They develop extensive data collection methods and tools create new understanding about the relationship between observable behaviors, searcher context, and underlying cognition, to better support people's information seeking. Yet aside from the problems of data size, realism, and demographics, laboratory studies are limited in the number and nature of phenomena they can study. Hence, data collected in laboratories contains different searcher populations and collects non-overlapping user and task characteristics. While research analyses and collection methods are isolated, how can we further IIR's mission of broad understanding? We approach this as a structure learning problem on incomplete data, determining the extent to which incomplete data can be used to predict user and task characteristics from interactions. In particular, we examine whether combining heterogeneous data sets is more effective than using a single data set alone in prediction. Our results indicate that adding external data significantly improves predictions of searcher characteristics, task characteristics, and behaviors, even when the data does not contain identical information about searchers.
AB - Interactive information retrieval (IIR) researchers often conduct laboratory studies to understand the relationship between people seeking information and information retrieval systems. They develop extensive data collection methods and tools create new understanding about the relationship between observable behaviors, searcher context, and underlying cognition, to better support people's information seeking. Yet aside from the problems of data size, realism, and demographics, laboratory studies are limited in the number and nature of phenomena they can study. Hence, data collected in laboratories contains different searcher populations and collects non-overlapping user and task characteristics. While research analyses and collection methods are isolated, how can we further IIR's mission of broad understanding? We approach this as a structure learning problem on incomplete data, determining the extent to which incomplete data can be used to predict user and task characteristics from interactions. In particular, we examine whether combining heterogeneous data sets is more effective than using a single data set alone in prediction. Our results indicate that adding external data significantly improves predictions of searcher characteristics, task characteristics, and behaviors, even when the data does not contain identical information about searchers.
KW - Bayesian networks
KW - Interactive information retrieval
KW - Searcher behavior
KW - Structure learning
KW - Task classification
KW - Task prediction
KW - Task type
UR - http://www.scopus.com/inward/record.url?scp=85074216982&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074216982&partnerID=8YFLogxK
U2 - 10.1145/3331184.3331221
DO - 10.1145/3331184.3331221
M3 - Conference contribution
T3 - SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 415
EP - 424
BT - SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
PB - Association for Computing Machinery, Inc
Y2 - 21 July 2019 through 25 July 2019
ER -