TY - GEN
T1 - Modeling spread of disease from social interactions
AU - Sadilek, Adam
AU - Kautz, Henry
AU - Silenzio, Vincent
PY - 2012
Y1 - 2012
N2 - Research in computational epidemiology to date has concentrated on coarse-grained statistical analysis of populations, often synthetic ones. By contrast, this paper focuses on fine-grained modeling of the spread of infectious diseases throughout a large real-world social network. Specifically, we study the roles that social ties and interactions between specific individuals play in the progress of a contagion.We focus on public Twitter data, where we find that for every health-related message there are more than 1,000 unrelated ones. This class imbalance makes classification particularly challenging. Nonetheless, we present a framework that accurately identifies sick individuals from the content of online communication. Evaluation on a sample of 2.5 million geo-tagged Twitter messages shows that social ties to infected, symptomatic people, as well as the intensity of recent co-location, sharply increase one's likelihood of contracting the illness in the near future. To our knowledge, this work is the first to model the interplay of social activity, human mobility, and the spread of infectious disease in a large real-world population. Furthermore, we provide the first quantifiable estimates of the characteristics of disease transmission on a large scale without active user participation-a step towards our ability to model and predict the emergence of global epidemics from day-to-day interpersonal interactions.
AB - Research in computational epidemiology to date has concentrated on coarse-grained statistical analysis of populations, often synthetic ones. By contrast, this paper focuses on fine-grained modeling of the spread of infectious diseases throughout a large real-world social network. Specifically, we study the roles that social ties and interactions between specific individuals play in the progress of a contagion.We focus on public Twitter data, where we find that for every health-related message there are more than 1,000 unrelated ones. This class imbalance makes classification particularly challenging. Nonetheless, we present a framework that accurately identifies sick individuals from the content of online communication. Evaluation on a sample of 2.5 million geo-tagged Twitter messages shows that social ties to infected, symptomatic people, as well as the intensity of recent co-location, sharply increase one's likelihood of contracting the illness in the near future. To our knowledge, this work is the first to model the interplay of social activity, human mobility, and the spread of infectious disease in a large real-world population. Furthermore, we provide the first quantifiable estimates of the characteristics of disease transmission on a large scale without active user participation-a step towards our ability to model and predict the emergence of global epidemics from day-to-day interpersonal interactions.
UR - http://www.scopus.com/inward/record.url?scp=84871981836&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84871981836&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84871981836
SN - 9781577355564
T3 - ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media
SP - 322
EP - 329
BT - ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media
T2 - 6th International AAAI Conference on Weblogs and Social Media, ICWSM 2012
Y2 - 4 June 2012 through 7 June 2012
ER -