TY - GEN
T1 - Detecting low self-esteem in youths from web search data
AU - Zaman, Anis
AU - Kautz, Henry
AU - Acharyya, Rupam
AU - Silenzio, Vincent
N1 - Funding Information:
We thank Adam Sadilek and other collaborators at Google and Verily for discussions, advice, and the use of anonymization software. This work was supported by the Goergen Institute for Data Science and Google.
Publisher Copyright:
© 2019 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License.
PY - 2019/5/13
Y1 - 2019/5/13
N2 - Online behavior leaves a digital footprint that can be analyzed to reveal our cognitive and psychological state through time. Recognizing these subtle cues can help identify different aspects of mental health, such as low self-esteem, depression, and anxiety. Google's web search engine, used daily by millions of people, logs every search query made by a user, which is accessible through a platform called Google Takeout. Previous researchers have made efforts to detect and predict behaviors associated with depression and anxiety from web data, but only at a population level. This paper fills in the gap of looking into signs of low self-esteem, a condition that work in a vicious cycle with depression and anxiety, at an individual level by looking into Google search history data. We target college students, a population prone to depression, anxiety, and low self-esteem, and ask to take mental health assessment survey along with their individual search history. Textual analysis show that search logs contain strong signals that can identify individuals with current low self-esteem. For example, participants with low self-esteem have fewer searches pertaining to family, friend, and money attributes; and we also observed differences in the search category distribution, over time, when compared with individuals with moderate to high self-esteem. Using these markers we were able to build a probabilistic classifier that can identify low self-esteem conditions, based on search history, with an average F1 score of 0.86.
AB - Online behavior leaves a digital footprint that can be analyzed to reveal our cognitive and psychological state through time. Recognizing these subtle cues can help identify different aspects of mental health, such as low self-esteem, depression, and anxiety. Google's web search engine, used daily by millions of people, logs every search query made by a user, which is accessible through a platform called Google Takeout. Previous researchers have made efforts to detect and predict behaviors associated with depression and anxiety from web data, but only at a population level. This paper fills in the gap of looking into signs of low self-esteem, a condition that work in a vicious cycle with depression and anxiety, at an individual level by looking into Google search history data. We target college students, a population prone to depression, anxiety, and low self-esteem, and ask to take mental health assessment survey along with their individual search history. Textual analysis show that search logs contain strong signals that can identify individuals with current low self-esteem. For example, participants with low self-esteem have fewer searches pertaining to family, friend, and money attributes; and we also observed differences in the search category distribution, over time, when compared with individuals with moderate to high self-esteem. Using these markers we were able to build a probabilistic classifier that can identify low self-esteem conditions, based on search history, with an average F1 score of 0.86.
KW - College students
KW - Individual search logs
KW - Low self-esteem
KW - Mental health
KW - Youths
UR - http://www.scopus.com/inward/record.url?scp=85066914014&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066914014&partnerID=8YFLogxK
U2 - 10.1145/3308558.3313557
DO - 10.1145/3308558.3313557
M3 - Conference contribution
AN - SCOPUS:85066914014
T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
SP - 2270
EP - 2280
BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PB - Association for Computing Machinery, Inc
T2 - 2019 World Wide Web Conference, WWW 2019
Y2 - 13 May 2019 through 17 May 2019
ER -