TY - GEN
T1 - Write It Like You See It
T2 - 5th AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, AIES 2022
AU - Adam, Hammaad
AU - Yang, Ming Ying
AU - Cato, Kenrick
AU - Baldini, Ioana
AU - Senteio, Charles
AU - Celi, Leo Anthony
AU - Zeng, Jiaming
AU - Singh, Moninder
AU - Ghassemi, Marzyeh
N1 - Funding Information:
This work is supported by the MIT-IBM Watson AI Lab. HA is funded by the MIT Jameel Clinic. KC is funded by the National Institute of Nursing Leadership through grant R01NR016941-01 Communicating Narrative Concerns Entered by RNs (CONCERN). LAC is funded by the National Institute of Health through the NIBIB R01 grant EB017205. MG is funded by the CIFAR Azreili Global Scholar and the Helmholtz Professorship.
Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/7/26
Y1 - 2022/7/26
N2 - Clinical notes are becoming an increasingly important data source for machine learning (ML) applications in healthcare. Prior research has shown that deploying ML models can perpetuate existing biases against racial minorities, as bias can be implicitly embedded in data. In this study, we investigate the level of implicit race information available to ML models and human experts and the implications of model-detectable differences in clinical notes. Our work makes three key contributions. First, we find that models can identify patient self-reported race from clinical notes even when the notes are stripped of explicit indicators of race. Second, we determine that human experts are not able to accurately predict patient race from the same redacted clinical notes. Finally, we demonstrate the potential harm of this implicit information in a simulation study, and show that models trained on these race-redacted clinical notes can still perpetuate existing biases in clinical treatment decisions.
AB - Clinical notes are becoming an increasingly important data source for machine learning (ML) applications in healthcare. Prior research has shown that deploying ML models can perpetuate existing biases against racial minorities, as bias can be implicitly embedded in data. In this study, we investigate the level of implicit race information available to ML models and human experts and the implications of model-detectable differences in clinical notes. Our work makes three key contributions. First, we find that models can identify patient self-reported race from clinical notes even when the notes are stripped of explicit indicators of race. Second, we determine that human experts are not able to accurately predict patient race from the same redacted clinical notes. Finally, we demonstrate the potential harm of this implicit information in a simulation study, and show that models trained on these race-redacted clinical notes can still perpetuate existing biases in clinical treatment decisions.
KW - clinical notes
KW - health equity
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85137164516&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137164516&partnerID=8YFLogxK
U2 - 10.1145/3514094.3534203
DO - 10.1145/3514094.3534203
M3 - Conference contribution
AN - SCOPUS:85137164516
T3 - AIES 2022 - Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society
SP - 7
EP - 21
BT - AIES 2022 - Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society
PB - Association for Computing Machinery, Inc
Y2 - 1 August 2022 through 3 August 2022
ER -