Address inaccurate race and ethnicity data in medical AI
The inaccuracy of racial and ethnicity data found in electronic health records (EHRS) may negatively impact patient care as artificial intelligence (AI) is increasingly integrated into healthcare. Because hospitals and providers collect such data inconsistently and struggle to accurately classify individual patients, AI systems trained on these data sets may inherit and perpetuate racial biases. In a new publication in PLOS Digital Health, bioethics and legal experts call for immediate standardization of methods for collecting racial and ethnic data and for developers to guarantee the quality of racial and ethnic data in medical AI systems. …
Address inaccurate race and ethnicity data in medical AI
The inaccuracy of racial and ethnicity data found in electronic health records (EHRS) may negatively impact patient care as artificial intelligence (AI) is increasingly integrated into healthcare. Because hospitals and providers collect such data inconsistently and struggle to accurately classify individual patients, AI systems trained on these data sets may inherit and perpetuate racial biases.
In a new publication in PLOS Digital Health, bioethics and legal experts call for immediate standardization of methods for collecting racial and ethnic data and for developers to guarantee the quality of racial and ethnic data in medical AI systems. The research synthesizes concerns about why patient racial data in EHRs may not be accurate, identifies best practices for health systems and medical AI researchers to improve data accuracy, and provides a new template for medical AI developers to transparently justify the quality of their racial and ethnic data
Lead author Alexandra Tsalidis, MBE, notes that "AI developers heeding our recommendation on how their racial and ethnic data was collected will not only advance transparency in medical AI, but also help patients and regulators assess the safety of the tools."
Racial bias in AI models is a major problem as the technology becomes increasingly integrated into healthcare. This article provides a concrete method that can be implemented to address these concerns. “
Francis Shen, JD, PhD, senior author
While more work needs to be done, the article offers a starting point, suggested by co-author Lakshmi Bharadwaj, MBE. “An open dialogue about best practices is an essential step, and the approaches we propose could achieve significant improvements.”
The research was supported by the NIH Bridge to Artificial Intelligence program (BRIDG2AI) and by an NIH Brain Neuroethics grant (R01MH134144).
Sources:
Tsalidis, A., Bharadwaj, L., & Shen, F. X. (2025). Standardization and accuracy of race and ethnicity data: Equity implications for medical AI. PLOS Digital Health. doi.org/10.1371/journal.pdig.0000807.