Studies show that saliency heat maps may not be ready for prime time yet

Transparenz: Redaktionell erstellt und geprüft.
Veröffentlicht am

Artificial intelligence models that interpret medical images promise to improve clinicians' ability to make accurate and timely diagnoses while reducing workload by allowing busy doctors to focus on critical cases and delegate routine tasks to AI. But AI models that lack transparency into how and why a diagnosis is made can be problematic. This opaque argument -; also known as “Black Box” AI -; can reduce the doctor's confidence in the reliability of the AI ​​tool and thus discourage its use. This lack of transparency could also lead clinicians to overlook interpretation of the...

Künstliche Intelligenzmodelle, die medizinische Bilder interpretieren, versprechen, die Fähigkeit von Klinikern zu verbessern, genaue und rechtzeitige Diagnosen zu stellen, und gleichzeitig die Arbeitsbelastung zu verringern, indem sie es vielbeschäftigten Ärzten ermöglichen, sich auf kritische Fälle zu konzentrieren und Routineaufgaben an KI zu delegieren. Aber KI-Modelle, denen es an Transparenz mangelt, wie und warum eine Diagnose gestellt wird, können problematisch sein. Diese undurchsichtige Argumentation -; auch bekannt als „Black Box“ AI -; kann das Vertrauen des Arztes in die Zuverlässigkeit des KI-Tools verringern und somit von seiner Verwendung abhalten. Dieser Mangel an Transparenz könnte Kliniker auch dazu verleiten, der Interpretation des …
Artificial intelligence models that interpret medical images promise to improve clinicians' ability to make accurate and timely diagnoses while reducing workload by allowing busy doctors to focus on critical cases and delegate routine tasks to AI. But AI models that lack transparency into how and why a diagnosis is made can be problematic. This opaque argument -; also known as “Black Box” AI -; can reduce the doctor's confidence in the reliability of the AI ​​tool and thus discourage its use. This lack of transparency could also lead clinicians to overlook interpretation of the...

Studies show that saliency heat maps may not be ready for prime time yet

Artificial intelligence models that interpret medical images promise to improve clinicians' ability to make accurate and timely diagnoses while reducing workload by allowing busy doctors to focus on critical cases and delegate routine tasks to AI.

But AI models that lack transparency into how and why a diagnosis is made can be problematic. This opaque argument -; also known as “Black Box” AI -; can reduce the doctor's confidence in the reliability of the AI ​​tool and thus discourage its use. This lack of transparency could also lead clinicians to trust the tool's interpretation.

In the field of medical imaging, saliency assessments have been a way to create more understandable AI models and demystify AI decision making -; An approach that uses heatmaps to determine whether the tool is correctly focusing only on the relevant parts of a given image or targeting irrelevant parts of it.

Heatmaps work by highlighting areas on an image that influenced the AI ​​model's interpretation. This could help human doctors detect whether the AI ​​model is focusing on the same areas as them or incorrectly focusing on irrelevant places in an image.

But a new study published Oct. 10 in Nature Machine Intelligence shows that, for all their promise, saliency heatmaps aren't ready for prime time yet.

The analysis, led by Harvard Medical School investigator Pranav Rajpurkar, Stanford's Matthew Lungren and New York University's Adriel Saporta, quantified the validity of seven widely used highlighting methods to determine how reliably and accurately they can identify pathologies associated with 10 commonly diagnosed conditions X-ray image, such as lung lesions, pleural effusions, edema or enlarged cardiac structures. To determine performance, the researchers compared the tools' performance with human expert judgment.

Ultimately, tools that used salient heatmap-based heatmaps consistently underperformed compared to human radiologists in image assessment and their ability to detect pathologic lesions.

The work represents the first comparative analysis between saliency maps and human expert performance in assessing multiple radiographic pathologies. The study also provides a detailed understanding of whether and how certain pathological features in an image can impact the performance of AI tools.

The saliency map feature is already being used as a quality assurance tool by clinical practices that use AI to interpret computer-aided detection methods, such as: B. Reading chest x-rays. But in view of the new findings, this feature should be enjoyed with caution and a healthy dose of skepticism, according to the researchers.

Our analysis shows that saliency maps are not yet reliable enough to validate individual clinical decisions made by an AI model. We have identified important limitations that raise serious safety concerns for use in current practice.”

Pranav Rajpurkar, Assistant Professor of Biomedical Informatics, HMS

The researchers caution that because of the important limitations identified in the study, salience-based heatmaps should be further refined before being widely used in clinical AI models.

The team's full codebase, data, and analysis are open and available to anyone interested in exploring this important aspect of clinical machine learning in medical imaging applications.

Source:

Harvard Medical School

Reference:

Saporta, A., et al. (2022) Benchmarking saliency methods for chest radiograph interpretation. Nature-machine intelligence. doi.org/10.1038/s42256-022-00536-x.

.