Human-Centered Tools for Coping with Imperfect Algorithms during Medical Decision-Making (1902.02960v1)

Published 8 Feb 2019 in cs.HC and cs.CY

Abstract: Machine learning (ML) is increasingly being used in image retrieval systems for medical decision making. One application of ML is to retrieve visually similar medical images from past patients (e.g. tissue from biopsies) to reference when making a medical decision with a new patient. However, no algorithm can perfectly capture an expert's ideal notion of similarity for every case: an image that is algorithmically determined to be similar may not be medically relevant to a doctor's specific diagnostic needs. In this paper, we identified the needs of pathologists when searching for similar images retrieved using a deep learning algorithm, and developed tools that empower users to cope with the search algorithm on-the-fly, communicating what types of similarity are most important at different moments in time. In two evaluations with pathologists, we found that these refinement tools increased the diagnostic utility of images found and increased user trust in the algorithm. The tools were preferred over a traditional interface, without a loss in diagnostic accuracy. We also observed that users adopted new strategies when using refinement tools, re-purposing them to test and understand the underlying algorithm and to disambiguate ML errors from their own errors. Taken together, these findings inform future human-ML collaborative systems for expert decision-making.

Citations (364)

View on Semantic Scholar

Summary

The paper introduces interactive refinement tools—refine-by-region, refine-by-example, and refine-by-concept—that enable clinicians to tailor ML image searches to clinical criteria.
The paper demonstrates that these human-centered tools significantly enhance diagnostic utility and trust by aligning algorithm outputs with expert judgment.
The paper points to future expansion of human-AI collaboration in medical imaging, suggesting broader application of adaptive tools across diverse diagnostic domains.

Overview of "Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making"

The research paper written by Carrie J. Cai and colleagues focuses on the intersection of ML and medical decision-making, particularly in the field of image retrieval systems employed by pathologists. The authors identified a specific challenge: ML algorithms, particularly those utilizing deep neural networks (DNN), may not always align with a pathologist's notion of similarity when retrieving images. This discrepancy can result in algorithmically similar images being clinically irrelevant, reducing trust and utility in automated systems.

Human-Centered Design and Evaluation

The researchers developed a set of refinement tools integrated into a system named SMILY (Similar Medical Images Like Yours). These interactive tools empower pathologists to navigate and fine-tune search results, highlighting specific visual features relevant to a given medical diagnosis. The tools aim to address the limitations of ML systems by providing users with three major capabilities:

Refine-by-region: Allows users to isolate specific areas of the query image to emphasize their importance in similarity searches.
Refine-by-example: Enables users to select desired examples from search results, recalibrating future searches based on user-chosen criteria.
Refine-by-concept: Utilizes Concept Activation Vectors (CAVs) to adjust the presence of certain clinical concepts in search results.

To evaluate the utility of these tools, pathologists tested the system, and the results indicated a marked improvement in diagnostic utility and user trust over traditional methods. Specifically, the tools facilitated better alignment with diagnostically relevant image features without compromising diagnostic accuracy.

Implications and Future Directions

The findings underscore the potential of integrating interactive refinements into ML systems to enhance collaborative human-AI decision-making processes. These tools not only bridge the semantic gap between algorithmic outputs and human expert needs but also enable practitioners to explore and experiment with hypotheses dynamically. This dual role of refinement tools—as instruments for both improving algorithmic alignment and fostering deeper understanding of ML system behavior—indicates their value beyond simple algorithmic feedback.

Future research may broaden the scope of application for these human-centered controls to other medical domains where similar image retrieval issues arise. Additionally, extending the concept sliders to include a wider variety of medically-relevant concepts and exploring dynamic, on-the-fly concept training could further enhance system adaptability and user autonomy.

The paper provides a comprehensive look at how interactive tools can mitigate some of the inherent limitations in current ML systems used in medical diagnosis. It offers a practical path forward for improving the integration of AI in clinical settings, leveraging both the capabilities of advanced image processing and the nuanced judgment of human experts. As AI continues to evolve and integrate into more domains, the balance between automation and human oversight will remain a critical area of paper and development.

PDF Markdown

Human-Centered Tools for Coping with Imperfect Algorithms during Medical Decision-Making (1902.02960v1)

Summary

Overview of "Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making"

Human-Centered Design and Evaluation

Implications and Future Directions

Related Papers