The self-learning deep learning algorithm can find similar cases in large pathological image repositories
Rare diseases are often difficult to diagnose and predicting the best course of treatment can be challenging for doctors. Researchers at the Mahmood Lab at Brigham and Women's Hospital, a founding member of the Mass General Brigham health system, have developed a deep learning algorithm that can teach itself to learn features that can then be used to find similar cases in large pathology image repositories. Known as SISH (Self-Supervised Image search for Histology), the new tool acts like a search engine for pathology images and has many potential applications, including identifying rare diseases and helping doctors determine which patients are likely to...

The self-learning deep learning algorithm can find similar cases in large pathological image repositories
Rare diseases are often difficult to diagnose and predicting the best course of treatment can be challenging for doctors. Researchers at the Mahmood Lab at Brigham and Women's Hospital, a founding member of the Mass General Brigham health system, have developed a deep learning algorithm that can teach itself to learn features that can then be used to find similar cases in large pathology image repositories. Known as SISH (Self-Supervised Image search for Histology), the new tool acts like a pathology image search engine and has many potential applications, including identifying rare diseases and helping doctors determine which patients are likely to respond to similar therapies. An article introducing the self-learning algorithm was published in Nature Biomedical Engineering.
We show that our system can help diagnose rare diseases and find cases with similar morphological patterns without the need for manual annotations and large datasets for supervised training. This system has the potential to improve pathology training, disease subtyping, tumor identification and identification of rare morphologies.”
Faisal Mahmood, PhD, senior author, Brigham's Department of Pathology
Modern electronic databases can store an immense amount of digital records and reference images, particularly in pathology through Whole Slide Images (WSIs). However, due to the gigapixel size of each individual WSI and the ever-increasing number of images in large repositories, searching and retrieving WSIs can be slow and complicated. Therefore, scalability remains a key barrier to efficient use.
To solve this problem, researchers at the Brigham developed SISH, which teaches itself to learn feature representations that can find cases with analogous features in pathology at a constant speed, regardless of the size of the database.
In their study, researchers tested SISH's speed and ability to retrieve interpretable disease subtype information for common and rare cancers. The algorithm was able to quickly and accurately retrieve images from a database containing tens of thousands of whole slide images from over 22,000 patient cases with over 50 different disease types and over a dozen anatomical locations. The speed of retrieval exceeded other methods in many scenarios, including disease subtype retrieval, particularly as the size of the image database scaled to thousands of images. Even as the repositories grew larger, SISH was able to maintain a consistent search speed.
However, the algorithm has some limitations, including high memory requirements, limited context detection on large tissue sections, and the fact that it is limited to a single imaging modality.
Overall, the algorithm demonstrated the ability to efficiently retrieve images regardless of repository size and in different datasets. It also demonstrated competence in diagnosing rare disease types and the ability to serve as a search engine to detect specific areas of the image that may be relevant to diagnosis. This work can have a major impact on future disease diagnosis, prognosis and analysis.
“As image databases continue to grow, we hope that SISH will help make disease identification easier,” Mahmood said. “We believe an important future direction in this area is multimodal case finding, where pathology, radiology, genomic and electronic medical record data are shared to find similar patient cases.”
Source:
Reference:
Chen, C., et al. (2022) Fast and scalable whole slide image search through self-supervised deep learning. Natural biomedical engineering. doi.org/10.1038/s41551-022-00929-8.
.