Behind the Scenes: Building an AI Algorithm in Digital Pathology
When a pathologist opens a digital slide and clicks to get an accurate and rapid analysis, it seems effortless and magical. Within seconds, AI detects cells, measures biomarkers, or suggests a diagnostic grade. However, behind every prediction lies a long and complex process that relies on collaboration between medical expertise, data science, and technical precision. In this article, we take you behind the scenes to discover how these tools are built and how they are used to advance medicine.
It all starts with a real medical need
With increasing workloads, a shortage of pathologists, and a rising demand for accurate and comprehensive diagnoses, the issue of pathological diagnosis faces growing challenges. From image complexity and inter-observer variability to continuous advancements in medical practices and regulatory pressures – all these factors increase the cognitive burden on specialists.
Here, AI emerges as a powerful tool to improve performance. It can automate certain tasks, provide greater objectivity, and increase efficiency. However, AI is not intended to replace human expertise, but rather to serve as a reliable assistant that enhances diagnostic accuracy and deepens the understanding of disease.
But for AI to be truly useful, it must address a clear medical need. The algorithm design process begins with a strategic question: What clinical problem are we trying to solve? Its importance, potential impact, availability of current solutions, and technical feasibility are all considered.
The context of actual use must also be considered: Will it save time? Will it improve accuracy? Will it provide new information? And the tool must integrate seamlessly into the daily workflow of physicians, taking into account technical constraints and economic costs.
Building a High-Quality Database
After identifying the medical need, the first practical step begins: building a robust database. This requires collecting a large number of diagnostic images that represent a wide variety of disease conditions and reflect differences in preparation and scanning methods.
The more complex the algorithm, the greater the need for data – sometimes reaching thousands of images. These images must be clear, free of artifacts, and scanned at an appropriate resolution. They must also conform to routine clinical standards to ensure the algorithm’s effectiveness in real medical environments.
Privacy and ethics are also essential elements: patient consent must be obtained, identities anonymized, and issues of bias and fairness addressed.
If a “supervised learning” approach is used, images must be accompanied by precise manual annotations that identify important elements such as positive cells, negative cells, mitoses, or tumor areas. This process requires high expertise and is often performed through consensus among several specialists, or by using special stains or diagnostic tests. In some cases, patient clinical follow-up is used to confirm the diagnosis.
This step is lengthy and meticulous, but essential, as the algorithm learns from what it is presented with. Any error in the annotations can lead to bias or poor performance.
Model Selection and Algorithm Training
After collecting images and their annotations, the next step is to select the appropriate type of AI model. In digital pathology, there are several options:
Classification: Distinguishing between different tissue types (e.g., malignant and benign).
Detection: Identifying specific regions in an image, such as suspicious foci.
Segmentation: Isolating fine structures such as cells or tissues.
Image Generation: Creating synthetic images to train models or simulate rare cases.
Then, the training method is chosen:
Fully Supervised: Requires precise annotations, is more transparent, but demands significant time and effort.
Weakly Supervised: Relies on general labels, saves time but is less accurate.
Unsupervised: Does not require annotations, but is less reliable and currently unsuitable for clinical use.
From Research to Product: The Scaling and Expansion Phase
Developing a robust algorithm in the lab is just the beginning. For clinical use, rigorous validation stages must be passed:
1.
Analytical Validation (Internal): Testing the technical performance of the model on training data.
2.
Clinical Validation (External): Testing it on data from different hospitals to measure its generalizability.
3.
Demonstrating Clinical Utility: Through clinical trials that show the tool genuinely improves patient care.
Afterward, the process of obtaining regulatory approval can begin, such as the CE mark in Europe or FDA approval in the United States. Applications for cost coverage from health systems can also be made, although this is still rare in the field of pathology.
Conclusion
From algorithm design to clinical validation, every step is essential to ensure the reliability and utility of AI tools in digital pathology. Ultimately, AI becomes a true ally for both clinicians and patients.
