AI models hold profound promise for accelerating clinical evidence generation, but without a clear regulatory framework, sponsors and innovators risk investing in methods regulators cannot accept.
According to the positioning paper from European Medicine Agency, for high-impact cases of using AI, an interaction with regulatory authorities is encouraged already in the planning stage. Furthermore, the interaction with regulators should cover questions such as intended context of use, generalizability, performance, robustness, transparency, and clinical applicability.
Context of use (COU)
The Virtual Phase III acts as a predictive tool to help the trial sponsor and regulators commit to the most efficient and statistically robust confirmatory trial possible. By generating further evidence based on data from Phase I/II, these virtual trials can guide decision-making on sample size, endpoint selection, and trial duration, potentially reducing cost and accelerating timelines. Furthermore, the Virtual Phase III predictions are based on:
- Biomarker trajectory modeling Our model relies on image-to-image medical image analysis to extract and track biomarkers like the longest diameter (LD), but also other segmentations for increasing the fidelity in generated predictions, such as the Functional Tumor Volume (FTV). Unlike a simple 1D measurement (longest tumor diameter), FTV’s value lies in its 3D measurement that specifically tracks the functional, actively enhancing (and thus, viable) portion of the tumor. Cell death and necrosis can cause the functional tissue to shrink significantly before the overall longest diameter changes much.
- Dose-reduction modeling If the investigational drug shows a significantly lower risk of relative dose intensity (RDI) < 85%, it provides compelling evidence of a superior safety/tolerability profile, which directly translates to better adherence and real-world efficacy.
Generalizability
While breast cancer serves as our initial use case, the approach has broad applicability to diseases detectable via medical imaging. More specifically, certain submodules that encode biological knowledge of solid tumors enable narrow generalization to other solid-tumor indications.
Clinical endpoint performance
AI models must demonstrate that they are fit-for-purpose within their defined Context of Use (COU). For Virtual Phase III trials, this requires showing that the predicted effect size, such as pathological complete response (pCR) rate, falls within a pre-specified, clinically acceptable margin.
Robustness and Transparency
We need to show why a patient was predicted to have a favorable disease trajectory. This allows the human pathologist or oncologist to verify the model’s reasoning before making a critical decision. The model should indicate which regions of the image at which time points drive its predictions. For example, in tumor modeling, it could highlight areas where FTV is shrinking or expanding and show that these areas influenced the predicted disease trajectory. Instead of outputting a single predicted value (e.g., tumor volume or pCR probability), the model outputs a distribution over possible predictions. From this distribution, you can compute confidence intervals at each voxel (3D spatial unit) and each time point (temporal dimension). The result is a 4D uncertainty map showing which regions or time points are predicted with high vs. low confidence.
Clinical applicability
Clinical applicability demands that AI outputs meaningfully guide patient management and clinical decision-making. Our approach achieves this by matching therapies to patients through responder stratification informed by individualized benefit-risk assessments, representing a novel paradigm in precision oncology.
Conclusions
AI models have the potential to accelerate clinical evidence generation through more synthetic trials, but realizing this promise requires developing regulatory strategies in close collaboration with authorities from the earliest prototype stages. Early regulatory engagement also enables new paradigms, such as precision oncology, bringing patient-level benefit-risk assessments into clinical practice.