SAN FRANCISCO@JPM

12-15 Jan |

Book your meeting

Expert information for pharma and regulatory authorities

Cancer4D  MULTIMODAL GENAI THAT UNDERSTANDS CANCER OVER TIME

Medical images detect tumors. What if they helped clinicians look ahead too? 

FILTER

FILTER

FILTER

Regulatory innovation for 4D generative AI

Reading time: 4 minutes

Regulatory compliance of multimodal AI is an obstacle for its clinical adoption in oncology. We propose an approach for how to address the regulatory innovation required by addressing safety and efficacy with a concept we call virtual phase III trials.

Finite samples require methods for minimizing statistical uncertainty
Much clinical research aims to estimate the effect of a treatment on a patient. The randomized clinical trial, RCT, is the gold standard for causal inference because randomization cancels out the effects of any unobserved confounders. However, clinical research must still contend with the statistical uncertainty inherent to finite samples.

As a result, novel approaches are designed using artificial intelligence for minimizing the remaining uncertainty about the causal effect, ensuring that conclusions drawn from the data are as reliable and reproducible as possible. The goal is to improve trial efficiency, not substituting a randomized control arm. Furthermore, there is an evidence gap between clinical trials and real-world use, due to trial inclusion criteria and trial protocols. Here we present a few approaches, using artificial intelligence.

  • Disease trajectories from clinical data – mainly efficacy in Phase II/III
    These type of models take a clinical trial participant’s baseline variables, e.g. lab results, biomarkers, and other clinical features, and forecast how that participant would have progressed on control treatment. European Medicine Agency stated a favorable opinion about this type of method. Because these predictions are primarily focused on efficacy outcomes rather than safety, their application is most relevant for Phase II and Phase III trials, where assessing treatment effect is the main objective. Safety, while important, is typically less predictable from baseline covariates alone, especially for rare or idiosyncratic adverse events. Hence, standard-of-care dosing is assumed and efficacy trajectories are predicted accordingly. Furthermore, the predictions with clinical data is primarily focused on continuous outcomes, such as lab values or repeated clinical assessments produce smooth trajectories. Such continuous outcomes are rare in oncology trials, which are using binary (response yes/no) and time-to-event endpoints (e.g. overall survival, disease-free progression etc).
  • Prognostic scoring from snapshot imaging – mainly efficacy in Phase II/III
    Artificial intelligence can also be applied to imaging in order to produce prognostic scores. The models can be applied on the baseline imaging, but also for each image during the clinical trial. As a consequence of the snapshot predictions, is that yet again the focus is on efficacy and not on safety. Standard-of-care dosing is assumed and the outcome predictions, e.g. overall survival, are predicted accordingly. The trend is that the prognostic AI models, sometimes also called image-based risk, are becoming multimodal, e.g. combining clinical data with baseline pathology imaging, for increased accuracy in prognostic scores and outcome predictions. Again, the focus is on efficacy and not on safety, which is resulting on a focus on Phase II and III.
  • Disease trajectories from multimodal and longitudinal data – combining efficacy and safety in Phase I/II
    With the advancement of generative AI models trained on multimodal and longitudinal data, there is a now a unique possibility to include not only efficacy, but also safety predictions. The longitudinal data contains information about treatment dose, and reduction of dose is the most common result of safety concerns. For breast cancer patients, dose reductions are common, and a study found that around 40% are getting dose reductions.

Conclusions
As can be concluded from the above discussion, the application of artificial intelligence for supporting clinical trials support two categories: efficacy and safety predictions. Here below we present all the three potential context of use for introducing artificial intelligence benefiting patients: Phase I/II, Phase III and clinical use. The regulatory innovation goal for supporting clinical trials is in the EIC Advanced Innovation Challenge a faster and safer testing of new therapeutic interventions, and our proposed approach is therefore to have the highest priority on the phase I/II clinical trials.

Written by

Anna

Published on

Validation of oncology virtual Phase III trials

Reading time: 3 minutes

Virtual phase III trials are conducted before the actual phase III trial, in order to evaluate the data from the phase I/II trial for safety and efficacy. Validating such virtual trials requires novel methods, and here we are presenting three such methods.

Matching patients using 4D generative AI
In order for single-arm trials to be complemented with a control arm, it has to be proven that the complement match the original arm. When using external control arms, this is done by for example using the propensity matching score on inclusion criteria, clinical data and biomarkers. A reflection paper for evidence-generation with external controls is being European Medicine Agency is under review. For virtual control arms, the challenge is more complex: the synthetic patients must not only be statistically similar to the real cohort, they must also behave like them across time. 4D generative models enable this by learning disease trajectories. The predictions from a 4D generative AI can be classified into observable and non-observable treatment scenarios. These two scenarios are also called factual and counterfactual treatment scenarios. The ground truth is only available for the observable treatment scenario. Changing any parameter, such as confounding variables, treatment dose, therapy type etc, changes the disease trajectory and it becomes of the the type non-observable and not verifiable with a ground truth. For 4D generative AI, we are here listing what kind of validation methods that could be applicable (more details found here):

  • Patient-level consistency checks
    We propose a patient-level verification can be done with a novel method we call blind treatment assignment test. This method is generating both observable and non-observable treatment scenarios, and the closest prediction to the ground truth is classified as the predicted treatment assignment. We call the method blind, since the 4D generative AI model does not know in advance with therapy that has been assigned. By doing so, it is assured that the 4D generative AI model is having a performance at least within the risk-benefit difference window between the two therapies.
  • Population-level consistency checks
    We also propose a population-level verification method we call non-observable randomized clinical trial benchmarking. This method is replicating historical clinical trials by letting the 4D generative AI model make only non-observable predictions for a new patient population. By doing so, it is assured that the 4D generative AI model is taking into account the confounding variables for assuring reliable performance.
  • Subgroup-level consistency checks
    We also propose a patient stratification for evaluating which subgroups that are predicted to benefit the most of a new therapy. By doing so, we are not only considering population-level benefits but also how the predictions are performing on subgroups, which enables checking for bias in the training data for the AI model.

Conclusions
4D generative AI provides a powerful framework for creating virtual control arms and simulating treatment outcomes in clinical trials. By leveraging patient-level, population-level, and subgroup-level consistency checks, it is possible to validate predictions in both observable (factual) and non-observable (counterfactual) scenarios. These validation methods help ensure that the model accurately captures disease trajectories over time,  accounts for confounding variables and heterogeneity in patient populations, and minimizes bias across subgroups while reflecting realistic treatment effects.

Written by

Anna

Published on

4D generative AI in regulatory terms for EMA

Reading time: 3 minutes

AI models hold profound promise for accelerating clinical evidence generation, but without a clear regulatory framework, sponsors and innovators risk investing in methods regulators cannot accept.

According to the positioning paper from European Medicine Agency, for high-impact cases of using AI, an interaction with regulatory authorities is encouraged already in the planning stage. Furthermore, the interaction with regulators should cover questions such as intended context of use, generalizability, performance, robustness, transparency, and clinical applicability.

Context of use (COU)
The Virtual Phase III acts as a predictive tool to help the trial sponsor and regulators commit to the most efficient and statistically robust confirmatory trial possible. By generating further evidence based on data from Phase I/II, these virtual trials can guide decision-making on sample size, endpoint selection, and trial duration, potentially reducing cost and accelerating timelines. Furthermore, the Virtual Phase III predictions are based on:

  • Biomarker trajectory modeling Our model relies on image-to-image medical image analysis to extract and track biomarkers like the longest diameter  (LD), but also other segmentations for increasing the fidelity in generated predictions, such as the Functional Tumor Volume (FTV). Unlike a simple 1D measurement (longest tumor diameter), FTV’s value lies in its 3D measurement that specifically tracks the functional, actively enhancing (and thus, viable) portion of the tumor. Cell death and necrosis can cause the functional tissue to shrink significantly before the overall longest diameter changes much.
  • Dose-reduction modeling If the investigational drug shows a significantly lower risk of relative dose intensity (RDI) < 85%, it provides compelling evidence of a superior safety/tolerability profile, which directly translates to better adherence and real-world efficacy.

Generalizability
While breast cancer serves as our initial use case, the approach has broad applicability to diseases detectable via medical imaging. More specifically, certain submodules that encode biological knowledge of solid tumors enable narrow generalization to other solid-tumor indications.

Clinical endpoint performance
AI models must demonstrate that they are fit-for-purpose within their defined Context of Use (COU). For Virtual Phase III trials, this requires showing that the predicted effect size, such as pathological complete response (pCR) rate, falls within a pre-specified, clinically acceptable margin.

Robustness and Transparency
We need to show why a patient was predicted to have a favorable disease trajectory. This allows the human pathologist or oncologist to verify the model’s reasoning before making a critical decision. The model should indicate which regions of the image at which time points drive its predictions. For example, in tumor modeling, it could highlight areas where FTV is shrinking or expanding and show that these areas influenced the predicted disease trajectory. Instead of outputting a single predicted value (e.g., tumor volume or pCR probability), the model outputs a distribution over possible predictions. From this distribution, you can compute confidence intervals at each voxel (3D spatial unit) and each time point (temporal dimension). The result is a 4D uncertainty map showing which regions or time points are predicted with high vs. low confidence.

Clinical applicability
Clinical applicability demands that AI outputs meaningfully guide patient management and clinical decision-making. Our approach achieves this by matching therapies to patients through responder stratification informed by individualized benefit-risk assessments, representing a novel paradigm in precision oncology.

Conclusions
AI models have the potential to accelerate clinical evidence generation through more synthetic trials, but realizing this promise requires developing regulatory strategies in close collaboration with authorities from the earliest prototype stages. Early regulatory engagement also enables new paradigms, such as precision oncology, bringing patient-level benefit-risk assessments into clinical practice.

Written by

Anna

Published on