Skill Guide

Demand forecasting for investigational products using time-series and probabilistic models

The application of statistical models-primarily ARIMA, Prophet, and Bayesian probabilistic methods-to forecast the supply and demand dynamics of investigational medicinal products (IMPs) across clinical trial phases, accounting for protocol amendments, enrollment volatility, and regulatory holds.

This skill directly mitigates the $2M-$50M cost of IMP overproduction or shortage per trial, protecting patient safety and trial timelines. It transforms clinical supply chain management from a reactive, guesswork function into a data-driven, risk-quantified strategic capability.

1 Careers

1 Categories

8.9 Avg Demand

18% Avg AI Risk

How to Learn Demand forecasting for investigational products using time-series and probabilistic models

1. Master the clinical trial supply chain lifecycle (Phase I-IV) and the unique demand drivers: randomization ratios, titration schedules, overage factors, and expiry windows. 2. Learn core time-series decomposition (trend, seasonality, residual) using trial enrollment data as your first dataset. 3. Understand basic probabilistic concepts: mean, standard deviation, confidence intervals, and Monte Carlo simulation's purpose.

1. Move from enrollment forecasting to integrated IMP demand forecasting, incorporating Visit Schedules and IP expiry dates. 2. Apply Prophet or SARIMA models to historical trial data, learning to tune hyperparameters like changepoint_prior_scale for protocol amendments. 3. Key mistake to avoid: Ignoring site activation curves. Model site-level enrollment, not just global targets.

1. Architect a holistic forecasting system that integrates Bayesian Hierarchical Models (e.g., using Stan or PyMC) to borrow strength across similar trials or cohorts. 2. Implement closed-loop feedback by integrating real-time IRT (Interactive Response Technology) and EDC data streams to update forecasts probabilistically. 3. Lead strategic alignment by presenting forecasts with explicit risk buffers (P50, P80, P95) to inform go/no-go decisions on manufacturing batch sizes.

Practice Projects

Beginner

Project

Forecasting Phase I Enrollment with ARIMA

Scenario

Given 24 months of monthly patient enrollment data for a completed Phase I oncology trial (N=50), build a model to forecast the next 6 months of enrollment for a similar new trial.

How to Execute

1. Clean the data in Python (pandas), handling missing months. 2. Perform a stationarity test (ADF) and apply differencing if needed. 3. Use `statsmodels` `ARIMA` function, testing orders (p,d,q) via ACF/PACF plots and AIC comparison. 4. Plot the forecast with confidence intervals and calculate MAE against a hold-out set.

Intermediate

Case Study/Exercise

Building a Probabilistic IMP Supply Plan

Scenario

A Phase II cardiology trial is planning for a 600-patient, 12-month recruitment period. The IMP has a 36-month shelf life and a 3-month manufacturing lead time. Management wants a supply plan that ensures <5% risk of stockout.

How to Execute

1. Model patient enrollment as a non-homogeneous Poisson process (NHPP) with a site activation curve (S-curve). 2. For each simulated patient, generate a treatment duration from a log-normal distribution. 3. Run 10,000 Monte Carlo simulations to produce a probability distribution of total demand over time. 4. From this distribution, extract the demand quantity at the 95th percentile (P95) to define your required supply quantity, accounting for lead time.

Advanced

Project

Implementing a Bayesian Hierarchical Forecasting Pipeline

Scenario

A portfolio of 5 similar oncology trials (varying in target population) is running. You have enrollment data from 10 past, analogous trials. You need to forecast demand for a new, sixth trial starting in 3 months.

How to Execute

1. Structure data hierarchically: trials within a disease area. 2. In PyMC, specify a Bayesian Hierarchical Model with hyperpriors for recruitment rate parameters (e.g., 'k' in a Poisson-Gamma model) that are shared across the portfolio. 3. Use MCMC sampling (NUTS) to fit the model to historical data, obtaining posterior distributions for each trial's parameters. 4. Forecast the new trial by drawing from the posterior predictive distribution, which naturally incorporates both the uncertainty from the new trial and the learned information from the portfolio.

Tools & Frameworks

Software & Platforms

Python (statsmodels, PyMC, Prophet, pandas)R (forecast, rstanarm)Specialized Platforms (Kinaxis RapidResponse, Oracle Clinical One Planning)Interactive Response Technology (IRT) Systems

Python/R for custom model development and statistical analysis. Specialized SaaS platforms offer integrated workflow but can be black-box. IRT systems are the source of truth for real-time, blinded enrollment and dispensing data.

Mental Models & Methodologies

Monte Carlo SimulationBayesian Inference & Hierarchical ModelingTime-Series Decomposition (STL)Enrollment Funnel Modeling (Site → Patient → Dose)

Monte Carlo for quantifying supply chain risk under uncertainty. Bayesian methods for incorporating prior knowledge and portfolio learning. Enrollment funnel modeling forces disaggregation of demand drivers, a critical practice for accuracy.

Interview Questions

Answer Strategy

The candidate must demonstrate they can decompose the problem beyond simple enrollment. They should talk about creating a dose-per-patient simulation model that accounts for different patient pathways (completers, dropouts, dose modifications) and then multiplying by a probabilistic enrollment forecast. Sample answer: 'I'd first model the dosing regimen as a patient journey state machine. Using historical screen failure and discontinuation rates, I'd simulate individual patient dose histories. These simulations would then be aggregated over a probabilistic enrollment forecast-likely an NHPP fitted to the site activation plan-to yield a total dose demand distribution, not just a point estimate.'

Answer Strategy

Tests risk communication and cross-functional leadership. The answer should move from numbers to business context and options. Sample answer: 'I would not simply recommend the 15,000-unit production run. I'd present the risk analysis: there's a 50% chance we over-produce by 2,000 units and a 5% chance we under-produce by 3,000. I'd then quantify the impact of a stockout (trial delay, patient impact) vs. write-off cost. In collaboration with supply chain, we'd evaluate mitigations: a phased production batch, a backup CMO, or adjusting site initiation pace to flatten the demand curve. The final decision would be documented with the chosen risk tolerance.'