Bayesian approaches for addressing missing data in Cost-Effectiveness Analysis (alongside Randomised Controlled Trials)

Gianluca Baio

Department of Statistical Science | University College London

22 July 2022

Check out our departmental podcast “Random Talks” on Soundcloud!   

Follow our departmental social media accounts         

Disclaimer

…Just so you know what you’re about to get into… 😉

Health technology assessment (HTA)

Objective: Combine costs and benefits of a given intervention into a rational scheme for allocating resources

Health technology assessment (HTA)

Probabilistic sensitivity analysis

Health technology assessment (HTA)

Probabilistic sensitivity analysis

Health technology assessment (HTA)

Probabilistic sensitivity analysis

Individual-level data

HTA alongside RCTs

Demographics
Clinical outcomes
HRQL data
Resource use data
ID
Trt
Sex
Age
\(\ldots\)
\(\color{olive}{y_0}\)
\(\color{olive}{y_1}\)
\(\color{olive}{\ldots}\)
\(\color{olive}{y_J}\)
\(\color{blue}{u_0}\)
\(\color{blue}{u_1}\)
\(\color{blue}{\ldots}\)
\(\color{blue}{u_J}\)
\(\color{red}{c_0}\)
\(\color{red}{c_1}\)
\(\color{red}{\ldots}\)
\(\color{red}{c_J}\)
1 1 M 23 \(\ldots\) \(y_{10}\) \(y_{11}\) \(\ldots\) \(y_{1J}\) 0.32 0.66 \(\ldots\) 0.44 103 241 \(\ldots\) 80
2 1 M 23 \(\ldots\) \(y_{20}\) \(y_{21}\) \(\ldots\) \(y_{2J}\) 0.12 0.16 \(\ldots\) 0.38 1204 1808 \(\ldots\) 877
3 2 F 19 \(\ldots\) \(y_{30}\) \(y_{31}\) \(\ldots\) \(y_{3J}\) 0.49 0.55 \(\ldots\) 0.88 16 23 \(\ldots\) 22
\(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\)

\(\color{olive}{y_{ij}} =\) Survival time, event indicator (eg CVD), number of events, continuous measurement (eg blood pressure), …

\(\color{blue}{u_{ij}}=\) Utility-based score to value health (eg EQ-5D, SF-36, Hospital & Anxiety Depression Scale), …

\(\color{red}{c_{ij}}=\) Use of resources (drugs, hospital, GP appointments), …

  • Usually aggregate longitudinal measurements into a cross-sectional summary and for each individual consider the pair \(\class{myblue}{(e_i,c_i)}\)
  • HTA preferably based on utility-based measures of effectiveness

  • Quality Adjusted Life Years (QALYs) are a measure of disease burden combining

    • Quantity of life (the amount of time spent in a given health state)
    • Quality of life (the utility attached to that state)

Use of resources (drugs, hospital, GP appointments), …

(“Standard”) Statistical modelling

  1. Compute individual QALYs and total costs as

\[\class{myblue}{e_i = \displaystyle\sum_{j=1}^{J} \left(u_{ij}+u_{i\hspace{.5pt}j-1}\right) \frac{\delta_{j}}{2}} \qquad \style{font-family:inherit;}{\text{ and }} \class{myblue}{\qquad c_i = \sum_{j=0}^J c_{ij}} \qquad \left[\style{font-family:inherit;}{\text{with: }} \class{myblue}{\delta_j = \frac{\style{font-family:inherit;}{\text{Time}}_j - \style{font-family:inherit;}{\text{Time}}_{j-1}}{\style{font-family:inherit;}{\text{Unit of time}}}}\right]\]

  1. (Often implicitly) assume normality and linearity and model independently individual QALYs and total costs by controlling for (centered) baseline values, eg \(\class{myblue}{u^∗ = (u − \bar{u} )}\) and \(\class{myblue}{c^∗ = (c − \bar{c} )}\) \[\begin{align} e_i & = \alpha_{e0} + \alpha_{e1} u^*_{0i} + \alpha_{e2} \style{font-family:inherit;}{\text{Trt}}_i + \varepsilon_{ei}\, [+ \ldots], \qquad \varepsilon_{ei} \sim \dnorm(0,\sigma_e) \\ c_i & = \alpha_{c0} + \alpha_{c1} c^*_{0i} + \alpha_{c2} \style{font-family:inherit;}{\text{Trt}}_i + \varepsilon_{ci}\, [+ \ldots], \qquad\hspace{2pt} \varepsilon_{ci} \sim \dnorm(0,\sigma_c) \end{align}\]
  1. Estimate population average cost and effectiveness differentials
    - Under this model specification, these are \(\class{myblue}{\Delta_e=\alpha_{e2}}\) and \(\class{myblue}{\Delta_c=\alpha_{c2}}\)
  1. Quantify impact of uncertainty in model parameters on the decision making process
    - In a fully frequentist analysis, this is done using resampling methods (eg bootstrap)

What’s wrong with that?…

  • Potential correlation between costs & clinical benefits [Individual level + Aggregated level Data]
    • Strong positive correlation: effective treatments are innovative and result from intensive and lengthy research \(\Rightarrow\) are associated with higher unit costs
    • Negative correlation: more effective treatments may reduce total care pathway costs e.g. by reducing hospitalisations, side effects, etc.
    • Because of the way in which standard models are set up, bootstrapping generally only approximates the underlying level of correlation – MCMC does a better job!
  • Joint/marginal normality not realistic [Mainly ILD]
    • Costs usually skewed and benefits may be bounded in \([0; 1]\) some ghost text to take it all the way to the end of the div so a coupl it to orange aside
    • Can use transformation (e.g. logs) – but care is needed when back transforming to the natural scale
    • Should use more suitable models (e.g. Beta, Gamma or log-Normal) – generally easier under a Bayesian framework
    • Particularly relevant in presence of partially observed data – more on this later!
  • Particularly as the focus is on decision-making (rather than just inference), we need to use all available evidence to fully characterise current uncertainty on the model parameters and outcomes [ILD + ALD]
    • A Bayesian approach is helpful in combining different sources of information
    • Propagating uncertainty is a fundamentally Bayesian operation!

Bayesian HTA in action

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = p(e)p(c\mid e) = p(c)p(e\mid c)}\)

Bayesian HTA in action

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = }\class{blue}{p(e)}\class{myblue}{p(c\mid e) = p(c)p(e\mid c)}\)

Bayesian HTA in action

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = }\class{blue}{p(e)}\class{red}{p(c\mid e)}\class{myblue}{ = p(c)p(e\mid c)}\)

Bayesian HTA in action

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = p(e)p(c\mid e) = p(c)p(e\mid c)}\)

- For example: \[\begin{align} \class{blue}{e_i \sim \style{font-family:inherit;}{\text{Normal}}(\phi_{ei},\tau_e)} && \class{blue}{\phi_{ei}} &\class{blue}{= \alpha_0 [+ \ldots]} && \class{blue}{\mu_e = \alpha_0} \\ \class{red}{c_i\mid e_i \sim \style{font-family:inherit;}{\text{Normal}}(\phi_{ci},\tau_c)} && \class{red}{\phi_{ci}} &\class{red}{= \beta_0 + \beta_1(e_i -\mu_e)[+\ldots]} && \class{red}{\mu_ c = \beta_0} \end{align}\]

Bayesian HTA in action

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = p(e)p(c\mid e) = p(c)p(e\mid c)}\)

- For example: \[\begin{align} \class{blue}{e_{i} \sim \style{font-family:inherit;}{\text{Beta}}(\phi_{ei}\tau_e,(1-\phi_{ei})\tau_e)} && \class{blue}{\style{font-family:inherit;}{\text{logit}}(\phi_{ei})} &\class{blue}{= \alpha_0 [+ \ldots]} && \class{blue}{\mu_e = \frac{\style{font-family:inherit;}{\text{exp}}(\alpha_0)}{1+\style{font-family:inherit;}{\text{exp}}(\alpha_0)}} \\ \class{red}{c_i\mid e_i \sim \style{font-family:inherit;}{\text{Gamma}}(\tau_c,\tau_c/\phi_{ci})} && \class{red}{\style{font-family:inherit;}{\text{log}}(\phi_{ci})} &\class{red}{= \beta_0 + \beta_1(e_i -\mu_e)[+\ldots]} && \class{red}{\mu_ c = \style{font-family:inherit;}{\text{exp}}(\beta_0)} \end{align}\]

  • Combining “modules” and fully characterising uncertainty about deterministic functions of random quantities is relatively straightforward using MCMC

  • Prior information can help stabilise inference (especially with sparse data!), eg

    • Cancer patients are unlikely to survive as long as the general population
    • ORs are unlikely to be greater than \(\pm \style{font-family:inherit;}{\text{5}}\)

Missing data in HTA

  • Missing data are complicated in any context

    • But are fairly established in medical/bio-statistical research
  • In HTA it’s even more complicated…

    • Bivariate outcome, usually correlated
    • Normality not reasonable (skewness)
    • Other features of the data (“spikes”)
    • Main objective: decision-making, not inference!

Missing data in HTA

Selection models: MCAR \((e,c)\)

  • \(\class{olive}{m_{ei}\sim\dbern(\pi_{ei}); \qquad \logit(\pi_{ei})=\gamma_{e0}}\)

  • \(\class{orange}{m_{ci}\sim\dbern(\pi_{ci}); \qquad \logit(\pi_{ci})=\gamma_{c0}}\)

Missing data in HTA

Selection models: MAR \((e,c)\)

  • \(\class{olive}{m_{ei}\sim\dbern(\pi_{ei}); \qquad \logit(\pi_{ei})=\gamma_{e0} + \sum_{k=1}^K \gamma_{ek}x_{eik}}\)

  • \(\class{orange}{m_{ci}\sim\dbern(\pi_{ci}); \qquad \logit(\pi_{ci})=\gamma_{c0} + \sum_{h=1}^H \gamma_{ch}x_{cih}}\)

Missing data in HTA

Selection models: MNAR \((e,c)\)

  • \(\class{olive}{m_{ei}\sim\dbern(\pi_{ei}); \qquad \logit(\pi_{ei})=\gamma_{e0} + \sum_{k=1}^K \gamma_{ek}x_{eik} + \gamma_{eK+1}e_i;} \class{blue}{\quad \gamma_{eK+1}\sim\style{font-family:inherit;}{\text{Informative Prior}}}\)

  • \(\class{orange}{m_{ci}\sim\dbern(\pi_{ci}); \qquad \logit(\pi_{ci})=\gamma_{c0} + \sum_{h=1}^H \gamma_{ch}x_{cih} + \gamma_{cH+1}c_i;} \class{blue}{\quad \gamma_{cK+1}\sim\style{font-family:inherit;}{\text{Informative Prior}}}\)

Motivating example: MenSS trial

  • The MenSS pilot RCT evaluates the cost-effectiveness of a new digital intervention to reduce the incidence of STI in young men with respect to the SOC
    • QALYs calculated from utilities (EQ-5D 3L)
    • Total costs calculated from different components (no baseline)

Partially observed data

Time Type of outcomes Observed (%) Observed (%)
Control (\(n_1=\) 75) Intervention (\(n_2=\) 84)
Baseline utilities 72 (96%) 72 (86%)
3 months utilities and costs 34 (45%) 23 (27%)
6 months utilities and costs 35 (47%) 23 (27%)
12 months utilities and costs 43 (57%) 36 (43%)
Complete cases utilities and costs 27 (44%) 19 (23%)

Motivating example: MenSS trial

  • The MenSS pilot RCT evaluates the cost-effectiveness of a new digital intervention to reduce the incidence of STI in young men with respect to the SOC
    • QALYs calculated from utilities (EQ-5D 3L)
    • Total costs calculated from different components (no baseline)

Skewness and “structural values”

Modelling

Gabrio et al (2018)

  1. Bivariate Normal
    • Simpler and closer to “standard” frequentist models
    • Accounts for correlation between QALYs and costs

Modelling

Gabrio et al (2018)

  1. Bivariate Normal
    • Simpler and closer to “standard” frequentist models
    • Accounts for correlation between QALYs and costs
  2. Beta-Gamma
    • Account for correlation between outcomes and model the relevant ranges: QALYs \(\in (0,1)\) and costs \(\in (0,\infty)\)
    • But: needs to rescale the observed data \(\class{myblue}{e^*_{it}=(e_{it}-\epsilon)}\) to avoid spikes at 1

Modelling

Gabrio et al (2018)

  1. Bivariate Normal
    • Simpler and closer to “standard” frequentist models
    • Accounts for correlation between QALYs and costs
  2. Beta-Gamma
    • Account for correlation between outcomes and model the relevant ranges: QALYs \(\in (0,1)\) and costs \(\in (0,\infty)\)
    • But: needs to rescale the observed data \(\class{myblue}{e^*_{it}=(e_{it}-\epsilon)}\) to avoid spikes at 1
  3. Hurdle model
    • Model \(e_{it}\) as a mixture to account for correlation between outcomes + relevant ranges + structural values
    • May expand further to account for partially observed baseline utilities \(u_{0it}\) (needs untestable assumptions!)

Modelling

Gabrio et al (2018)

  1. Bivariate Normal
    • Simpler and closer to “standard” frequentist models
    • Accounts for correlation between QALYs and costs
  2. Beta-Gamma
    • Account for correlation between outcomes and model the relevant ranges: QALYs \(\in (0,1)\) and costs \(\in (0,\infty)\)
    • But: needs to rescale the observed data \(\class{myblue}{e^*_{it}=(e_{it}-\epsilon)}\) to avoid spikes at 1
  3. Hurdle model
    • Model \(e_{it}\) as a mixture to account for correlation between outcomes + relevant ranges + structural values
    • May expand further to account for partially observed baseline utilities \(u_{0it}\) (needs untestable assumptions!)

Modelling

Gabrio et al (2018)

  1. Bivariate Normal
    • Simpler and closer to “standard” frequentist models
    • Accounts for correlation between QALYs and costs
  2. Beta-Gamma
    • Account for correlation between outcomes and model the relevant ranges: QALYs \(\in (0,1)\) and costs \(\in (0,\infty)\)
    • But: needs to rescale the observed data \(\class{myblue}{e^*_{it}=(e_{it}-\epsilon)}\) to avoid spikes at 1
  3. Hurdle model
    • Model \(e_{it}\) as a mixture to account for correlation between outcomes + relevant ranges + structural values
    • May expand further to account for partially observed baseline utilities \(u_{0it}\) (needs untestable assumptions!)

Bayesian multiple imputation (MAR)

missingHE

A R package to deal with missing data in HTA

Objectives: Run a set of complex models to account for different level of complexity and missingness

GitHub repository

Conclusions

  • A full Bayesian approach to handling missing data extends standard “imputation methods”

    • Can consider MAR and MNAR with relatively little expansion to the basic model
  • Particularly helpful in cost-effectiveness analysis, to account for

    • Asymmetrical distributions for the main outcomes
    • Correlation between costs & benefits
    • Structural values (eg spikes at 1 for utilities or spikes at 0 for costs)
  • Need specialised software + coding skills

    • R package missingHE to implement a set of general models
 priormodel<-list ("mu.prior.e"=mu.prior.e,"delta.prior.e"=delta.prior.e)
    run_model(data=data, model.eff=e~1, model.cost=c~1,
    dist_e="norm",dist_c="norm",type="MNAR_eff",stand=FALSE,
    program="JAGS",forward=FALSE,prob=c(0.05,0.95),n.chains=2,n.iter =20000,
    n.burnin=floor(20000/2),inits=NULL,n.thin=1,save_model=FALSE,prior=prior
 )
  • Link up with the “R-HTA-verse”

    • Other packages that can be used to analyse/post-process the output of economic models
    • BCEA, survHE, heemod, …