HTA the Bayesian way


Gianluca Baio

Department of Statistical Science   |   University College London

g.baio@ucl.ac.uk


https://gianluca.statistica.it

https://egon.stats.ucl.ac.uk/research/statistics-health-economics

https://github.com/giabaio   https://github.com/StatisticsHealthEconomics  

@gianlubaio@mas.to     gianluca-baio    


Karolinska Institute, Stockholm (Sweden)

Approaches to handling uncertainty in HTA

4 December 2025

Check out our departmental podcast “Random Talks” on Soundcloud!

Follow our departmental social media accounts + magazine “Sample Space”

To be or not to be?… (a Bayesian)

To be or not to be?… (a Bayesian)

Practical benefits

  • Ability to synthesise multiple datasets / sources of evidence in coherent manner
  • … Allows complexities about real-world data to be modelled (via MCMC methods)
  • Naturally provides predictions of future events
  • Full allowance for uncertainty in conclusions
  • Intuitive communication
    • Express uncertainty by probability statements about unknowns

Challenges

  • Can be harder to implement than classical “frequentist” methods
    • No free lunch… complex world requires complex modelling (whatever your approach…)
  • Extra source of information (the prior) – can be tricky to specify…

Bayesian inference

Basic ideas

  • A Bayesian model specifies a full probability distribution to describe uncertainty

  • This applies to

    • Data, which are subject to sampling variability
    • Parameters (or hypotheses), typically unobservable and thus subject to epistemic uncertainty
    • And even future, yet unobserved realisations of the observable variables (data)
  • Probability is the only languange in the Bayesian framework to assess any form of imperfect information or knowledge
    • No need to distinguish between probability and confidence
    • Before even seeing the data, we need to identify a suitable probability distribution to describe the overall uncertainty about the data \(\boldsymbol{y}\) and the parameters \(\boldsymbol\theta\)

\[p(\boldsymbol{y},\boldsymbol\theta)=p(\boldsymbol\theta)p(\boldsymbol y\mid\boldsymbol\theta) = p(\bm{y})p(\bm\theta\mid \bm{y})\] from which we derive Bayes Theorem

\[ p(\boldsymbol\theta\mid \boldsymbol y) = \frac{p(\boldsymbol\theta)p(\boldsymbol y\mid\boldsymbol\theta)}{p(\boldsymbol y)} \]

  • Express beliefs in form of a probability distribution

Prior information vs prior distribution

  • Before observing any new data or running our own study, we have information to suggest that a specific drug’s effectiveness is most likely included in the range \([0.2;0.6]\)
  • We can map this information into a number of distributions, which are consistent with this knowledge

\[ \phi \sim \dnorm(-0.405, 0.4137) \Rightarrow \] \[ \begin{aligned} \logit^{-1}(\phi) & =\frac{\exp(\phi)}{1+\exp(\phi)} \\ & \theta \approx \dbeta(9.2,13.8) \end{aligned} \]

Prior distributions

  • The scale on which the distribution is defined is important
  • Whenever we have genuine information we should use it (parachute effect…)
  • When we don’t, we can still do better than “I have no idea about this parameter”…

  • For example… Consider a model with \(\logit(\pi_{i}) = \alpha_i + \delta x_i\) and set priors
    • \(\alpha_i\sim\dnorm(\style{font-family:inherit;}{\text{mean}}=0,\style{font-family:inherit;}{\text{sd}}=4)\)
    • \(\style{font-family:inherit;}{\text{logOR}}=\delta\sim\dnorm(\style{font-family:inherit;}{\text{mean}}=0,\style{font-family:inherit;}{\text{sd}}=2)\)

Individual level data

Example — 10 Top Tips trial

ID Trt Sex Age BMI GP \(u_0\) \(u_3\) \(u_6\) \(u_{12}\) \(u_{18}\) \(u_{24}\) \(c\)
2 1 F 66 1 21 0.088 0.848 0.689 0.088 0.691 0.587 4230.04
3 1 M 57 1 5 0.796 0.796 0.796 0.620 0.796 1.000 1584.88
4 1 M 49 2 5 0.725 0.727 0.796 0.848 0.796 0.291 331.27
12 1 M 64 2 14 0.850 0.850 1.000 1.000 0.848 0.725 1034.42
13 1 M 66 1 9 0.848 0.848 0.848 1.000 0.848 0.725 1321.30
21 1 M 64 1 3 0.848 1.000 1.000 1.000 0.850 1.000 520.98
  • Demographics:
    • BMI = Categorised body mas index
    • GP = Number of GP visits
  • HRQL data
    • QoL measurements at baseline, 3, 6, 12, 18 and 24 months
  • Costs
    • Total costs over the course of the study

(“Standard”) Statistical modelling

  1. Compute individual QALYs and total costs as

\[\class{myblue}{e_i = \displaystyle\sum_{j=1}^{J} \left(u_{ij}+u_{i\hspace{.5pt}j-1}\right) \frac{\delta_{j}}{2}} \qquad \txt{and} \class{myblue}{\qquad c_i = \sum_{j=0}^J c_{ij}} \qquad \left[\txt{with } \class{myblue}{\delta_j = \frac{\text{Time}_j - \text{Time}_{j-1}}{\txt{Unit of time}}}\right]\]

(“Standard”) Statistical modelling

  1. Compute individual QALYs and total costs as

\[\class{myblue}{e_i = \displaystyle\sum_{j=1}^{J} \left(u_{ij}+u_{i\hspace{.5pt}j-1}\right) \frac{\delta_{j}}{2}} \qquad \txt{and} \class{myblue}{\qquad c_i = \sum_{j=0}^J c_{ij}} \qquad \left[\txt{with } \class{myblue}{\delta_j = \frac{\text{Time}_j - \text{Time}_{j-1}}{\txt{Unit of time}}}\right]\]

  1. (Often implicitly) assume normality and linearity and model independently individual QALYs and total costs by controlling for (centered) baseline values, eg \({u^∗_i = (u_i − \bar{u} )}\) and \({c^∗_i = (c_i − \bar{c} )}\)

\[\begin{align} e_i & = \alpha_{e0} + \alpha_{e1} \text{Trt}_i + \alpha_{e2} u^*_{0i} + \varepsilon_{ei}\, [+ \ldots], \qquad \varepsilon_{ei} \sim \dnorm(0,\sigma_e) \\ c_i & = \alpha_{c0} + \alpha_{c1} \text{Trt}_i + \alpha_{c2} c^*_{0i} + \varepsilon_{ci}\, [+ \ldots], \qquad\hspace{2pt} \varepsilon_{ci} \sim \dnorm(0,\sigma_c) \end{align}\]

  1. Estimate population average cost and effectiveness differentials
    • Under this model specification, these are \(\class{myblue}{\Delta_e=\alpha_{e1}}\) and \(\class{myblue}{\Delta_c=\alpha_{c1}}\)
  2. Quantify impact of uncertainty in model parameters on the decision making process
    • In a fully frequentist analysis, this is done using resampling methods (eg bootstrap)

Modelling ILD in HTA

Normal/Normal independent model — setup

  • The “standard” modelling is equivalent to

    \[\begin{eqnarray*} e_i & \sim & \dnorm(\phi_{ei},\sigma_{et}) \\ \phi_{ei} & = & \alpha_0 + \alpha_1 (\text{Trt}_i - 1) + \alpha_2 (u_{0i}-\bar{u}_{0}) \end{eqnarray*}\] and \[\begin{eqnarray*} c_i & \sim & \dnorm(\phi_{ci},\sigma_{ct}) \\ \phi_{ci} & = & \beta_0 + \beta_1(\text{Trt}_i - 1) \end{eqnarray*}\] where

    • \(\text{Trt}_i,t=1,2\) are the two intervention arm (\(t=1\) indicates the standard of care, while \(t=2\) is the active intervention)
    • \(u^*_{0i}=(u_{0i}-\bar{u}_{0})\) is the centred baseline QoL
  • Consequently
    • \(\mu_{et}=\alpha_0 + \alpha_1 (t-1)\) is the population average benefits
    • \(\mu_{ct}=\beta_0 + \beta_1(t-1)\) is the population average costs

Marginal/conditional factorisation model

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = p(e)p(c\mid e) = p(c)p(e\mid c)}\)

Marginal/conditional factorisation model

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = }\class{blue}{p(e)}\class{myblue}{p(c\mid e) = p(c)p(e\mid c)}\)

\[\begin{eqnarray*} e_i & \sim & p(e \mid \phi_{ei},\bm\xi_e) \\ g_e(\phi_{ei}) & = & \alpha_0 \, [+\ldots] \\ \mu_e & = & g_e^{-1}(\alpha_0) \\ && \\ \phi_{ei} & = & \style{font-family:inherit;}{\text{location}} \\ \bm\xi_e & = & \style{font-family:inherit;}{\text{ancillary}} \end{eqnarray*}\]

Marginal/conditional factorisation model

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = }\class{blue}{p(e)}\class{red}{p(c\mid e)}\class{myblue}{ = p(c)p(e\mid c)}\)

\[\begin{eqnarray*} e_i & \sim & p(e \mid \phi_{ei},\bm\xi_e) \\ g_e(\phi_{ei}) & = & \alpha_0 \, [+\ldots] \\ \mu_e & = & g_e^{-1}(\alpha_0) \\ && \\ \phi_{ei} & = & \style{font-family:inherit;}{\text{location}} \\ \bm\xi_e & = & \style{font-family:inherit;}{\text{ancillary}} \end{eqnarray*}\]

\[\begin{eqnarray*} c_i & \sim & p(c \mid e_i, \phi_{ci},\bm\xi_c) \\ g_c(\phi_{ci}) & = & \beta_0 +\beta_1(e_i - \mu_e) \, [+\ldots] \\ \mu_c & = & g_c^{-1}(\beta_0) \\ && \\ \phi_{ci} & = & \style{font-family:inherit;}{\text{location}} \\ \bm\xi_c & = & \style{font-family:inherit;}{\text{ancillary}} \end{eqnarray*}\]

Marginal/conditional factorisation model

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = p(e)p(c\mid e) = p(c)p(e\mid c)}\)

\[\begin{eqnarray*} e_i & \sim & \dnorm(\phi_{ei},\xi_e) \\ \phi_{ei} & = & \alpha_0 \, [+\ldots] \\ \mu_e & = & \alpha_0 \\ && \\ \phi_{ei} & = & \style{font-family:inherit;}{\text{marginal mean}} \\ \xi_e & = & \style{font-family:inherit;}{\text{marginal sd}} \\ g_e(\cdot) & = & \style{font-family:inherit;}{\text{identity}} \end{eqnarray*}\]

\[\begin{eqnarray*} c_i & \sim & \dnorm(\phi_{ci},\xi_c) \\ \phi_{ci} & = & \beta_0 +\beta_1(e_i - \mu_e) \, [+\ldots] \\ \mu_c & = & \beta_0 \\ && \\ \phi_{ci} & = & \style{font-family:inherit;}{\text{conditional mean}} \\ \xi_c & = & \style{font-family:inherit;}{\text{conditional sd}} \\ g_c(\cdot) & = & \style{font-family:inherit;}{\text{identity}} \end{eqnarray*}\]

  • Normal/Normal MCF — equivalent to

\[\left( \begin{array}{c} \varepsilon_{ei} \\ \varepsilon_{ci}\end{array} \right) \sim \dnorm\left( \left(\begin{array}{c} 0 \\ 0\end{array}\right), \left(\begin{array}{cc} \sigma^2_e & \rho\sigma_e\sigma_c \\ & \sigma^2_c \end{array}\right) \right)\]

  • Can also write down analytically the marginal mean and sd for the costs (but that’s not so relevant…)

Marginal/conditional factorisation model

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = p(e)p(c\mid e) = p(c)p(e\mid c)}\)

\[\begin{eqnarray*} e_i & \sim & \dbeta(\phi_{ei}\xi_e,(1-\phi_{ei})\xi_e) \\ \logit(\phi_{ei}) & = & \alpha_0 \, [+\ldots] \\ \mu_e & = & \frac{\exp(\alpha_0)}{1+\exp(\alpha_0)} \\ && \\ \phi_{ei} & = & \style{font-family:inherit;}{\text{marginal mean}} \\ \xi_e & = & \style{font-family:inherit;}{\text{marginal scale}} \\ g_e(\cdot) & = & \style{font-family:inherit;}{\text{logit}} \end{eqnarray*}\]

\[\begin{eqnarray*} c_i & \sim & \dgamma(\xi_c,\xi_c/\phi_{ci}) \\ \log(\phi_{ci}) & = & \beta_0 +\beta_1(e_i - \mu_e) \, [+\ldots] \\ \mu_c & = & \exp(\beta_0) \\ && \\ \phi_{ci} & = & \style{font-family:inherit;}{\text{conditional mean}} \\ \xi_c & = & \style{font-family:inherit;}{\text{shape}} \\ \xi_c/\phi_{ci} & = & \style{font-family:inherit;}{\text{rate}} \\ g_c(\cdot) & = & \style{font-family:inherit;}{\text{log}} \end{eqnarray*}\]

  • Beta/Gamma MCF
    • Effects may be bounded in \([0;1]\) (e.g. QALYs on a one-year horizon)
    • Costs are positive and skewed

Marginal/conditional factorisation model

  • In general, can represent the joint distribution as \(\class{myblue}{p(e,c) = p(e)p(c\mid e) = p(c)p(e\mid c)}\)

\[\begin{eqnarray*} e_i & \sim & p(e \mid \phi_{ei},\bm\xi_e) \\ g_e(\phi_{ei}) & = & \alpha_0 \, [+\ldots] \\ \mu_e & = & g_e^{-1}(\alpha_0) \\ && \\ \phi_{ei} & = & \style{font-family:inherit;}{\text{location}} \\ \bm\xi_e & = & \style{font-family:inherit;}{\text{ancillary}} \end{eqnarray*}\]

\[\begin{eqnarray*} c_i & \sim & p(c \mid e_i, \phi_{ci},\bm\xi_c) \\ g_c(\phi_{ci}) & = & \beta_0 +\beta_1(e_i - \mu_e) \, [+\ldots] \\ \mu_c & = & g_c^{-1}(\beta_0) \\ && \\ \phi_{ci} & = & \style{font-family:inherit;}{\text{location}} \\ \bm\xi_c & = & \style{font-family:inherit;}{\text{ancillary}} \end{eqnarray*}\]

  • Combining “modules” and fully characterising uncertainty about deterministic functions of random quantities is relatively straightforward using MCMC

  • Prior information can help stabilise inference (especially with sparse data!), eg

    • Cancer patients are unlikely to survive as long as the general population
    • ORs are unlikely to be greater than \(\pm \style{font-family:inherit;}{\text{5}}\)

To be or not to be?… (A Bayesian)

In principle, there’s nothing inherently Bayesian about MCF… BUT: there’s lots of advantages in doing it in a Bayesian setup

  1. As the model becomes more and more realistic and its structure more and more complex, to account for skeweness and correlation, the computational advantages of using maximum likelihood estimations become increasingly smaller
    • Writing and optimising the likelihood function becomes more complex analytically and even numerically and might require the use of simulation althorithms
    • Bayesian models generally scale up with minimal changes
    • Computation may be more expensive, but the marginal computational cost is in fact diminishing
  1. Realistic models are usually based on highly non-linear transformations
    • from a Bayesian perspective, this does not pose a substantial problem, because once we obtain simulations from the posterior distribution for \(\alpha_0\) and \(\beta_0\), it is possible to simply rescale them to obtain samples from the posterior distribution of \(\mu_e\) and \(\mu_c\)
    • This allows us to fully characterise and propagate the uncertainty in the fundamental economic parameters to the rest of the decision analysis with essentially no extra computational cost
    • Frequentist/ML approach would resort to resampling methods, such as the bootstrap to effectively produce an approximation to the joint posterior distribution for all the model parameters \(\bm\theta\) and any function thereof
  1. Prior information can help stabilise inference
    • Most often, we do have contextual information to mitigate limited evidence from the data and stabilise the evidence
    • Using a Bayesian approach allows us to use this contextual information in a principled way.

N/N independent vs N/N MCF

Normal/Normal independent      Normal/Normal MCF

Gamma/Gamma MCF — 10TT

  • Define \(e^*_i = (3 -e_i)\)
    • Rescale observed QALYs and turns the distribution into right skewed
    • Can use a Gamma distribution, accounting for a small number of individuals are associated with negative QALYs, indicating a very poor health state (“worse than death”)
  • Model

\[\begin{align*} e^*_{i} \sim \dgamma (\nu_e,\gamma_{ei}) && & \log(\phi_{ei}) = \alpha_0 + \alpha_1(\text{Trt}_i - 1) + \alpha_2 u^*_{0i}\\ c_i\mid e^*_i \sim \dgamma(\nu_c,\gamma_{ci}) && & \log(\phi_{ci}) = \beta_0 + \beta_1(\text{Trt}_i - 1) + \beta_2(e^*_i -\mu^*_e) \end{align*}\]

  • Because of the properties of the Gamma distribution
    • \(\phi_{ei}\) indicates the marginal mean for the QALYs,
    • \(\nu_e\) is the shape
    • \(\gamma_{ei}=\nu_c/\phi_{ei}\) is the rate
  • Similarly
    • \(\nu_c\) is the shape
    • \(\gamma_{ci}=\nu_c/\phi_{ci}\) is the rate of the conditional distribution for the costs given the benefits
    • \(\phi_{ci}\) is the conditional mean

N/N independent vs N/N MCF vs G/G MCF

Normal/Normal independent      Normal/Normal MCF      Gamma/Gamma MCF

10TT — Model selection

Model \(p_V\) \(\dic\)a \(p_D\) \(\dic\)b
a Computed as \(p_V+\overline{D(\bm\theta)}\)
b Computed as \(p_D+\overline{D(\bm\theta)}\)
Normal/Normal independent 9.64 687 9.24 686
Normal/Normal MCF 10.16 674 9.97 674
Gamma/Gamma MCF 12.30 492 10.47 490
  • Both versions of the DIC favour the Gamma/Gamma MCF and suggest the two Normal/Normal models are basically indistinguishible

  • \(p_V\) for the two Normal/Normal models computed as much larger than for the Gamma/Gamma MCF

  • \(p_D\) has a nice interpretation as effective number of model parameters — more on this later

Cost-effectiveness model

  • Can use the simulations from the three models for \((\mu_e,\mu_c)\) and then run BCEA to obtain the economic analysis

Aggregate level data

Absolute vs Relative effects

  • Absolute effects
    • e.g. probabilities, mean scores, event rates
    • Typically whats needed in a decision model
  • Relative effects
    • e.g. log-odds ratio, mean difference, hazard ratio
    • By design RCTs provide evidence on relative effects
    • \(\ldots\) relative effects more generalisable than the absolute effects

Goal: Apply relative effects to reference absolute effect to obtain absolute effects

  • mean\(_1\) = mean\(_0\) + mean difference
  • log-odds\(_1\) = log-odds\(_0\) + log OR — equivalently: \(\logit(p_1)=\logit(p_0)+\txt{log OR}\)
    • Transforms from log OR \(\mu\) to probability \(p\), using inverse logit function

\[\class{myblue}{p= \displaystyle\frac{\exp(\mu)}{1+\exp(\mu)} \iff \mu = \logit(p) = \log\left(\frac{p}{1-p}\right)}\]

  • log-hazard\(_1\) = log-hazard\(_0\) + log HR

NB: we need to fully propagate the uncertainty in the relative to the absolute effect! (Being Bayesian makes life much easier! 😉)

(“Standard”) cost-effectiveness modelling

  1. Build a population level model (e.g. decision tree or Markov model)

  • NB: in this case, the “data” are typically represented by summary statistics for the parameters of interest \(\bm\theta=(p_1,p_2,l,\ldots)\), but may also have access to a combination of ILD and summaries
  1. Use point estimates for the parameters to build the “base-case” (average) evaluation
  1. Use resampling methods (eg bootstrap) to propage uncertainty in the point estimates and perform uncertainty analysis

Bayesian cost-effectiveness modelling

  1. Build a population level model (e.g. decision tree or Markov model)

  • NB: in this case, the “data” are typically represented by summary statistics for the parameters of interest \(\bm\theta=(p_1,p_2,l,\ldots)\), but may also have access to a combination of ILD and summaries
  1. Estimate all model parameters at once and obtain MCMC simulations to run PSA and decision analysis at once

Cost-effectiveness analysis

Module 1: Influenza incidence

  • \(H\) studies reporting number of patients who get influenza (\(x_h\)) in the sample (\(m_h\))

  • \(\beta_h=\) population probability of influenza from the \(h\)-th study: \(\logit(\beta_h)=\gamma_h\sim \dnorm(\mu\gamma,\sigma_\gamma)\)

  • \(\mu_\gamma\sim\dnorm(0,v)\) \(=\) pooled averaged probability of infection (on logit scale!) \(\Rightarrow \displaystyle \class{red}{p_1=\frac{\exp(\mu_\gamma)}{1+\exp(\mu_\gamma)}}\)

    — or equivalently \(\class{red}{\logit(p_1)=\mu_\gamma}\)

Cost-effectiveness analysis

Module 2: Prophylaxis effectiveness

  • \(S\) studies reporting number of infected patients \(r^{(t)}_s\) in a sample made of \(n^{(t)}_s\) subjects

  • \(\pi^{(t)}_s=\) study- and treatment-specific chance of contracting influenza

    \(\logit\left(\pi^{(1)}_s\right) = \alpha_s \sim \dnorm(0, 10)\)

    \(\logit\left(\pi^{(2)}_s\right) = \alpha_s + \delta_s\)

    \(\delta_s \sim \dnorm(\mu_\delta,\sigma_\delta)=\) study-specific treatment effect

  • \(\class{blue}{\mu_\delta\sim\dnorm(0,v)}=\) pooled log-odds ratio of influenza given treatment

Cost-effectiveness analysis

Decision analytic model

Can combine modules 1 and 2 to get \(\class{olive}{\logit(p_2)=\logit(p_1)+\mu_\delta}\)

Cost-effectiveness analysis

  • Baseline adjustment: \[ \logit\left(\pi_s^{(2)}\right) = \alpha_s + \delta_s^* +\gamma(\alpha_s - \bar{\alpha}) \]

  • Pooling: \[ \alpha_s, \beta_h \sim \dnorm(\mu_\gamma,\sigma^2_\gamma) \]

Value of information

Knowledge is power?

(A tale of two stupid examples)

  • Example 1: Intervention \(t=1\) is more cost-effective, given current evidence
    • \(\class{myblue}{\Pr(t=1 \style{font-family:inherit;}{\text{ is cost-effective}}) = \style{font-family:inherit;}{\text{0.51}}}\)
    • If we get it wrong:
      • Increase in population average costs \(=\) £3
      • Decrease in population average effectiveness \(=\) 0.000001 QALYs
    • Large uncertainty/negligible consequences \(\Rightarrow\) can afford uncertainty!
  • Example 2: Intervention \(t=1\) is more cost-effective, given current evidence
    • \(\class{myblue}{\Pr(t=1 \style{font-family:inherit;}{\text{ is cost-effective}}) = \style{font-family:inherit;}{\text{0.999}}}\)
    • If we get it wrong:
      • Increase in population average costs \(=\) £1,000,000,000
      • Decrease in population average effectiveness \(=\) 999,999 QALYs
    • Tiny uncertainty/dire consequences \(\Rightarrow\) probably should think about it…!

Evidence based decision-making and VoI

Process inherently Bayesian!

VoI: Basic ideas

  • A new study will provide more data — aim: reducing (or even eliminating?…) uncertainty in a subset of the model parameters

  • Update the cost-effectiveness model

    • If optimal decision changes, gain in monetary net benefit (NB = utility) from using new optimal treatment. If optimal decision doesn’t change, no gain in NB
  • Expected VoI is the average gain in NB

  1. Expected value of Perfect Information (EVPI)
    • Value of completely resolving uncertainty in all input parameters to decision model
    • Infinite-sized, long-term follow up trial measuring everything!…
    • Gives an upper bound on the value of the new study – low EVPI suggests we can make our decision based on existing information
  1. Expected value of Partial Perfect Information (EVPPI)
    • Value of eliminating uncertainty in subset of input parameters to decision model
    • e.g.: Infinite-sized trial measuring relative effects on 1-year survival
    • Useful to identify which parameters are responsible for decision uncertainty
  1. Expected value of Sample Information (EVSI)
    • Value of reducing uncertainty by conducting a specific study of a given design
    • Can compare the benefits and costs of a study with given design
    • Is the proposed study likely to be a good use of resource? What is the optimal design?

VoI: Basic ideas & relevant measures

In general, VoI measures are always expressed as something like

VoI measure \(=\) Some idealised decision-making process \(-\) current decision-making process

Complexity

  • There’s no natural upper bound
    • Voi measures are positive, but how low is low?
  • Need to account for other factors
    • How much would it cost to get to the point when we can make the idealised decision-making process?
    • Who would that affect?
    • For how long?
  • Computational & modelling issues
    • You need to know what you’re doing (again, modelling fundamentally Bayesian)
    • And use suitable tools (basically, never use spreadsheets…)

Summarising uncertainty analysis (PSA)

Expected Value of Perfect Information

Parameter simulation
Iteration \(\pi_0\) \(\rho\) \(\ldots\) \(\gamma\)
1 0.585 0.3814 \(\ldots\) 0.4194
2 0.515 0.0166 \(\ldots\) 0.0768
3 0.611 0.1373 \(\ldots\) 0.0592
4 0.195 0.7282 \(\ldots\) 0.7314
\(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\)
1000 0.0305 0.204 \(\ldots\) 0.558
  • Characterise uncertainty in the model parameters
    • In a full Bayesian setting, these are drawings from the posterior distribution of \(\boldsymbol\theta\)
    • In a frequentist setting, these are typically bootstrap draws from a set of univariate ditributions that describe some level of uncertainty around the MLEs

Summarising uncertainty analysis (PSA)

Expected Value of Perfect Information

Parameter simulation Expected utility
Iteration \(\pi_0\) \(\rho\) \(\ldots\) \(\gamma\) \(\nb_1(\boldsymbol\theta)\) \(\nb_2(\boldsymbol\theta)\)
1 0.585 0.3814 \(\ldots\) 0.4194 77480.00 67795.00
2 0.515 0.0166 \(\ldots\) 0.0768 87165.00 106535.00
3 0.611 0.1373 \(\ldots\) 0.0592 58110.00 38740.00
4 0.195 0.7282 \(\ldots\) 0.7314 77480.00 87165.00
\(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\)
1000 0.0305 0.204 \(\ldots\) 0.558 48425.00 87165.00
Average 72365.35 77403.49
  • Uncertainty in the parameters induces a distribution of decisions
    • Typically based on the net benefits: \(\class{myblue}{\nb_t(\boldsymbol\theta)=k\mu_{et}-\mu_{ct}}\)
    • In each parameter configuration can identify the optimal strategy
  • Averaging over the uncertainty in \(\boldsymbol\theta\) provides \(t^*\), the overall optimal decision given current uncertainty (= choose the intervention associated with highest expected utility)

Summarising uncertainty analysis (PSA)

Expected Value of Perfect Information

Parameter simulation Expected utility
Iteration \(\pi_0\) \(\rho\) \(\ldots\) \(\gamma\) \(\nb_1(\boldsymbol\theta)\) \(\nb_2(\boldsymbol\theta)\) Maximum net benefit Opportunity loss
1 0.585 0.3814 \(\ldots\) 0.4194 77480.00 67795.00 77480.00 9685.00
2 0.515 0.0166 \(\ldots\) 0.0768 87165.00 106535.00 106535.00 0.00
3 0.611 0.1373 \(\ldots\) 0.0592 58110.00 38740.00 58110.00 19370.00
4 0.195 0.7282 \(\ldots\) 0.7314 77480.00 87165.00 87165.00 0.00
\(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\) \(\ldots\)
1000 0.0305 0.204 \(\ldots\) 0.558 48425.00 87165.00 87165.00 0.00
Average 72365.35 77403.49 91192.02 13788.53
  • Expected value of “Perfect” Information (EVPI) summarises uncertainty in the decision
    • Defined as the average Opportunity Loss, or average maximum expected utility under “perfect” information \(-\) maximum expected utility overall:
    \[\class{olive}{\evpi} = \class{blue}{\E_\boldsymbol{\theta}\left[\max_t \nb_t(\boldsymbol\theta) \right]} - \class{magenta}{\max_t \E_\boldsymbol\theta \left[\nb_t(\boldsymbol\theta)\right]}\]

Summarising PSA + Research priority

Expected Value of Partial Perfect Information

  • \(\class{blue}{\bm\theta} =\) all the model parameters; can be split into two subsets
    • The “parameters of interest”, \(\class{myblue}{\bm\phi}\), e.g. prevalence of a disease, HRQL measures, length of stay in hospital, …
    • The “remaining parameters, \(\class{olive}{\bm\psi}\), e.g. cost of treatment with other established medications, …
  • We are interested in quantifying the value of gaining more information on \(\class{myblue}{\bm\phi}\), while leaving current level of uncertainty on \(\class{olive}{\bm\psi}\) unchanged
  • First, consider the expected utility (EU) if we were able to learn \(\class{myblue}{\bm\phi}\) but not \(\class{myblue}{\bm\psi}\)
  • If we knew \(\class{myblue}{\bm\phi}\) perfectly, best decision = the maximum of this EU
  • Of course, we cannot know \(\class{myblue}{\bm\phi}\) perfectly, so take the expected value
  • And compare this with the maximum expected utility overall
  • This is the EVPPI

\[\class{myblue}{\E_{\bm\psi\mid\bm\phi}[\nb_t(\bm\theta)]}\]

\[\class{myblue}{\max_t}\class{gray}{\E_{\bm\psi\mid\bm\phi}[\nb_t(\bm\theta)]}\]

\[\class{myblue}{\E_{\bm\phi}}\class{myblue}{\left [\class{gray}{\max_t\E_{\bm\psi\mid\bm\phi}[\nb_t(\bm\theta)]}\right]}\]

\[\class{gray}{\E_{\bm\phi}\left[\max_t\E_{\bm\psi\mid\bm\phi}[\nb_t(\bm\theta)]\right]}\class{myblue}{-\max_t \E_{\bm\theta}[\nb_t(\bm\theta)]}\]

\[\class{myblue}{\evppi = \E_{\bm\phi}\left[\max_t\E_{\bm\psi\mid\bm\phi}[\nb_t(\bm\theta)]\right]-\max_t \E_{\bm\theta}[\nb_t(\bm\theta)]}\]

\[\class{myblue}{\evppi = \E_{\bm\phi}\left[\class{red}{\max_t\E_{\bm\psi\mid\bm\phi}[\nb_t(\bm\theta)]}\right]-\max_t \E_{\bm\theta}[\nb_t(\bm\theta)]}\]

  • That is the difficult part!
    • Can do nested Monte Carlo, but takes for ever to get accurate results
    • Recent methods based on GAMs/Gaussian Process regression/spatial modelling very efficient and quick!

Research priority

Expected value of sample information

Stolen from various presentations by Anna Heath

Research priority

Expected value of sample information Jackson et al (2021)

  • EVSI measures the value of reducing uncertainty by running a study of a given design

\[\evsi = \E_{\boldsymbol{X}} \left[ \max_t\ \color{blue}{\E_{\boldsymbol\theta \mid \boldsymbol{X}} \left[ \nb_t(\boldsymbol\theta) \right]} \right] - \max_{t}\color{magenta}{\E_{\boldsymbol\theta}\left[\nb_{t}(\boldsymbol\theta)\right]}\]


Value of decision based on sample information (for a given study design)


Value of decision based on current information

  • Can compare the benefits and costs of a study with given design
    • To see if a proposed study likely to be a good use of resources
    • To find the optimal study design

Research priority/study design

Research priority/study design

We can compute the Expected Net Benefit of Sampling \[ \style{font-family:inherit;}{\text{ENBS}} = \style{font-family:inherit;}{\text{EVSI}} − \style{font-family:inherit;}{\text{cost of the study}} \] to evaluate the actual economic value of a specific design