TB Research

High-Dimensional Disease Risk Score for Dealing With Residual Confounding Bias in Estimating Treatment Effects With a Survival Outcome.

Md Belal Hossain, Hubert Wong, Mohsen Sadatsafavi, Victoria J Cook, James C Johnston, Mohammad Ehsanul Karim

Pharmacoepidemiology and drug safety · 2025-07

Abstract

PURPOSE: Health administrative databases often contain no information on some important confounders, leading to residual confounding in the effect estimate. We aimed to explore the performance of high-dimensional disease risk score (hdDRS) to deal with residual confounding bias for estimating causal effects with survival outcomes.

METHODS: We used health administrative data of 49 197 individuals in British Columbia to examine the relationship between tuberculosis infection and time-to-development of cardiovascular disease (CVD). We designed a plasmode simulation exploring the performance of eight hdDRS methods that varied by different approaches to fit the risk score model and also examined results from high-dimensional propensity score (hdPS) and traditional regression adjustment. The log-hazard ratio (log-HR) was the target parameter with a true value of log(3).

RESULTS: In the presence of strong unmeasured confounding, the bias observed was -0.11 for the traditional method and -0.047 for the hdPS method. The bias ranged from -0.051 to -0.058 for hdDRS methods when risk score models were fitted to the full cohort and -0.045 to -0.049 when risk score models were fitted only to unexposed individuals. All methods showed comparable standard errors and nominal bias-eliminated coverage probabilities. With weak unmeasured confounding, hdDRS and hdPS produced approximately unbiased estimates. Our data analysis, after addressing residual confounding, revealed an 8%-11% higher CVD risk associated with tuberculosis infection.

CONCLUSIONS: Our findings support the use of selected hdDRS methods to address residual confounding bias when estimating treatment effects with survival outcomes. In particular, the hdDRS method using rate-based risk score modeling on unexposed individuals consistently exhibited the least bias. However, the hdPS method showed comparable performance across most evaluated scenarios. We share reproducible R codes to facilitate researchers' adoption and further evaluation of these methods.

MeSH terms

  • Humans
  • Cardiovascular Diseases
  • Bias
  • Confounding Factors, Epidemiologic
  • Female
  • Tuberculosis
  • Male
  • British Columbia
  • Middle Aged
  • Propensity Score
  • Databases, Factual
  • Proportional Hazards Models
  • Adult
  • Treatment Outcome
  • Aged
  • Cohort Studies
  • Risk Factors
  • Risk Assessment
  • Computer Simulation