"Imputation-based Q-learning for Optimizing Dynamic Treatment Regimes with Time-to-Event Data"

Public Health/Biostatistics
Abdus Wahed (advisor)

Abstract: Treatment of chronic diseases is often a multi-stage decision process where the treatment course is modified adaptively to address a patient’s evolving health conditions. At each treatment decision point, health care providers choose treatments based on patient characteristics, treatment history, and health conditions. Such sequentially adaptive medical decision-making algorithms are known as dynamic treatment regimes (DTRs). An important clinical question is to determine the optimal DTR, or equivalently, the sequence of treatment decisions that yields the best expected outcome. In this dissertation, we propose a new statistical method to estimate optimal DTR with right-censored survival outcomes and extend this method to the competing risks framework. In the first project, we propose an imputation-based Q-learning (IQ-learning) method for optimizing DTRs in multi-stage decision making, where a semiparametric Cox proportional hazard model is employed to estimate optimal treatment rules for each stage and then weighted hot-deck multiple imputation (MI) and direct-draw MI are used to predict optimal potential survival times. We extend the proposed optimal DTR estimation methods to an incomplete-data setting. Missing data are handled using inverse probability weighting and MI. We investigate the performance of IQ-learning via extensive simulations and show that it is robust to model mis-specification, imputes only plausible potential survival times contrary to parametric models, and provides more flexibility in terms of baseline hazard shape. We demonstrate IQ-Learning by developing an optimal DTR for leukemia treatment based on a randomized trial with observational follow-up. In the second project, we extend the proposed IQ-learning method to identify DTRs that optimize right censored competing risks outcomes. Similar to IQ-learning, in the optimization step, we use Cox model on cause-specific hazard function to estimate the optimal treatment rule for each stage. Then, in the post-optimization prediction step, we propose three hot-deck MI-based methods to predict the counterfactual competing risk times for those who did not receive their optimal treatments. The performance of the proposed method is evaluated through simulation studies.

Public health significance: The statistical methods proposed in this dissertation contribute to the field of dynamic treatment regimes optimization, which has a substantial impact on precision medicine, especially for the treatment of chronic diseases, such as cancer, AIDS, and depression. The estimated individualized treatment rules can guide and shape health policies which will ultimately improve overall public health and safety.

Event Details

Please let us know if you require an accommodation in order to participate in this event. Accommodations may include live captioning, ASL interpreters, and/or captioned media and accessible documents from recorded events. At least 5 days in advance is recommended.


Register for Zoom information. 

University of Pittsburgh Powered by the Localist Community Event Platform © All rights reserved