About this Event
230 S Bouquet St, Pittsburgh, PA 15213
Dissertation defense titled "Two Statistical Methods for High-Dimensional Data:Model-Free Inference in Protein Mutation Studies and
Robust Distance Correlation for Variable Screening". Data with a much larger number of features than sample size is frequently seen in modern statistical applications, ranging from genomic research, biomedical imaging to signal processing problems. In the high-dimensional settings, statistical inference and variable selection are essential for extracting meaningful scientific insights. This thesis presents two methodological contributions aimed at addressing these challenges in distinct contexts.
The first work focuses on protein contact prediction. We propose a novel framework that recasts the task as a statistical hypothesis testing problem within the context of partial correlation graphs for categorical variables.
In the second work, a robust version of distance correlation measure are presented, designed for variable screening in ultrahigh-dimensional data. This method addresses both model misspecification and tail robustness, and enjoys the so-called sure screening property. To further enhance its performance, we develop a new false discovery rate (FDR) control procedure based on the Reflection via Data Splitting (REDS) approach.
Committee Chair and Advisor: Dr. Zhao Ren
Please let us know if you require an accommodation in order to participate in this event. Accommodations may include live captioning, ASL interpreters, and/or captioned media and accessible documents from recorded events. At least 5 days in advance is recommended.