
"Methods for Combining Frequent or Sparse Signals in Omics Application" - Public Health/Biostatistics,
Committee:
George Tseng (advisor and committee chair)
Abstract:
Combining p-values to aggregate multiple effects has been of long-standing interest. This dissertation focuses on three types of p-value combination scenarios for omics data applications, which are discussed in Chapters 2-4.
Chapter 2 considers combining independent and non-sparse signals in a small group of p-values, which is a scenario related to meta-analysis, but the number of true signals in p-values and their strengths can vary with heterogeneity. We propose an ensemble method, namely Fisher ensemble (FE), to combine the existing Fisher and AFp methods. We show FE achieves asymptotic Bahadur optimality and integrates the strengths of Fisher and AFp. We extend the Fisher ensemble to a variant with emphasized power for concordant effect size directions. A transcriptomic meta-analysis of the AGEMAP dataset shows the advantages of the proposed Fisher ensemble methods.
Chapter 3 considers combining independent, weak, and sparse signals in a large group of p-values. We propose a simple yet truly adaptive modified Fisher's method for detecting sparse and heterogeneous signals. Our method achieves the optimal separating rate in a large-scale setup with sparse and heterogeneous signals. In terms of practical consideration, we also investigate the robustness of our method when the p-values are not exact and show that our method still attains the optimal separating rate under mild conditions. The proposed method is applied to a neuroticism GWAS application for the pathway-based association study.
Chapter 4 considers combining dependent, weak, and sparse signals in a large group of p-values. We study a family of p-value combination tests by heavy-tailed distribution transformations. We derive the conditions for a method of the family to enjoy robustness against the unknown dependency structure and to attain the optimal detection boundary for detecting weak and sparse signals. Only an equivalent class of Cauchy test, can possess robustness property. By applying our theoretical findings, we suggest a truncated Cauchy test that belongs to the class to address the high negative penalty issue of Cauchy test. A neuroticism GWAS application demonstrates the theoretical findings and advantages of the truncated Cauchy method.
Contribution to public health:
Omics data integration plays a crucial role in contemporary biomedical research, leading to disease mechanism understanding and biomarker detection. As essential statistical methods for aggregating information from multiple sources, p-value combination approaches have been widely employed in omics data analysis. Methods developed in the three chapters of this thesis not only build solid theoretical foundation but also provide practical data-driven methodologies for omics data integrative analysis to advance disease understanding and public health.
Tuesday, April 4 at 9:30 a.m.
Public Health, 7139
130 Desoto Street, Pittsburgh, 15261
"Methods for Combining Frequent or Sparse Signals in Omics Application" - Public Health/Biostatistics,
Committee:
George Tseng (advisor and committee chair)
Abstract:
Combining p-values to aggregate multiple effects has been of long-standing interest. This dissertation focuses on three types of p-value combination scenarios for omics data applications, which are discussed in Chapters 2-4.
Chapter 2 considers combining independent and non-sparse signals in a small group of p-values, which is a scenario related to meta-analysis, but the number of true signals in p-values and their strengths can vary with heterogeneity. We propose an ensemble method, namely Fisher ensemble (FE), to combine the existing Fisher and AFp methods. We show FE achieves asymptotic Bahadur optimality and integrates the strengths of Fisher and AFp. We extend the Fisher ensemble to a variant with emphasized power for concordant effect size directions. A transcriptomic meta-analysis of the AGEMAP dataset shows the advantages of the proposed Fisher ensemble methods.
Chapter 3 considers combining independent, weak, and sparse signals in a large group of p-values. We propose a simple yet truly adaptive modified Fisher's method for detecting sparse and heterogeneous signals. Our method achieves the optimal separating rate in a large-scale setup with sparse and heterogeneous signals. In terms of practical consideration, we also investigate the robustness of our method when the p-values are not exact and show that our method still attains the optimal separating rate under mild conditions. The proposed method is applied to a neuroticism GWAS application for the pathway-based association study.
Chapter 4 considers combining dependent, weak, and sparse signals in a large group of p-values. We study a family of p-value combination tests by heavy-tailed distribution transformations. We derive the conditions for a method of the family to enjoy robustness against the unknown dependency structure and to attain the optimal detection boundary for detecting weak and sparse signals. Only an equivalent class of Cauchy test, can possess robustness property. By applying our theoretical findings, we suggest a truncated Cauchy test that belongs to the class to address the high negative penalty issue of Cauchy test. A neuroticism GWAS application demonstrates the theoretical findings and advantages of the truncated Cauchy method.
Contribution to public health:
Omics data integration plays a crucial role in contemporary biomedical research, leading to disease mechanism understanding and biomarker detection. As essential statistical methods for aggregating information from multiple sources, p-value combination approaches have been widely employed in omics data analysis. Methods developed in the three chapters of this thesis not only build solid theoretical foundation but also provide practical data-driven methodologies for omics data integrative analysis to advance disease understanding and public health.
Tuesday, April 4 at 9:30 a.m.
Public Health, 7139
130 Desoto Street, Pittsburgh, 15261