An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies

Dai Hongying<sup>*</sup>; Wu Guodong; Wu Michael; Zhi Degui

doi:10.1371/journal.pone.0152667

摘要

Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker-single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency, lim(epsilon -> 0) N-(2)/N-(1) = phi(12)(theta), compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. epsilon -> 0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals (P-N(i) < epsilon -> 0). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol.

出版日期2016-7-5

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2022-08-09 13:31

An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友