A Unified Sparse Representation for Sequence Variant Identification for Complex Traits

Cao Shaolong; Qin Huaizhen; Deng Hong Wen; Wang Yu Ping<sup>*</sup>

doi:10.1002/gepi.21849

摘要

Joint adjustment of cryptic relatedness and population structure is necessary to reduce bias in DNA sequence analysis; however, existent sparse regression methods model these two confounders separately. Incorporating prior biological information has great potential to enhance statistical power but such information is often overlooked in many existent sparse regression models. We developed a unified sparse regression (USR) to incorporate prior information and jointly adjust for cryptic relatedness, population structure, and other environmental covariates. Our USR models cryptic relatedness as a random effect and population structure as fixed effect, and utilize the weighted penalties to incorporate prior knowledge. As demonstrated by extensive simulations, our USR algorithm can discover more true causal variants and maintain a lower false discovery rate than do several commonly used feature selection methods. It can handle both rare and common variants simultaneously. Applying our USR algorithm to DNA sequence data of Mexican Americans from GAW18, we replicated three hypertension pathways, demonstrating the effectiveness in identifying susceptibility genetic variants.

出版日期2014-12
单位上海生物信息技术研究中心

全文

访问全文

收藏分享被引(9) 浏览

更新时间：2021-04-16 07:24

A Unified Sparse Representation for Sequence Variant Identification for Complex Traits

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友