摘要
This article addresses the challenge of efficiently capturing a high proportion of true signals for subsequent data analyses when sample sizes are relatively limited with respect to data dimension. We propose the signal missing rate (SMR) as a new measure for false-negative control to account for the variability of false-negative proportion. Novel data-adaptive procedures are developed to control SMR without incurring many unnecessary false positives under dependence. We justify the efficiency and adaptivity of the proposed methods via theory and simulation. The proposed methods are applied to GWAS on human height to effectively remove irrelevant single nucleotide polymorphisms (SNPs) while retaining a high proportion of relevant SNPs for subsequent polygenic analysis. Supplementary materials for this article are available online.
原文 | English |
---|---|
頁(從 - 到) | 1787-1799 |
頁數 | 13 |
期刊 | Journal of the American Statistical Association |
卷 | 114 |
發行號 | 528 |
DOIs | |
出版狀態 | Published - 2019 10月 2 |
All Science Journal Classification (ASJC) codes
- 統計與概率
- 統計、概率和不確定性