TY - JOUR
T1 - Copy Number Variation Identification on 3,800 Alzheimer’s Disease Whole Genome Sequencing Data from the Alzheimer’s Disease Sequencing Project
AU - Lee, Wan Ping
AU - Tucci, Albert A.
AU - Conery, Mitchell
AU - Leung, Yuk Yee
AU - Kuzma, Amanda B.
AU - Valladares, Otto
AU - Chou, Yi Fan
AU - Lu, Wenbin
AU - Wang, Li San
AU - Schellenberg, Gerard D.
AU - Tzeng, Jung Ying
N1 - Publisher Copyright:
© Copyright © 2021 Lee, Tucci, Conery, Leung, Kuzma, Valladares, Chou, Lu, Wang, Schellenberg and Tzeng.
PY - 2021/11/4
Y1 - 2021/11/4
N2 - Alzheimer’s Disease (AD) is a progressive neurologic disease and the most common form of dementia. While the causes of AD are not completely understood, genetics plays a key role in the etiology of AD, and thus finding genetic factors holds the potential to uncover novel AD mechanisms. For this study, we focus on copy number variation (CNV) detection and burden analysis. Leveraging whole-genome sequence (WGS) data released by Alzheimer’s Disease Sequencing Project (ADSP), we developed a scalable bioinformatics pipeline to identify CNVs. This pipeline was applied to 1,737 AD cases and 2,063 cognitively normal controls. As a result, we observed 237,306 and 42,767 deletions and duplications, respectively, with an average of 2,255 deletions and 1,820 duplications per subject. The burden tests show that Non-Hispanic-White cases on average have 16 more duplications than controls do (p-value 2e-6), and Hispanic cases have larger deletions than controls do (p-value 6.8e-5).
AB - Alzheimer’s Disease (AD) is a progressive neurologic disease and the most common form of dementia. While the causes of AD are not completely understood, genetics plays a key role in the etiology of AD, and thus finding genetic factors holds the potential to uncover novel AD mechanisms. For this study, we focus on copy number variation (CNV) detection and burden analysis. Leveraging whole-genome sequence (WGS) data released by Alzheimer’s Disease Sequencing Project (ADSP), we developed a scalable bioinformatics pipeline to identify CNVs. This pipeline was applied to 1,737 AD cases and 2,063 cognitively normal controls. As a result, we observed 237,306 and 42,767 deletions and duplications, respectively, with an average of 2,255 deletions and 1,820 duplications per subject. The burden tests show that Non-Hispanic-White cases on average have 16 more duplications than controls do (p-value 2e-6), and Hispanic cases have larger deletions than controls do (p-value 6.8e-5).
UR - http://www.scopus.com/inward/record.url?scp=85119401906&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119401906&partnerID=8YFLogxK
U2 - 10.3389/fgene.2021.752390
DO - 10.3389/fgene.2021.752390
M3 - Article
AN - SCOPUS:85119401906
SN - 1664-8021
VL - 12
JO - Frontiers in Genetics
JF - Frontiers in Genetics
M1 - 752390
ER -