TY - JOUR
T1 - The discrepancy among single nucleotide variants detected by DNA and RNA high throughput sequencing data
AU - Guo, Yan
AU - Zhao, Shilin
AU - Sheng, Quanhu
AU - Samuels, David C.
AU - Shyr, Yu
N1 - Publisher Copyright:
© 2017 The Author(s).
PY - 2017/10/3
Y1 - 2017/10/3
N2 - Background: High throughput sequencing technology enables the both the human genome and transcriptome to be screened at the single nucleotide resolution. Tools have been developed to infer single nucleotide variants (SNVs) from both DNA and RNA sequencing data. To evaluate how much difference can be expected between DNA and RNA sequencing data, and among tissue sources, we designed a study to examine the single nucleotide difference among five sources of high throughput sequencing data generated from the same individual, including exome sequencing from blood, tumor and adjacent normal tissue, and RNAseq from tumor and adjacent normal tissue. Results: Through careful quality control and analysis of the SNVs, we found little difference between DNA-DNA pairs (1%-2%). However, between DNA-RNA pairs, SNV differences ranged anywhere from 10% to 20%. Conclusions: Only a small portion of these differences can be explained by RNA editing. Instead, the majority of the DNA-RNA differences should be attributed to technical errors from sequencing and post-processing of RNAseq data. Our analysis results suggest that SNV detection using RNAseq is subject to high false positive rates.
AB - Background: High throughput sequencing technology enables the both the human genome and transcriptome to be screened at the single nucleotide resolution. Tools have been developed to infer single nucleotide variants (SNVs) from both DNA and RNA sequencing data. To evaluate how much difference can be expected between DNA and RNA sequencing data, and among tissue sources, we designed a study to examine the single nucleotide difference among five sources of high throughput sequencing data generated from the same individual, including exome sequencing from blood, tumor and adjacent normal tissue, and RNAseq from tumor and adjacent normal tissue. Results: Through careful quality control and analysis of the SNVs, we found little difference between DNA-DNA pairs (1%-2%). However, between DNA-RNA pairs, SNV differences ranged anywhere from 10% to 20%. Conclusions: Only a small portion of these differences can be explained by RNA editing. Instead, the majority of the DNA-RNA differences should be attributed to technical errors from sequencing and post-processing of RNAseq data. Our analysis results suggest that SNV detection using RNAseq is subject to high false positive rates.
UR - http://www.scopus.com/inward/record.url?scp=85030317591&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85030317591&partnerID=8YFLogxK
U2 - 10.1186/s12864-017-4022-x
DO - 10.1186/s12864-017-4022-x
M3 - Article
C2 - 28984205
AN - SCOPUS:85030317591
SN - 1471-2164
VL - 18
JO - BMC genomics
JF - BMC genomics
M1 - 690
ER -