DeepVariant-on-Spark: Small-Scale Genome Analysis Using a Cloud-Based Computing Framework

Po Jung Huang, Jui Huan Chang, Hou Hsien Lin, Yu Xuan Li, Chi Ching Lee, Chung Tsai Su, Yun Lung Li, Ming Tai Chang, Sid Weng, Wei Hung Cheng, Cheng Hsun Chiu, Petrus Tang

研究成果: Article同行評審

5 引文 斯高帕斯(Scopus)

摘要

Although sequencing a human genome has become affordable, identifying genetic variants from whole-genome sequence data is still a hurdle for researchers without adequate computing equipment or bioinformatics support. GATK is a gold standard method for the identification of genetic variants and has been widely used in genome projects and population genetic studies for many years. This was until the Google Brain team developed a new method, DeepVariant, which utilizes deep neural networks to construct an image classification model to identify genetic variants. However, the superior accuracy of DeepVariant comes at the cost of computational intensity, largely constraining its applications. Accordingly, we present DeepVariant-on-Spark to optimize resource allocation, enable multi-GPU support, and accelerate the processing of the DeepVariant pipeline. To make DeepVariant-on-Spark more accessible to everyone, we have deployed the DeepVariant-on-Spark to the Google Cloud Platform (GCP). Users can deploy DeepVariant-on-Spark on the GCP following our instruction within 20 minutes and start to analyze at least ten whole-genome sequencing datasets using free credits provided by the GCP. DeepVaraint-on-Spark is freely available for small-scale genome analysis using a cloud-based computing framework, which is suitable for pilot testing or preliminary study, while reserving the flexibility and scalability for large-scale sequencing projects.

原文English
文章編號7231205
期刊Computational and Mathematical Methods in Medicine
2020
DOIs
出版狀態Published - 2020

All Science Journal Classification (ASJC) codes

  • 建模與模擬
  • 一般生物化學,遺傳學和分子生物學
  • 一般免疫學和微生物學
  • 應用數學

指紋

深入研究「DeepVariant-on-Spark: Small-Scale Genome Analysis Using a Cloud-Based Computing Framework」主題。共同形成了獨特的指紋。

引用此