A Comprehensive Computational Method for Cancer Risk Predisposition with Germline Copy Number Variation in Population Scale

  • 郭 泓億

Student thesis: Master's Thesis


Cancer ranks first in the top ten causes of death in Taiwan from 1982 With the advance in medical technology people try to understand cancer from DNA There is Next Generation Sequencing (NGS) technology which can help us quickly get a large amount of human Whole Genome Sequencing (WGS) data now WGS contains complete biological genetic information Except for environmental factors cancer is generally considered as the accumulation of DNA sequence mutations Hence this research is going to explore whether there exists any basic difference between cancerous patients and healthy people We check whether these differences cause mutations to accumulate by age and get cancer Moreover we also explore the relation between cancer predisposition and family cancer history Human genome which encoded as DNA within the 23 chromosome pairs contains about three billion DNA base pairs There are four types of bases Thymine (T) adenine (A) cytosine (C) and guanine (G) Part of these base pairs makes up about twenty to twenty-five thousand genes Most of the research of human hereditary concentrated on single nucleotide polymorphisms (SNP) a variation in a single nucleotide in DNA In VI contrast to small range mutation large range structure variation gets more concern recently Copy number variation (CNV) is a common type of structural variation and its affected scope is about thousands of thousands bp CNV results in the large scale of variations in chromosomes It is not only a type of genetic cumulative variation but also many research indicated that its high correlation with disease This research designed a copy number comprehensive analysis system (CNCAS) to find out the crucial copy number variation genes First we used NGS method to get the human WGS of cancerous patients and healthy people and we targeted on Taiwan ethnicity Cancer types include colorectal cancer ovarian cancer and endometrial cancer Then we analyzed on large range variations CNV from WGS and use gene as analysis unit to judge its mutation status We used different methods and combined several CNV database to explore whether these two groups exist some difference in their sequence We mapped the mutation to the gene and find out the winner genes We also investigated on clinical data to establish a complete WGS analysis platform Finally we find several cancerous risky genes and some certain patterns on cancer sequence We hope we can apply our platform to clinical therapy It will try to assist doctor to do cancer patient treatment decision and reduce the threat of cancer
Date of Award2017 Aug 11
Original languageEnglish
SupervisorJung-Hsien Chiang (Supervisor)

Cite this