Abstract
Although the haplotype data can be used to analyze the function of DNA, due to the significant efforts required in collecting the haplotype data, usually the genotype data is collected and then the population haplotype inference (PHI) problem is solved to infer haplotype data from genotype data for a population. This paper investigates the PHI problem based on the pure parsimony criterion (HIPP), which seeks the minimum number of distinct haplotypes to infer a given genotype data. We analyze the mathematical structure and properties for the HIPP problem, propose techniques to reduce the given genotype data into an equivalent one of much smaller size, and analyze the relations of genotype data using a compatible graph. Based on the mathematical properties in the compatible graph, we propose a maximal clique heuristic to obtain an upper bound, and a new polynomial-sized integer linear programming formulation to obtain a lower bound for the HIPP problem.
Original language | English |
---|---|
Pages (from-to) | 120-125 |
Number of pages | 6 |
Journal | Mathematical Biosciences |
Volume | 231 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2011 Jun |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Modelling and Simulation
- General Biochemistry,Genetics and Molecular Biology
- General Immunology and Microbiology
- General Agricultural and Biological Sciences
- Applied Mathematics