TY - JOUR
T1 - Mathematical properties and bounds on haplotyping populations by pure parsimony
AU - Wang, I. Lin
AU - Chang, Chia Yuan
N1 - Funding Information:
I-Lin Wang was partly supported by the National Science Council of Taiwan under Grant NSC97-2221-E-006-173 .
PY - 2011/6
Y1 - 2011/6
N2 - Although the haplotype data can be used to analyze the function of DNA, due to the significant efforts required in collecting the haplotype data, usually the genotype data is collected and then the population haplotype inference (PHI) problem is solved to infer haplotype data from genotype data for a population. This paper investigates the PHI problem based on the pure parsimony criterion (HIPP), which seeks the minimum number of distinct haplotypes to infer a given genotype data. We analyze the mathematical structure and properties for the HIPP problem, propose techniques to reduce the given genotype data into an equivalent one of much smaller size, and analyze the relations of genotype data using a compatible graph. Based on the mathematical properties in the compatible graph, we propose a maximal clique heuristic to obtain an upper bound, and a new polynomial-sized integer linear programming formulation to obtain a lower bound for the HIPP problem.
AB - Although the haplotype data can be used to analyze the function of DNA, due to the significant efforts required in collecting the haplotype data, usually the genotype data is collected and then the population haplotype inference (PHI) problem is solved to infer haplotype data from genotype data for a population. This paper investigates the PHI problem based on the pure parsimony criterion (HIPP), which seeks the minimum number of distinct haplotypes to infer a given genotype data. We analyze the mathematical structure and properties for the HIPP problem, propose techniques to reduce the given genotype data into an equivalent one of much smaller size, and analyze the relations of genotype data using a compatible graph. Based on the mathematical properties in the compatible graph, we propose a maximal clique heuristic to obtain an upper bound, and a new polynomial-sized integer linear programming formulation to obtain a lower bound for the HIPP problem.
UR - http://www.scopus.com/inward/record.url?scp=79954667306&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79954667306&partnerID=8YFLogxK
U2 - 10.1016/j.mbs.2011.02.008
DO - 10.1016/j.mbs.2011.02.008
M3 - Article
C2 - 21354185
AN - SCOPUS:79954667306
SN - 0025-5564
VL - 231
SP - 120
EP - 125
JO - Mathematical Biosciences
JF - Mathematical Biosciences
IS - 2
ER -