TY - JOUR
T1 - The analysis of inconsistencies between cytogenetic annotations and sequence mapping by defining the imprecision zones of cytogenetic banding
AU - Yen, Kuo Ho
AU - Ho, Chung Liang
AU - Lee, Chiang
N1 - Funding Information:
Funding: Department of Health, Taiwan (DOH-TD-B-111-004 to C.-Y Chou); National Science Council, Taiwan (NSC94-2320-B-006-002 to C.-L.H.).
PY - 2009
Y1 - 2009
N2 - Motivation: In current databases, there are many genes with inconsistent mapping positions between their cytogenetic annotations and sequence map positions. However, not all inconsistencies are the same. Some of them may be problematic which should be corrected in the future; while others may result from the imprecise nature of chromosomal banding which may be tolerable. It is important to stratify the cytogenetic position information into different confidence groups with the recognition of the impreciseness of cytogenetic banding. Results: When plotting their cytogenetic annotations against sequence map positions on a 2D plane, the consistent genes tend to have a compact linear distribution; while genes with inconsistent positions are more scattered. The overlapping areas between these two groups are defined as the tolerable imprecision zones by linear regression and distance analysis. The system was implemented using sequence information from NCBI Map Viewer Build 36.3 and cytogenetic annotations from NCBI Entrez Gene. The genes' position information is classified into five confidence groups: inconsistent-intolerable, inconsistent-tolerable, consistent-imprecise, consistent-precise and consistent-rough. Using information from NCBI Map Viewer Build 36.3 and NCBI Entrez Gene, the percentages of these confidence groups are 1.4%, 7.0%, 54.0%, 35.4% and 2.2%, respectively. Using information from NCBI Map Viewer Build 36.3 and NCBI online Mendelian inheritance in man (OMIM), the percentages are 3.7%, 16.9%, 49.0%, 19.0% and 11.4%, respectively. Combining these two results, a confidence table of genes' position information was constructed.
AB - Motivation: In current databases, there are many genes with inconsistent mapping positions between their cytogenetic annotations and sequence map positions. However, not all inconsistencies are the same. Some of them may be problematic which should be corrected in the future; while others may result from the imprecise nature of chromosomal banding which may be tolerable. It is important to stratify the cytogenetic position information into different confidence groups with the recognition of the impreciseness of cytogenetic banding. Results: When plotting their cytogenetic annotations against sequence map positions on a 2D plane, the consistent genes tend to have a compact linear distribution; while genes with inconsistent positions are more scattered. The overlapping areas between these two groups are defined as the tolerable imprecision zones by linear regression and distance analysis. The system was implemented using sequence information from NCBI Map Viewer Build 36.3 and cytogenetic annotations from NCBI Entrez Gene. The genes' position information is classified into five confidence groups: inconsistent-intolerable, inconsistent-tolerable, consistent-imprecise, consistent-precise and consistent-rough. Using information from NCBI Map Viewer Build 36.3 and NCBI Entrez Gene, the percentages of these confidence groups are 1.4%, 7.0%, 54.0%, 35.4% and 2.2%, respectively. Using information from NCBI Map Viewer Build 36.3 and NCBI online Mendelian inheritance in man (OMIM), the percentages are 3.7%, 16.9%, 49.0%, 19.0% and 11.4%, respectively. Combining these two results, a confidence table of genes' position information was constructed.
UR - http://www.scopus.com/inward/record.url?scp=63549121122&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=63549121122&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btn649
DO - 10.1093/bioinformatics/btn649
M3 - Article
C2 - 19098301
AN - SCOPUS:63549121122
SN - 1367-4803
VL - 25
SP - 845
EP - 852
JO - Bioinformatics
JF - Bioinformatics
IS - 7
ER -