TY - JOUR
T1 - Length distributions of simple tandem repeats in genomes
AU - Nyeo, Su Long
AU - Yu, Jui Ping
N1 - Funding Information:
This work was supported by the National Science Council of the Republic of China under contract no. NSC94-2112-M006-015 and by the National Center for Theoretical Sciences of the Republic of China.
PY - 2007/9
Y1 - 2007/9
N2 - The length distributions of simple tandem repeats in the genomes of several organisms are evaluated and found to exhibit long-range correlations in A and T nucleotide bases related repeats for most eukaryotes. In particular, the length distributions of the mononucleotide A/T repeat units have longer tails than those of the C/G repeat units. Also, the length distributions of the dinucleotide repeat unit CG show a simple monotonously fast decreasing behavior, while those of repeat units AT, AG and AC have complicated structures at larger repeat lengths, especially for human, mouse and rat chromosomes. These distributive behaviors are due to the CpG deficiency in different genomes with different methylation activities. Especially, methyltransferases in vertebrates appear to methylate specifically the cytosine in CpG dinucleotides, and the methylated cytosines is prone to mutate to thymine by spontaneous deamination. The dinucleotide CpG would gradually decay into TpG and CpA. In addition, there is a peak in the distributions of repeat unit A at repeat-repeat separation 153 nt for humans and chimpanzees. We show that the long-tail behavior of mononucleotide repeat unit A and the peak at repeat separation 153 nt are due to the interspersed repetitive DNA sequences in humans and chimpanzees.
AB - The length distributions of simple tandem repeats in the genomes of several organisms are evaluated and found to exhibit long-range correlations in A and T nucleotide bases related repeats for most eukaryotes. In particular, the length distributions of the mononucleotide A/T repeat units have longer tails than those of the C/G repeat units. Also, the length distributions of the dinucleotide repeat unit CG show a simple monotonously fast decreasing behavior, while those of repeat units AT, AG and AC have complicated structures at larger repeat lengths, especially for human, mouse and rat chromosomes. These distributive behaviors are due to the CpG deficiency in different genomes with different methylation activities. Especially, methyltransferases in vertebrates appear to methylate specifically the cytosine in CpG dinucleotides, and the methylated cytosines is prone to mutate to thymine by spontaneous deamination. The dinucleotide CpG would gradually decay into TpG and CpA. In addition, there is a peak in the distributions of repeat unit A at repeat-repeat separation 153 nt for humans and chimpanzees. We show that the long-tail behavior of mononucleotide repeat unit A and the peak at repeat separation 153 nt are due to the interspersed repetitive DNA sequences in humans and chimpanzees.
UR - http://www.scopus.com/inward/record.url?scp=34748845667&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34748845667&partnerID=8YFLogxK
U2 - 10.1142/S0218339007002246
DO - 10.1142/S0218339007002246
M3 - Article
AN - SCOPUS:34748845667
SN - 0218-3390
VL - 15
SP - 299
EP - 312
JO - Journal of Biological Systems
JF - Journal of Biological Systems
IS - 3
ER -