Abstract
The evolutionary features based on the distributions of k-mers in the DNA sequences of various organisms are studied. The organisms are classified into three groups based on their evolutionary periods: (a) E. coli and T. pallidum (b) yeast, zebrafish, A. thaliana, and fruit fly, (c) mouse, chicken, and human. The distributions of 6-mers of these three groups are shown to be, respectively, (a) unimodal, (b) unimodal with peaks generally shifted to smaller frequencies of occurrence, (c) bimodal. To describe the bimodal feature of the k-mer distributions of group (c), a model based on the cytosine-guanine "CG" content of the DNA sequences is introduced and shown to provide reasonably good agreements.
Original language | English |
---|---|
Article number | 011908 |
Journal | Physical Review E - Statistical, Nonlinear, and Soft Matter Physics |
Volume | 72 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2005 Jul |
All Science Journal Classification (ASJC) codes
- Statistical and Nonlinear Physics
- Statistics and Probability
- Condensed Matter Physics