Model for the distributions of k-mers in DNA sequences

Yaw Hwang Chen, Su Long Nyeo, Chiung Yuh Yeh

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)


The evolutionary features based on the distributions of k-mers in the DNA sequences of various organisms are studied. The organisms are classified into three groups based on their evolutionary periods: (a) E. coli and T. pallidum (b) yeast, zebrafish, A. thaliana, and fruit fly, (c) mouse, chicken, and human. The distributions of 6-mers of these three groups are shown to be, respectively, (a) unimodal, (b) unimodal with peaks generally shifted to smaller frequencies of occurrence, (c) bimodal. To describe the bimodal feature of the k-mer distributions of group (c), a model based on the cytosine-guanine "CG" content of the DNA sequences is introduced and shown to provide reasonably good agreements.

Original languageEnglish
Article number011908
JournalPhysical Review E - Statistical, Nonlinear, and Soft Matter Physics
Issue number1
Publication statusPublished - 2005 Jul

All Science Journal Classification (ASJC) codes

  • Statistical and Nonlinear Physics
  • Statistics and Probability
  • Condensed Matter Physics


Dive into the research topics of 'Model for the distributions of k-mers in DNA sequences'. Together they form a unique fingerprint.

Cite this