Characterization and distribution of repetitive elements in association with genes in the human genome

研究成果: Article

7 引文 (Scopus)

摘要

Repetitive elements constitute more than 50% of the human genome. Recent studies implied that the complexity of living organisms is not just a direct outcome of a number of coding Sequences; the repetitive elements, which do not encode proteins, may also play a significant role. Though scattered studies showed that repetitive elements in the regulatory regions of a gene control gene expression, no systematic survey has been done to report the characterization and distribution of various types of these repetitive elements in the human genome. Sequences from 5′ and 3′ untranslated regions and upstream and downstream of a gene were downloaded from the Ensembl database. The repetitive elements in the neighboring of each gene were identified and classified using cross-matching implemented in the RepeatMasker. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were collected to characterize genes in association with different types of repetitive elements using systems biology program. We identified a total of 1,068,400 repetitive elements which belong to 37-class families and 1235 subclasses that are associated with 33,761 genes and 57,365 transcripts. In addition, we found that the tandem repeats preferentially locate proximal to the transcription start site (TSS) of genes and the major function of these genes are involved in developmental processes. On the other hand, interspersed repetitive elements showed a tendency to be accumulated at distal region from the TSS and the function of interspersed repeat-containing genes took part in the catabolic/metabolic processes. Results from the distribution analysis were collected and used to construct a gene-based repetitive element database (GBRED; http://www.binfo.ncku.edu.tw/GBRED/index.html). A user-friendly web interface was designed to provide the information of repetitive elements associated with any particular gene(s). This is the first study focusing on the gene-associated repetitive elements in the human genome. Our data showed distinct genes associated with different kinds of repetitive element and implied such combination may shape the function of these genes. Aside from the conventional view of these elements in genome evolution, results from this study offer a systemic review to facilitate exploitation of these elements in genome function.

原文English
頁(從 - 到)29-38
頁數10
期刊Computational Biology and Chemistry
57
DOIs
出版狀態Published - 2015 五月 16

指紋

Human Genome
Genome
Genes
Gene
Transcription Initiation Site
Human
Transcription
Databases
Interspersed Repetitive Sequences
Distinct
Tandem Repeat Sequences
Systems Biology
Nucleic Acid Regulatory Sequences
5' Untranslated Regions
3' Untranslated Regions
Exploitation
Gene Expression
Annotation
Coding

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Organic Chemistry
  • Computational Mathematics

引用此文

@article{b29a653633bd4173bbcbb38b7ad5ee40,
title = "Characterization and distribution of repetitive elements in association with genes in the human genome",
abstract = "Repetitive elements constitute more than 50{\%} of the human genome. Recent studies implied that the complexity of living organisms is not just a direct outcome of a number of coding Sequences; the repetitive elements, which do not encode proteins, may also play a significant role. Though scattered studies showed that repetitive elements in the regulatory regions of a gene control gene expression, no systematic survey has been done to report the characterization and distribution of various types of these repetitive elements in the human genome. Sequences from 5′ and 3′ untranslated regions and upstream and downstream of a gene were downloaded from the Ensembl database. The repetitive elements in the neighboring of each gene were identified and classified using cross-matching implemented in the RepeatMasker. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were collected to characterize genes in association with different types of repetitive elements using systems biology program. We identified a total of 1,068,400 repetitive elements which belong to 37-class families and 1235 subclasses that are associated with 33,761 genes and 57,365 transcripts. In addition, we found that the tandem repeats preferentially locate proximal to the transcription start site (TSS) of genes and the major function of these genes are involved in developmental processes. On the other hand, interspersed repetitive elements showed a tendency to be accumulated at distal region from the TSS and the function of interspersed repeat-containing genes took part in the catabolic/metabolic processes. Results from the distribution analysis were collected and used to construct a gene-based repetitive element database (GBRED; http://www.binfo.ncku.edu.tw/GBRED/index.html). A user-friendly web interface was designed to provide the information of repetitive elements associated with any particular gene(s). This is the first study focusing on the gene-associated repetitive elements in the human genome. Our data showed distinct genes associated with different kinds of repetitive element and implied such combination may shape the function of these genes. Aside from the conventional view of these elements in genome evolution, results from this study offer a systemic review to facilitate exploitation of these elements in genome function.",
author = "Liang, {Kai Chiang} and Ta-Chien Tseng and Shaw-Jenq Tsai and Hsiao-Fang Sun",
year = "2015",
month = "5",
day = "16",
doi = "10.1016/j.compbiolchem.2015.02.007",
language = "English",
volume = "57",
pages = "29--38",
journal = "Computational Biology and Chemistry",
issn = "1476-9271",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Characterization and distribution of repetitive elements in association with genes in the human genome

AU - Liang, Kai Chiang

AU - Tseng, Ta-Chien

AU - Tsai, Shaw-Jenq

AU - Sun, Hsiao-Fang

PY - 2015/5/16

Y1 - 2015/5/16

N2 - Repetitive elements constitute more than 50% of the human genome. Recent studies implied that the complexity of living organisms is not just a direct outcome of a number of coding Sequences; the repetitive elements, which do not encode proteins, may also play a significant role. Though scattered studies showed that repetitive elements in the regulatory regions of a gene control gene expression, no systematic survey has been done to report the characterization and distribution of various types of these repetitive elements in the human genome. Sequences from 5′ and 3′ untranslated regions and upstream and downstream of a gene were downloaded from the Ensembl database. The repetitive elements in the neighboring of each gene were identified and classified using cross-matching implemented in the RepeatMasker. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were collected to characterize genes in association with different types of repetitive elements using systems biology program. We identified a total of 1,068,400 repetitive elements which belong to 37-class families and 1235 subclasses that are associated with 33,761 genes and 57,365 transcripts. In addition, we found that the tandem repeats preferentially locate proximal to the transcription start site (TSS) of genes and the major function of these genes are involved in developmental processes. On the other hand, interspersed repetitive elements showed a tendency to be accumulated at distal region from the TSS and the function of interspersed repeat-containing genes took part in the catabolic/metabolic processes. Results from the distribution analysis were collected and used to construct a gene-based repetitive element database (GBRED; http://www.binfo.ncku.edu.tw/GBRED/index.html). A user-friendly web interface was designed to provide the information of repetitive elements associated with any particular gene(s). This is the first study focusing on the gene-associated repetitive elements in the human genome. Our data showed distinct genes associated with different kinds of repetitive element and implied such combination may shape the function of these genes. Aside from the conventional view of these elements in genome evolution, results from this study offer a systemic review to facilitate exploitation of these elements in genome function.

AB - Repetitive elements constitute more than 50% of the human genome. Recent studies implied that the complexity of living organisms is not just a direct outcome of a number of coding Sequences; the repetitive elements, which do not encode proteins, may also play a significant role. Though scattered studies showed that repetitive elements in the regulatory regions of a gene control gene expression, no systematic survey has been done to report the characterization and distribution of various types of these repetitive elements in the human genome. Sequences from 5′ and 3′ untranslated regions and upstream and downstream of a gene were downloaded from the Ensembl database. The repetitive elements in the neighboring of each gene were identified and classified using cross-matching implemented in the RepeatMasker. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were collected to characterize genes in association with different types of repetitive elements using systems biology program. We identified a total of 1,068,400 repetitive elements which belong to 37-class families and 1235 subclasses that are associated with 33,761 genes and 57,365 transcripts. In addition, we found that the tandem repeats preferentially locate proximal to the transcription start site (TSS) of genes and the major function of these genes are involved in developmental processes. On the other hand, interspersed repetitive elements showed a tendency to be accumulated at distal region from the TSS and the function of interspersed repeat-containing genes took part in the catabolic/metabolic processes. Results from the distribution analysis were collected and used to construct a gene-based repetitive element database (GBRED; http://www.binfo.ncku.edu.tw/GBRED/index.html). A user-friendly web interface was designed to provide the information of repetitive elements associated with any particular gene(s). This is the first study focusing on the gene-associated repetitive elements in the human genome. Our data showed distinct genes associated with different kinds of repetitive element and implied such combination may shape the function of these genes. Aside from the conventional view of these elements in genome evolution, results from this study offer a systemic review to facilitate exploitation of these elements in genome function.

UR - http://www.scopus.com/inward/record.url?scp=84939598978&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939598978&partnerID=8YFLogxK

U2 - 10.1016/j.compbiolchem.2015.02.007

DO - 10.1016/j.compbiolchem.2015.02.007

M3 - Article

C2 - 25748288

AN - SCOPUS:84939598978

VL - 57

SP - 29

EP - 38

JO - Computational Biology and Chemistry

JF - Computational Biology and Chemistry

SN - 1476-9271

ER -