OrchidBase: A collection of sequences of the transcriptome derived from orchids

Chih Hsiung Fu, Yun Wen Chen, Yu Yun Hsiao, Zhao Jun Pan, Zhong Jian Liu, Yueh Min Huang, Wen Chieh Tsai, Hong Hwa Chen

Research output: Contribution to journalArticle

61 Citations (Scopus)

Abstract

Orchids are one of the most ecological and evolutionarily significant plants, and the Orchidaceae is one of the most abundant families of the angiosperms. Genetic databases will be useful not only for gene discovery but also for future genomic annotation. For this purpose, OrchidBase was established from 37,979,342 sequence reads collected from 11 in-house Phalaenopsis orchid cDNA libraries. Among them, 41,310 expressed sequence tags (ESTs) were obtained by using Sanger sequencing, whereas 37,908,032 reads were obtained by using next-generation sequencing (NGS) including both Roche 454 and Solexa Illumina sequencers. These reads were assembled into 8,501 contigs and 76,116 singletons, resulting in 84,617 non-redundant transcribed sequences with an average length of 459 bp. The analysis pipeline of the database is an automated system written in Perl and C#, and consists of the following components: automatic pre-processing of EST reads, assembly of raw sequences, annotation of the assembled sequences and storage of the analyzed information in SQL databases. A web application was implemented with HTML and a Microsoft. NET Framework C# program for browsing and querying the database, creating dynamic web pages on the client side, analyzing gene ontology (GO) and mapping annotated enzymes to KEGG pathways. The online resources for putative annotation can be searched either by text or by using BLAST, and the results can be explored on the website and downloaded. Consequently, the establishment of OrchidBase will provide researchers with a high-quality genetic resource for data mining and facilitate efficient experimental studies on orchid biology and biotechnology. The OrchidBase database is freely available at http://lab.fhes.tn.edu.tw/est.

Original languageEnglish
Pages (from-to)238-243
Number of pages6
JournalPlant and Cell Physiology
Volume52
Issue number2
DOIs
Publication statusPublished - 2011 Feb 1

All Science Journal Classification (ASJC) codes

  • Physiology
  • Plant Science
  • Cell Biology

Fingerprint Dive into the research topics of 'OrchidBase: A collection of sequences of the transcriptome derived from orchids'. Together they form a unique fingerprint.

  • Cite this