CSAR: A contig scaffolding tool using algebraic rearrangements

Kun Tze Chen, Chia Liang Liu, Shang Hao Huang, Hsin Ting Shen, Yi Kung Shieh, Hsien-Tai Chiu, Chin Lung Lu

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Advances in next generation sequencing have generated massive amounts of short reads. However, assembling genome sequences from short reads still remains a challenging task. Due to errors in reads and large repeats in the genome, many of current assembly tools usually produce just collections of contigs whose relative positions and orientations along the genome being sequenced are still unknown. To address this issue, a scaffolding process to order and orient the contigs of a draft genome is needed for completing the genome sequence. In this work, we propose a new scaffolding tool called CSAR that can efficiently and more accurately order and orient the contigs of a given draft genome based on a reference genome of a related organism. In particular, the reference genome required by CSAR is not necessary to be complete in sequence. Our experimental results on real datasets have shown that CSAR outperforms other similar tools such as Projector2, OSLay and Mauve Aligner in terms of average sensitivity, precision, F-score, genome coverage, NGA50 and running time.

Original languageEnglish
Pages (from-to)109-111
Number of pages3
JournalBioinformatics
Volume34
Issue number1
DOIs
Publication statusPublished - 2018 Jan 1

Fingerprint

Rearrangement
Genome
Genes
Sequencing
Coverage
Unknown
Necessary
Experimental Results

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Chen, K. T., Liu, C. L., Huang, S. H., Shen, H. T., Shieh, Y. K., Chiu, H-T., & Lu, C. L. (2018). CSAR: A contig scaffolding tool using algebraic rearrangements. Bioinformatics, 34(1), 109-111. https://doi.org/10.1093/bioinformatics/btx543
Chen, Kun Tze ; Liu, Chia Liang ; Huang, Shang Hao ; Shen, Hsin Ting ; Shieh, Yi Kung ; Chiu, Hsien-Tai ; Lu, Chin Lung. / CSAR : A contig scaffolding tool using algebraic rearrangements. In: Bioinformatics. 2018 ; Vol. 34, No. 1. pp. 109-111.
@article{b5f86de63eab4a62911be336c12d24ed,
title = "CSAR: A contig scaffolding tool using algebraic rearrangements",
abstract = "Advances in next generation sequencing have generated massive amounts of short reads. However, assembling genome sequences from short reads still remains a challenging task. Due to errors in reads and large repeats in the genome, many of current assembly tools usually produce just collections of contigs whose relative positions and orientations along the genome being sequenced are still unknown. To address this issue, a scaffolding process to order and orient the contigs of a draft genome is needed for completing the genome sequence. In this work, we propose a new scaffolding tool called CSAR that can efficiently and more accurately order and orient the contigs of a given draft genome based on a reference genome of a related organism. In particular, the reference genome required by CSAR is not necessary to be complete in sequence. Our experimental results on real datasets have shown that CSAR outperforms other similar tools such as Projector2, OSLay and Mauve Aligner in terms of average sensitivity, precision, F-score, genome coverage, NGA50 and running time.",
author = "Chen, {Kun Tze} and Liu, {Chia Liang} and Huang, {Shang Hao} and Shen, {Hsin Ting} and Shieh, {Yi Kung} and Hsien-Tai Chiu and Lu, {Chin Lung}",
year = "2018",
month = "1",
day = "1",
doi = "10.1093/bioinformatics/btx543",
language = "English",
volume = "34",
pages = "109--111",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "1",

}

Chen, KT, Liu, CL, Huang, SH, Shen, HT, Shieh, YK, Chiu, H-T & Lu, CL 2018, 'CSAR: A contig scaffolding tool using algebraic rearrangements', Bioinformatics, vol. 34, no. 1, pp. 109-111. https://doi.org/10.1093/bioinformatics/btx543

CSAR : A contig scaffolding tool using algebraic rearrangements. / Chen, Kun Tze; Liu, Chia Liang; Huang, Shang Hao; Shen, Hsin Ting; Shieh, Yi Kung; Chiu, Hsien-Tai; Lu, Chin Lung.

In: Bioinformatics, Vol. 34, No. 1, 01.01.2018, p. 109-111.

Research output: Contribution to journalArticle

TY - JOUR

T1 - CSAR

T2 - A contig scaffolding tool using algebraic rearrangements

AU - Chen, Kun Tze

AU - Liu, Chia Liang

AU - Huang, Shang Hao

AU - Shen, Hsin Ting

AU - Shieh, Yi Kung

AU - Chiu, Hsien-Tai

AU - Lu, Chin Lung

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Advances in next generation sequencing have generated massive amounts of short reads. However, assembling genome sequences from short reads still remains a challenging task. Due to errors in reads and large repeats in the genome, many of current assembly tools usually produce just collections of contigs whose relative positions and orientations along the genome being sequenced are still unknown. To address this issue, a scaffolding process to order and orient the contigs of a draft genome is needed for completing the genome sequence. In this work, we propose a new scaffolding tool called CSAR that can efficiently and more accurately order and orient the contigs of a given draft genome based on a reference genome of a related organism. In particular, the reference genome required by CSAR is not necessary to be complete in sequence. Our experimental results on real datasets have shown that CSAR outperforms other similar tools such as Projector2, OSLay and Mauve Aligner in terms of average sensitivity, precision, F-score, genome coverage, NGA50 and running time.

AB - Advances in next generation sequencing have generated massive amounts of short reads. However, assembling genome sequences from short reads still remains a challenging task. Due to errors in reads and large repeats in the genome, many of current assembly tools usually produce just collections of contigs whose relative positions and orientations along the genome being sequenced are still unknown. To address this issue, a scaffolding process to order and orient the contigs of a draft genome is needed for completing the genome sequence. In this work, we propose a new scaffolding tool called CSAR that can efficiently and more accurately order and orient the contigs of a given draft genome based on a reference genome of a related organism. In particular, the reference genome required by CSAR is not necessary to be complete in sequence. Our experimental results on real datasets have shown that CSAR outperforms other similar tools such as Projector2, OSLay and Mauve Aligner in terms of average sensitivity, precision, F-score, genome coverage, NGA50 and running time.

UR - http://www.scopus.com/inward/record.url?scp=85040034372&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85040034372&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btx543

DO - 10.1093/bioinformatics/btx543

M3 - Article

C2 - 28968788

AN - SCOPUS:85040034372

VL - 34

SP - 109

EP - 111

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 1

ER -