Tsukuba BB: A branch and bound algorithm for local multiple sequence alignment

研究成果: Conference contribution

2 引文 (Scopus)

摘要

In this paper we present a branch and bound algorithm for local gapless multiple sequence alignment (motif alignment) and its implementation. This is the first program to exploit the fact that the motif alignment problem is easier for short motifs. Indeed for a fixed motif width the running time of the algorithm is asymptotically linear in the size of the input. We tested the performance of the program on a dataset of 300 E.coli promoter sequences. For a motif width of 4 the optimal alignment of the entire set of sequences can be found. For the more natural motif width of 6 the program can align 19 sequences of length 100; more than twice the number of sequences which can be aligned by the best previous exact algorithm. The algorithm can relax the constraint of requiring each sequence to be aligned, and align 100 of the 300 promoter sequences with a motif width of 6. We also compare the effectiveness of the Gibbs sampling and beam search heuristics on this problem and show that in some cases our branch and bound algorithm can find the optimal solution, with proof of optimality, when those heuristics fail to find the optimal solution.

原文English
主出版物標題Combinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings
編輯David Sankoff, Raffaele Giancarlo
發行者Springer Verlag
頁面84-98
頁數15
ISBN(電子)3540676333, 9783540676331
出版狀態Published - 2000 一月 1
事件11th Annual Symposium on Combinatorial Pattern Matching, CPM 2000 - Montreal, Canada
持續時間: 2000 六月 212000 六月 23

出版系列

名字Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
1848
ISSN(列印)0302-9743
ISSN(電子)1611-3349

Other

Other11th Annual Symposium on Combinatorial Pattern Matching, CPM 2000
國家Canada
城市Montreal
期間00-06-2100-06-23

指紋

Multiple Sequence Alignment
Branch and Bound Algorithm
Alignment
Promoter
Optimal Solution
Heuristics
Beam Search
Asymptotically Linear
Gibbs Sampling
Escherichia coli
Exact Algorithms
Escherichia Coli
Optimality
Sampling
Entire

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

引用此文

Paul, B. H. I. (2000). Tsukuba BB: A branch and bound algorithm for local multiple sequence alignment. 於 D. Sankoff, & R. Giancarlo (編輯), Combinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings (頁 84-98). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 1848). Springer Verlag.
Paul, Brice Horton Ii. / Tsukuba BB : A branch and bound algorithm for local multiple sequence alignment. Combinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings. 編輯 / David Sankoff ; Raffaele Giancarlo. Springer Verlag, 2000. 頁 84-98 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{14c1240ae89642fea79a29d4caa9c5d8,
title = "Tsukuba BB: A branch and bound algorithm for local multiple sequence alignment",
abstract = "In this paper we present a branch and bound algorithm for local gapless multiple sequence alignment (motif alignment) and its implementation. This is the first program to exploit the fact that the motif alignment problem is easier for short motifs. Indeed for a fixed motif width the running time of the algorithm is asymptotically linear in the size of the input. We tested the performance of the program on a dataset of 300 E.coli promoter sequences. For a motif width of 4 the optimal alignment of the entire set of sequences can be found. For the more natural motif width of 6 the program can align 19 sequences of length 100; more than twice the number of sequences which can be aligned by the best previous exact algorithm. The algorithm can relax the constraint of requiring each sequence to be aligned, and align 100 of the 300 promoter sequences with a motif width of 6. We also compare the effectiveness of the Gibbs sampling and beam search heuristics on this problem and show that in some cases our branch and bound algorithm can find the optimal solution, with proof of optimality, when those heuristics fail to find the optimal solution.",
author = "Paul, {Brice Horton Ii}",
year = "2000",
month = "1",
day = "1",
language = "English",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "84--98",
editor = "David Sankoff and Raffaele Giancarlo",
booktitle = "Combinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings",
address = "Germany",

}

Paul, BHI 2000, Tsukuba BB: A branch and bound algorithm for local multiple sequence alignment. 於 D Sankoff & R Giancarlo (編輯), Combinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 卷 1848, Springer Verlag, 頁 84-98, 11th Annual Symposium on Combinatorial Pattern Matching, CPM 2000, Montreal, Canada, 00-06-21.

Tsukuba BB : A branch and bound algorithm for local multiple sequence alignment. / Paul, Brice Horton Ii.

Combinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings. 編輯 / David Sankoff; Raffaele Giancarlo. Springer Verlag, 2000. p. 84-98 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 1848).

研究成果: Conference contribution

TY - GEN

T1 - Tsukuba BB

T2 - A branch and bound algorithm for local multiple sequence alignment

AU - Paul, Brice Horton Ii

PY - 2000/1/1

Y1 - 2000/1/1

N2 - In this paper we present a branch and bound algorithm for local gapless multiple sequence alignment (motif alignment) and its implementation. This is the first program to exploit the fact that the motif alignment problem is easier for short motifs. Indeed for a fixed motif width the running time of the algorithm is asymptotically linear in the size of the input. We tested the performance of the program on a dataset of 300 E.coli promoter sequences. For a motif width of 4 the optimal alignment of the entire set of sequences can be found. For the more natural motif width of 6 the program can align 19 sequences of length 100; more than twice the number of sequences which can be aligned by the best previous exact algorithm. The algorithm can relax the constraint of requiring each sequence to be aligned, and align 100 of the 300 promoter sequences with a motif width of 6. We also compare the effectiveness of the Gibbs sampling and beam search heuristics on this problem and show that in some cases our branch and bound algorithm can find the optimal solution, with proof of optimality, when those heuristics fail to find the optimal solution.

AB - In this paper we present a branch and bound algorithm for local gapless multiple sequence alignment (motif alignment) and its implementation. This is the first program to exploit the fact that the motif alignment problem is easier for short motifs. Indeed for a fixed motif width the running time of the algorithm is asymptotically linear in the size of the input. We tested the performance of the program on a dataset of 300 E.coli promoter sequences. For a motif width of 4 the optimal alignment of the entire set of sequences can be found. For the more natural motif width of 6 the program can align 19 sequences of length 100; more than twice the number of sequences which can be aligned by the best previous exact algorithm. The algorithm can relax the constraint of requiring each sequence to be aligned, and align 100 of the 300 promoter sequences with a motif width of 6. We also compare the effectiveness of the Gibbs sampling and beam search heuristics on this problem and show that in some cases our branch and bound algorithm can find the optimal solution, with proof of optimality, when those heuristics fail to find the optimal solution.

UR - http://www.scopus.com/inward/record.url?scp=84937402994&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937402994&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84937402994

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 84

EP - 98

BT - Combinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings

A2 - Sankoff, David

A2 - Giancarlo, Raffaele

PB - Springer Verlag

ER -

Paul BHI. Tsukuba BB: A branch and bound algorithm for local multiple sequence alignment. 於 Sankoff D, Giancarlo R, 編輯, Combinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings. Springer Verlag. 2000. p. 84-98. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).