Effects of spaced k-mers on alignment-free genotyping

Hartmut Häntze, Paul Horton

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Motivation: Alignment-free, k-mer based genotyping methods are a fast alternative to alignment-based methods and are particularly well suited for genotyping larger cohorts. The sensitivity of algorithms, that work with k-mers, can be increased by using spaced seeds, however, the application of spaced seeds in k-mer based genotyping methods has not been researched yet. Results: We add a spaced seeds functionality to the genotyping software PanGenie and use it to calculate genotypes. This significantly improves sensitivity and F-score when genotyping SNPs, indels, and structural variants on reads with low (5×) and high (30×) coverage. Improvements are greater than what could be achieved by just increasing the length of contiguous k-mers. Effect sizes are particularly large for low coverage data. If applications implement effective algorithms for hashing of spaced k-mers, spaced k-mers have the potential to become an useful technique in k-mer based genotyping.

Original languageEnglish
Pages (from-to)I213-I221
JournalBioinformatics
Volume39
DOIs
Publication statusPublished - 2023 Jun 1

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Effects of spaced k-mers on alignment-free genotyping'. Together they form a unique fingerprint.

Cite this