Background: T cells and B cells are essential in the adaptive immunity via expressing T cell receptors and immunoglogulins respectively for recognizing antigens. To recognize a wide variety of antigens, a highly diverse repertoire of receptors is generated via complex recombination of the receptor genes. Reasonably, frequencies of the recombination events have been shown to predict immune diseases and provide insights into the development of immunity. The field is further boosted by high-throughput sequencing and several computational tools have been released to analyze the recombined sequences. However, all current tools assume regular recombination of the receptor genes, which is not always valid in data prepared using a RACE approach. Compared to the traditional multiplex PCR approach, RACE is free of primer bias, therefore can provide accurate estimation of recombination frequencies. To handle the non-regular recombination events, a new computational program is needed. Results: We propose TRIg to handle non-regular T cell receptor and immunoglobulin sequences. Unlike all current programs, TRIg does alignments to the whole receptor gene instead of only to the coding regions. This brings new computational challenges, e.g., ambiguous alignments due to multiple hits to repetitive regions. To reduce ambiguity, TRIg applies a heuristic strategy and incorporates gene annotation to identify authentic alignments. On our own and public RACE datasets, TRIg correctly identified non-regularly recombined sequences, which could not be achieved by current programs. TRIg also works well for regularly recombined sequences. Conclusions: TRIg takes into account non-regular recombination of T cell receptor and immunoglobulin genes, therefore is suitable for analyzing RACE data. Such analysis will provide accurate estimation of recombination events, which will benefit various immune studies directly. In addition, TRIg is suitable for studying aberrant recombination in immune diseases. TRIg is freely available at https://github.com/TLlab/trig.
All Science Journal Classification (ASJC) codes
- Structural Biology
- Molecular Biology
- Computer Science Applications
- Applied Mathematics