Summary: After the introduction of high-throughput sequencing, genotyping arrays continue to be a viable source for conducting large-scale genetic studies. Currently, Illumina is one of the largest genotyping array manufacturers. One technical issue that has always plagued the post-processing of Illumina genotyping array data is the strand definition. Against convention, Illumina uses their own definition of strand, which is inconsistent with the standard reference forward and reverse definition. This issue has been a major obstacle in the consistency of reporting, meta-analysis and correct interpretation of phenotype association results. To date, the strand issue has not been adequately addressed, prompting us to develop StrandScript, a tool that can convert all genotyping data generated from Illumina genotyping arrays to the reference forward strand. StrandScript works independently of the Illumina array version and is future proof for newer Illumina array designs. Furthermore, StrandScript can examine an Illumina genotyping array manifest file and can detect all problematic SNPs, including SNPs with wrong RS ID and SNPs with mismatched probe sequences. Here, we introduce StrandScript's design and development, and demonstrate its effectiveness using real genotyping data.
All Science Journal Classification (ASJC) codes