Developmental Validation of Probabilistic Genotyping Software for NGS-Generated aSTR Profiles

Thursday September 21st, 2023 // 11:20 am - 11:40 am // Hyatt Regency at the Colorado Convention Center, Centennial Ballroom

Forensic DNA analysis is vital in criminal investigations, and next-generation sequencing (NGS) or massively parallel sequencing (MPS) technologies have revolutionised the field by offering new methodologies for interpreting DNA evidence. NGS enables applications like genetic genealogy, missing-persons identification, and paternity testing. A specific area of interest is the technology’s ability to potentially improve the interpretation of autosomal short tandem repeats (aSTRs) in forensic mixtures. Unlike traditional capillary-electrophoresis methods, NGS can detect sequence variations within STRs, thereby providing sequence-level information that enhances individual identification capabilities.


Despite the advantages, interpreting NGS-generated aSTR profiles presents challenges due to increased complexity and vast data output. Published studies have explored the behaviour of STRs in NGS-DNA profiles and proposed quantitative models to describe expected NGS-DNA profiles. This growing understanding of the behaviour of NGS profiles has facilitated the development of probabilistic genotyping solutions for the interpretation aSTR mixtures created with sequencing technologies. Building upon biological models introduced by Cheng et al. (2021), which were adapted from Bright et al. (2013) and Vilsen et al. (2013), a prototype probabilistic genotyping solution was created.


This probabilistic genotyping tool is designed to address some of these challenges by providing a method for interpreting forensic aSTR DNA profiles generated using NGS technology. In this presentation, we present the results of the developmental validation of this probabilistic genotyping software. The validation involved assigning likelihood ratios (LRs) to over one hundred DNA profiles, deliberately designed to contain up to four contributors, distinguishing between known donors and non-donors. The results illustrate the behaviour of LRs, the impact of assuming a known donor within the mixture, and the consequences of under- or overestimating the number of contributors in a mixture. Additionally, we discuss limitations of this probabilistic genotyping solution, including sequence artifacts and degradation issues.


Kevin Cheng

Scientist Developer, Institute of Environmental and Science Research (ESR)

Kevin Cheng is a Scientist Developer at the Institute of Environmental and Science Research (ESR). He is involved with developmental research and validation for probabilistic genotyping software, and teaching the concepts of probabilistic genotyping when required.

Submit Questions