Sample Analysis Using Prediction Modeling for Early Mixture Detection

Analysis of evidentiary samples containing DNA from multiple contributors (“mixtures”) is a time intensive process for a forensic analyst and one where the contributor nature of a sample is not revealed until the end of the traditional forensic workflow.


Often, at this stage, retesting or additional testing of mixture samples may not be possible, particularly if only trace amounts of a contributor’s DNA is present.


Thus, a new method that would allow for the quick and accurate identification of single source (versus mixture) samples, prior to the end-point of STR analysis would be beneficial.


To meet this need, an HRM screening assay has been developed and integrated into an existing step in the laboratory evidence workflow – the real-time PCR-based DNA quantification step.


Using the developed assay, resulting HRM data is coupled with prediction modeling approaches to allow for the contributor status of an evidence item to be identified and a genetic comparison to be made without additional steps or delays in processing.


The developed integrated quantification-HRM mixture screening assay adds two new amplification targets (STR loci D5S818 and D18S51) and an intercalating dye into two existing commercial human DNA quantification chemistries, the Investigator Quantiplex® kit and Quantifiler™ Trio kit, which is then used in combination with added transition and melt steps post-amplification on the QuantStudio™ 6 Flex.


For implementation of this assay into forensic labs, the dataset used for modeling must be expanded to encompass HRM data from all common genotypes for both STR loci. For this, synthetic melt curve data will be generated for each common genotypes not represented in the aforementioned data sets for the integrated HRM assays. The synthetic data will then be incorporated into the final reference dataset that is used to train the prediction models, expanding the applicability of this assay.


To-date, synthetic data has been generated for the D18S51 locus and is being tested, increasing the training set used for prediction modeling from 7 genotypes to 66 common genotypes.