Nov 29 2017

Courtroom Testimony for Probabilistic Genotyping

DNA MixturesTipsForensic

“Ladies and Gentleman of the Jury, Jane Doe’s DNA matches the DNA found on the handle of the knife; therefore, she touched the knife and used it to stab John Doe”.  As much as prosecutors would like that statement to come out of our mouths, that is not the role of the DNA Analyst.  We cannot say when DNA came to be on an item or what action took place that resulted in the DNA being deposited there. As a DNA Analyst, our job is to let the jury know if DNA is detected on an item from the crime scene, who may or may not be a contributor to the DNA profile and then to support that statement with a statistic.

Most individuals working in the legal system are now familiar with the Random Match Probability statistic.  They understand that we can explain how rare a DNA profile is within the population, often wanting to hear how many planet Earths it would take to expect to see that profile.  However, this comparison is not an appropriate analogy when you are reporting a Likelihood Ratio (LR).  With the advent of probabilistic genotyping software, the challenge arises with not only how to present the “new” statistic, but how to explain probabilistic genotyping in general.



Written by: Samantha O. Wandzek, DNA Labs International



At DNA Labs International, we primarily use STRmix™ software for mixture interpretation.  There are two main focuses for testimony training.  The first is to know the material; learn how the software works, know the algorithms behind it and know the basic foundations.  The second is especially important for testimony; you need to be able to take what you know and explain it simply to a jury.  Keep it as simple as possible, but be able to go into further detail if asked.

There are two main types of comparisons that can occur, the first being the more straightforward scenario.  You have a known sample and you can compare the DNA profile of that individual to the mixed DNA profile to see if it he or she can be included as a possible contributor.  At DNA Labs International, we exclude individuals without running STRmix™ when appropriate.  For example, if none of their alleles are present, a software program is not going to change that; they can be excluded.  If an individual cannot be outright excluded as a contributor, the known and evidence samples are run through STRmix™.  What can we say in court about the statistic generated?


Jane Doe could not be ruled out as a contributor to the mixed DNA profile obtained from the knife handle.  A statistical tool was used to generate a statistic called a Likelihood Ratio.  A Likelihood Ratio is a ratio of two probabilities giving a numerical value that shows strength of support for one scenario over another.  In the case of DNA, we are usually using it to give strength of support for a known individual to be part of a mixture as opposed to the mixture being comprised of unknown random individuals in the population.


After that, you state the statistic and assuredly the jury will then look at you confused.  What does one million times more likely mean?  It is then your job to put that number into perspective based off of the verbal scale you may use and/or what was seen during your internal validation.  At DNA Labs International, we use the verbal scale that is published in the STRmix™ manuals and is used in the United States as well as internationally, including by the Institute of Environmental Science and Research, the developers of STRmix™.  In addition, we also have an “uninformative” range that may be reported.  What does it mean to have an uninformative statistic and how can you explain that to a jury?


The uninformative range is the statistical range where you cannot meaningfully include or exclude an individual.  You can have a true contributor, which is someone who is actually in the mixture, with a Likelihood Ratio falsely supporting exclusion as well as a non-contributor, which is someone who is not in the mixture, with a Likelihood Ratio falsely supporting inclusion.


What does that mean?  It means that qualitatively the known individual could not be ruled out as a contributor; however, the quantitative statistic generated could be a false inclusion or exclusion and is thus uninformative.  You are back to where you left off before the statistic was generated.  The uninformative range is utilized in an effort to not report low likelihood ratios that have been demonstrated to be unreliable through internal validation.

The second type of comparison that can occur is one where you do not have a known sample. The evidence sample is run in STRmix™ without a reference sample and a profile can be determined that meets requirements for database upload.  Laboratories vary on requirements for profile determination.  At DNA Labs International, we require a genotypic weight of greater than 99% at a locus and must have a sufficient number of loci meeting that requirement in order to use it for comparison.  What happens when the determined profile ends up with a database hit?


The DNA profile obtained from the knife handle is a mixture of at least two people.  A DNA profile was determined and matched the DNA profile obtained from Jane Doe.


A database hit to a determined profile would warrant an additional analysis in STRmix™ with the database individual considered in your proposition upon submission of a confirmation standard from the individual.  After that, you can simply go back to the LR explanation and give the statistic.  You do not need to even use the words probabilistic genotyping software, explain MCMC or use the word algorithm.

At some point, you may need to define some of these things, but by keeping it basic at first, you will have the best chance of getting the jury to understand.  As a DNA Analyst, you review all of the data, evaluate whether STRmix™ is performing as expected, and ensure that the determined profiles are intuitively correct.  This is just another tool that we are using to help interpret the sample and it does not replace the role of DNA Analyst.

What happens with more complex questions?  The STRmix™ Users Manual has a Frequently Asked Questions section that is a must read.  In addition, be prepared to answer questions about the STRmix™ report that is likely contained within your case file.  Some common questions that have been asked are:

  • How does STRmix™ work?
  • Who determines the number of contributors?
  • How are the mixture proportions determined?
  • What are iterations?
  • What is the acceptance rate?
  • What is allele variance?
  • What is stutter variance?
  • What is the effective sample size?

Rather than fear these questions, be prepared.  Prior to every testimony, go over your answers, write them down and learn them.  Sit down and talk with experienced analysts at your laboratory to go over possible case specific questions and answers.  Eventually, testifying for cases involving probabilistic genotyping will be the new normal.   The more you testify to probabilistic genotyping software results, the more comfortable you will become.  Remember don’t overstep your bounds as a scientist.  Know the limitations of the software and to what extent your opinion should go, which should be based off published literature as well as your own training and internal validations.


Requests for additional information can be made via:

ATTN: STRmix Inquiry