Forensic Genetic Genealogy (FGG) has recently become a valuable tool in the forensic science community and is having a great impact on the resolution of unresolved cases, including homicides, sexual assaults, and Unidentified Human Remains (UHRs) cases. In forensic investigations, following traditional Forensic DNA (STR) analysis and CODIS upload (within the United States), and failure to produce a candidate match in CODIS, FGG could produce investigative leads to identify an unknown individual. FGG employs SNP sequence data uploaded to genetic genealogy databases (i.e., FamilyTreeDNA® and GEDmatch PRO®) to identify genetic relatives (i.e., genetic matches) of the unknown individual. Family tree(s) are then constructed using the genetic matches to reach a possible candidate identity of the unknown individual. SNP sequencing (i.e., SNP microarray) typically requires high-quality/high-quantity DNA samples. Degraded DNA samples however are regularly encountered in forensic investigations. Therefore, a critical analysis of the impact of degraded DNA/SNP data is necessary to investigate the downstream effects this may have on the subsequent FGG analysis within the genetic genealogy databases.
Addressing this potential issue, Justin will present a study that investigates how manually degraded SNP DNA data files affect the top ten genetic matches generated in GEDmatch during his poster presentation at ISHI 33.
We talked with Justin to learn a little bit more about his poster presentation. If you’ll be at ISHI 33 this year, be sure to stop by poster #8 to learn more!
Briefly describe your work/area of interest.
Forensic Genetic Genealogy (FGG) has recently become a valuable tool in the forensic science community and is having a great impact on the resolution of unresolved cases, including homicides, sexual assaults, and Unidentified Human Remains (UHRs) cases. If there are no candidate matches following traditional Forensic DNA (STR) analysis and CODIS upload of DNA evidence, FGG could produce investigative leads to identify an unknown individual. FGG employs SNP sequence data technology that is uploaded to genetic genealogy databases (FamilyTreeDNA® and GEDmatch PRO ®) to identify genetic relatives (i.e., genetic matches) of the unknown individual.
Key collaborators in this project involved Dr. Claire Glynn, Dr. San Pietro, Melinde Byrne CG, and Julia Dollen MSFS.
Three volunteers provided their own downloaded raw DNA SNP microarray data. The data files were anonymized and subjected to a randomized manual deletion protocol composed of increasing increments of deletion percentages from the overall SNP data profile (minus 5%,10%, 15%, 20%,25%, 30%, -50% deletion). With a total of nine modified files, each file was uploaded to GEDmatch as “Research Files”, and a list of the top ten genetic matches based on shared DNA was produced. Each modified file was examined using autosomal One-to-Many matching, autosomal One-to-One Q-Matching, and Segment Searching, to investigate how values and top matches were altered with increased deletion of data.
The results show various changes among top matches, including, but not limited to; matches that decrease/increase in total shared cM value, decrease/increase in quality scores of matching segments on a one-to-one basis, and changes to percentage confidence in predicted relationships. Additionally, the ranking of each donor’s top ten genetic matches became altered with increasing deleted percentages, with some moving up in rank, some moving down in rank, and some lost completely (from the top ten list) when compared to the original full DNA SNP data file.
How did you get interested in this work? Why did this particular project appeal?
I grew interested in this work because of how innovative the tool is and how inspiring the success stories of FGG have been. Once learning about it through literature, news article, and hands-on learning I saw how incredible it was to utilize genetic relatedness to identify leads of an individual’s identity. I was inspired not only by the success stories but also by how passionate forensic genetic genealogists are about the emerging field.
This project appealed to me immediately because it is very empirical. There is still so much to be discovered about how SNP technology and DNA data files are affected by low-quality DNA samples and how it affects the FGG process. In retrospect, this project is attempting to tie in the fact that DNA samples at crime scenes are not pristine whatsoever, whereas those within genetic databases are submitting high-quality samples. I think the more research is done to understand how genetic matches, shared cM values, and segmental analysis is affected by imperfect DNA data files the better it is when interpreting the initial results from the sample upload.
Can you summarize the impact of your work for the audience (ISHI attendees and some general forensic enthusiasts)? How might this advance the field?
The impact of this work is that it helps to understand how crime scene samples function within the genetic databases. For the bigger picture, as FGG use grows it is important to understand how to assess the information coming from a subject’s matches, particularly when dealing with degraded DNA samples. This research will provide insight and a closer look at how poor-quality samples affect the FGG process.
WOULD YOU LIKE TO SEE MORE ARTICLES LIKE THIS? SUBSCRIBE TO THE ISHI BLOG BELOW!