STR and SNP Genetic Analyses of the Yavapai Native Americans using Massively Parallel Sequencing

The forensic science community is moving towards implementation of massively parallel sequencing (MPS) in routine casework pretty rapidly, and a key limitation in the field can now be addressed relatively quickly.



Submitted by: Frank Wendt, UNTHSC



Arguably all of the work done by forensic scientists in casework or research depends on underlying frequencies of the target amplicons of choice. A critical component of bringing new short tandem repeat (STR) kits and methods, such as MPS, into the field is establishing those frequencies using major population groups relative to geographic location. While reading population genetics papers relevant to the field, it is common see phrases like “major United States populations” or “major global populations” followed by some combination of African/African American, Hispanic, Caucasian, etc.  Large samplings from these populations are sufficient for capturing the bulk of genetic diversity at the STR markers used for forensic science. The community then statistically corrects for the presence of population substructure to avoid prejudice towards a defendant that may harbor rare alleles not yet observed.

While appropriate, for now, producing empirical frequency data for substructured populations is still the most effective solution to address inbreeding. Native Americans are a particularly interesting sample type because they likely exhibit greater genetic diversity than subgroups of major populations, making them valuable for completing the allele-pool puzzle and characterizing alleles that may be considered rare in other populations.

Through collaboration with Dr. Sree Kanthaswamy and his groups at Arizona State University and University of California at Davis, we acquired samples from the Yavapai population. Historically the Yavapai were semi-nomadic hunter-gatherers spanning west-central Arizona. Currently, the tribe has a size of approximately 900 individuals and occupies nearly 2000 acres around the Verde River north of the Gila River.

We chose to use the ForenSeq™ DNA Signature Prep Kit from Illumina, Inc. to get an enormous amount of data for this population, including autosomal, X-chromosomal, and Y-chromosomal STR length- and sequence-based information and human-identification single nucleotide polymorphism (iSNP), ancestry-informative SNP (aSNP), and phenotypic SNP (pSNP) information.

To summarize, this study served as the first investigation into Yavapai population genetics with respect to forensically relevant loci as well as the first set of population data reported using the ForenSeq™ DNA Signature Prep Kit. The autosomal STR combined random match probability (RMP) was substantially lower than previous reports, Y-STR haplotype diversity was relatively high (0.95), and combined RMPs for autosomal STRs and iSNPs was exceedingly low. Biogeographic ancestry estimates assigned East Asian and Admixed American ancestry to all but one sample. All complete pSNP profiles were predicted, unsurprisingly, to have black hair and brown eyes.


There is so much additional descriptive information captured by MPS that can be of value for casework. Click To Tweet


While focusing on a relatively unique population, the work presented highlights the utility of the ForenSeq™ DNA Signature Prep Kit for casework applications. There is so much additional descriptive information captured by sequencing that can be of value for casework. Most notably is the collection of autosomal, X-, and Y-chromosomal STRs and iSNPs, aSNPs, and pSNPs all in one tube, which is truly remarkable considering that the field started with just a few markers of interest.

Maybe a little less obvious is our characterization of STR motif sequence variation. This will likely become invaluable for mixture deconvolution and kinship assessments once these are better characterized. Essentially we are finding that some markers are highly polymorphic by sequence but maintain a relatively small amplicon size. Ultimately this can help overcome some of the amplification biases seen with loci that have large differences in amplicon sizes (like FGA or SE33).

I am currently focusing heavily on my dissertation project. I am characterizing a number of opiate-metabolism and analgesic-response polymorphisms that can be used to predict the metabolizer phenotype of an individual. Ultimately this work will be able to guide prescription medication practices and aid in molecular autopsy on individuals who may have expired due to drug related complications. In addition, I plan to continue working with this Native American population to explore STR and SNP flanking region and full mitochondrial genome variation.