A Cost-effective Workflow for Massively Parallel Sequencing of Drug Metabolizing Enzymes

After taking a course in Pharmacogenomics as part of my PhD program, I found the transdisciplinary approach required to understand the subject, combining both genetics and drug response, was an interesting combination that I wanted to explore. As part of a capstone project, the opportunity to work on a project in this area presented itself, and I was eager to jump aboard.


Submitted by: Ryan Gutierrez, Sam Houston State University



As a background to pharmacogenomics there are many widely prescribed drugs that do not have uniform effects on those who take them. In some cases people can be highly tolerant to a wide range of doses or not tolerant to even a usually sub-therapeutic amount. Often this variability in response to medication can be correlated to genetic differences in the regions of the genome that code for proteins that interact with the drug while it is in the body. Of particular interest in this project are drug metabolizing enzymes. Differences in the genomes of individuals can be translated into metabolizing proteins with altered function, the effect of which can sometimes be categorized as individuals who are normal metabolizers, rapid metabolizers, or poor metabolizers. In some cases metabolizer status is important because levels of prescribed drugs can quickly range from non-effective to therapeutic to toxic.

In this research a cost effective workflow was developed to identify polymorphisms in the drug metabolizing enzyme carboxylesterase 1 (CES1).  These polymorphisms were correlated with individual responses to methylphenidate amongst patients being treated with the drug for ADHD. While metabolizer status for methylphenidate will not lead to any deadly outcomes, there are a number of side effects that lead to lower quality of life and are seen among individuals taking the drug.

In designing our methods we planned to use long amplicon PCR to amplify our regions of interest so we could then get a high sequencing coverage for our samples. By using a Massively Parallel Sequencing (MPS) method rather than single base extension or Sanger sequencing we were able to get more data from our samples, potentially uncovering variants that had not been described in literature to date. We designed and optimized primers in order to enrich the 31.5 kilobase region of DNA in two overlapping 16.5 kilobase amplicons. We then sequenced the samples using Nextera XT library preparation and the Illumina MiSeq Reagent Kit v3. Samples were multiplexed for sequencing with over 80 samples being run on the same sequencing reactions, saving both time and money.

We were able to successfully identify both known and novel polymorphisms in the CES1 genes of our sample population. While we are still working on correlating any polymorphisms back to the clinical outcomes of the sample population while on methylphenidate, this serves as a proof that using long amplicon PCR with MiSeq sequencing can provide valuable information when it comes to drug metabolism.

To finish this project further downstream data analysis is necessary. While these samples have been effectively sequenced it is still necessary to correlate variants back to clinical outcome for our sample population.  Developing a good way to handle the volume of data received from massively parallel sequencing is still a challenge and one that I plan to continue working on.

While standard forensic DNA laboratories may not currently be involved sequencing drug metabolizing enzymes, the metabolism status of deceased victims can be important for forensic cases and could be discovered during molecular autopsy. This information can be important in determining the difference between accidental overdose and purposeful suicide. It also could provide evidence between a malicious poisoning and a therapeutic dose of a drug that happened to cause harm to one individual.