Statistical Modeling of Short-Tandem Repeat Capillary Electrophoresis Profiles

Slim Karkar, Lauren E. Alfonse, Catherine M. Grgicak, Desmond S. Lun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Objective: Interrogating multiple polymorphic Short Tandem Repeat (STR) locations by way of PCR and capillary electrophoresis (CE) is the chief technique by which laboratories determine whether an individual contributed their DNA to biological material retrieved from the environment. There is, theoretically, a substantial level of information contained within the CE signal, regarding the length and number of DNA fragments amplified. However, environmental samples are challenging to interpret because little is known regarding the quantity or quality of the DNA and the allele signal component is often obfuscated by PCR artifacts, known as stutter, and noise. Thus, developing a signal model that can effectively model the components of STR signal and does not rely on a priori knowledge of the quantity or quality of DNA, is warranted. Results: As such, we first develop a strategy wherein we quantity the quality of the profile by examining the degree to which the signal changes with amplicon size. Second, for different components of the signal, we develop models for each component, i.e., allele, the artifact stutter and noise, of the signal. By examining the out-of-sample prediction error we identify a model that can be effectively utilized for downstream interpretation. Significance: The model is selected using a large, diverse collection of profiles obtained using 144 distinct laboratory conditions and a large range of DNA template masses, which extend from a single copy of DNA to hundreds of copies. As a Gaussian mixture model, it can be readily applied to analyze complex DNA samples.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
EditorsHarald Schmidt, David Griol, Haiying Wang, Jan Baumbach, Huiru Zheng, Zoraida Callejas, Xiaohua Hu, Julie Dickerson, Le Zhang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages869-876
Number of pages8
ISBN (Electronic)9781538654880
DOIs
StatePublished - Jan 21 2019
Event2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018 - Madrid, Spain
Duration: Dec 3 2018Dec 6 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018

Conference

Conference2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
CountrySpain
CityMadrid
Period12/3/1812/6/18

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering
  • Health Informatics

Fingerprint Dive into the research topics of 'Statistical Modeling of Short-Tandem Repeat Capillary Electrophoresis Profiles'. Together they form a unique fingerprint.

Cite this