TY - JOUR
T1 - Analytical validation of whole exome and whole genome sequencing for clinical applications
AU - Linderman, Michael D.
AU - Brandt, Tracy
AU - Edelmann, Lisa
AU - Jabado, Omar
AU - Kasai, Yumi
AU - Kornreich, Ruth
AU - Mahajan, Milind
AU - Shah, Hardik
AU - Kasarskis, Andrew
AU - Schadt, Eric E.
PY - 2014/4/23
Y1 - 2014/4/23
N2 - Background: Whole exome and genome sequencing (WES/WGS) is now routinely offered as a clinical test by a growing number of laboratories. As part of the test design process each laboratory must determine the performance characteristics of the platform, test and informatics pipeline. This report documents one such characterization of WES/WGS. Methods. Whole exome and whole genome sequencing was performed on multiple technical replicates of five reference samples using the Illumina HiSeq 2000/2500. The sequencing data was processed with a GATK-based genome analysis pipeline to evaluate: intra-run, inter-run, inter-mode, inter-machine and inter-library consistency, concordance with orthogonal technologies (microarray, Sanger) and sensitivity and accuracy relative to known variant sets. Results: Concordance to high-density microarrays consistently exceeds 97% (and typically exceeds 99%) and concordance between sequencing replicates also exceeds 97%, with no observable differences between different flow cells, runs, machines or modes. Sensitivity relative to high-density microarray variants exceeds 95%. In a detailed study of a 129 kb region, sensitivity was lower with some validated single-base insertions and deletions "not called". Different variants are not called in each replicate: of all variants identified in WES data from the NA12878 reference sample 74% of indels and 89% of SNVs were called in all seven replicates, in NA12878 WGS 52% of indels and 88% of SNVs were called in all six replicates. Key sources of non-uniformity are variance in depth of coverage, artifactual variants resulting from repetitive regions and larger structural variants.
AB - Background: Whole exome and genome sequencing (WES/WGS) is now routinely offered as a clinical test by a growing number of laboratories. As part of the test design process each laboratory must determine the performance characteristics of the platform, test and informatics pipeline. This report documents one such characterization of WES/WGS. Methods. Whole exome and whole genome sequencing was performed on multiple technical replicates of five reference samples using the Illumina HiSeq 2000/2500. The sequencing data was processed with a GATK-based genome analysis pipeline to evaluate: intra-run, inter-run, inter-mode, inter-machine and inter-library consistency, concordance with orthogonal technologies (microarray, Sanger) and sensitivity and accuracy relative to known variant sets. Results: Concordance to high-density microarrays consistently exceeds 97% (and typically exceeds 99%) and concordance between sequencing replicates also exceeds 97%, with no observable differences between different flow cells, runs, machines or modes. Sensitivity relative to high-density microarray variants exceeds 95%. In a detailed study of a 129 kb region, sensitivity was lower with some validated single-base insertions and deletions "not called". Different variants are not called in each replicate: of all variants identified in WES data from the NA12878 reference sample 74% of indels and 89% of SNVs were called in all seven replicates, in NA12878 WGS 52% of indels and 88% of SNVs were called in all six replicates. Key sources of non-uniformity are variance in depth of coverage, artifactual variants resulting from repetitive regions and larger structural variants.
UR - http://www.scopus.com/inward/record.url?scp=84900476010&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84900476010&partnerID=8YFLogxK
U2 - 10.1186/1755-8794-7-20
DO - 10.1186/1755-8794-7-20
M3 - Article
C2 - 24758382
AN - SCOPUS:84900476010
SN - 1755-8794
VL - 7
JO - BMC Medical Genomics
JF - BMC Medical Genomics
IS - 1
M1 - 20
ER -