Abstract
We developed an approach for identifying groups or families of Staphylococcus aureus bacteria based on genotype data. With the emergence of drug-resistant strains, S. aureus represents a significant human health threat. Identifying the family types efficiently and quickly is crucial in community settings. Here, we develop a hybrid sequence algorithm approach to type this bacterium using only its spa gene. Two of the sequence algorithms we used are well established, whereas the third, the Best Common Gap-Weighted Sequence (BCGS), is novel. We combined the sequence algorithms with a weighted match/mismatch algorithm for the spa sequence ends. Normalized similarity scores and distances between the sequences were derived and used within unsupervised clustering methods. The resulting spa groupings correlated strongly with the groups defined by the well-established multilocus sequence typing (MLST) method. spa typing is preferable to MLST typing, which types seven genes instead of just one. Furthermore, our spa clustering methods can be fine-tuned to be more discriminating than MLST, identifying new strains that the MLST method may not. Finally, we performed a multidimensional scaling of our distance matrices to visualize the relationship between isolates. The proposed methodology provides a promising new approach to molecular epidemiology.
Original language | English (US) |
---|---|
Pages (from-to) | 693-704 |
Number of pages | 12 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 4 |
Issue number | 4 |
DOIs | |
State | Published - Oct 2007 |
All Science Journal Classification (ASJC) codes
- Biotechnology
- Genetics
- Applied Mathematics
Keywords
- Clustering
- Genotyping
- Molecular epidemiology
- Sequence algorithms
- Staphylococcus aureus