Abstract
Post-translational modifications are considered important molecular interactions in protein science. One of these modifications is “sumoylation” whose computational detection has recently become a challenge. In this paper, we propose a new computational predictor which makes use of the sine and cosine of backbone torsion angles and the accessible surface area for predicting sumoylation sites. The aforementioned features were computed for all the proteins in our benchmark dataset, and a training matrix consisting of sumoylation and non-sumoylation sites was ultimately created. This training matrix was balanced by undersampling the majority class (non-sumoylation sites) using the NearMiss method. Finally, an AdaBoost classifier was used for discriminating between sumoylation and non-sumoylation sites. Our predictor was called “C-iSumo” because of its effective use of circular functions. C-iSumo was compared with another predictor which was outperformed in statistical metrics such as sensitivity (0.734), accuracy (0.746) and Matthews correlation coefficient (0.494).
Original language | English (US) |
---|---|
Article number | 107235 |
Journal | Computational Biology and Chemistry |
Volume | 87 |
DOIs | |
State | Published - Aug 2020 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Structural Biology
- Biochemistry
- Organic Chemistry
- Computational Mathematics
Keywords
- Adaboost
- Amino acids
- Computational prediction
- Proteins
- Sumoylation