Are there any differences between features of proteins expressed in malignant and benign breast cancers?

Mansour Ebrahimi, Esmaeil Ebrahimie, Narges Shamabadi, Mahdi Ebrahimi

Abstract


  • BACKGROUND: The most common cancer among women is breast cancer and it has been blamed as the second leading cause of cancer death in women; so far many approaches have been used to analyze and detect benign and malignant forms of cancer and understanding the features involved in proteins expressed by various types of breast cancers is crucial.
  • METHODS: Herein features of proteins expressed in malignant, benign and both cancers were compared using different screening techniques, clustering methods, decision tree models and generalized rule induction (GRI) algorithms to look for patterns of similarity in two benign and malignant breast cancer groups.
  • RESULTS: The findings showed that the N-terminal amino acid was Met and 57 out of 838 proteins' features ranked as important (p > 0.05). The depth of the trees induced by tree induction models varied from 5 (in the Quest model) to 2 (in the C5.0 model) branches. The best performance evaluation found when C&RT model applied and the worst evaluation found when CHAID model applied. No significant difference in the percentage of correctness, performance evaluation, and mean correctness in tree induction algorithms was found when feature selection applied on datasets, but the number of peer groups reduced significantly (p < 0.05) when feature selection model applied.
  • CONCLUSIONS: The frequency of Ile-Ile was the most important protein attributes in all tree and rule induction models. The importance of sequence-based classification and the frequency of Ile-Ile in prediction of malignant and benign breast cancer have been discussed here.
  • KEYWORDS: Bioinformatics, Modeling, Breast Cancer, Malignant, Benign.

Keywords


Bioinformatics, modeling, breast cancer, malignant, benign

Full Text:

PDF