Feature Selection & Biomarkers

Scientific Research Project | 2018
work-single-image

Summary

In this project we evaluated proteins candidates for oral cancer biormarkers. Among many analysis and approaches involving discovery and targeted proteomics, in collaboration with the Brazilian Bioscience National Laboratory, I applied statistical and machine learning tools to evaluate the power of potential proteins and peptides candidates signatures to discriminate the N0/N+ oral cancer conditions explained in the article.

Our work was published in Nature Communications and appeared on radio, tv and magazines, and received an award for innovation integrating Computer Science methods and Biology for Health.

Category
Machine Learning
Feature Selection
Statistics
My job
Develop
Research
Report
Technology
R
Python
Random Forests
Statistical tests
False discovery rate

Publication

Carnielli, C. M., Macedo, C. C. S., De Rossi, T., Granato, D. C., Rivera, C., Domingues, R. R., … Paes Leme, A. F. (2018).
Combining discovery and targeted proteomics reveals a prognostic signature in oral cancer.
Nature Communications, 9(1), 3598.

Gallery

Machine Learning analysis overview

Step by step from creating signatures to evaluating them through cross-validation.

Univariate prediction power

Classifiers built with a single protein or peptide were assessed to indicate how each one performes alone in discriminating the classes N0 and N+.

ROC plots

ROC and AUC computed using the best signatures candidates (sets of features).

Distribution of ROC AUC

All possible signatures were evaluated. The plot shows the distribution of the computed ROC AUC values and highlight the position of the best signatures S1, S2, S3 and S4.

SO WHAT YOU THINK ?

Let me know if I can help you in any way.

Contact with me