Projects
Cancer Subtype Multi-Class Classification in Gene Expression Data
This project deals with a multi-class classification task (5 tumor types) within RNA-seq gene expression data. This project will introduce participants to topics such as data cleaning, clustering, feature selection methods, and machine learning modeling.
Modeling Complex Binary-Class Associations in Simulated Genomic Data
This project deals with binary classification tasks (case/control) in a variety of simulated single nucleotide polymorphism (SNP) datasets. Each dataset has a different form of underlying complex association (e.g. multivariate additive, epistatic, or genetic heterogeneity). This project will introduce participants to topics such as basic data preparation, feature selection methods, machine learning (ML) modeling algorithms, automated machine learning, and model explanation and interpretation.