Integration of Bio-Chemical data
Integration of Bio-Chemical data
Predicting Genetic Regulatory Response Using Classification.
In this joint work with Christina Leslie, Chris Wiggins and theoir students we used a boosting-based classification method, called Alternating Decision Trees, to combine gene regulation data obtained using micro-arrays with information about the binding sites associated with each gene. The resulting model is a classification rule that predicts the state of each gene based on the state of the regulatory genes and the motifs that appear in the regulatory sequence downstream from the gene.
Identifying metabolic enzymes with multiple types of association evidence
In this joint work with Dennis Vitkup, George Church and his students we have used boosting to identify “orphan” genes in metabolic networks. The goal of the method is to rank-sort all genes in the organism according to the likelihood that the gene corresponds to a particular unidentified metabolic enzyme. The method combines a diverse set of databases to generate the ranking. As the evidence given by each database is weak, combining them using boosting is a natural approach that proves to be effective.