Yoav Freund



Yoav Freund is a professor of Computer Science and Engineering at UC San Diego. His work is in the area of machine learning, computational statistics and their applications in biology, image processing and signal processing. Dr. Freund is an internationally known researcher in the field of machine learning --- a field which bridges computer science and statistics. He is best known for his joint work with Dr. Robert Schapire on the Adaboost algorithm. For this work they were awarded the 2003 Gödel prize in Theoretical Computer Science and the Kanellakis Prize in 2004. In 2008 Dr. Freund was elected as a AAAI Fellow.

Listed In the Thomson ISI directory of highly cited researchers.

My research these days is in applications of machine learning, in particular, applications to biology and medicine. Modern experiments in biology generate massive amounts of raw data. Transforming this raw data into knowledge is a central challenge.

Image analysis

For example, using Fluorescent In-Situ Hybridization (FISH) researchers are able to identify the regulatory state of individual cells in a drosophila embryos. However, converting the 3D images generated by the confocal microscope into a list describing the location and state of each cell is still, by and large, a slow manual process.

Significant progress in automating this process has been achieved using computer vision methods. However, applying existing computer vision methods to a new experimental setting often produces disappointing results.  Significant tuning of the method is usually needed, and as performing this tuning requires a good understanding of computer vision, the task is usually beyond the abilities of the experimentalist.

My approach to this problem is to use machine learning to create an interface between the computer vision algorithm and the experimentalist. Rather then asking the experimentalist to adjust the parameters of the computer vision algorithm, we provide the experimentalists with a graphical interface through which they can annotate the image or identify the locations where the computer vision method made an error. Using this feedback, the machine learning algorithm adjusts the parameters of the computer vision method until the performance of the method is satisfactory. Here are some examples of using this approach.

Data Integration for inference in Bio-Chemical networks.

As biological systems are complex, so is the experimental data. In order to study biochemical processes in-vivo, one has to simultaneously measure the concentration levels for many different molecules, micro-arrays provide a way of measuring the RNA concentration levels for thousands of genes at once. In addition, there is usually a variety of other sources of information, including DNA sequence information, phylogenetic information, data on Protein / Protein interactions etc. Each of these data sources is limited in its reliability and coverage. One of the central challenges for research in systems biology is to devise  methods for combining such diverse information sources and construct meaningful and statistically significant models for the biochemical processes. Machine learning in general, and boosting in particular, have proved to be useful in this context. Here are some papers on this subject.

Protein Sequence Analysis

This is a new home page that is under construction. My current web page is here.