Online Learning, Theory and applications

From SeedWiki

Jump to: navigation, search

The subject of this course is online learning. A previous course focused on the theoretical aspects of online learning, this course will emphasize applications of this approach to real world problems.



CSE254, Winter 2010 Teacher: Yoav Freund Time: TuTh 12:30p - 1:50p Room: EBU3B 2154 Pre-requisites: CSE103: introduction to probability and statistics or equivalent. Project: Grades for the class will be based on a project. The project will be managed through this wiki (see list below) Students are expected to devote 8 hours per week to work on the project.

Office Hours: Tuesdays from 3:30 to 5:30 PM

Student Projects

Simulation based projects will consist of a simulation of a "plant". A plant can be a physical system such as a pendulum, an RC circuit or a bouncing ball, or it can be a system that is part of a computer. The main part of the project would be to design and experiment with a controller for the plant. Students can choose the language to work in and whether to work in a group or as individuals. All students taking the course must have a project wiki page by January 19.

Theoretical foundations

  • Regret-minimization as the basis for online learning and online decision making.
  • the exponential weights algorithm.
  • NormalHedge.
  • A solution to the multiple arms bandit problem.
  • Basic game theory: The min-max theorem. Convergence to optimal mixed strategies in the zero sum game. Convergence to Correlated Equilibrium in the non zero sum game.

Resource allocation and System management

Computer systems of all sizes, from embedded systems to data centers are getting increasingly complex and increasingly resource limited. Traditional resources are computer time and memory, resources that are rapidly increasing in importance are energy consumption, heat dispersion, I/O bandwidth and computer-to-computer communication bandwidth.

Programmers have very limited knowledge of the resources consumed by the programs they write. They develop their program within a development framework, where the environment of the program is highly simplified so that they can make sure that the program is correct. When the program is then distributed and installed into the context in which it is used the environment of the program becomes infinitely more complex. The resources that are consumed depend on the combination of program that are used, it is not a simple sum of the resources used by each program alone. The performance is also highly dependent on the distribution of computational tasks placed on the system. A consequence of this is that resource management and optimization has to be done after the programs have been combined and the complete system is performing it's job.

Online learning is a framework for solving this resource management and optimization problem. Instead of committing ahead of time to a particular policy for managing resources, the idea is to consider a large and constantly evolving set of resource management policies (experts). Resource allocation decisions are made using those policies whose decisions have resulted in lower resource allocation in the recent past. The mathematical analysis of this approach is based on the game theory and provides a solid link between resource management decisions and the monetary cost that is the consequence of these decisions.

Currently, as far as I know, there is no application of this approach in practical systems. The goal of this course is to introduce the concept to students that are working in applied fields such as systems, networking and embedded systems. The students will create prototype systems to test the concepts taught in class in their area of expertise.

Examples of possible projects:

  • Simple:
    • Disk Spin-Down.
    • maintaining a wireless connection
    • putting computer into sleep mode.
  • Complex:
    • Energy management: in data centers, on computer chips.
    • QoS management in computer networks.
    • Cognitive wireless communication.

Tracking of objects in sensor networks

One of the fundamental tasks when analyzing data in sensor networks is to detect and track physical objects as they move through space. To perform this task reliably one wants to integrate the sensory data coming from all of the sensors that can sense the object. This is a very challanging task because each sensor has different capabilities, different noise characteristics and different failure modes. Some of these characteristics can be changed dynamically by external commands (for example, a PTZ video camera can change it's pan, tilt and zoom).

An example of such a system is the automatic cameraman

The application of online learning to the task of tracking is based on the idea of particle filters. It is different from traditional particle filters in that it is based on an online algorithm (NormalHedge) rather than on a Bayesian analysis. The advantage of this new approach is that it is more robust against arbitrary noise and requires a smaller number of particles to achieve good performance.

Here is a paper about out current work in this area:

Personal tools