NormalHedge and Active Learning
- Scenario: We are given a dataset with n unlabeled examples, m of which are labeled (labels are binary). We assume that the examples are generated IID according to a joint distribution D over instances and labels (we think of the unlabeled examples as having a label which has yet to be revealed). We assume that m is much smaller than n.
- We are given a concept class C.
- Using the unlabeled examples and ignoring computational complexity we can compute an epsilon cover over C of size N. If we put a weight of 1/N over each concept in the epsilon cover we get distribution over concepts that we call the prior induced by the epsilon cover.
- we say that the concept class C is smooth with respect to the distribution D if there exists an "ideal" prior distribution P over C such that sampling N(ε) concepts IID according to P will, with high probability, create an epsilon-cover for C. First task: suppose we don't know the distribution by we do know some "base-measure" such that the ratio between the probability of the set according to the base measure and the ideal prior is bounded in the range [ε,1 / ε]. Devise an algorithm that uses the unlabeled examples to create a sample from the ideal prior.
- Basic learning algorithm: