Informational Complexity Notion of Reduction for Concept Classes Shai Ben-David Cornell University, and Technion Joint work with Ami Litman Technion.

Slides:



Advertisements
Similar presentations
Vector Spaces A set V is called a vector space over a set K denoted V(K) if is an Abelian group, is a field, and For every element vV and K there exists.
Advertisements

Boosting Textual Compression in Optimal Linear Time.
Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.
10.4 Complex Vector Spaces.
VC Dimension – definition and impossibility result
NP-Completeness.
5.4 Basis And Dimension.
Shortest Vector In A Lattice is NP-Hard to approximate
Inapproximability of Hypergraph Vertex-Cover. A k-uniform hypergraph H= : V – a set of vertices E - a collection of k-element subsets of V Example: k=3.
Introduction The concept of transform appears often in the literature of image processing and data compression. Indeed a suitable discrete representation.
On Complexity, Sampling, and -Nets and -Samples. Range Spaces A range space is a pair, where is a ground set, it’s elements called points and is a family.
8.2 Kernel And Range.
Chapter 2 Multivariate Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
6.1 Vector Spaces-Basic Properties. Euclidean n-space Just like we have ordered pairs (n=2), and ordered triples (n=3), we also have ordered n-tuples.
Christos alatzidis constantina galbogini.  The Complexity of Computing a Nash Equilibrium  Constantinos Daskalakis  Paul W. Goldberg  Christos H.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Signal , Weight Vector Spaces and Linear Transformations
Signal , Weight Vector Spaces and Linear Transformations
Computational Geometry The art of finding algorithms for solving geometrical problems Literature: –M. De Berg et al: Computational Geometry, Springer,
Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.
The Computational Complexity of Searching for Predictive Hypotheses Shai Ben-David Computer Science Dept. Technion.
Measuring Model Complexity (Textbook, Sections ) CS 410/510 Thurs. April 27, 2007 Given two hypotheses (models) that correctly classify the training.
Computational Learning Theory; The Tradeoff between Computational Complexity and Statistical Soundness Shai Ben-David CS Department, Cornell and Technion,
Vapnik-Chervonenkis Dimension Definition and Lower bound Adapted from Yishai Mansour.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
1 Computational Learning Theory and Kernel Methods Tianyi Jiang March 8, 2004.
Toward NP-Completeness: Introduction Almost all the algorithms we studies so far were bounded by some polynomial in the size of the input, so we call them.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 22 Instructor: Paul Beame.
1 Foundations of Interval Computation Trong Wu Phone: Department of Computer Science Southern Illinois University Edwardsville.
The Apparent Tradeoff between Computational Complexity and Generalization of Learning: A Biased Survey of our Current Knowledge Shai Ben-David Technion.
1 The PCP starting point. 2 Overview In this lecture we’ll present the Quadratic Solvability problem. We’ll see this problem is closely related to PCP.
Mathematics1 Mathematics 1 Applied Informatics Štefan BEREŽNÝ.
Quantum One: Lecture 8. Continuously Indexed Basis Sets.
1 Introduction to Quantum Information Processing QIC 710 / CS 768 / PH 767 / CO 681 / AM 871 Richard Cleve QNC 3129 Lecture 18 (2014)
Diophantine Approximation and Basis Reduction
Chapter 2: Vector spaces
Chapter 3 Vector Spaces. The operations of addition and scalar multiplication are used in many contexts in mathematics. Regardless of the context, however,
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
Mathematical Induction
Section 2.3 Properties of Solution Sets
Computational Learning Theory IntroductionIntroduction The PAC Learning FrameworkThe PAC Learning Framework Finite Hypothesis SpacesFinite Hypothesis Spaces.
CHAPTER 5 SIGNAL SPACE ANALYSIS
Chap. 5 Inner Product Spaces 5.1 Length and Dot Product in R n 5.2 Inner Product Spaces 5.3 Orthonormal Bases: Gram-Schmidt Process 5.4 Mathematical Models.
AGC DSP AGC DSP Professor A G Constantinides©1 Signal Spaces The purpose of this part of the course is to introduce the basic concepts behind generalised.
Matrices and linear transformations For grade 1, undergraduate students For grade 1, undergraduate students Made by Department of Math.,Anqing Teachers.
Projective Geometry Hu Zhan Yi. Entities At Infinity The ordinary space in which we lie is Euclidean space. The parallel lines usually do not intersect.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
Equivalence Relations. Partial Ordering Relations 1.
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
Compression for Fixed-Width Memories Ori Rottenstriech, Amit Berman, Yuval Cassuto and Isaac Keslassy Technion, Israel.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Summary of the Last Lecture This is our second lecture. In our first lecture, we discussed The vector spaces briefly and proved some basic inequalities.
Extending a displacement A displacement defined by a pair where l is the length of the displacement and  the angle between its direction and the x-axix.
Vapnik–Chervonenkis Dimension
Computability and Complexity
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Quantum One.
Basis and Dimension Basis Dimension Vector Spaces and Linear Systems
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Computational Learning Theory
4.6: Rank.
§1-3 Solution of a Dynamical Equation
Computational Learning Theory
Signal & Weight Vector Spaces
Theorems about LINEAR MAPPINGS.
CSE 6408 Advanced Algorithms.
Quantum Foundations Lecture 3
Quantum Foundations Lecture 2
Approximation of Functions
Approximation of Functions
Presentation transcript:

Informational Complexity Notion of Reduction for Concept Classes Shai Ben-David Cornell University, and Technion Joint work with Ami Litman Technion

Measures of the Informational Complexity of a class  The VC-dimension of the class.  The sample complexity for learning the class from random examples.  The optimal mistake bound for learning the class online (or the query complexity of learning this class using membership and equivalence queries).  The size of the minimal compression scheme for the class.

Outline of the talk  Defining our reductions, and the induced notion of complete concept classes.  Introducing a specific family of classes that contains many natural concept classes.  Prove that the class of half-spaces is complete w.r.t. that family.  Demonstrate some non-reducability results.  Corollaries concerning the existence of compression schemes.

Defining Reductions We consider pair of sets (X,Y) where X is a domain and Y is a set of concepts. A concept class is a relations R over XxY (so each y  Y can be viewed as the subset {x: (x,y)  R} of X ).  An embedding of C=(X,Y,R) into C’=(X’,Y’,R’) is a pair of functions  :X  X’,  :Y  Y’, so that (x,y)  R iff (  (x),  (y))  R’.  C reduces to C’, denoted C  C,’ if such an embedding exits.

Relationship to Info Complexity If C  C’ then, for each of the complexity parameters mentioned above, C’ is at least as complex as C. E.g., if C  C,’ then, for every  and  the sample complexity of  learning C is at most that needed for learning C’. (This is in the agnostic prediction model)

Immediate observations  If we take into account the computational complexity of the embedding functions, then we can also bound the computational complexity of learning C by that of learning C’  For every k, the class of all binary functions on a k-size domain is minimal w.r.t. the family of all classes having VC-dimension k.

Universal Classes We say that a concept class C is universal for a family of classes F if every member of F reduces to C. Universal classes play a role analogous to that of, say, NP-hard decision problems – they are as complex as any member of the family F

Some important classes  For an integer k, let HS k denote the class of half spaces over R k. That is HS k =(R k, R k+1, H) where ((x 1,….x k ),(a 1,…a k+1 ))  H iff  a i x i +a k+1  0  Let PHS k denote the class of positive half spaces, that is, half spaces in which a 1 =1.  Finally, let HS k 0 denote the class of homogenous half spaces (I.e., those having a k+1 =0), and PHS k 0 the class of poditive and homogenous half spaces.

Half Spaces and Completeness The first family of classes that comes to mind is the family VC n - the family of all concept classes having VC-dimensions n. Theorem: For any n>2, no class HS k is universal for VC n (This holds even if we consider only finite classes)

Dudley Classes (1) Next, we define a rich subfamily of VC n for which classes of half spaces are universal. Let F be a family of real valued functions over some domain set X. For any function g, let h be any real valued function over X and define a concept class D F,h = (X, F, R F,h ) where R F,h = {(x,f) : f(x)+h(x)  0}. (Note that all the PPD’s defined by Adam yesterday were of this form)

Dudley Classes (2) Classes of the form D F,h = (X, F, R F,h ) are called Dudley Classes if the family of functions F is a vector space over the reals (with respect to point-wise addition and scalar multiplication). Examples of Dudley classes: HS k, PHS k, HS k 0, PHS k 0, and the class of all balls in any Euclidean space R k

Dudley’s Theorem Theorem: If the a family of functions F is a vector space, then, for every h, the VC dimension of D F,h equals the (linear) dimension of the vector space F. Corollary: Easy calculations for the VC dimension of the classes HS k, PHS k, HS k 0, PHS k 0, k-dimensional balls.

A Completeness Theorem Theorem: For every k, PHS k+1 0 is universal, (and therefore, complete) for the family of all k - dimensional Dudley classes. Proof: Let f 1, …f k be a basis for the vector space F, define  :X  R k+1,  :F  R k+1, be  x)  f 1  x), …. f k (x), h(x)) and for f=  a i f i  f)=(a 1, …a k, 1, 0)

Corollaries  k-size compression schemes for any k-dimensional Dudley class.  Learning algorithms for all Dudley classes.  An easy proof to Dudley’s theorem. (show that for any k –dimensional F, the class HS k 0 is embeddable into D F,h, for h=0)