Version Space Machine Learning Fall 2018.

Version Space Machine Learning Fall 2018

VERSION SPACE Concept Learning by Induction Learning has been classified into several types: deductive, inductive, analytical, etc. Much of human learning involves acquiring general concepts from specific training examples (this is called inductive learning)

VERSION SPACE Concept Learning by Induction Example: Concept of ball * red, round, small * green, round, small * red, round, medium Complicated concepts: “situations in which I should study more to pass the exam”

VERSION SPACE Concept Learning by Induction Each concept can be thought of as a Boolean-valued function whose value is true for some inputs and false for all the rest (e.g. a function defined over all the animals, whose value is true for birds and false for all the other animals) This lecture is about the problem of automatically inferring the general definition of some concept, given examples labeled as members or nonmembers of the concept. This task is called concept learning, or approximating (inferring) a Boolean valued function from examples

VERSION SPACE Concept Learning by Induction Target Concept to be learnt: “Days on which Aldo enjoys his favorite water sport” Training Examples present are:

VERSION SPACE Concept Learning by Induction The training examples are described by the values of seven “Attributes” The task is to learn to predict the value of the attribute EnjoySport for an arbitrary day, based on the values of its other attributes

VERSION SPACE Concept Learning by Induction: Hypothesis Representation The possible concepts are called Hypotheses and we need an appropriate representation for the hypotheses Let the hypothesis be a conjunction of constraints on the attribute-values

VERSION SPACE Concept Learning by Induction: Hypothesis Representation If sky = sunny  temp = warm  humidity = ?  wind = strong  water = ?  forecast = same then Enjoy Sport = Yes else Enjoy sport = No Alternatively, this can be written as: {sunny, warm, ?, strong, ?, same}

VERSION SPACE Concept Learning by Induction: Hypothesis Representation For each attribute, the hypothesis will have either ? Any value is acceptable Value Any single value is acceptable  No value is acceptable

VERSION SPACE Concept Learning by Induction: Hypothesis Representation If some instance (example/observation) satisfies all the constraints of a hypothesis, then it is classified as positive (belonging to the concept) The most general hypothesis is {?, ?, ?, ?, ?, ?} It would classify every example as a positive example The most specific hypothesis is {, , , , , } It would classify every example as negative

VERSION SPACE Concept Learning by Induction: Hypothesis Representation Alternate hypothesis representation could have been Disjunction of several conjunction of constraints on the attribute-values Example: {sunny, warm, normal, strong, warm, same}  {sunny, warm, high, strong, warm, same}  {sunny, warm, high, strong, cool, change}

VERSION SPACE Concept Learning by Induction: Hypothesis Representation Another alternate hypothesis representation could have been Conjunction of constraints on the attribute-values where each constraint may be a disjunction of values Example: {sunny, warm, normal high, strong, warm cool, same change}

VERSION SPACE Concept Learning by Induction: Hypothesis Representation Yet another alternate hypothesis representation could have incorporated negations Example: {sunny, warm, (normal  high), ?, ?, ?}

VERSION SPACE Concept Learning by Induction: Hypothesis Representation By selecting a hypothesis representation, the space of all hypotheses (that the program can ever represent and therefore can ever learn) is implicitly defined In our example, the instance space X can contain = 96 distinct instances There are = 5120 syntactically distinct hypotheses. Since every hypothesis containing even one  classifies every instance as negative, hence semantically distinct hypotheses are: = 973

VERSION SPACE Concept Learning by Induction: Hypothesis Representation Most practical learning tasks involve much larger, sometimes infinite, hypothesis spaces

VERSION SPACE Concept Learning by Induction: Search in Hypotheses Space Concept learning can be viewed as the task of searching through a large space of hypotheses implicitly defined by the hypothesis representation The goal of this search is to find the hypothesis that best fits the training examples

VERSION SPACE Concept Learning by Induction: Basic Assumption Once a hypothesis that best fits the training examples is found, we can use it to predict the class label of new examples The basic assumption while using this hypothesis is: Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples

VERSION SPACE Concept Learning by Induction: General to Specific Ordering If we view learning as a search problem, then it is natural that our study of learning algorithms will examine different strategies for searching the hypothesis space Many algorithms for concept learning organize the search through the hypothesis space by relying on a general to specific ordering of hypotheses

VERSION SPACE Concept Learning by Induction: General to Specific Ordering Example: Consider h1 = {sunny, ?, ?, strong, ?, ?} h2 = {sunny, ?, ?, ?, ?, ?} any instance classified positive by h1 will also be classified positive by h2 (because it imposes fewer constraints on the instance) Hence h2 is more general than h1 and h1 is more specific than h2

VERSION SPACE Concept Learning by Induction: General to Specific Ordering Consider the three hypotheses h1, h2 and h3

VERSION SPACE Concept Learning by Induction: General to Specific Ordering Neither h1 nor h3 is more general than the other h2 is more general than both h1 and h3 Note that the “more-general-than” relationship is independent of the target concept. It depends only on which instances satisfy the two hypotheses and not on the classification of those instances according to the target concept

VERSION SPACE Find-S Algorithm How to find a hypothesis consistent with the observed training examples? - A hypothesis is consistent with the training examples if it correctly classifies these examples One way is to begin with the most specific possible hypothesis, then generalize it each time it fails to cover a positive training example (i.e. classifies it as negative) The algorithm based on this method is called Find-S

VERSION SPACE Find-S Algorithm We say that a hypothesis covers a positive training example if it correctly classifies the example as positive A positive training example is an example of the concept to be learnt Similarly a negative training example is not an example of the concept

VERSION SPACE Find-S Algorithm

VERSION SPACE Find-S Algorithm The nodes shown in the diagram are the possible hypotheses allowed by our hypothesis representation scheme Note that our search is guided by the positive examples and we consider only those hypotheses which are consistent with the positive training examples The search moves from hypothesis to hypothesis, searching from the most specific to progressively more general hypotheses

VERSION SPACE Find-S Algorithm At each step, the hypothesis is generalized only as far as necessary to cover the new positive example Therefore, at each stage the hypothesis is the most specific hypothesis consistent with the training examples observed up to this point Hence, it is called Find-S

VERSION SPACE Find-S Algorithm Note that the algorithm simply ignores every negative example However, since at each step our current hypothesis is maximally specific it will never cover (falsely classify) any negative example. In other words, it will be always consistent with each negative training example However the data must be noise free and our hypothesis representation should be such that the true concept can be described by it

VERSION SPACE Find-S Algorithm Problems with Find-S: Has the learner converged to the true target concept? Why prefer the most specific hypothesis? Are the training examples consistent with each other? What if there are several maximally specific consistent hypotheses?

VERSION SPACE Definition: Version Space Version Space is the set of hypotheses consistent with the training examples of a problem Find-S algorithm finds one hypothesis present in the Version Space, however there may be others

Version Space Machine Learning Fall 2018.

Similar presentations

Presentation on theme: "Version Space Machine Learning Fall 2018."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Version Space Machine Learning Fall 2018.

Similar presentations

Presentation on theme: "Version Space Machine Learning Fall 2018."— Presentation transcript:

Similar presentations

About project

Feedback