Download presentation
Presentation is loading. Please wait.
Published byBrandon Harvey Modified over 6 years ago
1
Machine Learning Learning is “any change in a system that allows it to perform better the second time on repetition of the same task or on another task drawn from the same population” (Herbert Simon, 1983) Different types of AI learning models Inductive learning Explanation based learning Supervised learning Unsupervised learning Parallel distributed processing (PDP) models Neural networks
2
An Inductive Learning Framework
Data and goals of learning task Representation of learned knowledge Specific instance of concept “ball” size(obj1, small) Ù color(obj1,red) Ù shape(obj1, round) General concept “ball” size(X, Y) Ù color(X,Z) Ù shape(X, round) Operations on data Concept space Combination of representations and operations Heursitic search
3
Version Space Version space is a set of concept descriptions consistent with the training data Generalization and specialization shape(box, cube) → shape(X, cube) size(X, large) Ù shape(X, round) → shape(X, round) color(X, blue) Ù shape(X, cube) → color(X, blue) Ù (shape(X, cube) Ú shape(X, rectangle)) “Theory” of Generalization Given predicate sentences p and q, let P and Q be the set of all sentences that satisfy p and q, respectively. Expression p is more general that q iif. P Ê Q
4
Concept Space Searches
Specific to general Goal is to create a set S (hypotheses) of maximally specific generalizations Maintain set NL of previously observed negative examples (initially empty) Initialize S to first positive training instance in PG p : p Î PG s : s Î S, if s ≠ p, let s = most specific generalization that matches p Remove from S all hypotheses more general than other hypotheses in S Remove from S all hypotheses that match a previously observed negative in NL n : n Î NG Add n to NL s : s Î S s = n, remove s from S General to specific Goal is to create a set G of maximally general concepts Maintain set PL of previously observed positive examples Initialize G to the most general concepts g : g Î G, if g = n, let g = most general specializations that do not match n Remove from G all hypotheses more specific than other hypotheses in G Remove from G all hypotheses that fail to match a previously observed positive in PL Add p to PL g : g Î G g ≠ p, remove g from G
5
Concept Space Search Examples
X = {small, large} Y = {red, white, blue} Z = {ball, cube, brick} S: {} Positive: obj(small, red, ball) S: {obj(small, red, ball)} Positive: obj(small, white, ball) S: {obj(small, Y, ball)} Positive: obj(large, blue, ball) S: {obj(X, Y, ball)} Specific => General G {obj(X, Y, Z)} Negative: obj(small, red, brick) G: {obj:(large, Y, Z), obj(X, white, Z), obj(X, blue, Z), Positive: obj(large, white, ball) obj(X, Y, ball), obj(X, Y, cube)} G: {obj(large, Y, Z), obj(X, white, Z), obj(X, Y, ball)} Negative: obj(large, blue, cube) G: {obj(large, white, Z), obj(X, white, Z), obj(X, Y, ball)} Positive: obj(small, blue, ball) G: {obj(X, Y, ball)} General => Specific
6
Version Space Convergence
Generalization and specialization lead to version space convergence Specialization of general models Generalization of specific models Candidate Elimination Algorithm Bi-directional search - combines previous two search techniques If G = S and |S| = |G| = 1, then a single goal concept has been found Otherwise, there is no single concept that covers all positive instances and none of the negative instances
7
Candidate Elimination Algorithm
G: {obj(X, Y, Z)} Positive: obj(small, red, ball) S: {} G: {obj(X, Y, Z)} Negative: obj(small, blue, brick) S: {obj(small, red, ball)} G: {obj(X, red, Z), obj(X, Y, ball)} Positive: obj(large, red, ball) G: {obj(X, red, Z), obj(X, Y, ball)} Negative: obj(large, red, cube) S: {obj(X, red, ball)} G: {obj(X, red, ball)}
8
Explanation-Based Learning
Explanation-Based Algorithm Target concept Agent must find an effective definition of this concept Training example Domain theory (premise(X) → conclusion(X)) liftable(X) container(X) → cup(X) part(Z, W) concave(W) points_up(W) → container(Z) light(Y) part(Y, handle) → liftable(Y) small(A) → light(A) made_of(Z, feathers) → light(A) Operational criteria Means of describing the form of concept definitions
9
Explanation-Based Learning Example
Specific and Generalized Proof Trees cup(obj1) small(obj1) part(obj1, handle) owns(bob, obj1) part(obj1, bottom) part(obj1, bowl) points_up(bowl) concave(bowl) color(obj1, red) cup(obj1) liftable(obj1) container(obj1) light(obj1) part(obj1, handle) part(obj1, bowl) points_up(bowl) concave(bowl) small(obj1) cup(X) liftable(X) container(X) light(obj1) part(X, handle) part(X, W) points_up(W) concave(W) small(X)
10
Benefits of Explanation-Based Learning
Domain theory allows the learner to select relevant aspects of the training instance Ignores irrelevant aspects such as color of the cup EBL forms generalizations that we know to be relevant to specific goals and consistent with the domain theory Many instances may allow numerous possible generalizations that are either meaningless or wrong Allows the learner to learn from a single instance Allows the learner to hypothesize unstated relationships between goals and experiences
11
Unsupervised Learning
Supervised learning assumes the existence of an external method to correctly classify training data Unsupervised learning require that the learner evaluate concepts on its own AM is an early example of a discovery program Discovered natural numbers by modifying its notion of “bags,” or multisets. Figured out addition, multiplication, division, and prime numbers by evaluating “interesting” concepts Failed beyond rudimentary number theory – space grew combinatorially and percentage of interesting concepts diminished
12
Concept Clustering The goal of the clustering problem is to organize a collection of objects into some hierarchy of classes that meet some standard of quality Necessary to have some means of measuring similarity between objects Numeric taxonomy represents objects as collections of features and assigns numeric values to these features Similarity metric treats object as a point in n-dimensional space, where n is the number of features. Similarity between two objects is thus the Euclidean distance between them
13
Agglomerative Clustering
Bottom-up approach to the clustering problem Examine all pairs of objects, and make the pair with the highest degree of similarity a cluster Define the features of this cluster as some function (i.e. average) of the features of the members and replace the component objects with this definition Repeat until all objects have been reduced to a single cluster Difficult to compare objects defined using symbolic features rather than numeric features Define similarity as a proportion of common features However does not adequately take into account underlying semantic knowledge or take into account goals or background knowledge Traditional algorithms are extensional, enumerate all members. No intensional definition to classify both known and future members
14
Parallel Distributed Processing (PDP)
Also known as subsymbolic approaches/models View intelligence as the behavior of a collection of large numbers of simple interacting components Symbolic systems suffer from brittleness Encompasses phenomena related to the nature of a two-value system Human performance degrades as a problem gets harder, but almost always generates some answer; expert systems either perform perfectly or not at all Neural networks Connectionalist approach inspired by biological brains
15
Neural Networks Neurons Neural network Inputs values: xi Weights: wi
Usually discrete, real values, or within {0, 1} or {-1, 1} Weights: wi Usually real valued Activation values Output of the neuron Activation function: F Neural network Network topology Learning algorithm Environment
16
Simple Neural Networks
McCulloch-Pitts Neuron (McCulloch and Pitts, 1943) Inputs are either +1 or -1, activation function multiplies each input by its weight as sums the result If the sum is greater than or equal to 0, the output is 1. Otherwise the output is 0 Output X Y X Y ⌐X Weight +1 +1 -2 +1 +1 -1 -1 Inputs X Y +1 X Y +1 X
17
Perceptrons Devised by Frank Rosenblatt in the late 1950s
A single-layer network where all inputs and activation values are either 0 or 1, and the weights are real valued Activation function is a simple linear threshold 1 if ∑ xiwi > t 0 otherwise Supervised learning, perceptron changes weights based on correct results If output is correct, do nothing If output is 0 and should be 1, increment weights on the active lines (input of 1) by some amount d. If output is 1 and should be 0, decrement weights on the active lines by some amount d.
18
Limits of Perceptrons Single-layer networks are only capable of learning classes that are linearly separable For example, exclusive-or is not linearly separable, and thus cannot be represented by a perceptron For any n-dimensional space, a classification is linearly separable if these groups can be separated with a single n-1 dimensional hyperplane Y 1 X xor Y = 0 X xor Y = 1 X 1
19
Modern Machine Learning Topics
Asymptotic Model Selection for Naive Bayesian Networks Dimension Reduction in Text Classification with Support Vector Machines Stability of Randomized Learning Algorithms Diffusion Kernels on Statistical Manifolds Multiclass Boosting for Weak Classifiers Denoising Source Separation Learning with Decision Lists of Data-Dependent Features Generalization Bounds and Complexities Based on Sparsity and Clustering for Convex Combinations of Functions from Random Classes Characteristics of a Family of Algorithms for Generalized Discriminant Analysis of Undersampled Problems Journal of Machine Learning Research,
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.