Introduction to Machine Learning Dmitriy Dligach
Representations Objects –Real-life phenomena viewed as objects and their properties (features) Feature Vectors – Examples –Text classification –Face recognition –WSD f0f0 f1f1 fnfn
Supervised Learning Vector-value pair –(x 0, y 0 ), (x 1, y 1 ), …, (x n, y n ) Task: learn function y = f(x) Algorithms –KNN –Decision Trees –Neural Networks –SVM
Issues in Supervised Learning Training data –Why are we learning? Test data –Unseen data Overfitting –Fitting noise reduces performance
Unsupervised Learning Only feature vectors are given –x 0, x 1, …, x n Task: group feature vectors into clusters Algorithms –Clustering k-means mixture of gaussians –Principal Component Analysis –Sequence labeling HMMs
Supervised Example: Decision Trees
A Tree
Word Sense Disambiguation (WSD) bat (noun)
Another DT Example Word Sense Disambiguation Given an occurrence of a word, decide which sense, or meaning, was intended. Example, run –run1: move swiftly ( I ran to the store.) –run2: operate (I run a store.) –run3: flow (Water runs from the spring.) –run4: length of torn stitches (Her stockings had a run.)
WSD Word Sense Disambiguation Categories –Use word sense labels (run1, run2, etc.) Features – describe context of word –near(w) : is the given word near word w? –pos: word’s part of speech –left(w): is word immediately preceded by w? –etc.
Using a decision Tree pos near(race)near(stocking) near(river) run1 run3 noun yes no verb yesno yes 4pm run4 Given an event (=list of feature values): – Start at the root. – At each interior node, follow the outgoing arc for the feature value that matches our event – When we reach a leaf node, return its category. “I saw John run a race by a river.”
WSD: Sample Training Data FeaturesWord POSnear(race)near(river)near(stockings) Sense NounNo run4 VerbNo run1 VerbNoYesNorun3 NounYes run4 VerbNo Yesrun1 VerbYes Norun2 VerbNoYes run3
Unsupervised Example: K-Means Distance between two objects –Cosine distance –Euclidean distance Algorithm –Pick cluster centers at random –Assign the data points to the nearest clusters –Re-compute the cluster centers –Re-assign the data points –Continue until the clusters settle Hard clustering vs. soft clustering
Interactive Demos K-Means – SVMs –
ML Reference Tom Mitchell “Machine Learning”