Introduction to Machine Learning Approach Lecture 5.

Introduction to Machine Learning Approach Lecture 5

Concept of Learning Learning is an essential human property. Learning IS NOT learning by heart. Learning means improvement (according to a given criterion) when a similar situation appears. Any computer can learn by heart, the challenge is to generalise a behaviour to a novel situation.

Definitions of Learning “Learning denotes changes in a system that enable a system to do the same task more efficiently the next time.”…Herbert Simon “Learning is constructing or modifying representations of what is being experienced.”…Ryszard Michalski “Learning is making useful changes in our minds.”…Marvin Minsky

What is Machine Learning? The objective is to develop computational models that would implement various forms of learning, in particular mechanisms capable of extracting knowledge from examples. program = algorithm + data ( Conventionally ) program = algorithm + data + domain knowledge Representation of domain knowledge in suitable data structures.

The Machine Learning Task Learning Algorithm Examples Background Knowledge Concept Description

The Machine Learning Task Examples can be positive or negative or of many categories. Background knowledge contains the information about the language used to describe the examples and concept. Example: In NLP, words can be defined using POS, Name Entity, previous and next words and their types, syntactic constraints etc. Learning algorithm then build on the type of examples, relevance of background knowledge, and on the presumed nature of the concept.

Nature of Learning Algorithms Some learning algorithm are classified as Black-box. The internal description cannot be interpreted by the user, provides neither insight nor explanation of the recognition process. Examples: Neural Networks, Mathematical Statistics, HMM models etc.

Supervised vs. Unsupervised Learning Supervised Learning: “Given a training set of examples and desired output, the computational model applies each entry in the training set and learn by examples. ” Or “The learner seeks to develop a concept description from the examples that have been pre-classified by the teacher.”

Supervised vs. Unsupervised Learning Unsupervised Learning: “The machine simply receives input and does not obtain supervised target output. The goal of the machine is to build representations from that input which can be used for making decision, predicting things etc.” Unsupervised algorithm can be probabilistic models. Generally, model estimates similarities and differences based on the provided output.

Supervised vs. Unsupervised Learning In Supervised Learning: –if the inputs are missing, then model will not able to infer anything or –if the examples are wrongly classified by a classifier, then we are actually making incorrect model. In Unsupervised Learning: –If the inputs are modeled, then the missing inputs cause no problem.

Supervised vs. Unsupervised Learning Supervised approaches are usually better than unsupervised approaches. –Pereira and Schabes (1992) showed that SCFGs estimated using treebanks outperformed SCFGs estimated from raw sentences. –Elsworthy found that POS taggers trained on annotated sentences outperformed taggers trained on raw sentences. However, creating annotated material is costly, both in terms of humans and machines resources: –Penn treebank: 14 annotators over (at least) three years.

Supervised vs. Unsupervised Learning We know that systems usually improve with more training material. Brill (2001) argued that billions of words might be necessary. Honestly, we cannot manually annotate this amount of data. What to do?

Self Training One solution is to get learner to annotate the data. Self-training: –A single learner is trained using annotated material. –The learner then labels large pool of unannotated material. –Retrain the learner using all material. Charniak tried this with 30 million words of newswire and he received slight improvement.

Methodology Methodology in machine learning applications and particularly NLP is: Evaluate Model Revise Training Set Final Result Testing Set

Training Set Models are estimated using a training set. –Training set should be representative of the task in terms of feature set. –It should be as large as possible. –It should be well-known. The training set is used to specify parameters and to estimate them. Some times features can be noisy, then ignore those features. Noisy features can cause false estimation of models.

Testing Set Estimated models are evaluated using a testing set. –The testing set should be disjoint from the training set. –It should be large enough for results to be reliable. –It should be unseen.

Information Retrieval Consider a set of documents D = {D 1, D 2, D 3, ………, D n }. The input is a query q in terms of list of keywords. The similarity between the query and each document is then calculated sim(q, D i ). This similarity measure is a set membership function describing the likelihood that the document is of interest (relevant) according to the user’s interest.

Performance Measures The last and considerably important part is evaluation. Useful to consider: Confusion matrix for 2  2 classification problem also known as contingency table. PositiveNegative PositiveTrue Positive (TP)False Negative (FN) NegativeFalse Positive (FP)True Negative (TN) Predicted Actual

Performance Measures Precision: The proportion of the total number of examples that are classified correctly divided by the total number of examples. Accuracy = Disadvantage: Accuracy does not cater the difference between error types FP and FN.

Performance Measures Precision: It is defined as the proportion of selected examples that are classified as correct. Precision = Recall: It records the proportion of correct examples selected. Recall =

Performance Measures Different systems use a variable range of precision and recall. It is easier said than to decide which is more valuable to exercise. Need to compute a single numerical value that incorporates precision and recall.

Performance Measures F-score: Measures a weighted harmonic mean of precision and recall and proposed by Rijsbergen (1979): F-score = Where normally set to 1

Introduction to Machine Learning Approach Lecture 5.

Similar presentations

Presentation on theme: "Introduction to Machine Learning Approach Lecture 5."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to Machine Learning Approach Lecture 5.

Similar presentations

Presentation on theme: "Introduction to Machine Learning Approach Lecture 5."— Presentation transcript:

Similar presentations

About project

Feedback