 Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber.

 Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber

 Rosina Weber Knowledge acquisition INFO 612 3/18/2003 Dr. R. Weber

 Rosina Weber Knowledge acquisition Knowledge engineering knowledge acquisition & elicitation knowledge elicitation –steps, recommendations, issues, results KA tools and techniques Manual methods Interactive methods Automated methods

 Rosina Weber + source of expertise knowledge engineer knowledge base inference procedures books documents humans facts KNOWLEDGE ENGINEERING Problem Assessment knowledge acquisition knowledge representation design testingdocumentation

 Rosina Weber + Knowledge engineering source of expertise knowledge engineer Knowledge based system knowledge acquisition knowledge representation books documents humans facts KNOWLEDGE ENGINEERING

 Rosina Weber Knowledge acquisition transference of expertise from a knowledge source to a program capture of expertise from a domain expert to be represented in a program Knowledge elicitation-KE

 Rosina Weber Types of Knowledge From Durkin 1994

 Rosina Weber Sources of Knowledge Experts End-users Multiple experts Reports Books Regulations Guidelines

 Rosina Weber First steps in KE The knowledge engineer must: obtain a general view of the domain identify a framework to structure the new domain capture the reasoning style of the experts in the domain

 Rosina Weber First meeting w/experts what is a KBS goals of system commitment (e.g., confidentiality) give the choice to leave

 Rosina Weber Recommendations meet only once a week with each expert limit meetings to 40 min at most keep 2/3 of the interview to technical topics and 1/3 to general topics process each interview before the next one limit total meetings to 3 hs a day be sure never to mention other expert’s views employ same methods in the same order to all experts be consistent and provide a convenient environment From [DIA89]

 Rosina Weber Interviews Unstructured interviews Structured interviews Observational Retrospective

 Rosina Weber Issues in KE Compiled knowledge: can be executed but its internal structure cannot easily be understood (e.g., ride a bike). Knowledge that became so obvious that humans cannot explain. When asking something to an expert, he or she might try to answer things that are unknown or maybe compiled knowledge Psychologists do not identify an association between a verbal report and cognition

 Rosina Weber Problems with KE Plausible lines of reasoning can have little to do with actual problem-solving. Academic knowledge may be obtained in place of compiled knowledge. Experts may be insecure. They could be afraid of losing their jobs; they may not want computers encroaching on their "private domain;" they may not want to expose their problem-solving methods to the scrutiny of colleagues or of the general public. Interpersonal interviewing problems can result when knowledge engineers are not trained in interviewing techniques. Protocol analysis (obs. & retrospec.) is labor intensive, error- prone, and results in a series of random behavior samples that must be synthesized by the knowledge engineer.

 Rosina Weber Results of KE low productivity –knowledge engineers need to study the field –it is hard to find a framework to structure the new domain –experts reason at a low level of specificity

 Rosina Weber KA tools and techniques 1.Manual methods 2.Interactive methods 3.Automated methods From: John H. Boose Knowledge Acquisition Tools, Methods, and Mediating Representations Copyright © 1990, John H. Boose. in Motoda, H., Mizoguchi, R., Boose, J. H., and Gaines, B. R. (Eds.) (1990). Proceedings of the First Japanese Knowledge Acquisition for Knowledge-Based Systems Workshop: JKAW-90, Ohmsha,Ltd: Japan.

 Rosina Weber Manual methods (i) Brainstorming – rapidly generate a large number of ideas Interviewing –unstructured (general questions) – semi-structured (open questions+topics) – structured (strict agenda) – Neurolinguistic Programming (eye movement, body language) – tutorial

 Rosina Weber Manual methods (ii) Knowledge Org. Techniques: – Card Sorting – ethnoscience techniques (names & categories) – knowledge analysis – mediating representations – overcoming bias – psychological scaling – uncertain information elicitation and representation Hoffman, (1987) describes various methods to elicit expertise with different advantages and disadvantages

 Rosina Weber Manual methods (iii) Protocol Analysis Techniques – Participant Observation – Protocol Analysis (retrospective) User Interface Techniques – in wizard of oz technique, an expert simulates the behavior of a future system

 Rosina Weber Interactive methods problem-to-method relationship –usually a domain specific problem employing a highly specialized method using much domain knowledge, or a general problem employing a general method with little domain knowledge) E.g., interdependency models representation languages –for defining and describing problems and methods, e.g., method ontologies intelligent editors –that help AI programmers construct large knowledge bases, e.g., CYC

 Rosina Weber Automated Methods (i) Analogy – apply knowledge from old situations in similar new situations Apprenticeship Learning – learn by watching experts solve problems Neural Networks Discovery – Learn by experimentation and observation

 Rosina Weber The picnic game Let’s practice how to learn rules? According to Michalski (1983) A theory and methodology of inductive learning. In Machine Learning, chapter 4, “inductive learning is a heuristic search through a space of symbolic descriptions (i.e., generalizations) generated by the application of rules to training instances.”

 Rosina Weber Inductive Learning Definition According to Michalski (1983) A theory and methodology of inductive learning. In Machine Learning, chapter 4, “inductive learning is a heuristic search through a space of symbolic descriptions (i.e., generalizations) generated by the application of rules to training instances.”

 Rosina Weber Inductive Learning Learning by generalization Performance of classification tasks –Also categorization Rules indicate categories Goal: –Characterize a concept

 Rosina Weber Learner uses: –positive examples (instances ARE examples of a concept) and –negative examples (instances ARE NOT examples of a concept) Concept Learning is a Form of Inductive Learning

 Rosina Weber Needs empirical validation Dense or sparse data determine quality of different methods Concept Learning

 Rosina Weber The learned concept should be able to correctly classify new instances of the concept –When it succeeds in a real instance of the concept it finds true positives – When it fails in a real instance of the concept it finds false negatives Validation of Concept Learning i

 Rosina Weber The learned concept should be able to correctly classify new instances of the concept –When it succeeds in a counterexample it finds true negatives –When it fails in a counterexample it finds false positives Validation of Concept Learning ii

 Rosina Weber Rule Learning Learning algorithms widely used in data mining Decision Trees Neural Networks

 Rosina Weber Decision trees Knowledge representation formalism Represent mutually exclusive rules (disjunction) A way of breaking up a data set into classes or categories Classification rules that determine, for each instance with attribute values, whether it belongs to one or another class

Decision trees consist of: - leaf nodes (classes) - decision nodes (tests on attribute values) - from decision nodes branches grow for each possible outcome of the test From Cawsey, 1997

 Rosina Weber Decision tree induction Goal is to correctly classify all example data Several algorithms to induce decision trees: ID3 (Quinlan 1979), CLS, ACLS, ASSISTANT, IND, C4.5 Constructs decision tree from past data Not incremental Attempts to find the simplest tree (not guaranteed because it is based on heuristics)

 Rosina Weber From: – a set of target classes –Training data containing objects of more than one class ID3 uses test to refine the training data set into subsets that contain objects of only one class each Choosing the right test is the key ID3 algorithm

 Rosina Weber Information gain or ‘minimum entropy’ Maximizing information gain corresponds to minimizing entropy Predictive features (good indicators of the outcome) How does ID3 chooses tests

 Rosina Weber Information gain is a statistical property Compute entropy How to best classify the training instances Predictive features (good indicators of the outcome) Choosing tests

 Rosina Weber ID3 algorithm

 Rosina Weber ID3 algorithm (cont’d)

 Rosina Weber ID3 algorithm (cont’d) Yes No 3 times No No 4 times No Yes 3 times

 Rosina Weber ID3 algorithm (cont’d) Yes No 3 times No No 4 times No Yes 3 times Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times

 Rosina Weber ID3 algorithm (cont’d) Yes No 3 times No No 4 times No Yes 3 times Refund? No yes no

 Rosina Weber ID3 algorithm (cont’d) Refund? No yes no

 Rosina Weber ID3 algorithm (cont’d) Refund? No yes Marital Status? married Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times No

 Rosina Weber ID3 algorithm (cont’d) Refund? No yes Marital Status? married Single, divorced Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times No Taxable Income?

 Rosina Weber ID3 algorithm (cont’d) Refund? No yes Marital Status? married Single, divorced Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times No Taxable Income? No < 80K

 Rosina Weber ID3 algorithm (cont’d) Refund? No yes Marital Status? married Single, divorced Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times No Taxable Income? No < 80K>80K Yes

 Rosina Weber What rules can you use from this decision tree? Refund? No yes Marital Status? married Single, divorced No Taxable Income? No < 80K>80K Yes

 Rosina Weber Knowledge Discovery 1 Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, novel, and potential useful and understandable patterns in data. (R.Feldman,2000) 2 Knowledge Discovery from Processes 1.1 Data mining is one step in the KDD method. 1.2 Text mining concerns applying data mining techniques to unstructured text.

 Rosina Weber Automated Methods (ii) Example Selection – select an appropriate set of examples for various learning techniques Explanation-Based Learning – deduce a general rule from a single example by relating it to an existing theory Function Induction – learn functions from input data Genetic Algorithm – crossing-over, mutation

 Rosina Weber Automated Methods (iii) Performance Feedback – performance feedback is used to reinforce behavior Rule Induction Similarity-Based Learning – learn similarities from sets of positive examples and differences from sets of negative examples Systemic Principles Derivation – use general principles to derive specific laws

 Rosina Weber References Boose, John H. (1990). Knowledge Acquisition Tools, Methods, and Mediating Representations. In Motoda, H., Mizoguchi, R., Boose, J. H., and Gaines, B. R. (Eds.) (1990). Proceedings of the First Japanese Knowledge Acquisition for Knowledge-Based Systems Workshop: JKAW-90, Ohmsha,Ltd: Japan. Buchanan, Bruce G. & Wilkins, David C. (eds.) Readings in Knowledge acquisition and learning: automating the construction and improvement of expert systems. Diaper,D. Knowledge elicitation - principles, techniques and applications. Chichester: John Wiley & Sons, 1989. p.96-97 Hoffman,R.R.. The Problem of extracting the knowledge of experts from the perspective of experimental psychology. AI Magazine, p. 53-67, 1987.

 Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber.

Similar presentations

Presentation on theme: " Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

 Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber.

Similar presentations

Presentation on theme: " Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber."— Presentation transcript:

Similar presentations

About project

Feedback