Presentation is loading. Please wait.

Presentation is loading. Please wait.

 Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber.

Similar presentations


Presentation on theme: " Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber."— Presentation transcript:

1  Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber

2  Rosina Weber Knowledge acquisition INFO 612 3/18/2003 Dr. R. Weber

3  Rosina Weber Knowledge acquisition Knowledge engineering knowledge acquisition & elicitation knowledge elicitation –steps, recommendations, issues, results KA tools and techniques Manual methods Interactive methods Automated methods

4  Rosina Weber + source of expertise knowledge engineer knowledge base inference procedures books documents humans facts KNOWLEDGE ENGINEERING Problem Assessment knowledge acquisition knowledge representation design testingdocumentation

5  Rosina Weber + Knowledge engineering source of expertise knowledge engineer Knowledge based system knowledge acquisition knowledge representation books documents humans facts KNOWLEDGE ENGINEERING

6  Rosina Weber Knowledge acquisition transference of expertise from a knowledge source to a program capture of expertise from a domain expert to be represented in a program Knowledge elicitation-KE

7  Rosina Weber Types of Knowledge From Durkin 1994

8  Rosina Weber Sources of Knowledge Experts End-users Multiple experts Reports Books Regulations Guidelines

9  Rosina Weber First steps in KE The knowledge engineer must: obtain a general view of the domain identify a framework to structure the new domain capture the reasoning style of the experts in the domain

10  Rosina Weber First meeting w/experts what is a KBS goals of system commitment (e.g., confidentiality) give the choice to leave

11  Rosina Weber Recommendations meet only once a week with each expert limit meetings to 40 min at most keep 2/3 of the interview to technical topics and 1/3 to general topics process each interview before the next one limit total meetings to 3 hs a day be sure never to mention other expert’s views employ same methods in the same order to all experts be consistent and provide a convenient environment From [DIA89]

12  Rosina Weber Interviews Unstructured interviews Structured interviews Observational Retrospective

13  Rosina Weber Issues in KE Compiled knowledge: can be executed but its internal structure cannot easily be understood (e.g., ride a bike). Knowledge that became so obvious that humans cannot explain. When asking something to an expert, he or she might try to answer things that are unknown or maybe compiled knowledge Psychologists do not identify an association between a verbal report and cognition

14  Rosina Weber Problems with KE Plausible lines of reasoning can have little to do with actual problem-solving. Academic knowledge may be obtained in place of compiled knowledge. Experts may be insecure. They could be afraid of losing their jobs; they may not want computers encroaching on their "private domain;" they may not want to expose their problem-solving methods to the scrutiny of colleagues or of the general public. Interpersonal interviewing problems can result when knowledge engineers are not trained in interviewing techniques. Protocol analysis (obs. & retrospec.) is labor intensive, error- prone, and results in a series of random behavior samples that must be synthesized by the knowledge engineer.

15  Rosina Weber Results of KE low productivity –knowledge engineers need to study the field –it is hard to find a framework to structure the new domain –experts reason at a low level of specificity

16  Rosina Weber KA tools and techniques 1.Manual methods 2.Interactive methods 3.Automated methods From: John H. Boose Knowledge Acquisition Tools, Methods, and Mediating Representations Copyright © 1990, John H. Boose. in Motoda, H., Mizoguchi, R., Boose, J. H., and Gaines, B. R. (Eds.) (1990). Proceedings of the First Japanese Knowledge Acquisition for Knowledge-Based Systems Workshop: JKAW-90, Ohmsha,Ltd: Japan.

17  Rosina Weber Manual methods (i) Brainstorming – rapidly generate a large number of ideas Interviewing –unstructured (general questions) – semi-structured (open questions+topics) – structured (strict agenda) – Neurolinguistic Programming (eye movement, body language) – tutorial

18  Rosina Weber Manual methods (ii) Knowledge Org. Techniques: – Card Sorting – ethnoscience techniques (names & categories) – knowledge analysis – mediating representations – overcoming bias – psychological scaling – uncertain information elicitation and representation Hoffman, (1987) describes various methods to elicit expertise with different advantages and disadvantages

19  Rosina Weber Manual methods (iii) Protocol Analysis Techniques – Participant Observation – Protocol Analysis (retrospective) User Interface Techniques – in wizard of oz technique, an expert simulates the behavior of a future system

20  Rosina Weber Interactive methods problem-to-method relationship –usually a domain specific problem employing a highly specialized method using much domain knowledge, or a general problem employing a general method with little domain knowledge) E.g., interdependency models representation languages –for defining and describing problems and methods, e.g., method ontologies intelligent editors –that help AI programmers construct large knowledge bases, e.g., CYC

21  Rosina Weber Automated Methods (i) Analogy – apply knowledge from old situations in similar new situations Apprenticeship Learning – learn by watching experts solve problems Neural Networks Discovery – Learn by experimentation and observation

22  Rosina Weber The picnic game Let’s practice how to learn rules? According to Michalski (1983) A theory and methodology of inductive learning. In Machine Learning, chapter 4, “inductive learning is a heuristic search through a space of symbolic descriptions (i.e., generalizations) generated by the application of rules to training instances.”

23  Rosina Weber Inductive Learning Definition According to Michalski (1983) A theory and methodology of inductive learning. In Machine Learning, chapter 4, “inductive learning is a heuristic search through a space of symbolic descriptions (i.e., generalizations) generated by the application of rules to training instances.”

24  Rosina Weber Inductive Learning Learning by generalization Performance of classification tasks –Also categorization Rules indicate categories Goal: –Characterize a concept

25  Rosina Weber Learner uses: –positive examples (instances ARE examples of a concept) and –negative examples (instances ARE NOT examples of a concept) Concept Learning is a Form of Inductive Learning

26  Rosina Weber Needs empirical validation Dense or sparse data determine quality of different methods Concept Learning

27  Rosina Weber The learned concept should be able to correctly classify new instances of the concept –When it succeeds in a real instance of the concept it finds true positives – When it fails in a real instance of the concept it finds false negatives Validation of Concept Learning i

28  Rosina Weber The learned concept should be able to correctly classify new instances of the concept –When it succeeds in a counterexample it finds true negatives –When it fails in a counterexample it finds false positives Validation of Concept Learning ii

29  Rosina Weber Rule Learning Learning algorithms widely used in data mining Decision Trees Neural Networks

30  Rosina Weber Decision trees Knowledge representation formalism Represent mutually exclusive rules (disjunction) A way of breaking up a data set into classes or categories Classification rules that determine, for each instance with attribute values, whether it belongs to one or another class

31 Decision trees consist of: - leaf nodes (classes) - decision nodes (tests on attribute values) - from decision nodes branches grow for each possible outcome of the test From Cawsey, 1997

32  Rosina Weber Decision tree induction Goal is to correctly classify all example data Several algorithms to induce decision trees: ID3 (Quinlan 1979), CLS, ACLS, ASSISTANT, IND, C4.5 Constructs decision tree from past data Not incremental Attempts to find the simplest tree (not guaranteed because it is based on heuristics)

33  Rosina Weber From: – a set of target classes –Training data containing objects of more than one class ID3 uses test to refine the training data set into subsets that contain objects of only one class each Choosing the right test is the key ID3 algorithm

34  Rosina Weber Information gain or ‘minimum entropy’ Maximizing information gain corresponds to minimizing entropy Predictive features (good indicators of the outcome) How does ID3 chooses tests

35  Rosina Weber Information gain is a statistical property Compute entropy How to best classify the training instances Predictive features (good indicators of the outcome) Choosing tests

36  Rosina Weber ID3 algorithm

37  Rosina Weber ID3 algorithm

38  Rosina Weber ID3 algorithm

39  Rosina Weber ID3 algorithm (cont’d)

40  Rosina Weber ID3 algorithm (cont’d) Yes No 3 times No No 4 times No Yes 3 times

41  Rosina Weber ID3 algorithm (cont’d) Yes No 3 times No No 4 times No Yes 3 times Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times

42  Rosina Weber ID3 algorithm (cont’d) Yes No 3 times No No 4 times No Yes 3 times Refund? No yes no

43  Rosina Weber ID3 algorithm (cont’d) Refund? No yes no

44  Rosina Weber ID3 algorithm (cont’d) Refund? No yes no

45  Rosina Weber ID3 algorithm (cont’d) Refund? No yes Marital Status? married Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times No

46  Rosina Weber ID3 algorithm (cont’d) Refund? No yes Marital Status? married Single, divorced Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times No Taxable Income?

47  Rosina Weber ID3 algorithm (cont’d) Refund? No yes Marital Status? married Single, divorced Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times No Taxable Income? No < 80K

48  Rosina Weber ID3 algorithm (cont’d) Refund? No yes Marital Status? married Single, divorced Single No 2 times Married No 3 times Divorced No 1 time Divorced Yes 1 time Single Yes 2 times No Taxable Income? No < 80K>80K Yes

49  Rosina Weber What rules can you use from this decision tree? Refund? No yes Marital Status? married Single, divorced No Taxable Income? No < 80K>80K Yes

50  Rosina Weber Knowledge Discovery 1 Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, novel, and potential useful and understandable patterns in data. (R.Feldman,2000) 2 Knowledge Discovery from Processes 1.1 Data mining is one step in the KDD method. 1.2 Text mining concerns applying data mining techniques to unstructured text.

51  Rosina Weber Automated Methods (ii) Example Selection – select an appropriate set of examples for various learning techniques Explanation-Based Learning – deduce a general rule from a single example by relating it to an existing theory Function Induction – learn functions from input data Genetic Algorithm – crossing-over, mutation

52  Rosina Weber Automated Methods (iii) Performance Feedback – performance feedback is used to reinforce behavior Rule Induction Similarity-Based Learning – learn similarities from sets of positive examples and differences from sets of negative examples Systemic Principles Derivation – use general principles to derive specific laws

53  Rosina Weber References Boose, John H. (1990). Knowledge Acquisition Tools, Methods, and Mediating Representations. In Motoda, H., Mizoguchi, R., Boose, J. H., and Gaines, B. R. (Eds.) (1990). Proceedings of the First Japanese Knowledge Acquisition for Knowledge-Based Systems Workshop: JKAW-90, Ohmsha,Ltd: Japan. Buchanan, Bruce G. & Wilkins, David C. (eds.) Readings in Knowledge acquisition and learning: automating the construction and improvement of expert systems. Diaper,D. Knowledge elicitation - principles, techniques and applications. Chichester: John Wiley & Sons, 1989. p.96-97 Hoffman,R.R.. The Problem of extracting the knowledge of experts from the perspective of experimental psychology. AI Magazine, p. 53-67, 1987.


Download ppt " Rosina Weber Knowledge acquisition and machine learning Reading textbook chapters 10 and 20 INFO 629 Dr. R. Weber."

Similar presentations


Ads by Google