1 Knowledge Engineering for Bayesian Networks Ann Nicholson School of Computer Science and Software Engineering Monash University.

Slides:



Advertisements
Similar presentations
Chapter 14: Usability testing and field studies
Advertisements

Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Design of Experiments Lecture I
A Tutorial on Learning with Bayesian Networks
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Making Simple Decisions
On-line learning and Boosting
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
1 Knowledge Engineering for Bayesian Networks Ann Nicholson School of Computer Science and Software Engineering Monash University.
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
1 Knowledge Engineering for Bayesian Networks. 2 Probability theory for representing uncertainty l Assigns a numerical degree of belief between 0 and.
The IMAP Hybrid Method for Learning Gaussian Bayes Nets Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada
Antony Tang 1, Ann Nicholson 2, Yan Jin 1, Jun Han 1 1 Faculty of ICT, Swinburne University of Technology 2 School of Computer Science and Software Engineering,
Knowledge Engineering for Bayesian Networks
Data Mining Cardiovascular Bayesian Networks Charles Twardy †, Ann Nicholson †, Kevin Korb †, John McNeil ‡ (Danny Liew ‡, Sophie Rogers ‡, Lucas Hope.
1 Knowledge Engineering for Bayesian Networks Ann Nicholson School of Computer Science and Software Engineering Monash University.
Chapter 14: Usability testing and field studies. 2 FJK User-Centered Design and Development Instructor: Franz J. Kurfess Computer Science Dept.
Data Mining Cardiovascular Bayesian Networks Charles Twardy †, Ann Nicholson †, Kevin Korb †, John McNeil ‡ (Danny Liew ‡, Sophie Rogers ‡, Lucas Hope.
1 Using Bayesian networks for Water Quality Prediction in Sydney Harbour Ann Nicholson Shannon Watson, Honours 2003 Charles Twardy, Research Fellow School.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Overview Full Bayesian Learning MAP learning
Parameterising Bayesian Networks: A Case Study in Ecological Risk Assessment Carmel A. Pollino Water Studies Centre Monash University Owen Woodberry, Ann.
1 © 1998 HRL Laboratories, LLC. All Rights Reserved Construction of Bayesian Networks for Diagnostics K. Wojtek Przytula: HRL Laboratories & Don Thompson:
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Knowledge Engineering a Bayesian Network for an Ecological Risk Assessment (KEBN-ERA) Owen Woodberry Supervisors: Ann Nicholson Kevin Korb Carmel Pollino.
1 Knowledge Engineering for Bayesian Networks Ann Nicholson School of Computer Science and Software Engineering Monash University.
Incorporating Prior Information in Causal Discovery Rodney O'Donnell, Jahangir Alam, Bin Han, Kevin Korb and Ann Nicholson.
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
Uncertainty Logical approach problem: we do not always know complete truth about the environment Example: Leave(t) = leave for airport t minutes before.
Knowledge Engineering a Bayesian Network for an Ecological Risk Assessment (KEBN-ERA) Owen Woodberry Supervisors: Ann Nicholson Kevin Korb Carmel Pollino.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
An Ontology-Based Approach to Building BNs for the Weather Forecasting Domain Tali Boneh Ann Nicholson, Kevin Korb (Monash University) John Bally (Bureau.
Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.
Data Mining Techniques
Author: Fang Wei, Glenn Blank Department of Computer Science Lehigh University July 10, 2007 A Student Model for an Intelligent Tutoring System Helping.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
Testing & modeling users. The aims Describe how to do user testing. Discuss the differences between user testing, usability testing and research experiments.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Auto Diagnosing: An Intelligent Assessment System Based on Bayesian Networks IEEE 2007 Frontiers In Education Conference- Global Engineering : Knowledge.
Slides for “Data Mining” by I. H. Witten and E. Frank.
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
CHAPTER 6 Naive Bayes Models for Classification. QUESTION????
SOFTWARE METRICS Software Metrics :Roadmap Norman E Fenton and Martin Neil Presented by Santhosh Kumar Grandai.
CH751 퍼지시스템 특강 Uncertainties in Intelligent Systems 2004 년도 제 1 학기.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Data Mining and Decision Support
Of An Expert System.  Introduction  What is AI?  Intelligent in Human & Machine? What is Expert System? How are Expert System used? Elements of ES.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
LOAD FORECASTING. - ELECTRICAL LOAD FORECASTING IS THE ESTIMATION FOR FUTURE LOAD BY AN INDUSTRY OR UTILITY COMPANY - IT HAS MANY APPLICATIONS INCLUDING.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Modeling of Core Protection Calculator System Software February 28, 2005 Kim, Sung Ho Kim, Sung Ho.
Knowledge Engineering for Bayesian Networks
Some tools and a discussion.
Bayes Net Learning: Bayesian Approaches
A Short Tutorial on Causal Network Modeling and Discovery
Machine Learning: Lecture 3
Predicting Frost Using Artificial Neural Network
Learning Probabilistic Graphical Models Overview Learning Problems.
Uncertainty Logical approach problem: we do not always know complete truth about the environment Example: Leave(t) = leave for airport t minutes before.
Testing & modeling users
Presentation transcript:

1 Knowledge Engineering for Bayesian Networks Ann Nicholson School of Computer Science and Software Engineering Monash University

2 Overview l The BN Knowledge Engineering Process »focus on combining expert elicitation and automated methods l Case Study I: Seabreeze prediction l Case Study II: Intelligent Tutoring System for decimal misconceptions l Conclusions

3 Elicitation from experts l Variables »important variables? values/states? l Structure »causal relationships? »dependencies/independencies? l Parameters (probabilities) »quantify relationships and interactions? l Preferences (utilities) (for decision networks)

4 Expert Elicitation Process l These stages are done iteratively l Stops when further expert input is no longer cost effective l Process is difficult and time consuming.

5 Knowledge discovery l There is much interest in automated methods for learning BNs from data »parameters, structure (causal discovery) l Computationally complex problem, so current methods have practical limitations »e.g. limit number of states, require variable ordering constraints, do not specify all arc directions, don’t handle hidden variables l Evaluation methods

6 The knowledge engineering process 1. Building the BN »variables, structure, parameters, preferences »combination of expert elicitation and knowledge discovery 2. Validation/Evaluation »case-based, sensitivity analysis, accuracy testing 3. Field Testing »alpha/beta testing, acceptance testing 4. Industrial Use »collection of statistics 5. Refinement »Updating procedures, regression testing

7 Case Study: Seabreeze prediction l Joint project with Bureau of Meteorology »(Kennet, Korb & Nicholson, PAKDD’2001) l Goal: proof of concept; test ideas about integration of automated learners & elicitation What is a seabreeze? (separate picture)

8 Rule-based predictor and data l Bureau of Meteorology’s (BOM) system achieved about 67% predictive accuracy; currently in use. Ifwind component is offshore andwind component < 23 knots andthe forecast period is in the afternoon thena sea breeze is likely l Seabreeze Data: »30MB from October 1997 to October 1999 from Sydney, Australia. 7% had missing attribute values. »Three types of sensor site data: –Automatic weather stations: ground level data, time –Olympic sites (for sailing, etc): rain, temp, humidity, wind –Balloon data: gradient-level readings »Predicted variables: wind speed and direction

9 Methodology 1. Expert Elicitation. Using variables with available data, forecasters provided causal relations between them. 2. Tetrad II (Spirtes, et al., 1993) uses the Verma-Pearl algorithm (1991) with significance testing to recover causal structure. (NB: usability problems) 3. CaMML (Wallace and Korb, 1999) uses Minimum message Length (MML) to discover causal structure. l BNs for Seabreeze Predictions (see separate slide) »All parameterization was performed by Netica keeping different methods on an equal footing. »Uses simple counting over training data to estimate conditional probabilities (Spiegelhalter & Lauritzen, 1990)

10 Predictive accuracy l Instead of seabreeze existence prediction, we substituted more demanding task: prediction of wind direction at ground level l From this (and gradient-level wind direction), seabreezes can be inferred. l Training/testing regime »randomly select 80% of data for training »use remainder for testing accuracy l Results »See separate slide (comparison of airport site type network versions)

11 Predictive accuracy conclusions l Elicited and discovered nets (MML + Tetrad II) are systematically superior to BOM RB l Discovered networks are superior to elicited nets in first 3 hrs (conf intervals are ~10%) l Strong time component to accuracy

12 Adaptation: Incremental learning l Learn structure from first year’s data (using MML) l Reparameterise nets over second year’s data, while predicting seabreezes »greedy search yielded a time decay factor of e -t0.05 l Results (see separate slides) »comparison of incremental and normal training methods by BN type and by time of year »incremental performed better

13 Case Study II: Intelligent tutoring l Tutoring domain: primary and secondary school students’ misconceptions about decimals l Based on Decimal Comparison Test (DCT) »student asked to choose the larger of pairs of decimals »different types of pairs reveal different misconceptions l ITS System involves computer games involving decimals l This research also looks at a combination of expert elicitation and automated methods

14 Expert classification of Decimal Comparison Test (DCT) results

15 The ITS architecture Adaptive Bayesian Network Decimal comparison test (optional) Inputs Computer Games Generic BN model of student Information about student e.g. age (optional) Hidden number Flying photographer Decimaliens …. Number between Student Item Answer Item Answer Classroom diagnostic test results (optional) Classroom Teaching Activities Report on student Answer Item type New game  Diagnose misconception  Predict outcomes  Identify most useful information Sequencing tactics  Select next item type  Decide to present help  Decide change to new game  Identify when expertise gained Teacher System Controller Module Answers Help Feedback Help

16 Expert Elicitation l Variables »two classification nodes: fine and coarse »item types: (i) H/M/L (ii) 0-N l Structure »arcs from classification to item type »item types independent given classification l Parameters »careless mistake (3 different values) »expert ignorance: - in table (uniform distribution)

17 Expert Elicited BN

18 Evaluation process l Case-based evaluation »experts checked individual cases »sometimes, if prior was low, ‘true’ classification did not have highest posterior (but usually had biggest change in ratio) l Adaptiveness evaluation »priors changes after each set of evidence l Comparison evaluation »Differences in evaluation between BN and expert rule

19 Comparison: expert BN vs rule UndesirableDesirableSame

20 Results Undes. Desir. Same varying prob. of careless mistake varying granularity of item type: 0-N and H/M/L

21 Automated methods: Classification l Applied SNOB classification program, based on MML l Using data from 2437 students, 30 items, SNOB produced 14 classes »10 corresponded to expert classes »2 expert classes LRV and AU were not found »4 clases were mainly combinations of AU and UN »unable to classify 0.5% of students l Using pre-processed data (0-N or H/M/L) on 6 item types, SNOB found only 5 or 6 classes

22 Automated Methods l Parameters »Again, used Netica counting method l Structure »Applied CaMML to pre-processed data (0-N and H/-M/L) 1.constrained so that classification node was parent of item type nodes 2.unconstrained »Many different network structures found, all with arcs between item type nodes, of varying complexity

23 Results from automated methods

24 Conclusions l Automated methods yielded BNs which gave quantitative results comparable to or better than elicited BNs »validation of automated methods (?) l Undertaking both elicitation and automated KE resulted in additional domain analysis (e.g. 0-N vs H/M/L) l Hybrid of expert and automated approaches is feasible »methodology for combining is needed »evaluation measures and methods needed (may be domain specific)