Agenda Bayesian Network Introduction Inference of Bayesian Network

Slides:



Advertisements
Similar presentations
A Tutorial on Learning with Bayesian Networks
Advertisements

Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Lauritzen-Spiegelhalter Algorithm
PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :
Dynamic Bayesian Networks (DBNs)
Bayesian Networks. Introduction A problem domain is modeled by a list of variables X 1, …, X n Knowledge about the problem domain is represented by a.
Introduction of Probabilistic Reasoning and Bayesian Networks
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.
. Inference I Introduction, Hardness, and Variable Elimination Slides by Nir Friedman.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.
Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.
Learning Bayesian Networks (From David Heckerman’s tutorial)
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Reasoning with Bayesian Networks. Overview Bayesian Belief Networks (BBNs) can reason with networks of propositions and associated probabilities Useful.
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
Copyright © 2009 PMI RiskSIGNovember 5-6, 2009 RiskSIG - Advancing the State of the Art A collaboration of the PMI, Rome Italy Chapter and the RiskSIG.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making 2007 Bayesian networks Variable Elimination Based on.
Unsupervised Learning: Clustering Some material adapted from slides by Andrew Moore, CMU. Visit for
Reasoning with Bayesian Belief Networks. Overview Bayesian Belief Networks (BBNs) can reason with networks of propositions and associated probabilities.
Bayesian Learning Chapter Some material adapted from lecture notes by Lise Getoor and Ron Parr.
Introduction to Bayesian Networks
1 CMSC 671 Fall 2001 Class #25-26 – Tuesday, November 27 / Thursday, November 29.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
Uncertainty. Assumptions Inherent in Deductive Logic-based Systems All the assertions we wish to make and use are universally true. Observations of the.
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
Slides for “Data Mining” by I. H. Witten and E. Frank.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
Exploiting Structure in Probability Distributions Irit Gat-Viks Based on presentation and lecture notes of Nir Friedman, Hebrew University.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
Machine Learning in Practice Lecture 5 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Learning and Acting with Bayes Nets Chapter 20.. Page 2 === A Network and a Training Data.
Advances in Bayesian Learning Learning and Inference in Bayesian Networks Irina Rish IBM T.J.Watson Research Center
Quiz 3: Mean: 9.2 Median: 9.75 Go over problem 1.
Data Mining and Decision Support
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011.
Introduction on Graphic Models
Crash Course on Machine Learning Part VI Several slides from Derek Hoiem, Ben Taskar, Christopher Bishop, Lise Getoor.
1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk
Qian Liu CSE spring University of Pennsylvania
Learning Bayesian Network Models from Data
Irina Rish IBM T.J.Watson Research Center
Objective of This Course
CS 188: Artificial Intelligence
Class #19 – Tuesday, November 3
Bayesian Learning Chapter
CS 188: Artificial Intelligence Fall 2008
Class #16 – Tuesday, October 26
Presentation transcript:

Bayesian Networks October 9, 2008 Sung-Bae Cho

Agenda Bayesian Network Introduction Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Bayesian Network Application Example Life Log Application Example Summary & Review

Probabilities X binary: Probability distribution P(X|x) X is a random variable Discrete random variable: Finite set of possible outcomes Continuous random variable: Probability distribution (density function) over continuous values x is background state of information X binary: 5 7

Rules of Probability Product Rule Marginalization Bayes Rule X binary:

Rules of Probability Chain rule of probability

The Joint Distribution Recipe for making a joint distribution of M variables Make a truth table listing all combinations of values of your variables For each combination of values, say how probable it is

Using the Joint Once you have the joint distribution, you can ask for the probability of any logical expression involving your attribute

Using the Joint

Inference with Joint

Inference with the Joint

Joint Distributions Bad news Good news Once you have a joint distribution, you can ask important questions about stuff that involves a lot of uncertainty Bad news Impossible to create for more than about ten attributes because there are so many numbers needed when you build them

Bayesian Networks In general P(X1, … Xn) needs at least 2n – 1numbers to specify the joint probability Exponential storage and inference Overcome the problem of exponential size by exploiting conditional independence REAL Joint Probability Distribution (2n-1 numbers) Graphical Representation of Joint Probability Distribution (Bayesian Network) Conditional Independence (Domain knowledge or Derived from data)

= P(A) P(S) P(T|A) P(L|S) P(B|S) P(C|T,L) P(D|T,L,B) Bayesian Networks? P(D|T,L,B) P(B|S) P(S) P(C|T,L) P(L|S) P(A) P(T|A) Lung Cancer Smoking Chest X-ray Bronchitis Dyspnoea Tuberculosis Visit to Asia CPD: T L B D=0 D=1 0 0 0 0.1 0.9 0 0 1 0.7 0.3 0 1 0 0.8 0.2 0 1 1 0.9 0.1 ... P(A, S, T, L, B, C, D) = P(A) P(S) P(T|A) P(L|S) P(B|S) P(C|T,L) P(D|T,L,B) Conditional Independencies Efficient Representation [Lauritzen & Spiegelhalter, 95]

Bayesian Networks? Structured, graphical representation of probabilistic relationships between several random variables Explicit representation of conditional independencies Missing arcs encode conditional independence Efficient representation of joint pdf (probabilistic distribution function) Allows arbitrary queries to be answered P(lung cancer=yes | smoking=no, dyspnoea=yes)=?

Bayesian Networks? Also called belief networks, and (directed acyclic) graphical models Bayesian network Directed acyclic graph Nodes are variables (discrete or continuous) Arcs indicate dependence between variables Conditional Probabilities (local distributions)

Bayesian Networks Smoking Cancer P( S=no) 0.80 S=light) 0.15 S=heavy) 0.05 Smoking= no light heavy P( C=none) 0.96 0.88 0.60 C=benign) 0.03 0.08 0.25 C=malig) 0.01 0.04 0.15

Product Rule P(C,S) = P(C|S) P(S) P(Smoke) P(Cancer)

Bayes Rule Cancer= none benign malignant P( S=no) 0.821 0.522 0.421 S=light) 0.141 0.261 0.316 S=heavy) 0.037 0.217 0.263

Missing Arcs Represent Conditional Independence Battery Engine Turns Over Start Start and Battery are independent, given Engine Turns Over General product (chain) rule for Bayesian networks

Bayesian Network Smoking Gender Age Cancer Lung Tumor Serum Calcium Exposure to Toxics

Bayesian Network Knowledge Engineering Objective: Construct a model to perform a defined task Participants: Collaboration between domain expert and BN modeling expert Process: iterate until “done” Define task objective Construct model Evaluate model

The KE Process What are the variables? What are their values/states? What is the graph structure? What is the local model structure? What are the parameters (Probabilities)? What are the preferences (utilities)?

The Knowledge Acquisition Task Variables: collectively exhaustive, mutually exclusive values clarity test: value should be knowable in principle Structure if data available, can be learned constructed by hand (using “expert” knowledge) variable ordering matters: causal knowledge usually simplifies Probabilities can be learned from data second decimal usually does not matter; relative probabilities sensitivity analysis

What are the Variables? “Focus” or “query” variables Variables of interest “Evidence” or “Observation” variables What sources of evidence are available? “Context” variables Sensing conditions, background causal conditions Start with query variables and spread out to related variables

What are the Values/States? Variables/values must be exclusive and exhaustive Naive modelers sometimes create separate (often Boolean) variables for different states of the same variable Types of variables Binary (2-valued, including Boolean) Qualitative Numeric discrete Numeric continuous Dealing with infinite and continuous domains Some BN software requires that continuous variables be discretized Discretization should be based on differences in effect on related variables (i.e., not just be even sized chunks)

Values versus Probabilities What is a variable? Collectively exhaustive, mutually exclusive values Error Occurred No Error Risk of Smoking Smoking Values versus Probabilities

What is the Graph Structure? Goals in specifying graph structure Minimize probability elicitation: fewer nodes, fewer arcs, smaller state spaces Maximize fidelity of model Sometimes requires more nodes, arcs, and states Tradeoff between more accurate model and cost of additional modeling Too much detail can decrease accuracy

Agenda Bayesian Network Introduction Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Bayesian Network Application Example Life Log Application Example Summary & Review

Inference We now have compact representations of probability distributions: Bayesian Networks Network describes a unique probability distribution P How do we answer queries about P? We use inference as a name for the process of computing answers to such queries

Inference - The Good & Bad News We can do inference We can compute any conditional probability P( Some variables | Some other variable values ) The sad, bad news Conditional probabilities by enumerating all matching entries in the joint are expensive: Exponential in the number of variables Sadder and worse news General querying of Bayesian networks is NP-complete Hardness does not mean we cannot solve inference It implies that we cannot find a general procedure that works efficiently for all networks For particular families of networks, we can have provably efficient procedures

Example H M G S J 23 JP Computation 24 JP computation H=T 0.7 H=F 0.3 M=T 0.3 M=F 0.7 H M H=T H=F M=T M=F G=T 0.8 0.6 0.7 0.3 G=F 0.2 0.4 G S J G=T G=F J=T 0.8 0.6 J=F 0.2 0.4 G=T G=F S=T 0.6 0.7 S=F 0.4 0.3 23 JP Computation 24 JP computation

Queries: Likelihood There are many types of queries we might ask. Most of these involve evidence Evidence e is an assignment of values to a set E variables in the domain Without loss of generality E = { Xk+1, …, Xn } Simplest query: compute probability of evidence This is often referred to as computing the likelihood of the evidence

Queries Often we are interested in the conditional probability of a variable given the evidence This is the a posteriori belief in X, given evidence e A related task is computing the term P(X, e) i.e., the likelihood of e and X = x for values of X we can recover the a posteriori belief by

A Posteriori Belief This query is useful in many cases: Prediction: what is the probability of an outcome given the starting condition Target is a descendent of the evidence Diagnosis: what is the probability of disease/fault given symptoms Target is an ancestor of the evidence As we shall see, the direction between variables does not restrict the directions of the queries Probabilistic inference can combine evidence form all parts of the network

Approaches to Inference Exact inference Inference in Simple Chains Variable elimination Clustering / join tree algorithms Approximate inference Stochastic simulation / sampling methods Markov chain Monte Carlo methods Mean field theory

Inference in Simple Chains X1 X2 How do we compute P(X2)? How do we compute P(X3)? we already know how to compute P(X2)... X1 X2 X3

Inference in Simple Chains X1 X2 X3 Xn How do we compute P(Xn)? Compute P(X1), P(X2), P(X3), … We compute each term by using the previous one Complexity: Each step costs operations Compare to naïve evaluation, that requires summing over joint values of n-1 variables

Elimination in Chains A B C E D

Elimination in Chains X A B C E D X X A B C E D

Variable Elimination General idea: Write query in the form Iteratively Move all irrelevant terms outside of innermost sum Perform innermost sum, getting a new term Insert the new term into the product

Stochastic Simulation Suppose you are given values for some subset of the variables, G, and want to infer values for unknown variables, U Randomly generate a very large number of instantiations from the BN Generate instantiations for all variables – start at root variables and work your way “forward” Only keep those instantiations that are consistent with the values for G Use the frequency of values for U to get estimated probabilities Accuracy of the results depends on the size of the sample (asymptotically approaches exact results)

Markov Chain Monte Carlo Methods So called because Markov chain – each instance generated in the sample is dependent on the previous instance Monte Carlo – statistical sampling method Perform a random walk through variable assignment space, collecting statistics as you go Start with a random instantiation, consistent with evidence variables At each step, for some non-evidence variable, randomly sample its value, consistent with the other current assignments Given enough samples, MCMC gives an accurate estimate of the true distribution of values

Agenda Bayesian Network Introduction Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Bayesian Network Application Example Life Log Application Example Summary & Review

Troubleshooters, Windows 98 & 2000 Why Learning knowledge-based (expert systems) Answer Wizard, Office 95, 97, & 2000 Troubleshooters, Windows 98 & 2000 Causal discovery Data visualization Concise model of data Prediction data-based

Why Learning? Knowledge acquisition is bottleneck Knowledge acquisition is an expensive process Often we don’t have an expert Data is cheap Amount of available information growing rapidly Learning allows us to construct models from raw data Conditional independencies & graphical language capture structure of many real-world distributions Graph structure provides much insight into domain Allows “knowledge discovery” Learned model can be used for many tasks Supports all the features of probabilistic learning Model selection criteria Dealing with missing data & hidden variables

Why Struggle for Accurate Structure? Earthquake Alarm Set Sound Burglary Truth Adding an arc Missing an arc Earthquake Alarm Set Sound Burglary Earthquake Alarm Set Sound Burglary Increases the number of parameters to be fitted Wrong assumptions about causality and domain structure Cannot be compensated by accurate fitting of parameters Also misses causality and domain structure

Learning Bayesian Networks from Data X1 X2 X3 True 1 0.7 False 5 -1.6 3 5.9 … true 2 6.3 Bayesian network learner X1 X2 X3 X4 X5 X6 X7 X8 X9 … … + Prior/expert information

Learning Bayesian Networks Known Structure, Complete Data Network structure is specified Inducer needs to estimate parameters Data does not contain missing values Unknown Structure, Complete Data Network structure is not specified Inducer needs to select arcs & estimate parameters Data does not contain missing values Known Structure, Incomplete Data Data contains missing values Need to consider assignments to missing values Unknown Structure, Incomplete Data Need to consider assignments to missing values

Two Types of Methods for Learning BNs Constraint based Finds a Bayesian network structure whose implied independence constraints “match” those found in the data Scoring methods (Bayesian, MDL, MML) Find the Bayesian network structure that can represent distributions that “match” the data (i.e. could have generated the data) Practical considerations The number of possible BN structures is super exponential in the number of variables. How do we find the best graph(s)?

Approaches to Learning Structure Constraint based Perform tests of conditional independence Search for a network that is consistent with the observed dependencies and independencies Pros & Cons Intuitive, follows closely the construction of BNs Separates structure learning from the form of the independence tests Sensitive to errors in individual tests Computationally hard

Approaches to Learning Structure Score based Define a score that evaluates how well the (in)dependencies in a structure match the observations Search for a structure that maximizes the score Pros & Cons Statistically motivated Can make compromises Takes the structure of conditional probabilities into account Computationally hard

Score-based Learning Define scoring function that evaluates how well a structure matches the data Search for a structure that maximizes the score E, B, A <Y, Y, Y> <N, N, Y> <N, Y, Y> . <N,Y,Y> E E E B A A A B B

Structure Search as Optimization Input: Training data Scoring function Set of possible structures Output A network that maximizes the score Key computational property: Decomposability

Tree Structured Networks At most one parent per variable Why trees? Elegant math We can solve the optimization problem Sparse parameterization Avoid overfitting

Beyond Trees When we consider more complex network, the problem is not as easy Suppose we allow at most two parents per node A greedy algorithm is no longer guaranteed to find the optimal network In fact, no efficient algorithm exists

Model Search Finding the BN structure with the highest score among those structures with at most k parents is NP hard for k>1 (Chickering, 1995) Heuristic methods Greedy Greedy with restarts MCMC methods score all possible single changes any changes better? perform best change yes no return saved structure initialize structure

Algorithm B

Scoring Functions: MDL Minimum Description Length (MDL) Learning  data compression Other: MDL = -BIC (Bayesian Information Criterion) Bayesian score (BDe) - asymptotically equivalent to MDL <9.7 0.6 8 14 18> <0.2 1.3 5 ?? ??> <1.3 2.8 ?? 0 1 > <?? 5.6 0 10 ??> ………………. DL(Data|model) DL(Model)

Typical Operations S C E D S C E D S C E D S C E D Add C D Reverse C E Delete C E S C E D S C E D

Exploiting Decomposability in Local Search Caching: To update the score of after a local change, we only need to re-score the families that were changed in the last move

Greedy Hill-Climbing Simplest heuristic local search Start with a given network empty network best tree a random network At each iteration Evaluate all possible changes Apply change that leads to best improvement in score Reiterate Stop when no modification improves score Each step requires evaluating approximately n new changes

Greedy Hill-Climbing: Possible Pitfalls Greedy Hill-Climbing can get struck in: Local Maxima: All one-edge changes reduce the score Plateaus: Some one-edge changes leave the score unchanged Happens because equivalent networks received the same score and are neighbors in the search space Both occur during structure search Standard heuristics can escape both Random restarts TABU search

Agenda Bayesian Network Introduction Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Bayesian Network Application Example Life Log Application Example Summary & Review

What are BNs useful for? Cause Effect Cause Effect Predictive Inference Prediction: P(symptom|cause)=? Diagnosis: P(cause|symptom)=? Classification: P(class|data) Decision-making (given a cost function) Data mining: induce best model from data Cause Effect Diagnostic Reasoning Medicine Bio-informatics Computer troubleshooting Stock market Text Classification Speech recognition

Why use BNs? Explicit management of uncertainty Modularity (modular specification of a joint distribution) implies maintainability Better, flexible and robust decision making – MEU (Maximization of Expected Utility), VOI (Value of Information) Can be used to answer arbitrary queries - multiple fault problems (General purpose “inference” algorithm) Easy to incorporate prior knowledge Easy to understand

Example from Medical Diagnostics Visit to Asia Smoking Patient Information Tuberculosis Lung Cancer Bronchitis Medical Difficulties Tuberculosis or Cancer XRay Result Dyspnea Diagnostic Tests Network represents a knowledge structure that models the relationship between medical difficulties, their causes and effects, patient information and diagnostic tests

Example from Medical Diagnostics Tuber Present Absent Lung Can Tub or Can True False Visit to Asia Smoking Patient Information Tuberculosis Lung Cancer Bronchitis Medical Difficulties Tub or Can True False Bronchitis Present Absent 0.90 0.70 0.80 0.10 0.l0 0.30 0.20 Dyspnea Tuberculosis or Cancer XRay Result Dyspnea Diagnostic Tests Relationship knowledge is modeled by deterministic functions, logic and conditional probability distributions

Example from Medical Diagnostics Propagation algorithm processes relationship information to provide an unconditional or marginal probability distribution for each node The unconditional or marginal probability distribution is frequently called the belief function of that node

Example from Medical Diagnostics As a finding is entered, the propagation algorithm updates the beliefs attached to each relevant node in the network Interviewing the patient produces the information that “Visit to Asia” is “Visit” This finding propagates through the network and the belief functions of several nodes are updated

Example from Medical Diagnostics Further interviewing of the patient produces the finding “Smoking” is “Smoker” This information propagates through the network

Example from Medical Diagnostics Finished with interviewing the patient, the physician begins the examination The physician now moves to specific diagnostic tests such as an X-Ray, which results in a “Normal” finding which propagates through the network Note that the information from this finding propagates backward and forward through the arcs

Example from Medical Diagnostics The physician also determines that the patient is having difficulty breathing, the finding “Present” is entered for “Dyspnea” and is propagated through the network The doctor might now conclude that the patient has bronchitis and does not have tuberculosis or lung cancer

Applications Industrial Processor Fault Diagnosis - by Intel Auxiliary Turbine Diagnosis - GEMS by GE Diagnosis of space shuttle propulsion systems - VISTA by NASA/Rockwell Situation assessment for nuclear power plant - NRC Military Automatic Target Recognition - MITRE Autonomous control of unmanned underwater vehicle - Lockheed Martin Assessment of Intent Medical Diagnosis Internal Medicine Pathology diagnosis - Intellipath by Chapman & Hall Breast Cancer Manager with Intellipath Commercial Financial Market Analysis Information Retrieval Software troubleshooting and advice - Windows 95 & Office 97 Pregnancy and Child Care - Microsoft Software debugging - American Airlines’ SABRE online reservation system

Agenda Bayesian Network Introduction Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Bayesian Network Application Example Life Log Application Example Summary & Review

Modeling Users in Location-Tracking Modeling Users in Location-Tracking Application [Abdelsalam 2004] Taxi calling service for customer by tracking roving users and connecting suitable taxi Sending the position of user has a specified time interval The position in the time interval is uncertain General input: Direction, last position, velocity Proposed solution: Considering additional evidences by Bayesian network Habit of user, user behavior, environments The proposed methods Considering unique goals, personalities, and task for location reasoning Goal: Build location-tracking applications that take into account the individual characteristics, habits, and preferences of the users W. Abdelsalam, Y. Ebrahim, “Managing uncertainty: Modeling users in location-tracking applications,” IEEE Pervasive Computing, 2004.

Modeling Users Temporal variables Time information Ex) Events occurred, time of year, day of the week, time of day Spatial variables Information about locations of user Ex) Building, town, certain part of town, certain road or highway Environmental variables Environmental information Ex) Weather conditions, road conditions, special events Behavioral variables Behavioral feature & information Ex) Typical speeds, resting patterns, preferred work areas, common reactions in certain situations

Collecting User Data User behavior log User-specific data Environment-specific data User location data Data feature driven frequency Location: periodic Event & occurrence: non-periodic

Building the User Model Bayesian Networks Need flexible and automatic techniques  Bayesian Network! cf. logic-based methods Two major purposes of using BN Reasonability of causes from observation Considering the changed evidence with reduced computation

Using Bayesian Networks Taxicab location-tracking application Event Time of day Morning rush hour Lunch hour Evening rush hour Late night Etc Source: start location airport, downtown, … Destination: end location Weather conditions Sunny, rainy, snowing, … Route All available routes are considered Considering route is local or highway Speed Speed variation range Last location Route Speed Future location

Experimental Results (1) Comparison with LSR (Last Speed Reported) method length = velocity X time The simulation Data: Artificially generated data Simulating roving users Simplified the speed values for routes and weather 45%(less than 10km/h)+25%(20km/h)+20%(50km/h)+10%(100km/h) 200 routes are generated local or highway is randomly selected A set of Trip Segments (TSs) Speed + duration Weather Good or Bad

Experimental Results (2) Standard distribution of distance error for each reporting interval

Agenda Bayesian Network Introduction Inference of Bayesian Network Modeling of Bayesian Network Bayesian Network Application Bayesian Network Application Example Life Log Application Example Summary & Review

Summary & Review Bayesian Network Inference & Modeling Methods Applications A Life Log Application Example Using BN for inference by user modeling and observation Ongoing research Detail modeling: Using SOM Route reduction: Using key and secondary route segments Application on LBS: Utilization of personalized context for location based services