Reasoning under Uncertainty Eugene Fink LTI Seminar November 16, 2007.

Slides:



Advertisements
Similar presentations
1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
Advertisements

The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
PROJECT RISK MANAGEMENT
Representing and Querying Correlated Tuples in Probabilistic Databases
Rulebase Expert System and Uncertainty. Rule-based ES Rules as a knowledge representation technique Type of rules :- relation, recommendation, directive,
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :
1 Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Type I and Type II Errors One-Tailed Tests About a Population Mean: Large-Sample.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
Scheduling with uncertain resources Elicitation of additional data Ulaş Bardak, Eugene Fink, Chris Martens, and Jaime Carbonell Carnegie Mellon University.
Elements of Decision Problems
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Scheduling with Uncertain Resources Reflective Agent with Distributed Adaptive Reasoning RADAR.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Extensions to Consumer theory Inter-temporal choice Uncertainty Revealed preferences.
Machine Learning Methods for Personalized Cybersecurity Jaime G. Carbonell Eugene Fink Mehrbod Sharifi Applying machine learning and artificial intelligence.
Pengujian Hipotesis Nilai Tengah Pertemuan 19 Matakuliah: I0134/Metode Statistika Tahun: 2007.
Automated Changes of Problem Representation Eugene Fink LTI Retreat 2007.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Preference Elicitation in Scheduling Problems Ulaş Bardak Ph.D. Thesis Proposal Committee Jaime Carbonell, Eugene Fink, Stephen Smith, Sven Koenig (University.
1 Wavelet synopses with Error Guarantees Minos Garofalakis Phillip B. Gibbons Information Sciences Research Center Bell Labs, Lucent Technologies Murray.
Scheduling with uncertain resources: Representation and utility function Ulas Bardak, Eugene Fink, and Jaime Carbonell Reflective Agent with Distributed.
Decision Making Decision-making is based on information Information is used to: Identify the fact that there is a problem in the first place Define and.
CBR in Medicine Jen Bayzick CSE435 – Intelligent Decision Support Systems.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
AM Recitation 2/10/11.
PMI Knowledge Areas Risk Management.
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
Project Risk and Cost Management. IS the future certain? The future is uncertain, but it is certain that there are two questions will be asked about our.
 1  Outline  stages and topics in simulation  generation of random variates.
1 1 Slide © 2003 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Automated Assistant for Crisis Management Reflective Agent with Distributed Adaptive Reasoning RADAR.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
1 Introduction to Software Engineering Lecture 1.
In Chapter 4: Budgeting the Project Budgeting: the process of forecasting what resources the project will require. Cost estimating process: evaluating.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Unclassified//For Official Use Only 1 Analysis of Uncertain Data in Text Documents Carnegie Mellon University and DYNAM i X Technologies PI : Jaime G.
Chap. 5 Building Valid, Credible, and Appropriately Detailed Simulation Models.
Advanced Decision Architectures Collaborative Technology Alliance An Interactive Decision Support Architecture for Visualizing Robust Solutions in High-Risk.
Reserve Variability – Session II: Who Is Doing What? Mark R. Shapland, FCAS, ASA, MAAA Casualty Actuarial Society Spring Meeting San Juan, Puerto Rico.
PAINT RAPID : Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University DYNAM i X Technologies PI: Jaime Carbonell Eugene.
Cmpe 589 Spring 2006 Lecture 2. Software Engineering Definition –A strategy for producing high quality software.
1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime G. Carbonell / / (412)
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
RADAR May 5, RADAR /Space-Time Assistant: Crisis Allocation of Resources.
Unclassified//For Official Use Only 1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime.
1 An infrastructure for context-awareness based on first order logic 송지수 ISI LAB.
Introduction to Machine Learning © Roni Rosenfeld,
Analysis of Uncertain Data: Tools for Representation and Processing Bin Fu Eugene Fink Jaime G. Carbonell.
RADAR February 15, RADAR /Space-Time Learning.
Scheduling with Uncertain Resources Eugene Fink, Jaime G. Carbonell, Ulas Bardak, Alex Carpentier, Steven Gardiner, Andrew Faulring, Blaze Iliev, P. Matthew.
Computational Intelligence: Methods and Applications Lecture 15 Model selection and tradeoffs. Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Urban Planning Group Implementation of a Model of Dynamic Activity- Travel Rescheduling Decisions: An Agent-Based Micro-Simulation Framework Theo Arentze,
Analysis of Uncertain Data: Smoothing of Histograms Eugene Fink Ankur Sarin Jaime G. Carbonell
Structural & Multidisciplinary Optimization Group Deciding How Conservative A Designer Should Be: Simulating Future Tests and Redesign Nathaniel Price.
Automated Assistant for Crisis Management (Reflective Agent with Distributed Adaptive Reasoning) RADAR.
Building Valid, Credible & Appropriately Detailed Simulation Models
Scheduling with uncertain resources Collaboration with the user Eugene Fink, Ulaş Bardak, Brandon Rothrock, Jaime Carbonell Carnegie Mellon University.
Unclassified//For Official Use Only 1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime.
Introduction to Machine Learning
Decision Support and Business Intelligence Systems
RADAR/Space-Time: Allocation of Rooms and Vendor Orders
Chapter 10 Verification and Validation of Simulation Models
Information Elicitation in Scheduling Problems
Paraskevi Raftopoulou, Euripides G.M. Petrakis
Scheduling under Uncertainty
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Presentation transcript:

Reasoning under Uncertainty Eugene Fink LTI Seminar November 16, 2007

Challenges The available knowledge about the real world is inherently uncertain. We usually make decisions based on incomplete and partially inaccurate data.

Learning of reasonable default assumptions Representation of uncertainty Challenges Fast reasoning based on uncertain knowledge Elicitation of critical additional data Contingency reasoning

Projects RADAR / Space-Time (2003–2008) “Reflective Agent with Distributed Adaptive Reasoning” Scheduling and resource allocation under uncertainty. RAPID (2007–2011) “Representation and Analysis of Probabilistic Intelligence Data” Analysis of uncertain military- intelligence data and planning of future data collection.

Outline Representation of uncertainty Reasoning based on uncertain knowledge Elicitation of missing data Future research challenges Representation of uncertainty

Alternative representations Approximations Mary’s weight is about 150. Mary’s cell phone is probably in her purse. Ranges or sets of possible values Mary’s weight is between 140 and 160. Mary’s cell phone may be in her purse, office, home, or car. Probability distributions Phone location: 95%purse 2% home 2% office 1% car Weight: probability density

Approximations Simple and intuitive approach, which usually does not require changes to standard algorithms. BUT… We assume that small input changes do not cause large output changes We may need to modify standard algorithms to ensure that they do not violate this assumption

Approximations Example: Selecting an amount of medication. patient weight amount of medication Since small input changes translate into small output changes, we can use an approximate weight value.

Approximations Example: Loading an elevator. load weight chance of overloading We can adapt this procedure to the use of approximate weights by subtracting a safety margin from the weight limit. 155 LB 140 LB

Approximations Example: Playing the “exact weight” game. If your weight is exactly 150 lb, you are a winner! player weight prize If we use approximate weight values, we cannot determine the chances of winning.

Ranges or sets of possible values Explicit representation of a margin of error Moderate changes to standard algorithms BUT… We may lose the accuracy of computation, and we cannot evaluate the probabilities of different possible values.

Ranges or sets of possible values Example: Selecting an amount of medication. patient weight amount of medication We obtain a range that includes the correct amount of medication. If the range width is within the acceptable margin of error, we can use it to select an appropriate amount.

Ranges or sets of possible values Example: Loading an elevator. load weight chance of overloading We identify the danger of overloading, but we cannot determine its probability.

Ranges or sets of possible values Example: Playing the “exact weight” game. player weight prize We still cannot determine the chances of winning.

Probability distributions Accurate analysis of possible values and their probabilities. BUT… Major changes to standard algorithms Major increase of the running time

Probability distributions Example: Playing the “exact weight” game. player weight prize We can determine possible outcomes and evaluate their probabilities.

RADAR / RAPID approach to uncertainty representation ranges or sets of values probability distributions ranges or sets with probabilities We approximate a probability density function by a set of uniform distributions, and represent it as a set of ranges with probabilities. Weight: 0.1 chance: [ ] 0.8 chance: [ ] 0.1 chance: [ ] probability density weight

Uncertain data Nominal values An uncertain nominal value is a set of possible values and their probabilities. Phone location: 0.95 chance: purse 0.02 chance: home 0.02 chance: office 0.01 chance: car

Uncertain data Nominal values Integers and reals An uncertain numeric value is a probability- density function represented by a set of uniform distributions. probability density weight Weight: 0.1 chance: [ ] 0.8 chance: [ ] 0.1 chance: [ ]

Uncertain data Nominal values Integers and reals Strings An uncertain string is a regular expression with probabilities.

Uncertain data Nominal values Integers and reals Strings Spatial regions An uncertain region is a set of rectangular regions and their probabilities x y

Uncertain data Nominal values Integers and reals Strings Spatial regions Functions An uncertain function is a piecewise-linear function with uncertain y-coordinates patient weight amount of medication or a set of possible functions and their probabilities. 0.8 chance 0.2 chance

Outline Representation of uncertainty Reasoning based on uncertain knowledge Elicitation of missing data Future research challenges

Uncertainty arithmetic We have developed a library of basic operations on uncertain data, which input and output uncertain values. Arithmetic operations Logical operations ≤ ≠ ¬ Function application Analysis of distributions μ σ

Uncertainty arithmetic Allows extension of standard algorithms to reasoning with uncertain values Supports the control of the trade-off between the speed and accuracy BUT… Approximate and relatively slow Assumes that all probability distributions are independent

RADAR application Scheduling and resource allocation based on uncertain knowledge of scheduling constraints, preferences, and available resources. Uncertain room and event properties Uncertain resource availability and prices Uncertain utility functions We use an optimization algorithm that searches for a schedule with the greatest expected quality.

RADAR results Manual Auto rooms 62 events Manual Auto rooms 84 events without uncertainty with uncertainty 10 Search time Schedule Quality Time (seconds) 13 rooms 84 events Manual Auto rooms 32 events 0.80 Schedule Quality Manual and auto scheduling problem size Scheduling of conference events.

RAPID application Analysis of military intelligence, which usually includes uncertain and partially inaccurate data. Relational database with uncertain data Retrieval of approximate and probabilistic matches for given queries Automated inferences, verification of given hypotheses, and search for novel patterns

Outline Representation of uncertainty Reasoning based on uncertain knowledge Elicitation of missing data Future research challenges

Elicitation challenge Identification of critical missing data Analysis of the trade-off between the cost of data acquisition and the expected performance improvements Planning of effective data collection

RADAR / RAPID approach to elicitation of additional data For each question, compute its expected impact on the overall utility, and select questions with best expected impacts For each candidate question, estimate the probabilities of possible answers For each possible answer, compute its cost, as well as its impact on the utility of reasoning or optimization

RADAR / RAPID approach to elicitation of additional data Model Const- ruction Model Evalu- ation Question Selection Reasoning or Optimization current model model utility and limitations questions answers Top-Level Control Data Collection

RADAR application The system identifies critical missing knowledge, sends related questions to the user, and improves the world model based on the user’s answers. Elicitation of additional data about scheduling constraints, preferences, and available resources.

RADAR application Elicitation of additional data about scheduling constraints, preferences, and available resources. Info elicitorParserOptimizer Process new info Update resource allocation Choose and send questions Top-level control and learning Graphical user interface User

Missing info: Invited talk: – Projector need Poster session: – Room size – Projector need RADAR example: Initial schedule Assumptions: Invited talk: – Needs a projector Poster session: – Small room is OK – Needs no projector Available rooms: Room num. Area (feet 2 ) Proj- ector ,000 1,000 1,000 Yes No Yes Requests: Invited talk, 9–10am: Needs a large room Poster session, 9–11am: Needs a room Initial schedule: Talk Posters

Requests: Invited talk, 9–10am: Needs a large room Poster session, 9–11am: Needs a room RADAR example: Choice of questions Initial schedule: Talk Posters Candidate questions: Invited talk: Needs a projector? Poster session: Needs a larger room? Needs a projector? Useless info: There are no large rooms w/o a projector × Useless info: There are no unoccupied larger rooms × Potentially useful info √

RADAR example: Improved schedule Requests: Invited talk, 9–10am: Needs a large room Poster session, 9–11am: Needs a room Initial schedule: Talk Posters Info elicitation: System: Does the poster session need a projector? User: A projector may be useful, but not really necessary New schedule: Talk Posters

RADAR results Repairing a conference schedule after a “ crisis ” loss of rooms. After Crisis 0.50 Manual Repair 0.61 Auto w/o Elicitation 0.68 Auto with Elicitation 0.72 Schedule Quality Manual and auto repair Schedule Quality Number of Questions Dependency of the quality on the number of questions

RAPID application Identification of critical uncertainties, based on given tasks and priorities Planning of intelligence collection, based on the analysis of cost/benefit trade-offs and related risks Proactive collection of military intelligence.

RAPID application Proactive collection of military intelligence. Critical uncertainties Uncertain inference rules Query matches Evaluation of hypotheses Prioritized plans for proactive data collection Inferred facts Learned inference rules Goals, queries, and hypotheses Uncertain facts Knowledge entry and editing

Outline Representation of uncertainty Reasoning based on uncertain knowledge Elicitation of missing data Future research challenges

Future work Learning of defaults and “common-sense” rules Contingency reasoning Theory of proactive learning

Defaults assumptions Learning to make reasonable common-sense assumptions in the absence of specific data. Example assumptions: Almost all people weigh less than 500 lb Tall people usually weigh more than short people For people under eighteen years old, the expected weight increases with age

Defaults assumptions Learning to make reasonable common-sense assumptions in the absence of specific data. Representation of general uncertain assumptions, context-based assumptions, and uncertain dependencies Passive and active learning of these assumptions and dependencies Unsupervised learning of relevant contexts

Contingency reasoning Analysis of possible future developments and preparation to likely developments. Identification of critical uncertainties and their discretization into specific scenarios Compact representation of scenario spaces Construction of related contingency plans

Proactive learning General theory of the development and analysis of related learning techniques. Integration of learning with follow-up reasoning Model Const- ruction Model Evalu- ation Question Selection Reasoning or Optimization current model model utility and limitations questionsanswers Top-Level Control Data Collection Integration of learning algorithms with reasoning engines that use the learned knowledge.

Proactive learning General theory for the development and analysis of related learning techniques. Integration of learning with follow-up reasoning Automated selection of learning examples Model Const- ruction Model Evalu- ation Question Selection Reasoning or Optimization current model model utility and limitations questionsanswers Top-Level Control Data Collection Active selection of examples based on the trade-off among their cost, expected accuracy, and impact on the learned- knowledge utility.

Proactive learning General theory for the development and analysis of related learning techniques. Integration of learning with follow-up reasoning Automated selection of high-level strategies Model Const- ruction Model Evalu- ation Question Selection Reasoning or Optimization current model model utility and limitations questionsanswers Top-Level Control Data Collection Intelligent choice and guidance of learning strategies, with the purpose to reduce the cost and time of learning. Automated selection of learning examples

Proactive learning General theory for the development and analysis of related learning techniques. Integration of learning with follow-up reasoning Proactive analysis of future needs Model Const- ruction Model Evalu- ation Question Selection Reasoning or Optimization current model model utility and limitations questionsanswers Top-Level Control Data Collection Automated evaluation of future needs for the learned knowledge, and adaptation of the learning process to both expected and sudden changes in these needs. Automated selection of high-level strategies Automated selection of learning examples