By Neng-Fa Zhou1 PRISM: A Probabilistic Language for Modeling and Learning Joint work with Taisuke Sato (Tokyo Institute of Technology)

Slides:



Advertisements
Similar presentations
Hidden Markov Models (HMM) Rabiner’s Paper
Advertisements

1 Essential Probability & Statistics (Lecture for CS598CXZ Advanced Topics in Information Retrieval ) ChengXiang Zhai Department of Computer Science University.
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
Dynamic Bayesian Networks (DBNs)
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Introduction of Probabilistic Reasoning and Bayesian Networks
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data John Lafferty Andrew McCallum Fernando Pereira.
Statistical NLP: Lecture 11
Ch-9: Markov Models Prepared by Qaiser Abbas ( )
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Models Fundamentals and applications to bioinformatics.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models (HMMs) Steven Salzberg CMSC 828H, Univ. of Maryland Fall 2010.
Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –One exception: games with multiple moves In particular, the Bayesian.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Discrete random variables Probability mass function Distribution function (Secs )
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference (Sec. )
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Representing Uncertainty CSE 473. © Daniel S. Weld 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one.
Genome evolution: a sequence-centric approach Lecture 3: From Trees to HMMs.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
Probabilistic Inference in PRISM Taisuke Sato Tokyo Institute of Technology.
Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.
Real-Time Odor Classification Through Sequential Bayesian Filtering Javier G. Monroy Javier Gonzalez-Jimenez
Made by: Maor Levy, Temple University  Probability expresses uncertainty.  Pervasive in all of Artificial Intelligence  Machine learning 
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Isolated-Word Speech Recognition Using Hidden Markov Models
Graphical models for part of speech tagging
Margin Learning, Online Learning, and The Voted Perceptron SPLODD ~= AE* – 3, 2011 * Autumnal Equinox.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Introduction to Bayesian statistics Yves Moreau. Overview The Cox-Jaynes axioms Bayes’ rule Probabilistic models Maximum likelihood Maximum a posteriori.
Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving STOCHASTIC METHODS Luger: Artificial Intelligence,
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Simultaneously Learning and Filtering Juan F. Mancilla-Caceres CS498EA - Fall 2011 Some slides from Connecting Learning and Logic, Eyal Amir 2006.
CY3A2 System identification1 Maximum Likelihood Estimation: Maximum Likelihood is an ancient concept in estimation theory. Suppose that e is a discrete.
CS Statistical Machine learning Lecture 24
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
CSC321: Neural Networks Lecture 16: Hidden Markov Models
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
Linear Tabling1 Tabled Prolog and Linear Tabling Neng-Fa Zhou (City Univ. of New York) Yi-Dong Shen (Chinese Academy of Sciences) Taisuke Sato (Tokyo Inst.
John Lafferty Andrew McCallum Fernando Pereira
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
1 Hidden Markov Models Hsin-min Wang References: 1.L. R. Rabiner and B. H. Juang, (1993) Fundamentals of Speech Recognition, Chapter.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Gaussian Process Networks Nir Friedman and Iftach Nachman UAI-2K.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Other Models for Time Series. The Hidden Markov Model (HMM)
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
Learning Bayesian Networks for Complex Relational Data
Ch3: Model Building through Regression
Hidden Markov Autoregressive Models
Representing Uncertainty
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Sampling Distribution
Sampling Distribution
LECTURE 15: REESTIMATION, EM AND MIXTURES
Mode-Directed Tabling for Dynamic Programming, Machine Learning, and Constraint Solving Neng-Fa Zhou (City Univ. of New York) Yoshitaka Kameya and Taisuke.
Statistical Relational AI
Chapter 14 February 26, 2004.
Presentation transcript:

by Neng-Fa Zhou1 PRISM: A Probabilistic Language for Modeling and Learning Joint work with Taisuke Sato (Tokyo Institute of Technology)

by Neng-Fa Zhou2 What is PRISM? 4 PRISM = Probabilistic Prolog 4 Three execution modes –Sample execution –Probability calculation –Learning direction(D):- msw(coin,Face), (Face==head->D=left;D=right).

by Neng-Fa Zhou3 Features 4 Use logic programs to describe probabilistic choices and their consequences 4 Probability distributions (parameters of switches) can be learned automatically from samples 4 Tabling is used in probabilistic computations and learning (resembles dynamic programming) 4 A high level yet efficient modeling language (Subsumes HMM, PCFG, and discrete Bayesian networks)

by Neng-Fa Zhou4 Applications 4 Probabilistic modeling and learning for problem domains where randomness or uncertainty is involved –Stochastic language processing –Gene sequence analysis –Game analysis –Optimization (e.g., performance tuning)

by Neng-Fa Zhou5 PRISM : the Language 4 Probability distributions and switches  Sample execution: sample(Goal)  Probability calculation: prob(Goal,P)  Learning: learn(Facts) values(coin,[head:0.5,tail:0.5]). direction(D):- msw(coin,Face), (Face==head->D=left;D=right).

by Neng-Fa Zhou6 Assumptions 4 Distribution assumption Let values(I,[o 1 :p 1,…,o n :p n ]) be a sample space declaration.  p i = 1 msw(I,V) always succeeds if V is a variable. 4 Independence assumption 4 Exclusiveness assumption prob(A  B) = prob(A)*prob(B) prob(A  B) = prob(A)+prob(B)

by Neng-Fa Zhou7 Learning Given a set of observed facts F, determine the probability distributions for the switches to maximize the likelihood of F.

by Neng-Fa Zhou8 Using Tabling (Dynamic Programming) in Learning hmm(L) :- msw(init,Si), hmm(Si,L). hmm(S,[]). hmm(S,[C|L]) :- msw(out(S),C), msw(tr(S),NextS), hmm(NextS,L). values(init,[s0,s1]). values(out(_),[a,b]). values(tr(_),[s0,s1]). hmm([a,b,a]) hmm(s1,[a,b,a]) hmm(s0,[b,a])hmm(s1,[b,a]) hmm(s0,[a])hmm(s1,[a]) hmm(s0,[]) hmm(s0,[a,b,a]) hmm(s1,[])

by Neng-Fa Zhou9 Papers 1.T. Sato: A Statistical Learning Method for Logic Programs with Distribution Semantics, ICLP T. Sato and Y. Kameya: Parameter Learning of Logic Programs for Symbolic-statistical Modeling, Journal of Artificial Intelligence Research, N.F. Zhou, T. Sato, K. Hasida: Toward a High- performance System for Symbolic and Statistical Modeling, Proc. IJCAI Workshop on Learning Statistical Models from Relational Data, pp , 2003.