Download presentation
Presentation is loading. Please wait.
Published byCharity Shana Gordon Modified over 9 years ago
1
Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz, Eyal Amir and Dan Roth This research is supported by ARDA’s AQUAINT Program, by NSF grant ITR-IIS-0085980 and DAF grant FA8750-04-2-0222 (DARPA REAL program). AAAI’06, IJCAI’05 Goal A general system to reason with symbolic data in a probabilistic fashion Being able to describe objects of arbitrary structure: lists, trees, objects with various fields and refering to other objects; Being able to describe probabilistic facts. Previous approaches - Logic Powerful data representation, powerful knowledge declaration, but lack of probabilities; Prevents us to say things as simple as: “most birds fly”, “most capitalized words are proper nouns” or “rectangular objects are common”. Lifted First-Order Probabilistic Inference Allows for declaration of probabilistic knowledge (KB) that applies to many objects: Prob(properNoun(Word) | noun(Word), capitalized(Word), similarTo(Word, Word2), knownName(Word2)) = 0.7 holds for all words involved. Does not build BN in advance of inference. Instead, performs inference using KB items as necessary. Does not consider individual objects separately unless this is really necessary: similar to theorem-proving, but with probabilities. Example: given capitalized(Beautiful) and adjective(Beautiful), the algorithm does not have to consider the whole list of known names in order to decide that Beautiful is not a proper noun, because it is not a noun in the first place. An algorithm constructing a model before doing inference would build a node per known name! Example A group of people, each with data such as name, age, profession, as well as relationships between them: Prob(friends(X,Z) | friends(X,Y), friends(Y,Z)) = 0.6 Prob(age(Person) > 20 | degree(Person)) = 0.9. Key ideas Each atom in a rule is a parameterized random variable. noun(Word) actually stands for all its instance random variables noun(w 1 ), noun(w 2 ),..., word(w n ). We use a First-Order Variable Elimination algorithm, where often all instances of a parameterized random variable can be eliminated at once, regardless of the domain size, since they all have the same structure. This is an idea introduced by David Poole in 2003 that we formalized and call Inversion Elimination (IE). However, IE only works when these instances are independent of each other. For more general cases, we introduced Counting Elimination, which however does depend on the domain size. Contributions A First-Order Probabilistic Inference algorithm which: provides a language for both structured data and probabilistic information; takes advantage of first-order level information by eliminating many objects in a single step, as opposed to previous work. Comparing Lifted and Propositional Inferences We compare lifted methods (inversion and counting elimination) to their propositional counterparts (which produce identical numerical results), for P(sick(Person) | epidemics), P(death | sick(Person)) in (I) and P(p(X),p(Y),r) in (II). Example Web pages, with their words, subjects and links, and how they relate to each other. Prob(subject(Page,coffe) | wordIn(Page,java)) = 0.2 Prob(subject(Page,programming) | wordIn(Page,java)) = 0.8 Prob(subject(Page,Subj) | link(Page,Page2), subject(Page2, Subj)) = 0.7. Example Smoking facts in a large population: Prob(friends(X,Y)) = 0.1 Prob(smoker(X)) = 0.3 Prob(smoker(Y) | friends(X,Y), smoker(X)) = 0.6 Prob(cancer(X) | smoker(X)) = 0.6, X john Prob(cancer(john)) = 0.01 A possible query would be: in a population of 40,000, what is the probability of a male having cancer? Previous approaches - Probabilistic models Powerful probabilistic capabilities in models such as Bayesian networks and Markov models, but lacks structured data such as lists, trees, etc. Cannot automatically apply knowledge to more than one object at the same time (for example, probabilistic properties of nouns to all nouns in a piece of text). Previous approach - Knowledge-driven Construction Constructs a propositional probabilistic model based on logic-like probabilistic knowledge; Builds model before starting inference, often building unnecessary parts. Where we are We have: predicates (relational structure); no individuals split unnecessarily. But we do not have yet: function symbols, so no data structures such as lists and trees; does not take full advantage of inference during construction. This will require approximation.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.