The Phylogenetic Indian Buffet Process : A Non- Exchangeable Nonparametric Prior for Latent Features By: Kurt T. Miller, Thomas L. Griffiths and Michael.

Slides:



Advertisements
Similar presentations
Sinead Williamson, Chong Wang, Katherine A. Heller, David M. Blei
Advertisements

Xiaolong Wang and Daniel Khashabi
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Sharing Features among Dynamical Systems with Beta Processes
HW 4. Nonparametric Bayesian Models Parametric Model Fixed number of parameters that is independent of the data we’re fitting.
Dictionary Learning on a Manifold
Learning Scalable Discriminative Dictionaries with Sample Relatedness a.k.a. “Infinite Attributes” Jiashi Feng, Stefanie Jegelka, Shuicheng Yan, Trevor.
Structural Inference of Hierarchies in Networks BY Yu Shuzhi 27, Mar 2014.
Models with Discrete Dependent Variables
Hidden Markov Models M. Vijay Venkatesh. Outline Introduction Graphical Model Parameterization Inference Summary.
From Variable Elimination to Junction Trees
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
Data Quality Class 9. Rule Discovery Decision and Classification Trees Association Rules.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Modified by Benny Chor, from slides by Shlomo Moran and Ydo Wexler (IIT)
Phylogenetic Trees Presenter: Michael Tung
Computing Trust in Social Networks
Belief Propagation, Junction Trees, and Factor Graphs
Realistic evolutionary models Marjolijn Elsinga & Lars Hemel.
07/05/2004 Evolution/Phylogeny Introduction to Bioinformatics MNW2.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
CISC667, F05, Lec16, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (III) Probabilistic methods.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
Probabilistic methods for phylogenetic trees (Part 2)
Phylogenetic Tree Construction and Related Problems Bioinformatics.
Algorithm Animation for Bioinformatics Algorithms.
Phylogenetic trees Sushmita Roy BMI/CS 576
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
P HYLOGENETIC T REE. OVERVIEW Phylogenetic Tree Phylogeny Applications Types of phylogenetic tree Terminology Data used to build a tree Building phylogenetic.
1 Generalized Tree Alignment: The Deferred Path Heuristic Stinus Lindgreen
Bayesian Hierarchical Clustering Paper by K. Heller and Z. Ghahramani ICML 2005 Presented by HAO-WEI, YEH.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
CSIT 402 Data Structures II
Hidden Topic Markov Models Amit Gruber, Michal Rosen-Zvi and Yair Weiss in AISTATS 2007 Discussion led by Chunping Wang ECE, Duke University March 2, 2009.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
Phylogenetic Prediction Lecture II by Clarke S. Arnold March 19, 2002.
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
394C, Spring 2013 Sept 4, 2013 Tandy Warnow. DNA Sequence Evolution AAGACTT TGGACTTAAGGCCT -3 mil yrs -2 mil yrs -1 mil yrs today AGGGCATTAGCCCTAGCACTT.
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
Summary We propose a framework for jointly modeling networks and text associated with them, such as networks or user review websites. The proposed.
Variational Inference for the Indian Buffet Process
Stick-Breaking Constructions
The Infinite Hierarchical Factor Regression Model Piyush Rai and Hal Daume III NIPS 2008 Presented by Bo Chen March 26, 2009.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Stick-breaking Construction for the Indian Buffet Process Duke University Machine Learning Group Presented by Kai Ni July 27, 2007 Yee Whye The, Dilan.
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Nonparametric Bayesian Models. HW 4 x x Parametric Model Fixed number of parameters that is independent of the data we’re fitting.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Multi-label Prediction via Sparse Infinite CCA Piyush Rai and Hal Daume III NIPS 2009 Presented by Lingbo Li ECE, Duke University July 16th, 2010 Note:
1 Binary Search Trees  Average case and worst case Big O for –insertion –deletion –access  Balance is important. Unbalanced trees give worse than log.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.
Hierarchical clustering approaches for high-throughput data Colin Dewey BMI/CS 576 Fall 2015.
Bayesian Hierarchical Clustering Paper by K. Heller and Z. Ghahramani ICML 2005 Presented by David Williams Paper Discussion Group ( )
Latent Feature Models for Network Data over Time Jimmy Foulds Advisor: Padhraic Smyth (Thanks also to Arthur Asuncion and Chris Dubois)
Application of Phylogenetic Networks in Evolutionary Studies Daniel H. Huson and David Bryant Presented by Peggy Wang.
Clustering Machine Learning Unsupervised Learning K-means Optimization objective Random initialization Determining Number of Clusters Hierarchical Clustering.
10. Decision Trees and Markov Chains for Gene Finding.
Accelerated Sampling for the Indian Buffet Process
Multiple Alignment and Phylogenetic Trees
Nonparametric Latent Feature Models for Link Prediction
A Non-Parametric Bayesian Method for Inferring Hidden Causes
Hierarchical clustering approaches for high-throughput data
Hierarchical Topic Models and the Nested Chinese Restaurant Process
Revealing priors on category structures through iterated learning
CMSC 202 Trees.
Identifiability of Path-specific Effects
Presentation transcript:

The Phylogenetic Indian Buffet Process : A Non- Exchangeable Nonparametric Prior for Latent Features By: Kurt T. Miller, Thomas L. Griffiths and Michael I. Jordan ICML 2008 Presented by: John Paisley Duke University, ECE

Motivation Nonparametric models are often used with the assumption of exchangeability. –The Indian Buffet Process is an example Sometimes, non-exchangeable models might be more appropriate. –The Phylogenetic Indian Buffet Process –Similar to the IBF, but uses additional information of how related diners are with each other. –These relationships are captured in a tree structure.

Indian Buffet Process

Phylogenetic Indian Buffet Process Uses a tree to model columns z k This is done as follows: –Assign the root node to be zero –Along an edge of distance t, let this change to a 1 with probability, where. The distance from every leaf to the root is 1. –If a 0 is changed to a 1 along a path to a node, all subsequent nodes are 1 and therefore so are the leaves.

Sampling Issues For (1), use the sum-product algorithm (Pearl, 1988). For (2), use the chain rule of probability. An MCMC inference algorithm is given in detail.

Experimental Results Elimination by Aspects (EBA) model –A Choice Model Let there be i objects and z ik indicate the i th object has the k th feature. Let each feature have a weight, w k. The EBA model defines the probability of choosing object I over j as The likelihood of an observation matrix, X, is This has been modeled using the IBP.

Experimental Results Consider now an underlying tree structure to this model. Preference trees: Out of 9 personalities, 3 movie stars, 3 athletes and 3 politicians, people made the 36 pairwise choices of whom they would rather spend time with. Here, L is the length of the edge of each general category to a leaf. A soft version of this tree is modeled with the pIBP using data generated from this model with L = 0.1

Experimental Results Example results: As the number of samples decreases, the pIBP is able to infer the structure better than the IBP because of the prior.

Experimental Results As can be seen, the additional structure in the model produces better results.