Ideal Parent Structure Learning School of Engineering & Computer Science The Hebrew University, Jerusalem, Israel Gal Elidan with Iftach Nachman and Nir.

Slides:

Advertisements

Similar presentations

TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST

Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…

3.6 Support Vector Machines

© Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems Introduction.

1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.

Analysis of Algorithms

and 6.855J Cycle Canceling Algorithm. 2 A minimum cost flow problem , $4 20, $1 20, $2 25, $2 25, $5 20, $6 30, $

February 21, 2002 Simplex Method Continued

Thursday, March 7 Duality 2 – The dual problem, in general – illustrating duality with 2-person 0-sum game theory Handouts: Lecture Notes.

February 7, 2002 A brief review of Linear Algebra Linear Programming Models Handouts: Lecture Notes.

and 6.855J Spanning Tree Algorithms. 2 The Greedy Algorithm in Action

Summary of Convergence Tests for Series and Solved Problems

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.

Add Governors Discretionary (1G) Grants Chapter 6.

SAT-10 Practice Mark your answers correctly on the SAT-10 Practice paper.

ALGEBRAIC EXPRESSIONS

Year 6 mental test 10 second questions Numbers and number system Numbers and the number system, fractions, decimals, proportion & probability.

1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8.

On Sequential Experimental Design for Empirical Model-Building under Interval Error Sergei Zhilin, Altai State University, Barnaul, Russia.

CIS: Compound Importance Sampling for Binding Site p-value Estimation The Hebrew University, Jerusalem, Israel Yoseph Barash Gal Elidan Tommy Kaplan Nir.

Solve Multi-step Equations

ABSTRACT: We examine how to determine the number of states of a hidden variables when learning probabilistic models. This problem is crucial for improving.

The basics for simulations

PP Test Review Sections 6-1 to 6-6

ABC Technology Project

DIVISIBILITY, FACTORS & MULTIPLES

ABSTRACT: We examine how to detect hidden variables when learning probabilistic models. This problem is crucial for for improving our understanding of.

Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.

演算法實驗室演算法實驗室 On the Minimum Node and Edge Searching Spanning Tree Problems Sheng-Lung Peng Department of Computer Science and Information Engineering.

Constant, Linear and Non-Linear Constant, Linear and Non-Linear

1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.

© 2012 National Heart Foundation of Australia. Slide 2.

Adding Up In Chunks.

LO: Count up to 100 objects by grouping them and counting in 5s 10s and 2s. Mrs Criddle: Westfield Middle School.

1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.

Artificial Intelligence

Before Between After.

Addition 1’s to 20.

25 seconds left…...

Subtraction: Adding UP

Test B, 100 Subtraction Facts

Analyzing Genes and Genomes

2 x0 0 12/13/2014 Know Your Facts!. 2 x1 2 12/13/2014 Know Your Facts!

. Lecture #8: - Parameter Estimation for HMM with Hidden States: the Baum Welch Training - Viterbi Training - Extensions of HMM Background Readings: Chapters.

We will resume in: 25 Minutes.

Essential Cell Biology

Solving Linear Systems by Linear Combinations

PSSA Preparation.

Essential Cell Biology

Tutorial 1: Sensitivity analysis of an analytical function

Profile. 1.Open an Internet web browser and type into the web browser address bar. 2.You will see a web page similar to the one on.

Basics of Statistical Estimation

Probabilistic Reasoning over Time

Principal Component Analysis and Linear Discriminant Analysis

13-Optimization Assoc.Prof.Dr. Ahmet Zafer Şenalp Mechanical Engineering Department Gebze Technical.

EMIS 8374 LP Review: The Ratio Test. 1 Main Steps of the Simplex Method 1.Put the problem in row-0 form. 2.Construct the simplex tableau. 3.Obtain an.

Multivariate Statistical Process Control and Optimization

Constraint Optimization We are interested in the general non-linear programming problem like the following Find x which optimizes f(x) subject to gi(x)

Quiz Number 2 Group 1 – North of Newark Thamer AbuDiak Reynald Benoit Jose Lopez Rosele Lynn Dave Neal Deyanira Pena Professor Kenneth D. Lawerence New.

0 x x2 0 0 x1 0 0 x3 0 1 x7 7 2 x0 0 9 x0 0.

T-SPaCS – A Two-Level Single-Pass Cache Simulation Methodology + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing Wei Zang.

Inferring Quantitative Models of Regulatory Networks From Expression Data Iftach Nachman Hebrew University Aviv Regev Harvard Nir Friedman Hebrew University.

Information Bottleneck EM School of Engineering & Computer Science The Hebrew University, Jerusalem, Israel Gal Elidan and Nir Friedman.

Presentation transcript:

Ideal Parent Structure Learning School of Engineering & Computer Science The Hebrew University, Jerusalem, Israel Gal Elidan with Iftach Nachman and Nir Friedman

Problems: Need to score many candidates Each one requires costly parameter optimization Structure learning is often impractical S C E D S C E D S C E D S C E D Learning Structure Data Variables Input: Instances S C E D Output: Init: Start with initial structure Consider local changes 1 Score each candidate 2 Apply best modification 3 The Ideal Parent Approach Approximate improvements of changes (fast) Optimize & score promising candidates (slow)

E C P(E| C) D A C E B Linear Gaussian Networks

Goal: Score only promising candidates The Ideal Parent Idea Parent Profile Child Profile Instances Pred(X|U) U X

Goal: Score only promising candidates The Ideal Parent Idea Ideal Profile Instances Pred(X|U) U X Y Step 1: Compute optimal hypothetical parent Pred(X|U,Y) Instances potential parents Step 2: Search for similar parent Z1Z1 Z2Z2 Z3Z3 Z4Z4 Parent Profile Child Profile

Step 3: Add new parent and optimize parameters Goal: Score only promising candidates The Ideal Parent Idea Instances U X Step 1: Compute optimal hypothetical parent Instances potential parents Step 2: Search for similar parent Z1Z1 Z2Z2 Z3Z3 Z4Z4 Pred(X|U,Y) Ideal Profile Y Parent(s) Profile Z2Z2 Predicted(X|U,Z) Child Profile

Choosing the best parent Z Our goal: Choose Z that maximizes U X Z U X Likelihood of Theorem: likelihood improvement when only z is optimized y,z Y Z We define:

Similarity vs. Score C 2 is more accurate C 1 will be useful later score C 2 Similarity score C 1 Similarity We now have an efficient approximation for the score effect of fixed variance is large

Ideal Parent in Search Structure search involves O(N 2 ) Add parent O(NE) Replace parent O(E) Delete parent O(E) Reverse edge S C E D S C E D S C E D S C E D Vast majority of evaluations are replaced by ideal approximation Only K candidates per family are optimized and scored

Gene Expression Experiment 4 Gene expression datasets with 44 (Amino), 89 (Metabolism) and 173 (2xConditions) variables K test -log-likelihood Amino Metabolism Conditions (AA) Conditions (Met) K speedup K 0.4%-3.6% changes evaluated greedy Speedup:

Scope Conditional probability distribution (CPD) of the form link function white noise General requirement: g(U) be any invertible (w.r.t u i ) function Linear GaussianChemical ReactionSigmoid Gaussian

Problem: No simple form for similarity measures Sigmoid Gaussian CPD P(X=0.5|Z) Z P(X=0.85|Z) 0 1 g(z) Z X = 0.5 X = g(z) 0.5 Y(0.5)Y(0.85) Linear approximation around Y=0 Exact Approx Z X Likelihood Solution: Sensitivity to Z depends on gradient of specific instance Z

Sigmoid Gaussian CPD Z x 0.25 ( g 0.5 ) Z x ( g 0.85 ) Z (X=0.5) Z (X=0.85) Equi-Likelihood PotentialAfter gradient correction We can now use the same measure

Sigmoid Gene Expression 4 Gene expression datasets with 44 (Amino), 89 (Metabolism) and 173 (Conditions) variables test -log-likelihood K Amino Metabolism Conditions (AA) Conditions (Met) greedy speedup K 2.2%-6.1% moves evaluated times faster

For the Linear Gaussian case: Challenge: Find that maximizes this bound Adding New Hidden Variables Idea Profile Idea: Introduce hidden parent for nodes with similar ideal profiles H X1X1 X2X2 X4X4 X1X1 X2X2 X3X3 X4X4 X5X5 Y1Y1 Y2Y2 Y3Y3 Y4Y4 Y5Y5 Instances

where is the matrix whose columns are must lie in the span of is the eigenvector with largest eignevalue Setting and using the above (with A invertible) Scoring a parent Rayleigh quotient of the matrix and. Finding h* amounts to solving an eigenvector problem where |A|=size of cluster

X1X1 X2X2 X3X3 X4X4 X1X1 X2X2 X3X3 X4X4 compute only once Compute using X1X1 X2X X1X1 X3X X3X3 X4X Finding the best Cluster

X1X1 X2X2 X3X3 X4X4 X1X1 X2X2 X3X3 X4X4 compute only once X1X1 X3X3 X1X1 X3X3 X1X1 X2X X1X1 X3X X3X3 X4X X1X1 X3X3 X2X2 X2X X4X4 X1X1 X3X3 X2X2 X4X Finding the best Cluster wSelect cluster with highest score wAdd hidden parent and continue with search

Bipartite Network Instances from biological expert network with 7 (hidden) parents and 141 (observed) children test log-likelihood Instances train log-likelihood Instances Greedy Ideal K=2 Ideal K=5 Gold Speedup is roughly x 10 Greedy takes over 2.5 days!

Summary New method for significantly speeding up structure learning in continuous variable networks Offers promising time vs. performance tradeoff Guided insertion of new hidden variables Future work Improve cluster identification for non-linear case Explore additional distributions and relation to GLM Combine the ideal parent approach as plug-in with other search approaches