Learning Conference Reviewer Assignments Adith Swaminathan Guide : Prof. Soumen Chakrabarti Department of Computer Science and Engineering, Indian Institute.

Slides:



Advertisements
Similar presentations
ECE 667 Synthesis and Verification of Digital Circuits
Advertisements

ECE Longest Path dual 1 ECE 665 Spring 2005 ECE 665 Spring 2005 Computer Algorithms with Applications to VLSI CAD Linear Programming Duality – Longest.
1 Maximum flow sender receiver Capacity constraint Lecture 6: Jan 25.
Structured SVM Chen-Tse Tsai and Siddharth Gupta.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
Semi-supervised Learning Rong Jin. Semi-supervised learning  Label propagation  Transductive learning  Co-training  Active learning.
EMIS 8373: Integer Programming Valid Inequalities updated 4April 2011.
Continuous optimization Problems and successes
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Approximation Algoirthms: Semidefinite Programming Lecture 19: Mar 22.
MCFRoute: A Detailed Router Based on Multi- Commodity Flow Method Xiaotao Jia, Yici Cai, Qiang Zhou, Gang Chen, Zhuoyuan Li, Zuowei Li.
Semidefinite Programming
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
Placement of Integration Points in Multi-hop Community Networks Ranveer Chandra (Cornell University) Lili Qiu, Kamal Jain and Mohammad Mahdian (Microsoft.
CES 514 – Data Mining Lecture 8 classification (contd…)
2010/5/171 Overview of graph cuts. 2010/5/172 Outline Introduction S-t Graph cuts Extension to multi-label problems Compare simulated annealing and alpha-
Support Vector Machines Based on Burges (1998), Scholkopf (1998), Cristianini and Shawe-Taylor (2000), and Hastie et al. (2001) David Madigan.
Support Vector Machines Formulation  Solve the quadratic program for some : min s. t.,, denotes where or membership.  Different error functions and measures.
Maximum Flows Lecture 4: Jan 19. Network transmission Given a directed graph G A source node s A sink node t Goal: To send as much information from s.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Dept. of Computer Science & Engineering, CUHK Pseudo Relevance Feedback with Biased Support Vector Machine in Multimedia Retrieval Steven C.H. Hoi 14-Oct,
ECE LP Duality 1 ECE 665 Spring 2005 ECE 665 Spring 2005 Computer Algorithms with Applications to VLSI CAD Linear Programming Duality.
Minimum Cost Flow Lecture 5: Jan 25. Problems Recap Bipartite matchings General matchings Maximum flows Stable matchings Shortest paths Minimum spanning.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.
Multi-item auctions & exchanges (multiple distinguishable items for sale) Tuomas Sandholm Carnegie Mellon University.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Hamed Pirsiavash, Deva Ramanan, Charless Fowlkes
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
1.3 Modeling with exponentially many constr.  Some strong formulations (or even formulation itself) may involve exponentially many constraints (cutting.
Design Techniques for Approximation Algorithms and Approximation Classes.
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 8 Lecture 8 Network Flow Based Modeling Mustafa Ozdal Computer Engineering Department,
National Taiwan University Department of Computer Science and Information Engineering Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Review for E&CE Find the minimal cost spanning tree for the graph below (where Values on edges represent the costs). 3 Ans. 18.
Models in I.E. Lectures Introduction to Optimization Models: Shortest Paths.
and 6.855J Lagrangian Relaxation I never missed the opportunity to remove obstacles in the way of unity. —Mohandas Gandhi.
Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.
CS223 Advanced Data Structures and Algorithms 1 Maximum Flow Neil Tang 3/30/2010.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 8 Lecture 8 Network Flow Based Modeling Mustafa Ozdal Computer Engineering Department,
Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar.
Machine learning optimization Usman Roshan. Machine learning Two components: – Modeling – Optimization Modeling – Generative: we assume a probabilistic.
A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.
Review for E&CE Find the minimal cost spanning tree for the graph below (where Values on edges represent the costs). 3 Ans. 18.
ICS 353: Design and Analysis of Algorithms Backtracking King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Linear Programming Chapter 1 Introduction.
Page 1 CS 546 Machine Learning in NLP Review 2: Loss minimization, SVM and Logistic Regression Dan Roth Department of Computer Science University of Illinois.
Learning to Align: a Statistical Approach
Lecture 7: Constrained Conditional Models
Semi-Supervised Clustering
Constrained Clustering -Semi Supervised Clustering-
Lectures on Network Flows
CIS 700 Advanced Machine Learning for NLP Inference Applications
Chapter 1. Introduction Mathematical Programming (Optimization) Problem: min/max
Janardhan Rao (Jana) Doppa, Alan Fern, and Prasad Tadepalli
1.3 Modeling with exponentially many constr.
Chapter 6. Large Scale Optimization
Integer Programming (정수계획법)
Vijay V. Vazirani Georgia Tech
Timing Optimization.
1.3 Modeling with exponentially many constr.
Integer Programming (정수계획법)
ICS 353: Design and Analysis of Algorithms
Fast Min-Register Retiming Through Binary Max-Flow
Primal Sparse Max-Margin Markov Networks
Chapter 1. Formulations.
Chapter 6. Large Scale Optimization
Presentation transcript:

Learning Conference Reviewer Assignments Adith Swaminathan Guide : Prof. Soumen Chakrabarti Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Future Work (from BTP1) Given WWW2010’s assignments, learn Affinity_Param, Topic_Param and Irritation Citations as edge features Load-Constrained Partial Assignments Better estimation of Assignment Quality

Background Conference Reviewer-Paper Assignment as a Many-Many- matching [1] Minimum Cost Network Flow (MCF)

Conference Reviewer Assignment Set of Reviewers, R, max #papers = L_i Set of Papers, P, min #reviews = K Assumption : Only require #reviews, not quality Suppose we have cost function A_ij(y) for

ILP -> Assumption -> MCF

Two problems Integer Linear Programs are NP-Hard! – Relax? – More assumptions? How to determine A_ij? – M * N ~ – Multimodal clues

ILP -> Assumption -> MCF Enforce structure on A_ij – Better model multimodality – Fewer parameters to fix “Learn” A_ij using Structured Learning Techniques A_ij = w T Φ(R_i, P_j, y_ij)

Ramifications of Structured Costs Costs decompose over pairs – Decomposable Preference Auction – Polynomial Algorithms for DPAs [2] Restricted notion of optimality – Per-reviewer/Per-paper constraint could be combinatorial – Stability?

ILP -> Assumption -> MCF

Minimum Cost Network Flow Directed graph G=(V,E), capacities u(E)>= 0, costs c(E) Nodes have numbers b(V) : Sum(b(V)) = 0 Task : Find a function f: E->R + which satisfies the b-flow at minimum cost Successive Shortest Path Algorithm

Node features and Edge features Reviewer ProfileTopics Paper ContentsTopics Affinity Bid Topic Overlap Cites

The Loss Function L_ij = w_1 * exp(-Affinity_ij) + w_2 * [[1 – Topic_Overlap_ij]] + w_3 * Bid_Cost Bid_Cost = Potential(R_i, P_j, y_ij) Irritation (I) and Disappointment (D) needs to be set

Assignment Quality Measures Number of Bids Violated? – Not a reliable measure. +ve Bids Violated –ve Bids Violated Assignments satisfying Topic Match Confidence?

Confidence == Quality? Very sparse – Fewer than 5% observed – Extrapolated Confidence? Reliable – Bids as a precursor of Confidence [3] – Confidence-Augmented Loss?

Learning w’s Transductive Ordinal Regression – Assume : Assignments are independent (Naïve) – Heuristic : Augment observed dataset – Extrapolate observed Confidence [4] – Learn w over extrapolated dataset Support Vector Machine for Structured Outputs – Cast as soft-margin SVM formulation [5] – Upper-bound objective with a convex fn (Optimality?) – Minimize, using Cutting Plane (Approximate)

Transductive Ordinal Regression [6]

SVM Struct. [7] Loss Augmented Inference ~ Most Violated Constraint Loss is decomposable -> Modified MCF

PARA : Paper Assignment to Reviewers Apparatus

Results

Bimodal Behaviour Reviewer either gets few or L_i papers Load Penalties [8] Introduce more parameters Infer using modified MCF Learning parameters? Load Rebalancing Tradeoff between MCF optimum and old assignment

Penalise Reviewer Loads

Load Constrained Assignments

Avenues for Future Work Document Modelling for Affinity Scores Objective Assignment Evaluation Transitive Citation Scores Load Penalty Parameter Estimation

References 1. The Conference Paper Assignment Problem, J. Goldsmith, R.H. Sloan, MultiAgent Systems: Algorithmic, Game-Theoretic, and Logical Foundations, Y. Shoham, K. Leyton- Brown, Automating the Assignment of Submitted Manuscripts to Reviewers, S.T. Dumais, J. Nielson, Semisupervised Regression with cotraining algorithms, Z. Zhou, M. Li, 2007

References – contd. 5. Learning structured prediction models : A Large Margin Approach, B. Taskar, et al, Ologit : Ordinal Logistic Regression for Zelig, G. King, et al, SVM Learning for Interdependant and Structured Output Spaces, I. Tsochantaridis, et al, Word Alignment via Quadratic Assignment, S. Lacoste-Julien, et al, 2006