October Rotation Paper-Reviewer Matching Dina Elreedy Supervised by: Prof. Sanmay Das
Agenda o Problem Definition and Motivation o System Design o Datasets o Experiments Results o Extension
Problem Definition and Motivation o Large Conferences receive hundreds (or thousands) of papers and have hundreds of reviewers! o Constraints Reviewer Load Matching Quality Challenging Task!! o Automatic Reviewer-Paper Matching systems try to maximize overall reviewers’ preferences through the assignment.
Problem Formulation [1] L. Charlin and R. S. Zemel, “The toronto paper matching system: an automated paper-reviewer assignment system,” in International Conference on Machine Learning (ICML), Given: Matrix of paper-reviewer preferences A Goal: Find a matching Y satisfying constraints and maximizing total affinity. Challenge: Input preferences matrix A is very sparse! We have followed same structure of Toronto Paper matching system [1], which is widely used in large AI conferences (NIPS, ICML, UAI, AISTATS,..etc).
System Design Prediction Module (BPMF) Optimization Module Given Preferences A Filled Preferences A’ Matching Y
Prediction Module o Collaborative Filtering is successfully used in Recommender Systems. o We have used Bayesian Probabilistic Matrix Factorization(BPMF), the public available implementation of [2]. o Matrix Factorization U: Matrix of Papers’ Latent Variables V: Matrix of Reviewers’ Latent Variables o BPMF assumes Gaussian prior distribution for U and V. [2] R. Salakhutdinov and A. Mnih, “Bayesian probabilistic matrix factorization using markov chain monte carlo,”inProceedingsofthe25thinternationalconferenceonMachinelearning. ACM,2008,pp.880–887. A=U T V
Bayesian PMF A=R in this figure Gibbs Sampling is used to estimate Posterior Probabilities of U and V.
System Design Prediction Module (BPMF) Optimization Module Given Preferences A Filled Preferences A’ Matching Y
Optimization Module Can be solved using Linear Programming using Taylor Formulation 3 3. C. J. Taylor, “On the optimal assignment of conference papers to reviewers,” 2008.
Datasets NIPS 2006 Paper-Reviewer Dataset 148 papers submitted to NIPS 2006 and 364 reviewers. Reviewers’ preferences range from 0 to 3 Data Sparsity (Percentage of Known Ratings)=393/(148*365)= Netflix Movie Rating Dataset We have used a small portion of Netflix Dataset as it is a very large dataset (6000 movies and 3500 users)that makes optimization intractable. We have only considered the problem of 300 users and 500 movies. Data Sparsity= 4035/(500*300)=
Experiments and Results o For each dataset, we evaluate both prediction accuracy(RMSE) and matching quality (Affinity score). NIPS Dataset Netflix Dataset
Results (cont.) NIPS Dataset Netflix Dataset
Extension Develop Active Learning Strategies for ratings elicitation to enhance matching quality. We aim at selecting the most useful pairs for the matching processing (November Rotation).
Thank you