Presenter: Libin Zheng, Yongqi Zhang Department of Computer Science and Engineering HKUST Date: 24/11/2015 Crowd-aided course selection on MOOC.

Slides:



Advertisements
Similar presentations
Online Max-Margin Weight Learning with Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Incentivize Crowd Labeling under Budget Constraint
Web Information Retrieval
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Computer Science & Engineering 2111 IF and Boolean Functions 1 CSE 2111 Lecture-IF and Boolean Functions.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
A. Darwiche Learning in Bayesian Networks. A. Darwiche Known Structure Complete Data Known Structure Incomplete Data Unknown Structure Complete Data Unknown.
Evaluating Search Engine
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
. Hidden Markov Models For Genetic Linkage Analysis Lecture #4 Prepared by Dan Geiger.
A machine learning approach to improve precision for navigational queries in a Web information retrieval system Reiner Kraft
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Shuchi Chawla, Carnegie Mellon University Static Optimality and Dynamic Search Optimality in Lists and Trees Avrim Blum Shuchi Chawla Adam Kalai 1/6/2002.
Scaling Personalized Web Search Glen Jeh, Jennfier Widom Stanford University Presented by Li-Tal Mashiach Search Engine Technology course (236620) Technion.
Investigation of Web Query Refinement via Topic Analysis and Learning with Personalization Department of Systems Engineering & Engineering Management The.
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query Processing Timothy M. Sutherland Bradford Pielech Yali Zhu Luping Ding.
Maryam Karimzadehgan (U. Illinois Urbana-Champaign)*, Ryen White (MSR), Matthew Richardson (MSR) Presented by Ryen White Microsoft Research * MSR Intern,
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.
Adversarial Information Retrieval The Manipulation of Web Content.
An efficient distributed protocol for collective decision- making in combinatorial domains CMSS Feb , 2012 Minyi Li Intelligent Agent Technology.
1 1 Slide © 2004 Thomson/South-Western Chapter 17 Multicriteria Decisions n Goal Programming n Goal Programming: Formulation and Graphical Solution and.
Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.
Efficient and Scalable Computation of the Energy and Makespan Pareto Front for Heterogeneous Computing Systems Kyle M. Tarplee 1, Ryan Friese 1, Anthony.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
윤언근 DataMining lab.  The Web has grown exponentially in size but this growth has not been isolated to good-quality pages.  spamming and.
1 N -Queens via Relaxation Labeling Ilana Koreh ( ) Luba Rashkovsky ( )
Comp 538 Course Presentation Discrete Factor Analysis Learning Hidden Variables in Bayesian Network Calvin Hua & Lily Tian Computer Science Dep, HKUST.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Efficient Processing of Top-k Spatial Preference Queries
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Introduction to Loops For Loops. Motivation for Using Loops So far, everything we’ve done in MATLAB, you could probably do by hand: Mathematical operations.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
David Ackerman, Associate VP Crystal Butler, Research Associate.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
© 2014 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
A Bayesian Method for Rank Agreggation Xuxin Liu, Jiong Du, Ke Deng, and Jun S Liu Department of Statistics Harvard University.
1 CS 430: Information Discovery Lecture 5 Ranking.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown Structure Complete Data Unknown Structure Incomplete.
REU 2009-Traffic Analysis of IP Networks Daniel S. Allen, Mentor: Dr. Rahul Tripathi Department of Computer Science & Engineering Data Streams Data streams.
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
Exercise 1: Maximum element for the following code a- what is the basic operation? b- what is C(n)? d-what is the running time?
ESTIMATING WEIGHT Course: Special Topics in Remote Sensing & GIS Mirza Muhammad Waqar Contact: EXT:2257 RG712.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
. The EM algorithm Lecture #11 Acknowledgement: Some slides of this lecture are due to Nir Friedman.
Ranking: Compare, Don’t Score Ammar Ammar, Devavrat Shah (LIDS – MIT) Poster ( No preprint), WIDS 2011.
Recommending Forum Posts to Designated Experts
Online Courses A note given in BCC class on May 10, 2016
On Assigning Implicit Reputation Scores in an Online Labor Marketplace
A Scoring Model for Job Selection
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
G CISA Dumps PDF Certified Information Systems Auditor CISA DumpsCISA Braindumps CISA Exam Dumps.
Hidden Markov Models Part 2: Algorithms
Edge computing (1) Content Distribution Networks
Weakly Learning to Match Experts in Online Community
Learning to Rank Typed Graph Walks: Local and Global Approaches
Retrieval Performance Evaluation - Measures
Efficient Processing of Top-k Spatial Preference Queries
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Information Organization: Evaluation of Classification Performance
Presentation transcript:

Presenter: Libin Zheng, Yongqi Zhang Department of Computer Science and Engineering HKUST Date: 24/11/2015 Crowd-aided course selection on MOOC

1 New York Times: “2012 is the Year of MOOCs”; platfroms come forth continuously : Coursera, edX, Udacity……… 1. Motivation MOOC: Massive Open Online Course Function: provide online courses which are openly accessible via the web. Online Students who complete the courses can finally earn a certificate with some payments. Problem: low completion rate of courses, high drop rate of students. Top ten reasons for dropping out: poor course design, hidden cost, over-worklaod, lecture fatigue…….

2 1. Motivation Lack of support for students’ rating and comments on courses Students have limited access to the quality of courses before enrollment. Crowd: experienced students Course reputation user Course query Courses Course selection Enroll Highly-rated Courses Crowd-aided course selection: With the help of experienced students, users don’t have to enroll multiple courses to make the comparison by themselves during enrollment.

3 2. Problem formulation I am interested in Machine Learning, and want to register a ML course on Coursera. Which course should I select ??? Which course do you recommend? A : B : Comparison task post

5 2. Problem formulation Task assignment constraints Constraint: for a comparison task (A,B), only students who have experience in enrollment of both A and B can do the comparison.  Some comparison tasks(A, B) cannot be assigned for there is no valid responders.  Even the task(A,B) is assigned, we may not receive its answer due to: the student simply do not respond to a vote. it takes an unacceptably long time to respond. Only partial evidence is offered !! Result of a vote matrix W(i,j) is the number of votes for o j being greater than o i Judgment problem: With vote matrix W, what is the best estimate for the max ?

3. Methods Maximum Likelihood(ML) ML: using Bayesian formulas to determine the max element, which requires to enumerate all the permutations(NP-Hard! Not Practical). Object j is the true max element

3. Methods Local strategy(local) Define a score for each object, initialized as: score(i) = wins(i) – losses(i) Differentiating the votes by considering the strength of the object o i was compared against. That is, o i would be given bonus if it wins a strong candidate.

3. Methods Iterative strategy(ITR) Iteratively prune half of the objects with lower scores, and update the scores with only the remaining. Considered. Define a score for each object, defined as: dif(i) = wins(i) – losses(i) The final survivor is considered as the max.

3. Methods pageRank The probability of object i’s being the max element is the sum of probabilities of the defeated objects A modification of the classical PageRank algorithm For each object, firstly calculate its period, and then compute its average probability over that period, serving as the final score.

4.1 Synthetic Experiment Experiment setup Control variablesValues From 1 coverage to 5 coverage Worker accuracyFrom 0.55 to 0.95 Number of runs5000 Construction of W Average as the result

Evaluation criteriaFormulation PrecisionRatio of the true max elements among all returns. MRR (Mean Reciprocal Rank) The inverse rank of the true maximum in the predicted ranking. Higher value indicates better performance 4.1 Synthetic Experiment Experiment setup Control variablesValues From 1 coverage to 5 coverage Worker accuracyFrom 0.55 to 0.95 Number of runs5000 Construction of W Average as the result max descending Reciprocal Rank = 5/7 True order

4.1 Synthetic Experiment Precision versus coverageMRR versus coverage Compare ‘ML’ with other method; since running ‘ML’ is very costly, we only consider an input of 5 objects with worker accuracy = 0.75.

4.1 Synthetic Experiment Number of objects: 100Coverage: from 1 to 10 P : 0.75 P : 0.55 P : 0.95

4.2 An off-line experiment The questionaire : From 16 volunteers, we obtain a vote matrix consisting of 46 votes, which is more than 2 coverage. Result:

4.2 An off-line experiment ITR: rank = Local: rank = The reasonable way to define the ground truth.

Thanks ! 10