Jifan Yu, Chenyu Wang, Gan Luo,

Slides:



Advertisements
Similar presentations
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
Advertisements

Relationship Mining Association Rule Mining Week 5 Video 3.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
Real-Time Concepts for Embedded Systems Author: Qing Li with Caroline Yao ISBN: CMPBooks.
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
JASS 2005 Next-Generation User-Centered Information Management Information visualization Alexander S. Babaev Faculty of Applied Mathematics.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Speaker: Ruirui Li 1 The University of Hong Kong.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
Brent M. Dingle, Ph.D Game Design and Development Program Mathematics, Statistics and Computer Science University of Wisconsin - Stout Edge Detection:
Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
C ROWD P LANNER : A C ROWD -B ASED R OUTE R ECOMMENDATION S YSTEM Han Su, Kai Zheng, Jiamin Huang, Hoyoung Jeung, Lei Chen, Xiaofang Zhou.
Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
Recommender Systems & Collaborative Filtering
What is a CAT? What is a CAT?.
Evolutionary Algorithms Jim Whitehead
Model Discovery through Metalearning
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Appmarketingminds.com Welcome to App Marketing Minds’ series on how to create viral applications.
Games Design: Game Concepts
Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment Xinyang Jiang, Fei Wu, Xi Li, Zhou Zhao, Weiming Lu, Siliang Tang, Yueting.
INAGO Project Automatic Knowledge Base Generation from Text for Interactive Question Answering.
Introduction to Algorithms
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
Real-time Wall Outline Extraction for Redirected Walking
By Dr. Abdulrahman H. Altalhi
Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data
Optimizing L&D Contribution to Business Outcomes
CIKM Competition 2014 Second Place Solution
Result of Ontology Alignment with RiMOM at OAEI’06
Global Enterprise Search
Presentation 王睿.
Weakly Learning to Match Experts in Online Community
Find API Usage Patterns
CIS 488/588 Bruce R. Maxim UM-Dearborn
Discriminative Frequent Pattern Analysis for Effective Classification
Introduction Task: extracting relational facts from text
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Computer Vision Chapter 4
ISWC 2013 Entity Recommendations in Web Search
The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’
Agile testing for web API with Postman
Effective Entity Recognition and Typing by Relation Phrase-Based Clustering
Code search & recommendation engines
The INTERACT Website: Important source of information for the ETC Community Karen Vandeweghe, Communications Manager, IS Bratislava 27 January 2010.
ECE 352 Digital System Fundamentals
Actively Learning Ontology Matching via User Interaction
SharePoint has been a pioneer of collaborative work culture and has been dubbed as one of the most successful products by Microsoft for enterprise businesses.
Physics-guided machine learning for milling stability:
Implementation of Learning Systems
Topic: Semantic Text Mining
WAVES: A needs analysis of Scenario-Based Learning and MOOCs
Presentation transcript:

Course Concept Expansion in MOOCs with External Knowledge and Interactive Game Jifan Yu, Chenyu Wang, Gan Luo, Lei Hou, Jie Tang, Juanzi Li and Zhiyuan Liu Tsinghua University Good Morning ladies and gentleman, I am Jifan from Tsinghua University. It's great pleasure indeed for me to be able to attend this conference. My job is to expand the content of MOOCs based on a small game. To be honest, I have been longing for this mode of playing digital games in class since my childhood.

Learning in a Game Find out the irrelevant orange candidate(s) ! Heap Christmas tree Binary Tree Tango Tree Huffman Tree So how do games expand course knowledge, and why do we need to expand knowledge in MOOCs? Maybe this simple little game can give us some inspiration. The rule of this game is simple. Given a few orange candidates, we need to find out which of them are irrelevant to a given blue concept. Okay now I need a volunteer, to help me find the irrelevant candidate.

Learning in a Game The game helps students to review old knowledge Find out the irrelevant orange candidate(s) ! Heap Data Structures The game helps students to review old knowledge and get to new knowledge in a relaxing way. Christmas tree ✔️ Binary Tree Tango Tree Huffman Tree “Tango Tree is a type of Binary Search Tree that a competitive ratio…” ---Wikipedia Nice job! Thank you, the answer is Christmas Tree. Because other candidates are all concepts from "data structures". In addition to the answer of this game, I think we've all noticed an unfamiliar concept, Tango Tree. Actually, I didn't know it before I Google it, which is an advanced binary search Tree structure.

Course: “Introduction to Psychology” In real MOOCs “Top-Student Game” in XuetangX Over 50,000 user operations Course: “Introduction to Psychology” Top-Student Game: Users delete irrelevant orange concepts and gain bonus. In fact, that‘s how we actually use it, on XuetangX, one of China’s largest MOOC education sites. It already has thousands of users interacting with it. The game which we called “Top-student” is designed below the videos and presents content related to this video.

Challenges A New Task: Course Concept Expansion Find out concepts related to the course. Interact with MOOC users via our Game. Set Expansion (Wang, 2007; Adrian and Manna, 2018) Courses are not typical “categories”.  Semantic Drifts How to find high-quality expansion candidates? How to involve user behavior in Top-Student Game? Typical Category MOOC Course “Psychology in Work” Cities Florence New York Work Paris Beijing … Therefore, we designed a new task named Course Concept Expansion, which aims at finding out concept related to the course from external source. And we present the expansion results for users using our game. The closest existing job to this, is Set Expansion. That is, given a set, use external resources to expand the entities of the set as many as possible. However, these methods cannot be used directly on MOOCs because MOOC course are often combinations of categories. And this may lead to a violent Semantic Drifts. Psychology

1-2. Prevent Semantic Drifts 3. Interact with Users Framework Input: MOOC, External Knowledge Base Models + Game Output: Expanded Concepts 1-2. Prevent Semantic Drifts 3. Interact with Users To overcome the challenges, we build a three stage workflow to expand using an external knowledge base. At first we extract course concepts from MOOC, and select the entities which have relations with them in KB as candidates. Finally we put them into game, to show them to MOOC users and collect feedback for further optimization.

Method: Candidate Generation First: Course Concept Extraction Assumption: A course is a concept space which contains one or more concept clusters. Each course, e.g. “Data Structure and Algorithm” Graph algorithms Trees Sort algorithms. Using Clusters to delineate the semantic boundaries of the course The first step: Candidate Generation is very important, Because candidate generation directly determines the quality and quantity of the resulting extensions presented to the user, we want to minimize semantic drift in this step. Through observation, we find that the concepts of the course tend to cluster together when represented by vectors. For example, in the course “data structure and algorithm”, concepts can be roughly divided into three categories, Graph Algori

Method: Candidate Generation The concept space boundary is fitted while searching for new candidates (in KB). Confidence Score Each new found concept’s confidence score is provided by its nearest cluster. Once a potential cluster’s size reach to τ… (It can provide enough seeds)

Method: Candidate Generation Link the course concepts into KB Search for the concepts, entities of their neighbors. A part of KB in Candidate Generation Link the Course Concept Search the neighbors as candidates

Method: Classification Feature Engineering Confidence Score (S) Search Path Encoding (Cho et al., 2014) (PE) Prerequisite Features (Pan, 2017) (Ps) User Deletion Rate (From Game) (Dr) Corpus-based Feature Knowledge Base Feature Human Efforts Domain-Specific Feature

Method: Game-based Optimization “Top-Student Game” in XuetangX As a feedback collector As an online evaluation

Method: Game-based Optimization “Top-Student Game” in XuetangX Multi-level optimization For Candidate Generation Adjust the Confidence Score For Classification Perform as a feature

Experiment Dataset: MOOC data from XuetangX and Coursera For each courses, we select top 800 expanded concepts from 100,000 candidates Knowledge Base: XLORE (Jin et al., 2018)

Our Model achieve a good result with(without) game. Experiment Baselines PR (Graph Based Method) SEISA (He and Xin, 2011) EBM (Embedding Based Method) (Mamou et al., 2018) PUL PU-Learning methods(Wang et al., 2017) Evaluation Metrics MAP Mean Average Precision Results Our Model achieve a good result with(without) game.

Each of the Feature we design is useful. Experiments Parameter Analysis Feature Contribution Each of the Feature we design is useful.

Expansion results satisfied the real MOOC users. Experiments Online Evaluation Cr is the rate of user deletion.(Larger Cr indicates a lower expansion quality) Expansion results satisfied the real MOOC users.

MOOCdata: http://moocdata.cn We are THU MOOC Team! Data and Other Interesting Work!!All in http://moocdata.cn MOOC: A perfect platform of AI in Education Easy to interact with users Easy to build fancy functions Our Publications in AAAI, NIPS…

MOOCdata: http://moocdata.cn Our Data and Other Interesting Work!! Students’ behavior Course Recommendation QA system Prerequisite Relation Discovery Xiaomu: A learning assistant in XuetangX.

Thank you! Questions? Knowledge Engineering Group, Tsinghua University THU MOOC Team: http://moocdata.cn yujifan0326@gmail.com