Intelligent Database Systems Lab N.Y.U.S.T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog,

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab N.Y.U.S.T. I. M. local-density based spatial clustering algorithm with noise Presenter : Lin, Shu-Han Authors : Lian Duan,
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Validating Transliteration Hypotheses Using the Web: Web.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Discovering Leaders from Community Actions Presenter : Wu, Jia-Hao Authors : Amit Goyal, Francesco Bonchi,
Intelligent Database Systems Lab Presenter: NENG-KAI, HONG Authors: G. PANKAJ JAIN, VARADRAJ P. GURUPUR, JENNIFER L. SCHROEDER, AND EILEEN D. FAULKENBERRY.
Maryam Karimzadehgan (U. Illinois Urbana-Champaign)*, Ryen White (MSR), Matthew Richardson (MSR) Presented by Ryen White Microsoft Research * MSR Intern,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Quality evaluation of product reviews using an information.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
SEEKING STATEMENT-SUPPORTING TOP-K WITNESSES Date: 2012/03/12 Source: Steffen Metzger (CIKM’11) Speaker: Er-gang Liu Advisor: Dr. Jia-ling Koh 1.
1 A Discriminative Approach to Topic- Based Citation Recommendation Jie Tang and Jing Zhang Presented by Pei Li Knowledge Engineering Group, Dept. of Computer.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. BNS Feature Scaling: An Improved Representation over TF·IDF for SVM Text Classification Presenter : Lin,
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Evaluation of novelty metrics for sentence-level novelty mining Presenter : Lin, Shu-Han Authors : Flora.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
НИУ ВШЭ – НИЖНИЙ НОВГОРОД EDUARD BABKIN NIKOLAY KARPOV TATIANA BABKINA NATIONAL RESEARCH UNIVERSITY HIGHER SCHOOL OF ECONOMICS A method of ontology-aided.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Wenyi Huang, Yabin Zheng and Maosong Sun 2010, ACM Automatic Keyphrase Extraction.
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A Web 2.0-based collaborative annotation system for enhancing knowledge sharing in collaborative learning.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extraction Presenter : Jiang-Shan.
Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches John HannonJohn Hannon, Mike Bennett, Barry SmythBarry Smyth.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Batch kernel SOM and related Laplacian methods for social network analysis Presenter : Lin, Shu-Han Authors.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A quantitative stock prediction system based on financial news Presenter : Chun-Jung Shih Authors :Robert.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Lecture 1: Overview of IR Maya Ramanath. Who hasn’t used Google? Why did Google return these results first ? Can we improve on it? Is this a good result.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Determining the best K for clustering transactional datasets – A coverage density-based approach Presenter.
Presenter : Lin, Shu-Han Authors : Jeen-Shing Wang, Jen-Chieh Chiang
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Recommendations for E-Learning Personalization.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Concept similarity in Formal Concept Analysis-An information.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An information-pattern-based approach to novelty detection Presenter : Lin, Shu-Han Authors : Xiaoyan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
1 Mining the Web to Determine Similarity Between Words, Objects, and Communities Author : Mehran Sahami Reporter : Tse Ho Lin 2007/9/10 FLAIRS, 2006.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. How valuable is medical social media data? Content analysis of the medical web Presenter :Tsai Tzung.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Development of a reading material recommendation system based on a knowledge engineering approach Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Supporting personalized ranking over categorical attributes Presenter : Lin, Shu-Han Authors : Gae-won.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Adaptation of the Vector-Space Model for Ontology-Based.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: YU-SHENG.
Language Model in Turkish IR Melih Kandemir F. Melih Özbekoğlu Can Şardan Ömer S. Uğurlu.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improving the performance of personal name disambiguation.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Region-based image retrieval using integrated color, shape,
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Wen Zhang, Taketoshi Yoshida, Xijin Tang 2011.ESWA A comparative study of TF*IDF,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A personal route prediction system base on trajectory.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Longzhuang Li, Yi Shang, Wei Zhang 2002.ACM. Improvement of HITS-based Algorithms.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining knowledge from natural language texts using fuzzy associated concept mapping Presenter : Wu,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining concept maps from news stories for measuring civic scientific literacy in media Presenter :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Source Code Elements for Comprehending Object- Oriented.
Ranking-based Processing of SQL Queries Date: 2012/1/16 Source: Hany Azzam (CIKM’11) Speaker: Er-gang Liu Advisor: Dr. Jia-ling Koh.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Identifying Domain Expertise of Developers from Source Code Presenter : Wu, Jia-Hao Authors : Renuka.
Intelligent Database Systems Lab Presenter : Chuang, Kai-Ting Authors : Rafael Odon de Alencar, Clodoveu Augusto Davis Jr., Marcos André Gonçalves 2010,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Providing Justifications in Recommender Systems Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Advisor-Advisee Relationships from Research Publication.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Community self-Organizing Map and its Application to Data Extraction Presenter: Chun-Ping Wu Authors:
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Key Blog Distillation: Ranking Aggregates Presenter : Yu-hui Huang Authors :Craig Macdonald, Iadh Ounis.
Intelligent Database Systems Lab Presenter : JHOU, YU-LIANG Authors : Jae Hwa Lee, Aviv Segev 2012 CE Knowledge maps for e-learning.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An Integrated Machine Learning Approach to Stroke Prediction Presenter: Tsai Tzung Ruei Authors: Aditya.
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee IPM Multilingual document mining.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Learning Portfolio Analysis and Mining for SCORM Compliant Environment Pattern Recognition (PR, 2010)
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Yong-Bin Kang, Pari Delir Haghighi, Frada Burstein ESA CFinder: An intelligent key.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Self-organizing information fusion and hierarchical knowledge.
Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan
IR Theory: Evaluation Methods
Citation-based Extraction of Core Contents from Biomedical Articles
Presentation transcript:

Intelligent Database Systems Lab N.Y.U.S.T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog, Leif Azzopardi, Maarten de Rijke Information Processing and Management (IPM) 45 (2009) 1–19

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Methodology Experiments Conclusion Comments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation The expert finding: finding experts given a topic. Yellow Pages:  Profiles: employees self-assess their skills.  Keywords; e.g., marketing Problem:  Information: antiquated  Keywords: restricted 3

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objectives Within the organization…  Mine published intranet documents.  Search all kinds of expertise. ‘Who are the experts on topic “Internet marketing and internet advertising” in my organization?’ 4

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Overview To capture the association between a candidate expert and an area of expertise… “What is the probability of a candidate ca being an expert given the query topic q?”  Model 1: candidate-based (query-independent) approach: idea: build a profile of candidate experts, and rank them based on query.  Model 2: document-based (query-dependent) approach idea: find the query-relevant documents, then associate with experts. 5 (constant) Bayes’ Theorem (uniform)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Model 1 Build a textual representation (model) of a person’s knowledge according to his documents. Then estimate the probability of the query given the candidate’s model. 6 p(Internet Marketing | θ ca ) =p(“Internet”| θ ca ) ‧ p(“Marketing”| θ ca ) e.g., p(Internet marketing and internet advertising| θ ca ) =p(“Internet”| θ ca ) 2 ‧ p(“Marketing”| θ ca ) ‧ p(“and”| θ ca ) ‧ p(“Advertising”| θ ca ) (Smoothed) (weighted)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Model 1B Estimate p(t | d, ca)  Candidate identifier  Window size (w) 7 e.g., p(“Internet”| “Mail.No.43”, “John”) … John is a major in marketing. … ( ) is a major in marketing. … p.s. the closer, the more powerful. (weighted)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Model 2 8 (Smoothed)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Model 2B Model 2 Model2B 9

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – document-candidate associations Boolean model TF-IDF 10 (document importance) (senior member of organization)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Evaluation measures:  MAP (mean average precision)  MRR (mean reciprocal rank): 11 (1/3 + 1/2 + 1)/3 = 11/18

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Model 1 vs. Model 2 Window-based models 12

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Association methods Parameter sensitivity 13

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions Model 1: build a profile of candidate experts, and rank them based on query. Model 2: find the query-relevant documents, then associate with experts. Model 2 was to be preferred over Model 1:  Effectiveness: in terms of average precision and reciprocal rank  Implement: only requiring a regular document index window-based extensions improved :  Effectiveness: especially on top of Model 1 Frequency-based (TF-IDF) document-candidate associations is helpful. 14

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comments Advantage  Integrate ideas Drawback  … Application  … 15