Text Retrieval and Data Mining in SI - An Introduction

Slides:



Advertisements
Similar presentations
SEARCHING THE BLOGOSPHERE
Advertisements

1 A Probabilistic Approach to Spatiotemporal Theme Pattern Mining on Weblogs Qiaozhu Mei, Chao Liu, Hang Su, and ChengXiang Zhai : University of Illinois.
Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search Date: 2014/03/25 Author: Taifeng Wang, Jiang Bian, Shusen.
1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute.
INTRODUCTION TO MACHINE LEARNING David Kauchak CS 451 – Fall 2013.
Scholarly information – past, present and future Siân Harris Editor, Research Information
Social Bookmarking & Research What Delicious can do for you.
2008 © ChengXiang Zhai 1 Contextual Text Analysis with Probabilistic Topic Models ChengXiang Zhai Department of Computer Science Graduate School of Library.
2010 © University of Michigan 1 Text Retrieval and Data Mining in SI - An Introduction Qiaozhu Mei School of Information Computer Science and Engineering.
Topic Modeling with Network Regularization Qiaozhu Mei, Deng Cai, Duo Zhang, ChengXiang Zhai University of Illinois at Urbana-Champaign.
1 Web Search and Advanced Internet Services 290N Class Introduction Tao Yang, 2014.
Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.
Welcome Introduction and Overview Computer Science Research Practicum Fall 2012 Andrew Rosenberg.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
MINING MULTI-FACETED OVERVIEWS OF ARBITRARY TOPICS IN A TEXT COLLECTION Xu Ling, Qiaozhu Mei, ChengXiang Zhai, Bruce Schatz Presented by: Qiaozhu Mei,
COMP 4332 / RMBI 4330 Big Data Mining (Spring 2015) Lei Chen Hong Kong University of Science and Technology
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, Prepare Yourself for IR Research ChengXiang Zhai Department of Computer.
1 Information Retrieval and Advanced Internet Services 290N Class Introduction Tao Yang, 2015
Social Content ASIDIC, Tampa Fl, March 2009 What is Social Content? How can we use Social Content? What is the future of Social Content?
How to make searchers better searchers Vivian Lin Dufour 21 Oct 2010.
2009 © Qiaozhu Mei University of Illinois at Urbana-Champaign Towards Contextual Text Mining Qiaozhu Mei University of Illinois at Urbana-Champaign.
2009 © Qiaozhu Mei University of Illinois at Urbana-Champaign Contextual Text Mining Qiaozhu Mei University of Illinois at Urbana-Champaign.
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
Query Logs – Used everywhere and for everything Sai Vallurupalli.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Intent Subtopic Mining for Web Search Diversification Aymeric Damien, Min Zhang, Yiqun Liu, Shaoping Ma State Key Laboratory of Intelligent Technology.
Comparative Text Mining Q. Mei, C. Liu, H. Su, A. Velivelli, B. Yu, C. Zhai DAIS The Database and Information Systems Laboratory. at The University of.
2010 © University of Michigan 1 DivRank: Interplay of Prestige and Diversity in Information Networks Qiaozhu Mei 1,2, Jian Guo 3, Dragomir Radev 1,2 1.
Hypersearching the Web, Chakrabarti, Soumen Presented By Ray Yamada.
ReproZip Packing Experiments for Sharing and Publication Fernando Chirigati, Juliana Freire | NYU-Poly Dennis Shasha | NYU.
ESSENTIAL SCIENCE INDICATORS (ESI) James Cook University Celebrating Research 9 OCTOBER 2009 Steven Werkheiser Manager, Customer Education & Training ANZ.
Next Generation Search Engines Bin Tan. Current search engines only provide search service. Current search engines only provide search service. A lot.
© 2009 Endeca Technologies, Inc. All rights reserved. exploring semantic means Daniel Tunkelang Chief Scientist, Endeca.
Using your school Databases Please open up a word document. Please access the Library Media Center Please open up a word document. Please access the Library.
Google Custom Search Engine Presented by David Bickford Director, Arizona Health Sciences Library at the Phoenix Biomedical Campus October 21, 2015.
Supporting Knowledge Discovery: Next Generation of Search Engines Qiaozhu Mei 04/21/2005.
© Business School, 2010 Information filtering on dynamical networks Associate Prof. Jianguo Liu University of Shanghai for Science and Technology
UOS Personalized Search Zhang Tao 장도. Zhang Tao Data Mining Contents Overview 1 The Outride Approach 2 The outride Personalized Search System 3 Testing.
Info Start-up company founded by academicians and graduate students from Sabanci University. We offer social media analysis tools and services including.
2014 Lexicon-Based Sentiment Analysis Using the Most-Mentioned Word Tree Oct 10 th, 2014 Bo-Hyun Kim, Sr. Software Engineer With Lina Chen, Sr. Software.
| Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 1 Knowledge Management in Web 2.0 Comparison.
Off-Site SEO to Improve Your Website’s Page Rank Straight Up Marketing.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Library Training on Mendeley Reference Manager
CS510 Advanced Topics in Information Retrieval (Fall 2017)
MINING DEEP KNOWLEDGE FROM SCIENTIFIC NETWORKS
Market Intelligence Analysis

Discover How Your Business Can Benefit from a Facebook Fanpage
Statistical Learning Methods for Natural Language Processing on the Internet 徐丹云.
Willie Miller Informatics & Journalism Librarian September 10, 2014
中国计算机学会学科前沿讲习班:信息检索 Course Overview
Elsevier Activity Range
E-Commerce Theories & Practices
Publishing Communities
ChengXiang (“Cheng”) Zhai Department of Computer Science
Introduction to TIMAN: Text Information Managemetn & Analysis
Computing Curriculum Plans
CS510 (Fall 2018) Advanced Topics in Information Retrieval
An Efficient method to recommend research papers and highly influential authors. VIRAJITHA KARNATAPU.
Fake News Detection - Social Article Fusion
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
<< Advanced Software Agents in Web Mining >>
Web Mining Department of Computer Science and Engg.
FashionBrain: Understanding Europe’s Fashion Data Universe
Web archives as a research subject
Web and Internet Search: What’s Next?
Web Search and Advanced Internet Services
Presented By S.Yamuna AP/CSE
Computing Curriculum Plans
Presentation transcript:

Text Retrieval and Data Mining in SI - An Introduction Qiaozhu Mei School of Information Computer Science and Engineering University of Michigan qmei@umich.edu

Challenge of Data Mining Published Content: 3-4G/day User generated data: 8-10G/day Private text data: 3T/day - Ramakrishnan and Tomkins 2007

What do We Do in this Battle? Contextual Text Mining Social Data Mining Information Retrieval Social Network Mining Health Informatics Bioinformatics Statistical Topic Modeling Web Search Social networks Online communities Academic networks Information networks Query logs Social bookmarks Scientific Literature News articles blogs tweets Web pages Social media EHR Crowd time location authorship sentiments impact event Topics User Content Context

Personalization v.s. Diversification MSR PageRank Mountain Safety Research + MSR Tents + MSR Wheels + Microsoft Research …  ? Personalized Rank Microsoft Research + Microsoft Research Redmond + Microsoft Research Asia … ? Diverse Rank Mountain Safety Research + Microsoft Research + Metropolis Street Racer … ? - Joint work with Jian Guo, Qian Zhen

Topic Evolution and Trends Hot Topics in SIGMOD Mention “context” more. Time as context What’s hot in literature/twitter?

Modeling Spatiotemporal Topic Diffusion One Week Later How does discussion spread? Topic = “government response in hurricane Katrina”

Summarizing and Tracking Opinions The Da Vinci Code Blogs; customer reviews Tom Hanks, who is my favorite movie star act the leading role. protesting... will lose your faith by watching the movie. ... so sick of people making such a big deal about a fiction book a good book to past time. What is good and what is bad?

Topical Community Detection Social/Academic Network Information retrieval community Machine learning community Data mining Who works together on what? Text Content

Thanks! Joint work with Cheng Zhai, Ken Church, Bruce Schatz, Ravi Kumar, Andrew Tomkins, Denny Zhou, Jian Guo, Qian Zhen, Xu Ling, Duo Zhang, Deng Cai, Dong Xin, Chao Liu ...