Multimodal Information Access and Synthesis A DHS Institute of Discrete Science UIUC Dan Roth Department of Computer Science University of Illinois.

Slides:



Advertisements
Similar presentations
Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
Advertisements

 Copyright 2007 STI - INTERNATIONAL Semantic Technology Institute International PlanetData - Ensuring Impact.
Education, Outreach and Training. Specifications Document Overall objective: Better integration of ecoinformatics, in general, and SEEK tools, specifically,
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, Introduction to IR Research ChengXiang Zhai Department of Computer.
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
Web Mining Research: A Survey
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Web Mining Research: A Survey
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
The American Marketing Association (AMA). 2 The American Marketing Association (AMA) is one of the largest professional associations in the world with.
© 2013 Association for Computing Machinery Honeywell Introduction to the ACM Digital Library January 16, 2013 Honeywell Introduction to the ACM Digital.
Overview of Search Engines
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
ASIDIC Spring Conference ‘Smart Content’ Uncovering the Value and Benefits of Semantic Technology Richard C. Fusco Director, Content Strategy – McGraw-Hill.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Page 1 Dan Roth Department of Computer Science University of Illinois at Urbana-Champaign.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Computer Jobs 2013 Bob Nielson. Average Wage The average wages of all jobs in America >>>> $45,790 > $80,180.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Computer Jobs 2014 Bob Nielson. Average Wage The average wages of all jobs in America >>>> $45,790 > $80,180.
IL Step 1: Sources of Information Information Literacy 1.
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, Prepare Yourself for IR Research ChengXiang Zhai Department of Computer.
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
CS598CXZ (CS510) Advanced Topics in Information Retrieval (Fall 2014) Instructor: ChengXiang (“Cheng”) Zhai 1 Teaching Assistants: Xueqing Liu, Yinan Zhang.
Chapter 1 Introduction to Data Mining
How to get the most out of the survey task + suggested survey topics for CS512 Presented by Nikita Spirin.
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
Automatically Extracting Data Records from Web Pages Presenter: Dheerendranath Mundluru
Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Structural.
Flexible Text Mining using Interactive Information Extraction David Milward
CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
KJC001 (sp2015.ppt – May 12, 2015) – Industry senior project presentation Industry-based Senior Project in the Department of Computer Science and Engineering.
Data Mining By Dave Maung.
Introduction to Science Informatics Lecture 1. What Is Science? a dependence on external verification; an expectation of reproducible results; a focus.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
NGA Demo Participant Collaboration Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
The Interplay Between Mathematics/Computation and Analytics Haesun Park Division of Computational Science and Engineering Georgia Institute of Technology.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
ICT-enabled Agricultural Science for Development Scenarios, Opportunities, Issues by ICTs transforming agricultural science, research & technology generation.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Waqas Haider Bangyal. 2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
Enriched Doctoral Training in the Mathematical Sciences
Computer Science Department, University of Missouri, Columbia
Introduction to IR Research
中国计算机学会学科前沿讲习班:信息检索 Course Overview
CS598CXZ (CS510) Advanced Topics in Information Retrieval (Fall 2016)
Course Summary (Lecture for CS410 Intro Text Info Systems)
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Mining: Concepts and Techniques Course Outline
Data Warehousing and Data Mining
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
Web Mining Department of Computer Science and Engg.
Promising “Newer” Technologies to Cope with the
Presentation transcript:

Multimodal Information Access and Synthesis A DHS Institute of Discrete Science UIUC Dan Roth Department of Computer Science University of Illinois at Urbana/Champaign

M I A S Multi-Modal Information Access & Synthesis 2 Most of the data today is unstructured books, newspaper articles, journal publications, field reports, images, and audio and video streams. How to deal with the huge amount of unstructured data as if it was organized in a database with a known schema. how to locate, access, organize, analyze and synthesize unstructured data. MIAS Mission: develop the theories, algorithms, and tools for analysts to access a variety of data formats and models integrate them with existing resources transform raw data into useful and understandable information. MIAS Mission

M I A S Multi-Modal Information Access & Synthesis 3 Task Perspective In the next decade, intelligence analysts will need to monitor a huge number of interesting events and entities formulate and evaluate hypotheses with respect to them. Analysts must interact, at the appropriate level of semantic abstraction, with a system that can synthesize, summarize and interpret vast amounts of multimodal information, integrate observed data with domain models and background information in multiple formats, propose hypotheses, and help verify them.

M I A S Multi-Modal Information Access & Synthesis 4 Scenario Consider an intelligence analyst researching a problem Iranian nuclear program – generate a list of Iranian nuclear scientists, affiliations, specialties, biographies, photos, and notable recent activities. Medical treatment – what is known about it; who are the experts; what do users say about it; what side effects have been reported Current technologies have solved the problem of collecting and storing huge amounts of information; it would be reasonable to assume that the information she is after does exist; However, multiple barriers exist on the way to a successful completion of the analysis, each posing a significant research challenge.

M I A S Multi-Modal Information Access & Synthesis 5 Multimodal Information Access & Synthesis Focused Multimodal Data Retrieval News Articles Field reports Specific Web sites Text Repositories Relational databases Surveillance Videos... News Articles Field reports Specific Web sites Text Repositories Relational databases Surveillance Videos... Online Data Sources Web pages Text documents * * * * * * * * * U. of Tehran Elkhan Factory * * * * * * visited Saeed Zakeri Relational data Images Saeed Zakeri attended Infer Metadata: Semantic entities Discover Relations Between Semantic Entities Support Information Analysis, Knowledge Discovery, Monitoring Discover unusual events, entities, and associations. Continuous monitoring of events, entities & associations Rapid retrieval of all info. about a particular entity Efficient search, querying, question answering, browsing, mining, loc = Northern Iran name = Elkhan Factory topics = fertilizer, enrichment Semantic categories Temporal categories Subjectivity/Opinions Semantic Disambiguation & Integration across multiple sources and modalities

M I A S Multi-Modal Information Access & Synthesis 6 MIAS Processes Focused data retrieval and integration Identify and collect relevant data from multiple sources Semantic data enrichment Infer semantics from unstructured data and images; Allow navigation and search across disparate data modalities; augment KBs Entity identification and relations discovery Identify real-world entities and relations among them Relate them to existing institutional resources for information integration Knowledge discovery and hypotheses generation and verification Construct the rich semantic structure and hidden networks of entity linkages Foundations Machine learning, database and data mining, natural language processing, inference and optimization and computer vision techniques Required for and driven by the aforementioned problems.

M I A S Multi-Modal Information Access & Synthesis 7 Develop diverse human resources to enhance the scientific research, educational, and governmental workforce in MIAS Educational and Outreach Initiatives: Target students from small research programs & minority-serving Expose them to the national labs Open opportunities for bigger impact A comprehensive education program designed to increase participation in the study and practice of MIAS topics: Provide substantive training for a new generation of experts in the field, Serve as a tool for recruiting an experienced group of undergraduates into graduate study in one of the broad fields of information science Be an intellectual community center, where participants at all levels of expertise come together in an enriched environment of collaboration. Integrated Mission: Research & Education

M I A S Multi-Modal Information Access & Synthesis 8 Intensive Course in The Math of Data Sciences 1.Probability and Statistics 2.Linear Algebra 3.Data Structures and Algorithms 4.Optimization 5.Learning & Clustering Data Science Summer Institute at UIUC Research Projects Led by co-PIs and Grad Students Topics: 1.Virtual Web: Focused Crawling 2.Relations and Entities 3.Text and Images Advanced MIAS Related Tutorials 1st DSSI: May 2007 Huge Success Speaker Series 8 weeks course 25 students Faculty from UIUC, Kansas State, UTSA, UTEP

M I A S Multi-Modal Information Access & Synthesis 9 Machine Learning & Data Mining Information Retrieval & Web Information Access Computer Vision Natural Language Processing & Information Extraction Databases & Information Integration Data Science Summer Institute at UIUC Advanced MIAS Related Tutorials

M I A S Multi-Modal Information Access & Synthesis 10 Data Science Summer Institute at UIUC

M I A S Multi-Modal Information Access & Synthesis 11 Our Team Leading researchers in intelligent information access & analysis and its foundations A large number of affiliates/consultants covering all areas of interest to the MIAS center.

M I A S Multi-Modal Information Access & Synthesis 12 Kevin C. Chang: Data Integration and Retrieval Deep Web MetaQuerier: Large-scale integration over deep Web pioneered the “holistic integration” paradigm Widely published at SIGMOD, VLDB, ICDE System building and demo at CIDR, SIGMOD, VLDB, … PC/editor/organizers of SIGMOD, ICDE, WWW, SIGKDD special issue, WIRI06, IIWeb06 workshops Awards: NSF CAREER, IBM Faculty, NCSA Faculty VLDB00 Best Paper Selection

M I A S Multi-Modal Information Access & Synthesis 13 AnHai Doan: Data Integration ACM Doctoral Dissertation Award 2004 Sloan Fellowship 2006 Edited special issues on Data Integration Co-chaired workshops on data integration, Web technologies, machine learning Co-writing a book titled “Data Integration” Monitoring People and Events Data integration Matching Schemas, Ontologies, Entities Integrating databases and text

M I A S Multi-Modal Information Access & Synthesis 14 David Forsyth: Computer Vision and Learning Linking Text and Images Labeling images via (a lot of) caption text Leading Computer Vision researcher, Over 110 papers on vision, graphics, learning applications Program Chair, CVPR 2000, General chair CVPR 2006, regular member of PC in all major vision conferences IEEE Technical Achievement Award, 2006 Lead author of main textbook, widely adopted

M I A S Multi-Modal Information Access & Synthesis 15 Jiawei Han: Data Mining Mining patterns and knowledge discovery from massive data Research focus: Mining data streams, frequent patterns, sequential patterns, graph patterns, and their applications Privacy preserving Data Mining Developed many popular data mining algorithms, e.g., FPgrowth, PrefixSpan, gSpan, StarCubing, CrossMine, RankingCube, and CrossClus Over 300 research papers published in conferences and journals Editor-in-Chief, ACM Transactions on Knowledge Discovery from Data Textbook, “Data mining: Concepts and Techniques,” adopted worldwide

M I A S Multi-Modal Information Access & Synthesis 16 UIUC CS Department Director of Diversity Programs Lecturer for Discrete Math, Data Structures courses at UIUC One of the leading teachers and educational leaders at UIUC. Research in Data Mining and Data bases Speaker and regular presenter at conferences for young women, including: GAMES and WYSE summer camps Expanding Your Horizons careers conference Grace Hopper Celebration of Women in Computing panel on best practices for recruiting women into undergraduate programs in CS. SIGCSE workshop, “How to host your own Small Regional Celebration of Women in Computing.” Cinda Heeren: Data Mining, Education Summer School MIAS Summer School Director Mathematical Foundation of Data Science Discrete

M I A S Multi-Modal Information Access & Synthesis 17 ChengXiang Zhai: Information Retrieval and Text Mining Probabilistic Paradigms for Information Retrieval Personalized/Context Dependent Search Relation Identification and Data Integration Leading expert in information retrieval and search technologies Recipient of the 2004 Presidential Early Career Award for Scientists and Engineers (PECASE), Main architect and key contributor of the Lemur Toolkit (being used by many research groups and IR companies around the world) ACM SIGIR’04 best paper award Selected services include Program Chair of ACM CIKM 2004; HLT/NAACL 06; SIGIR’09

M I A S Multi-Modal Information Access & Synthesis 18 Leading Researcher in Machine Learning, NLP, AI Developed Popular Machine Learning system and machine learning based NLP tools used in industry and NLP classes. Program Chair, ACL’03, CoNLL’02; Regular senior PC member in all major Machine Learning, NLP and AI conferences Associate Editor: Journal of Artificial Intelligence Research; Machine Learning Multiple papers awards Dan Roth: Machine Learning, NLP Semantic Analysis and Data enrichment Entity and Relation Identification and Integration Textual Entailment Machine Learning Methods for NLP and IE

M I A S Multi-Modal Information Access & Synthesis 19 Information Access & Synthesis Processes Focused data retrieval and integration, Identify and collect relevant data from multiple sources Semantic data enrichment, Infer semantics from unstructured data and images; Allow navigation and search across disparate data modalities; augment KBs Entity identification and relations discovery, Identify real-world entities and relations among them Relate them to existing institutional resources for information integration Knowledge discovery and hypotheses generation and verification, Construct the rich semantic structure and hidden networks of entity linkages Foundations Machine learning, database and data mining, natural language processing, inference and optimization and computer vision techniques Called for and driven by the aforementioned problems. Tools Text Processing& Analysis Semantic Analysis & Information Extraction Information Integration Machine Learning & Data Mining Integrating Text & Images