Pedro DeRose University of Wisconsin-Madison The DBLife Prototype System in The Cimple Project on Community Information Management.

Slides:



Advertisements
Similar presentations
Tools and training for volunteer and service programs The Resource Center.
Advertisements

Effective Practices for Effective Programs National Service Resource Center Effective Practices Collection
Database Management Using Microsoft Access Xinhua Chen, Ph.D. Chinese Association of Professionals in Science and Technology March 23, 2003.
Presented by: Student Information Centre (SIC), Swinburne University The SIC Technological Consultative Group: Ray Chan Siu-Ching Fong & Hwee-Ting Lee.
Web Communities: The World Online Raghu Ramakrishnan Chief Scientist for Audience and Cloud Computing Research Fellow Yahoo! (On leave, Univ. of Wisconsin-Madison)
Data Management for XML: Research Directions By: Jennifer Widom Stanford University Reviewer: Kristin Streilein.
1 Community Systems: The World Online Raghu Ramakrishnan Yahoo! Research Univ. of Wisconsin-Madison (on leave)
Project 1 Assignment Building a mini-database for CCI in UNCC which includes entity sets: departments (CS,SIS, bioinformatics), faculties, courses given.
6/14/2015 8:20 PM1 CSE 574 Extracting, Managing & Personalizing Web Information Staffing –Dan Weld –Raphael Hoffmann Content –Intersection of AI, ML, DB.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
AnHai Doan University of Wisconsin-Madison Managing Unstructured Data.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
1 Data Warehousing – CG124 Dr. Akhtar Ali School of Computing, Engineering and Information Sciences Computing.unn.ac.uk/staff/CGMA2/CG124.
Mendeley What is it? How is it different from other “Bibliographic databases” like End Note and Reference.
A Platform for Personal Information Management and Integration Xin (Luna) Dong and Alon Halevy University of Washington.
Combining Keyword Search and Forms for Ad Hoc Querying of Databases Eric Chu, Akanksha Baid, Xiaoyong Chai, AnHai Doan, Jeffrey Naughton University of.
© Acquia, Inc Commons 3.0 Customer Deck Transforming Digital Business with OpenWEM Name, Date.
AnHai Doan University of Illinois Joint work with Pedro DeRose, Robert McCann, Yoonkyong Lee, Mayssam Sayyadian, Warren Shen, Wensheng Wu, Quoc Le, Hoa.
Sharepoint Makes daily tasks more efficient and improves internal as well as external collaboration Not just cost savings, but adds business value.
Supporting the Automatic Construction of Entity Aware Search Engines Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, Paolo Papotti Dipartimento di Informatica.
AnHai Doan University of Wisconsin-Madison The Cimple Project on Community Information Management.
© 2011 IBM Corporation Getting social in IBM. © 2011 IBM Corporation Are we having a conversation?  Today, business happens in real-time. Market shifts.
NUITS: A Novel User Interface for Efficient Keyword Search over Databases The integration of DB and IR provides users with a wide range of high quality.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
BBS CONTACT CAPABILITY REVIEW: WEB WIREFRAMES PROPOSAL VERSION.
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
1 BINGO! and Daffodil: Personalized Exploration of Digital Libraries and Web Sources Martin Theobald Max-Planck-Institut für Informatik Claus-Peter Klas.
Mini-Project on Web Data Analysis DANIEL DEUTCH. Data Management “Data management is the development, execution and supervision of plans, policies, programs.
A Relational Approach to Incrementally Extracting and Querying Structure in Unstructured Data Eric Chu, Akanksha Baid, Ting Chen, AnHai Doan, Jeffrey Naughton.
The Development of the Ceramics and Glass website Mia Ridge Museum Systems Team Museum of London.
Cimple: Building Community Portal Sites through Crawling & Extraction Zachary G. Ives University of Pennsylvania CIS 650 – Implementing Data Management.
Real World IR Challenges (CS598-CXZ Advanced Topics in IR Presentation) Jan. 20, 2005 ChengXiang Zhai Department of Computer Science University of Illinois,
AnHai Doan University of Wisconsin-Madison Data Quality Challenges in Community Systems Joint work with Pedro DeRose, Warren Shen, Xiaoyong Chai, Byron.
Prototype Information Architecture. Key Requirements Access to data, tools, and expertise –Integrated access to spatial data –Submission of info. to OWEB.
Optimizing Complex Extraction Programs over Evolving Text Data Fei Chen 1, Byron Gao 2, AnHai Doan 1, Jun Yang 3, Raghu Ramakrishnan 4 1 University of.
A brief recap… Using technology to develop and delivery PLEI – Sessions on social media in Montreal 2010 – Series of webinars on portal websites in 2011.
1 Community Systems: The World Online Raghu Ramakrishnan VP and Research Fellow Yahoo! Research.
Integrating Structured & Unstructured Data. Goals  Identify some applications that have crucial requirement for integration of unstructured and structured.
Lesson 2: Web Development Teams
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
Current Trends in Web-Enablement June The “Real-Time” Enterprise  Extending business processes and operations  To the end-user via the web 
Presented By – Yogesh A. Vaidya. Introduction What are Structured Web Community Portals? Advantages of SWCP powerful capabilities for searching, querying.
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
Pedro DeRose University of Wisconsin-Madison Cimple 1.0: A Community Information Management Workbench Preliminary Examination.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
AMERICAN PHYSICAL SOCIETY
Possible Sigsoft Research Projects Presenter: Luke Rajlich Sept 26, 2005.
Xiaoyong Chai, Ba-Quy Vuong, AnHai Doan, Jeffrey F. Naughton University of Wisconsin-Madison Efficiently Incorporating User Feedback into Information Extraction.
An Overview of Literature Management Systems Qiaozhu Mei April 12, 2007.
The Canadian Healthcare Education Commons What is CHEC-CESC?
CPS 49S Google: The Computer Science Within and its Impact on Society Shivnath Babu Spring 2007.
Warren Shen, Xin Li, AnHai Doan Database & AI Groups University of Illinois, Urbana Constraint-Based Entity Matching.
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
OARE Module 5A: Scopus (Elsevier)
MINING DEEP KNOWLEDGE FROM SCIENTIFIC NETWORKS
DWQ Web Transformation
Augmenting (personal) IR
The Knowledge Center.
Finding a team of Experts in Social Networks
CompSci Self-Managing Systems
A Platform for Personal Information Management and Integration
Web Mining Department of Computer Science and Engg.
Arizona Collaborative Clearinghouse
Web archives as a research subject
Data Management and Information Processing
Cloud Futures Panel -- Future cloud-related research
January 2011 Amanda Goad - Account Executive
CS122B: Projects in Databases and Web Applications Spring 2018
CS122B: Projects in Databases and Web Applications Winter 2018
Finding a team of Experts in Social Networks
Presentation transcript:

Pedro DeRose University of Wisconsin-Madison The DBLife Prototype System in The Cimple Project on Community Information Management

Community Information Management Numerous Web communities –database researchers, movie fans, legal professionals, bioinformatics, etc. –enterprise intranets, tech support groups Each community = many data sources + many members Members often want to integrate data, query, and discover community information –any interesting connection between researchers X and Y? –find all citations of this paper in the past one week on the Web –what is new in the past 24 hours in the database community? –what are current hot topics? who has moved where?

Researcher Homepages Conference Pages Group Pages DBworld mailing list DBLP Cimple Wisconsin/Yahoo! Research Web pages Text documents * * * * * * * * * SIGMOD-04 * * * * give-talk Jim Gray Keyword search SQL querying Question answering Browse Mining Alert/Monitor News summary Jim Gray SIGMOD-04 * * Personalize system, provide feedback Structured community portal, driven by extraction + integration + mass collaboration

The Research Team Core Members –Pedro DeRose –Warren Shen –AnHai Doan –Raghu Ramakrishnan Supporting Members –Fei Chen –Yoonkyong Lee –Doug Burdick –Mayssam Sayyadian –Xiaoyong Chai –Ting Chen

Prototype System: DBLife Integrate data of the DB research community Live at dblife-labs.cs.wisc.edu 1,075 data sources –463researcher homepages –103department homepages – 54conference homepages – 99faculty hubs – 56database group pages –203project homepages – 85colloquia – 11event pages –DBWorld –DBLP Crawled daily, pages = 160+ MB / day

Information Extraction

Data Integration Raghu Ramakrishnan co-authors = A. Doan, Divesh Srivastava,...

Resulting ER Graph “Proactive Re-optimization Jennifer Widom Shivnath Babu SIGMOD 2005 David DeWitt Pedro Bizarro coauthor advise write PC-Chair PC-member

Provide Services

Mass Collaboration: An Example

Summary Community Information Management –increasingly crucial problem The Cimple project –sample challenges: information extraction data integration mass collaboration –extends the footprints of DB technologies to Web data –develops new DB technologies DBLife prototype –more at dblife.cs.wisc.edu, latest features (e.g., wiki) at dblife-labs.cs.wisc.edu –research/education tool, community service, benchmark, challenge problem