Download presentation
Presentation is loading. Please wait.
Published byIsabella Fletcher Modified over 8 years ago
1
Pedro DeRose University of Wisconsin-Madison The DBLife Prototype System in The Cimple Project on Community Information Management
2
Community Information Management Numerous Web communities –database researchers, movie fans, legal professionals, bioinformatics, etc. –enterprise intranets, tech support groups Each community = many data sources + many members Members often want to integrate data, query, and discover community information –any interesting connection between researchers X and Y? –find all citations of this paper in the past one week on the Web –what is new in the past 24 hours in the database community? –what are current hot topics? who has moved where?
3
Researcher Homepages Conference Pages Group Pages DBworld mailing list DBLP Cimple Project @ Wisconsin/Yahoo! Research Web pages Text documents * * * * * * * * * SIGMOD-04 * * * * give-talk Jim Gray Keyword search SQL querying Question answering Browse Mining Alert/Monitor News summary Jim Gray SIGMOD-04 * * Personalize system, provide feedback Structured community portal, driven by extraction + integration + mass collaboration
4
The Research Team Core Members –Pedro DeRose –Warren Shen –AnHai Doan –Raghu Ramakrishnan Supporting Members –Fei Chen –Yoonkyong Lee –Doug Burdick –Mayssam Sayyadian –Xiaoyong Chai –Ting Chen
5
Prototype System: DBLife Integrate data of the DB research community Live at dblife-labs.cs.wisc.edu 1,075 data sources –463researcher homepages –103department homepages – 54conference homepages – 99faculty hubs – 56database group pages –203project homepages – 85colloquia – 11event pages –DBWorld –DBLP Crawled daily, 11000+ pages = 160+ MB / day
6
Information Extraction
7
Data Integration Raghu Ramakrishnan co-authors = A. Doan, Divesh Srivastava,...
8
Resulting ER Graph “Proactive Re-optimization Jennifer Widom Shivnath Babu SIGMOD 2005 David DeWitt Pedro Bizarro coauthor advise write PC-Chair PC-member
9
Provide Services
10
Mass Collaboration: An Example
11
Summary Community Information Management –increasingly crucial problem The Cimple project –sample challenges: information extraction data integration mass collaboration –extends the footprints of DB technologies to Web data –develops new DB technologies DBLife prototype –more at dblife.cs.wisc.edu, latest features (e.g., wiki) at dblife-labs.cs.wisc.edu –research/education tool, community service, benchmark, challenge problem
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.