Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Network Resource Broker for IPTV in Cloud Computing Lei Liang, Dan He University of Surrey, UK OGF 27, G2C Workshop 15 Oct 2009 Banff,
PeerApp Proprietary and Confidential P2P Application Management for Service Providers P4PWG, January 2008 Alan Arolovitch.
Push Technology Humie Leung Annabelle Huo. Introduction Push technology is a set of technologies used to send information to a client without the client.
1 Efficient and Robust Streaming Provisioning in VPNs Z. Morley Mao David Johnson Oliver Spatscheck Kobus van der Merwe Jia Wang.
® IBM India Research Lab © 2006 IBM Corporation Challenges in Building a Strategic Information Integration Infrastructure Mukesh Mohania IBM India Research.
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
19 Historical overview Main challenge: How to distribute content in high quality over the Internet cost-effectively? • Traditional “Best-effort” model:
Toolbox Mirror -Overview Effective Distributed Learning.
Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
1 Probabilistic Models for Web Caching David Starobinski, David Tse UC Berkeley Conference and Workshop on Stochastic Networks Madison, Wisconsin, June.
An Overlay Multicast Infrastructure for Live/Stored Video Streaming Visual Communication Laboratory Department of Computer Science National Tsing Hua University.
Web Caching Schemes For The Internet – cont. By Jia Wang.
Content Delivery Networks. History Early 1990s sees 100% growth in internet traffic per year 1994 o Netscape forms and releases their first browser.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
Web Cache. Introduction what is web cache?  Introducing proxy servers at certain points in the network that serve in caching Web documents for faster.
NTT Information Sharing Platform Laboratories / Humboldt University Berlin Yoshitsugu Tsuchiya Wataru Takita NTT Information Sharing Platform Laboratories.
Efficient Content Distribution on Internet. Who pays for showing a Web page to a user? Receiving side –Users pay to small ISPs, who pay to big ISPs, who.
TV-Anytime (and the myTV project) Ronald Tol Philips Research.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
Web Search Created by Ejaj Ahamed. What is web?  The World Wide Web began in 1989 at the CERN Particle Physics Lab in Switzerland. The Web did not gain.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
1 CMPT 275 High Level Design Phase Architecture. Janice Regan, Objectives of Design  The design phase takes the results of the requirements analysis.
Company LOGO mDNS (ICM3400) Proposal for Hierarchical Multicast Session Directory Architecture Piyush Harsh & Richard Newman.
Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.
Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.
Master Thesis Defense Jan Fiedler 04/17/98
World Wide Web Caching: Trends and Technologys Gerg Barish & Katia Obraczka USC Information Sciences Institute, USA,2000.
Sharing Information across Congestion Windows CSE222A Project Presentation March 15, 2005 Apurva Sharma.
Putting Intelligence in Internetworking: an Architecture of Two Level Overlay EE228 Project Anshi Liang Ye Zhou.
Management for IP-based Applications Mike Fisher BTexaCT Research
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
Design and Analysis of Advanced Replacement Policies for WWW Caching Kai Cheng, Yusuke Yokota, Yahiko Kambayashi Department of Social Informatics Graduate.
1 Mobile Networks logica Contents Confidential & Proprietary to Logica © 2001 Application-Level Active Networks Presented at Dublin Breakfast Briefing,
Enabling Peer-to-Peer SDP in an Agent Environment University of Maryland Baltimore County USA.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Best Practices in Higher Education Student Data Warehousing Forum Northwestern University October 21-22, 2003 FIRST QUESTIONS Emily Thomas Stony Brook.
Multicast instant channel change in IPTV systems 1.
Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN.
Chapter 8 Data and Knowledge Management. 2 Learning Objectives When you finish this chapter, you will  Know the difference between traditional file organization.
Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
IXP1200 Applications Ada Gavrilovska, Jiantao Kong, Weidong Shi, Xiaotong Zhuang Dr. Karsten Schwan, Dr. Ken Mackenzie Scalable Real Time Media Streaming.
The LSAM Proxy Cache - a Multicast Distributed Virtual Cache Joe Touch USC / Information Sciences Institute 元智大學 資訊工程研究所 系統實驗室 陳桂慧
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
August 23, 2001ITCom2001 Proxy Caching Mechanisms with Video Quality Adjustment Masahiro Sasabe Graduate School of Engineering Science Osaka University.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
Napa Valley Watershed Information Center Presentation to the Napa River Watershed Conservancy and WIC Board December 18, 2002.
/ Fast Web Content Delivery An Introduction to Related Techniques by Paper Survey B Li, Chien-chang R Sung, Chih-kuei.
Internet2 Distributed Storage Infrastructure Status Micah Beck, Chair Network Storage WG Innovative Computing Laboratory University of Tennessee, Knoxville.
The FAIRMODE PM modelling guide Laurence ROUIL Bertrand BESSAGNET
KNOWLEDGE MANAGEMENT (KM) Session # 33. Corporate Intranet A Conceptual Model INTRANET Production Team— New Product Budget Director— New Product Knowledge.
Chapter 8 Environments, Alternatives, and Decisions.
CIIT-Human Computer Interaction-CSC456-Fall-2015-Mr
IBM Tivoli Web Site Analyzer Training Document
Distributed Systems CS
MANAGING DATA RESOURCES
KNOWLEDGE MANAGEMENT (KM) Session # 34
Web Mining Department of Computer Science and Engg.
Junghoo “John” Cho UCLA
EE 122: Lecture 22 (Overlay Networks)
Presentation transcript:

Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western Michigan University, USA

13-16 November 2000ICDL Table of Contents  Survival from “Information Explosion”  Warehouse-Mediated Content Delivery  Community-Oriented Web Warehouses  Technical Issues  Warehouse Enhanced Web Caching  Related Work  Concluding Remarks

13-16 November 2000ICDL Survival from “Information Explosion”  Web Traffic Doubled Every 3-6 Months  Exponential Growth of the Web –1 Billion Pages, January 2000 –2 Billion Pages, June 2000 –100 Times Increase in the Next 2 Years Information Overload for both Nets and Users

13-16 November 2000ICDL Scale up the Web and Internet  More Bandwidth –Never Keep Pace with the Traffic Growth  More Server Capacity –How to Deal with “Hot-Spots” ?  Site Replication –Only Benefit Replicated Servers ?

13-16 November 2000ICDL Our Approach  Tame the Chaotic Info. Streams Saving Redundant Data Transfers  Unite the Individual Users Sharing Findings and Efforts of Each Other

13-16 November 2000ICDL Warehouse-Mediated Content Delivery  Direct Delivery –  QoS: Server, Network  Overloaded –  Personalized Services  Unrealistic –  Information Hunting  Difficult Internet

13-16 November 2000ICDL Indirect Content Delivery Storage Output Analysis Notification Transformation Buffering WWW Input Resource Discovery Clustering Searching Navigation Filtering Web Warehouse

13-16 November 2000ICDL Community-Oriented Web Warehousing Sharing   Contribution The Community of Users * People with Special Information Needs/Interests

13-16 November 2000ICDL Examples of User Community Sports Fan Patients Businessman Researchers

13-16 November 2000ICDL Real/Cyber Communities (a) Real Communities Dependent on Location (b) Cyber Communities Independent on Location

13-16 November 2000ICDL Technical Issues  Functions of a Web Warehouse  Web Caching vs. Web Warehousing  Data Warehousing vs. Web Warehousing  Dynamic Hierarchical Web Warehouses

13-16 November 2000ICDL Functions of a Web Warehouse  Buffering  Transformation 1.Transcoding 2.Summarizing  Content Analysis  Notification Resource Discovery Storage Reusing Transform Format A Format B Content A Transform Content B Data/Information Analysis Knowledge

13-16 November 2000ICDL Web Caching Research Program Content Analysis Transformation Warehousing

13-16 November 2000ICDL From Web Caching to Web Warehousing Web CachingWeb Warehousing ObjectDataInformation ObjectiveReusingSharing StorageBoundedBound-Free PopulationResponsesWeb View ModelFS DependentHypermedia

13-16 November 2000ICDL From Data Warehousing to Web Warehousing ItemsData WHWeb WH 1ObjectiveDecision SupportInformation Sharing 2ModelRDB/OORDBHypermedia 3PopulationView Materialization Resource Discovery Content Localization 4ResourceOperational DataWeb Documents 5Data TypeStructuredSemi-/Un-structured 6Tie to Web DWH  WebWWH  Web

13-16 November 2000ICDL Warehouse as Shared Information Repository  Real Communities  –Centralized Management of Warehouses –Unicast Data Transfer  Cyber Communities  –Distributed Management of Warehouse –Multicast Data Transfer

13-16 November 2000ICDL Hierarchy of Web Warehouses HP Design Sports Skiing Tennis Mr. A, Ms. C Mrs. D … Mr. A, Ms. C Mrs. D … Mr. A. Mr. D ….. Mr. A. Mr. D …..

13-16 November 2000ICDL Dynamic Formation of Web Warehouses (Split ) Tennis Skiing A B Sports Tennis Skiing A A B B

13-16 November 2000ICDL Dynamic Formation of Web Warehouses (Union ) Painting Drawing A A B B Painting & Drawing Painting & Drawing A A B B

13-16 November 2000ICDL Current Status: Content-Sensitive Caching Web Caching Warehousing Content Sensitive Caching Content-Sensitive Caching

13-16 November 2000ICDL Content-Sensitive Cache Replacement Policy  Cache Replacement : Keep? Replace?  Traditional Caching Long Time Observation  Replacement Decision 60% One-Access Objects  How Differentiate ? Content-Sensitive Caching LRU-SP+

13-16 November 2000ICDL LRU-SP+: Content-Sensitive Size-Adjusted & Popularity-Aware LRU  Daily Indexing: Cache Content  Indices  Indices  Popular Topics  How Similar? New Document  Popular Topics  Benefit/Size Model “Observed” Pop. + “Inherent” Pop.  Implement this Model

13-16 November 2000ICDL Related Work  LSAM’s Proxy Cache (Push) –Multicast-Based Virtual Cache –Affinity Groups and Push Channels  INTELSAT’s Wormhole Content Delivery –Warehouse-Koisk Model –Satellite-Based Delivery Platform

13-16 November 2000ICDL Concluding Remarks Proposed to Cope with the Scaling Problems by Web Warehouse-Mediated Content Delivery  Discussed the Basic Functions of a Web Warehouse: Buffering, Transformation, Notification and Content Analysis  Introduced our Current Work: Warehouse-Enhanced Web Caching