Dynamic-Content Web Caching with Cooperative Proxy Scheme Βελισκάκης Μανώλης Εθνικό Μετσόβιο Πολυτεχνείο Dept. of Electrical & Computer Engineering Knowledge and Database Systems Laboratory Συνάντηση DBLAB Τρίτη, 20 Ιανουαρίου 2004
Outline Problem Definition Problem Definition Dynamic-Data Web Caching vs Cooperative Schemes Dynamic-Data Web Caching vs Cooperative Schemes Proposed Web Caching Algorithm Proposed Web Caching Algorithm Current and Future Work Current and Future Work Discussion Discussion
Problem Definition – What? Query Results Dynamic Data for personalization purposes
Problem Definition – Where? Client Proxy Edge-of-net Internet Service Provider Edge-of-Enterprise Application Server Web Server DBMS
Problem Definition – How? Nowadays Approaches Exact matching query Materialized Views DB Characteristics to Proxies
Problem Definition – Topology Scheme Broadcast queries Hierarchical Caching URL Hashing Directory based Cooperation
Problem Definition - Issues Replacement Policy Replacement Policy Cache Consistency Cache Consistency Proxy Communication Proxy Communication Web objects placement Web objects placement
Dynamic-Data Web Caching vs Cooperative Schemes Exact matching query Materialized Views DB Characteristics to Proxies Broadcast queries Hierarchical Caching URL Hashing Directory based Cooperation Replacement Policy Replacement Policy Proxy Communication Proxy Communication Web objects placement Web objects placement
Dynamic-Data Web Caching vs Cooperative Schemes Conclusions (?) Exact Matching Query Exact Matching Query –Common Web Caching Issues –Not interesting DB Characteristics to Proxies DB Characteristics to Proxies –Common DB Replication Issues –Interesting Issue: Create Cache Tables knowing that there is a cooperative proxy Scheme
Dynamic-Data Web Caching vs Cooperative Schemes Conclusions (?) Materialized Views Materialized Views –Many interesting issues Query rewriting Query rewriting Replacement Algorithm Replacement Algorithm Appropriate Cooperative Scheme Appropriate Cooperative Scheme Web Objects exchange between Proxies Web Objects exchange between Proxies Consideration of DBMS structure Consideration of DBMS structure Dynamic or a priori definition of Materialized Views Dynamic or a priori definition of Materialized Views Giving DB capabilities to Proxies (queries on Materialized Views) Giving DB capabilities to Proxies (queries on Materialized Views) Communication between Proxies Communication between Proxies
Proposed Web Caching Algorithm – Hybrid Topology (Hierarchical-Directory Based) PROXY 1b DIRECTORY Q.M CACHE PROXY 2b DIRECTORY Q.M CACHE PROXY 1c DIRECTORY Q.M CACHE PROXY 1a CLCLIIENTSENTSCLCLIIENTSENTSI DIRECTORY Q.M CACHE PROXY 2c DIRECTORY Q.M CACHE PROXY 2a C LI E N T S DIRECTORY Q.M CACHE WEB SERVER DATABASESERVER DATABASE SERVER
Proposed Web Caching Algorithm – Web Objects description There are 3 different ways to refer to a Web Object There are 3 different ways to refer to a Web Object –URL –QTag –QTag+Query Result (Whole Web Object)
Proposed Web Caching Algorithm – Web Objects description QTAG<QTag ID:Number, //Unique identifier for every QTag Query:String, //Contains the query that has been asked to the Back-End Database LocationOfWebServer:URL, //Contains the URL Location of the Web Server that stands in front of the Database DatabaseID:Number,//Contains the ID of the Database where the query was asked TimeToLive:Number (sec), //Determines the period in which the query is valid and can satisfy Requests Weight:Number,//Determines the significance (Weight) of the query. Relationships:List of QTag.ID //Determines a list of Web Objects that are frequently used with the current Web Object in order to satisfy query requests />
Proposed Web Caching Algorithm – Web Objects description QTAG + Query Results <QTagID= , Query=”Select name, surname, age from Customers where Name=’John’”, LocationOfWebServer =” DatabaseID =1, TimeToLive=1000, Weight=0.65 Relationships=” , , , /> John, Manolopoulos, 28 John, Nikolaidis,35...John,Fissas,40 Query Result
Proposed Web Caching Algorithm – Proxy Structure PROXY STRUCTURE MAIN CACHE QUERY REWRITER CACHE DIRECTORY COOPERATIVE- SCHEME DIRECTORY WEIGHT CALCULATOR REST OF COOPERATIVE SCHEME URL/QTag TRANSFORMER
Proposed Web Caching Algorithm – Proxy Structure – URL/QTag Transformer Proxies manipulates Web-Objects (Query Results) through their Proxies manipulates Web-Objects (Query Results) through their Extract from a Web Object’s URL the Extract from a Web Object’s URL the – Query (Knowing the CGI that produces the Query Result) –LocationOfWebServer –DatabaseID 1-1 correspondence between URLs and QTags 1-1 correspondence between URLs and QTags
Proposed Web Caching Algorithm – Proxy Structure – Query Rewriter Proposed Web Caching Algorithm – Proxy Structure – Query Rewriter Rewriting the requested Web Objects (Queries) in case there is not an exact match of the requested query cached but it can be satisfied from other already cached web objects (queries). Rewriting the requested Web Objects (Queries) in case there is not an exact match of the requested query cached but it can be satisfied from other already cached web objects (queries). Query rewriter will follow standard query-rewriting methods and techniques that are already used to database system and environments Query rewriter will follow standard query-rewriting methods and techniques that are already used to database system and environments
Proposed Web Caching Algorithm – Proxy Structure – Weight Calculator Proposed Web Caching Algorithm – Proxy Structure – Weight Calculator Every web object will be characterized from a Weight W which will be determined from the following factors: Every web object will be characterized from a Weight W which will be determined from the following factors: S (Determined from the web-object’s size) Πs (Determined from the influence percentage of Factor S to the Weight) CS (Determined from the web-object’s cost-retrieval) Πcs (Determined from the influence percentage of factor CS to the Weight) Ρ (Determined from the web-object’s popularity) Πp (Determined from the influence percentage of Factor Ρ to the Weight) R (Determined from the web-object’s significance as far as its relationships concerns) Πr (Determined from the influence percentage of Factor R to the Weight)
Proposed Web Caching Algorithm Some of the Sub-QTags are cached None of the Sub-QTags are cached The Request arrives to a Proxy The URL/QTag Transformer Finds the QTag that best describes the incoming URL The QTag is sent to Query Rewriter Query Rewriter Rewrites the Query and produces Sub-QTags The Query Rewriter asks the Cache Directory if any of these Sub-QTags is already cached in the Main Cache All the Sub- QTags are cached Send request to Web Server and Caches the response Query Rewriter retrieves the locally cached Web Objects The Query Rewriter asks the Cooperative-scheme Directory if the rest Sub- QTags cached in other Proxies Not all of the rest of the Sub-Qtags are cached in other Proxies ALL of the rest of the Sub-Qtags are cached in other Proxies Proxy retrieves the Cached Web Objects from the other Proxies and sends them to Query Rewriter Query Rewriter combines the Sub QTags and the proxy sends the response The Proxy Caches locally the retrieved Web Objects Weight Calculator Refreshes Weight Value and Parameters of the Sub-Tags
Current and Future Work Study and Testing the proposed new approaches Study and Testing the proposed new approaches Definition of Workload Definition of Workload Better Definition and Testing of the proposed Algorithm Better Definition and Testing of the proposed Algorithm
Discussion Efficiency of Testing Tools (Simulator) Efficiency of Testing Tools (Simulator) Ideas for efficient Web Caching for Dynamic-Data Ideas for efficient Web Caching for Dynamic-Data Comments Comments
Thank You