Mobile Data Access1 Replication, Caching, Prefetching and Hoarding for Mobile Computing
Mobile Data Access2 Definitions r Replication: To maintain multiple (consistent) copies of a data item m Static replication: the number and location of copies are statically determined (at compile time, design time). m Dynamic replication: the number and location of copies is determined dynamically (at run-time) r Caching: To maintain a temporary copy of the data in fast (local) memory. The copy is fetched when it is first accessed. r Pre-fetching: To obtain a temporary before it is accessed (to hide access latency). r Hoarding: To preload a copy of a data object so that the mobile client can work while it is disconnected from the network (I.e. prefetching to tolerate disconnections).
Mobile Data Access3 Data Access Model r On-Demand r Broadcast Channel
Mobile Data Access4 Motivation r Caching (Prefetching/Hoarding) at mobile clients is crucial to improve performance of info access and database querying. r Issues: m Read only data: currency guarantees m Read/Write data: consistency in presence of disconnected operations m Server Load/Scalability in presence of numerous clients
Mobile Data Access5 Mobile Database Querying: Requirements r Minimize query delay r Maximize number of queries answered per unit time (system throughput) r Handle client disconnection r Conserve wireless bandwidth and battery power r Minimize server load r Handle mobility
Mobile Data Access6 Advantages of Caching in Mobile Environment r Helps reduce latency caused by narrow bandwidth wireless links r Enable limited functionality in mobile hosts even in disconnected mode r Helps conserve battery power by reducing the number of uplink queries r Conserves bandwidth
Mobile Data Access7 Problems in Maintaining Consistent Cache r Classic solutions do not work m Mobile Clients may be disconnected for long duration => invalidations may be lost m Upon reconnection mobile clients will have to revalidate their cache (wastes energy and bandwidth). r Need new solutions
Mobile Data Access8 Challenges to Efficient Caching Scheme r Efficient caching scheme should take into account: m Data access pattern m Data update rate m Communication/access cost m Mobility pattern of the clients m Connectivity characteristics Disconnection frequency Available bandwidth m Data currency requirements m Location-dependence of information
Mobile Data Access9 General Issues in Designing Caching Schemes r Where to cache? How many levels of caching to use? r What to cache (when to cache a data item and how long) ? r How to invalidate cached items? Who is responsible for invalidations? What is the granularity at which the invalidations are done? r What data currency guarantees the system can provide to the user? What are the costs involved? How to charge the user? r What is the effect of the caching scheme on the query delay (response time) and the system throughput (query completion rate)?
Mobile Data Access10 Classification of Cache Invalidation Schemes r Who is in charge of invalidations? m Server or Client (Push or Pull): Callbacks or Validation Checks r Whether or not server maintains per client state information? m Stateless or Stateful Server r How server sends invalidation reports? m Synchronously or Asynchronously r What kind of information is sent in the invalidation report? m State or History based r How information is organized in invalidation reports? m Uncompressed or Compressed
Mobile Data Access11 Cache Maintenance Schemes r Broadcasting Invalidation Reports [Barbara Sigmod 94]. r Disconnected Operation in CODA (Satyanarayanan et. al. ) m Hoarding (Prefetching) r AS (Asynchronous Stateful) Caching Scheme (Kahol et. al. ICDCS 00)
Mobile Data Access12 Broadcasting Invalidation Reports r Uses stateless servers and synchronous broadcasts [Barbara Sigmod 94] r Clients maintain local caches and use the information in invalidation reports to update their cache. r A server broadcasts invalidation reports every L time units which contains ids of all the data items which changed during the past w = kL time units. r A query is satisfies after receiving the next invalidation report.
Mobile Data Access13 Broadcasting IR: Variations r If a client is disconnected from the network and misses k consecutive invalidation reports then it has to discard its cache. r Two variations: 1. Timestamp Strategy (TS): invalidation reports contain ids of modified data items over a large window (k > 1). 2. Amnesic Terminal (AT): invalidation reports contain ids of only those data items which changed since the last broadcast (k=1). r TS is better when clients are “sleepers” and AT is better when clients are “workaholics”.
Mobile Data Access14 Disconnected Operation in CODA r Goal: COnstant Data Availability r Mechanisms: server replication and disconnected operations. r Caching scheme (asynchronous, stateful): m Uses callbacks while a client is reachable from a server. m During disconnections permits access to possibly stale data. m Upon reconnection, the client does validity checks on each volume cached. r Uses hoarding to improve data availability
Mobile Data Access15 Drawbacks r Drawbacks of Barbara’s scheme: m Poor delay characteristics due to waiting involved before answering a query. m Poor network utilization characteristics due to answering of queries in bursts. m Does not support arbitrary disconnection pattern. r Drawbacks of CODA caching scheme: m Server has to keep cache state of each client (affects scalability). m A client has to perform volume-by-volume validation check after each reconnection.
Mobile Data Access16 AS Caching Scheme (Kahol et al) r Maintains a Home Location Cache (HLC) at home MSS of a mobile client. r A HLC contains the state of the cache at a MH. r Uses Asynchronous transfer of invalidation reports. r Supports arbitrary disconnection durations by maintaining the timestamp of the last invalidation report destined for an MH at its HLC.
Mobile Data Access17 An Example for AS Scheme r Each cache is associated with a cache timestamp which is the timestamp of the last invalidation report received. r A mobile client sends a probe message to its home MSS when it gets connected to determine whether it missed any invalidation reports while it was disconnected.
Mobile Data Access18 Hoarding r Planned and accidental disconnections are not considered failures. r A technique to reduce the cost of cache misses during disconnection: m load necessary data before disconnect and be ready. r Hoarding techniques: m user-provided information (client-initiated disconnection) explicitly specify which data (files, tables) to hoard Implicitly based on the specified application m access structured-based (use past history) E.g., tree-based in file systems, access paths (joins) in databases
Mobile Data Access19 Hoarding versus Prefetching r Both pre-fetch data in anticipation of future use. r Prefetching m Objective is to improve performance (throughput or response time). m Cache miss is not catastrophic. r Hoarding m Objective is to fetch all needed data into MU cache prior to disconnect. Thus the goal is to facilitate disconnected operation. m Cache miss is catastrophic. m OK to overfetch
Mobile Data Access20 Hoarding in Database Systems r Granularity of Hoarding m RDBMS: ranges from tables, set of tables, whole relations m OO DBMS: objects, set of objects or class r Hoard by issuing queries or materialized views m User may explicit issue hoarding queries E.g., Create View with Update-On clause [Lauzac 98] OO query to describe hoarding profiles [Gruber 94] m History of past references both queries and data objects m Hoard Keys - an extended database organization [Badrinath 98] hoard keys are used to partition a relation in disjoint logical horizontal fragments
Mobile Data Access21 References r D. Barbara and T. Imielinski, Sleepers and Workaholics: Caching Strategies in Mobile Environments, VLDB Journal, 4, , r A. Kahol, S. Khurana, S.K. S. Gupta, and P. K. Srimani, A Strategy to Manage Cache Consistency in a Disconnected Distributed Environment, IEEE Transactions on Parallel and Distributed Systems, 12(7), , July 2001