CSE 535 – Mobile Computing Lecture 8: Data Dissemination Sandeep K. S. Gupta School of Computing and Informatics Arizona State University.

Slides:



Advertisements
Similar presentations
Dissemination-based Data Delivery Using Broadcast Disks.
Advertisements

Paging: Design Issues. Readings r Silbershatz et al: ,
Ch 11 Distributed Scheduling –Resource management component of a system which moves jobs around the processors to balance load and maximize overall performance.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks By C. K. Toh.
Memory Management Chapter 7.
Memory Management Chapter 7. Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated efficiently to pack as.
Chapter 7 Memory Management Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic, N.Z. ©2009, Prentice.
Agent Caching in APHIDS CPSC 527 Computer Communication Protocols Project Presentation Presented By: Jake Wires and Abhishek Gupta.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
TCP over ad hoc networks Ad Hoc Networks will have to be interfaced with the Internet. As such backward compatibility is a big issue. One might expect.
What’s a Web Cache? Why do people use them? Web cache location Web cache purpose There are two main reasons that Web cache are used:  to reduce latency.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
Memory Management Chapter 5.
Client-Server Caching James Wann April 4, Client-Server Architecture A client requests data or locks from a particular server The server in turn.
A New Broadcasting Technique for An Adaptive Hybrid Data Delivery in Wireless Mobile Network Environment JungHwan Oh, Kien A. Hua, and Kiran Prabhakara.
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
What is adaptive web technology?  There is an increasingly large demand for software systems which are able to operate effectively in dynamic environments.
Client-Server Computing in Mobile Environments
1 Proxy-Assisted Techniques for Delivering Continuous Multimedia Streams Lixin Gao, Zhi-Li Zhang, and Don Towsley.
Update Propagation with Variable Connectivity Prasun Dewan Department of Computer Science University of North Carolina
Mobility in Distributed Computing With Special Emphasis on Data Mobility.
Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li Pusan National University.
CH2 System models.
جلسه دهم شبکه های کامپیوتری به نــــــــــــام خدا.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
Mobile Data Access1 Replication, Caching, Prefetching and Hoarding for Mobile Computing.
1 Lecture 2: Service and Data Management Ing-Ray Chen CS 6204 Mobile Computing Virginia Tech Fall 2005 Courtesy of G.G. Richard III for providing some.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
On The Cooperation of Web Clients and Proxy Caches Yiu Fai Sit, Francis C.M. Lau, Cho-Li Wang Department of Computer Science The University of Hong Kong.
Feb 5, ECET 581/CPET/ECET 499 Mobile Computing Technologies & Apps Data Dissemination and Management 2 of 3 Lecture 7 Paul I-Hai Lin, Professor Electrical.
Networked Games Objectives – –Understand the types of human interaction that a network game may provide and how this influences game play. –Understand.
HTTP evolution - TCP/IP issues Lecture 4 CM David De Roure
Energy-Efficient Data Caching and Prefetching for Mobile Devices Based on Utility Huaping Shen, Mohan Kumar, Sajal K. Das, and Zhijun Wang P 邱仁傑.
Data Scheduling for Multi-item and transactional Requests in On-demand Broadcast Nitin Pabhu Vijay Kumar MDM 2005.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
An Architecture for Mobile Databases By Vishal Desai.
Informationsteknologi Wednesday, October 3, 2007Computer Systems/Operating Systems - Class 121 Today’s class Memory management Virtual memory.
CPET 565 Mobile Computing Systems Data Dissemination and Management (2) Lecture 8 Hongli Luo Indiana University-Purdue University Fort Wayne.
Massively Distributed Database Systems Broadcasting - Data on air Spring 2015 Ki-Joune Li Pusan National University.
Wireless Cache Invalidation Schemes with Link Adaptation and Downlink Traffic Presented by Ying Jin.
1 Ethernet CSE 3213 Fall February Introduction Rapid changes in technology designs Broader use of LANs New schemes for high-speed LANs High-speed.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
1 Wireless World Wide Web: Mobile Access to Web Resources 王讚彬 台中教育大學資訊系.
2010INT Operating Systems, School of Information Technology, Griffith University – Gold Coast Copyright © William Stallings /2 Memory Management.
Lecture 4 Page 1 CS 111 Summer 2013 Scheduling CS 111 Operating Systems Peter Reiher.
Feb 5, ECET 581/CPET/ECET 499 Mobile Computing Technologies & Apps Data Dissemination and Management 3 of 4 Lecture 8 Paul I-Hai Lin, Professor Electrical.
Mobile File Systems.
CS6320 – Performance L. Grewe.
Reddy Mainampati Udit Parikh Alex Kardomateas
Presented by Kristen Carlson Accardi
Chapter 25: Advanced Data Types and New Applications
Data Dissemination and Management - Topics
Data Dissemination and Management (2) Lecture 10
Web Caching? Web Caching:.
CSE 4340/5349 Mobile Systems Engineering
Channel Allocation Problem/Multiple Access Protocols Group 3
Channel Allocation Problem/Multiple Access Protocols Group 3
Dissemination of Dynamic Data on the Internet
Spectrum Sharing in Cognitive Radio Networks
Database System Architectures
Data Dissemination and Management (3)
Data Dissemination and Management (2) Lecture 10
Satellite Packet Communications A UNIT -V Satellite Packet Communications.
Presentation transcript:

CSE 535 – Mobile Computing Lecture 8: Data Dissemination Sandeep K. S. Gupta School of Computing and Informatics Arizona State University

Data Dissemination

Communications Asymmetry Network asymmetry In many cases, downlink bandwidth far exceeds uplink bandwidth Client-to-server ratio Large client population, few servers Data volume Small requests for info, large responses Again, downlink bandwidth more important Update-oriented communication Updates likely affect a number of clients

Disseminating Data to Wireless Hosts Broadcast-oriented dissemination makes sense for many applications Can be one-way or with feedback Sports Stock prices New software releases (e.g., Netscape) Chess matches Music Election Coverage Weather…

Dissemination: Pull Pull-oriented dissemination can run into trouble when demand is extremely high Web servers crash Bandwidth is exhausted client server hel p

Dissemination: Push Server pushes data to clients No need to ask for data Ideal for broadcast-based media (wireless) client server Whew !

Broadcast Disks server Schedule of data blocks to be transmitted

Broadcast Disks: Scheduling Round Robin Schedule Priority Schedule

Priority Scheduling (2) Random Randomize broadcast schedule Broadcast "hotter" items more frequently Periodic Create a schedule that broadcasts hotter items more frequently… …but schedule is fixed "Broadcast Disks: Data Management…" paper uses this approach Simplifying assumptions  Data is read-only  Schedule is computed and doesn't change…  Means access patterns are assumed the same Allows mobile hosts to sleep…

"Broadcast Disks: Data Management…" Order pages from "hottest" to coldest Partition into ranges ("disks")—pages in a range have similar access probabilities Choose broadcast frequency for each "disk" Split each disk into "chunks" maxchunks = LCM(relative frequencies) numchunks(J) = maxchunks / relativefreq(J) Broadcast program is then: for I = 0 to maxchunks - 1 for J = 1 to numdisks Broadcast( C(J, I mod numchunks(J) )

Sample Schedule, From Paper Relative frequencies 4 2 1

Broadcast Disks: Research Questions From Vaidya: How to determine the demand for various information items? Given demand information, how to schedule broadcast? What happens if there are transmission errors? How should clients cache information?  User might want data item immediately after transmission…

Hot For You Ain't Hot for Me Hottest data items are not necessarily the ones most frequently accessed by a particular client Access patterns may have changed Higher priority may be given to other clients Might be the only client that considers this data important… Thus: need to consider not only probability of access (standard caching), but also broadcast frequency A bug in the soup: Hot items are more likely to be cached! (Reduce their frequency?)

Broadcast Disks Paper: Caching Under traditional caching schemes, usually want to cache "hottest" data What to cache with broadcast disks? Hottest? Probably not—that data will come around soon! Coldest? Ummmm…not necessarily… Cache data with access probability significantly higher than broadcast frequency

Caching, Cont. PIX algorithm (Acharya) Eject the page from local cache with the smallest value of: probability of access broadcast frequency Means that pages that are more frequently accessed may be ejected if they are expected to be broadcast frequently…

Broadcast Disks: Issues User profiles Provide information about data needs of particular clients "Back channel" for clients to inform server of needs Either advise server of data needs… …or provide "relevance feedback" Dynamic broadcast Changing data values introduces interesting consistency issues If processes read values at different times, are the values the same? Simply guarantee that data items within a particular broadcast period are identical?

Hybrid Push/Pull "Balancing Push and Pull for Data Broadcast" (Acharya, et al SIGMOD '97) "Pull Bandwidth" (PullBW) – portion of bandwidth dedicated to pull-oriented requests from clients PullBW = 0% "pure" Push Clients needing a page simply wait PullBW = 100% Schedule is totally request-based

Interleaved Push and Pull (IPP) Mixes push and pull Allows client to send requests to the server for missed (or absent) data items Broadcast disk transmits program plus requested data items (interleaved) Fixed threshold ThresPerc to limit use of the back channel by a particular client Sends a pull request for p only if # of slots before p will be broadcast is greater than ThresPerc ThresPerc is a percentage of the cycle length Also controls server load–as ThresPerc  100%, server is protected

CSIM-based Simulation Measured Client (MC) Client whose performance is being measured Virtual Client (VC) Models the "rest" of the clients as a single entity… …chewing up bandwidth, making requests… Assumptions: Front channel and back channel are independent Broadcast program is static—no dynamic profiles Data is read only

Simulation (1) No feedback to clients!

Simulation (2) Can control ratio of VC to MC requests Noise controls the similarity of the access patterns of VC and MC Noise == 0  same access pattern PIX algorithm is used to manage client cache VC's access pattern is used to generate the broadcast (since VC represents a large population of clients) Goal of simulation is to measure tradeoffs between push and pull under broadcast

Simulation (3) CacheSize pages are maintained in a local cache SteadyStatePerc models the # of clients in the VC population that have “filled” caches—e.g., most important pages are in cache ThinkTimeRatio models intensity of VC request generation relative to MC ThinkTimeRatio high means more activity on the part of virtual clients

Simulation (4)

Experiment 1: Push vs. Pull Important! PullBW set at 50% in 3a – if server's pull queue fills, requests are dropped! At PullBW = 10%, reduction in bandwidth hurts push, is insufficient for pull requests! server death! Light loads: pull better

Experiment 2: Cache Warmup Time for MC Low server load: pull better. High server load: push better.

Experiment 3: Noise: Are you (VC) like me (MC)? On the left, pure push vs. pure pull. On the right, pure push vs. IPP !!!

Experiment 4: Limiting Greed If there’s plenty of bandwidth, limiting greed isn’t a good idea On the other hand…

Experiment 5: Incomplete Broadcasts Not all pages broadcast—non-broadcast pages must be explicitly pulled Lesson: Must provide adequate bandwidth or response time will suffer! In 7b, making clients wait longer before requesting helps… Server overwhelmed—requests are being dropped!

Incomplete Broadcast: More Lesson: Careful! At high server loads with lots of pages not broadcast, IPP can be worse than push or pull!

Experimental Conclusions Light server load: pull better Push provides a safety cushion in case a pull request is dropped, but only if all pages are broadcast Limits on pull provide a safety cushion that prevents the server from being crushed Broadcasting all pages can be wasteful But must provide adequate bandwidth to pull omitted pages… Otherwise, at high load, IPP can be worse than pull! Overall: Push and pull tend to beat IPP in certain circumstances But IPP tends to have reasonable performance over a wide variety of system loads… Punchline: IPP a good compromise in a wide range of circumstances

Mobile Caching: General Issues Mobile user/application issues: Data access pattern (reads? writes?) Data update rate Communication/access cost Mobility pattern of the client Connectivity characteristics  disconnection frequency  available bandwidth Data freshness requirements of the user Context dependence of the information

Mobile Caching (2) Research questions: How can client-side latency be reduced? How can consistency be maintained among all caches and the server(s)? How can we ensure high data availability in the presence of frequent disconnections? How can we achieve high energy/bandwidth efficiency? How to determine the cost of a cache miss and how to incorporate this cost in the cache management scheme? How to manage location-dependent data in the cache? How to enable cooperation between multiple peer caches?

Mobile Caching (3) Cache organization issues: Where do we cache? (client? proxy? service?) How many levels of caching do we use (in the case of hierarchical caching architectures)? What do we cache (i.e., when do we cache a data item and for how long)? How do we invalidate cached items? Who is responsible for invalidations? What is the granularity at which the invalidation is done? What data currency guarantees can the system provide to users? What are the (real $$$) costs involved? How do we charge users? What is the effect on query delay (response time) and system throughput (query completion rate)?

Weak vs. Strong Consistency Strong consistency Value read is most current value in system Invalidation on each write can expire outdated values Disconnections may cause loss of invalidation messages Can also poll on every access Impossible to poll if disconnected! Weak consistency Value read may be “somewhat” out of date TTL (time to live) associated with each value Can combine TTL with polling e.g., Background polling to update TTL or retrieval of new copy of data item if out of date

Disconnected Operation Disconnected operation is very desirable for mobile units Idea: Attempt to cache/hoard data so that when disconnections occur, work (or play) can continue Major issues: What data items (files) do we hoard? When and how often do we perform hoarding? How do we deal with cache misses? How do we reconcile the cached version of the data item with the version at the server?

One Slide Case Study: Coda Coda: file system developed at CMU that supports disconnected operation Cache/hoard files and resolve needed updates upon reconnection Replicate servers to improve availability What data items (files) do we hoard? User selects and prioritizes Hoard walking ensures that cache contains the “most important” stuff When and how often do we perform hoarding? Often, when connected

Coda (2) (OK, two slides) How do we deal with cache misses? If disconnected, cannot How do we reconcile the cached version of the data item with the version at the server? When connection is possible, can check before updating When disconnected, use local copies Upon reconnection, resolve updates If there are hard conflicts, user must intervene (e.g., it’s manual— requires a human brain) Coda reduces the cost of checking items for consistency by grouping them into volumes If a file within one of these groups is modified, then the volume is marked modified and individual files within can be checked

WebExpress Housel, B. C., Samaras, G., and Lindquist, D. B., “WebExpress: A Client/Intercept Based System for Optimizing Web Browsing in a Wireless Environment,” Mobile Networks and Applications 3:419–431, System which intercepts web browsing, providing sophisticated caching and bandwidth saving optimizations for web activity in mobile environments Major issues: Disconnected operation Verbosity of HTTP protocol  Perform Protocol Reduction TCP connection setup time  Try to re-use TCP connections Low bandwidth in wireless networks  Caching Many responses from web servers are very similar to those seen previously  Use differencing rather than returning complete responses, particularly for CGI-based interactions

WebExpress (2) Two intercepts: both on client side and in the wired network Caching on both client and on wired network + differencing One TCP connection Reduce redundant HTTP header info Reinsert removed HTTP header info on server side