Data Staging on Untrusted Surrogates Jason Flinn Shafeeq Sinnamohideen Niraj Tolia Mahadev Satyanarayanan Intel Research Pittsburgh, University of Michigan,

Slides:



Advertisements
Similar presentations
Dissemination-based Data Delivery Using Broadcast Disks.
Advertisements

Mobile Computing
Source: IEEE Pervasive Computing, Vol. 8, Issue.4, Oct.2009, pp. 14 – 23 Author: Satyanarayanan, M., Bahl, P., Caceres, R., Davies, N. Adviser: Chia-Nian.
Cobalt: Separating content distribution from authorization in distributed file systems Kaushik Veeraraghavan Andrew Myrick Jason Flinn University of Michigan.
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Workloads Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Live workload Benchmark applications Micro- benchmark.
SIGOPS European Workshop The Case for Cyber Foraging Rajesh Krishna Balan Carnegie Mellon University Joint work with: Joint work with: J. Flinn (Univ.
G Robert Grimm New York University Disconnected Operation in the Coda File System.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Reducing the Energy Usage of Office Applications Jason Flinn M. Satyanarayanan Carnegie Mellon University Eyal de Lara Dan S. Wallach Willy Zwaenepoel.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Data Provisioning Services for mobile clients by Mustafa Ergen Authors: Mohit Agarwal and Anuj Puri Berkeley WOW Group University.
1 CAPS: A Peer Data Sharing System for Load Mitigation in Cellular Data Networks Young-Bae Ko, Kang-Won Lee, Thyaga Nandagopal Presentation by Tony Sung,
An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and Swapnil Patil Carnegie Mellon University and *Intel Research.
OSD Metadata Management
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley
File Access Patterns in Coda Distributed File System Yevgeniy Vorobeychik.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
The Medusa Proxy A Tool For Exploring User- Perceived Web Performance Mimika Koletsou and Geoffrey M. Voelker University of California, San Diego Proceeding.
7/15/2015ROC/OceanStore Winter Retreat Introspective Replica Management in OceanStore Dennis Geels.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
Client-Server Computing in Mobile Environments
MetaSync File Synchronization Across Multiple Untrusted Storage Services Seungyeop Han Haichen Shen, Taesoo Kim*, Arvind Krishnamurthy,
Web Caching and Content Delivery. Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely used method to improve.
Slingshot: Deploying Stateful Services in Wireless Hotspots Ya-Yunn Su Jason Flinn University of Michigan.
A Low-Bandwidth Network File System A. Muthitacharoen, MIT B. Chen, MIT D. Mazieres, NYU.
Energy Efficiency and Storage Flexibility in the Blue File System Edmund B Nightingale Jason Flinn University of Michigan.
Ubiquitous Data Access Doppalapudi Raghu Chaitanya Jaliparthi Gangadhar.
Ch 1. Mobile Adaptive Computing Myungchul Kim
MOBILE CLOUD COMPUTING
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
CORE KAIST EECS Computer Engineering Research Lab A General Purpose Proxy Filtering Mechanism Applied to the Mobile Environment Bruce Zenel Jupyung Lee.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.
Low-Overhead Byzantine Fault-Tolerant Storage James Hendricks, Gregory R. Ganger Carnegie Mellon University Michael K. Reiter University of North Carolina.
1 Configurable Security for Scavenged Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany, Matei Ripeanu.
UbiStore: Ubiquitous and Opportunistic Backup Architecture. Feiselia Tan, Sebastien Ardon, Max Ott Presented by: Zainab Aljazzaf.
Slingshot: Deploying Stateful Services in Wireless Hotspots Ya-Yunn Su Jason Flinn University of Michigan Presenter: Youngki, Lee.
Network Computing Laboratory Integrating Portable and Distributed Storage Niraj Tolia, Jan Harkes, Michael Kozuch, and M. Satyanarayanan CMU and Intel.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
The Vesta Parallel File System Peter F. Corbett Dror G. Feithlson.
SPECULATIVE EXECUTION IN A DISTRIBUTED FILE SYSTEM E. B. Nightingale P. M. Chen J. Flint University of Michigan.
Mobile Data Access1 Replication, Caching, Prefetching and Hoarding for Mobile Computing.
A Low-bandwidth Network File System Athicha Muthitacharoen et al. Presented by Matt Miller September 12, 2002.
Web Caching and Replication Presented by Bhushan Sonawane.
ENERGY-EFFICIENCY AND STORAGE FLEXIBILITY IN THE BLUE FILE SYSTEM E. B. Nightingale and J. Flinn University of Michigan.
DCIM: Distributed Cache Invalidation Method for Maintaining Cache Consistency in Wireless Mobile Networks.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
Simplifying Cloud Connectivity for Your Clients Presenter: Tom SharkeyTom Sharkey December 8,
The Personal Server Changing the Way We Think About Ubiquitous Computing Roy Want, et al. / Intel Research UBICOMP 2002 Nov Seungjae Lee
MICROSOFT TESTS /291/293 Fairfax County Adult Education Courses 1477/1478/1479.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Querying the Internet with PIER CS294-4 Paul Burstein 11/10/2003.
An Architecture for Internet Data Transfer Niraj Tolia, Michael Kaminsky, David G. Andersen, and Swapnil Patil NSDI ’ Eunsang Cho.
Multicast in Information-Centric Networking March 2012.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
BD-CACHE Big Data Caching for Datacenters
Slingshot: Deploying Stateful Services in Wireless Hotspots
Proposal: A General Infrastructure for Efficient Application-Level Protocols Steven Czerwinski Goal: To investigate ways to make.
Efficient and Transparent Dynamic Content Updates for Mobile Clients
Energy Efficiency and Storage Flexibility in the Blue File System
Distributed Systems CS
Energy Efficiency and Storage Flexibility in the Blue File System
28 January - 1 February 2019 NIPWG 6 - Rostock, Germany
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Data Staging on Untrusted Surrogates Jason Flinn Shafeeq Sinnamohideen Niraj Tolia Mahadev Satyanarayanan Intel Research Pittsburgh, University of Michigan, Carnegie Mellon University

Mobile Data Access: Expectation vs. Reality Mobile computers increasingly connected  expectation of ubiquitous data access  distributed file systems can help Does reality match expectations?  Size, weight, energy constaints  Less storage, processing power, etc. How to match reality and expectations?  Use untrusted, unmanaged infrastructure!

Problem: Limited Storage Latency often the real performance-killer  File systems: many sequential RPCs  Network latency not improving (much)! What if one can’t cache all files of interest?  Borrow storage from nearby surrogate  Use as a “L2 file cache” Client Surrogate File server

Problem: Limited Battery Energy File system consumes a lot of energy:  Network communication  Storage (disk spin-ups, reads, writes) Surrogate helps preserve client battery  Use surrogate cache to avoid disk spin-ups  Prefetch updates to surrogate, not client

Problem: Limited Bandwidth How to fetch large updates in a short window?  Example: passing through airport gate  11 Mbps (or more) local wireless bandwidth  Wide-area Internet bandwidth often less InfoStation (Wu, Badrinath, et al.)  Cache updates before mobile user arrives  Blast data as user passes through cell Surrogate: mechanism for caching file data.

Location, Location, Location Requirement: surrogate located near the client!  Must be opportunistic (use what’s there) Vision: surrogates ubiquitously deployed  Computers getting ever cheaper  Already b wireless networks in cafes  Can’t trust or assume good behavior!

Outline  Motivation  Architecture and design  Implementation  Evaluation  Related work and conclusions

Data Staging Architecture Surrogate Data Pump Staging Server Modifications & Unstaged reads files Encrypted files Staged reads File keys and hashes (via secure channel) File Client Desktop Proxy File Server File Client Wimpy Client Server High Latency Coda File system traffic

Trust (or Lack Thereof) Trusted: client, file server, desktop, file system Untrusted: surrogate, network How to deal with untrusted surrogate?  End-to-end encryption (privacy)  Cryptographic hashes (authenticity)  Read-only data (can’t “lose” updates)  Monitor performance (mitigate DoS)

Ease of Management Can’t require a system administrator!  Build on commodity software  Apache with Perl scripts (643 LoC)  No long-term state  OK to trip over power cord!  Allow file system diversity  Minimalist API  Currently support Coda and NFS

Surrogate API Register()Get lease, quota for surrogate Renew()Renew a lease Deregister()Explicitly stop using surrogate Stage()Put data on the surrogate Unstage()Remove data from surrogate Get()Retrieve data from surrogate

Which Files to Stage? Must predict the files most likely to be accessed Prediction orthogonal to data staging  Client proxy has hooks for prediction code  Hoarding: user manually specifies files, dirs  Clustering: per-activity LRU caching Manual Copy Coda Hoarding User-Driven Clustering SEER Less Transparent More Transparent

Client Proxy Data Structures Client proxy final arbiter of validity For each staged file, maintains:  Valid bit  Data length  Encryption key and secure hash File idValid?LengthKeyHash 0x3fdcYes32,5580xeabc…0xea67… 0x3fe6No23,4580xabc3…0x7345…

Staging Data Client proxy sends list of files to data pump For each file, data pump:  Reads file and attributes from file system  Encrypts file, generates hash over data  Sends encrypted data to surrogate  Sends key, hash, length to client Staging asynchronous with client file accesses  If file staged, client gets it from surrogate  Otherwise, gets it from file server

Outline  Motivation  Architecture and design  Implementation  Evaluation  Related work and conclusions

Experimental Setup Coda file server Ethernet Client: IPAQ MB Coda cache b Wireless Access Point 30 ms delay Surrogate Cold cache: no data on client or surrogate Warm cache: data initially on client and surrogate

Benchmark: Image Trace Record accesses to digital photo library in Coda  Take the first 10,148 accesses  150 MB unique data, 401 MB total data read  Replay trace as fast as possible (DFSTrace) Variables:  Wastage ratio: extra data prefetched  Miss ratio: amount of data never prefetched  Assume wastage ratio 33%, miss ratio 0%  Then do sensitivity analysis

Baseline Image Results Staging reduces execution time 45-48%!

Sensitivity Analysis Higher miss ratio has relatively greater effect

Longer-Duration File Traces Used Mummert’s Coda file system traces  Traces of client activity (open, mkdir, etc.)  Duration: hours  Working set size: MB Methodology:  Keep inter-request delays when prefetching  Eliminate delays afterwards

File Trace Results Up to 48% reduction in cumulative file access delay

Request Latency Breakdown

Related Work Web Caching (Akamai, Squid)  Different data access patterns, consistency Fluid Replication (Kim02)  Assume more trust and management OceanStore (Kubiatowicz02)  Staging minimalist, file-system agnostic Builds on work in file prefetching, InfoStations

Conclusion Possible to significantly improve distributed file system performance with untrusted, unmanaged infrastructure! Future work:  Grow set of supported file systems  Surrogate discovery and migration  Support for energy-awareness