Pradeep Kumar Gunda, Lenin Ravindranath, Chandramohan A. Thekkath, Yuan Yu, and Li Zhuang Presented by: Hien Nguyen.

Slides:



Advertisements
Similar presentations
MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
Advertisements

Seyedehmehrnaz Mireslami, Mohammad Moshirpour, Behrouz H. Far Department of Electrical and Computer Engineering University of Calgary, Canada {smiresla,
The Zebra Striped Network Filesystem. Approach Increase throughput, reliability by striping file data across multiple servers Data from each client is.
The Zebra Striped Network File System Presentation by Joseph Thompson.
Spark: Cluster Computing with Working Sets
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
Nectar: Efficient Management of Computation and Data in Data Centers Lenin Ravindranath Pradeep Kumar Gunda, Chandu Thekkath, Yuan Yu, Li Zhuang.
Optimus: A Dynamic Rewriting Framework for Data-Parallel Execution Plans Qifa Ke, Michael Isard, Yuan Yu Microsoft Research Silicon Valley EuroSys 2013.
1 File Management Chapter File Management File management system consists of system utility programs that run as privileged applications Input to.
Chapter 11 ASP.NET JavaScript, Third Edition. 2 Objectives Learn about client/server architecture Study server-side scripting Create ASP.NET applications.
File Management Chapter 12.
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Computer Skills Preparatory Year Presented by: L.Obead Alhadreti.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Query Processing Presented by Aung S. Win.
Microsoft Visual Basic 2012 CHAPTER ONE Introduction to Visual Basic 2012 Programming.
Chapter 10 Architectural Design
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Lecture Roger Sutton CO530 Automation Tools 5: Class Libraries and Assemblies 1.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Introduction to RUP Spring Sharif Univ. of Tech.2 Outlines What is RUP? RUP Phases –Inception –Elaboration –Construction –Transition.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Networked File System CS Introduction to Operating Systems.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
Chapter 7 Structuring System Process Requirements
Chapter 6 : Software Metrics
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters Hung-chih Yang(Yahoo!), Ali Dasdan(Yahoo!), Ruey-Lung Hsiao(UCLA), D. Stott Parker(UCLA)
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
G.Corti, P.Robbe LHCb Software Week - 19 June 2009 FSR in Gauss: Generator’s statistics - What type of object is going in the FSR ? - How are the objects.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
Large scale IP filtering using Apache Pig and case study Kaushik Chandrasekaran Nabeel Akheel.
The Functions of Operating Systems Desktop PC Operating Systems.
Dryad and DryaLINQ. Dryad and DryadLINQ Dryad provides automatic distributed execution DryadLINQ provides automatic query plan generation Dryad provides.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
1 1 ECHO Extended Services February 15, Agenda Review of Extended Services Policy and Governance ECHO’s Service Domain Model How to…
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.
Introduction to Active Directory
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
Module 6: Administering Reporting Services. Overview Server Administration Performance and Reliability Monitoring Database Administration Security Administration.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
COMP091 – Operating Systems 1 Memory Management. Memory Management Terms Physical address –Actual address as seen by memory unit Logical address –Address.
PROGRAMMING FUNDAMENTALS INTRODUCTION TO PROGRAMMING. Computer Programming Concepts. Flowchart. Structured Programming Design. Implementation Documentation.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Scan, Import, and Automatically file documents to Box Introduction
Hadoop.
Chapter 9 – Real Memory Organization and Management
Database Performance Tuning and Query Optimization
CIS16 Application Development – Programming with Visual Basic
Chapter 11 Database Performance Tuning and Query Optimization
DryadInc: Reusing work in large-scale computations
Web Application Development Using PHP
ReStore: Reusing Results of MapReduce Jobs
Presentation transcript:

Pradeep Kumar Gunda, Lenin Ravindranath, Chandramohan A. Thekkath, Yuan Yu, and Li Zhuang Presented by: Hien Nguyen

 Introduction  Related work  System design overview  Implementation details  Experimental evaluation  Discussion and conclusions

 Major challenges in realizing the full potential of data-intensive distributed computing within data centers:  a large fraction of computations in a data center are redundant  many datasets are obsolete or seldom used wasting vast amounts of resources in a data center.

 wastage storage in 240-node experimental Dryad/DryadLINQ cluster around 50% of the files are not accessed in the last 250 days

 Execution statistics of 25 production clusters running data-parallel apps:  on one such cluster, over 7000 hours/day of redundant computation can be eliminated by caching intermediate results.  equivalent to shutting off 300 machines daily.  cumulatively, over all clusters, this figure is over 35,000 hours per day.

 Nectar: a system that manages the execution environment of a data center and is designed to address these problems:  Automatically caches computation results.  Automatically manages data: storage, retrieval, garbage collection.  Maintaining the dependency of data and programs.

 Computations running on a Nectar-managed data center are LINQ programs: comprises a set of operators to manipulate datasets of.NET objects.  All of these operators are functional: they transform input datasets to new output datasets.

 LINQ example

 Data stored in a Nectar-managed data center:  Primary: created once and accessed many times, referenced by conventional pathnames.  Derived: results produced by computations running on primary and other derived datasets, all access mediated by Nectar, referred by simple pathnames containing a simple indirection (like UNIX symbolic link) to the actual LINQ programs that produce them.

 A Nectar-managed data center offers:  Efficient space utilization: storage, retrieval, and eviction of the results of all computations.  Reuse of shared sub-computations: automatically caches the results of sub-computations.  Incremental computations: caching enables reusing results of old data, only compute incrementally for newly arriving data.  Ease of content management: derived datasets uniquely named by LINQ expressions, and automatically managed by Nectar

 Inspiration from the Vesta system: primary and derived data. But difference in app domains  DryadInc: early attempt to eliminate redundant computations via caching. Caching approach is quite similar to Nectar. However, too general and low-level for the system we wanted to build.

 Stateful bulk processing system and Comet : mainly focus on the same problem of addressing incremental computation. However, Nectar:  automatically manages data and computation  transparent to users  sharing sub-computation does not require submitting jobs at the same time.

 DryadLINQ: a simple, powerful, and elegant programming environment for writing large scale data parallel applications running on large PC clusters:  Dryad provides reliable, distributed computing for large-scale data parallel applications.  LINQ enables developers to write and debug apps in a SQL-like query language, relying on.NET library and using Visual Studio.

 DryadLINQ translates LINQ programs into distributed Dryad computations:  C# and LINQ data objects become distributed partitioned files.  LINQ queries become distributed Dryad jobs.  C# methods become code running on the vertices of a Dryad job.

 DryadLinQ program’s input and output are expected to be streams:  consists of an ordered sequence of extents, each each stores a sequence of object of some data type  require that streams be append-only: new contents are added by either appending to the last extent or adding a new extent

 Nectar uses fault-tolerant, distributed file system called TidyFS which maintains two namespaces:  Program store: keeps all DryadLINQ programs that have ever executed successfully.  Data store is used to store all derived streams generated by DryadLINQ programs.

 Nectar cache server provides cache hits to the program rewriter on the client side.  Any stream in the data store that is not referenced by any cache entry is deleted permanently by the garbage collector.  Programs in the program store are never deleted and are used to recreate a deleted derived stream if needed.

 Client-side libarary: 3 main components  Cache Key Calculation  Rewriter  Cost Estimator

 Cache Key Calculation  A computation is uniquely identified by its program and inputs.  Use the Rabin fingerprint of the program and the input datasets as the cache key for a computation.  The fingerprint of a DryadLINQ program must be able to detect any changes to the code the program depends on: ▪ implement a static dependency analyzer to compute the transitive closure of all the code that are reachable from the program. ▪ fingerprint is then formed using all the reachable code.

 Rewriter: support 3 rewriting scenarios  Common sub-expressions  Incremental query plans: have P(D1), now compute P(D1+D2): finds an operator C such that P (D1 + D2) = C(P (D1), D2)  Incremental query plans for sliding windows: same program is repeatedly run on the following sequence of inputs: ▪ D 1 = d 1 + d d n, ▪ D 2 = d 2 + d d n+1, ▪ D 3 = d 3 + d d n+2,

 Cost Estimator:  choose an optimal cache hit to rewrite the program.  using execution statistics collected and saved in the cache server from past executions.

 Datacenter-Wide Service  Cache Service: bookkeeping information about DryadLINQ programs and the location of their results ▪ serving the cache lookup requests by the Nectar rewriter ▪ deleting the cache entries of the least value.  Garbage collector: ▪ identify datasets unreachable from any cache entry and delete them. ▪ use a lease to protect newly created derived datasets

 Caching Computations  Cache and Programs  The Rewriting Algorithm  Cache Insertion Policy  Managing Derived Data  Data Store for Derived Data  Garbage Collection

 Cache and Programs  Cache entry:  as the primary key: can uniquely determine the result of the computation  Fingerprint of the inputs: based on the actual content of the datasets  Fingerprintf of the program: static dependency analyzer to capture all the dependency of an expression.

 Cache and Programs (cont.)  Statistics information kept in the cache entry is used by the rewriter to find an optimal execution plan, contains information such as the cumulative execution time.  Supporting operations: ▪ Lookup ▪ Inquire ▪ AddEntry

 The Rewriting Algorithm  For a given expression: may get cache hits on any possible sub-expression and subset of the input dataset ▪ considering all of them in the rewriting is not tractable ▪ only consider cache hits on prefix sub-expression on segments of the input dataset ▪ i.e: D.Where(P).Select(F): only consider cache hits for the sub-expressions S.Where(P) and S.Where(P).Select(F) for all subsequence of extents in D.

 Rewriting algorithm: GroupBy-Select example  var groups = source.GroupBy(KeySelect);  var reduced = groups.Select(Reduce);

 Rewriting algorithm: start from the largest prefix sub-expression (root of the expression)  Step 1: probe the cache server to obtain all the possible hits  Step 2: find the best hits  Step 2: choose a subset of hits.

 Cache Insertion Policy: consider every prefix sub-expression to be a candidate for caching  always creates a cache entry for the final result of a computation as we get it for free  For sub-expression candidates, cache only when predicted to be useful in the future: ▪ Based on previous statistics ▪ Based on runtime information

 Data Store for Derived Data: stores all derived datasets in a data store inside a distributed, fault-tolerand file system.  actual location of a derived dataset is completely opaque to programmers  Accessing an existing derived dataset must go through the cache server.  New derived datasets can only be created as results of computations.

 Garbage Collection: automatically deletes the derived datasets that are considered to be least useful in the future.  cache server deletes entries that determined to have the least value.  The datasets referred to by these deleted cache entries will then be considered garbage.  eviction policy is based on the cost-to-benefit ratio:  Give newly created cache entries a chance to demonstrate their usefulness: use leases.

 Production Clusters: use logs from 25 different clusters to evaluate the usefulness of Nectar  Benefits from Caching: 20% to 65% jobs in a cluster benefits from caching. 30% jobs in 17 clusters had a cache hit, on an average more than 35% jobs benefited from caching.

 Production Clusters (cont.)  Ease of Program Development: ▪ automatically supports incremental computation and programmers do not need to code them explicitly ▪ debugging time significantly improved due to cache hits

 Production Clusters (cont.)  Managing Storage: Production clusters create large amount of derived datasets and if not properly managed can create significant storage pressure.

 System Deployment Experience  Each machine in our 240-node research cluster has two dual-core 2.6GHz AMD Opteron 2218 HE CPUs, 16GB RAM, four 750GB SATA drives, and runs Windows Server 2003 operating system.  Datasets ▪ WordDoc Dataset ▪ ClickLog Dataset ▪ SkyServer Dataset

 System Deployment Experience (cont.)  Sub-computation Evaluation: 4 programs ▪ WordAnalysis parses the dataset to generate the number of occurrences of each word and the number of documents that it appears in. ▪ TopWord looks for the top ten most commonly used words in all documents. ▪ MostDoc looks for the top ten words appearing in the largest number of documents. ▪ TopWordRatio finds the percentage of occurrences of the top ten mostly used word among allwords.

 System Deployment Experience (cont.)  Incremental Computation:

 System Deployment Experience (cont.)  Debugging Experience: Sky Server

 The most popular comment from our users is that the system makes program debugging much more interactive and fun.  Most of the Nectar developers, use Nectar to develop Nectar on a daily basis, and found a big increase in productivity.  Nectar is a complex distributed systems with multiple interacting policies. Devising the right policies and fine-tuning their parameters to find the righ tradeoffs are essential to make the system work in practice.