SANDIE: Optimizing NDN for Data Intensive Science

Slides:

Advertisements

Similar presentations

COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. INFORM: a dynamic INterest FORwarding Mechanism for Information Centric Networking Raffaele Chiocchetti,

Advertisements

Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.

Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,

Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.

Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.

NAMED DATA NETWORKING IN CLIMATE RESEARCH AND HEP APPLICATIONS CHEP2015, APRIL , OKINAWA, JAPAN Susmit Shannigrahi*, Artur Barczyk^, Christos Papadopoulos*,

Multipath Protocol for Delay-Sensitive Traffic Jennifer Rexford Princeton University Joint work with Umar Javed, Martin Suchara, and Jiayue He

CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 34 – Media Server (Part 3) Klara Nahrstedt Spring 2012.

Active Network Applications Tom Anderson University of Washington.

Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Hopkins Storage Systems Lab, Department of Computer Science A Workload-Driven Unit of Cache Replacement for Mid-Tier Database Caching Xiaodan Wang, Tanu.

Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.

1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.

A Dynamic Data Grid Replication Strategy to Minimize the Data Missed Ming Lei, Susan Vrbsky, Xiaoyan Hong University of Alabama.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

E VALUATION OF F AIRNESS IN ICN X. De Foy, JC. Zuniga, B. Balazinski InterDigital

Martin-1 CSE 5810 CSE 5810 Individual Research Project: Integration of Named Data Networking for Improved Healthcare Data Handling Robert Martin Computer.

Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

Congestion Control in CSMA-Based Networks with Inconsistent Channel State V. Gambiroza and E. Knightly Rice Networks Group

Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.

1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.

Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.

Caitriana Nicholson, CHEP 2006, Mumbai Caitriana Nicholson University of Glasgow Grid Data Management: Simulations of LCG 2008.

MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.

Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,

Presenter: Yue Zhu, Linghan Zhang A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop: a Case Study by PowerPoint.

DATA ACCESS and DATA MANAGEMENT CHALLENGES in CMS.

William Stallings Data and Computer Communications

Auction-based in-network caching in Information-centric networks Workshop ACROSS, 16th of September 2016 | Lucia D’Acunto.

BD-Cache: Big Data Caching for Datacenters

Architecture and Algorithms for an IEEE 802

Constraint-Based Routing

Notes Onur Ascigil, Vasilis Sourlas, Ioannis Psaras, and George Pavlou

Virtual laboratories in cloud infrastructure of educational institutions Evgeniy Pluzhnik, Evgeniy Nikulchev, Moscow Technological Institute

The Impact of Replacement Granularity on Video Caching

Tohoku University, Japan

Authors: Sajjad Rizvi, Xi Li, Bernard Wong, Fiodar Kazhamiaka

Distributed Data Access and Resource Management in the D0 SAM System

Outline Introduction Routing in Mobile Ad Hoc Networks

Wireless Sensor Network Architectures

Grid Computing.

NDN Day: Introduction and Perspective

PA an Coordinated Memory Caching for Parallel Jobs

Named Data Networking in science Applications

CSCE 990: Advanced Distributed Systems

An Equal-Opportunity-Loss MPLS-Based Network Design Model

Edge computing (1) Content Distribution Networks

ExaO: Software Defined Data Distribution for Exascale Sciences

INFOCOM 2013 – Torino, Italy Content-centric wireless networks with limited buffers: when mobility hurts Giusi Alfano, Politecnico di Torino, Italy Michele.

Junaid Ahmed Khan, Cedric Westphal, J. J

Xiaoyang Zhang1, Yuchong Hu1, Patrick P. C. Lee2, Pan Zhou1

The Globus Toolkit™: Information Services

Coded Caching in Information-Centric Networks

Cooperative Caching, Simplified

Consideration on applying ICN to Edge Computing

Group Based Management of Distributed File Caches

Javad Ghaderi, Tianxiong Ji and R. Srikant

Backbone Traffic Engineering

Project Overview Konstantinos Tserpes, ICCS/NTUA Final Review Meeting

EE 122: Lecture 22 (Overlay Networks)

2019/9/14 The Deep Learning Vision for Heterogeneous Network Traffic Control Proposal, Challenges, and Future Perspective Author: Nei Kato, Zubair Md.

Contention-Aware Resource Scheduling for Burst Buffer Systems

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

SANDIE: Optimizing NDN for Data Intensive Science Edmund Yeh

Data-intensive Science and LHC Data-intensive science: extract knowledge from massive datasets with growing scale and complexity. Global data distribution, processing, access, analysis; coordinated use of large but limited computing, storage and network resources. NDN a natural fit for data-intensive science. Large Hadron Collider (LHC) high energy physics program is world’s largest data intensive application, handles about 1 Exabyte (1 million terabytes) by 2018. LHC network connects CERN to 500 tiered sites worldwide. LHC network traffic projected to grow by another 2 orders of magnitude by 2026. Increased complexity of data. Present system cannot scale to meet needs of HC Run3 (2021-23). SANDIE: NDN for Data Intensive Science

SANDIE Project $1M, 2-year US NSF CC* grant starting July 2017. Northeastern (lead), Caltech, Colorado State. Use NDN principles to redesign high energy physics (HEP) Large Hadron Collider (LHC) program network. Will deploy NDN edge caches with SSDs and 40G/100G network interfaces at 7 sites (Northeastern, Caltech, CSU, Fermilab, ..) Combine with larger core caches in strategic locations. Coordinate with compact preprocessing of data objects (e.g. MiniAOD, MicroAOD, data scouting, n-tuples). Leverages NDN-based tested for climate applications at CSU. SANDIE: NDN for Data Intensive Science

Project Aims Derive NDN-based operational model for data distribution, processing, gathering, analysis to benefit LHC and other major data-intensive scientific programs (e.g. LSST, SKA). Simultaneous optimization of caching (“hot” datasets), forwarding, and congestion control in network core and site edges; leverages PI previous work. Network core and edge node design and performance optimization. Development of naming scheme and attributes for fast access and efficient communication in HEP and other fields. Scalability of NDN system in amount and type of data supported as well as geographic extent. SANDIE: NDN for Data Intensive Science

Optimized NDN Algorithms State-of-the-art, theoretically grounded, high-performance optimization frameworks for optimizing NDN. Joint optimization of caching, forwarding, and congestion control. VIP (Virtual Interest Packet) framework (Yeh, et al. 2014). Adaptive caching and routing framework (Ioannidis and Yeh 2016, 2017). Algorithms have optimality guarantees. Algorithms are distributed, adaptive, dynamic. Superior performance in delay, cache hits, cache evictions. SANDIE: NDN for Data Intensive Science

VIP Framework VIP (Virtual Interest Packet) framework for optimal caching and forwarding for NDN (first published at ACM ICN 2014). First comprehensive framework for jointly optimal caching, forwarding, and congestion control for ICN networks. Stochastic control approach: models queuing and congestion. Maximizes user demand rate satisfied by NDN network. Superior performance in user delay, cache hit rate, cache evictions. Distributed, adaptive algorithms for changing content, user demands, and network conditions. Project supported by NSF FIA NDN grant 2010-2014, NSF NeTS grants 2014-2020, Cisco grants 2013-2018. SANDIE: NDN for Data Intensive Science

VIP Framework For each Interest Packet for data object k entering network, generate corresponding VIP count for object k. VIP counts: locally measured demand/popularity for data objects. VIP counts incremented at request entry nodes, removed at data sources and caching points. VIP counts are common metric for determining caching and forwarding decisions. Forwarding: at each time and on each link, use all capacity to send data object with largest VIP count difference. Dynamic, multipath forwarding algorithm. Caching: at each time and each node, cache data objects (with all chunks) with highest VIP counts. Number of replicas dynamically optimized. Both caching and forwarding are adaptive and distributed. Can incorporate content-centric congestion control. SANDIE: NDN for Data Intensive Science

VIP Evaluation SANDIE: NDN for Data Intensive Science

VIP Evaluation: Delay SANDIE: NDN for Data Intensive Science

VIP Evaluation: Cache Hits SANDIE: NDN for Data Intensive Science

VIP Evaluation: Cache Evictions SANDIE: NDN for Data Intensive Science

LHC Model Map SANDIE: NDN for Data Intensive Science

LHC Model Data Access Pattern Files Total Size Min Size Max Size Mean Size Median Size 58035992 149.86 PB 0.00 MB 163.82 GB 2.58 GB 2.59 GB SANDIE: NDN for Data Intensive Science

VIP Evaluation for LHC Model Data object: (MiniAOD) dataset (2 TB), each with 100 data block chunks (20 GB). VIP counts maintained for 200 most popular (hot) datasets, accounting for 87% of requests among 1500 active datasets. Dataset popularity distribution fitted with Zipf(1.2). Dataset requests arrive at Tier 1/Tier 2/Tier 3 node at rate l, according to Zipf(1.2) distribution. VIP-controlled 60 TByte caches at CHIC, LOSA, ATLA and AMST core nodes. Source nodes for all datasets: CERN and FNL. 800 TB repositories at all Tier 2 sites: store all 200 hot datasets. Scenarios: 1) no caching, no tier 2 repositories; 2) caching at 4 core nodes, no repositories at tier 2 nodes; 3) no caching, tier 2 repositories at all tier 2 nodes; 4) caching at 4 core nodes, repositories at all tier 2 nodes. SANDIE: NDN for Data Intensive Science

VIP Evaluation for LHC Model SANDIE: NDN for Data Intensive Science

Adaptive Caching and Routing New approach for optimizing caching and routing in general multi-hop networks by Ioannidis and Yeh (ACM SIGMETRICS 2016, ACM ICN 2017 Best Paper Award). Optimize caching and routing to maximize gain in reducing link costs incurred. Distributed, adaptive algorithm. Performance guarantee: converges to point within 1-1/e of optimal (NP-hard) solution. Stochastic projected gradient ascent on concave relaxation. Route-to-Nearest-Server combined with LRU, LFU, FIFO, … can be arbitrarily suboptimal. In heterogeneous networks, can decrease network costs by 3 orders of magnitude. Project supported by NSF NeTS grant 2017-2020, Intel grant 2018-2019. SANDIE: NDN for Data Intensive Science

Optimal Distributed Computation and Storage with NDN LHC network: distributed computation and data storage network. Need to optimize usage of bandwidth, storage, and computation resources for performance. Powerful machines may be congested, may not have needed datasets. Caching required data where jobs are processed enhances performance. Ongoing work: use NDN approach to jointly optimize workflow (forwarding and processor scheduling) and data caching. SANDIE: NDN for Data Intensive Science

Conclusion and Ongoing Work NDN holds great promise for optimizing data-intensive science networks. State-of-the-art, theoretically grounded, high-performance frameworks (VIP, adaptive caching/routing) developed for jointly optimizing caching, forwarding, and congestion control for NDN. Algorithms are distributed, adaptive, dynamic. Superior performance in delay, cache hits. VIP optimization shows great performance gains for LHC model network. Ongoing work: optimization with storage management (SSD, disks) Ongoing work: joint optimization of bandwidth, storage, and computation resources for performance. Ongoing work: proactive caching strategies using predictions on future demand. ACM ICN 2018 at Northeastern University, Boston, Sept. 21-23. SANDIE: NDN for Data Intensive Science