Percipient StorAGe for Exascale Data Centric Computing Exascale Storage Architecture based on “Mero” Object Store Giuseppe Congiu Seagate Systems UK.

Slides:



Advertisements
Similar presentations
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
NGNS Program Managers Richard Carlson Thomas Ndousse ASCAC meeting 11/21/2014 Next Generation Networking for Science Program Update.
Power is Leading Design Constraint Direct Impacts of Power Management – IDC: Server 2% of US energy consumption and growing exponentially HPC cluster market.
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,
DORII Joint Research Activities DORII Joint Research Activities Status and Progress 4 th All-Hands-Meeting (AHM) Alexey Cheptsov on.
A Unified, Low-overhead Framework to Support Continuous Profiling and Optimization Xubin (Ben) He Storage Technology & Architecture Research(STAR)
1 The Strategic Research Agenda from 2014 to 2020 BDEC Wednesday, January 28 th,
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Workshop on the Future of Scientific Workflows Break Out #2: Workflow System Design Moderators Chris Carothers (RPI), Doug Thain (ND)
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
WP18: High-Speed Data Recording Krzysztof Wrona, European XFEL 07 October 2011 CRISP.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Paperless Timesheet Management Project Anant Pednekar.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Software Defined Datacenter – from Vision to Solution
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Defining the Competencies for Leadership- Class Computing Education and Training Steven I. Gordon and Judith D. Gardiner August 3, 2010.
Parallel Programming Models
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Business System Development
Organizations Are Embracing New Opportunities
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
MapReduce Compiler RHadoop
Sushant Ahuja, Cassio Cristovao, Sameep Mohta
SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data - Aditi Thuse.
Fundamentals of Information Systems, Sixth Edition
Status and Challenges: January 2017
Methodology: Aspects: cost models, modelling of system, understanding of behaviour & performance, technology evolution, prototyping  Develop prototypes.
Data Ingestion in ENES and collaboration with RDA
Recap: introduction to e-science
Enabling Scalable and HA Ingestion and Real-Time Big Data Insights for the Enterprise OCJUG, 2014.
Structural Simulation Toolkit / Gem5 Integration
NSF : CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science PI: Geoffrey C. Fox Software: MIDAS HPC-ABDS.
Big Data - in Performance Engineering
Toward a Unified HPC and Big Data Runtime
Power is Leading Design Constraint
Cloud DIKW based on HPC-ABDS to integrate streaming and batch Big Data
Introduction to Apache
Overview of big data tools
Thales Alenia Space Competence Center Software Solutions
Department of Intelligent Systems Engineering
Operating Systems: A Modern Perspective, Chapter 3
PHI Research in Digital Science Center
Data(trans)forming Roberto Barcellan European Commission NTTS2019
On the Role of Burst Buffers in Leadership-Class Storage Systems
Reportnet 3.0 Database Feasibility Study – Approach
Big Data, Simulations and HPC Convergence
Productive + Hybrid + Intelligent + Trusted
Overview of Computer system
Convergence of Big Data and Extreme Computing
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Percipient StorAGe for Exascale Data Centric Computing Exascale Storage Architecture based on “Mero” Object Store Giuseppe Congiu Seagate Systems UK DKRZ I/O Workshop 22/03/2017 Per-cip-i-ent (pr-sp-nt) Adj. Having the power of perceiving, especially perceiving keenly and readily. n. One that perceives. This presentation outlines the architecture of the SAGE percipient storage system © Seagate Technology LLC

Storage Problems at Extreme Scale Storage cannot keep up w/ Compute! Way too much data Way too much energy to move data New Storage devices use unclear Opportunity: Big Data Analytics and Extreme Computing Overlaps 1 – denser compute components (MIC, GPGPUs) 2 – larger datasets and larger and more complex workflows 3 – rethink workflows (HPC vs BDA) 4 – parallel file systems too dumb we need percipient storage that can do both HPC and BDA (BDEC) 5 – we need to minimize data movement at exascale 6 – plus we have new storage devices Storage Problems at Extreme Scale © Seagate Technology LLC

SAGE Project Goal Big (Massive!) Data Analysis Extreme Computing Changing I/O needs HDDs cannot keep up Big (Massive!) Data Analysis Avoid Data Movements Manage & Process extremely large data sets Need Exascale Data Centric Computing Systems Big Data Extreme Computing (BDEC Systems) SAGE Validates a BDEC System which can Ingest, Store, Process and Manage extreme amounts of data BDEC term, US labs are starting to build BDEC systems requirements, we are actually building a system here SAGE Project Goal © Seagate Technology LLC

Research Areas Sage Ecosystem Areas of Research Percipient Storage for Function offloading Programming techniques supporting function shipping Next Generation media for deeper hierarchies Research Areas © Seagate Technology LLC

Percipient Storage Overview Goal Build the data centric computing platform Methodology Advanced Object Storage New NVRAM Technologies in I/O stack Ability for I/O to Accept computation Incl. Memory as part of storage tiers API for massive data ingest and extreme I/O Commodity Server & Computing Components in I/O stack Percipient Storage Overview © Seagate Technology LLC

Percipient Storage Modes of Operation © Seagate Technology LLC

Percipient Storage Architecture Emphasize Optimization tools (ADDB Allinea) Percipient Storage Architecture © Seagate Technology LLC

Percipient Storage Architecture ( Detailed)

Perspective on Monitoring.. Primary Goal Provide monitoring tools with telemetry data Methodology Mero has “ADDB” Infrastructure ( Analysis and Diagnostics Data Base) Structured telemetry records produced by the various subsystems It will now be exposed through the “Clovis” Management Interface ADDB Records can feed into tools Allinea will be consuming ADDB records in the SAGE project Records can be kept in local Mero nodes and Map-Reduce style analysis can also be performed Perspective on Monitoring..

SAGE Co-Design Primary Goal Demonstrate Use Cases & Co-Design the system Methodology Obtain Requirements from: Satellite Data Processing Bio-Informatics Space Weather Nuclear Fusion (ITER) Synchrotron Experiments Data Analytics Application Detailed profiling supported by Tools Feedback requirements from platform SAGE Co-Design © Seagate Technology LLC

SAGE Co-Design.. Completed Co-Design Process – First Phase © Seagate Technology LLC

Services, Systemware & Tools Goal Explore tools and services on top of Mero Methodology “HSM” Methods to automatically move data across tiers “pNFS” parallel file system access on Mero Scale out Object Storage Integrity checking service provision Allinea Performance Analysis Tools provision Completed Design of HSM Completed Design of pNFS Completed scoping/Arch of Data Integrity Checking Completed scoping/Arch of Performance Analysis Tools Services, Systemware & Tools © Seagate Technology LLC

Programming Models and Analytics Goal Explore usage of SAGE by programming models, runtimes and data analytics solutions Methodology Usage of SAGE through MPI and PGAS Adapt MPI-IO for SAGE Adapt PGAS for SAGE Runtimes for SAGE Pre/Post Processing Volume Rendering Exploit Caching hierarchy Data Analytics methods on top of Clovis Apache Flink over Clovis, looking beyond Hadoop Exploit NVRAM as extension of Memory MPI-IO for SAGE Design available Runtimes design available Framework for data analytics available PGAS for SAGE being studied Programming Models and Analytics © Seagate Technology LLC

Integration & Demonstration Goal Hardware definition, integration and demonstration Methodology Design and Bring-up of SAGE hardware Seagate Hardware Atos Hardware Integration of all the software components From WP2/3/4 Integration in Juelich Supercomputer Center Demonstrate use cases Extrapolate performance to Exascale Study other Object stores vis-à-vis Mero Design & Definition of base SAGE hardware complete Prototype being tested in Seagate Integration & Demonstration © Seagate Technology LLC

SAGE Prototype Ready for Shipping to Juelich © Seagate Technology LLC

Overall Project Timeline M1 (Sept 2015) – Project Start M3 – Draft System architecture M9 – Co-Design Inputs from Apps Complete M12 – Design of Key Software Components ( 1st Draft) M18 – Prototype system available ( Seagate) M27 – Design of All Software Components ( 2nd Draft) M18 – M28 – Prototype system testing/system integration in Juelich M28 – M36 – Application use cases and performance tests of the SAGE Platform in Juelich Overall Project Timeline © Seagate Technology LLC

Consortium Coordinator: Seagate Needs to be part of Malcolm’s introduction Consortium © Seagate Technology LLC

Questions ? giuseppe.congiu@seagate.com sai.narasimhamurthy@seagate.com © Seagate Technology LLC