Platform Disaggregation Lightening talk Openlab Major review 16 th Octobre 2014.

Slides:



Advertisements
Similar presentations
Introduction to Storage Area Network (SAN) Jie Feng Winter 2001.
Advertisements

Confidential Prepared by: System Sales PM Version: 1.0 Lean Design with Luxury Performance.
4/11/2017 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
Accelerating Your Success™ Oracle on Oracle for NHS 1.
SGI ® Company Proprietary SGI ® Modular InfiniteStorage Sales Deck – V4 – Feb 2013 An evolutionary new compute & storage platform exclusive to SGI.
1© Copyright 2011 EMC Corporation. All rights reserved. EMC SQL Server Data Warehouse Fast Track Solutions Nov 30, 2011 Ling Wu, EMC.
C LOUD C OMPUTING Presented by Ye Chen. What is cloud computing? Cloud computing is a model for enabling ubiquitous, convenient, on- demand network access.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
Scale-out Central Store. Conventional Storage Verses Scale Out Clustered Storage Conventional Storage Scale Out Clustered Storage Faster……………………………………………….
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Cisco and NetApp Confidential. Distributed under non-disclosure only. Name Date FlexPod Entry-level Solution FlexPod Value, Sized Right for Smaller Workloads.
1 AppliedMicro X-Gene ® ARM Processors Optimized Scale-Out Solutions for Supercomputing.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Microsoft Private Cloud Fast Track: The Next Generation of Private Cloud Reference Architecture Mike Truitt Sr. Product Planner Bryon Surace Sr. Program.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
SOFTWARE AS A SERVICE PLATFORM AS A SERVICE INFRASTRUCTURE AS A SERVICE.
Understand what’s new for Windows File Server Understand considerations for building Windows NAS appliances Understand how to build a customized NAS experience.
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
HEPiX 21/05/2014 Olof Bärring, Marco Guerri – CERN IT
Abstract Load balancing in the cloud computing environment has an important impact on the performance. Good load balancing makes cloud computing more.
Facebook: Infrastructure at Scale Using Emerging Hardware Bill Jia, PhD Manager, Performance and Capacity Engineering Oct 8 th, 2013.
Copyright 2009 Fujitsu America, Inc. 0 Fujitsu PRIMERGY Servers “Next Generation HPC and Cloud Architecture” PRIMERGY CX1000 Tom Donnelly April
UTA Site Report Jae Yu UTA Site Report 4 th DOSAR Workshop Iowa State University Apr. 5 – 6, 2007 Jae Yu Univ. of Texas, Arlington.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
 Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). 
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
Server Virtualization
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
Tier1 Hardware Review Martin Bly HEPSysMan - RAL, June 2013.
Visual Studio Windows Azure Portal Rest APIs / PS Cmdlets US-North Central Region FC TOR PDU Servers TOR PDU Servers TOR PDU Servers TOR PDU.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
Rick Claus Sr. Technical Evangelist,
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
High performance Brocade routers
CloudLab Aditya Akella. CloudLab 2 Underneath, it’s GENI Same APIs, same account system Even many of the same tools Federated (accept each other’s accounts,
ClinicalSoftwareSolutions Patient focused.Business minded. Slide 1 Opus Server Architecture Fritz Feltner Sept 7, 2007 Director, IT and Systems Integration.
A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud.
BNL Oracle database services status and future plans Carlos Fernando Gamboa, John DeStefano, Dantong Yu Grid Group, RACF Facility Brookhaven National Lab,
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Leading in the compute.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
The Evolution of the Italian HPC Infrastructure Carlo Cavazzoni CINECA – Supercomputing Application & Innovation 31 Marzo 2015.
Predrag Buncic CERN Data management in Run3. Roles of Tiers in Run 3 Predrag Buncic 2 ALICEALICE ALICE Offline Week, 01/04/2016 Reconstruction Calibration.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Learnings from the first Plugfest
Academia Sinica Grid Computing Centre (ASGC), Taiwan
E2800 Marco Deveronico All Flash or Hybrid system
Experience of Lustre at QMUL
IT Services Katarzyna Dziedziniewicz-Wojcik IT-DB.
DSS-G Configuration Bill Luken – April 10th , 2017
EonNAS.
IT-DB Physics Services Planning for LHC start-up
Module 2: DriveScale architecture and components
Software Defined Storage
Sebastian Solbach Consulting Member of Technical Staff
Appro Xtreme-X Supercomputers
Bridges and Clouds Sergiu Sanielevici, PSC Director of User Support for Scientific Applications October 12, 2017 © 2017 Pittsburgh Supercomputing Center.
Thoughts on Computing Upgrade Activities
PES Lessons learned from large scale LSF scalability tests
GridPP Tier1 Review Fabric
Red Hat User Group June 2014 Marco Berube, Cloud Solutions Architect
Deploy OpenStack with Ubuntu Autopilot
Power couple. Dell EMC servers powered by Intel® Xeon® processors and running Windows Server* 2016, ready to securely handle dynamic business workloads.
DESIGNED TO WORK BETTER TOGETHER.
Windows Server 2016 Software Defined Storage
Workload Optimized OpenStack made easy
IBM Power Systems.
Mario open Ethernet drive architecture Introducing a new technology from HGST Mario Blandini
OpenStack for the Enterprise
Presentation transcript:

Platform Disaggregation Lightening talk Openlab Major review 16 th Octobre 2014

Background: the CERN *aaS catalog Platform disaggregation- 2 OpenStack LSF s/w build … Interactive AFS Ceph CASTOR EOS Hadoop … Netapp Exchange DFS TS AD … Lync JIRA Drupal Twiki Indico CATIA Ansys … Oracle CDS Kibana ElasticSearch TSM Puppet Foreman Git BOINC CERNBOX CVMFS Grid Services LQCD

Background: CERN IT assets Platform disaggregation- 3 Server & StorageMeyrinWigner Number of Cores93,93720,544 Number of Drives65,71610,921 Number of Memory Modules66,16710,247 Number of 10G NIC3,7081,211 Number of 1G NIC18,7762,292 Number of Processors18,0182,570 Number of Servers9,8081,288 Total Disk Space (TB)99,32932,584 Total Memory Capacity (TB)34483 Tape & networkMeyrin&Wigner Tape Drives141 Tape Cartridges50,622 Data Volume on Tape (TB)98,552 Free Space on Tape (TB)13,582 Routers (GPN)135 Routers (TN)24 Routers (Others)101 Star Points631 Switches3,225

Background: Platform customizations Platforms –Base node (Cloud, batch worker, web services, …) 2x CPU (e.g. Intel E5-26xx v2,3) 64GB RAM 2x 2TB HDD On-board 1GbE + dedicated IPMI –Disk storage (JBOD) front-end Base node + SAS HBA (LSI e) + 10GbE (SFP+) –Tape server Base node + 10GbE + 8Gbs FC –TSM front-end Base node + SAS HBA + 10GbE + 8Gbs FC (tape) –Oracle DB server Base node + 64GB RAM + 2x dual port 10GbE (SFP+) + RHEL certification –HPC Base node + 64GB (or 192GB) + low-latency 10GbE –Windows server Base node + RAID (int or int/ext) –“Fat” cloud node Base node + 64GB RAM + 10GbE (SFP+ or RJ45) + SSD –…–… Challenge: achieve all those customizations starting from monolithic base platforms Platform disaggregation- 4

Open Compute Project (OCP) is an interesting new direction with –Potential far-reaching impact for industry and data centres –A constantly growing provider community and private customer space Encouraging results from our small-scale tests with two twin systems –Sufficiently interesting to motivate launching a project for larger deployment Platform is still monolithic –Except rack level power distribution Could Open Compute help? Open Compute at CERN 21/05/2014

Breaking Up the Monolith? Platform disaggregation- 6 Intriguing paragraph from an Open Compute announcement (OCP summit January 2013): ( ) … “But most exciting of all are a series of new developments that will enable us to take some big steps forward toward better utilization of these technologies. One of the challenges we face as an industry is that much of the hardware we build and consume is highly monolithic -- our processors are inextricably linked to our motherboards, which are in turn linked to specific networking technology, and so on. This leads to poorly configured systems that can't keep up with rapidly evolving software and waste lots of energy and material. To fix this, we need to break up some of these monolithic designs -- to disaggregate some of the components of these technologies from each other so we can build systems that truly fit the workloads they run and whose components can be replaced or updated independently of each other. Several members of the Open Compute Project have come together today to take the first steps toward this kind of disaggregation:” … (*) (*) More technical details in “Design guide for photonic architecture” contributed by Intel to Open Compute project in 2013:

Openlab-V scope Disaggregation for enabling provisioning of customized hardware platforms? Vision: Software Defined Platform Flexible provisioning / sustainment –Commissioning –Repurpose –Connectivity and routing domains Flexible component lifecycle management –Trays of network components (NICs, switches, fabrics, …) –Trays of storage (HDD, SSD, NVMe, …) –Trays of memory (RAM, NVRAM, …) –Trays of processors sockets (Xeon, Xeon Phi, Atom, ARM, …) Scalable, Open and Affordable –Scalable performance and manageability –Open competitive manufacturer / supplier ecosystem –Affordable also underneath the hyper-scale pedestal Platform disaggregation- 7 Long-term

Objectives and timescale Phase 1, ~PM12:Disaggregated ToR Model –OCP rack enhanced with a disaggregated Top of the Rack switch Phase 2, ~PM24: Disaggregated storage –Prototype enhanced with SSD storage sleds are added to the switch fabric Phase 3, ~PM36 or beyond: Disaggregated system memory –Add second level memory hierarchy on NVRAM sleds? Would other research organisations want to participate? Platform disaggregation- 8