Low Latency Analytics HPC Clusters

Slides:



Advertisements
Similar presentations
RapidIO Overview October © Copyright 2002 RapidIOTrade Association RapidIO Steering and Sponsoring Member Companies.
Advertisements

PowerEdge T20 Customer Presentation. Product overview Customer benefits Use cases Summary PowerEdge T20 Overview 2 PowerEdge T20 mini tower server.
PowerEdge T20 Channel NDA presentation Dell Confidential – NDA Required.
Confidential Prepared by: System Sales PM Version: 1.0 Lean Design with Luxury Performance.
©2009 HP Confidential template rev Ed Turkel Manager, WorldWide HPC Marketing 4/7/2011 BUILDING THE GREENEST PRODUCTION SUPERCOMPUTER IN THE.
Program Systems Institute Russian Academy of Sciences1 Program Systems Institute Research Activities Overview Extended Version Alexander Moskovsky, Program.
Digital RF Stabilization System Based on MicroTCA Technology - Libera LLRF Robert Černe May 2010, RT10, Lisboa
Scale-out Central Store. Conventional Storage Verses Scale Out Clustered Storage Conventional Storage Scale Out Clustered Storage Faster……………………………………………….
1 © 2001, Cisco Systems, Inc. All rights reserved. NIX Press Conference Catalyst 6500 Innovation Through Evolution 10GbE Tomáš Kupka,
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
RapidIO based Low Latency Heterogeneous Supercomputing Devashish Paul, Director Strategic Marketing, Systems Solutions
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
1 AppliedMicro X-Gene ® ARM Processors Optimized Scale-Out Solutions for Supercomputing.
1 petaFLOPS+ in 10 racks TB2–TL system announcement Rev 1A.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
Corporate Partner Overview and Update September 27, 2007 Gary Crane SURA Director IT Initiatives.
High Performance Computing G Burton – ICG – Oct12 – v1.1 1.
How to construct world-class VoIP applications on next generation hardware David Duffett, Aculab.
Silicon Building Blocks for Blade Server Designs accelerate your Innovation.
Workload Optimized Processor
CONFIDENTIAL Mellanox Technologies, Ltd. Corporate Overview Q1, 2007.
Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
SLAC Particle Physics & Astrophysics The Cluster Interconnect Module (CIM) – Networking RCEs RCE Training Workshop Matt Weaver,
© 2012 IBM Corporation IBM Flex System™ The elements of an IBM PureFlex System.
Gilad Shainer, VP of Marketing Dec 2013 Interconnect Your Future.
VPX Cover Overview and Update “VPX, Open VPX, and VPX REDI ” are trademarks of VITA.
March 9, 2015 San Jose Compute Engineering Workshop.
Infiniband in EDA (Chip Design) Glenn Newell Sr. Staff IT Architect Synopsys.
Network Architecture for the LHCb DAQ Upgrade Guoming Liu CERN, Switzerland Upgrade DAQ Miniworkshop May 27, 2013.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
Slide ‹Nr.› l © 2015 CommAgility & N.A.T. GmbH l All trademarks and logos are property of their respective holders CommAgility and N.A.T. CERN/HPC workshop.
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage April 2010.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
RapidIO based Low Latency Heterogeneous Supercomputing Devashish Paul, Director Strategic Marketing, Systems Solutions
GPU Solutions Universal I/O Double-Sided Datacenter Optimized Twin Architecture SuperBlade ® Storage SuperBlade ® Configuration Training Francis Lam Blade.
Next Generation HPC architectures based on End-to-End 40GbE infrastructures Fabio Bellini Networking Specialist | Dell.
Low Latency Analytics and Edge Connected Vehicles with RapidIO Interconnect Devashish Paul, Director Strategic Marketing June 2016.
© 2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP ProCurve 2910 Series Switches.
Chapter 1: Explore the Network
F1-17: Architecture Studies for New-Gen HPC Systems
ControlLogix Portfolio
LHCb and InfiniBand on FPGA
Video Security Design Workshop:
5/15/2018 5:55 PM THR2085 Breaking Physical Barriers on Azure Hybrid Cloud & Driving SQL with QCT/Intel Next-Gen Hardware Dr. Jen-Yao Chung AVP of Research.
AIC/XTORE SAS OVERVIEW
Berkeley Cluster Projects
Application of General Purpose HPC Systems in HPEC
Company Product with Intel Solution Product Focus
Flex System Enterprise Chassis
System On Chip.
Appro Xtreme-X Supercomputers
OCP: High Performance Computing Project
Dr. Jeffrey M. Harris Director of Research and System Architecture
Carrier Grade OCP Open Solutions for Telecom Data Centers October 2016
Nokia AirFrame Datacenter Solution Overview
Data Center Colocation Services.
Characteristics of Reconfigurable Hardware
Data Center Colocation Services.
IBM Power Systems.
Cost Effective Network Storage Solutions
Mario open Ethernet drive architecture Introducing a new technology from HGST Mario Blandini
Ampere for the openEDGE
Presentation transcript:

Low Latency Analytics HPC Clusters Devashish Paul Dir Strategic Marketing IDT

Agenda OCP HPC Project Overviews Heterogeneous Computing RapidIO Low Latency Interconnect Technology Computing and Low Latency Switching Platforms for OCP CERN Use Case

Mixed-signal application-specific solutions IDT Company Overview Founded 1980 Workforce Approximately 1,800 employees Headquarters San Jose, California #1 Serial Switching – 100% 4G Infrastructure with RapidIO #1 Memory Interface – Industry Leader DDR4 #1 Silicon Timing Devices – Broadest Portfolio 800+ Issued and Pending Patents Worldwide Mixed-signal application-specific solutions

Clustering Fabric Needs Lowest Deterministic System Latency Scalability Peer to Peer / Any Topology Embedded Endpoints Energy Efficiency Cost per performance HW Reliability and Determinism Low Latency Hardware Terminated Guaranteed Delivery Scalability Ethernet PCIe PCIe Ethernet in software only RapidIO Interconnect combines the best attributes of PCIe® and Ethernet in a multi-processor fabric

RapidIO Heterogeneous Computing clusters Other nodes Heterogeneous compute workloads ARM, x86, OpenPower, No protocol termination CPU cycles 100 ns latency Distributed Switching Energy efficiency 20 to 50 Gbps embedded interconnect Mission critical reliability Build HPC systems tuned to workload Rack Scale Interconnect Fabric Cable / connector Local Inter- Connect Fabric FPGA/GPU CPU Storage Backplane Backplane Interconnect Fabric Other nodes Flexible Solutions Appliance  Rack Scale

10/20/40/50 Gbps per port – 6.25/10/12.5 Gps lane Over 15 million RapidIO switches shipped > 2xEthernet (10GbE) Over 110 million 10-20 Gbps ports shipped 100% 4G interconnect market share 60% 3G, 100% China 3G market share 10/20/40/50 Gbps per port – 6.25/10/12.5 Gps lane 100+ Gbps interconnect in definition Embedded RapidIO NIC on processors, DSPs, FPGA and ASICs. Hardware termination at PHY layer: 3 layer protocol Lowest Latency Interconnect ~ 100 ns Inherently scales to large system with 1000’s of nodes

Supported by Large Eco-system © 2015 RapidIO.org

Open Compute Project HPC and Low Latency Interconnect Supports Opencompute.org HPC Initiative

HPC/Supercomputing Interconnect ‘Check In’ Requirements RapidIO Infiniband Ethernet PCIe The Meaning of Low Latency Switch silicon: ~100 nsec Memory to memory : < 1 usec Scalability Ecosystem supports any topology, 1000’s of nodes Integrated HW Termination Available integrated into SoCs AND Implement guaranteed, in order delivery without software Power Efficient 3 layers terminated in hardware, Integrated into SoC’s Fault Tolerant Ecosystem supports hot swap Ecosystem supports fault tolerance Deterministic Guaranteed, in order delivery Deterministic flow control Top Line Bandwidth Ecosystem supports > 8 Gbps/lane

RapidIO Heterogeneous switch + server External Panel With S-RIO and 10GigE 4x 20 Gbps RapidIO external ports 4x 10 GigE external Ports 4 processing mezzanine cards In chassis 320 Gbps of switching with 3 ports to each processing mezzanine Compute Nodes can use PCIe to S-RIO NIC Compute Node with ARM/PPC/DSP/FPGA are native RapidIO connected with small switching option on card Co located Storage over SATA 10 GbE added for ease of migration 19 Inch RapidIO Switch PCIe – S-RIO GPU DSP/ FPGA ARM CPU x86 CPU

Quad Socket x86 HPC/Analytics Appliance Ordering Now 4x 20 Gbps RapidIO external ports 4x 10 GigE external Ports 4 x Broadwell Sockets In chassis 320 Gbps of switching with 3 ports to each processing mezzanine Compute Nodes can use PCIe to S-RIO NIC Co located Storage over SATA 10 GbE added for ease of migration

Acceleration with Mobile GPU Compute Node Scalable Architecture GPU + Interconnect for Exascale 4 x Tegra K1 GPU RapidIO network 140 Gbps embedded RapidIO switching 4x PCIe2 to RapidIO NIC silicon 384 Gigaflops per GPU >1.5 Tflops per node 6 Teraflops per 1U 0.25 Petaflops per rack GPU compute acceleration

Proposed: OCP Telco Low Latency Switch for Edge Computing Scale Out Roadmap to 4.8 Tbps 2U 100 ns Switch With 50 Gbps ports 96 x 50 Gbps ports Sub 400W switching power Supports redundant ports to 42U rack and intra rack scale out 0.75 Tbps 1 U 19 Inch 100 ns Switch With 20 Gbps ports 38 x 20 Gbps ports Sub 200W switching power Support 42U Rack level scale out 5G|Mobile Edge Computing |HPC| Video Analytics | Low Latency Financial Trading

OCP HPC Platform Use Case with RapidIO Fabric RapidIO Low latency interconnect fabric for real time data center analytics at CERN Openlab Desire to leverage multi core heterogeneous computing x86, ARM, GPU, FPGA, DSP Collapse workloads and detect events with more certainty in real time: ex: particle collisions, fraud detection Use Industry standard Hardware from OCP Market emerges for real time compute analytics in data center

Use of OCP HPC Designed Platforms

Networking+Acceleration Accelerator Card System topology: Server+ToR+ Networking+Acceleration RapidIO Top of Rack Switching 20 Gbps RapidIO PCIe Gen3 16 lanes PCIe Connector IDT RXS 1632 Switch IDT XO IDT UFT FPGA 1588 1588 s/w NIC RTL IDT S-RIO 50 Gbps RTL Hard Core PCIe MIP DDR4 IDT Tsi721 PCIe2 to S-RIO 50 Gbps 6x 50 Gbps Standard Server Socket X86/ARM OpenPower 19 Inch 1U or 2U rack mounted server IDT-FPGA Ref Design and System: Connects CPU to 50 Gbps RapidIO network

Wrap Up Low latency 100ns switching Heterogeneous computing Other nodes Rack Scale Interconnect Fabric Low latency 100ns switching Heterogeneous computing OCP HPC ready Emerging analytics workloads energy efficient high bandwidth interconnect Ideal for HPC and HPC like apps such as financial computing Cable / connector Local Inter- Connect Fabric FPGA/GPU CPU Storage Backplane Backplane Interconnect Fabric 20 Gbps per port now 50 Gbps released 100 Gbps in definition Other nodes