Low Latency Analytics and Edge Connected Vehicles with RapidIO Interconnect Devashish Paul, Director Strategic Marketing June 2016.

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Agenda Product Overview Hardware Interfaces Software Features
©2009 HP Confidential template rev Ed Turkel Manager, WorldWide HPC Marketing 4/7/2011 BUILDING THE GREENEST PRODUCTION SUPERCOMPUTER IN THE.
Digital RF Stabilization System Based on MicroTCA Technology - Libera LLRF Robert Černe May 2010, RT10, Lisboa
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
1© Copyright 2015 EMC Corporation. All rights reserved. SDN INTELLIGENT NETWORKING IMPLICATIONS FOR END-TO-END INTERNETWORKING Simone Mangiante Senior.
RapidIO based Low Latency Heterogeneous Supercomputing Devashish Paul, Director Strategic Marketing, Systems Solutions
An overview of Infiniband Reykjavik, June 24th 2008 R E Y K J A V I K U N I V E R S I T Y Dept. Computer Science Center for Analysis and Design of Intelligent.
1 AppliedMicro X-Gene ® ARM Processors Optimized Scale-Out Solutions for Supercomputing.
Router Architectures An overview of router architectures.
IWARP Ethernet Key to Driving Ethernet into the Future Brian Hausauer Chief Architect NetEffect, Inc.
RSC Williams MAPLD 2005/BOF-S1 A Linux-based Software Environment for the Reconfigurable Scalable Computing Project John A. Williams 1
High Performance Computing G Burton – ICG – Oct12 – v1.1 1.
Silicon Building Blocks for Blade Server Designs accelerate your Innovation.
Maximizing The Compute Power With Mellanox InfiniBand Connectivity Gilad Shainer Wolfram Technology Conference 2006.
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
March 9, 2015 San Jose Compute Engineering Workshop.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
Slide ‹Nr.› l © 2015 CommAgility & N.A.T. GmbH l All trademarks and logos are property of their respective holders CommAgility and N.A.T. CERN/HPC workshop.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage April 2010.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
RapidIO based Low Latency Heterogeneous Supercomputing Devashish Paul, Director Strategic Marketing, Systems Solutions
The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.
APE group Many-core platforms and HEP experiments computing XVII SuperB Workshop and Kick-off Meeting Elba, May 29-June 1,
Introduction to Mobile-Cloud Computing. What is Mobile Cloud Computing? an infrastructure where both the data storage and processing happen outside of.
Cisco Routers Cisco Service Provider Core and Edge Routers.
© 2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP ProCurve 2910 Series Switches.
i.MX 8 Series: 3 Processor Families with Targeted Features
Enhancements for Voltaire’s InfiniBand simulator
Optimizing the Data Centre Physical Layer in the Era of the Cloud
Connected Infrastructure
Instructor Materials Chapter 7: Network Evolution
TV Broadcasting What to look for Architecture TV Broadcasting Solution
NFV Compute Acceleration APIs and Evaluation
LHCb and InfiniBand on FPGA
CIS 700-5: The Design and Implementation of Cloud Networks
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Hands On SoC FPGA Design
Application of General Purpose HPC Systems in HPEC
IOT Critical Impact on DC Design
Flex System Enterprise Chassis
System On Chip.
Appro Xtreme-X Supercomputers
OCP: High Performance Computing Project
Connected Infrastructure
IS3120 Network Communications Infrastructure
Low Latency Analytics HPC Clusters
Dr. Jeffrey M. Harris Director of Research and System Architecture
Mobile edge computing Report by Weiqing huang.
HMP for IoT – The path to powerful ultra-efficient nodes
Storage Networking Protocols
NTHU CS5421 Cloud Computing
Business Data Communications, 4e
IBM Power Systems.
Windows Virtual PC / Hyper-V
Big Data and IoT FTG-07.
DATS International Portfolio.
Cost Effective Network Storage Solutions
CLOUD INNOVATIONS Murugasamy (Sammy) Nachimuthu
NetFPGA - an open network development platform
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Ethernet Storage Market Size Growth During Forecast Period.
Utilizing the Network Edge
Mario open Ethernet drive architecture Introducing a new technology from HGST Mario Blandini
Martin Croome VP Business Development GreenWaves Technologies.
Presentation transcript:

Low Latency Analytics and Edge Connected Vehicles with RapidIO Interconnect Devashish Paul, Director Strategic Marketing June 2016 © Integrated Device Technology

IDT Company Overview Devashish Paul, Director Strategic Marketing 2 Mixed-signal application-specific solutions #1 Serial Switching – 100% 4G Infrastructure with RapidIO #1 Memory Interface – Industry Leader DDR4 #1 Silicon Timing Devices – Broadest Portfolio 800+ Issued and Pending Patents Worldwide #1 Serial Switching – 100% 4G Infrastructure with RapidIO #1 Memory Interface – Industry Leader DDR4 #1 Silicon Timing Devices – Broadest Portfolio 800+ Issued and Pending Patents Worldwide

Low Latency Analytics and Edge Connected Vehicles with RapidIO Devashish Paul, Director Strategic Marketing 3 Agenda  Computing Trends/Distributed Computing  RapidIO Gbps Technology  100 ns Low Latency Analytics Architectures  CERN OpenLab collaboration  Analytics examples with low latency RapidIO  Connected Vehicles Use Case

5G Mobile Edge Computing + Edge Connected Appliances Devashish Paul, Director Strategic Marketing 4 The Network is the Data Center Ecommerce Fleet Management Semi Autonomous Vehicles Traffic management Low Latency Energy Efficient Analytics Workloads At Network Edge RapidIO IEEE 1588 Timing RF Products Memory Interface Retimers Sensors

Devashish Paul, Director Strategic Marketing 5 The Network is the Data Center Ubiquitous Computing: IDT connects, synchronizes, times and makes sense of the human-and-machine connected world

Network and Data Center Convergence Devashish Paul, Director Strategic Marketing 6 Apps moving from data center to co-locate with access node (base station or wired access node) Supporting real time communication to mobile devices (phones, cars IOT) Tight time synchronization between apps running on distributed servers and in data center Need low latency interconnect Edge Computing an essential element of 5G Rollouts Wired and Wireless Access nodes Servers/ Analytics

PPC GPU Data Center Servers Cloud RAN 4G BTS Towards 5G/MEC  Co-located CPU and Acceleraters Devashish Paul, Director Strategic Marketing 7 Mobile Edge Computing

RapidIO Interconnect combines the best attributes of PCIe ® and Ethernet in a multi-processor fabric Ethernet PCIe Scalability Low Latency Hardware Terminated Guaranteed Delivery Ethernet in software only Clustering Fabric Needs Lowest Deterministic System Latency Scalability Peer to Peer / Any Topology Embedded Endpoints Energy Efficiency Cost per performance HW Reliability and Determinism Lowest Deterministic System Latency Scalability Peer to Peer / Any Topology Embedded Endpoints Energy Efficiency Cost per performance HW Reliability and Determinism Devashish Paul, Director Strategic Marketing 8

Heterogeneous Edge Analytics with RapidIO Heterogeneous compute workloads No protocol termination CPU cycles Energy efficiency 20 to 50 Gbps embedded interconnect Scalable Fat node connect multiple boards in Edge Appliance Push Video encode/decode, streaming, rendering and analytics into the network edge Devashish Paul, Director Strategic Marketing 9 Board leve RapidIO Inter- Connect Fabric FPGA/GPU accelerators FPGA/GPU accelerators CPU Storage Cable / connector Backplane RapidIO Rack Scale Interconnect Fabric Other nodes RapidIO Backplane Interconnect Fabric Edge Analytics require low latency compute For HPC – like compute Video server

10/20/40/50 Gbps per port – 6.25/10/12.5 Gps lane 100+ Gbps interconnect in definition Hardware termination at PHY layer: 3 layer protocol Lowest Latency Interconnect ~ 100 ns Inherently scales to large system with 1000’s of nodes Over 15 million RapidIO switches shipped > 2xEthernet (10GbE) Over 110 million Gbps ports shipped 100% 4G interconnect market share Devashish Paul, Director Strategic Marketing 10

Supported by Large Eco-system 02/09/16 Devashish Paul, Director Strategic Marketing 11 © 2015 RapidIO.org

RapidIO Ecosystem and Market Progression Performance evolves for multiple system platform generations st Gen Standard 2 nd Gen Standard 3 rd Gen Standard 4th Gen Standard Space Interconnect Bridging Data Center Computing Storage RapidIO Gen3 (10xN) released with path to 25 Gbaud Devashish Paul, Director Strategic Marketing 12

NASA Space Interconnect Standard Next Generation Spacecraft Interconnect Standard RapidIO selected from Infiniband / Ethernet /FiberChannel / PCIe NGSIS members: BAE, Honeywell, Boeing, Lockheed- Martin, Sandia Cisco, Northrup-Grumman, Loral, LGS, Orbital Sciences, JPL, Raytheon, AFRL 13

Open Compute Project High Performance Computing 02/09/16 Devashish Paul, Director Strategic Marketing 14 IDT Co-Chairs Opencompute.org HPC Initiative

RapidIO.org ARM64 bit Scale Out Group 10s to 100s cores & Sockets ARM AMBA® protocol mapping to RapidIO protocols -AMBA 4 AXI4/ACE mapping to RapidIO protocols -AMBA 5 CHI mapping to RapidIO protocols Migration path from AXI4/ACE to CHI and future ARM protocols Supports Heteregeneous computing Support Wireless/HPC/Data Center Applications Latency Scalability Hardware Termination Energy Efficiency Source – Linley Processor confwww.rapidio.org

16 Better Link Utilization RapidIO offers better Link Utilization – RapidIO Link Rate: Gbps (Gen2 x4) without encoding overhead >90% Utilization Link Rate Gbps >90% Utilization Link Rate Gbps

17 Low CPU Overhead RapidIO keeps CPU available for actual User Application – RapidIO keeps CPU overhead below 1% – IDT RapidIO 20G PCIE-RapidIO NIC and Intel x86 Ethernet – Ethernet Intel Driver Software with TCP/IP – Intel x86 CPU x5560 and Intel X520 2x10GbE PCIe-Ethernet NIC RapidIO CPU Overhead less than 1% CPU Capacity available for User Processing RapidIO CPU Overhead less than 1% CPU Capacity available for User Processing

18 Significant Power Savings RapidIO offers Significant Power Savings – Low Power PCIe-RapidIO NIC with low CPU overhed – IDT RapidIO 20G PCIE-RapidIO NIC and Intel x86 Ethernet – Ethernet Intel Driver Software – Intel x86 CPU x5560 and Intel X520 2x10GbE PCIe-Ethernet NIC RapidIO saves ~15W/CPU Power Saving around 1.2 KW per Rack (80 socket) RapidIO saves ~15W/CPU Power Saving around 1.2 KW per Rack (80 socket)

20 Gbps RapidIO2 Analytics Kit 19 PCIe-RapidIO NIC (Tsi721) PCIe-RapidIO NIC (Tsi721) RapidIO Switch Unit (CPS1432/CPS1848) RapidIO Switch Unit (CPS1432/CPS1848) Driver Software General Available May 20, 2016 Fabric Management TCP/IP Device Drivers Hardware (NIC/Switch) Remote Memory Access User Space Kernel Space

RapidIO Interconnect at CERN OpenLab 20

RapidIO at CERN Openlab: LHC and Data Center 21 RapidIO Low latency interconnect fabric Heterogeneous computing Large scalable multi processor systems Desire to leverage multi core x86, ARM, GPU, FPGA, DSP with uniform fabric Desire programmable upgrades during operations before shut downs

CERN OpenLab Root Analytics on RapidIO 22

CERN OpenLab Event Building Use Case on RapidIO 23

24

Analytics Optimized 100ns 50 Gbps Switch RXS2448 –600 Gbps Full-Duplex Serial RapidIO ® Switch –50 Gbps per port –33 x 33 mm package –48 lanes at 12.5 Gbps –Up to 24 Serial RapidIO Ports –RapidIO Specification (Rev 3.2) Compliant RXS1632 –400 Gbps Full-Duplex Serial RapidIO Switch –50 Gbps per port –29 x 29 mm package –32 lanes at 12.5 Gbps –Up to 16 Serial RapidIO Ports –RapidIO Specification (Rev 3.2) Compliant Clocking & Reset Power Mgmt JTAGI2C Maintenance Buffer and Control Scheduler and Switch Fabric Stack Port0/1 SerDes Stack Port0/1 SerDes Stack Port0/1 SerDes Stack Port0/1 SerDes StackPort0/1 SerDes StackPort0/1 SerDes StackPort0/1 SerDes StackPort0/1 SerDes StackPort0/1 SerDes StackPort0/1 SerDes StackPort0/1 SerDes StackPort0/1 SerDes ©2016. IDT. 50 Gbps per port | 300mW per 10 Gbps data | 100ns latency Devashish Paul, Director Strategic Marketing 25

Local Inter- Connect Fabric Cable / Connector CPU/ Accelerator CPU/ Accelerator Storage I/O Hub I/O Hub 20 to 50 Gbps RapidIO Ports Scalable Low Latency Analytics Server Fat Node Devashish Paul, Director Strategic Marketing to 50 Gbps embedded ports 300 mW per 10 Gbps 100 ns latency Distributed switching Direct connection to BBU CPU/ Accelerator CPU/ Accelerator Storage I/O Hub I/O Hub Edge Node with I/O Hub Local Inter- Connect Fabric FPGA/GPU CPU Storage Cable / Connector Edge Node Native Interconnect 20 to 50 Gbps RapidIO Ports RapidIO Top of Rack Switching

Edge Analytics Systems APPLICATION ISSUES SOLVED ● 50 Gbps per port with 95% link utilization ● 100 ns latency ● Power efficient 300 mW per 10 Gbps RXS SoC FPGA DSP Connector RXS SoC FPGA DSP Connector RXS SoC FPGA DSP Connector RXS BBU Basestation RXS Fabric Cluster for Rack Switching C-RAN Backplane RXS CPU FPGA/ GPU RXS Server Backplane RXS CPU FPGA/ GPU Server RXS CPU FPGA/ GPU Server Edge Computing Appliance Distributed low latency switching Optimized for needs of Edge Analytics Devashish Paul, Director Strategic Marketing 27

Low Latency Switch for Edge Computing Scale Out Devashish Paul, Director Strategic Marketing x 20 Gbps ports Sub 200W switching power Support 42U Rack level scale out Available Now 0.75 Tbps 1 U 19 Inch 100 ns Switch With 20 Gbps ports Roadmap to 4.8 Tbps 2U 100 ns Switch With 50 Gbps ports 96 x 50 Gbps ports Sub 400W switching power Supports redundant ports to 42U rack and intra rack scale out 2H G|Mobile Edge Computing |HPC| Video Analytics | Low Latency Financial Trading

29 12x 50 Gbps per port Switch Silicon

X86 Edge Based Analytics Appliance Devashish Paul, Director Strategic Marketing 30 Key Components: 19 inch 1U ToR 38x20 port ToR switch 19 inch 1U networking server with built in 320 Gbps RapidIO Switching Quad Intel Broadwell Sockets 4 TB storage in 8 bays 320 Gbps built in switching in Video Server 100ns latency per switch hop Up to 160 sockets per rack with any to any connectivity TCP/IP software driver for RapidIO + IDT Open Source Software Available Q Analytics Server and ToR RapidIO|x86| Low Latency | Energy Efficient

GPU Acceleration Nodes 02/09/16 Devashish Paul, Director Strategic Marketing 31 Computing 4 x Nvidia Tegra K1 GPU per Compute slot 16x per 19 Inch 1U server Analytics acceleration or cloud based rendering of gaming content Interconnect Fabric RapidIO Switching + 4x RapidIO Gen2 NIC Performance 1.28 Tflops/Compute (4 GPU) 140 Gbps Switching Fabric 100nsec RapidIO switching latency Low Latency GPU Analytics Module

GPU+ARM RapidIO Video Analytics RapidIO Switch Appliance Low Power ARM+GPU Cluster with RapidIO ~29 frames/sec/node ~100 ns RapidIO switching latency > TFlops Processing/node ~4W per GPU node

Edge Video Analytics: IBM Power8/Nvidia Tesla/RapidIO Switching Devashish Paul, Director Strategic Marketing 33 Roadmap to 4.8 Tbps 2U 100 ns Switch With 50 Gbps ports Building the OpenPower Edge Node Low latency RapidIO ToR Switching 38x20 20 Gbps 100 ns IDT RapidIO switch latency, 95% link utilization Multi socket Power8 servers 38 sockets per rack any to any compute for video analytics Option to add Nvidia Tesla class GPU for acceleration per Power8 Node IDT Versaclock5 Timing

Edge Social Data Analytics Analyze User Impressions on World Cup Final 2014 (Germany/Argentina) – HPAC Lab project to analyze World Cup 2014 twitter data using Hadoop and visualize using Tableau public on HPAC Platform Devashish Paul, Director Strategic Marketing 34 Orange Silicon Valley

5G Lab Germany: Edge Analytics for Autonomous Vehicles Devashish Paul, Director Strategic Marketing 35 Low Latency Edge Computing Key to Tactile Internet

Automotive Sensor and Sensor Fusion 36 Keys to Next-Generation Automotive Designs IDT now offers even a greater array of solutions Automotive sales channel provides immediate leverage for existing IDT products

5G Lab Germany: Edge Analytics for Autonomous Vehicles Network Edge Connected Vehicles –HPC like workload and network edge GPU/x86/ARM/Open Power based Analytics Low latency RapidIO Fabric –In vehicle sensor fusion in real time with low latency –Leverage OCP Innovations (Edge Appliance and ToR) –Multi Vehicle Sensors analysed at network edge by “Supercomputing at the Edge” Devashish Paul, Director Strategic Marketing 37 Autonomous Vehicle Video Analytics/Object Recognition Deep Learning/Object Analytics

Connected Vehicle Edge Analytics Workloads 38 Offload Use Cases Fleet Management Municipal Self Driving Snow Removal Police Services Traffic management Weather/Hazard avoidance Low Latency Energy Efficient Analytics Workloads At Network Edge Many workloads HPC-like requiring low latency Requires mission critical reliability at the appliance level Leverages 5G Network and associated handoffs between 5G nodes RapidIO is Key Enabler for this Edge Use Case

RapidIO Analytics and Connected Vehicles Low Latency Servers 20 to 50 Gbps RapidIO interconnect with 100ns latency Heterogeneous Computing X86, OpenPower, ARM GPU and FPGA assist Push Video encode/decode, streaming, rendering and analytics into the network edge Devashish Paul, Director Strategic Marketing 39 Board level RapidIO Inter- Connect Fabric FPGA/GPU accelerators FPGA/GPU accelerators CPU Storage Cable / connector Backplane RapidIO Rack Scale Interconnect Fabric Other nodes RapidIO Backplane Interconnect Fabric HPC like workloads In data center, Edge and Vehicles Video server

Backup 40

Packet Structure – PHY Layer PHY Layer Fields

Packet Structure – Transport Layer Transport Layer Fields

Packet Structure – Logical Layer Logical Layer Fields