Department of Particle & Particle Astrophysics Sea-Of-Flash-Interface SOFI introduction and status The PetaCache Review Michael Huffer,

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

DCM Embedded Software Infrastructure, Build Environment and Kernel Modules A.Norman (U.Virginia) 1 July '09 NOvA Collaboration Mtg.
Page 1 Dorado 400 Series Server Club Page 2 First member of the Dorado family based on the Next Generation architecture Employs Intel 64 Xeon Dual.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Chapter 13 Embedded Systems
G Robert Grimm New York University Disco.
BEEKeeper Remote Management and Debugging of Large FPGA Clusters Terry Filiba Navtej Sadhal.
Robert C. Sass Facility Advisory Committee April XES Controls/DAQ 1 AMOS/LUSI Controls and Data Acquisition LCLS eXperimental End Stations (XES)
Chapter 13 Embedded Systems
PRASHANTHI NARAYAN NETTEM.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
1 What is an operating system? CSC330Patricia Van Hise.
GigE Knowledge. BODE, Company Profile Page: 2 Table of contents  GigE Benefits  Network Card and Jumbo Frames  Camera - IP address obtainment  Multi.
Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller
COMPONENTS OF THE SYSTEM UNIT
The 6713 DSP Starter Kit (DSK) is a low-cost platform which lets customers evaluate and develop applications for the Texas Instruments C67X DSP family.
Xilinx at Work in Hot New Technologies ® Spartan-II 64- and 32-bit PCI Solutions Below ASSP Prices January
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
1 The SpaceWire Internet Tunnel and the Advantages It Provides For Spacecraft Integration Stuart Mills, Steve Parkes Space Technology Centre University.
1 A web enabled compact flash card reader eeble. 2 Weeble Team Chris Foster Nicole DiGrazia Mike Kacirek Website
An Introduction to Software Architecture
Slide 1 DESIGN, IMPLEMENTATION, AND PERFORMANCE ANALYSIS OF THE ISCSI PROTOCOL FOR SCSI OVER TCP/IP By Anshul Chadda (Trebia Networks)-Speaker Ashish Palekar.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3: Operating Systems Computer Science: An Overview Tenth Edition.
Particle Physics & Astrophysics Representing: Mark Freytag Gunther Haller Ryan Herbst Michael Huffer Chris O’Grady Amedeo Perazzo Leonid Sapozhnikov Eric.
GBT Interface Card for a Linux Computer Carson Teale 1.
SLAC Particle Physics & Astrophysics The Cluster Interconnect Module (CIM) – Networking RCEs RCE Training Workshop Matt Weaver,
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
ATCA based LLRF system design review DESY Control servers for ATCA based LLRF system Piotr Pucyk - DESY, Warsaw University of Technology Jaroslaw.
Page 1 Reconfigurable Communications Processor Principal Investigator: Chris Papachristou Task Number: NAG Electrical Engineering & Computer Science.
SLAC Particle Physics & Astrophysics Future Development and Direction RCE Training Workshop Michael Huffer, 15 June, 2009.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
OS Services And Networking Support Juan Wang Qi Pan Department of Computer Science Southeastern University August 1999.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Computer Organization & Assembly Language © by DR. M. Amer.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Axel Jantsch 1 Networks on Chip Axel Jantsch 1 Shashi Kumar 1, Juha-Pekka Soininen 2, Martti Forsell 2, Mikael Millberg 1, Johnny Öberg 1, Kari Tiensurjä.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Full and Para Virtualization
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. NAS versus SAN NAS – Architecture to provide dedicated file level access.
Proposal for an Open Source Flash Failure Analysis Platform (FLAP) By Michael Tomer, Cory Shirts, SzeHsiang Harper, Jake Johns
XLV INTERNATIONAL WINTER MEETING ON NUCLEAR PHYSICS Tiago Pérez II Physikalisches Institut For the PANDA collaboration FPGA Compute node for the PANDA.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Tackling I/O Issues 1 David Race 16 March 2010.
Department of Particle & Particle Astrophysics Modular Data Acquisition Introduction and applicability to LCLS DAQ Michael Huffer,
ROM. ROM functionalities. ROM boards has to provide data format conversion. – Event fragments, from the FE electronics, enter the ROM as serial data stream;
KM3NeT Offshore Readout System On Chip A highly integrated system using FPGA COTS S. Anvar, H. Le Provost, F. Louis, B.Vallage – CEA Saclay IRFU – Amsterdam/NIKHEF,
Online Software November 10, 2009 Infrastructure Overview Luciano Orsini, Roland Moser Invited Talk at SuperB ETD-Online Status Review.
Status Report of the PC-Based PXD-DAQ Option Takeo Higuchi (KEK) 1Sep.25,2010PXD-DAQ Workshop.
Introduction to Operating Systems Concepts
Introduction to Computing Systems
Enhancements for Voltaire’s InfiniBand simulator
Operating System Overview
Use of FPGA for dataflow Filippo Costa ALICE O2 CERN
Business System Development
Memory Management.
Introduction to Programmable Logic
Discovering Computers 2011: Living in a Digital World Chapter 4
THE PROCESS OF EMBEDDED SYSTEM DEVELOPMENT
PRESENTATION ON Sky X TECH. SUBMETTED TO:- SUBMETTED BY:-
Migration Strategies – Business Desktop Deployment (BDD) Overview
AT91RM9200 Boot strategies This training module describes the boot strategies on the AT91RM9200 including the internal Boot ROM and the U-Boot program.
Operating Systems Chapter 5: Input/Output Management
Multithreaded Programming
NVMe.
Presentation transcript:

Department of Particle & Particle Astrophysics Sea-Of-Flash-Interface SOFI introduction and status The PetaCache Review Michael Huffer, Stanford Linear Accelerator Center November 02, 2006

Department of Particle & Particle Astrophysics 2 Outline Background – History of PPA involvement – Synergy with current activities Requirements – Usage model – System requirements – Individual client requirements Implementation – Abstract model and features – Building Blocks Deliverables – Packaging Schedule – Status – Milestones Summary – Reuse – Conclusions

Department of Particle & Particle Astrophysics 3 Background Research Engineering Group supports a wide range of activities with limited resources – LSST, SNAP, ILC, SiD, EXO, LHC, LCLS, etc In order to utilize these resources most effectively requires understanding: – core competencies – the requirements of future electronics systems Two imperatives for REG: – Support upcoming experiments – Build for the future by advancing core competencies What are: – More detailed examples of a couple of upcoming experiments? – The necessary core competencies?

Department of Particle & Particle Astrophysics 4 LSST SLAC/KIPAC is lead institution for the camera – Camera contains > 3 gigapixels > 6 gigabytes of data/image Readout time is 1-2 seconds – KIPAC delivers camera DAQ system “The Large Synoptic Survey Telescope (LSST) is a proposed ground-based 8.4-meter, 10 square-degree- field telescope that will provide digital imaging of faint astronomical objects across the entire sky, night after night. In a relentless campaign of 15 second exposures, LSST will cover the available sky every three nights, opening a movie-like window on objects that change or move on rapid timescales: exploding supernovae, potentially hazardous near-Earth asteroids, and distant Kuiper Belt Objects. The superb images from the LSST will also be used to trace billions of remote galaxies and measure the distortions in their shapes produced by lumps of Dark Matter, providing multiple tests of the mysterious Dark Energy.”

Department of Particle & Particle Astrophysics 5 SNAP SLAC is lead institution for all non-FPA related electronics – One contact every 24 hours – Requires data to be stored on board instrument – Storage capacity is roughly 1 Terabyte (includes redundancy) – Examining NAND flash as solution to storage problem “The Supernova/Acceleration Probe (SNAP) satellite observatory is capable of measuring thousands of distant supernovae and mapping hundreds to thousands of square degrees of the sky for gravitational lensing each year. The results will include a detailed expansion history of the universe over the last 10 billion years, determination of its spatial curvature to provide a fundamental test of inflation - the theoretical mechanism that drove the initial formation of structure in the universe, precise measures of the amounts of the key constituents of the universe, ΩM and ΩL, and the behavior of the dark energy and its evolution over time.”

Department of Particle & Particle Astrophysics 6 Core competencies System on Chip (SOC) – Integrated processors and functional blocks on an FPGA Small footprint, high performance, persistent, memory systems – NAND Flash Open Source R/T kernels – RTEMS (Real-time Executive for Multiprocessor Systems) High performance serial data transport and switching – MGTs (Multi-Gigabit Transceivers) Modern networking protocols: – 10 Gigabit Ethernet – InfiniBand – PCI-Express

Department of Particle & Particle Astrophysics 7 PetaCache consistent with mission? Project Use core technology? SOCMemoryR/T kernelsH/S transport LSST yesnoyes SNAP noyes no Petacache yes Main Entry: syn·er·gy Pronunciation: 'si-n&r-jE Function: noun Inflected Form(s): plural -gies Etymology: New Latin synergia, from Greek synergos working together 1 : SYNERGISM ; broadly : combined action or operation 2 : a mutually advantageous conjunction or compatibility of distinct business participants or elements (as resources or efforts) SYNERGISM

Department of Particle & Particle Astrophysics 8 Usage model System requirements: – Scalable, both in: Storage capacity Number of concurrent clients – Large address space – Random access – Support population evolution Features: – Changes are quasi-adiabatic “Write once, read many” – Able to treat as Read-Only system Requirements not addressed in this phase: – Access Control – Redundancy – Cost Data Storage distribution, transport & management Host client “Lots of storage, shared concurrently by many clients, distributed over a large number of hosts”

Department of Particle & Particle Astrophysics 9 Client Requirements Uniform access time to fetch a “fixed” amount of data from storage – Implies: deterministic and relatively “small” latency in round-trip time Where: “fixed” is O(8 Kbytes) & “small” O(200 micro-seconds) – Need approximately 40 Mbytes/sec between client & storage Access time scales independent of: – Address – Number of concurrent clients Two contributions to latency: – Storage access time – Distribution, transport, and management overhead Petacache project focus is on this issue alone SOFI architecture attempts to address both issues

Department of Particle & Particle Astrophysics 10 Abstract model Key features: – Available concurrency and bandwidth scales with storage capacity – Many individual “Memory servers” Access granularity is 8 Kbytes 16 GBytes of Memory/server 40 Mbytes/sec/server – Load Leveling Data randomly distributed over memory servers Multicast for concurrent addressing Both client & server side caching – Two address spaces Physical page access Logical block access Hides data distribution from client – Network Attached storage Memory server Flash Memory Controller (FMC) Client Content Addressable Switching

Department of Particle & Particle Astrophysics 11 Building Blocks Host Client Interface (SOFI) (1 of n) 1 Gigabit Ethernet (.1 GByte/sec) (1 of n) Application specific Cluster Inter-Connect Module (CIM) Host Inter-connect 1 Gbyte/sec PGP (Pretty Good Protocol) Network Attached Storage 8 x 10 G-Ethernet (8 GByte/sec) 10 G- Ethernet Slice Access Module (SAM) Four Slice Module (FSM) 256 GByte Flash

Department of Particle & Particle Astrophysics 12 Four Slice Module (FSM) clock configuration FPGA PHY to DIMM initiator PGP & command encode/decode FMC 1 CRC-InCRC-Out FMC 2 FMC 3 FMC 4 to PHY out-bound transfer & decode in-bound transfer & encode in-bound arbiter out-bound arbiter 1 DIMM (8 devices) 32 GBytes 1 x 4 slices

Department of Particle & Particle Astrophysics 13 Flash Memory Controller (FMC) Implemented as Core IP Controls 16 GBytes of memory (4 devices) in units of: – Pages (8 Kbytes) – Blocks (512 Kbytes) Queues operations – Read Page (in units of 128 byte chunks) – Write Page – Erase Block – Read statistics counters – Read device attributes Transfers data at 40 Mbyte/sec

Department of Particle & Particle Astrophysics 14 Universal Protocol Adapter (UPA) Left side MFD FPGA (SOC) 200 DSPs Lots of gates Xilinx XC4VFX60 Fabric clock Right side MGT clock Right side PPC-405 (450 MHZ) Right side Configuration memory 128 Mbytes) Samsung K9F5608 Right side Memory (512 Mbytes) Micron RLDRAM II Right side Multi-Gigabit Transceivers (MGT) 8 lanes Left side 100-baseT Reset Reset options JTAG The SAM is ½ of a UPA pair

Department of Particle & Particle Astrophysics 15 UPA Features “Fat” Memory Subsystem – Sustains 8 Gbytes/sec – “Plug-In” DMA interface (PIC) Designed as a set of IP cores Designed to work in conjunction with MGT and protocol cores Bootstrap loader (with up to 16 boot options and images) Interface to configuration memory Open Source R/T kernel (RTEMS) 100 base-T Ethernet interface Full network stack “Think of the UPA as a Single Board Computer (SBC) which interfaces to one or more busses through its MGTs”

Department of Particle & Particle Astrophysics 16 UPA Customization for SAM Implements two cores: – PGP – 10-GE All 8 lanes of MGT used: – 4 lanes for PGP core – 4 lanes for 10-GE Network driver to interface 10G-E to network stack Executes application code to satisfy: – Server side of SOFI client interface Physical to Logical translation Server side caching – FSM management software Proxy FMC command set Maintains bad blocks Maintains available blocks

Department of Particle & Particle Astrophysics 17 (Cluster Inter-connect Module (CIM) (21)(16) (21)(22) (4) (8) to SAMs (high-speed) to SAMs (low-speed) to host inter-connect (management-network)to host inter-connect (data-network) High Speed Switch Data (24 x 10-GE) Fulcrum FM2224 Switch management (UPA) 10 GE (XUI) 10 GE (XUI) Low-speed Switch Management (24 x FE + 4 x GE) Zarlink ZL baseT 100 baseT 1000 baseT 100 baseT

Department of Particle & Particle Astrophysics 18 Client/Server Interface Client Interface resides on host Servers reside on SAMs Any one client on any one host has uniform access to all flash storage Client accesses flash through network Interconnect Abstract Interconnect model – Delivered implementation is IP (UDP and multicast services) Interface delivers three types of services: – Random Read access to objects within the store – Population of objects within the store ( Write and Erase access) – Access to performance metrics Client Interface is Object-Oriented (C++) – Class library (distributed as a set of binaries and header files) Two address spaces (physical & logical) – Client access information only in logical space – Client is not sensitive to actual physical location of information – Population distribution is pseudo-random ( static load leveling)

Department of Particle & Particle Astrophysics Page Slice Manager Interconnect Addressing Controller 2 0 x 2 32 x 2 2 x 2 2 x 2 21 = 128K peta-pages (1M peta-bytes) Physical addressing (1 page = 8 Kbytes) Logical Addressing (1 block = 8 Kbytes) Bundle Block Partition Interconnect

Department of Particle & Particle Astrophysics 20 Using the interface Partition is a management tool – Segment logically storage into disjoint sets – One-to-One correspondence between a partition and a server – One SAM may host more then one server Bundle is an organization tool – Bundle belongs to one (and only one) partition – Bundle is an access pattern hint. Allows: fetch look-ahead optimization of overlapping fetches from different clients Both partition and bundle are assigned unique identifiers (over all time) Identifiers may have character names (alias) – Assigned at population time Client query is composed of: partition/cluster/offset/length – offset is expressed in units of blocks – length is express in units of bytes Client may query by either identifier or alias

Department of Particle & Particle Astrophysics 21 Deliverables Two FSMs (8 slices) – 1/2 TByte Two SAMs – Enough to support FSM operations Client/Server interface (SOFI) – Targeted to Linux How will the hardware be packaged ? – Where packaging is defined as: How the building blocks are partitioned The specification of the electro-mechanical interfaces

Department of Particle & Particle Astrophysics 22 The “Chassis” Accepts DC power Passive Backplane 8 U X2 (XENPACK MSA) 1U Fan-Tray 1U Air-Outlet 1U Air-Inlet 2 FSMs/Card – 1/2 TByte 16 Cards/Bank – 8 TByte 2 Banks/Chassis – 64 SAMS – 1 CIM – 16 TByte 3 chassis/rack – 48 TByte Supervisor Card (8U) Line Card (4U)

Department of Particle & Particle Astrophysics TByte facility 48 TByte facility Catalyst 6500 (3 x 4 10GE, 2 x 48 1GE) SOFI Host ( 1 x 96) xRootD servers 1 chassis

Department of Particle & Particle Astrophysics 24 Schedule/Status Methodology: – Hardware Implement 3 “platforms” – One for each type of module Decouple packaging from architectural & implementation issues… – Evaluate layout issues concerning high-speed signals – Evaluate potential packaging solutions – Allow concurrent development of VHDL & CPU code IP protocol implementation Client API logical/physcial translation cache management FSM interface The “wire” logical/physcial translation cache management SAM Host – Software Emulate FSM component of server software – Complete/debug in absence of hardware – Allows clients an “early look” at interface

Department of Particle & Particle Astrophysics 25 Evaluation platforms UPA – Memory subsystem – Bootstrap loader – Configuration memory – RTEMs – Network stack/network driver interface issues CIM – Low and high speed management – Evaluate different physical interfaces (including X2) FSM Line card (depending on packaging this could be production prototype) – FMC debug – PGP debug

Department of Particle & Particle Astrophysics 26 Schedule OctoberNovemberDecemberJanuaryFebruaryMarch schematic layout debug spin/load design specification implement Chassis/mechanicalPICRTEMS/UPA UPA/PGPUPA/10GE driver UPA/10GE MACSOFIUPA platform 1 CIM platform2Supervisor PCB 4 Backplane5 Line Card PCB3 activities products

Department of Particle & Particle Astrophysics 27 Milestones Milestonedate RTEMS running on UPA evaluation platform2 rd week December/2006 SOFI (emulation) ready3 rd week January/2007 Supervisor PCB ready for debug3 rd week January/2007 Chassis & PCBs complete3 th week of Febuary/2007 Start Test & Integrate2 nd week of March/2007

Department of Particle & Particle Astrophysics 28 Status Products specification designimplementation SOFI in-progress DIMM FCS FSM in-progress SAM in-progress CIM in-progress UPA in-progress PGP core  10-GE core   The “chassis”  

Department of Particle & Particle Astrophysics 29 Products & Reuse Product Targeted for use? Petacache LSST Camera DAQ SNAP LCLS DAQAtlas Trigger Upgrade UPA yes noyes 10-GE core yes noyes PGP core yes no yes FCS yes noyesno CIM yes noyes FSM yes no SAM yes no DIMM yes no SOFI yes no The “chassis” yes maybenomaybe

Department of Particle & Particle Astrophysics 30 Conclusions Robust and well developed architecture – Concurrency and bandwidth scale as storage is added – Logical Address space hides client from actual data distribution – Network Attached Storage – Scalable (in size and users) Packaging solution may need an iteration… Schedule – Somewhat unstable, however… sequence and activities are to a large degree correct risk is in development of 10 GE – Well-along implementation road Well developed synergy between Petacache and the current activities of ESE – Great mechanism to develop core competencies – Many of the project deliverables are directly usable in other experiments