Ian McCrea Rutherford Appleton Laboratory Chilton, Oxfordshire, UNITED KINGDOM 25 May 2009 EISCAT-3D Data System Specifications and Possible Solutions.

Slides:



Advertisements
Similar presentations
Data Storage Solutions Module 1.2. Data Storage Solutions Upon completion of this module, you will be able to: List the common storage media and solutions.
Advertisements

A new Network Concept for transporting and storing digital video…………
Elements of a Microprocessor system Central processing unit. This performs the arithmetic and logical operations, such as add/subtract, multiply/divide,
Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Chapter 6 Multiplexing.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
20 Feb 2002Readout electronics1 Status of the readout design Paul Dauncey Imperial College Outline: Basic concept Features of proposal VFE interface issues.
COE 342: Data & Computer Communications (T042) Dr. Marwan Abu-Amara Chapter 8: Multiplexing.
1 I/O Management in Representative Operating Systems.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
The Design and Implementation of a Log-Structured File System Presented by Carl Yao.
BACKUP/MASTER: Immediate Relief with Disk Backup Presented by W. Curtis Preston VP, Service Development GlassHouse Technologies, Inc.
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
Data and Computer Communications Chapter 8 – Multiplexing
Chocolate Bar! luqili. Milestone 3 Speed 11% of final mark 7%: path quality and speed –Some cleverness required for full marks –Implement some A* techniques.
Computer & Communications Systems Software Development Unit 1.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Ian McCrea STFC Rutherford Appleton Laboratory Chilton, Oxfordshire, UK On behalf of the EISCAT_3D Project Consortium.
EPICS Archiving Appliance Test at ESS
Ian McCrea STFC Rutherford Appleton Laboratory Chilton, Oxfordshire, UK On behalf of the EISCAT_3D Project Consortium.
Solar observation modes: Commissioning and operational C. Vocks and G. Mann 1. Spectrometer and imaging modes 2. Commissioning proposals 3. Operational.
Planning and Designing Server Virtualisation.
Chapter © 2006 The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/ Irwin Chapter 7 IT INFRASTRUCTURES Business-Driven Technologies 7.
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
NTD/xNTD Signal Processing Presented by: John Bunton Signal Processing team: Joseph Pathikulangara, Jayasri Joseph, Ludi de Souza and John Bunton Plus.
Paul Alexander & Jaap BregmanProcessing challenge SKADS Wide-field workshop SKA Data Flow and Processing – a key SKA design driver Paul Alexander and Jaap.
IT253: Computer Organization
Documentation: What we might do in an ideal world Ian McCrea.
Hardware. Make sure you have paper and pen to hand as you will need to take notes and write down answers and thoughts that you can refer to later on.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Research Networks and Astronomy Richard Schilizzi Joint Institute for VLBI in Europe
EISCAT-3D Ian McCrea Rutherford Appleton Laboratory, UK On behalf of the EISCAT-3D Design Team.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Experts in numerical algorithms and High Performance Computing services Challenges of the exponential increase in data Andrew Jones March 2010 SOS14.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 Lecture # 17 Computer Communication & Networks.
An FX software correlator for VLBI Adam Deller Swinburne University Australia Telescope National Facility (ATNF)
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
Topics discussed in this section:
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
A real-time software backend for the GMRT : towards hybrid backends CASPER meeting Capetown 30th September 2009 Collaborators : Jayanta Roy (NCRA) Yashwant.
Scott Koranda, UWM & NCSA 14 January 2016www.griphyn.org Lightweight Data Replicator Scott Koranda University of Wisconsin-Milwaukee & National Center.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.
M.P. Rupen, Synthesis Imaging Summer School, 18 June Cross Correlators Michael P. Rupen NRAO/Socorro.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Yin Chen Towards the Big Data Strategies for EISCAT-3D.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
6.1 Chapter 6 Bandwidth Utilization: Multiplexing and Spreading Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
EISCAT-3D Lassi Roininen Sodankyla Geophysical Observatory.
Chapter 2 PHYSICAL LAYER.
RAID.
Bandwidth Utilization: Multiplexing and Spreading
Integrating Disk into Backup for Faster Restores
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Cross-site problem resolution Focus on reliable file transfer service
Pasquale Migliozzi INFN Napoli
Bandwidth Utilization: Multiplexing and Spreading
Thoughts on Computing Upgrade Activities
Chapter 4: Digital Transmission
Computing Infrastructure for DAQ, DM and SC
Upgrading to Microsoft SQL Server 2014
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Presentation transcript:

Ian McCrea Rutherford Appleton Laboratory Chilton, Oxfordshire, UNITED KINGDOM 25 May 2009 EISCAT-3D Data System Specifications and Possible Solutions

Title here2 Principles EISCAT-3D is very different to EISCAT  Much more low-level data  Continuous operation, unattended remotes  Interferometry as well as standard IS  Lots of supporting instruments Store data at the lowest practical level  Analysis can be done direct from samples  Any pre-processing reduces flexibility  A wide range of appliations and techniques Data volumes are very large  Can’t store lowest level data forever  Keep them until they are “optimally processed”  Keep a set of correlated data forever (as now)

Title here3 Types of Data Incoherent Scatter  Continuous, complex, amplitude-domain data  Two polarisation streams/beam  80 MHz sampling at 16 bits  Bandpassing, but limit set by modulation bandwidth Interferometry  Continuous on limited number of baselines  Don’t record if nothing happening….but need ability to “backspace” and “run on”.  Save data until optimum brightness function is made and transferred to archive Supporting Instruments  EISCAT-3D will attract many supporting instruments, using same data system  Some data sets big (e.g. imagers) but not always interesting  Suitable for mixture of short-term buffer and permanent archive

Title here4 Types of Archive Ring Buffer  High volume (~100 TB) short duration (hours to days)  Data accumulate constantly – oldest data over-written  Records IS data and interferometry when events detected  Needs to record latent archive data in event of network outage Interferometry System  Small storage area (~100 GB), holds only the past few minutes of data  Data accumulate constantly, and tested against threshold  If event detected, divert to ring buffer (for backscpacing) otherwise delete Permanent Archive  Large capacity (~1PB) permanent archive  Mid and high-level 200 TB/year  Tiered storage, connected to multi-user computing facility

Title here5 Data System Overview

Title here6 Some data rates….. Lowest-level data –80 MHz, 16 bits, 2.56 Gb/s/element, 4x10 13 b/s (!) –Impossible to store –Combine by group (49 antennas) then into <10 beams –Each beam 25 TB/day (still the same order as the LHC) Central site –Only one signal beam (because of transmitter) –Calibration beam(s) will be small data volume –Approx 1 TB/hour (320 MB/s) Remote sites –5-10 beams, but intersection limited –Challenge is of same order as central site –Need identical ring buffers at all sites

Title here7 Band-passing and Bandwidths 80 MHz sampling oversamples a 30 MHz band –But not all of this contains data… –N*Ion lines + 2N*plasma lines –Ion lines ~ 50 kHz, plasma lines ~ 100 kHz. –Seems that we can bandpass…. Bandpassing limited by modulation bandwidth –Convolution of backscatter spectrum and pulse spectrum –Shorter pulses/bauds have higher bandwidth –Some codes at EISCAT have 500 kHz mod. band. Bandpassing depends on coding.. –Worthwhile to bandpass for standard codes –But we need an algorithm to set the pass bands… –Design for worst case (no bandpass) –Don’t forget interferometry….

Title here8 Higher-Level Data Rates Interferometry Data  19 modules tested (202 MB/s, 17 TB/day)  But maybe only 5% of samples above threshold  Five minutes of data = 60 GB (lead-in) Permanent Archive  Continue to store lag profiles  Ability to store limited raw data  Data from supporting instruments  Current archive growth is a few TB/year  We want better time/range resolution  Allow archive to grow at > 1 TB/year

Title here9 Supporting Instruments Examined data rates from variety of instruments  Other radars (coherent scatter, meteor)  Lidars and advanced sounders  Passive optical (high-resolution imagers)  Radio instruments (riometers, VLF, GPS)  Magnetometers Advanced instruments at central site only  Remote sites unattended, therefore  No instruments needing manual intervention  No huge data sets allowed Data volumes can still be large  High-resolution cameras can produce 100s of GB/day  But not all of the data are interesting….  Design for 150 GB/day at central site, 30 GB/day at remotes.

Title here10 Approach to Vendors Ring Buffer and Central Archives very different  Need completely different technical solutions Interferometer is a subset of the ring buffer  Same problem, but smaller data sets Based on data rates, produced specifications  Two questions for manufacturers:  Are requiremements achieveable now ?  What kind of technologies can be used to achieve them ? Manufacturer approaches began at Storage Expo 2007  ~20 companies contacted  Significant discussions with ~10

Title here11 Specifications: Ring Buffer Minimum of 56 TB short-term storage  IS and interferometry both produce ~ 1 TB/hour  ~2 days of full-bandwidth ISR data  ~2 weeks of bandpassed data  Several months of high-level data (weather latency) 4 input, 8 output 160 MB/s  Somebody needs to read these data !! 38 input, 38 output 6 MB/s  Less demanding things, including monitoring Identical systems at central site and remotes Power draw < 300 kW System management, monitoring tools, warranties

Title here12 Specifications: Central Archive 1 PB of usable initial storage  Initially a five year archive  400 TB on line  600 TB “near on-line” e.g. tape library  Extensible at 200 TB/year Input 300 MB/s over all channels  > 20 TB/day – allows fast filling Output 1 TB/s over all channels  100 output channels  Assume we have lots of simultaneous users  2 high-volume output channels Power draw <300 kW System management, monitoring tools, warranties

Title here13 Solutions: Ring Buffer Two solutions for multiple input channels  Each channel separate, lots of channels  Multiplex into a few high-rate channels  Second solution probably better  e.g. 20 channels, multiplexed into 8 links of 6 GB/s. Multiple drives, multiple enclosures  10 drive enclosures  Each enclosure 50% filled (300 x 300 GB SAS drives)  Resulting capacity 90 TB (72 TB directly usable) Expansion and degradation  Half-filled cabinets allow expansion  RAID6 allows graceful degradation  Power draw and temperature range within spec.

Title here14 Solutions: Central Archive Central site only  Can be a staffed system, easier maintenance  Mix of on-line and near-on-line  User access should look immediate, even for historic data MAID – Massive Array of Idle Disks  Large disk arrays with “sleep mode”  Spin down if unused for given time (5-330 mins) Example system: 1.2 PB archive  24 GB/s bandwidth over 20 channels, 20 TB/hour backup  RAID 6 graceful degradation, modular “hot swap” components  Single system supports 1200 disk drives

Title here15 Network Issues Big issue is data transfer from the remotes  1 beam is 320 MB/s, remotes will have multiple beams  Supporting instruments add ~30% overhead Need to recover from interrupts quickly  Otherwise we may never catch up  Interrupts might last days/weeks Fast links already practical  Protocols for 10 GB/s links exist already  We may need to provide some of the networking… A back-up option is needed if the network fails  Something to tell us if the site is alive  …and how cold it is……  Options are mobile phone, satellite, microwave link…

Title here16 Some Other Ideas Project Blackbox Containerised data centre Transports on a truck 1.5 PB disk storage 7 TB memory 250 servers, 2000 cores, 8000 threads When full, drive back to HQ and put in another… Turns out to be a very high- bandwidth solution….. …provided you can integrate rates over time

Title here17 Visualisation What kinds of visualisation will we need ?  This work transferred from RAL to UiT  Carried out by Bjorn Gustavsson Learn from other radars  AMISRs already have same problem  Jicamarca has lots of imaging software  Software is open-source and adaptable Allow users to bring their own routines  Develop an open source library  Make full use of cuts, movies etc. Don’t try to be too smart  The human brain can only interpret so much !

Title here18 Risk and Return Some questions to consider:  Should we provide a data system like this ourselves ?  Why not let a commercial data centre do this ?  Should we explore other funding sources for this ?  The EU e-infrastructures programme  What’s the cost/benefit between the data system and (say) an extra site ?  What about metadata and services – almost forgotten so far ?

Title here19 Summary and Conclusions Based on an analysis of expected performance and data rates of the new radar, we proposed a data system with three distinct elements:  Cyclic buffers (short duration for low-level data)  Interferometry (allows ability to back-space to start of event)  Permanent Archive (ulimate home of all summary data and centre of user analysis)  Data system also handles supporting instruments Storage, I/O and other specifications were put on all system elements and discussed with vendors:  Data volumes and rates are challenging, but can already be handled now  Appropriate systems can be had to provide functionality we need Implementation depends on many things (funding, politics, scientific priorities etc.)  These will be better defined during next phase of the project