The Science Data Processor and Regional Centre Overview Paul Alexander UK Science Director the SKA Organisation Leader the Science Data Processor Consortium.

Slides:



Advertisements
Similar presentations
Founded in 2010: UCL, Southampton, Oxford and Bristol Key Objectives of the Consortium: Prove the concept of shared, regional e-infrastructure services.
Advertisements

SADC HPC Workshop, 2 Dec 2013, Cape Town
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY Center for Computational Sciences Cray X1 and Black Widow at ORNL Center for Computational.
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Paul Alexander DS3 & DS3-T3 SKADS Review 2006 DS3 The Network and its Output Data Paul Alexander.
Cape Town 2013 Dr Paul Calleja Director Cambridge HPC Service SKA The worlds largest Radio Telescope streaming data processor.
Implementation of a streaming database management system on a Blue Gene architecture for measurement data processing. Erik Zeitler Uppsala data base lab.
ASKAP Central Processor: Design and Implementation Calibration and Imaging Workshop 2014 ASTRONOMY AND SPACE SCIENCE Ben Humphreys | ASKAP Software and.
ASKAP and DVP David DeBoer ASKAP Project Director 15 April 2010 Arlington, VA.
Paul Alexander Dynamic RangeAAVP 2010 Overview of Calibration and Dynamic Range Challenges Paul Alexander.
Big Data Imperial June 2013 Dr Paul Calleja Director HPCS The SKA The worlds largest big-data project.
Synergy.cs.vt.edu Power and Performance Characterization of Computational Kernels on the GPU Yang Jiao, Heshan Lin, Pavan Balaji (ANL), Wu-chun Feng.
1 Down Place Hammersmith London UK 530 Lytton Ave. Palo Alto CA USA.
Chapter 2 Computer Clusters Lecture 2.3 GPU Clusters for Massive Paralelism.
Big Data in Science (Lessons from astrophysics) Michael Drinkwater, UQ & CAASTRO 1.Preface Contributions by Jim Grey Astronomy data flow 2.Past Glories.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
UIUC CSL Global Technology Forum © NVIDIA Corporation 2007 Computing in Crisis: Challenges and Opportunities David B. Kirk.
INFSO-RI Enabling Grids for E-sciencE EGEODE VO « Expanding GEosciences On DEmand » Geocluster©: Generic Seismic Processing Platform.
Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.
VO Sandpit, November 2009 e-Infrastructure to enable EO and Climate Science Dr Victoria Bennett Centre for Environmental Data Archival (CEDA)
Design of a Software Correlator for the Phase I SKA Jongsoo Kim Cavendish Lab., Univ. of Cambridge & Korea Astronomy and Space Science Institute Collaborators:
Danie Ludick MScEng Study Leader: Prof. D.B. Davidson Computational Electromagnetics Group Stellenbosch University Extended Studies of Focal Plane Arrays.
Paul Alexander & Jaap BregmanProcessing challenge SKADS Wide-field workshop SKA Data Flow and Processing – a key SKA design driver Paul Alexander and Jaap.
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
March 9, 2015 San Jose Compute Engineering Workshop.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
● Radio telescope arrays – km diameter – Resolution arcmin to micro-arcsec at radio wavelengths ● Similar (baseline/ wavelength) for other regimes.
An FX software correlator for VLBI Adam Deller Swinburne University Australia Telescope National Facility (ATNF)
Large Area Surveys - I Large area surveys can answer fundamental questions about the distribution of gas in galaxy clusters, how gas cycles in and out.
Australian SKA Pathfinder (ASKAP) David R DeBoer ATNF Assistant Director ASKAP Theme Leader 06 November 2007.
Biryaltsev E.V., Galimov M.R., Demidov D.E., Elizarov A.M. HPC CLUSTER DEVELOPMENT AND OPERATION EXPERIENCE FOR SOLVING THE INVERSE PROBLEMS OF SEISMIC.
18-19 July, 2002Correlator Backend System OverviewTom Morgan 1 Correlator Backend System Overview Tom Morgan, NRAO.
A real-time software backend for the GMRT : towards hybrid backends CASPER meeting Capetown 30th September 2009 Collaborators : Jayanta Roy (NCRA) Yashwant.
Paul AlexanderSKA Computational Challenges Square Kilometre Array Computational Challenges Paul Alexander.
DiRAC-3 – The future Jeremy Yates, STFC DiRAC HPC Facility.
Cross correlators are a highly computationally intensive part of any radio interferometer. Although the number of operations per data sample is small,
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
Square Kilometre Array eInfrastructure: Requirements, Planning, Future Directions Duncan Hall SPDO Software and Computing EGEE 2009.
SKA Central Signal Processor Architecture SKA CSP LMC Peer Review 11.April, 2016, Madrid Sonja Vrcic SKA CSP LMC Sub-element Lead.
VGOS GPU Based Software Correlator Design Igor Surkis, Voytsekh Ken, Vladimir Mishin, Nadezhda Mishina, Yana Kurdubova, Violet Shantyr, Vladimir Zimovsky.
© Thomas Ludwig Prof. Dr. Thomas Ludwig German Climate Computing Center (DKRZ) University of Hamburg, Department for Computer Science (UHH/FBI) Disks,
Andrew Faulkner April 2016 STFC Industry Day: Low Frequency Aperture Array Andrew Faulkner Project Engineer.
Centre of Excellence in Physics at Extreme Scales Richard Kenway.
Parallel Computers Today LANL / IBM Roadrunner > 1 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating point.
The SKA Science Data Processor and Regional Centres
E-field Parallel Imaging Correlator: A New Imager for Large-N Arrays
Open Science cloud access to LOFAR data and compute
Energy efficient SCalable
Latency and Communication Challenges in Automated Manufacturing
Big Data - Efficient SW Processing
Cost Models for HPC and Supercomputing
Korea Astronomy and Space Science Institute
Jay Boisseau, Director Texas Advanced Computing Center
Scientific Computing Department
Mid Frequency Aperture Arrays
Appro Xtreme-X Supercomputers
Computing Architecture
OCP: High Performance Computing Project
Pulsar Timing with ASKAP Simon Johnston ATNF, CSIRO
Low Latency Analytics HPC Clusters
Correlator – Backend System Overview
Barcelona Supercomputing Center
Observational Astronomy
Observational Astronomy
Post-Correlation Interference Mitigation
SDP Kernels Workshop – The Role of Kernels
SDP Interface Identification
The Greening of IT at the University of Michigan
Presentation transcript:

The Science Data Processor and Regional Centre Overview Paul Alexander UK Science Director the SKA Organisation Leader the Science Data Processor Consortium

SKA: A Leading Big Data Challenge for 2020 decade Antennas Digital Signal Processing (DSP) High Performance Computing Facility (HPC) Transfer antennas to DSP 2020: 5,000 PBytes/day 2030: 100,000 PBytes/day Over 10’s to 1000’s kms To Process in HPC 2020: 50 PBytes/day 2030: 10,000 PBytes/day Over 10’s to 1000’s kms HPC Processing 2020: 300 PFlop 2028: 30 EFlop

XXXXXX SKY Image Detect & amplify Digitise & delay Correlate Process Calibrate, grid, FFT Integrate s B B. s 12 Astronomical signal (EM wave) Visibility: V(B)= E 1 E 2 * = I(s) exp( i  B.s/c ) Resolution determined by maximum baseline  max ~ / B max Field of View (FoV) determined by the size of each dish  dish ~ / D Standard interferometer

Image formation computationally intensive Images formed by an inverse algorithm similar to that used in MRI Typical image sizes up to: 30k x 30k x 64k voxels

Must also solve for calibration errors

Challenge Very Dependent on Experiment

One SDP Two Telescopes Ingest (GB/s) SKA1_Low500 SKA1_Mid1000 In total need to deploy eventually a system which is close to 0.5 EFlop of processing

… and regional centres

Scope of the SDP

The SDP System

Imaging Processing Model Correlator RFI excision and phase rotation

Performance Requirements Imaging and calibration determines system sizing Data is Buffered after Ingest Double buffered Buffer size >100 PByte Imaging must account for very large field of view o Algorithms complex and computationally expensive o Will need to evolve algorithms during life of the telescope Data products will be calibrated multi- dimensional images and time-series data Volume of potential data products very large May only be able archive data specific to observation requested Commensal fast-imaging mode for slow transients Processing to be achieved (Pflop) Ingest (GB/s) SKA1_LOW25500 SKA1_Mid Detailed analysis is complex

Illustrative Computing Requirements ~100 PetaFLOPS total achieved ~200 PetaByte/s aggregate BW to fast working memory ~50 PetaByte Storage ~1 TeraByte/s sustained write to storage ~10 TeraByte/s sustained read from storage –~ FLOPS/byte read from storage 13

SDP High-Level Architecture

How do we get performance and manage data volume? Graph-based approach Hadoop

Data Driven Architecture

o Smaller FFT size at cost of data duplication

SDP Data Flow Approach: Next Generation Data Science Engine?

Hardware Platform

Wilkes Built November 2013, HPCS/SDP + Dell, NVIDIA and Mellanox UK’s fastest academic cluster 128 Dell T620 servers 256 NVIDIA K20 GPUs 256 Mellanox Connect InfiniBand cards + switch Performance of 240TFlop with a Top500 position of 166 Funded: STFC, SKA, and industrial sponsorship from Rolls Royce and Mitsubishi Heavy Industries. most energy efficient air cooled supercomputer in the world second in Green500 with 3631 MFLOP/w most energy efficient air cooled supercomputer in the world second in Green500 with 3631 MFLOP/w

3-30 EBytes / year of fully processed data for SKA2 Data rates and processing increase by FACTOR ~100 for SKA2

END