e-Science for the Square Kilometre Array

Slides:



Advertisements
Similar presentations
Dominik Stoklosa Poznan Supercomputing and Networking Center, Supercomputing Department EGEE 2007 Budapest, Hungary, October 1-5 Workflow management in.
Advertisements

Dominik Stokłosa Pozna ń Supercomputing and Networking Center, Supercomputing Department INGRID 2008 Lacco Ameno, Island of Ischia, ITALY, April 9-11 Workflow.
The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes.
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
National e-Science Centre Glasgow e-Science Hub Opening: Remarks NeSCs Role Prof. Malcolm Atkinson Director 17 th September 2003.
SKADSMTAvA A. van Ardenne SKADS Coördinator ASTRON, P.O. Box 2, 7990 AA Dwingeloo The Netherlands SKADS; The.
Distributed search for complex heterogeneous media Werner Bailer, José-Manuel López-Cobo, Guillermo Álvaro, Georg Thallinger Search Computing Workshop.
Paul Alexander DS3 & DS3-T3 SKADS Review 2006 DS3 The Network and its Output Data Paul Alexander.
Jeroen Stil Department of Physics & Astronomy University of Calgary Stacking of Radio Surveys.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
SKA South Africa Overview Thomas Kusel MeerKAT System Engineering Manager April 2011.
Probing the field of Radio Astronomy with the SKA and the Hartebeesthoek Radio Observatory: An Engineer’s perspective Sunelle Otto Hartebeesthoek Radio.
Distributed components
Thee-Framework for Education & Research The e-Framework for Education & Research an Overview TEN Competence, Jan 2007 Bill Olivier,
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Susana Sánchez Expósito Instituto de Astrofísica de Andalucía - CSIC Pablo Martin, Jose Enrique Ruiz, Lourdes Verdes-Montenegro, Julian Garrido, Raül Sirvent,
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Marcin Okoń Pozna ń Supercomputing and Networking Center, Supercomputing Department e-Science 2006, Amsterdam, Dec Virtual Laboratory as a Remote.
SKA Introduction Jan Geralt Bij de Vaate Andrew Faulkner, Andre Gunst, Peter Hall.
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
E-Science for the SKA WF4Ever: Supporting Reuse and Reproducibility in Experimental Science Lourdes Verdes-Montenegro* AMIGA and Wf4Ever teams Instituto.
Paul Alexander & Jaap BregmanProcessing challenge SKADS Wide-field workshop SKA Data Flow and Processing – a key SKA design driver Paul Alexander and Jaap.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Peter Wilkinson SKADS Workshop: Paris 10 October nd SKADS Workshop: Introduction “What is it about?” Peter Wilkinson University of Manchester.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
The Astronomy challenge: How can workflow preservation help? Susana Sánchez, Jose Enrique Ruíz, Lourdes Verdes-Montenegro, Julian Garrido, Juan de Dios.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
1 Curation and Characterization of Web Services Jose Enrique Ruiz October 23 rd IVOA Fall Interop Meeting - Sao Paolo.
Large Area Surveys - I Large area surveys can answer fundamental questions about the distribution of gas in galaxy clusters, how gas cycles in and out.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Introduction to the VO ESAVO ESA/ESAC – Madrid, Spain.
Netherlands Institute for Radio Astronomy Big Data Radio Astronomy A VRC for SKA and its pathfinders Hanno Holties March 28 th 2012.
An Open Data Platform in the framework of the EGI-LifeWatch Competence Centre Fernando Aguilar Jesús Marco
1 Workflows Access and Massage VO Data José Enrique Ruiz on behalf of the Wf4Ever Team IVOA INTEROP SPRING MEETING 2013 HEIDELBERG, MAY16th 2013.
European Perspective on Distributed Computing Luis C. Busquets Pérez European Commission - DG CONNECT eInfrastructures 17 September 2013.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
FP6 and research infrastructures Main instruments Integrated infrastructure initiative (I3), several M€/year, 4-5 years communication network development,
LOFAR - Calibration, Analysis and Modelling of Radio-Astronomy Data EGI Conference May 2015, Lisbon Daniele Lezzi – Barcelona Supercomputing.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Open Science cloud access to LOFAR data and compute
Bob Jones EGEE Technical Director
Accessing the VI-SEEM infrastructure
Ecological Niche Modelling in the EGI Cloud Federation
Grid Optical Burst Switched Networks
Model Discovery through Metalearning
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Clouds , Grids and Clusters
Mid Frequency Aperture Arrays
INTAROS WP5 Data integration and management
Data Centres in the Virtual Observatory Age
Joseph JaJa, Mike Smorul, and Sangchul Song
An FP5 RTD programme Participants: University of Manchester: UK
Grid Computing.
EGEE NA4 Lofar Lofar Information System Design OmegaCEN
2.6.5 – International Co-operation
GRID COMPUTING PRESENTED BY : Richa Chaudhary.
A 10-m Diameter Submillimeter-wave Telescope for the South Pole
Data Warehousing and Data Mining
Chapter 6 – Architectural Design
04/02/2019 The Use of Grid Technology in Large Scale Data Processing Collaboration Environments S.G. Ansari S.G. Ansari 04/02/2019.
GISELA & CHAIN Workshop Digital Cultural Heritage Network
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

e-Science for the Square Kilometre Array IAU 2012 Data Intensive Astronomy Symposium (Sp15) Beijing, August 29th 2012 e-Science for the Square Kilometre Array Juan de Dios Santander Vela (IAA-CSIC) on behalf of the AMIGA team

Talk Overview The Square Kilometre Array (SKA) The SKA Challenge AMIGA & SKA e-Science Tools for the SKA SKA Computing Synergies Conclusions Reordenar la charla siguiendo la historia del abstract. [Done.]

The Square Kilometre Array

The Square Kilometre Array The embodiment of The Hydrogen Array concept Thousands of antennas, with up to 1 sq km collecting area Distributed across thousands of kilometres of terrain With enormous simultaneous bandwidth to increase survey speed Can be incrementally built See Michael Kramer’s talk on the SKA on SpS9 to see the reason for large bandwidth A Continental Scale, Distributed Sensor Network

SKA Antennas Low-Frequency Aperture Arrays Mid-frequency dishes Combination of Different Antenna Kinds Low-Frequency Aperture Arrays Sparse aperture arrays 70 – 450 MHz Multibeaming Only central core, i.e., 100 Km baselines SKA1: 2016 -2019 Mid-frequency dishes 13m Gregorian-offset dishes 450 MHz – 3 GHz Surface accuracy to 10- 25 GHz

SKA Antennas Mid-Frequency Aperture Arrays Focal Plane Arrays Dense aperture arrays 200 – 500 MHz 200 deg2 FoV Only central core, i.e., 100 Km baselines SKA2: 2018 -2023 Focal Plane Arrays Multibeam Radio-Camera 12m antennas 700 MHz – 1.8 GHz Surface accuracy to 10 GHz

South-Africa & Australia/New Zealand Joint Site SKA Site Selection South-Africa & Australia/New Zealand Joint Site

SKA Site Selection SKA2 AAs SKA2 Mid SKA1&2 Mid SKA1 Low SKA1 Survey ANZ SKA2_LOW SKA1_MID RSA SKA2_MID SKA1_SURVEY SKA2_AA SKA1 Survey

The SKA Challenge

Massive Data Flow, Storage & Processing Courtesy A. Faulkner

SKA Architecture Diagram (SKA1 High-Level Overview)

Software Driven Possible Science will Depend on Software Capabilities We are also interested in the interfaces, and Processing Provenance Software Driven

Interface to the Users We are also interested in the interfaces, and Processing Provenance

But, how can we do our science, then?

AMIGA & SKA That is the question my group, AMIGA, has been asking for the last 10 years: how should science be done

AMIGA & SKA That is the question AMIGA has been asking for the last 10 years: how should science be done

AMIGA Analysis of the interstellar Medium of Isolated GAlaxies PI, L. Verdes-Montenegro Revise Her Talk for Main Results (SpS3, Secular Evolution) Analysis of the interstellar Medium of Isolated GAlaxies Multi-wavelength, multi-object study on isolated galaxies with strict isolation criteria Careful curation of data Very careful processing of new parameters from Group’s own observation programs and data reduction Literature table scanning Virtual Observatory table harvesting and parsing Emphasis on marrying astronomy and computer science, and buy-in of the VO Paso de usuarios a desarrolladores de e-Ciencia: Estudios detallados, comprobación de provenance, data sharing, radio data archiving AMIGA trabaja justo en la solución de problemas de e-Ciencia: manipulation of thousands of datasets from large telescopes (see Felix Stoehr talk on ALMA data deluge) e-Science Users e-Science Developers!

Participating in SKA.TEL.SDP Protoconsortium AMIGA Project goal: providing a baseline for galaxy properties to compare with other environments Interaction-free sample, ideal for tracing Hi infall: we can use CIG galaxies to detect the cosmic web Need for very sensitive telescopes able to resolve faint Hi Square Kilometre Array & pathfinders We Need Tools for Our Own Science Analysis ⤷ Participating in SKA.TEL.SDP Protoconsortium

e-Science Tools & SKA

For Scientific Discussion & Science Extraction e-Science Tools & SKA Distributed computing Move computation to the data Computing services Collaborative environments Linked data The SKA is a pure e-Science project Science-computing } For Scientific Discussion & Science Extraction

Defining Computations Events & Processes Dependencies Resources Local & Remote Processes Sequences Concurrences Triggers Formally, or at least Machine Readable Talk about Communicating Sequential Projects, Actor Model, etc. ➡ Workflow Definition Languages

AMIGA Contributions Wf4Ever AMIGA4GAS Workflows for process & scientific methodology specification Web and command line tools for data preservation, metho- dology preservation, reuse, repurposing, & collaboration Provide extra tools for astronomical data processing & services AMIGA4GAS Use workflows as process abstraction engines Use federation and supercomputing models for Taverna Adapt Taverna (& workflows) to those computing models Wf4Ever: linked data philosophy

EU funded FP7 STREP Project Intelligent Software Components (iSOCO, Spain) University of Manchester (UNIMAN, UK) Universidad Politécnica de Madrid (UPM, Spain) Poznan Supercomputing and Networking Centre (PSNC, Poland) University of Oxford (OXF, UK) Instituto de Astrofísica de Andalucía (IAA, Spain) Leiden University Medical Centre (LUMC, NL) EU funded FP7 STREP Project December 2010 – December 2013 3 7 4 1 6 5 2 STREP: Specific Targeted REsearch Projects

Core Competencies (Tech) Technological infrastructure for the preservation and efficient retrieval and reuse of scientific workflows in a range of disciplines One SME Six public organisations Partners Archival, classification, and indexing of scientific workflows and their associated materials in scalable semantic repositories, providing advanced access and recommendation capabilities Creation of scientific communities to collaboratively share, reuse, and evolve workflows and their parts, stimulating the development of new scientific knowledge Goals Digital Libraries Workflow Management Semantic Web Integrity & Authenticity Provenance Information Quality Core Competencies (Tech) Astronomy (IAA-CSIC) Genome-wide Analysis and Biobanking Case Studies Targeting Already Established Communities: MyExperiment, Virtual Observatory

AMIGA4GAS 3D Kinematical modeling 12 ASCII Files ROTCUR GALMOD Input Files 12 Runs Possible combinations in Input Parameters 12 Cubes 4 Approaching 4 Receeding 4 Both COPY 1 DataCube 1 Velocity Map 1 Config File Rotcur 1 Config File Galmod GIPSY tasks. GIPSY is used by Kapteyin for Westerbork, ATNF for ATCA, and can be used for VLA. MOMENTS 8 Velocity Maps 8 Cubes 4 Approaching + Receeding 4 Both SUB SUB VARIABLE PARAMS INSET RADII, WIDTHS WEIGHT TOLERANCE DENS NV Z0 VDISP 8 Residual Cubes 8 Residual Maps MNMX 8 Values for Peaks in Cubes 8 Values for Peaks in Maps

Direct Relevance to SKA Science Data Processor AMIGA4GAS AMIGA for the GTC, ALMA, and SKA Pathfinders Technical part, devoted to computing & data federation Heterogeneous computing federation Local computing cluster, grid, cloud computing Main Goals Porting the Taverna workflow engine to supercomputing environments Development of an integration layer for automatic workflow deployment In Partnership With BSC, FCSCL Direct Relevance to SKA Science Data Processor

Infrastructure Federation GRID Fed4AMIGA Resource Manager Super Computer COMPSs Cloud Fed4AMIGA: Federation of Infrastructures How to integrate the infrastructures in a federated system? How to authenticate the users? How to implement business rules to decide in which infrastructure the task should run?

Wf4Ever + CyberSKA Workflows can be used to formally specify processing tasks Specially good when computing tasks exists as services Even better if data can be referenced, instead of sent over the wire

VERY Much Science-Oriented Wf4Ever + CyberSKA Complementary to CyberSKA (infrastructure) Wf4Ever places more emphasis on End-user tools process creation data manipulation annotation interdisciplinary algorithm repurposing Long-term Preservation, Quality Assurance VERY Much Science-Oriented

SKA Computing Synergies

SKA Computing Synergies The SKA computing power amounts to being able to sift through the entire Internet more than 100 times per day Citizens can be empowered, through SKA-like tools, to process city wide, regional, or national data for insight Intelligent sensor networks can provide tools for better, instantaneous, resource planning In Line With H2020 Priorities

Conclusions

Conclusions The SKA is the proverbial e-Science instrument Workflows can be used for both machine-readable, formal process description, and human-readable scientific tool development Federated, transparent workflow computing Wf4Ever & AMIGA4GAS are complementary to CyberSKA, SKA.TEL.SDP work

Conclusions It is a long road towards the SKA, but we have to get involved in this problems now

Thank you! Julián Garrido José Enrique Ruiz del Mazo Susana Sánchez Expósito Juan de Dios Santander-Vela <jdsant@iaa.es> Lourdes Verdes-Montenegro