Download presentation
Presentation is loading. Please wait.
1
Data Transport to the Cloud
David Aikema University of Cape Town
2
Outline Brad Frank talked about ARCADE and MeerKAT
Rob Simmonds discussed SKA regional centres and delivery system objectives Now a brief outline of a prototype for staging data from the MeerKAT archive for further analysis Scenario Why schedule data transfers? Related work Architecture Software
3
Scenario MeerKAT archive at CHPC
Much of the data analysis to be done elsewhere IDIA / ARC ASTRON (Netherlands) Need to store produced Science Products from these facilities back in the archive
4
Why schedule data transfers?
Allows priorities to be set on which data is moved next Adhere to user/project resource allocations Avoid starvation Manage network to maximize performance Handle congestion – particularly on long-distance links (ASTRON) Ensures that WAN is kept busy by keeping data in flight Use efficient WAN data transfer protocols Allows checks to see if data is available at other locations Support subscriptions to datasets
5
Related work CERN tools LIGO Data Replicator GridFTP / Globus NGAS
Phedex, Rucio, FTS, … Somewhat relevant but closely tied to specific projects LIGO Data Replicator GridFTP / Globus NGAS Apache OODT (HT)Condor / Stork
6
Components Twisted Framework (Python) Rabbitmq queuing system
Globus (Software-as-a-Service)
7
Overview Archive Interface Incoming request Request Handler Staging
Queue Staging Agent Staging Buffer Remote Storage Distribution Policy Transfer Queue Transfer Agent Globus
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.