Data Transport to the Cloud

Slides:



Advertisements
Similar presentations
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Advertisements

Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
High Speed Networks and Internets : Multimedia Transportation and Quality of Service Meejeong Lee.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
End-to-End Analysis of Distributed Video-on-Demand Systems Padmavathi Mundur, Robert Simon, and Arun K. Sood IEEE Transactions on Multimedia, February.
A Grid Resource Broker Supporting Advance Reservations and Benchmark- Based Resource Selection Erik Elmroth and Johan Tordsson Reporter : S.Y.Chen.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Mobile Agent Community Service Place Jeff Schott CS590L Spring 2004.
1© Copyright 2015 EMC Corporation. All rights reserved. SDN INTELLIGENT NETWORKING IMPLICATIONS FOR END-TO-END INTERNETWORKING Simone Mangiante Senior.
September 2011 At A Glance The API provides a common interface to the GMSEC software information bus. Benefits Isolates both complexity of applications.
QualNet 2014/05/ 尉遲仲涵. Outline Directory Structure QualNet Basic Message & Event QualNet simulation architecture Protocol Model Programming.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
XCAT Science Portal Status & Future Work July 15, 2002 Shava Smallen Extreme! Computing Laboratory Indiana University.
Lecture#2 on Internet and World Wide Web. Internet Applications Electronic Mail ( ) Electronic Mail ( ) Domain mail server collects incoming mail.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
3rd Nov 2000HEPiX/HEPNT CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
File and Object Replication in Data Grids Chin-Yi Tsai.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
Univ. of TehranAdv. topics in Computer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Job Life Cycle Management Libraries for CMS Workflow Management Projects Stuart Wakefield on behalf of CMS DMWM group Thanks to Frank van Lingen for the.
Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison Managing and Scheduling Data.
Computing Sciences Directorate, L B N L 1 CHEP 2003 Standards For Storage Resource Management BOF Co-Chair: Arie Shoshani * Co-Chair: Peter Kunszt ** *
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Hussein Suleman University of Cape Town Department of Computer Science Digital Libraries Laboratory February 2008 Data Curation Repositories:
INFSO-RI Enabling Grids for E-sciencE The gLite File Transfer Service: Middleware Lessons Learned form Service Challenges Paolo.
Communications & Networks National 4 & 5 Computing Science.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
1 VLDB - Data Management in Grids B. Del-Fabbro, D. Laiymani, J.M. Nicod and L. Philippe Laboratoire d’Informatique de l’Université de Franche-Comté Séoul,
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
10 May 2001WP6 Testbed Meeting1 WP5 - Mass Storage Management Jean-Philippe Baud PDP/IT/CERN.
Bulk Data Transfer Activities We regard data transfers as “first class citizens,” just like computational jobs. We have transferred ~3 TB of DPOSS data.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Parallelizing Functional Tests for Computer Systems Using Distributed Graph Exploration Alexey Demakov, Alexander Kamkin, and Alexander Sortov
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
Paul Alexander1 DS3 Deliverable Status 4 th SKADS Workshop, Lisbon, 2-3 October 2008 DS3 Deliverables Review.
December 10,1999: MONARC Plenary Meeting Harvey Newman (CIT) MONARC Plenary December 9 Agenda u Introductions HN, LP15’ è Status of Actual CMS ORCA databases.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Review of PARK Reflectometry Group 10/31/2007. Outline Goal Hardware target Software infrastructure PARK organization Use cases Park Components. GUI /
Parallel Computing Globus Toolkit – Grid Ayaka Ohira.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
Simulation Production System
The ATLAS “DQ2 Accounting and Storage Usage Service”
WP18, High-speed data recording Krzysztof Wrona, European XFEL
CyberSKA: Global Federated e-Infrastructure
Future of WAN Access in ATLAS
The Underlying Technologies
Jan 12, 2005 Improving CMS data transfers among its distributed Computing Facilities N. Magini CERN IT-ES-VOS, Geneva, Switzerland J. Flix Port d'Informació.
Academia Sinica Grid Computing Centre
Farida Fassi, Damien Mercie
CERN-Russia Collaboration in CASTOR Development
Grid Portal Services IeSE (the Integrated e-Science Environment)
CMS staging from tape Natalia Ratnikova, Fermilab
Simulation use cases for T2 in ALICE
Call Center Metrics: Best Practices in Performance Measurement and Management to Maximize Quitline Efficiency and Quality by Penny Reynolds The Call Center.
Software models - Software Architecture Design Patterns
Production Manager Tools (New Architecture)
SKA Regional Centre: The South African Perspective
Parallel I/O for Distributed Applications (MPI-Conn-IO)
Presentation transcript:

Data Transport to the Cloud David Aikema University of Cape Town

Outline Brad Frank talked about ARCADE and MeerKAT Rob Simmonds discussed SKA regional centres and delivery system objectives Now a brief outline of a prototype for staging data from the MeerKAT archive for further analysis Scenario Why schedule data transfers? Related work Architecture Software

Scenario MeerKAT archive at CHPC Much of the data analysis to be done elsewhere IDIA / ARC ASTRON (Netherlands) Need to store produced Science Products from these facilities back in the archive

Why schedule data transfers? Allows priorities to be set on which data is moved next Adhere to user/project resource allocations Avoid starvation Manage network to maximize performance Handle congestion – particularly on long-distance links (ASTRON) Ensures that WAN is kept busy by keeping data in flight Use efficient WAN data transfer protocols Allows checks to see if data is available at other locations Support subscriptions to datasets

Related work CERN tools LIGO Data Replicator GridFTP / Globus NGAS Phedex, Rucio, FTS, … Somewhat relevant but closely tied to specific projects LIGO Data Replicator GridFTP / Globus NGAS Apache OODT (HT)Condor / Stork

Components Twisted Framework (Python) Rabbitmq queuing system Globus (Software-as-a-Service)

Overview Archive Interface Incoming request Request Handler Staging Queue Staging Agent Staging Buffer Remote Storage Distribution Policy Transfer Queue Transfer Agent Globus