The Data Logistics Toolkit Martin Swany Professor, School of Informatics and Computing Executive Associate Director, Center for Research in Extreme Scale.

Slides:



Advertisements
Similar presentations
Recent Developments in Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science Department.
Advertisements

The Globus Striped GridFTP Framework and Server Bill Allcock 1 (presenting) John Bresnahan 1 Raj Kettimuthu 1 Mike Link 2 Catalin Dumitrescu 2 Ioan Raicu.
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
A new Network Concept for transporting and storing digital video…………
High Performance Computing Course Notes Grid Computing.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
SensIT PI Meeting, April 17-20, Distributed Services for Self-Organizing Sensor Networks Alvin S. Lim Computer Science and Software Engineering.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Microsoft ® Application Virtualization 4.6 Infrastructure Planning and Design Published: September 2008 Updated: February 2010.
Grid Monitoring By Zoran Obradovic CSE-510 October 2007.
P2P Games Conference “Attributes of the Gaming Cloud?” Norman Henderson ASANKYA
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
National Science Foundation Arlington, Virginia January 7-8, 2013 Tom Lehman University of Maryland Mid-Atlantic Crossroads.
Distributed FutureGrid Clouds for Scalable Collaborative Sensor-Centric Grid Applications For AMSA TO 4 Sensor Grid Technical Interchange Meeting By Anabas,
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago.
Sponsored by the National Science Foundation Research & Experiments on GENI GENI CC-NIE Workshop NSF Mark Berman, Mike Zink January 7,
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Globus GridFTP: What’s New in 2007 Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
Moving Large Amounts of Data Rob Schuler University of Southern California.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Internet2 Performance Update Jeff W. Boote Senior Network Software Engineer Internet2.
Globus GridFTP and RFT: An Overview and New Features Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
GENI Experiments in Optimizing Network Environments using XSP Ezra Kissel and Martin Swany University of Delaware Abstract Our proposal is to build, deploy.
Logistical Networking Micah Beck, Research Assoc. Professor Director, Logistical Computing & Internetworking (LoCI) Lab Computer.
Logistical Networking as an Advanced Engineering Testbed Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Troubleshooting GridFTP flows with XSP and Periscope Dan Gunter, presenter Ahmed El-Hassany, Ezra Kissel, Guilherme Fernandes, Martin Swany.
An Exposed Approach to Reliable Multicast in Heterogeneous Logistical Networks Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking.
ASCR/ESnet Network Requirements an Internet2 Perspective 2009 ASCR/ESnet Network Requirements Workshop April 15/16, 2009 Richard Carlson -- Internet2.
1 Mobile Management of Network Files Alex BassiMicah Beck Terry Moore Computer Science Department University of Tennessee.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Course ILT Basic networking concepts Unit objectives Compare various types of networks Discuss types of servers Discuss LAN topologies Discuss planning.
Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science.
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
ABone Architecture and Operation ABCd — ABone Control Daemon Server for remote EE management On-demand EE initiation and termination Automatic EE restart.
LEGS: A WSRF Service to Estimate Latency between Arbitrary Hosts on the Internet R.Vijayprasanth 1, R. Kavithaa 2,3 and Raj Kettimuthu 2,3 1 Coimbatore.
LAMP: Bringing perfSONAR to ProtoGENI Martin Swany.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Cyberinfrastructure: An investment worth making Joe Breen University of Utah Center for High Performance Computing.
LAMP: Leveraging and Abstracting Measurements with perfSONAR Guilherme Fernandes
The PRPv1 Architecture Model Panel Presentation Building the Pacific Research Platform Qualcomm Institute, Calit2 UC San Diego October 16, 2015.
Globus online Software-as-a-Service for Research Data Management Steve Tuecke Deputy Director, Computation Institute University of Chicago & Argonne National.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
AMQP, Message Broker Babu Ram Dawadi. overview Why MOM architecture? Messaging broker like RabbitMQ in brief RabbitMQ AMQP – What is it ?
An End-to-End Approach to Scalable Network Storage Micah Beck, Associate Professor Director, Logistical Computing & Internetworking (LoCI) Lab Terry Moore,
Experimenter Feedback Ezra Kissel GEC12 – Kansas City, MO Nov 4 th 2011.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
An Architectural Approach to Managing Data in Transit Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science.
Logistical Networking: Buffering in the Network Prof. Martin Swany, Ph.D. Department of Computer and Information Sciences.
Run-time Adaptation of Grid Data Placement Jobs George Kola, Tevfik Kosar and Miron Livny Condor Project, University of Wisconsin.
BDTS and Its Evaluation on IGTMD link C. Chen, S. Soudan, M. Pasin, B. Chen, D. Divakaran, P. Primet CC-IN2P3, LIP ENS-Lyon
Slide 1 E-Science: The Impact of Science DMZs on Research Presenter: Alex Berryman Performance Engineer, OARnet Paul Schopis, Marcio Faerman.
AMSA TO 4 Advanced Technology for Sensor Clouds 09 May 2012 Anabas Inc. Indiana University.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
UNM SCIENCE DMZ Sean Taylor Senior Network Engineer.
Landsat Remote Sensing Workflow
FileCatalyst Performance
Software infrastructure for a National Research Platform
DLT Development Update
Regional Software Defined Science DMZ (SD-SDMZ)
DLT Development Update
University of Technology
STATEL an easy way to transfer data
Presentation transcript:

The Data Logistics Toolkit Martin Swany Professor, School of Informatics and Computing Executive Associate Director, Center for Research in Extreme Scale Computing (CREST) Indiana University

The Data Logistics Toolkit Logistics - the management of the flow of resources from the point of origin to the point of consumption The DLT integrates local and distributed storage infrastructure, file transfer software, performance monitoring and tuning The DLT software distribution supports the creation of network- optimized data nodes

DLT Overview Set of packages with configuration scripts, etc. Allows the configuration of –DTN with GridFTP –IBP storage depot for content distribution –Phoebus WAN accelerator –On-ramp for Internet2 AL2S using XSP Includes Periscope/perfSONAR monitoring Automatic network tuning

DTN with AL2S On-Ramp Working with the Globus team at U. Chicago and Argonne Leveraging our eXtensible Session Protocol (XSP) to create end-to-end, “sessions” –user-network interface (UNI) XSP daemon acts as network controller –signals AL2S/OESS, OSCARS, OpenFlow GridFTP XIO driver, updating to use the Globus Transfer Network Controller API Generic, transparent on-ramp to circuit networks like AL2S

WAN Acceleration A key reason the Science DMZ model “works” is the separation of lossy access networks from high-bandwidth, long-latency links Termination of TCP connections in “middleboxes” can increase throughput by reducing the RTT Protocol translation Storage in the network to buffer and burst

Distributed Storage for Content Distribution IBP provides a primitive, scalable, in-network storage service File-like abstractions can be built on top of this Uses a data structure known as an exNode (like a Unix inode) to track allocations These basic building blocks can be used to build various instances –Parallel filesystem –Distributed RAID-like storage –Content distribution network –Bittorrent-like peer to peer transfers

Architecture Unified Network Information Service (UNIS) –Descendant of perfSONAR Lookup and Topology Services –Network and service “graph” Intelligent Data Movement Service (IDMS) –Data dispatcher –Operates on UNIS data –Spawn storage services dynamically in GENI Periscope/perfSONAR –Monitoring for operational integrity and optimization, BLiPP Storage Services –IBP, prototype based on Ceph Other services –Data transfer (GridFTP), WAN acceleration

Earth Observation Depot Network (EODN) – An open, community specific content distribution network for remote sensing data

Landsat data Landsat 8 launched February 13 th, 2013 Covers the entire land surface of the Earth every 16 days – 8 day offset from Landsat 7 –~700 scenes each day Each scene contains a GeoTIFF product: high-resolution sensor images –~1GB compressed, 2GB uncompressed Traditionally used for environmental monitoring and land use and land cover change studies

EODN Client EODN (DLT) WISC IU NYSER MIZZ RealEarth UW-Madison UNIS DMS discover / measure (3) stage sensing data (2) harvest (6) Processing… (7) WMS upload (5) fast download EODN Harvester (1) subscribe (4) publish web GUI Landsat Ground Network

Cisco Appliance Platform In collaboration with Internet2, Cisco and Fusion-io Cisco C220 server –2x Intel® Xeon® E5-2680, 16 64GB DDR3 RAM –Fusion-io ioDrive2 1.2 TB CentOS 6.4 Linux with DLT RPMs and tuning for data transfer throughput

Acknowledgements Staff Scientist Dr. Ezra Kissel leads the DLT development efforts, PI of the GENI IDMS effort CC-NIE integration project with U. Tennessee and Vanderbilt U. CC-NIE integration project with the Globus team at U. Chicago and Argonne Nat’l Lab EODN development with AmericaView, U. Wisconsin 12

Phoebus-SLaBS performance GridFTP transfers over dedicated 10G path, increasing WAN latency, 4ms LAN RTT and.001% edge loss