Torrent-based software distribution

Slides:



Advertisements
Similar presentations
2000 Making DADS distributed a Nordunet2 project Jochen Hollmann Chalmers University of Technology.
Advertisements

ALICE G RID SERVICES IP V 6 READINESS
Incentives Build Robustness in BitTorrent Bram Cohen.
BitTorrent Join the swarm! BY: Joe Petruska. What is BitTorrent? a peer-to-peer file sharing protocol used for distributing large amounts of data.
Torrent base of software distribution by ALICE at RDIG V.V. Kotlyar (IHEP, Protvino), E.A. Ryabinkin (RRC-KI, Moscow), I.A. Tkachenko (RRC-KI, Moscow),
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Peer to Peer (P2P) Networks and File sharing. By: Ryan Farrell.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Presented by Stephen Kozy. Presentation Outline Definition and explanation Comparison and Examples Advantages and Disadvantages Illegal and Legal uses.
Background Info The UK Mirror Service provides mirror copies of data and programs from many sources all over the world. This enables users in the UK to.
CS CS 5150 Software Engineering Lecture 13 System Architecture and Design 1.
Torrent-based Software Distribution in ALICE.
CVMFS: Software Access Anywhere Dan Bradley Any data, Any time, Anywhere Project.
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.
BitTorrent Presentation by: NANO Surmi Chatterjee Nagakalyani Padakanti Sajitha Iqbal Reetu Sinha Fatemeh Marashi.
BitTorrent Internet Technologies and Applications.
BitTorrent How it applies to networking. What is BitTorrent P2P file sharing protocol Allows users to distribute large amounts of data without placing.
SUSE Linux Enterprise Server Administration (Course 3037) Chapter 4 Manage Software for SUSE Linux Enterprise Server.
Bit Torrent A good or a bad?. Common methods of transferring files in the internet: Client-Server Model Peer-to-Peer Network.
Networking By Nachiket Agrawal 10DD Contents Network Stand Alone LAN Advantages and Disadvantages of LAN Advantages and Disadvantages of LAN Cabled LAN.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
1Offline Weekly Meeting May May 2006 AliRoot Build Integration and (Testing) System Peter Hristov Vagner Morais.
1 / 22 AliRoot and AliEn Build Integration and Testing System.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
A P2P-Based Architecture for Secure Software Delivery Using Volunteer Assistance Purvi Shah, Jehan-François Pâris, Jeffrey Morgan and John Schettino IEEE.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagansco, S. Banerjee, L. Betev, F. Carminati,
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES A. Abramyan, S. Bagansco, S. Banerjee, L. Betev, F. Carminati,
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Tools and techniques for managing virtual machine images Andreas.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
AliEn central services Costin Grigoras. Hardware overview  27 machines  Mix of SLC4, SLC5, Ubuntu 8.04, 8.10, 9.04  100 cores  20 KVA UPSs  2 * 1Gbps.
ALICE DATA ACCESS MODEL Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring.
+ AliEn site services and monitoring Miguel Martinez Pedreira.
Bit Torrent Nirav A. Vasa. Topics What is BitTorrent? Related Terms How BitTorrent works Steps involved in the working Advantages and Disadvantages.
T3 data access via BitTorrent Charles G Waldman USATLAS/University of Chicago USATLAS T2/T3 Workshop Aug
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
A. Gheata, ALICE offline week March 09 Status of the analysis framework.
Solutions for the Fourth Problem Set COSC 6360 Fall 2014.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES L. Betev, A. Grigoras, C. Grigoras, P. Saiz, S. Schreiner AliEn.
Testing CernVM-FS scalability at RAL Tier1 Ian Collier RAL Tier1 Fabric Team WLCG GDB - September
Christmas running post- mortem (Part III) ALICE TF Meeting 15/01/09.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz The future of AliEn.
A DΙgital Library Infrastructure on Grid EΝabled Technology SAPIR – Search in Audio Visual Content.
36 th LHCb Software Week Pere Mato/CERN.  Provide a complete, portable and easy to configure user environment for developing and running LHC data analysis.
AAF tips and tricks Arsen Hayrapetyan Yerevan Physics Institute, Armenia.
1 Policy Based Systems Management with Puppet Sean Dague
Virtual machines ALICE 2 Experience and use cases Services at CERN Worker nodes at sites – CNAF – GSI Site services (VoBoxes)
Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
Federating Data in the ALICE Experiment
Installation of the ALICE Software
An example of peer-to-peer application
Introduction to BitTorrent
Dag Toppe Larsen UiB/CERN CERN,
Decentralized peer discovery performance in swarm-protocols
Dag Toppe Larsen UiB/CERN CERN,
Latest WMS news and more
Report PROOF session ALICE Offline FAIR Grid Workshop #1
ALICE FAIR Meeting KVI, 2010 Kilian Schwarz GSI.
BDII Performance Tests
Torrent-based software distribution
Storage elements discovery
AliEn central services (structure and operation)
CMSC 611: Advanced Computer Architecture
ALICE-Grid Activities in Bologna
Distributed P2P File System
Alice Software Demonstration
The BitTorrent Protocol
Content Distribution Networks + P2P File Sharing
Content Distribution Networks + P2P File Sharing
Presentation transcript:

Torrent-based software distribution Costin Grigoras Pablo Saiz ALICE Offline Week – 24.06.2009

Current way of distributing sw SLC4 32bit SLC4 64bit SLC5 32bit SLC5 64bit Build servers AliRoot & deps SLC4 Itanium Mac 32bit Mac 64bit Ubuntu 64bit Catalogue ALICE::CERN::SE AliEn Shared software area NFS/AFS/... VoBox PackMan Grid Site X Worker nodes

Current way of distributing sw Advantages A single service/site manages the installation of required packages Disadvantages Shared software area is a single point of failure / bottleneck Difficult to update packages keeping the version number Need to keep a short list of active software packages

How can we avoid using a shared software area ? Worker nodes are independent Self-consistent software packages are required No site-local software repository Avoid overloading central software repositories Would be nice to be able to quickly update software packages if needed We are trying to use BitTorrent technology to solve all the above

Preparing for torrent package.tar.bz2 Chunks of equal size package.tar.bz2.torrent (tens of KB) Metadata info of the original file: - SHA1 hashes of chunks - SHA1 hash of the entire file * uniquely identifies the file - Tracker location (entry point)

Data flow in torrent networks Tracker Discovery service: keeps track of who has which files/chunks. HTTP-based protocol Seeder Seeder Client Client Clients that have the complete file and serve it Are in the process of downloading the file. Cooperate to download faster.

Implementation in AliEn SLC4 32bit SLC4 64bit SLC5 32bit SLC5 64bit Build servers AliRoot & deps SLC4 Itanium Mac 32bit Mac 64bit Ubuntu 64bit Catalogue torrent://... Seeder alitorrent:8092 Tracker alitorrent:8088 AliEn http://alitorrent.cern.ch VoBox Grid Site X Worker nodes

Implementation in AliEn Worker nodes keep seeding the packages that they have downloaded Other worker nodes will fetch the content mostly from local nodes Worker nodes from site A are usually firewalled from site B, so no inter-site traffic If initial download is not possible via torrent, fall back to wget and then seed the fetched files Multiple versions of the same file can co-exist since they will have different hash codes; old ones will be graciously phased out.

Current status AliEn itself is packaged in a small (35MB) archive AliRoot, Root & deps. packaged in single archives: max. 300MB/job Subatech is used as testbed LDAP flag to switch modes: name=Subatech-CREAM,ou=CE,ou=Services,ou=Subatech,ou=Sites,o=alice,dc=cern,dc=ch installMethod=Torrent Production jobs work fine Analysis jobs fail to load a particular library; most probably a configuration issue that is currently tracked You can download precompiled packages from http://alitorrent.cern.ch/

Future plans Full-scale testing of the solution Evaluate the need for caching On worker nodes, as files On VoBox, as seeder Regional seeders All these would require managers Try to use the solution for distributing data files or pre-compiled PAR files Latest version would be fetched at every execution, no cleanup required for previous ones