Download presentation
Presentation is loading. Please wait.
1
the Virtual Data Toolkit distributed by the Open Science Grid Richard Jones University of Connecticut CAT project meeting, June 24, 2008
2
2 the UConn Grendl cluster 62 dual-processor nodes mix of old and newer cpu’s 7 TB of shared storage condor job management heavy reliance on nfs home-built processing workflow package called “openShop”
3
CAT project meeting, June 24, 2008 3 the UConn Grendl cluster Efficient MPI job scheduling using the condor “parallel universe” Large datasets staged on large distributed “parallel virtual file system” (pvfs) volumes high throughput low cost – no dedicated file servers reduced cpu – data location coupling
4
CAT project meeting, June 24, 2008 4 Obstacles to scaling nfs servers x clients = N 2 problem 1 server down hangs/drags all N clients starts to be a problem with 62 nodes cross-site nfs is an admin nightmare! pvfs 1 server down hangs entire volume poor recovery, compared to nfs invasive installation procedure
5
CAT project meeting, June 24, 2008 5 CAT project scaling data large base-input datasets non-volatile non-replicated relatively compact PWA event lists volatile replicated complex workflow pattern global management scheme is needed
6
CAT project meeting, June 24, 2008 6 CAT project scaling processor co-scheduling cpu resource allocation in clusters network latency allocation persistence not tied to client location access independent of local userid global resource monitoring required
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.