DUCKS – Distributed User-mode Chirp-Knowledgeable Server Joe Thompson Jay Doyle
DUCKS Motivation Performing distributed jobs on many files comes with many problems. Storage Space. If all files are stored on one machine. Many concurrent jobs hammer the single machine with I/O. Solution : Distribute the files to multiple machines. Presents problem of file management (confusing). Proposed Solution : A tracking system to abstract file location from the user.
DUCKS Features Bring together functionality of Condor and CHIRP in an easy to use package. Abstract Condor and CHIRP interfaces. Intelligently distribute files over CHIRP servers. Provide a semantic per-user namespace for file storage Provide simple interface using Condor for a Program-To- Data model. Provide simple interface using Condor for a Data-To- Program model.
DUCKS Components Ducks Server: Ducks Client: C Server: MySQL Database: Event-based transaction handler process to interface with clients. Background CHIRP monitoring process. Background cleanup process. MySQL Database: Handles the metadata of the file system. Handles user permissions and state. Ducks Client: Standalone command line scripts that interface with the DUCKS server, CHIRP file system, and Condor.
DUCKS File Input User calls a client side script. Request to store goes to the server. Server chooses CHIRP server and unique pathname. Server sends this info back to the client. Client utilizes chirp_put to store the file. Client responds to the server. Server updates the database. STORE: CHIRP Path SEND: DUCKS Name GET: CHIRP Path CHIRP DUCKS Key: Data Transfer Database Query
DUCKS File Retrieval User calls a client side script Request for semantic name goes to the server Server responds with CHIRP server and path Client utilizes chirp_get to retrieve the file REQUEST : CHIRP Path GET: File SEND: DUCKS Name GET: CHIRP Path CHIRP DUCKS Key: Data Transfer Database Query
DUCKS Job Execution 2 Modes : Job-to-Data & Data-to-Job Job-to-Data : User sends file list to the server (with executable). Server builds a Condor script for all requested files on a given machine and sends the script back to the user The user submits the job requests Data-to-Job : User script builds a Condor wrapper script that requests files from DUCKS on any machine it gets to use
DUCKS File & Job Input Chirp 1 File File Chirp 2 File Job Chirp 3 User 1 DUCKS Client DUCKS Database Key: Data Transfer Database Query DUCKS Server
DUCKS Progress Server 60% complete. Client 30% complete. All functionality has been outlined with skeleton code. Connect, Disconnect, & Store messages are mostly functional. CHIRP Tracker 100% functional. Background Retry process 100% functional. Database table structure and interface is mostly functional. Client 30% complete. Connect and Disconnect messages 100%. Store 70%. Retrieve, Remove and LS have skeleton code. Condor execution scripts are designed – but need porting.
DUCKS Demo
DUCKS Demo