Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison Managing and Scheduling Data.

Similar presentations


Presentation on theme: "Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison Managing and Scheduling Data."— Presentation transcript:

1 Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison kosart@cs.wisc.edu http://www.cs.wisc.edu/condor Managing and Scheduling Data Placement (DaP) Requests

2 www.cs.wisc.edu/condor Outline › Motivation › DaP Scheduler › Case Study: DAGMan › Conclusions

3 www.cs.wisc.edu/condor Demand for Storage › Applications require access to larger and larger amounts of data  Database systems  Multimedia applications  Scientific applications Eg. High Energy Physics & Computational Genomics Currently terabytes soon petabytes of data

4 www.cs.wisc.edu/condor Is Remote access good enough? › Huge amounts of data (mostly in tapes) › Large number of users › Distance / Low Bandwidth › Different platforms › Scalability and efficiency concerns => A middleware is required

5 www.cs.wisc.edu/condor Two approaches › Move job/application to the data  Less common  Insufficient computational power on storage site  Not efficient  Does not scale › Move data to the job/application

6 www.cs.wisc.edu/condor Move data to the Job Huge tape library (terabytes) Compute cluster LAN Local Storage Area (eg. Local Disk, NeST Server..) WAN Remote Staging Area

7 www.cs.wisc.edu/condor Main Issues › 1. Insufficient local storage area › 2. CPU should not wait much for I/O › 3. Crash Recovery › 4. Different Platforms & Protocols › 5. Make it simple

8 www.cs.wisc.edu/condor Data Placement Scheduler (DaPS) › Intelligently Manages and Schedules Data Placement (DaP) activities/jobs › What Condor is for computational jobs, DaPS means the same for DaP jobs › Just submit a bunch of DaP jobs and then relax..

9 www.cs.wisc.edu/condor DaPS Architecture DAPS Server AcceptExec.Sched. DaPS Client Req. GridFTP ServerNeST ServerSRB Server Local Disk GridFTP ServerSRM Server Req. Buffer Req. LocalRemote Queue Thirdparty transfer Get Put

10 www.cs.wisc.edu/condor DaPS Client Interface › Command line:  dap_submit › API:  dapclient_lib.a  dapclient_interface.h

11 www.cs.wisc.edu/condor DaP jobs › Defined as ClassAds › Currently four types:  Reserve  Release  Transfer  Stage

12 www.cs.wisc.edu/condor DaP Job ClassAds [ Type = Reserve; Server = nest://turkey.cs.wisc.edu; Size = 100MB; reservation_no = 1; …… ] [ Type = Transfer; Src_url = srb://ghidorac.sdsc.edu/kosart.condor/x.dat; Dst_url = nest://turkey.cs.wisc.edu/kosart/x.dat; reservation_no = 1;...... ]

13 www.cs.wisc.edu/condor Supported Protocols › Currently supported:  FTP  GridFTP  NeST (chirp)  SRB (Storage Resource Broker) › Very soon:  SRM (Storage Resource Manager)  GDMP (Grid Data Management Pilot)

14 www.cs.wisc.edu/condor Case Study: DAGMan.dag File Condor Job Queue A DAGMan C D A B

15 www.cs.wisc.edu/condor Current DAG structure › All jobs are assumed to be computational jobs Job A Job B Job C Job D

16 www.cs.wisc.edu/condor Current DAG structure › If data transfer to/from remote sites is required, this is performed via pre- and post-scripts attached to each job. Job A PRE Job B POST Job C Job D

17 www.cs.wisc.edu/condor New DAG structure Add DaP jobs to the DAG structure PRE Job B POST Transfer in Reserve In & out Job B Transfer out Release in Release out

18 www.cs.wisc.edu/condor New DAGMan Architecture.dag File Condor Job Queue A DAGMan B D A C DaPS Job Queue X Y X

19 www.cs.wisc.edu/condor Conclusion › More intelligent management of remote data transfer & staging  increase local storage utilization  maximize CPU throughput

20 www.cs.wisc.edu/condor Future Work › Enhanced interaction with DAGMan › Data Level Management instead of File Level Management › Possible integration with Kangaroo to keep the network pipeline full

21 www.cs.wisc.edu/condor Thank You for Listening & Questions › For more information  Drop by my office anytime Room: 3361, Computer Science & Stats. Bldg.  Email to: condor-admin@cs.wisc.edu


Download ppt "Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison Managing and Scheduling Data."

Similar presentations


Ads by Google