Presentation is loading. Please wait.

Presentation is loading. Please wait.

2/8/00CHEP20001 AMUN A Practical Application Using the Nile Distributed Operating System Authors: R. Baker (Cornell University, Ithaca, NY USA) L. Zhou.

Similar presentations


Presentation on theme: "2/8/00CHEP20001 AMUN A Practical Application Using the Nile Distributed Operating System Authors: R. Baker (Cornell University, Ithaca, NY USA) L. Zhou."— Presentation transcript:

1 2/8/00CHEP20001 AMUN A Practical Application Using the Nile Distributed Operating System Authors: R. Baker (Cornell University, Ithaca, NY USA) L. Zhou (University of Florida, Gainesville, FL USA) J. Duboscq (Ohio State University, Columbus, OH USA) Presented by: D. Mimnagh (University of Texas, Austin, TX USA)

2 2/8/00CHEP20002 Overview What is Nile? What is AMUN? Results Conclusions

3 2/8/00CHEP20003 What is Nile? Nile: Distributed computing solution for CLEO –fault-tolerant (recover from resource failure) –self-managing (sophisticated resource scheduling) –heterogeneous (will run anything anywhere) Designed for HEP –track reconstruction –data analysis –simulation But very generic

4 2/8/00CHEP20004 Nile Architecture

5 2/8/00CHEP20005 What is AMUN? Advanced Monte Carlo Under Nile CLEO II.V signal Monte Carlo –τ lepton pair events Testbed –Nile control system using RMI (see E272) –Borrowed workstation program

6 2/8/00CHEP20006 Prototype –csh scripts –list of machine owners Must be easy and honest –simple configuration files creation –monitor usage remotely and locally –allow preemption for unexpected usage –need local space for intermediate results Will be integrated with Nile in Java Managing Loaned Workstations

7 2/8/00CHEP20007 Very stable –weeks of uninterrupted use Heterogeneity –as many as 60 machines, Alpha Linux + Unix –SpecInt ranging from 1 to 25 Scaling –linear –Network topology issues can break linearity –1-3 second to reschedule CPU Nile performance Results

8 2/8/00CHEP20008 Scaling with Total SpecInt

9 2/8/00CHEP20009 Events Generated Job construction requirements: –choose subjob size –collection script 25 million τ events generated as many as 1 million a day

10 2/8/00CHEP200010 Conclusion Successful implementation of Nile in RMI CPU resources used efficiently –loaned CPU To do: –rewrite scripts in Java –admin tools –GUI tools


Download ppt "2/8/00CHEP20001 AMUN A Practical Application Using the Nile Distributed Operating System Authors: R. Baker (Cornell University, Ithaca, NY USA) L. Zhou."

Similar presentations


Ads by Google