2/8/00CHEP20001 AMUN A Practical Application Using the Nile Distributed Operating System Authors: R. Baker (Cornell University, Ithaca, NY USA) L. Zhou.

Slides:



Advertisements
Similar presentations
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Advertisements

Automated Testing for Mobility Management Entity of Long Term Evolution System 5/5/2015 Xi Chen.
Grid in action: from EasyGrid to LCG testbed and gridification techniques. James Cunha Werner University of Manchester Christmas Meeting
Distributed Systems 1 Topics  What is a Distributed System?  Why Distributed Systems?  Examples of Distributed Systems  Distributed System Requirements.
Notes to the presenter. I would like to thank Jim Waldo, Jon Bostrom, and Dennis Govoni. They helped me put this presentation together for the field.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Aneka: A Software Platform for .NET-based Cloud Computing
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Two Broad Categories of Software
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
MCell Usage Scenario Project #7 CSE 260 UCSD Nadya Williams
Replication Monitoring University of Maryland Institute for Advanced Computer Studies.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Workload Management Massimo Sgaravatto INFN Padova.
Parallel Reconstruction of CLEO III Data Gregory J. Sharp Christopher D. Jones Wilson Synchrotron Laboratory Cornell University.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Włodzimierz Funika, Filip Szura Automation of decision making for monitoring systems.
Ch 4. The Evolution of Analytic Scalability
Copyright © 2005, SAS Institute Inc. All rights reserved. Quantifying and Controlling Operational Risk with SAS OpRisk VaR Donald Erdman April 11, 2005.
1 Integrating GPUs into Condor Timothy Blattner Marquette University Milwaukee, WI April 22, 2009.
Middleware Enabled Data Sharing on Cloud Storage Services Jianzong Wang Peter Varman Changsheng Xie 1 Rice University Rice University HUST Presentation.
The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.
Redes Inalámbricas Máster Ingeniería de Computadores 2008/2009 Tema 7.- CASTADIVA PROJECT Performance Evaluation of a MANET architecture.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.
J OINT I NSTITUTE FOR N UCLEAR R ESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA 1 Nechaevskiy A. Dubna, 2012.
BaBar MC production BaBar MC production software VU (Amsterdam University) A lot of computers EDG testbed (NIKHEF) Jobs Results The simple question:
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Data Analysis using Java Mobile Agents Mark Dönszelmann, Information, Process and Technology Group, IT, CERN ATLAS Software Workshop Analysis Tools Meeting,
DCE (distributed computing environment) DCE (distributed computing environment)
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Distribution After Release Tool Natalia Ratnikova.
Archivists' Toolkit - CRADLE Presentation, 10 Feb The Archivists’ Toolkit CRADLE Presentation 10 Feb
CUDA Performance Study on Hadoop MapReduce Clusters Chen He Peng Du University of Nebraska-Lincoln.
Development Timelines Ken Kennedy Andrew Chien Keith Cooper Ian Foster John Mellor-Curmmey Dan Reed.
A Federation Architecture for DETER Ted Faber, John Wroclawski, Kevin Lahey, John Hickey University of Southern California Information Sciences Institute.
CHEP Sep Andrey PHENIX Job Submission/Monitoring in transition to the Grid Infrastructure Andrey Y. Shevel, Barbara Jacak,
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
3/24/2003CHEP'03, La Jolla, USA Object Database for Constants: The common CLEO Online and Offline solution Hubert Schwarthoff Cornell University With N.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
VO-Ganglia Grid Simulator Catalin Dumitrescu, Mike Wilde, Ian Foster Computer Science Department The University of Chicago.
Postgraduate Computing Lectures Applications I: Overview 1 Applications: Overview Symbiosis: Theory v. Experiment Theory –Build models to explain existing.
ATLAS Grid Data Processing: system evolution and scalability D Golubkov, B Kersevan, A Klimentov, A Minaenko, P Nevski, A Vaniachine and R Walker for the.
ABone Architecture and Operation ABCd — ABone Control Daemon Server for remote EE management On-demand EE initiation and termination Automatic EE restart.
CORBA/RMI issues in Nile Authors: F. Handfield (University of Texas, Austin, TX USA) D. Mimnagh (University of Texas, Austin, TX USA) M. Ogg (University.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
Sep 13, 2006 Scientific Computing 1 Managing Scientific Computing Projects Erik Deumens QTP and HPC Center.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,
A Distributed Resource Management Architecture that Supports Advance Reservations and Co-Allocation Presented by Alain Roy, University of Chicago With.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Intersecting UK Grid & EGEE/LCG/GridPP Activities Applications & Requirements Mark Hayes, Technical Director, CeSC.
Mobile Analyzer A Distributed Computing Platform Juho Karppinen Helsinki Institute of Physics Technology Program May 23th, 2002 Mobile.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Virtual Network Computing Sangmi Lee Oct,25,2000 Florida State University.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Gu Minhao, DAQ group Experimental Center of IHEP February 2011
Workload Management Workpackage
U.S. ATLAS Grid Production Experience
US CMS Testbed.
Gridifying the LHCb Monte Carlo production system
Instructor: Mort Anvari
Introduction of Week 5 Assignment Discussion
Presentation transcript:

2/8/00CHEP20001 AMUN A Practical Application Using the Nile Distributed Operating System Authors: R. Baker (Cornell University, Ithaca, NY USA) L. Zhou (University of Florida, Gainesville, FL USA) J. Duboscq (Ohio State University, Columbus, OH USA) Presented by: D. Mimnagh (University of Texas, Austin, TX USA)

2/8/00CHEP20002 Overview What is Nile? What is AMUN? Results Conclusions

2/8/00CHEP20003 What is Nile? Nile: Distributed computing solution for CLEO –fault-tolerant (recover from resource failure) –self-managing (sophisticated resource scheduling) –heterogeneous (will run anything anywhere) Designed for HEP –track reconstruction –data analysis –simulation But very generic

2/8/00CHEP20004 Nile Architecture

2/8/00CHEP20005 What is AMUN? Advanced Monte Carlo Under Nile CLEO II.V signal Monte Carlo –τ lepton pair events Testbed –Nile control system using RMI (see E272) –Borrowed workstation program

2/8/00CHEP20006 Prototype –csh scripts –list of machine owners Must be easy and honest –simple configuration files creation –monitor usage remotely and locally –allow preemption for unexpected usage –need local space for intermediate results Will be integrated with Nile in Java Managing Loaned Workstations

2/8/00CHEP20007 Very stable –weeks of uninterrupted use Heterogeneity –as many as 60 machines, Alpha Linux + Unix –SpecInt ranging from 1 to 25 Scaling –linear –Network topology issues can break linearity –1-3 second to reschedule CPU Nile performance Results

2/8/00CHEP20008 Scaling with Total SpecInt

2/8/00CHEP20009 Events Generated Job construction requirements: –choose subjob size –collection script 25 million τ events generated as many as 1 million a day

2/8/00CHEP Conclusion Successful implementation of Nile in RMI CPU resources used efficiently –loaned CPU To do: –rewrite scripts in Java –admin tools –GUI tools