Massive Ray Tracing in Fusion Plasmas on EGEE J.L. Vázquez-Poletti, E. Huedo, R.S. Montero and I.M. Llorente Distributed Systems Architecture Group Universidad.

Slides:



Advertisements
Similar presentations
Experiences with GridWay on CRO NGI infrastructure / EGEE User Forum 2009 Experiences with GridWay on CRO NGI infrastructure Emir Imamagic, Srce EGEE User.
Advertisements

1/22 Distributed Systems Architecture Research Group Universidad Complutense de Madrid Constantino Vázquez Eduardo Huedo Scaling DRMAA codes to the Grid:
Distributed Systems Architecture Research Group Universidad Complutense de Madrid EGEE UF4/OGF25 Catania, Italy March 2 nd, 2009 State and Future Plans.
Generic MPI Job Submission by the P-GRADE Grid Portal Zoltán Farkas MTA SZTAKI.
MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
JSAGA2 Overview job desc. gLite plug-ins Globus plug-ins JSAGA hidemiddlewareheterogeneity (e.g. gLite, Globus, Unicore) JDLRSL.
1 Generic logging layer for the distributed computing by Gene Van Buren Valeri Fine Jerome Lauret.
A Computation Management Agent for Multi-Institutional Grids
EGEE-II INFSO-RI Enabling Grids for E-sciencE Supporting MPI Applications on EGEE Grids Zoltán Farkas MTA SZTAKI.
2 nd GADA Workshop / OTM 2005 Conferences Eduardo Huedo Rubén S. Montero Ignacio M. Llorente Advanced Computing Laboratory Center for.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Workload Management Massimo Sgaravatto INFN Padova.
Stuart K. PatersonCHEP 2006 (13 th –17 th February 2006) Mumbai, India 1 from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci',
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Trace Generation to Simulate Large Scale Distributed Application Olivier Dalle, Emiio P. ManciniMar. 8th, 2012.
RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
Computational grids and grids projects DSS,
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
3-2.1 Topics Grid Computing Meta-schedulers –Condor-G –Gridway Distributed Resource Management Application (DRMAA) © 2010 B. Wilkinson/Clayton Ferner.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
The Grid computing Presented by:- Mohamad Shalaby.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks S. Natarajan (CSU) C. Martín (UCM) J.L.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks, An Overview of the GridWay Metascheduler.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Superscheduling and Resource Brokering Sven Groot ( )
Cluster 2004 San Diego, CA A Client-centric Grid Knowledgebase George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison September 23 rd,
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
Migration of Monte Carlo Simulation of High Energy Atmospheric Showers to GRID Infrastructure Migration of Monte Carlo Simulation of High Energy Atmospheric.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks, Novelties and Features around the GridWay.
INFSO-RI Enabling Grids for E-sciencE Workflows in Fusion applications José Luis Vázquez-Poletti Universidad.
The EDGeS project receives Community research funding 1 Porting Applications to the EDGeS Infrastructure A comparison of the available methods, APIs, and.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The.
7. Grid Computing Systems and Resource Management
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
1/23 Distributed Systems Architecture Research Group Universidad Complutense de Madrid Nuevos modelos de provisión de recursos para infrestructuras GRID:
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
BDTS and Its Evaluation on IGTMD link C. Chen, S. Soudan, M. Pasin, B. Chen, D. Divakaran, P. Primet CC-IN2P3, LIP ENS-Lyon
Herramientas para una ejecución adaptativa, eficiente y tolerante a fallos de tareas en entornos dinámicos distribuidos M. Rodríguez-Pascual, A.J. Rubio-Montero.
1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
GridWay Overview John-Paul Robinson University of Alabama at Birmingham SURAgrid All-Hands Meeting Washington, D.C. March 15, 2007.
Congreso Cuidad, Spain May 15, 2007 GridWay 1/27, Programming with the DRMAA OGF Standard GridWay Distributed Systems Architecture Group Universidad Complutense.
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
Introduction to Load Balancing:
U.S. ATLAS Grid Production Experience
Practical: The Information Systems
Management of Virtual Machines in Grids Infrastructures
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Simulation in a Distributed Computing Environment
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
BOSS: the CMS interface for job summission, monitoring and bookkeeping
J.L. Vázquez-Poletti (UCM) EGEE08 (Istambul)
Management of Virtual Machines in Grids Infrastructures
Simulation in a Distributed Computing Environment
Simulation in a Distributed Computing Environment
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

Massive Ray Tracing in Fusion Plasmas on EGEE J.L. Vázquez-Poletti, E. Huedo, R.S. Montero and I.M. Llorente Distributed Systems Architecture Group Universidad Complutense de Madrid (Spain) EGEE User Forum 1 st - 3 rd March, CERN (Geneva)

What are we going to see?  MA-RA-TRA/G: a computational view  Execution using the LCG-2 Resource Broker  Execution using the GridWay Meta-Scheduler  Comparison  Conclusions  “Our two cents”

MA-RA-TRA/G: a computational view  MA-RA-TRA/G activity: “Massive Ray Tracing in Fusion Plasmas on Grids”  Application profile: –Sizes  Executable (Truba) – 1.8 MB  Input files – 70 KB  Output files – about 459 KB –Execution Time – about 26 minutes  Pentium 4 (3.2 GHz) –1 execution = 1 ray traced

Execution using the LCG-2 Resource Broker  lcg2.1.9 User Interface C++ API  1 job = 1 ray  Procedure: –Launcher script  Generates JDL files –Framework over LCG-2  Launches them simultaneously  Queries each job's state periodically  Retrieves each job's output (Sandbox)

Execution using the LCG-2 Resource Broker JDL generates gets Job launches queries state retrieves output MRT Launcher Job

Execution using the LCG-2 Resource Broker SWETEST VO

Execution using the LCG-2 Resource Broker

 Total Time: 220 minutes (3.67 hours)  Execution Time: –Average: minutes –Std. Deviation: minutes  Transfer Time: –Average: 0.42 minutes –Std. Deviation: 0.06 minutes  Avg. Productivity: Jobs/hour  Avg. Overhead: 1.82 minutes/job

Execution using the LCG-2 Resource Broker CPU normalized to reference value of 1000 SepctInt2000 (Pentium GHz) 42.86

Execution using the LCG-2 Resource Broker  As in the real world, some jobs failed –Jobs affected: 31 –Max resubmissions/job: 1  Problems encountered: –LCG-2 Infrastructure:  Lack of opportunistic migration  No slowdown detection  Jobs assigned to busy resources –The API itself:  Submitting more than 80 jobs in a Collection (empirically)  Not a standard

Execution using the GridWay Meta-Scheduler  Open-source Meta-Scheduling framework  Works on top of Globus services  Performs: –Job execution management –Resource brokering  Allows unattended, reliable and efficient execution of: –single jobs, array jobs, complex jobs –on heterogeneous, dynamic and loosely- coupled grids

Execution using the GridWay Meta-Scheduler  Works transparently to the end user  Adapts job execution to changing Grid conditions –Fault recovery –Dynamic scheduling –Migration on-request  Scheduling using Information System (GLUE schema) from LCG-2  Stands on the client side

Execution using the GridWay Meta-Scheduler  Execution as seen by GridWay: –Prolog: prepares remote system  Creates directory  Transfers input files and executable –Wrapper: executes job and gets exit code –Epilog: finalizes remote system  Transfers output files  Cleans up directory

Execution using the GridWay Meta-Scheduler SWETEST VO

Execution using the GridWay Meta-Scheduler  Total Time: minutes (2.06 hours)  Execution Time: –Average: 36.8 minutes –Std. Deviation: minutes  Transfer Time: –Average: 0.87 minutes –Std. Deviation: 0.51 minutes  Avg. Productivity: Jobs/hour  Avg. Overhead: 0.52 minutes/job

Execution using the GridWay Meta-Scheduler CPU normalized to reference value of 1000 SepctInt2000 (Pentium GHz) 48.54

Execution using the GridWay Meta-Scheduler  Also with GridWay, some jobs failed: 1  Reschedules: 21 –Max./job: 4

Comparison

REMEMBER: With GridWay, only 1 job failed (and was resubmitted)

Comparison

Conclusions  GridWay obtains higher productivity –Reduces number of nodes and stages –Mechanisms not given by LCG-2  Opportunistic migration  Performance slowdown detection  API's –LCG-2: Relays on specific middleware –DRMAA implementation: doesn't  GGF standard  Job sync, termination and suspension

“Our two cents”  Data from Information System should: –be updated more frequently –represent the “real” situation

Thank you for your attention! Want to give GridWay a try? Download it!