EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks BiG: A Grid Service to Distribute Large BLAST.

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MyProxy and EGEE Ludek Matyska and Daniel.
Advertisements

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Tutorial Getting started with GILDA.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Enabling Grids for E-sciencE Medical image processing web portal : Requirements analysis. An almost end user point of view … H. Benoit-Cattin,
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Santiago de Chile, 1st EELA Conference, 4-5/9/06 1 Status.
INFSO-RI Enabling Grids for E-sciencE Application of the Grid to Pharmacokinetic Modelling of Contrast Agents in Abdominal Imaging.
INFSO-RI Enabling Grids for E-sciencE The GENIUS Grid portal Tony Calanducci INFN Catania - Italy First Latin American Workshop.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite IPv6 compliance project tests Further.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Interoperability Shibboleth - gLite Christoph.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ignacio Blanquer Vicente Hernández Damià.
EGEE-III INFSO-RI Enabling Grids for E-sciencE I. Blanquer(1), V. Hernandez(1), G. Aparicio (1), M. Pignatelli(2), J. Tamames(2)
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
INFSO-RI Enabling Grids for E-sciencE Supporting legacy code applications on EGEE VOs by GEMLCA and the P-GRADE portal P. Kacsuk*,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Security and Job Management.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ignacio Blanquer Vicente Hernández Bioinformatics.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Provenance Challenge gLite Job Provenance.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
EGEE-Forum – May 11, 2007 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks A gateway platform for Grid Nicolas.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Integration of Astro-WISE with Grid storage.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America EELA Demo: Blast in Grids Ignacio Blanquer.
INFSO-RI Enabling Grids for E-sciencE Status of the Biomedical Applications in EELA Project (E-Infrastructures Shared Between Europe.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to GILDA and gaining access.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
INFSO-RI Enabling Grids for E-sciencE GILDA Practicals : Security systems GILDA Tutors Singapore, 1st South East Asia Forum -- EGEE.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The GILDA training infrastructure.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status report on Application porting at SZTAKI.
National Computational Science National Center for Supercomputing Applications National Computational Science GSI Online Credential Retrieval Requirements.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
INFSO-RI Grupo de Redes y Computación de Altas Prestaciones Actividades del Grupo de Redes y Computación de Altas Prestaciones.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Biomed Applications Ignacio Blanquer, Vicente Hernández.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Abel Carrión Ignacio Blanquer Vicente Hernández.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
4th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS E-infrastructure shared between Europe and Latin America Security Hands-on Vanessa.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarksEGEE-III INFSO-RI Astro-Wise and EGEE.
INFSO-RI Enabling Grids for E-sciencE Charon Extension Layer. Modular environment for Grid jobs and applications management Jan.
EGEE-II INFSO-RI Enabling Grids for E-sciencE MyProxy - a brief introduction.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ignacio Blanquer Vicente Hernández Damià.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CRAB: the CMS tool to allow data analysis.
INFSO-RI Enabling Grids for E-sciencE Activities of the UPV in NA4- Biomed Ignacio Blanquer Vicente Hernández Universidad Politécnica.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Introduction to P-GRADE Portal hands-on Miklos Kozlovszky MTA SZTAKI
INFSO-RI Enabling Grids for E-sciencE VOMS & MyProxy interaction Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January.
EGEE is a project funded by the European Union under contract IST Enabling bioinformatics applications to.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE III User Forum – Clermont Ferrand Analysis of Metagenomes on the EGEE Grid Gabriel.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite – UNICORE interoperability Daniel Mallmann.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Interfacing gLite services with the Kepler.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks VOMS & Reliability Vincenzo Ciaschini & Andrea.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bioinformatics activity Christophe BLANCHET.
EGI-InSPIRE RI Grid Training for Power Users EGI-InSPIRE N G I A E G I S Grid Training for Power Users Institute of Physics Belgrade.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI VOMS Proxy Lifetime UCB 21 Aug 2012 David Kelsey STFC.
Enabling Grids for E-sciencE gLite security pratical tutorial Dario Russo INFN Catania Catania,
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite, the EGEE middleware Mike Mineter Training Outreach Education National.
RI EGI-TF 2010, Tutorial Managing an EGEE/EGI Virtual Organisation (VO) with EDGES bridged Desktop Resources Tutorial Robert Lovas, MTA SZTAKI.
Zach Miller Computer Sciences Department University of Wisconsin-Madison Supporting the Computation Needs.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Management Claudio Grandi.
Enabling Grids for E-sciencE LRMN ThIS on the Grid Sorina CAMARASU.
2 nd EGEE/OSG Workshop Data Management in Production Grids 2 nd of series of EGEE/OSG workshops – 1 st on security at HPDC 2006 (Paris) Goal: open discussion.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios Grid Monitor E. Imamagic, SRCE OAT.
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks BiG: A Grid Service to Distribute Large BLAST Runs Ignacio Blanquer Valencia University of Technology (Universidad Politécnica de Valencia)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Contents Problem Addressed: BLAST. Requirements and Design Objectives. Architecture –Security. –Load Balancing. –Accessibility. Performance and Usage. Conclusions. 2 nd EGEE User Forum - Manchester 2

Enabling Grids for E-sciencE EGEE-II INFSO-RI nd EGEE User Forum - Manchester BLAST BLAST (Basic Local Alignment Search Tool) is a Bioinformatics Procedure Applied to Identify Compatible Protein and Nucleotide Sequences in Protein and DNA Databases. BLAST can be Applied, Among Other Uses, to Annotate the Estimated Function of Unknown Sequences. BLAST is Computationally Intensive. 3

Enabling Grids for E-sciencE EGEE-II INFSO-RI Design Objectives and Requirements I Easy Interface with High Compatibility (Web Service + NCBI Based) –Same Parameters as BLAST. –User-friendly and Intuitive. Support to Searching Simultaneously on Multiple Databases –Parallel Process on Multiple Database Queries. Architecture Exportable to Other Common Problems –Modular Structure of the System Components. Secure and Efficient –Simple but Effective Control of Users. –No Exposure of Credentials. 2 nd EGEE User Forum - Manchester 4

Enabling Grids for E-sciencE EGEE-II INFSO-RI Design Objectives and Requirements II Scalability –Data Partition in Grid Approach Gives Scalability with Huge Quantities of Data. High Performance –Grid Computing + MPI Parallel Jobs in Dedicated Clusters. Robust –Fault Tolerance on Server and Client. Interoperable –Accessible Through Web Services, Stand-Alone Applications and Web Portals. Portable –Sessions Could Be Independent of the Server. 2 nd EGEE User Forum - Manchester 5

Enabling Grids for E-sciencE EGEE-II INFSO-RI Architecture FASTA File (Input Sequence) AGTACGTAGTAGCTGC TGCTACGTGGCTAGCT AGTACGTCAGACGTAG ATGCTAGCTGACTCGA FASTA File (Input Sequence) AGTACGTAGTAGCTGC TGCTACGTGGCTAGCT AGTACGTCAGACGTAG ATGCTAGCTGACTCGA Execution Parameters Execution Parameters Output Matches Xxxxx x x x x x xxx xx xxx x Output Matches Xxxxx x x x x x xxx xx xxx x Protein Database (Non Redunda nt e.g.) 2 nd EGEE User Forum - Manchester 6

Enabling Grids for E-sciencE EGEE-II INFSO-RI BiG Security - Authentication Double-Credentials Level –Instead of Storing a Portal Certificate Private Key or Transferring User Private Keys (Even securely), a myProxy Certificate Server is Used. –Certificates in the MyProxy Server are Manually and Temporally Renewed (Planned Weekly) and Short-Time Certificates are Retrieved by the UI when Required. –This Enhances the Security and Does not Expose Credentials, Even in Secure Environments. 2 nd EGEE User Forum - Manchester 7

Enabling Grids for E-sciencE EGEE-II INFSO-RI BiG Security - Authorisation Alternatives and Problems –Uploading VOMS Credentials: A myProxy Credential is Uploaded in a Proxy Server.  It is Suggested that VOMS Attributes Should be Added After the Retrieval of a Delegated Copy of the Proxy > It Does not Work.  VOMS Attributes Could not be Uploaded With Standard MyProxy Commands > Use an Updated Version from INFN. –VOMS Credentials Duration and Proxy Renewal: A Delegated myProxy Credential Needs to Be Renewed for a Long-Living Job.  It Does Not Work with VOMS Credentials. VOMS Life-Time is 24 Hours > Unsolved Problem for Long-living Executions.  Incorrect Configuration of Automatic Renewal on RBs. Proposals –Upload VOMS-Extended MyProxy Credentials and do not use Renewal. –Do not use VOMS if Automatic Renewal is Required. 2 nd EGEE User Forum - Manchester 8

Enabling Grids for E-sciencE EGEE-II INFSO-RI BiG – Load Balancing BiG Provides a Grid Interface to MPIBlast –MPIBlast Scalability Depends Highly on the Efficiency of the MPI Version. –It is Configured in a Per-site Basis. –Value used Currently is 20 CPUs. –Databases are Pre-Distributed to Reduce Overhead. Larger Scalability is Managed Through Splitting the Input Sequences into Multiple Jobs –Multiple Parallel Jobs are Scheduled. –Embarassingly Parallel Approach. Multiple Databases can be Searched in Parallel –Directly Multiplies Performance. 2 nd EGEE User Forum - Manchester 9

Enabling Grids for E-sciencE EGEE-II INFSO-RI BiG - Accessibility Users Access the System Through Stand-Alone Applications or Web Portals. Currently –BLAST2GO: –Web from Cecalcula: –Even CLI. 2 nd EGEE User Forum - Manchester 10

Enabling Grids for E-sciencE EGEE-II INFSO-RI BiG: Usage Report Period: Jul’06-Dec’06. Usage Statistics: –Number of Jobs: 284. –CPU Consumed: 173 CPU/Days. –Resources Used: ramses.dsic.upv.es:2119/jobmanager-pbs- biomedg. –BiG is Being Used at the University of Los Andes to Work on the Complete Genome of the Plasmodiun Falciparum for the Identification of DHFR Antigenic Proteins dic-06nov-06oct-06sep-06ago-06jul-06 Cpu/hours Time#Jobs 2 nd EGEE User Forum - Manchester 11

Enabling Grids for E-sciencE EGEE-II INFSO-RI BiG Usage Agreements Highest Difficulty is to Lead with the Quality of Service –Users Do not Understand Waiting Times and Impredictable Response Time. –Lack of MPI Resources Reduces the Availability of the System. Resources are Available for Short Executions (Below 15 Minutes in Total). Larger Executions Require Pre-Reservation of Resources –And Human Supervision due to Potential Unstability. –Users Negotiate the Experiments with the Resource Providers (UPV) A General Adoption of Such Mechanisms Inside the Infrastructure will be Necessary in the Long Term nd EGEE User Forum - Manchester

Enabling Grids for E-sciencE EGEE-II INFSO-RI BiG Current Actions The work on BiG is Currently Focused on Three Areas –Improve Technical Issues  Better Management of Errors to Ease the Recovering on the Client Applications.  Migration of Sessions Among Different Portals.  Enhanced Robustness. –Foster the Usage  A New Portal is Being Developed.  New Users have been Identified in Computational Biology and Farmacoepidemiology. –Generalisation of the Service Model and Extension to Other Problems. 2 nd EGEE User Forum - Manchester 13

Enabling Grids for E-sciencE EGEE-II INFSO-RI Conclusions BiG is Service-Oriented, Being Interoperable with Many Application Models (Portals, Applications or Scripts). BiG is Intended for Processing Big Sets of Sequences, Although it Works Efficiently Even with Short Sequences. A Complete Genome Screening Implies Tens of Thousands of Sequences and Could Take More Than 30 Hours in a Conventional Computer. This is Done Periodically to Check the New Versions of the Target Databases. 2 nd EGEE User Forum - Manchester 14

Enabling Grids for E-sciencE EGEE-II INFSO-RI Contact Vicente Hernández / Ignacio Blanquer Universidad Politécnica de Valencia Camino de Vera s/n Valencia, Spain Tel: Fax nd EGEE User Forum - Manchester