INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Introduction to Grid Comptuing and EGEE Fabio Scibilia INFN Catania Catania, 08.02.2006.

Slides:



Advertisements
Similar presentations
CERN STAR TAP June 2001 Status of the EU DataGrid Project Fabrizio Gagliardi CERN EU-DataGrid Project Leader June 2001
Advertisements

An open source approach for grids Bob Jones CERN EU DataGrid Project Deputy Project Leader EU EGEE Designated Technical Director
From CESSDA to European Research Infrastructure Developments in cross-European data sharing.
High Performance Computing Course Notes Grid Computing.
An overview of the EGEE project Bob Jones EGEE Technical Director DTI International Technology Service-GlobalWatch Mission CERN – June 2004.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Infrastructure overview Arnold Meijster &
LCSC October The EGEE project: building a grid infrastructure for Europe Bob Jones EGEE Technical Director 4 th Annual Workshop on Linux.
EGEE is proposed as a project funded by the European Union under contract IST The EGEE International Grid Infrastructure and the Digital Divide.
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Welcome to CERN Research Technology Training Collaborating.
Enabling, facilitating and delivering quality training in the UK and Internationally The challenge of grid training and education David Fergusson, Deputy.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
GGF12 – 20 Sept LCG Incident Response Ian Neilson LCG Security Officer Grid Deployment Group CERN.
Computational grids and grids projects DSS,
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Rackspace Analyst Event Tim Bell
INFSO-RI Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones.
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.
EGEE is proposed as a project funded by the European Union under contract IST EU eInfrastructure project initiatives FP6-EGEE Fabrizio Gagliardi.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE – paving the way for a sustainable infrastructure.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
TNC 2006 Catania 17 th May Technical Challenges of Establishing a Pilot Grid Infrastructure in South Eastern Europe Emanouil Atanassov on behalf.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Bob Jones Technical Director CERN - August 2003 EGEE is proposed as a project to be funded by the European Union under contract IST
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
EGEE is a project funded by the European Union under contract IST Middleware Planning for LCG/EGEE Bob Jones EGEE Technical Director e-Science.
CLRC and the European DataGrid Middleware Information and Monitoring Services The current information service is built on the hierarchical database OpenLDAP.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
…building the next IT revolution From Web to Grid…
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
INFSO-RI Enabling Grids for E-sciencE Experience of using gLite for analysis of ATLAS combined test beam data A. Zalite / PNPI.
EGEE is a project funded by the European Union under contract IST Presentation of NA4 Generic Applications Roberto Barbera NA4 Generic Applications.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract INFSO-RI Grid Accounting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks NA2 – Dissemination, Outreach & Communication.
Enabling, facilitating and delivering quality training in the UK and Internationally Introduction to e-science concepts Mike Mineter Training Outreach.
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
EGEE is a project funded by the European Union under contract IST EGEE Summary NA2 Partners April
NORDUnet NORDUnet e-Infrastrucure: Grids and Hybrid Networks Lars Fischer CTO, NORDUnet Fall 2006 Internet2 Member Meeting, Chicago.
EGEE Project Review Fabrizio Gagliardi EDG-7 30 September 2003 EGEE is proposed as a project funded by the European Union under contract IST
The Mission of CERN  Push back  Push back the frontiers of knowledge E.g. the secrets of the Big Bang …what was the matter like within the first moments.
25-September-2005 Manjit Dosanjh Welcome to CERN International Workshop on African Research & Education Networking September ITU, UNU and CERN.
EGEE is a project funded by the European Union under contract IST Compchem VO's user support EGEE Workshop for VOs Karlsruhe (Germany) March.
Activities and Perspectives at Armenian Grid site The 6th International Conference "Distributed Computing and Grid- technologies in Science and Education"
FESR Consorzio COMETA - Progetto PI2S2 Introduction to Grid Computing Pietro Di Primo INFN – Catania , Catania.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Report on availability of the JINR networking, computing and information infrastructure for real data taking and processing in LHC experiments Ivanov V.V.
Grids and SMEs: Experience and Perspectives Emanouil Atanassov, Todor Gurov, and Aneta Karaivanova Institute for Parallel Processing, Bulgarian Academy.
Bob Jones EGEE Technical Director
Accessing the VI-SEEM infrastructure
Clouds , Grids and Clusters
Long-term Grid Sustainability
Introduction to Grid Computing and the Trigrid VL infrastructure
Future EU Grid Projects
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE Introduction to Grid Comptuing and EGEE Fabio Scibilia INFN Catania Catania,

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Fundamentals of Grid Computing

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Grid Idea By A Simple Analogy The user: –Does not need to know anything about what stays beyond the socket. –Can absorb all the power he wants according to the agreement The power society –Can modify production technologies at any moment –Manages the power network as it wants –Defines terms and conditions of the agreement Some power stations dispersed everywhere produce the electrical power The produced power is distributed over a power network One consumer wants to access to that power Now the user is able to access to the power grid He/she comes to an agreement with the electrical society The electrical society provides for a new socket in which the user can plug

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, In the same way... The user: –Does not need to know what stays beyond its user interface –Can access to a massive amount of computational power through a simple terminal The society: –Can extend grid facilities at any moment –Manages the architecture of the grid –Defines policies and rules for accessing to grid resources Some computing farms produce the computing power Computing power is made available over the Internet One user wants to access to intensive computational power He/she comes to an agreement with some society that offers grid services Now the user accesses to grid facilities as a grid user The society will provide for grid facilities allowing the user to access to its grid resources and providing for proper tools

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, What about Grid Computing Share data Distribute computation Coordinate works Access to remote instrumentation Grid Computing paradigm is an emerging way of thinking distributed environments in a global scale infrastructure to:

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Why Computing Grids now? Because the amount of computational power needed by many applications is getting very huge Because the amount of data requires massive and complex distributed storage systems To make easier the cooperation of people and resources belonging to different organizations To access to particular instrumentation that is not easily reachable in a different way Because it is the next of step in the evolution of distribution of computation Thousands of CPUs working at the same time on the same task From hundreds of Gigabytes to Petabytes (10 15 ) produced by the same application. People of several organizations working together to achieve a common goal Because it cannot be moved or replicated or its cost is too much expensive. To create a marketplace of computational power and storage over the Internet

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Who is interested in Grids? Research community, to carry out important results from experiments that involve many and many people and massive amounts of resources Enterprises that can have huge computation without the need for extending their current informatics infrastructure Businesses, which can provide for computational power and data storage against a contract or for rental

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Properties of Grids Transparency –The complexity of the Grid architecture is hidden to the final user –The user must be able to use a Grid as it was a unique virtual supercomputer –Resources must be accessible setting their location apart Openness –Each subcomponent of the Grid is accessible independently of the other components Heterogeneity –Grids are composed by several and different resources Scalability –Resources can be added and removed from the Grid dynamically Fault Tolerance –Grids must be able to work even if a component fails or a system crashes Concurrency –Different processes on different nodes must be able to work at the same time

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Challenged Issues in Grids (i) Security –Authentication and authorization of users –Confidentiality and not repudiation Information Services –To discover and monitor Grid resource –To check for health-status of resources –As basis for decision making processes File Management –Creation, modification and deletion of files –Replication of files to improve access performances –Ability to access to files without the need to move tham locally to the code Administration –Systems to administer Grid resource respecting local administration policies

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Challenged Issues in Grids (ii) Resource Brokering –To schedule tasks across different resources –To make optimal or suboptimal decisions –To reserve (in the future) resources and network bandwidth Naming services –To name resources in un unambiguous way in the Grid scope Friendly User Interfaces –Because most of Grid users have nothing to do with computing science (physicians, chemistries...) –Graphical User Interfaces (GUIs) –Grid Portals (very similar to classical Web Portals) –Command Line Interfaces (CLIs) for experts

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Virtual Organizations (VOs) A Virtual Organization is a collection of people and resources that work in a coordinated way to achieve a common goal To use Grid facilities, any user MUST subscribe to a Virtual Organization as member Each people or resource can be member of more VOs at the same time Each VO can contain people or resources belonging to different administration domains

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Virtual Laboratory A new way of cooperating in experiments A platform that allow scientists to work together on in the same “Virtual” Laboratory Strictly correlated to Grids and Virtual Organizations

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Globus Alliance The Globus Alliance –Is a community of people and organizations involved in projection and development of Grid technologies –University of Illinois, Argonne National Laboratory, University of Edinburgh, EPCC, etc… The Globus Toolkit (GT) –It is a standard de facto –It is a bag of services –At its fourth release (GT4) –Now adopts Web Services interfaces The Global Grid Forum –It is a forum of grid researchers –Works to define standards and protocols on grid technologies – It is divided in Working Groups (WGs) –

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Globus Services

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Hourglass Reference Model Fabric layer: –Manages resources locally Connectivity –Network communications (IP, DNS etc.) –Security: authentication, authorization, certification –Single Sign On Resource –Allocation, reservation and monitoring of resources –Data access and transport –Gathering of information on resources Collective –View of services as collections –Discovery and allocation –Replica and catalogue of data –Management of workflow Application –User applications –Tools and interfaces Fabric Connectivity Resource Collective Application

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, An Example: The project –Searches for Extra Terrestrial Intelligence (SETI)  Collecting samples of microwaves coming from the Universe through a telescope  Scheduling tasks spread over Grid nodes to analyse these samples –Uses desktop computers as Grid nodes –Working nodes are dynamically added and removed to the grid –The owner of the desktop machine decides how contribute to the project offering its computational power To contribute to the project – –Download and install the client –Your machine will work as a Grid node when is idle (in place of your screensaver)

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Application Areas (i) Physicical Science Applications –GryPhiN, –Particle Physics DataGrid (PPDG), –GridPP, –AstroGrid, Life Science Applications –Protein Data Bank (PDB), –Biomedical Informatics Research Network (BIRN), –Telemicroscopy, –myGrid,

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Application Areas (ii) Engineering Oriented Applications –NASA Information Power Grid (IPG), –Grid Enabled Optimization and Design Search for Engineering (GEODISE), Commercial Applications –Butterfly Grid, –Everquest, E-Utility –ClimatePrediction experiment,

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, EGEE Project

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, EGEE Partners CERN Central Europe including Austria, Czech Republic, Hungary, Poland, Slovakia and Slovenia France Germany and Switzerland Ireland and the United Kingdom Italy Northern Europe including Belgium, Denmark, Finland, The Netherlands, Norway and Sweden Russia South-East Europe including Bulgaria, Cyprus, Greece, Israel and Romania South-West Europe including Portugal and Spain NRENS (National Research and Education Networks) United States

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, The largest e-Infrastructure: EGEE Objectives –consistent, robust and secure service grid infrastructure –improving and maintaining the middleware –attracting new resources and users from industry as well as science Structure –71 leading institutions in 27 countries, federated in regional Grids –leveraging national and regional grid activities worldwide –funded by the EU with ~32 M Euros for first 2 years starting 1st April 2004

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, EGEE Activities 48 % service activities (Grid Operations, Support and Management, Network Resource Provision) 24 % middleware re-engineering (Quality Assurance, Security, Network Services Development) 28 % networking (Management, Dissemination and Outreach, User Training and Education, Application Identification and Support, Policy and International Cooperation) Emphasis in EGEE is on operating a production grid and supporting the end-users

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, EGEE Enabling Grids for E-SciencE (EGEE) in Europe –Funded by the European Union (EU) –Involves 26 countries and more than 70 institutions EGEE infrastructure –Over GEANT European Communication Network –LHC Computing Grid (LCG) Middleware –Moving towards the complete adoption of the new gLite middleware Globus 2 basedWeb services based gLite-2gLite-1LCG-2LCG-1

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Large Hadron Collider It is a particle accelerator built in Geneve The biggest instrument ever built Data is collected in a few places of the LHC and distributed across many computing sites Mont Blanc (4810 m) Downtown Geneva

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, The LHC Experiments Large Hadron Collider (LHC): –four experiments:  ALICE  ATLAS  CMS  LHCb –27 km tunnel –Start-up in 2007

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, ATLASCMS LHCb ~10-15 PetaBytes /year ~10 8 events/year ~10 3 batch and interactive users The LHC Experiments

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Grid monitoring –GIIS Monitor + Monitor Graphs –Sites Functional Tests –GOC Data Base –Scheduled Downtimes –Live Job Monitor –GridIce – VO + Fabric View –Certificate Lifetime Monitor Operation of Production Service: real-time display of grid operations Accounting Information Selection of Monitoring tools:

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, BioMed Overview Infrastructure –~3.000 CPUs –~12 TB of disk –in 9 countries >50 users in 7 countries working with 12 applications 18 research labs Month Number of jobs PADOVA BARI 15 resource centres  17 CEs  16 SEs

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Biomed Virtual Organisation ~ 70 users, 9 countries > 12 Applications (medical image processing, bioinformatics) ~3000 CPUs, ~12 TB disk space ~100 CPU years, ~ 500K jobs last 6 months

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Bioinformatics Grid Protein Sequence Analysis –Gridified version of NPSA web portal  Offering proteins databases and sequence analysis algorithms to the bioinformaticians (3000 hits per day)  Need for large databases and big number of short jobs –Objective: increased computing power –Status: 9 bioinformatic softwares gridified –Grid added value: open to a wider community with larger bioinformatic computations xmipp_MLrefine –3D structure analysis of macromolecules  From (very noisy) electron microscopy images  Maximum likelihood approach to find the optimal model –Objective: study molecule interaction and chem. properties –Status: algorithm being optimised and ported to 3D –Grid added value: parallel computation on different resources of independent jobs

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Roma, Contacts EGEE Website How to join How to test EGEE Project Office