Uni Innsbruck Informatik - 1 The Grid: From Parallel to Virtualized Parallel Computing Michael Welzl DPS NSG Team.

Slides:



Advertisements
Similar presentations
Network Resource Broker for IPTV in Cloud Computing Lei Liang, Dan He University of Surrey, UK OGF 27, G2C Workshop 15 Oct 2009 Banff,
Advertisements

EC-GIN Europe-China Grid InterNetworking. Enriched with customised network mechanisms Original Internet technology Overview of Project Traditional Internet.
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
What is Grid Computing? Cevat Şener Dept. of Computer Engineering, METU.
High Performance Computing Course Notes Grid Computing.
This product includes material developed by the Globus Project ( Introduction to Grid Services and GT3.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Seminar Grid Computing ‘05 Hui Li Sep 19, Overview Brief Introduction Presentations Projects Remarks.
Grid InterNetworking CAIA guest talk
Globus Toolkit 4 hands-on Gergely Sipos, Gábor Kecskeméti MTA SZTAKI
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Introduction  What is an Operating System  What Operating Systems Do  How is it filling our life 1-1 Lecture 1.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Workload Management Massimo Sgaravatto INFN Padova.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Grid Computing Net 535.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Globus 4 Guy Warner NeSC Training.
Kate Keahey Argonne National Laboratory University of Chicago Globus Toolkit® 4: from common Grid protocols to virtualization.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
DISTRIBUTED COMPUTING
Grid Computing - AAU 14/ Grid Computing Josva Kleist Danish Center for Grid Computing
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Uni Innsbruck Informatik - 1 Open Issues and New Challenges for End-to-End Transport E2E RG Meeting - July 28/29, MIT, Cambridge MA Michael Welzl
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
Uni Innsbruck Informatik - 1 Network Support for Grid Computing... a new research direction! Michael Welzl DPS NSG Team
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
1 Observations on Architecture, Protocols, Services, APIs, SDKs, and the Role of the Grid Forum Ian Foster Carl Kesselman Steven Tuecke.
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Introducing EC-GIN: Europe-China Grid InterNetworking Sven Hessler Institute of Computer Science, NSG team University of Innsbruck, Austria
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Workload Management Workpackage
Clouds , Grids and Clusters
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Grid Computing.
University of Technology
The Anatomy and The Physiology of the Grid
Grid Computing Software Interface
Presentation transcript:

Uni Innsbruck Informatik - 1 The Grid: From Parallel to Virtualized Parallel Computing Michael Welzl DPS NSG Team Institute of Computer Science University of Innsbruck Habilitation talk TU Darmstadt 14 June 2007

Uni Innsbruck Informatik - 2 Outline Grid introduction Middleware –first step towards virtualization Research efforts –further steps towards virtualization Conclusion

Uni Innsbruck Informatik - 3 Grid Computing A brief introduction

Uni Innsbruck Informatik - 4 Introducing the Grid History: parallel processing at a growing scale –Parallel CPU architectures –Multiprocessor machines –Clusters –(“Massively Distributed“) computers on the Internet GRID logical consequence of HPC metaphor: power grid just plug in, don‘t care where (processing) power comes from, don‘t care how it reaches you –Common definition: The real and specific problem that underlies the Grid concept is coordinated resource sharing and problem solving in dynamic, multi institutional virtual organizations [Ian Foster, Carl Kesselman and Steven Tuecke, “The Anatomy of the Grid – Enabling Scalable Virtual Organizations”, International Journal on Supercomputer Applications, 2001]

Uni Innsbruck Informatik - 5 Scope Definition quite broad (“resource sharing“) -Reasonable - e.g., computers also have harddisks –But also led to some confusion - e.g., new research areas / buzzwords: Wireless Grid, Data Grid, Semantic / Knowledge Grid, Pervasive Grid, [this space reserved for your favorite research area] Grid Example of confusion due to broad Grid interpretation: “One of the first applications of Grid technologies will be in remote training and education. Imagine the productivity gains if we had routine access to virtual lecture rooms! (..) What if we were able to walk up to a local ‘power wall‘ and give a lecture fully electronically in a virtual environment with interactive Web materials to an audience gathered from around the country - and then simply walk back to the office instead of going back to a hotel or an airplane?“ [I. Foster, C. Kesselman (eds): “The Grid: Blueprint for a New Computing Infrastructure“, 2nd edition, Elsevier Inc. / MKP, 2004]  Clear, narrower scope is advisable for thinking/talking about the Grid Traditional goal: processing power –Grid people = parallel people; thus, main goal has not changed much

Uni Innsbruck Informatik - 6 The next Web? Ways of looking at the Internet –Communication medium ( ) –Truly large kiosk (web) The Grid way of looking at the Internet –Infrastructure for Virtual Teams Most of the time... –the “real and specific goal“ is High Performance Computing –Virtual Organizations and Virtual Teams are well defined i.e. not an „open“ system, e.g. security is a big issue Virtual Teams –Geographically distributed –Organizationally distributed –Yet work on a common problem It has been called “the next web“ But Web 2.0 is already here :-)

Uni Innsbruck Informatik - 7 Virtual Organizations and Virtual Teams Distributed resources and people Linked by networks, crossing admin domains Sharing resources, common goals Dynamic R R R R R R R R R R VO-B VO-A R R R R Source: Globus presentation by Ian Foster

Uni Innsbruck Informatik - 8 Austrian Grid E-science Grid applications Medical Sciences –Distributed Heart Simulation –Virtual Lung Biopsy –Virtual Eye Surgery –Medical Multimedia Data Management and Distribution –Virtual Arterial Tree Tomography and Morphometry High-Energy Physics –CERN experiment analyses Applied Numerical Simulation –Distributed Scientific Computing: Advanced Computational Methods in Life Science –Computational Engineering –High Dimensional Improper Integration Procedures Astrophysical Simulations and Solar Observations –Astrophysical Simulations –Hydrodynamic Simulations –Federation of Distributed Archives of Solar Observation Meteorologal Simulations Environmental GRID Applications

Uni Innsbruck Informatik - 9 Example: CERN Large Hadron Collider Largest machine built by humans: particle accelerator and collider with a circumference of 27 kilometers Will generate 10 Petabytes (10 7 Gigabytes) of information per year … starting 2007! This information must be processed and stored somewhere Beyond the scope of a single institution to manage this problem –Projects: LCG (LHC Computing Grid), EGEE (Enabling Grids for E-sciencE) Source: Globus presentation by Ian Foster

Uni Innsbruck Informatik - 10 Complexity Grid poses difficult problems –Heterogeneity and dynamicity of resources –Secure access to resources with different users in various roles, belonging to VTs which belong to VOs –Efficient assignment of data and tasks to machines (“scheduling“) Cluster with dedicated network links Heterogeneousdistributedsystems Massivelyparallelsystems GRID Systems with distributed memory Systems with Shared memory

Uni Innsbruck Informatik - 11 Grid requirements Computer scientists can tackle these problems –Grid application users and programmers are often not computer scientists Important goal: ease of use –Programmer should not worry (too much) about the Grid –User should worry even less –Ultimate goal: write and use an application as if using a single computer (power grid metaphor) How do computer scientists simplify? Abstraction. We build layers. In a Grid, we typically have Middleware.

Uni Innsbruck Informatik - 12 Grid Middleware

Uni Innsbruck Informatik - 13 Grid computing without middleware Example manual Grid application execution 1.scp code to 10 machines 2.log in to the 10 machines via ssh and start “application > result“ everywhere 3.Estimate running time, or let application tell you that it‘s done (e.g. via TCP/IP communication in app code) 4.retrieve result files via scp Tedious process - so write a script file Do this again for every application / environment? What if your colleagues need something similar? Standards needed, tools introduced

Uni Innsbruck Informatik - 14 Toolkits Most famous: Globus Toolkit –Evolution from GT2 via GT3 to GT4 influenced the whole Grid community –Reference implementation of Open Grid Forum (OGF) standards Other well-known examples –Condor Exists since mid-1980‘s No Grid back then - system gradually evolved towards it Traditional goal: harvest CPU power of normal user workstations  many Grid issues always had to be addressed anyway Special interfaces now enable Condor-Globus communication (“Condor-G“) –Unicore (used in D-Grid) –gLite (used in EGEE) Issues that these middlewares (should) address –Load Balancing, error management –Authentification, Authorization and Accounting (AAA) –Resource discovery, naming –Resource access and monitoring –Resource reservation and QoS management

Uni Innsbruck Informatik - 15 Grid Resource Allocation Manager (GRAM) Globus tool for job execution –Unified, resource independent replacement for steps in “manual Grid“ example Unified way to set environment variables: Resource Specification Language (RSL) (stdout = x, arguments = y,..) Steps 1-4 become –Blocking: “globus-job-run -stage hostname applicationname“ -stage option copies code to remote machine Different architectures: recompilation needed – but not supported! –Nonblocking: scp code, then “globus-job-submit hostname applicationname“ (staging not yet supported) Obtain unique URL, continuously use it to query job status When done, use “globus-job-get-output URL stdout“ to retrieve stdout More complex systems are built on top of GRAM –E.g. Message Passing Interface (MPI) for the Grid: MPICH-G2

Uni Innsbruck Informatik - 16 GRAM /2 GRAM leaves a lot of questions unanswered –How to recompile application for different architectures? (automatically + in a unified way) –What if your computer‘s IP address changes? –What if the 10 accessed computer‘s IP addresses change? –What if two of the computers becomes unavailable? –What if 3 other users start to work with 5 of the 10 computers? A tool for each problem... –General-purpose Architecture for Reservation and Allocation (GARA) Integrated QoS via “advance reservation“ of resources (CPU, Disk, Network) –Monitoring and Discovery System (MDS) for locating and monitoring resources –Resource Broker (Globus: do it yourself; Condor: “matchmaker“) translates requirement specification (CPU, memory,..) into IP address Diversity of complex tools standardized + available in Globus, addressing some but not all of the issues  need for an architecture

Uni Innsbruck Informatik - 17 Evolution: moving towards an architecture OGSI / OGSA: Open Grid Service Infrastructure / Architecture –Open Grid Forum (OGF) standards –OGSA = service-oriented architecture; key concept for virtualization use a resource = call a service –OGSI = Web Services + state management failed: too complex, not compliant with Web Service standards GT3 GT4 Source: Globus presentation by Ian Foster

Uni Innsbruck Informatik - 18 Research towards the power outlet

Uni Innsbruck Informatik - 19 Current SoA Standards are only specified when mechanisms are known to work –Globus only includes such working elements Lots of important features missing Practical issues with existing middlewares –Submitting a Globus job is very slow (Austrian Grid: approx. 20 seconds)  significant granularity limit for parallelization! –Globus is a huge piece of software Currently, some confusion about right location of features –On top of middleware? (research on top of Globus) –In middleware? (other Middleware projects) –In the OS? (XtreemOS)  Upcoming slides concern mechanisms which are mostly on top and partially within middleware

Uni Innsbruck Informatik - 20 Automatic parallelization in Grids Scheduling; important issue for “power outlet“ goal! –Automatic distribution of tasks and inter-task data transmissions = scheduling Grid scheduling encompasses –Resource Discovery Authorization Filtering, Application Requirement Definition, Minimal Requirement Filtering –System Selection Dynamic Information Gathering System Selection –Job Execution (optional) Advance Reservation Job Submission Preparation Tasks Monitoring Progress Job Completion Clean-up Tasks So far, most scheduling efforts consider embarassingly parallel applications - typically parameter sweeps (no dependencies)

Uni Innsbruck Informatik - 21 Condor case study Application name, parameters, etc. + requirements specified in ClassAds –“Requirements = Memory >= 256 && Disk > 10000; Rank = (KFLOPS*10000) + Memory“  only use computers which match requirements (else error), order them by rank –Explicit support for parameter sweeps: loop variables Resources registered with description; “central manager“ checks pool against application ClassAds (“matchmaking“) every 5 minutes, assigns jobs Checkpointing in Condor: need to recompile applications, link with special library (redirects syscalls) –Save current state for fault tolerance or vacating jobs Because preempted by higher priority job, machine busy, or user demands it Used in Grid Application Development Software Project (GrADS) for rescheduling (dynamic scheduling) and metascheduling (negotiation between multiple applications); ClassAds language extended –e.g., aggregation functions such as Max, Min, Sum

Uni Innsbruck Informatik - 22 Grid workflow applications Dependencies between applications (or large parts of applications) typically specified in Directed Acyclic Graph (DAG) –Condor: DAG manager (DAGMan) uses.dag file for simple dependencies –“Do not run job ‘B’ until job ‘A’ has completed successfully” DAGMan scheduling: for all tasks do... –Find task with earliest starting time –Allocate it to processor with Earlierst Finish Time –Remove task from list GriPhyN (Grid Physics Network) facilitates workflow design with “Pegasus“ (Planning for Execution in Grids) framework –Specification of abstract workflow: identify application components, formulate workflow specifying the execution order, using logical names for components and files –Automatic generation of concrete workflow (map components to resources) –Concrete workflow submitted to Condor-G/DAGMan

Uni Innsbruck Informatik - 23 Grid Workflow Applications /2 Components are built, Web (Grid) Services are defined, Activities are specified Several projects (e.g. K-WF Grid) and systems (e.g. ASKALON) exist Most applications have simple workflows –E.g. Montage: dissects space image, distributes processing, merges results Legacy codes Components Web Services Grid Workflows based on activities MPI HPF OMP HPF OMP MPI Java Legacy Codes Descriptor Generation Component Interaction Optimization, Adaptation Service Description Discovery, Selection Deployment, Invocation Dynamic Instantiation Service Orchestration Quality of Service Source: presentation by Thomas Fahringer

Uni Innsbruck Informatik - 24 Scheduling example: HEFT algorithm Step 1 - task prioritizing Rank of a task: longest “distance“ to the end (Mean processing + transfer costs) Tasks are sorted by decreasing rank order TaskP1P TaskRank calculation = = max( , )= max(12.5+3, )= TaskRank

Uni Innsbruck Informatik - 25 Step 2 - processor selection (EFT) TaskP1P FT(T1, P1) = 1 FT(T1, P2) = 1 FT(T2, P1) = 1+0.5=1.5 FT(T2, P2) = =5.5 FT(T4, P1) = =3 FT(T4, P2) = =6 FT(T3, P1) = 3+2=5 FT(T3, P2) = =4.5 FT(T5, P1) = =7 FT(T5, P2) = =10.5 P1P Processor idle + task ready Data transfer Task processing 5

Uni Innsbruck Informatik - 26 HEFT discussion HEFT is not a solution, just a heuristic –problem is known to be NP-complete Outperformed competitors (DAGMan scheduling, genetic algorithm) in ASKALON real-life experiments –Still, many improvements possible e.g., other functions than mean, and extension for rescheduling suggested Heterogeneous network capacities and traffic interactions ignored Tasks = {T1, T2, T3, T4} Resources = {R1, R2, R3, R4} Data transfers = {D1, D2, D3, D4} Not detected!

Uni Innsbruck Informatik - 27 Conclusion

Uni Innsbruck Informatik - 28 How far have we come? Remember: systems on last slides are still research –Not standardized, not part of reference middleware implementations –Right place (OS / Middleware / App) for some functions still undecided A lot is still manual –Basically three choices for deploying an application on the Grid Simply use it if it‘s a parameter sweep “Gridify“ it (rewrite using customized allocation - e.g. MPICH-G2) Utilize a workflow tool Convergence between P2P systems and Grids has only just begun Several issues and possible improvements –Large number of layers are a mismatch for high performance demands –Network usage simplistic, no customized mechanisms

Uni Innsbruck Informatik - 29 HTTP 1.1 Open issues: layering inefficiency Example: loss of “connection“ semantics IP TCP HTTP 1.0 SOAP Stateless Connection state Stateless Can‘t reuse connections! Connection state Web Service Grid Service Doesn‘t care, can do both Stateless Stateful Can‘t reuse connections! WS-RF Could reuse connections, but doesn‘t! Breaking the chain Reuses connections

Uni Innsbruck Informatik - 30 Open issues Strangely, parallel processing background seems to be ignored –E.g., work on task-processor mapping + P2P overlays such as hypercube = ? Granularity Complexity (Dependencies) Microcode Instruction level parallelism Arbitrary parallel applications Parameter sweeps Workflow applications Manual aid needed Untouched!

Uni Innsbruck Informatik - 31 Thank you! Questions?

Uni Innsbruck Informatik - 32 Backup slides

Uni Innsbruck Informatik - 33 Enriched with customised network mechanisms Original Internet technology Traditional Internet applications (web browser, ftp,..) Real-time multimedia applications (VoIP, video conference,..) Today‘s Grid applications Driving a racing car on a public road Applications with special network properties and requirements Bringing the Grid to its full potential ! EC-GIN EC-GIN enabled Grid applications Research gap: Grid-specific network enhancements

Uni Innsbruck Informatik - 34 Grid-network peculiarities Special behavior –Predictable traffic pattern - this is totally new to the Internet! –Web: users create traffic –FTP download: starts... ends –Streaming video: either CBR or depends on content! (head movement,..) Could be exploited by congestion control mechanisms –Distinction: Bulk data transfer (e.g. GridFTP) vs. control messages (e.g. SOAP) –File transfers are often “pushed“ and not “pulled“ –Distributed System which is active for a while overlay based network enhancements possible –Multicast –P2P paradigm: “do work for others for the sake of enhancing the whole system (in your own interest)“ can be applied - e.g. act as a PEP,... sophisticated network measurements possible –can exploit longevity and distributed infrastructure Special requirements –file transfer delay predictions note: useless without knowing about shared bottlenecks –QoS, but for file transfers only (“advance reservation“)

Uni Innsbruck Informatik - 35 What is EC-GIN? European project: Europe-China Grid InterNetworking –STREP in IST FP6 Call 6 –2.2 MEuro, 11 partners (7 Europe + 4 China) –Networkers developing mechanisms for Grids

Uni Innsbruck Informatik - 36 Research Challenges Research Challenges: –How to model Grid traffic? Much is known about web traffic (e.g. self-similarity) - but the Grid is different! –How to simulate a Grid-network? Necessary for checking various environment conditions May require traffic model (above) Currently, Grid-Sim / Net-Sim are two separate worlds (different goals, assumptions, tools, people) –How to specify network requirements? Explicit or implicit, guaranteed or “elastic“, various possible levels of granularity –How to align network and Grid economics? Combined usage based pricing for various resources including the network –What P2P methods are suitable for the Grid? What is the right means for storing short-lived performance data?

Uni Innsbruck Informatik - 37 Problem: How Grid people see the Internet Abstraction - simply use what is available –still: performance = main goal Wrong. Quote from a paper review: “In fact, any solution that requires changing the TCP/IP protocol stack is practically unapplicable to real-world scenarios, (..).“ How to change this view –Create awareness - e.g. GGF GHPN-RG published documents such as “net issues with grids“, “overview of transport protocols“ –Develop solutions and publish them! (EC-GIN, GridNets) Just like Web Service community Absolutely not like Web Service community ! Conflict! Existing transport system (TCP/IP + Routing +..) works well QoS makes things better, the Grid needs it! –we now have a chance for that, thanks to IPv6

Uni Innsbruck Informatik - 38 A time-to-market issue Research begins (Real-life) coding begins Real-life tests begin Thesis writing Result: thesis + running code; tests in collaboration with different research areas Result: thesis + simulation code; perhaps early real-life prototype (if students did well) Research begins (Simulation) coding begins Thesis writing Typical Grid project Typical Network project Ideal

Uni Innsbruck Informatik - 39 Machine-only communication Trend in networks: from support of Human-Human Communication – , chat via Human-Machine Communication –web surfing, file downloads (P2P systems), streaming media to Machine-machine Communication –Growing number of commercial web service based applications –New “hype“ technologies: Sensor nets, Autonomic Computing vision Semantic Web (Services): first big step for supporting machine-only communication at a high level So far, no steps at a lower level –This would be like RTP, RTCP, SIP, DCCP,... for multimedia apps: not absolutely necessary, but advantageous

Uni Innsbruck Informatik - 40 The long-term value of Grid-net research A subset of Grid-net developments will be useful for other machine-only communication systems! Grid Sensor nets Web service applications Future work Key for achieving this: change viewpoint from “what can we do for the Grid“ to “what can the Grid do for us“ (or from “what does the Grid need“ to “what does the Grid mean to us“)

Uni Innsbruck Informatik - 41 Large stacks Source: DoD IP TCP HTTP SOAP Middleware WS-RF Grid apps

Uni Innsbruck Informatik - 42 The Grid and P2P systems Look quite similar –Goal in both cases: resource sharing Major difference: clearly defined VOs / VTs –No incentive considerations –Availability not such a big problem as in P2P case It is an issue, but at larger time scales –(e.g. computers in student labs should be available after 22:00, but are sometimes shut down by tutors) –Scalability not such a big issue as in P2P case...so far!  convergence as Grids grow coordinated resource sharing and problem solving in dynamic, multi institutional virtual organizations (Grid, P2P)

Uni Innsbruck Informatik - 43 How the tools are applied in practice Web Browser Compute Server Data Catalog Data Viewer Tool Certificate authority Chat Tool Credential Repository Web Portal Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Registration Service Source: Globus presentation by Ian Foster

Uni Innsbruck Informatik - 44 Data MgmtSecurity Common Runtime Execution Mgmt Info Services Web Services Components Non-WS Components Pre-WS Authentication Authorization GridFTP Pre-WS Grid Resource Alloc. & Mgmt Pre-WS Monitoring & Discovery C Common Libraries Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Java WS Core Community Authorization Replica Location eXtensible IO (XIO) Credential Mgmt Community Scheduling Framework Delegation Example: Globus Toolkit version 4 (GT4) Data Replication Trigger C WS Core Python WS Core WebMDS Workspace Management Grid Telecontrol Protocol Contrib/ Preview Core Depre- cated Example Source: Globus presentation by Ian Foster

Uni Innsbruck Informatik - 45 Automatic parallelization Has been addressed in the past Microcode parallelism (pipelining in CPU) –Relatively easy: simple dependencies Instruction level parallelism –More complex dependencies –Can automatically be analyzed by compiler Reordering, loop unrolling,.. for (i=1; i<100; i++) a[i] = a[i] + b[i] * c[i]; /* Thread 1 */ for (i=1; i<50; i++) a[i] = a[i] + b[i] * c[i]; /* Thread 2 */ for (i=50; i<100; i++) a[i] = a[i] + b[i] * c[i]; (Intel C++ compiler) Source: WIKIPEDIA

Uni Innsbruck Informatik - 46 Automatic parallelization /2 Parallel Computing: complete applications parallelized –Very complex dependencies –Decomposition methods + mapping of tasks onto processors: usually not automatic (depends on problem and interconnection network) –Algorithm specific methods developed (matrix operations, sorting,..) –Some parts can be automatized, but not everything  explicit parallelism (OpenMP) and even allocation (MPI) quite popular Some research efforts on half-automatic parallelization (“manual“ aid) –Programmer knows about problem-specific locality needs (interacting code elements) –Examples: Java extensions such as JavaSymphony [Thomas Fahringer, Alexandru Jugravu] HPF+ HALO concept [Siegfried Benkner] Source:

Uni Innsbruck Informatik - 47 Source: