Service Orientation Considerations PoC R1 David Groep, Scaling and Validation Programme
Service Orientation Service Oriented Architecture (SOA) "an application architecture within which all functions are defined as independent services with well defined invocable interfaces which can be called in defined sequences to form business processes". Web services is a collection of technologies, including XML, SOAP, WSDL, and UDDI, Thus Web services does not equal service-oriented architecture. Source: UvA Grid Master IGC2005, Adam Belloum
Service Decomposition Source: UvA Grid Master IGC2005, Adam Belloum
Service Orientation All functions are defined as services. All services are independent. Operate as "black boxes"; o external components neither know nor care how boxes are executed o merely that they return the expected result. The interfaces are invocable; At an architectural level, it is irrelevant whether o they are local or remote o what interconnect scheme or protocol is used to effect the invocation, o what infrastructure components are required to make the connection. Source: UvA Grid Master IGC2005, Adam Belloum
Service Orientation Typically within a business environment Service means business functions, business transactions, and system services. The difference in the types of services. Business functions are from the application's perspective, non-system functions that are effectively atomic. Services might be low-level or complex high-level (fine-grained or course grained) functions Source: UvA Grid Master IGC2005, Adam Belloum
Services are not Servers Web Servers (Apache, Tomcat, ISS) Talks HTTP(S) Hosts web sites and pages May be dynamic (.php,.asp,.jsp) Seen by people Web Services (Axis, GT4,.NET, …) Bind to a protocol (HTTP(S), SMTP, …) May be hosted in a container Have a syntax and semantics definition Are seen and used by programs
Defining and building Services in WS-* [See Introduction to Web Services Tutorial] 1.Start with the WSDL 2.Then, generate your interfaces E.g. with GT4 tools or JWSDP See appropriate tutorial(s): “Technologies for Building Grids” or “Sotomajor” 3.Fill the stubs with an implementation 4.Deploy your services in a container EGEE NA3 Training Home has plenty of material:
Part 2: Web Services Caveats and Deployment Hints Hints, caveats and the constraints in Service Deployment in the VL-e PoC Environment
Some general caveats Re-use services Use generic services where possible ‘job submission service’, ‘data location srv’ Don’t over-do it WS are not suitable for low-latency HPC or bulk file transfer We’re not running a course on WS-RF … Not all services (nor architectures) are mature jet Be prepared to make compromises … … as long as it’s in-line with our long-term goals Test it works with your favourite workflow system!
Web Services hints Web Services are just a syntax Define proper sementics and document that to make your service re-useable Match up the semantics with related services Make sure your services are compliant o with relevant standards (like WS-I Basic Profile) o with the chosen hosting system (GT4)
SOA Editor Cape Clear SOA Editor, Cape Clear 2003 WSDL © EGEE Consortium and partners, see
Style There are multiple ways to bind a portType RPC/Encoded In the WSDL: In Java, methods are like: public void method(String in1, BigInteger myNumber) Document/Literal (or wrapped) In the WSDL: In Java, methods are like: public void method(PurchaseOrder po) Just because it’s no trouble: name the XML schema element referenced by an operation (through a message element) the same as the operation itself This is the only requirement of document/wrapped which isn't enforced by the WS-I requirements. Note that document/wrapped is a valid document/literal definition. See also specific presentation on the web page on WDSL styles
Services Hosting Usually, a service lives in a container Container takes care of Translating incoming SOAP message to appropriate objects (Java objects, C structures, …) Enforcing access control Invoking methods Packaging the results Destroying the invoked objects
Containers Many containers, but not all the same Vary in what they support, e.g., Platform: JVM,.NET, Unix Language (Java, C, Python) WSDL bindings (rpc or document/literal) Security mechanisms (none or GSI) VL-e supplied container GT4 (Axis 2.0-RC2++) Modified to do WS-RF, -Addressing, -Notification With GSI Security framework Almost like standard Axis 2
Implementation Remember that web services are stateless Model state via Resources Don’t keep running after the service invocation is complete Use database/memory based storage, not process- based state retention o Much more friendly on the CPU o Resilient to machine/VM restarts The ultimate horror scenario: R-GMA producer & consumer threads, see Concurrency P&E paper on the web
Security model We use SOAP over TLS This is the default in GT4 orders of magnitude faster than MLS but cannot do secure message forwarding
PoC Environment Hardware and Software
RP, the P4CTB and the PoC Application development NL-Grid production cluster Central mass-storage facilities + SURFnet Initial compute platform Stable, reliable, tested Cert. releases Grid MW & VL- software VL-e Proof of Concept Environment VL-e Rapid Prototyping Environment DAS-2, local resources VL-e Certification Environment NL-Grid Fabric Research Cluster Test & Cert. Grid MW & VL-software Compatibility Flexible, test environment Environments Usage Characteristics Virtual Lab. rapid prototyping (interactive simulation) Flexible, ‘unstable’ Download Repository PoC Installer Cluster Tools Developer CVS Nightly builds Unit tests stable, tested releases Integration tests Functionality tests Adventurous application people PoC Release nRelease Candidate n+1 Developers environment Tagged Release Candidates Rapid Prototyping Environment
PoC: Hardware and Software The PoC Distribution is set of software The PoC Environment is the ensemble of systems running the PoC Distribution The PoC Central Facilities (“CF”) are those systems managed by the P4 partners and designated as such
PoC Software Distribution OSRed Hat Enterprise Edition R3 PoC installer MiddlewareGlobus Toolkit version 2.x as supported by VDT Current grid services from LCG, EGEE Globus Toolkit 4.0 (WSRF components) Storage Resource Broker (version 3.3.1) Operating system supported for all operations: RHEL3 & compatibles will work on a variety of similar systems Middleware: Services & Hosting Environments
PoC Software Distribution IntegrativeThree systems Kepler, Taverna, Triana OGSA-DAI as installed with GT4 IBIS (version 1.2) Tools & LibrariesParaView (version 2.2.1) VTK (version 4.4) MESA (version 6.4 with GLUT) ITK (version 1.4) FSL (version 3.2) MRICRO (version 1.39) java SDK x deployment/install octave (version ) MPItb (v2.1.6x) MatLabMPI (v1.2) lamMPI (version x.x.x) LUCENE (version 1.4.3) Ant (version ) SWI-Prolog (version ) R Grid Weka (version 3.4.2) Nimrod client software (v3.0.0) GAT Sesame (client)
PoC Central Facility topology
Service Deployment constraints CF Farm of GT4 containers provided on the CF Based on the concept of “VO Boxes” For now: login to these boxes via gsissh Not for compute-intensive work You cannot run a container continuously on worker nodes No inbound IP connectivity Resource management and utilization And you should not want to, because All services in a container content for CPU and (disk) bandwidth JVM does not insulate services from each other
Other constraints on the CF Worker nodes are allocated in a transient way Jobs run in a one-off scratch directory No shared file system for $HOME (and if it happens to be there don’t use it!) Jobs will be queued o Short jobs get to run faster o Small jobs get to run faster o Priority depends on your VO affiliation You can connect out, but cannot listen Your job is limited in wall time
Central Hosting A common hosting environment is offered on a set of VO Boxes Provided: GT4.0 WS (as specified for R1) Visible from the outside world Highly monitored systems Can talk to the ‘inside’ Requirements for deployment on the CF Appropriate AuthN/AuthZ must be used for all services (this is available by default in GT4) Allow for request tracability (e.g. via log files)
High performance/throughput Services ‘I need lots of * for my service’ Parallelize it Use fine-grained services decomposition Use MPI (e.g. via IBIS) Use dedicated programs for processing Submit via a resource broker to the compute service Submit jobs to the compute service to run dedicated GT4 hosting environments with your service Keep in mind to submit these to the cluster where your front-end is running on a VO Box
Complex Services: job submission Use a job submission service GT4 WS-GRAM not (yet) supported on CF: only single-cluster, security model not complete without use of the WSS gLite CE WS (“CREAM”): also single cluster, security=compatible, deployment in 2006Q3 Brokered submission o Select appropriate resources via GLUE schema constraints & JDL o Submit directly via RB API code on client side o Use RB job submission WS by Bart
Brokered Job Submission Service Submit brokered jobs in JDL through WS-interface to the Matrix Resource Broker WSDL specification For detailed info: ask Bart Heupers Future version of the basic Grid middleware will have a native WMS WS interface
Questions and Discussion!