Presentation is loading. Please wait.

Presentation is loading. Please wait.

Silver to Grid Data Services Session III: Deploying a Data Service on caGrid and using caGrid Service APIs caBIG™ Annual Meeting June 23-25, 2008.

Similar presentations


Presentation on theme: "Silver to Grid Data Services Session III: Deploying a Data Service on caGrid and using caGrid Service APIs caBIG™ Annual Meeting June 23-25, 2008."— Presentation transcript:

1 Silver to Grid Data Services Session III: Deploying a Data Service on caGrid and using caGrid Service APIs caBIG™ Annual Meeting June 23-25, 2008

2 Overview of Sessions I - III Divided into three 1 hour 15 minutes sessions TuesdayWednesday 10:15 - 11:30 a.m.10:00 - 11:15 a.m.Session I: Overview of the Silver to Grid training program. Presentation and Live Demo of caGrid Semantic Interoperability. 12:45 - 2:00 p.m.12:30 - 1:45 p.mSession II (Lessons 1-5): Developing a Silver-Level Compatible Data Service API 2:15 - 3:30 p.m.2:00 - 3:15 p.mSession III (Lessons 6-9): Deploying a Data Service on caGrid and using caGrid Service APIs

3 Acknowledgements Peter McGarvey Baris Suzek Mike Keller Dianne Reeves George Komatsoulis Avinash Shanbhag Becky Angeles Jennifer Brush Jamie Parker Claire Wolfe Ken Smith Sal Mungal Virginia Hetrick Shannon Hastings Architecture/VCDE workspace participants

4 Session III: Deploying a Data Service on caGrid and using caGrid Service APIs

5 Session III: Lessons Lesson 6: Installing caGrid node for Data Service Deployment Lesson 7: Deploying a caGrid Data Service Lesson 8: Using caGrid Data Services Lesson 9: Using caGrid Metadata Service APIs

6 Lesson 6: Installing caGrid for Data Service Deployment

7 Installing caGrid for Data Service Deployment

8 Outline Overview caGrid caGrid Infrastructure Step-by-step caGrid Installation for Data Service Deployment

9 What is caGrid? Development project of Architecture Workspace Service oriented infrastructure that supports caBIG™ An architecture that allows building a grid of your own Enables collaborating institutions to share information and analytical resources efficiently and securely

10 caGrid Community Involvement caGrid itself provides no real “data” or “analysis” to caBIG™; its the enabling infrastructure which allows the community to develop Analytical Services Data Services Community members add value to the grid as applications, services (data/analytical), and processes caGrid provides the necessary core services, APIs, and tooling Community members develop end user applications/clients which consume the resources provided on the grid

11 caGrid Infrastructure Client and service APIs are object oriented, and operate over well-defined and curated data types Objects are defined in UML and converted into ISO/IEC 11179 Administered Components, which are in turn registered in the Cancer Data Standards Repository (caDSR) Object definitions are drawn from controlled terminology and the vocabulary is registered in the Enterprise Vocabulary Services (EVS), and their relationships are thus semantically described Objects are serialized to XML that adhere to XML schemas registered in the Global Model Exchange (GME)

12 caGrid Infrastructure – cont’d Service and the hosting center metadata is registered in Index Service

13 caGrid Installation: Before starting Dowload caGrid 1.2 Installer http://gforge.nci.nih.gov/frs/download.php/3738/caGrid-installer-1.2.zip Setting environment variables JAVA_HOME : Location of Java JDK 1.5.X ANT_HOME: Location of Ant 1.6.5 CATALINA_HOME: Location of Tomcat ver. 5.0.28 GLOBUS_LOCATION: Location of Globus Toolkit ver. 4.0.3 If not available, caGrid Installer installs Ant, Globus Toolkit and/or Tomcat Unzip caGrid-installer-1.2.zip Run caGrid installer: java -jar caGrid-installer-1.2.jar

14 caGrid Installation: Installation Types Choose any combination of installation types to install one or more caGrid components For data service deployment, choose options “Install caGrid” and “Configure Container”

15 caGrid Installation: Service Container Choose Tomcat or Globus as service containers

16 caGrid Installation: Prerequisites Install (or reinstall) prerequisite software Ant Tomcat Globus Toolkit

17 caGrid Installation: Location Provide the directory where caGrid will be installed

18 caGrid Installation: Target Grid Each target grid basically uses different URLs for caGrid core services. For instance service URLs for OSU Training Grid are: cagrid.master.index.service.url=http://training03.cagrid.org:6080/wsrf/services/DefaultIndexService cagrid.master.cadsr.service.url=http://training02.cagrid.org:6080/wsrf/services/cagrid/CaDSRService cagrid.master.gme.service.url=http://training02.cagrid.org:6080/wsrf/services/cagrid/GlobalModelExchange cagrid.master.gridgrouper.service.url=https://training03.cagrid.org:6443/wsrf/services/cagrid/GridGrouper cagrid.master.dorian.service.url=https://dorian.cagrid.org:6443/wsrf/services/cagrid/Dorian Choose one of the available grids: NCICB Development NCICB Production NCICB QA OSU Development OSU Training and more

19 caGrid Installation: Container Configuration Securing container is needed to host secure services. Secure services are those that require clients to use one of the Globus Security Infrastructure (GSI) authentication mechanisms.

20 caGrid Installation: Completion

21 Additional Information caGrid Wiki: http://www.cagrid.org/mwiki/index.php?title=CaGrid caBIG™ Architecture WS caGrid Web Page: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid/

22 Lesson 7: Deploying a caGrid Data Service

23 Deploying a caGrid Data Service

24 Outline Overview Major steps for deployment Introduce Toolkit Step-by-step deployment of a Data Service; gridPIR

25 caGrid Data Service Deployment – Major steps Provide client and service APIs that are object oriented Provide objects that are defined in UML and registered in the Cancer Data Standards Repository (caDSR) Provide object definitions drawn from controlled terminology and vocabulary registered in the Enterprise Vocabulary Services (EVS) Provide XML schemas that for XML serialization of objects (may be registered in Global Model Exchange) Provide service metadata about the center where service is deployed

26 caGrid Data Service Deployment – Major steps Register service metadata about the service and the center where service is deployed

27 Service Metadata (Domain Model Portion) …..

28 Introduce: Grid Service Authoring Toolkit An open-source and extensible toolkit Supports easy development and deployment of WS/WSRF compliant Grid services by hiding low level details of the Globus Toolkit Enables the implementation of strongly-typed Grid services Facilitates caGrid data service development using caCORE SDK artifacts through pluggable service styles

29 Deploying a caGrid data service using Introduce: Grid-enablement of Protein Information Resource (gridPIR) A data service to provide comprehensive and fully annotated protein related information for genomic and proteomic cancer research Developed using model driven approach and caCORE SDK 3.2.1 All data is public so no security layer implemented

30 Introduce: Create a caGrid Service ant introduce Modify an existing service Deploy an existing service Browse Data Types from caDSR or GME

31 Introduce: Enter service information An analytical service exposes operation(s) with input/output objects A data service exposes objects that presents the data resource

32 Introduce: Data Service Configuration Different Service Styles (including caCORE SDK) supported. gridPIR is generated using caCORE SDK v3.2.1 Optional extensions for Bulk Data Transfer or Web Services Enumeration

33 Introduce: caCORE SDK-generated Client Selection Two options for client selection: Option 1: Use remote API if data service caCORE-like system (API) and caGrid Data Service are on the different machines Option 2: Use local API if both caCORE-like system (API) and caGrid Data Service are deployed on the same machine

34 Introduce: Remote API Selection Library folder (including client jar) generated by caCORE SDK

35 Introduce: Remote API Selection Treat all queries case-insensitive Use Common Security Module Enter URL for remote caCORE-like gridPIR API (publicly accessible)

36 Introduce: Choosing objects (model) service exposes 4. Add selected packages 1. Fetch models from caDSR 2. Select gridPIR model v1.2 3. Select package from gridPIR model

37 Introduce: Choosing XML Schema Find schemas from GME (if registered) OR Resolve schemas manually

38 Introduce: Choosing XML Schema – Manual Resolution (cont’d) XSD generated by caCORE SDK

39 Introduce: Entering Service Description 1. Select Metadata Tab 2. Select ServiceMetadata row 3. Edit Property

40 Introduce: Entering Service Metadata (cont’d) Enter: POC Hosting Center Address

41 Introduce: Deploy gridPIR Data Service Deploy an existing service

42 Introduce: Selecting Data Service Location in the file system Compiled service stubs Metadata files Library files XML schemas Source code for service stubs

43 Introduce: Selecting Data Service Location in the file system Container information Register to Index Service? URL for Index Service

44 Verifying Deployment URL for deployed service

45 Outcome

46 Additional Information caGrid Wiki: http://www.cagrid.org/mwiki/index.php?title=CaGrid Introduce Toolkit Wiki: http://www.cagrid.org/mwiki/index.php?title=Introduce caGrid Data Services Wiki: http://www.cagrid.org/mwiki/index.php?title=Data_Services caBIG™ Architecture WS caGrid Web Page: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid/

47 Lesson 8: Using caGrid Data Services

48 Outline Overview CQL Using caGrid Portal Using Generic Data Service Client CQL Examples using gridPIR Data Service

49 Executing a Data Service Query Query Results

50 caBIG Query Language (CQL) CQL Query: A simple wrapper element at the head of every CQL query document., contains the target. Target : The Target element is of the type Object, and describes the data type which the query will return. QueryModifier: An optional element modifying the returned result set. This modifier has a required attribute ‘countOnly’ and optionally allows for a choice of a list of Attribute Names or a single Distinct Attribute to return. Object: Contains the required attribute ‘name.’ This attribute’s value defines the caDSR class of the object. When the Object is the top level target of a CQL query, it identifies the data type that will be returned by the caGrid Data Service. The Object allows for a choice between three child elements. The possible child elements are Attribute, Association, and Group. Objects may have at most one of these child elements. Groups also have an attribute ‘logicOperator,’ an enumeration of the values “AND” and “OR.” http://www.cagrid.org/mwiki/index.php?title=Data_Services:CQL

51 caGrid Portal http://cagrid-portal.nci.nih.gov/

52 caGrid Portal Discovery Data Service Query Portal Allows Discovery Exploration of: Domain Models Semantic Metadata Hosting Center Info Data Queries Example: Query on Gene Objects

53 caGrid Portal 1) Select “Edit Query Modifiers” 2) Select “Object” then “Update” 3) Update then “Add Criterion” 4) Select attribute “name” 5) Set “name” EQUAL_TO “BRCA1” 6) Update

54 caGrid Portal 7) Submit Query 8) When query is finished Select View Results 9) Query returns 11 Gene Objects and attribute values for each

55 caGrid Portal See Query and Results as XML

56 Generic Data Service Client Can be used for caGrid data services that are based on caCORE- like systems since such services usually expose only the query method in its public API For services that have additional methods other than query, specific clients generated by the Introduce Toolkit needs to be used Typically query involves three steps Initialization Creating and submitting a CQL query Processing the results

57 Generic Data Service Client: Initializing the client For gridPIR Data Service: String serviceURL= "http://141.161.25.20:8080/wsrf/services/cagrid/GridPIR"; DataServiceClient client=new DataServiceClient(serviceURL);

58 Generic Data Service Client: Creating a CQL query Option 1: Create CQLQuery object programmatically: //CQL query to retrieve all BRCA1 genes CQLQuery query = new CQLQuery(); Object target = new Object(); target.setName(Gene.class.getName()); Attribute nameAttribute = new Attribute(“name", Predicate.EQUAL_TO, “BRCA1"); target.setAttribute(nameAttribute); query.setTarget(target);

59 Generic Data Service Client: Creating a CQL query Option 2: Load CQLQuery object from an XML string or file: // from a string CQLQuery query2 = (CQLQuery) Utils.deserializeObject( new StringReader(“ "), CQLQuery.class); // from a file CQLQuery query3 = (CQLQuery) Utils.deserializeObject( new FileReader(cqlFile), CQLQuery.class);

60 Generic Data Service Client: Submitting the CQL Query Results are returned as CQLQueryResults object: try { CQLQueryResults results = client.query(query); } catch (QueryProcessingExceptionType ex) { // handle processing exception } catch (MalformedQueryExceptionType ex) { // handle malformed query } catch (RemoteException ex) { // handle remote exception }

61 Generic Data Service Client: Processing the results Option 1: Results can be iterated as single items using a specialized implementation of the standard Java Iterator interface CQLQueryResultsIterator: Iterator iter = new CQLQueryResultsIterator(results, GridPIRClient.class.getResourceAsStream("client-config.wsdd")); while (iter.hasNext()) { Gene gene = (Gene) iter.next(); System.out.println(g.getName()); }

62 Generic Data Service Client: Processing the results Option 2: Results can be serialized to a string (or file for future processing) : StringWriter w = new StringWriter(); Utils.serializeObject( results, new QName("http://CQL.caBIG/1/gov.nih.nci.cagrid.CQLResultSet", "CQLResultSet“), w); System.out.println(w.getBuffer());

63 Find Protein objects for Human Breast cancer 1 (BRCA1): CQL Example: Query involving three classes

64 Results :

65 Additional Information caGrid Data Services Wiki: http://www.cagrid.org/mwiki/index.php?title=Data_Services caGrid Data Service Client API Wiki: http://www.cagrid.org/mwiki/index.php?title=Data_Services:Client_API CQL Wiki: http://www.cagrid.org/mwiki/index.php?title=Data_Services:CQL

66 Lesson 9: Using caGrid Metadata Service APIs

67 Outline Overview caGrid Metadata Service APIs caDSR Service API EVS Service API Discovery API MetadataUtils

68 caGrid Data Service Metadata Overview - gridPIR UML Modeling Semantic Annotation using EVS concepts caDSR Load/CDE Creation EVS Concept Codes CDE Public ID/version UML Class/attribute CDE Long Name

69 Metadata Services on caGrid - caDSR

70 caDSR Grid Service API Client Provides access to information in the caDSR such as semantically annotated UML model as registered in caDSR Has capability to generate caGrid standard metadata instances such as domain models

71 caDSR Grid Service API Client - Operations Provides access to the UML-like view of caDSR registered items: findAllProjects findPackagesInProject findClassesInPackage findAttributesInClass Enables clients to generate caGrid standard Data Service metadata: generateDomainModelForProject generateDomainModelForPackage generateDomainModelForClasses generateDomainModelForClassesWithExcludes Provides clients the ability to augment a ServiceMetadata (standard caGrid service metadata) skeleton instance with the information extracted from caDSR annotateServiceMetadata

72 caDSR Grid Service API Client – Retrieving list of projects from caDSR //caDSR service on production caGrid String serviceURL=http://cagrid- service.nci.nih.gov:8080/wsrf/services/cagrid/CaDSRService;http://cagrid- service.nci.nih.gov:8080/wsrf/services/cagrid/CaDSRService // create a caDSRServiceClient instance CaDSRServiceClient client = new CaDSRServiceClient(serviceURL); // get list of projects from caDSR Project[] projects = client.findAllProjects(); //processing results for (int i = 0; i < projects.length; i++) { Project p = projects[i]; System.out.println(i+" Short name:"+p.getShortName()+" / Long name:"+p.getLongName());

73 caDSR Grid Service API Client – Retrieving list of projects from caDSR Result:

74 caDSR Grid Service API Client – Retrieving classes/attributes registered for a gridPIR from caDSR Result:

75 caDSR Grid Service API Client – Retrieving classes/attributes registered for a gridPIR from caDSR // Retrieve the list of classes for a registered model UMLClassMetadata[] classArray= client.findClassesInProject(gridPIRProject); // Retrieve the list of attributes for a class UMLAttributeMetadata[] attributeArray= client.findAttributesInClass(gridPIRProject,classArray[i]); // Retrieve the value domain for an attribute ValueDomain valueDomain=client.findValueDomainForAttribute(gridPIRProject, attributeArray[j]);

76 Metadata Services on caGrid – Enterprise Vocabulary Services (EVS)

77 EVS Grid Service API Client Provides information on vocabularies and terms/ concepts presented by the NCI Metathesaurus/ Thesaurus

78 EVS Grid Service API Client - Methods Provides a list of programmatically accessible vocabularies getVocabularyNames Provides access to concepts/terms from the vocabularies searchDescLogicConcept Provides complete History for concepts; the evolution of the concept as they are created, merged, modified, split, or retired. getHistoryRecords Provides access to concepts that are supported by the NCI Metathesaurus searchMetaThesaurus searchSourceByCode

79 EVS Grid Service API Client – Retrieve list of vocabularies provided by EVS //URL for production grid EVS Grid Service String serviceURL="http://cagrid- service.nci.nih.gov:8080/wsrf/services/cagrid/EVSGridService";http://cagrid- service.nci.nih.gov:8080/wsrf/services/cagrid/EVSGridService // create a EVSServiceCLient instance EVSGridServiceClient client=new EVSGridServiceClient(serviceURL); //retrieve list of vocabularies service provides DescLogicConceptVocabularyName[] vocabularyNames=client.getVocabularyNames(); //list the names of vocabularies for(int i=0;i<vocabularyNames.length;i++){ System.out.println(i+": "+vocabularyNames[i].getVocabularyName()); }

80 EVS Grid Service API Client - Retrieve list of vocabularies provided by EVS Result:

81 EVS Grid Service API Client – Retrieve EVS concept code for a term //Set the search criteria EVSDescLogicConceptSearchParams evsSearchParams = new EVSDescLogicConceptSearchParams(); //searching in NCI_Thesaurus evsSearchParams.setVocabularyName("NCI_Thesaurus"); //searching the concept code for term protein evsSearchParams.setSearchTerm(“protein”); //set maximum number of returned terms/concepts to 10 evsSearchParams.setLimit(10); //run query against the EVS grid service DescLogicConcept[] descLogicConcepts = client.searchDescLogicConcept(evsSearchParams); //process results

82 EVS Grid Service API Client – Retrieve EVS concept code for a term Result:

83 Metadata Services on caGrid – Index Service

84 Discovery API Client Provides methods to query the Index Service and used to discover services of interest

85 Discovery API Client - Methods Searches based on service level metadata E.g. discoverServicesByResearchCenter Searches based on semantic annotation E.g. discoverDataServicesByModelConceptCode Searches based on operation metadata (for analytical services) E.g. discoverServicesByOperationName Searches based on information model metadata (for data services) E.g. discoverDataServicesByExposedClass Searched based on XPath leveraging Service Metadata XML discoverByFilter (String xpathPredicate)

86 Discovery API Client – Discovering services using a keyword //Index service on production caGrid String serviceURL="http://cagrid- index.nci.nih.gov:8080/wsrf/services/DefaultIndexService";http://cagrid- index.nci.nih.gov:8080/wsrf/services/DefaultIndexService // create a DiscoveryClient instance DiscoveryClient client = new DiscoveryClient(serviceURL); //discover services by keyword EndpointReferenceType[] endPointReferenceArr = client.discoverServicesBySearchString(“Protein”); //list URLs for returned services for (int i=0; i < endPointReferenceArr.length; i++){ System.out.println("Address: "+endPointReferenceArr[i].getAddress()); }

87 Discovery API Client – Discovering services using a keyword Result:

88 Metadata API - MetadataUtils Used to access and manipulate instances of service metadata Complements the Discovery API; Once a service is discovered MetadataUtils’s methods can be used to access and inspect the full metadata for the service

89 Metadata API – MetadataUtils Methods Retrieves the service metadata or domain model from the specified service. getServiceMetadata Writes/reads the XML representation of the service metadata to/from the specified writer/reader: serializeServiceMetadata deserializeServiceMetadata Writes/reads the XML representation of the domain model to/from the specified writer/reader: serializeDomainModel deserializeDomainModel

90 Metadata API – MetadataUtils - Example //discover services by concept code used to annotate the model such as C17021 (Protein) EndpointReferenceType[] endPointReferenceArr = client.discoverServicesByModelConceptCode(“C17021”); for (int i=0; i < endPointReferenceArr.length; i++){ //retrieve service metadata for the service ServiceMetadata serviceMetadata=MetadataUtils.getServiceMetadata(endPointReferenceArr[i]); //print host center information System.out.println("Hosting Center: "+serviceMetadata.getHostingResearchCenter().getResearchCenter().getDisplayName()); //retrieve domain model for the service DomainModel domainModel=MetadataUtils.getDomainModel(endPointReferenceArr[i]); //print domain model name System.out.println("Domain Model/Project Short Name: "+domainModel.getProjectShortName()); System.out.println("Address: "+endPointReferenceArr[i].getAddress()); }

91 Metadata API – MetadataUtils - Example Result:

92 Additional Information caGrid Wiki: http://www.cagrid.org/mwiki/index.php?title=CaGrid caBIG™ Architecture WS caGrid Web Page: https://cabig.nci.nih.gov/workspaces/Architecture/caGri d/https://cabig.nci.nih.gov/workspaces/Architecture/caGri d/

93 Session III: Deploying a Data Service on caGrid and using caGrid Service APIs

94 Questions

95 Closing Remarks

96 Additional Resources caBIG™ web site: https://cabig.nci.nih.gov/https://cabig.nci.nih.gov/ caBIG™ gForge site: https://gforge.nci.nih.gov/https://gforge.nci.nih.gov/ caGrid Wiki: http://www.cagrid.org/http://www.cagrid.org/ caBIG™ Learning Management System: http://ncicbtraining.nci.nih.gov/TP2005/tp2000web.dll/NCICBTrain ing caCORE SDK: http://ncicb.nci.nih.gov/infrastructure/cacoresdkhttp://ncicb.nci.nih.gov/infrastructure/cacoresdk caBIG™ Compatibility Guidelines: https://gforge.nci.nih.gov/docman/index.php?group_id=233&sele cted_doc_group_id=1138&language_id=1 Upcoming boot camps


Download ppt "Silver to Grid Data Services Session III: Deploying a Data Service on caGrid and using caGrid Service APIs caBIG™ Annual Meeting June 23-25, 2008."

Similar presentations


Ads by Google