Download presentation
Presentation is loading. Please wait.
Published byFrederick Carter Modified over 9 years ago
1
Silver to Grid Data Services Session III: Deploying a Data Service on caGrid and using caGrid Service APIs caBIG™ Annual Meeting June 23-25, 2008
2
Overview of Sessions I - III Divided into three 1 hour 15 minutes sessions TuesdayWednesday 10:15 - 11:30 a.m.10:00 - 11:15 a.m.Session I: Overview of the Silver to Grid training program. Presentation and Live Demo of caGrid Semantic Interoperability. 12:45 - 2:00 p.m.12:30 - 1:45 p.mSession II (Lessons 1-5): Developing a Silver-Level Compatible Data Service API 2:15 - 3:30 p.m.2:00 - 3:15 p.mSession III (Lessons 6-9): Deploying a Data Service on caGrid and using caGrid Service APIs
3
Acknowledgements Peter McGarvey Baris Suzek Mike Keller Dianne Reeves George Komatsoulis Avinash Shanbhag Becky Angeles Jennifer Brush Jamie Parker Claire Wolfe Ken Smith Sal Mungal Virginia Hetrick Shannon Hastings Architecture/VCDE workspace participants
4
Session III: Deploying a Data Service on caGrid and using caGrid Service APIs
5
Session III: Lessons Lesson 6: Installing caGrid node for Data Service Deployment Lesson 7: Deploying a caGrid Data Service Lesson 8: Using caGrid Data Services Lesson 9: Using caGrid Metadata Service APIs
6
Lesson 6: Installing caGrid for Data Service Deployment
7
Installing caGrid for Data Service Deployment
8
Outline Overview caGrid caGrid Infrastructure Step-by-step caGrid Installation for Data Service Deployment
9
What is caGrid? Development project of Architecture Workspace Service oriented infrastructure that supports caBIG™ An architecture that allows building a grid of your own Enables collaborating institutions to share information and analytical resources efficiently and securely
10
caGrid Community Involvement caGrid itself provides no real “data” or “analysis” to caBIG™; its the enabling infrastructure which allows the community to develop Analytical Services Data Services Community members add value to the grid as applications, services (data/analytical), and processes caGrid provides the necessary core services, APIs, and tooling Community members develop end user applications/clients which consume the resources provided on the grid
11
caGrid Infrastructure Client and service APIs are object oriented, and operate over well-defined and curated data types Objects are defined in UML and converted into ISO/IEC 11179 Administered Components, which are in turn registered in the Cancer Data Standards Repository (caDSR) Object definitions are drawn from controlled terminology and the vocabulary is registered in the Enterprise Vocabulary Services (EVS), and their relationships are thus semantically described Objects are serialized to XML that adhere to XML schemas registered in the Global Model Exchange (GME)
12
caGrid Infrastructure – cont’d Service and the hosting center metadata is registered in Index Service
13
caGrid Installation: Before starting Dowload caGrid 1.2 Installer http://gforge.nci.nih.gov/frs/download.php/3738/caGrid-installer-1.2.zip Setting environment variables JAVA_HOME : Location of Java JDK 1.5.X ANT_HOME: Location of Ant 1.6.5 CATALINA_HOME: Location of Tomcat ver. 5.0.28 GLOBUS_LOCATION: Location of Globus Toolkit ver. 4.0.3 If not available, caGrid Installer installs Ant, Globus Toolkit and/or Tomcat Unzip caGrid-installer-1.2.zip Run caGrid installer: java -jar caGrid-installer-1.2.jar
14
caGrid Installation: Installation Types Choose any combination of installation types to install one or more caGrid components For data service deployment, choose options “Install caGrid” and “Configure Container”
15
caGrid Installation: Service Container Choose Tomcat or Globus as service containers
16
caGrid Installation: Prerequisites Install (or reinstall) prerequisite software Ant Tomcat Globus Toolkit
17
caGrid Installation: Location Provide the directory where caGrid will be installed
18
caGrid Installation: Target Grid Each target grid basically uses different URLs for caGrid core services. For instance service URLs for OSU Training Grid are: cagrid.master.index.service.url=http://training03.cagrid.org:6080/wsrf/services/DefaultIndexService cagrid.master.cadsr.service.url=http://training02.cagrid.org:6080/wsrf/services/cagrid/CaDSRService cagrid.master.gme.service.url=http://training02.cagrid.org:6080/wsrf/services/cagrid/GlobalModelExchange cagrid.master.gridgrouper.service.url=https://training03.cagrid.org:6443/wsrf/services/cagrid/GridGrouper cagrid.master.dorian.service.url=https://dorian.cagrid.org:6443/wsrf/services/cagrid/Dorian Choose one of the available grids: NCICB Development NCICB Production NCICB QA OSU Development OSU Training and more
19
caGrid Installation: Container Configuration Securing container is needed to host secure services. Secure services are those that require clients to use one of the Globus Security Infrastructure (GSI) authentication mechanisms.
20
caGrid Installation: Completion
21
Additional Information caGrid Wiki: http://www.cagrid.org/mwiki/index.php?title=CaGrid caBIG™ Architecture WS caGrid Web Page: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid/
22
Lesson 7: Deploying a caGrid Data Service
23
Deploying a caGrid Data Service
24
Outline Overview Major steps for deployment Introduce Toolkit Step-by-step deployment of a Data Service; gridPIR
25
caGrid Data Service Deployment – Major steps Provide client and service APIs that are object oriented Provide objects that are defined in UML and registered in the Cancer Data Standards Repository (caDSR) Provide object definitions drawn from controlled terminology and vocabulary registered in the Enterprise Vocabulary Services (EVS) Provide XML schemas that for XML serialization of objects (may be registered in Global Model Exchange) Provide service metadata about the center where service is deployed
26
caGrid Data Service Deployment – Major steps Register service metadata about the service and the center where service is deployed
27
Service Metadata (Domain Model Portion) …..
28
Introduce: Grid Service Authoring Toolkit An open-source and extensible toolkit Supports easy development and deployment of WS/WSRF compliant Grid services by hiding low level details of the Globus Toolkit Enables the implementation of strongly-typed Grid services Facilitates caGrid data service development using caCORE SDK artifacts through pluggable service styles
29
Deploying a caGrid data service using Introduce: Grid-enablement of Protein Information Resource (gridPIR) A data service to provide comprehensive and fully annotated protein related information for genomic and proteomic cancer research Developed using model driven approach and caCORE SDK 3.2.1 All data is public so no security layer implemented
30
Introduce: Create a caGrid Service ant introduce Modify an existing service Deploy an existing service Browse Data Types from caDSR or GME
31
Introduce: Enter service information An analytical service exposes operation(s) with input/output objects A data service exposes objects that presents the data resource
32
Introduce: Data Service Configuration Different Service Styles (including caCORE SDK) supported. gridPIR is generated using caCORE SDK v3.2.1 Optional extensions for Bulk Data Transfer or Web Services Enumeration
33
Introduce: caCORE SDK-generated Client Selection Two options for client selection: Option 1: Use remote API if data service caCORE-like system (API) and caGrid Data Service are on the different machines Option 2: Use local API if both caCORE-like system (API) and caGrid Data Service are deployed on the same machine
34
Introduce: Remote API Selection Library folder (including client jar) generated by caCORE SDK
35
Introduce: Remote API Selection Treat all queries case-insensitive Use Common Security Module Enter URL for remote caCORE-like gridPIR API (publicly accessible)
36
Introduce: Choosing objects (model) service exposes 4. Add selected packages 1. Fetch models from caDSR 2. Select gridPIR model v1.2 3. Select package from gridPIR model
37
Introduce: Choosing XML Schema Find schemas from GME (if registered) OR Resolve schemas manually
38
Introduce: Choosing XML Schema – Manual Resolution (cont’d) XSD generated by caCORE SDK
39
Introduce: Entering Service Description 1. Select Metadata Tab 2. Select ServiceMetadata row 3. Edit Property
40
Introduce: Entering Service Metadata (cont’d) Enter: POC Hosting Center Address
41
Introduce: Deploy gridPIR Data Service Deploy an existing service
42
Introduce: Selecting Data Service Location in the file system Compiled service stubs Metadata files Library files XML schemas Source code for service stubs
43
Introduce: Selecting Data Service Location in the file system Container information Register to Index Service? URL for Index Service
44
Verifying Deployment URL for deployed service
45
Outcome
46
Additional Information caGrid Wiki: http://www.cagrid.org/mwiki/index.php?title=CaGrid Introduce Toolkit Wiki: http://www.cagrid.org/mwiki/index.php?title=Introduce caGrid Data Services Wiki: http://www.cagrid.org/mwiki/index.php?title=Data_Services caBIG™ Architecture WS caGrid Web Page: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid/
47
Lesson 8: Using caGrid Data Services
48
Outline Overview CQL Using caGrid Portal Using Generic Data Service Client CQL Examples using gridPIR Data Service
49
Executing a Data Service Query Query Results
50
caBIG Query Language (CQL) CQL Query: A simple wrapper element at the head of every CQL query document., contains the target. Target : The Target element is of the type Object, and describes the data type which the query will return. QueryModifier: An optional element modifying the returned result set. This modifier has a required attribute ‘countOnly’ and optionally allows for a choice of a list of Attribute Names or a single Distinct Attribute to return. Object: Contains the required attribute ‘name.’ This attribute’s value defines the caDSR class of the object. When the Object is the top level target of a CQL query, it identifies the data type that will be returned by the caGrid Data Service. The Object allows for a choice between three child elements. The possible child elements are Attribute, Association, and Group. Objects may have at most one of these child elements. Groups also have an attribute ‘logicOperator,’ an enumeration of the values “AND” and “OR.” http://www.cagrid.org/mwiki/index.php?title=Data_Services:CQL
51
caGrid Portal http://cagrid-portal.nci.nih.gov/
52
caGrid Portal Discovery Data Service Query Portal Allows Discovery Exploration of: Domain Models Semantic Metadata Hosting Center Info Data Queries Example: Query on Gene Objects
53
caGrid Portal 1) Select “Edit Query Modifiers” 2) Select “Object” then “Update” 3) Update then “Add Criterion” 4) Select attribute “name” 5) Set “name” EQUAL_TO “BRCA1” 6) Update
54
caGrid Portal 7) Submit Query 8) When query is finished Select View Results 9) Query returns 11 Gene Objects and attribute values for each
55
caGrid Portal See Query and Results as XML
56
Generic Data Service Client Can be used for caGrid data services that are based on caCORE- like systems since such services usually expose only the query method in its public API For services that have additional methods other than query, specific clients generated by the Introduce Toolkit needs to be used Typically query involves three steps Initialization Creating and submitting a CQL query Processing the results
57
Generic Data Service Client: Initializing the client For gridPIR Data Service: String serviceURL= "http://141.161.25.20:8080/wsrf/services/cagrid/GridPIR"; DataServiceClient client=new DataServiceClient(serviceURL);
58
Generic Data Service Client: Creating a CQL query Option 1: Create CQLQuery object programmatically: //CQL query to retrieve all BRCA1 genes CQLQuery query = new CQLQuery(); Object target = new Object(); target.setName(Gene.class.getName()); Attribute nameAttribute = new Attribute(“name", Predicate.EQUAL_TO, “BRCA1"); target.setAttribute(nameAttribute); query.setTarget(target);
59
Generic Data Service Client: Creating a CQL query Option 2: Load CQLQuery object from an XML string or file: // from a string CQLQuery query2 = (CQLQuery) Utils.deserializeObject( new StringReader(“ "), CQLQuery.class); // from a file CQLQuery query3 = (CQLQuery) Utils.deserializeObject( new FileReader(cqlFile), CQLQuery.class);
60
Generic Data Service Client: Submitting the CQL Query Results are returned as CQLQueryResults object: try { CQLQueryResults results = client.query(query); } catch (QueryProcessingExceptionType ex) { // handle processing exception } catch (MalformedQueryExceptionType ex) { // handle malformed query } catch (RemoteException ex) { // handle remote exception }
61
Generic Data Service Client: Processing the results Option 1: Results can be iterated as single items using a specialized implementation of the standard Java Iterator interface CQLQueryResultsIterator: Iterator iter = new CQLQueryResultsIterator(results, GridPIRClient.class.getResourceAsStream("client-config.wsdd")); while (iter.hasNext()) { Gene gene = (Gene) iter.next(); System.out.println(g.getName()); }
62
Generic Data Service Client: Processing the results Option 2: Results can be serialized to a string (or file for future processing) : StringWriter w = new StringWriter(); Utils.serializeObject( results, new QName("http://CQL.caBIG/1/gov.nih.nci.cagrid.CQLResultSet", "CQLResultSet“), w); System.out.println(w.getBuffer());
63
Find Protein objects for Human Breast cancer 1 (BRCA1): CQL Example: Query involving three classes
64
Results :
65
Additional Information caGrid Data Services Wiki: http://www.cagrid.org/mwiki/index.php?title=Data_Services caGrid Data Service Client API Wiki: http://www.cagrid.org/mwiki/index.php?title=Data_Services:Client_API CQL Wiki: http://www.cagrid.org/mwiki/index.php?title=Data_Services:CQL
66
Lesson 9: Using caGrid Metadata Service APIs
67
Outline Overview caGrid Metadata Service APIs caDSR Service API EVS Service API Discovery API MetadataUtils
68
caGrid Data Service Metadata Overview - gridPIR UML Modeling Semantic Annotation using EVS concepts caDSR Load/CDE Creation EVS Concept Codes CDE Public ID/version UML Class/attribute CDE Long Name
69
Metadata Services on caGrid - caDSR
70
caDSR Grid Service API Client Provides access to information in the caDSR such as semantically annotated UML model as registered in caDSR Has capability to generate caGrid standard metadata instances such as domain models
71
caDSR Grid Service API Client - Operations Provides access to the UML-like view of caDSR registered items: findAllProjects findPackagesInProject findClassesInPackage findAttributesInClass Enables clients to generate caGrid standard Data Service metadata: generateDomainModelForProject generateDomainModelForPackage generateDomainModelForClasses generateDomainModelForClassesWithExcludes Provides clients the ability to augment a ServiceMetadata (standard caGrid service metadata) skeleton instance with the information extracted from caDSR annotateServiceMetadata
72
caDSR Grid Service API Client – Retrieving list of projects from caDSR //caDSR service on production caGrid String serviceURL=http://cagrid- service.nci.nih.gov:8080/wsrf/services/cagrid/CaDSRService;http://cagrid- service.nci.nih.gov:8080/wsrf/services/cagrid/CaDSRService // create a caDSRServiceClient instance CaDSRServiceClient client = new CaDSRServiceClient(serviceURL); // get list of projects from caDSR Project[] projects = client.findAllProjects(); //processing results for (int i = 0; i < projects.length; i++) { Project p = projects[i]; System.out.println(i+" Short name:"+p.getShortName()+" / Long name:"+p.getLongName());
73
caDSR Grid Service API Client – Retrieving list of projects from caDSR Result:
74
caDSR Grid Service API Client – Retrieving classes/attributes registered for a gridPIR from caDSR Result:
75
caDSR Grid Service API Client – Retrieving classes/attributes registered for a gridPIR from caDSR // Retrieve the list of classes for a registered model UMLClassMetadata[] classArray= client.findClassesInProject(gridPIRProject); // Retrieve the list of attributes for a class UMLAttributeMetadata[] attributeArray= client.findAttributesInClass(gridPIRProject,classArray[i]); // Retrieve the value domain for an attribute ValueDomain valueDomain=client.findValueDomainForAttribute(gridPIRProject, attributeArray[j]);
76
Metadata Services on caGrid – Enterprise Vocabulary Services (EVS)
77
EVS Grid Service API Client Provides information on vocabularies and terms/ concepts presented by the NCI Metathesaurus/ Thesaurus
78
EVS Grid Service API Client - Methods Provides a list of programmatically accessible vocabularies getVocabularyNames Provides access to concepts/terms from the vocabularies searchDescLogicConcept Provides complete History for concepts; the evolution of the concept as they are created, merged, modified, split, or retired. getHistoryRecords Provides access to concepts that are supported by the NCI Metathesaurus searchMetaThesaurus searchSourceByCode
79
EVS Grid Service API Client – Retrieve list of vocabularies provided by EVS //URL for production grid EVS Grid Service String serviceURL="http://cagrid- service.nci.nih.gov:8080/wsrf/services/cagrid/EVSGridService";http://cagrid- service.nci.nih.gov:8080/wsrf/services/cagrid/EVSGridService // create a EVSServiceCLient instance EVSGridServiceClient client=new EVSGridServiceClient(serviceURL); //retrieve list of vocabularies service provides DescLogicConceptVocabularyName[] vocabularyNames=client.getVocabularyNames(); //list the names of vocabularies for(int i=0;i<vocabularyNames.length;i++){ System.out.println(i+": "+vocabularyNames[i].getVocabularyName()); }
80
EVS Grid Service API Client - Retrieve list of vocabularies provided by EVS Result:
81
EVS Grid Service API Client – Retrieve EVS concept code for a term //Set the search criteria EVSDescLogicConceptSearchParams evsSearchParams = new EVSDescLogicConceptSearchParams(); //searching in NCI_Thesaurus evsSearchParams.setVocabularyName("NCI_Thesaurus"); //searching the concept code for term protein evsSearchParams.setSearchTerm(“protein”); //set maximum number of returned terms/concepts to 10 evsSearchParams.setLimit(10); //run query against the EVS grid service DescLogicConcept[] descLogicConcepts = client.searchDescLogicConcept(evsSearchParams); //process results
82
EVS Grid Service API Client – Retrieve EVS concept code for a term Result:
83
Metadata Services on caGrid – Index Service
84
Discovery API Client Provides methods to query the Index Service and used to discover services of interest
85
Discovery API Client - Methods Searches based on service level metadata E.g. discoverServicesByResearchCenter Searches based on semantic annotation E.g. discoverDataServicesByModelConceptCode Searches based on operation metadata (for analytical services) E.g. discoverServicesByOperationName Searches based on information model metadata (for data services) E.g. discoverDataServicesByExposedClass Searched based on XPath leveraging Service Metadata XML discoverByFilter (String xpathPredicate)
86
Discovery API Client – Discovering services using a keyword //Index service on production caGrid String serviceURL="http://cagrid- index.nci.nih.gov:8080/wsrf/services/DefaultIndexService";http://cagrid- index.nci.nih.gov:8080/wsrf/services/DefaultIndexService // create a DiscoveryClient instance DiscoveryClient client = new DiscoveryClient(serviceURL); //discover services by keyword EndpointReferenceType[] endPointReferenceArr = client.discoverServicesBySearchString(“Protein”); //list URLs for returned services for (int i=0; i < endPointReferenceArr.length; i++){ System.out.println("Address: "+endPointReferenceArr[i].getAddress()); }
87
Discovery API Client – Discovering services using a keyword Result:
88
Metadata API - MetadataUtils Used to access and manipulate instances of service metadata Complements the Discovery API; Once a service is discovered MetadataUtils’s methods can be used to access and inspect the full metadata for the service
89
Metadata API – MetadataUtils Methods Retrieves the service metadata or domain model from the specified service. getServiceMetadata Writes/reads the XML representation of the service metadata to/from the specified writer/reader: serializeServiceMetadata deserializeServiceMetadata Writes/reads the XML representation of the domain model to/from the specified writer/reader: serializeDomainModel deserializeDomainModel
90
Metadata API – MetadataUtils - Example //discover services by concept code used to annotate the model such as C17021 (Protein) EndpointReferenceType[] endPointReferenceArr = client.discoverServicesByModelConceptCode(“C17021”); for (int i=0; i < endPointReferenceArr.length; i++){ //retrieve service metadata for the service ServiceMetadata serviceMetadata=MetadataUtils.getServiceMetadata(endPointReferenceArr[i]); //print host center information System.out.println("Hosting Center: "+serviceMetadata.getHostingResearchCenter().getResearchCenter().getDisplayName()); //retrieve domain model for the service DomainModel domainModel=MetadataUtils.getDomainModel(endPointReferenceArr[i]); //print domain model name System.out.println("Domain Model/Project Short Name: "+domainModel.getProjectShortName()); System.out.println("Address: "+endPointReferenceArr[i].getAddress()); }
91
Metadata API – MetadataUtils - Example Result:
92
Additional Information caGrid Wiki: http://www.cagrid.org/mwiki/index.php?title=CaGrid caBIG™ Architecture WS caGrid Web Page: https://cabig.nci.nih.gov/workspaces/Architecture/caGri d/https://cabig.nci.nih.gov/workspaces/Architecture/caGri d/
93
Session III: Deploying a Data Service on caGrid and using caGrid Service APIs
94
Questions
95
Closing Remarks
96
Additional Resources caBIG™ web site: https://cabig.nci.nih.gov/https://cabig.nci.nih.gov/ caBIG™ gForge site: https://gforge.nci.nih.gov/https://gforge.nci.nih.gov/ caGrid Wiki: http://www.cagrid.org/http://www.cagrid.org/ caBIG™ Learning Management System: http://ncicbtraining.nci.nih.gov/TP2005/tp2000web.dll/NCICBTrain ing caCORE SDK: http://ncicb.nci.nih.gov/infrastructure/cacoresdkhttp://ncicb.nci.nih.gov/infrastructure/cacoresdk caBIG™ Compatibility Guidelines: https://gforge.nci.nih.gov/docman/index.php?group_id=233&sele cted_doc_group_id=1138&language_id=1 Upcoming boot camps
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.