LexBIG/LexGrid Services for LexBIG 2.3 Model and API for the Grid
Objectives Provide a Context and Overview of LexBIG Follow a path of execution through the vital model portions of LexBIG Demonstrate the relationship between the separate service and data models of LexBIG/LexGrid Mention the particular problems posed in model compliance Provide a brief presentation of code samples Describe the analytical service in the context of the Grid service implementation
What are LexGrid and LexBIG? LexGrid = Common information model capable of representing multiple vocabulary resources, and common foundation for vocabulary tools and services. LexBIG = caBIG API based on the LexGrid model, delivered as part of NCI EVS version 4.x. Used as basis for the Distributed LexBIG API, and providing infrastructure to implement legacy EVS services based on EVS model 3.x. LexEVS = Externalization of the LexGrid model and LexBIG API as the next generation of EVS interfaces. Introduced as Distributed API and grid-level services in EVS version 4.x, and completing transition to caCORE SDK-generated interfaces (RESTFul API, SOAP services, etc) in EVS 5.0.
Conceptual Overview The LexGrid Data Model provides a base for terminology service data loads. The basic service layer is LexBIG and it sits on top of loads of LexGrid configured data. EVS caCORE API’s form the next interface layer, which in turn have grid services built on top of them. Each layer above the data model layer can be accessible via public interfaces. Grid Services EVS caCORE APIs LexBIG Java API LexGrid Model & Storage Browsers and Applications
The Service Layer The service layer is comprised of three main service classes. The LexBIG service, The LexBIG Metadata Service and the Service convenience methods. The main LexBIG service class provides the main entry point into the terminology service and allowing the user to access nodes and edges from the graph like hierarchy of terms within a terminology. The Metadata service provides users access to separately stored metadata for a given coding scheme. The convenience methods provide a more user friendly interface over the common service access methods available for node sets and graphs.
The main LexBIG service class provides the main entry point into the terminology service and allowing the user to access nodes and edges from the graph like hierarchy of terms within a terminology. The Service Layer
The main LexBIG service class provides the main entry point into the terminology service and allowing the user to access nodes and edges from the graph like hierarchy of terms within a terminology. getCodingSchemeConcepts(CodingSchemeIdentification, CodingSchemeVersionOrTag) : CodedNodeSet
The Service Layer A LexBIGService Interface provides references to a terminology service, the vocabularies in those services, concepts contained in the vocabularies, and the relationships that exist between these concepts. When a user “gets” a set of Coded Nodes from the service the nodes are returned as a reference to all the possible concept nodes in the service. No actual node content is returned. getCodingSchemeConcepts(CodingSchemeIdentification, CodingSchemeVersionOrTag) : CodedNodeSet
Query Reference Interfaces A Coded Node Set is a reference point, a dynamic list of query options built by the user to customize query results. It contains no actual coded nodes. It provides potential set manipulations such a union (with another coding scheme) Restrictions can narrow results to a more usable set result.When it is finally resolved to a ResolvedConceptReferenceList or Iterator it provides useful results in the form of data objects. Code Node graphs contain both node ands and connecting edges defined by any hierarchy of relationships between concepts in a given terminology.
Query Reference Interfaces CodedNodeSet methods are a prelude to the resolution of a query
Query Reference Interfaces A restriction method such as restrictToStatus, when called on CodedNodeSet, starts a list of pending query operations.
Query Reference Interfaces Restrictions can be added to the list of pending query operations until a result set is outlined that is most specific to a user’s needs.
Query Reference Interfaces A policy parameter applied at list resolution provides users with one last opportunity to tailor results to their needs.
The Connecting Classes A vertical slice from the service model into the data model. Once a list of ResolvedConceptReferences is resolved from a CodedNodeSet, the user has access to those elements defined first in the service layer…
The Connecting Classes … but with references into the data representation layer
The Data Model The model is designed to provide a universal container for ontologies and terminologies. The central class is a “concept” that complementary classes allow to function as graph node.
Data Model Relationships The concept containing nodes can be linked as a graph with model relationships The data model of relationships provides for an Association which contains a reference to a container of all the concepts which are source concepts in this association. This container, in turn, has a reference to all the concepts which are targets of these concepts.
Data Model Relationships The concept containing nodes can be linked as a graph with model relationships The data model of relationships provides for an Association which contains a reference to a container of all the concepts which are source concepts in this association. This container, in turn, has a reference to all the concepts which are targets of these concepts.
Service Model Relationships The service model maintains relationships somewhat differently. A resolved concept reference has two references to containers of associations. One set of associations represents those in which the concept reference have a source relationship with other concepts. The other set is the target associations.
Service Model Relationships The service model maintains relationships somewhat differently. A resolved concept reference has two references to containers of associations. One set of associations represents those in which the concept reference have a source relationship with other concepts. The other set is the target associations.
Parameter and Return Objects In order to provide some semantic relevance to input and output objects, many Java objects were wrapped with “policy” objects.
Model Compliance for an Existing Model Multiple schemas and diagrams for portions of a data model and its service extension made for unusual complexity Interface API layers required wrappers for java objects to ensure semantic relevance The existing model needed to be reviewed and corrected for classes that would not be exposed by the API
Model Compliance continued Minor adjustments were required in the UML definitions of attributes. Some association multiplicities required adjustment to allow proper representation in the UML browser Long definitions had to be shortened and or given special tags to be allowed to be applied to class and attribute definitions
Exercising the LexBIG API Query Optimizing through use of “Restriction” methods Lucene Query Syntax adds text searching tools maximizing text search capabilities. –Additional tools supplement this functionality. (for stemmed, double metaphone, contains, and regular expression capable searches)
LexGrid Query Optimizing Get a coded node reference from the LexBIG Grid Service. Restrict or enhance the possible returned values before resolving the code reference (a set or graph.) Resolve the code reference when a useful set has been built using the query structure.
Getting a Coded Node Set Get a coded node set from the terminology service Restrict the set to reasonable boundaries CodedNodeSet cns = lbs_.getCodingSchemeConcepts(CodingSchemeIdentification csi, CodingSchemeVersionOrTag csvt); NCI Thesaurus 08.03d cns = cns.restrictToMatchingDesignations(MatchCriteria, SearchDesignationOption, ExtensionIdentification, InternationalDesignation); Lucene Query en “Blood” PREFERRED _ONLY
Resolving the List allows the user to further narrow the query restrict the content of objects returned, arbitrarily restrict the size of the returned list. This resolve method accepts a policy object which contains a group of parameter objects that serve the purpose of defining limits and restricting content and size.
Resolve the List (finally make the query) This representation of a method call is an example of various possible input values for the parameters of grid service level values. The various string values are wrapped in various semantically significant objects. Sort Option is code, or sorting on the code value of the concept. Concept name is the property name filter. The file option algorithm is null and not in effect. The property type filter is Presentation. Maximum to return is 100. The Boolean option for fully resolving a set of Nodes is set to true insuring all values associated with a concept node will be returned rather than a summary. ResolvedConceptReferenceList rcl = cns.resolveToList(SetResolutionPolicy); SetResolutionPolicy SortOptionList: SortOptions LocalNameList:propertyNames LocalNameList:filterOptions PropertyType:propertyTypeOption Integer:MaximumToReturn Boolean:ResolveConcepts “code” 100 true “PRESENTATION” null “CONCEPT_NAME”
Results This graphical representation of a result set shows a list of returned values, a portion of associated properties for a single value and the value or concepts place in a given hierarchy. This is as it appears in a prototype application.
Lucene Queries and Other Text searching mechanisms of the LexBIG API. Matching Designation restrictions may use Lucene Query Syntax to query values posed by the MatchingCriteria internal value. Some of the details of the syntax are posted here. sersyntax.html
LexEVS Grid Services Analytical Service Uses the existing LexEVS Castor generated Data Model Stateful Extensive use of Service Contexts and Resources.
Implementation (Model) Because LexEVS has an existing Castor-generated model, we needed to annotate and submit this model for Silver Level Compliance. Introduce was configured to use the existing model Castor Beans New Serializers/Deserializers needed to be built
Implementation (API) Services were created to mirror the existing LexEVS API
Implementation (API) cont. Grid services operate on a multiple tiered, chained server client basis. Invocations of a service call can be perpetuated across these chains and are nearly identical to local java invocations. Implementation Flow Sample Implementation
Service Contexts and State LexEVS Grid Services need the ability to make stateful calls to the server –Example: Create a query on the server, add restrictions and limits with subsequent calls, and finally execute the query and retrieve the results. How did we implement this in caGrid?
Some Terms Resource: A stateful container created by the server hosting the caGrid Service used hold objects used by Services and Service Contexts. Service Contexts: Additional Grid Services that are acquired through the main service. Not meant to be called directly. Main Service: The set of Grid Services we want to directly expose to the user.
Main Service The main access point for the Grid Services
Accessing Service Contexts Notice the Grid Service ‘getCodingSchemeConcepts’. This service returns a ‘CodedNodeSetReference’, which is a reference to a set of CodedNodeSet operations, or the CodedNodeSet Service Context.
Accessing Service Contexts (cont.) With this Reference, the user can call any of the CodedNodeSet Service Context Grid Services
Why Service Contexts? Service Contexts allow us to maintain state on the server –This is important because many of the LexBIG API calls that we wish to expose via Grid Services use state (example, restricting a query before a resolve). Allows us to be consistent with the LexBIG API. The LexBIG API allows the user to place restrictions on a query before asking the database for the actual content. The Grid Services reflect this pattern.
Putting it all together How Service Contexts and Resources are used. Client Grid Services Distributed LexBIG
Putting it all together Step 1: The client calls ‘getCodingSchemeConcepts’ from the Main service. Client Grid Services ‘getCodingSchemeConcepts’ Distributed LexBIG
Putting it all together Step 2: The Grid Service receives the ‘getCodingSchemeConcepts’ call. This call is then forwarded on to Distributed LexBIG. Client Grid Services ‘getCodingSchemeConcepts’ Distributed LexBIG
Putting it all together Step 3: Distributed LexBIG returns the requested CodedNodeSet Object to the Grid Services Server. Client Grid Services ‘getCodingSchemeConcepts’ ‘CodedNodeSet Object’ Distributed LexBIG
Putting it all together Step 4: The Grid Services Server now creates a stateful “Resource” to hold this CodedNodeSet object. Grid Services CodedNodeSet Resource Client Grid Services ‘getCodingSchemeConcepts ’ ‘CodedNodeSet Object’ Distributed LexBIG
Putting it all together Step 5: The Grid Services Server then returns back to the Client a CodedNodeSetReference to the created Resource. This CodedNodeSetReference is simply an object containing a URL, Port, and Connection information for the Service Context and Resource Client Grid Services ‘getCodingSchemeConcepts’ ‘CodedNodeSet Object’ CodedNodeSet Resource CodedNodeSetReference Distributed LexBIG
Using the Service Context The client now makes calls through the CodedNodeSet Service Context. Grid Services CodedNodeSet Resource Client Add Restrictions Union, Intersect, etc…
Problems Encountered Exposing an existing API as Grid Services Loading and annotating an existing Data Model Primitives in method calls and return values.