LexEVS 101 Craig Stancl Rick Kiefer February, 2010
LexGrid Model Overview The LexGrid Model is Mayo’s proposed mechanism for standard storage of controlled vocabularies and ontologies: Defines HOW vocabularies should be formatted and represented Flexible enough to accurately represent a wide variety of vocabularies and other lexically-based resources Defines several different server storage mechanisms and a XML format Provides the core representation for all data managed and retrieved through the LexEVS system Once the vocabulary information is represented in a standardized format, it becomes possible to build common repositories to store vocabulary content and common programming interfaces and tools to access and manipulate that content. Terminologies from widely varying resources such as RRF, OWL, and OBO can be loaded to a single data base management system and accessed with an single API. The LexGrid model stands alone as a complete terminology model
LexBIG Model Overview LexBIG provides references into the data model without requiring resolution of the a complete terminology node set or graph. As such it functions as a kind of lazy loading mechanism, similar to what can be found in Hibernate. Elements of LexBIG that are resolved in a minimal manner can often avoid database calls by referring to a Lucene index, saving response time.
LexGrid, LexBIG and LexEVS LexEVS: Optimizing query code that retrieves LexBIG objects. LexBIG: How the terminology service looks as objects returned to the user. LexGrid: How the terminology service looks in a data base. LexGrid Data base LexBIG Objects LexEVS API LexEVS uses the LexBIG model in conjunction with the LexGrid model
LexEVS Environment Architecture LexEVS consists of LexGrid Model & Storage, LexEVS Java API, LexEVS Distributed Service and LexEVS caGrid Service. LexEVS caGrid Service LexEVS Distributed / SDK Service LexEVS Java API LexGrid Model & Storage
LexEVS API - Local/Direct Now we will discuss LexEVS API. LexEVS Java API LexGrid Model & Storage
LexEVS API - Local/Direct The direct/local API consists of LexEVS on a local system (LexEVS installed). The API uses JDBC query the LexEVS database. Database Server LexEVS on Local System LexEVS Install JDBC Direct
8 LexEVS API - Local/Direct The LexEVS Local Installation is the foundation of the LexEVS System as a whole. All of the other Environments rely on this being available and configured properly. Some characteristics of the Local installation are: Java based Installed via GUI install program, or command line Some indexes (Lucene-based) are held on the local file system Includes the LexEVS GUI Includes a full set of Administration scripts to maintain the server Optionally includes Testing resources, Source Code, JavaDocs, and more…
Model Objects LexEVS caCORE SDK APIs Java API LexGrid MySQL DB Lucene Index Files Distributed Java LexEVS caGrid API Java (QBE) Application Service Client Web/Grid Service (Soap/HTTP/Rest) Java (RMI) ( Distributed) Client Application Core API Data Source RMI LexEVS API - Local/Direct In a local environment, an application uses the Java API to access content in LexEVS.
LexEVS API – Distributed / SDK Now we will discuss LexEVS Distributed API. LexEVS Distributed / SDK Service LexEVS Java API LexGrid Model & Storage
LexEVS API – Distributed / SDK The distributed Java/SDK consists of a client system which uses RMI to communicate with a distributed LexEVS server where LexEVS API is installed. The API uses JDBC query the LexEVS database. Database Server Distributed LexEVS Server RMI LexEVS on Local System LexEVS Install Database Server LexEVS Install JDBC Direct Distributed / SDK LexBIG API Proxy Client System LexEVS Client Proxy
12 LexEVS API – Distributed / SDK The LexEVS Distributed Installation is actually two services combined as one application. Remote Access (via Remote Method Invocation) to the local LexEVS API A caCORE SDK Service conforming to all of the caCORE Service Interfaces. The key feature of the Distributed environment is that it exposes the fully LexEVS API to clients, while centralizing the actual vocabulary content in one place. This lets users have a single set of loaded ontologies – instead of multiple sets for multiple users – which reduces maintenance and increases usability. The Distributed Layer is also the first LexEVS environment to employ any type of Ontology-based security. It uses Security Tokens to restrict licensed ontologies (i.e. MedDRA).
LexEVS API – Distributed / SDK Remote Access (via Remote Method Invocation) to the local LexEVS API Any method that can be called locally is also available remotely (with the exception of certain administration functionality, which is disabled for security purposes).
Model Objects LexEVS caCORE SDK APIs Java API LexGrid MySQL DB Lucene Index Files Distributed Java LexEVS caGrid API Java (QBE) Application Service Client Web/Grid Service (Soap/HTTP/Rest) Java (RMI) ( Distributed) Client Application Core API Data Source RMI LexEVS API – Distributed / SDK In a distributed environment, the client application uses the Distributed Java API (RMI) to access content in LexEVS or caCORE SDK Services which include REST, SOAP, RMI Interfaces for QBE, HQL, Hibernate Detached Criteria, SDK CQL, caGrid CQL.
LexEVS API – Distributed / SDK A caCORE SDK Service conforming to all of the caCORE Service Interfaces. This includes: REST-ful service caCORE-SDK SOAP Web Service Query By Example (QBE) Java RMI Interfaces A Web-based interface to the REST-ful service
LexEVS API – Distributed / SDK In a distributed environment, the client application uses the Distributed Java API (RMI) to access content in LexEVS or caCORE SDK Services which include REST, SOAP, RMI Interfaces for QBE, HQL, Hibernate Detached Criteria, SDK CQL, caGrid CQL. Model Objects LexEVS caCORE SDK APIs Java API LexGrid MySQL DB Lucene Index Files Distributed Java LexEVS caGrid API Java (QBE) Application Service Client Web/Grid Service (Soap/HTTP/Rest) Java (RMI) ( Distributed) Client Application Core API Data Source RMI
LexEVS API – caGrid Service Now we will discuss LexEVS caGrid Service LexEVS caGrid Service LexEVS Distributed / SDK Service LexEVS Java API LexGrid Model & Storage
LexEVS on Local System LexEVS Install Database Server Distributed LexEVS Server RMI Database Server LexEVS Install JDBC Direct Distributed / SDK Database Server LexBIG API Proxy Client System caGrid Host Server Client System Distributed LexEVS Server RMI LexEVS Install Grid JDBC TCP LexEVS Proxy LexEVS Client Proxy LexEVS API – caGrid Service The caGrid Service consists of client system, caGrid Host Server, Distributed LexEVS Server and Database Server.
19 LexEVS API – caGrid Service LexEVS has two deployed caGrid Services, one Analytical Service and one Data Service. They are both available and discoverable through the caGrid Portal/Index Service Analytical Service Exposes the LexEVS API in much the same way as the Local and Distributed Environments do – except as a Grid Service. A user may again use familiar Interfaces (LexBIGService, CodedNodeSet, CodedNodeGraph, etc.) to interact with the Analytical Grid Service Data Service The Data Service simply exposes the LexGrid model as a caGrid Data Service. Like any standard caGrid Data Service, CQL queries are used to query the data source.
Model Objects LexEVS caCORE SDK APIs Java API LexGrid MySQL DB Lucene Index Files Distributed Java LexEVS caGrid API Java (QBE) Application Service Client Web/Grid Service (Soap/HTTP/Rest) Java (RMI) ( Distributed) Client Application Core API Data Source RMI LexEVS API – caGrid Service In grid services environment, the client application calls the grid services interfaces which in turn call the distributed Java API to access content in LexEVS.
21 Choosing an Environment LexEVS Environments – Which one to use? Choosing the right Environment for your needs is important. Each of the Environments adds complexity and maintenance to the system. Also, performance plays a factor as each added Environment adds overhead. Local Best Performance, easiest installation. Use this when Performance is critical and there isn’t a need to directly expose the LexEVS API to other users. Distributed Use this to directly expose the LexEVS API to multiple users – while sharing only one set of loaded ontologies. The RMI overhead decreases performance slightly from the Local Environment. Also, if caCORE SDK-like functionality is need, this Environment is required. Grid The most complex to set up – use this if users need a functioning caGRID Node. This adds another layer of overhead, so performance will be impacted the most in this Environment.
22 Services Overview The LexEVS Service is designed to run standalone or as part of a larger network of services. It is comprised of four primary subsystems: Service Manager Service Metadata Query Service Extensions
23 Services: Service Manager LexEVS Service - Service Manager The service manager provides a centralized access point for administrative functions, including write and update access for a service's content. For example, the service manager allows new coding schemes to be validated and loaded, existing coding schemes to be retired and removed, and the status of various coding schemes to be updated and changed.
24 Services: Metadata Service LexEVS Service – Metadata Service The Service Metadata provides external clients with information about the vocabulary content (e.g. NCI Thesaurus) and appropriate licensing information.
25 Services: Query Operations LexEVS Service - Query Operations The Query Operations provide numerous functions for querying and traversing vocabulary content. The Query Service is comprised of: Lexical Operations Graph Operations Metadata Operations History Operations
26 Query Service: Lexical Set Operations Query Service - Lexical Set Operations Lexical Set Operations provides methods to return a lists or iterators of coded entries. Supported query criteria include the application of match/filter algorithms, sorting algorithms, and property restrictions. Support is also provided to resolve the union, intersection or difference of two node sets.
27 Query Service: Graph Operations Query Services - Graph Set Operations Graph Operations support the subsetting of concepts according to relationship and distance, identification of relation source and target concepts, and graph traversal. Additional operations include enumeration and traversal of concepts by relation, walking of directed acyclic graphs (DAGs), enumeration of source and target concepts for a relation, and enumeration of relations for a concept.
28 Query Service: Metadata Operations Query Services - Metadata Operations Metadata Operations allows for the query and resolution of registered code system metadata according to specified coding scheme references, property names, or values.
29 Query Service: History Query Services - History History provides vocabulary-specific information about concept insertions, modifications, splits, merges, and retirements when supplied by the content provider.
30 Services: Extensions Extensions The Extensions component provides a mechanism to extend the specific service functions, such as Loaders, or re-wrap specific query operations into convenience methods.
For More Information… Vocabulary Knowledge Center Wiki Vocabulary Knowledge Center Forums Vocabulary Knowledge Center