Phase II Additions to LSG Search capability to Gene Browser –Though GUI in Gene Browser BLAST plugin that invokes remote EBI BLAST service Working set manager –State retention between sessions Provenance viewer –Displays annotated provenance –Accepts RDF representations
Knowledge Discovery through Provenance Collection, Representation, and Use in the Life Science Grid (LSG) Phase II Final Report : Architectural and Technical Details Beth Plale Director, Center for Data and Search Informatics Indiana University
Key contribution to LSG proper (provenance aside) Introduction of state
LSG Space user Entrez Gene Ontology Gene Browser Lilly CAB Bus Lilly to Karma Events Reflector Karma Framework Provenance DB Events Capture OPM* RDF Interface S-OGSA Service Semantic Binding Annotated Provenance Graph Karma Events Bus (WS+LSG) Resources (public + private) Proxy SAWSDL Registry Service Annotations Services and data ontology (myGrid) Karma structure ontology Karma Services *Open Provenance Model v1.01 BLAST Working Set Manager RDF Viewer Provenance Graphs
BLAST support
Working Set: support for user state
Demo 1: Phase II Use Case Select “gene” from database list, list will show in Gene Browser Submit gene to NCBI Open tab of BLAST plugin, download FASTA sequence Run BLAST Add results to Working Set Annotate Working set
Working Set Manager listens to CAB bus. It uses Entrez ID and/or all or partial of BLAST result as input to working set, or imports csv file into working set. Working set WS2 was generated from working set WS1 by Delete Rows. WS2 can be exported as csv file.
Demo II: Query provenance database Query BLAST related data. Query 1: get the latest Blast_Plugin. create or replace view v1 as select process_id, service_id, process_initialization_time from process where service_id like '%Blast_Plugin' and process_initialization_time = (select max(process_initialization_time) from process where service_id like '%Blast_Plugin’) select * from v1; Query 2: get the service (Blast_Ebi_Web_Service) invoked by Blast_Plugin. select invoker_id, invokee_id, p.service_id from invocation, process p, v1 where invoker_id = v1.process_id and invokee_id = p.process_id;
Demo II: Query provenance database Query 3: get input to Blast Ebi Web Service select p.service_id, artifact_id,artifact_value from artifact_used au,artifact a, process p, v1 where au.artifact_no = a.artifact_no and au.process_id = p.process_id and p.process_id = v1.process_id + 1 and p.service_id like '%Blast_Ebi_Web_Service’; Query 4: get output from Blast Ebi Web Service select p.service_id, artifact_id,artifact_value from artifact_generated ag,artifact a, process p,v1 where ag.artifact_no = a.artifact_no and ag.process_id = p.process_id and p.process_id = v1.process_id + 2 and p.service_id like '%Blast_Ebi_Web_Service’;
Suggested Future Work Next steps: –Engagement of users or, –Expanded functionality set Build in development time if we must do this ourselves Represent to user combined visualization and process provenance Write research quality paper –Requires user study, or comparison, or … Formally integrate provenance collection tools into non-public LSG
Suggested Future Work: Technical Support BLAST in asynchronous mode, extend NCBI Entrez to work on other NCBI databases, and design rich provenance queries.