Applying the Semantic Web at UCHSC - Center for Computational Pharmacology Ian Wilson
Projects with semantics at UCHSC-CCP Integrated Neuroscience Initiative on Alcoholism (INIA) Analysis Suite Ongoing project to support the INIA consortium (30+ universities geographically dispersed) First release at the Neuroscience 2004 conference – Microarrays only at the moment NLP Enrichment Opportunities Recent funding September 2004 – NLM Data integration framework for life science knowledge-bases
INIA SemWeb Opportunities Conversion of LISP-CM based signal transduction knowledge-base to OWL Application framework to link the semantics of our data – e.g. MAGE, fMRI, etc. Exploring ‘scientific workflow’ tools to enable composition of semantically annotated web services – easy UI for the investigator myGrid project – also presenting at the conference Issues with the granularity of semantics
NLP Enrichment Opportunities Using Direct Memory Access Parsing (DMAP) – ‘conceptual parsing’ supported by ontologies Developing Protege plug-in to support NLP annotations – gold standard development Text sources Entrez GeneRIFs 255 character summary of gene function derived from PubMed Gene Ontology Definitions
Data Integration Framework Creating RDF wrappers for several bioinformatics data sources Using NCBI, GO, Uniprot, etc. as test cases Alignment of several bio-ontologies – extending when appropriate Investigating/benchmarking several triple stores and browsers Kowari, Jena, Sesame Lightweight JSP, Longwell Mappings are not always straight forward
Why integrate? Current architecture is not maintainable Web tiered databases Data models in flux Web client interfaces in flux Everyone has a different client interface and data model design CLI tools 500+ services/databases & Growing Cutting and pasting Large number of steps Frequently repeated – info now rapidly added to public databases Don’t always get results
Semantic Web Concerns Inference Modeling default reasoning and negation in OWL? Reification is not sufficient for context – Quads Named graphs DL’s are good for certain tasks, but Need other logics in the life sciences closed world reasoning – e.g. rules Scalability Need to constrain search in RDF space
Conclusion Always looking for collaborators Questions?