CD40 ligand and tumor necro sis factor alpha, the cells acquire a mature phenotype of dendritic cells that is characterized by up-regulation of human leukocy te antigen (CD80, CD86, CD40 and CD54 and appearance of CD83. These Second Order Semantic Enrichment and the role of Wikis for Professionals Peter Bram t Hoen Ellen Sterrenburg Herman van Haagen Allessandro Botelho-Bovo Judith Boer Johan den Dunnen Gert Jan van Ommen Erik van Mulligen Martijn Schuemie Rob Jelier Antoine Velthoven Christina Hettne Jan Kors Johan van der Lei Christine Chichester Erik van Mulligen Marc Weeber Kevin Kalupsen Reuben Christie Jacintha van Beemen Nickolas Barris Albert Mons Gerard Meijssen Erik Moeller Peter Jan Roes Karsten Uil Siebrand Mazeland Sabine Cretella Barend Mons
The Consortium Open Access Semantic Support Technology For on-line Knowledge Tracking, Discovery and Management
WikiProfessional Semantic Web workspaces for scientists enabling real time knowledge exchange and exploration
The Million Minds Approach Why ?
Many challenges in current bomedical research Volume of data (both high troughput and text) Complexity Distributed systems and databases Incompatible data formats Multi-disciplinarity Multi-linguality Ambiguity of terminology Inability to share Knowledge Globalization of knowledge
Too much to read: major trends foreseen: From Reading to Consulting From Reading to Meta Analysis From Texts to Facts To Central AND COMMUNITY Annotation
Repetition of facts is of great value for the readability of individual papers, but the fact itself is a single unit of information, and needs no repetition.
– A defining characteristic of wiki technology is the ease with which pages can be created and updated. Generally, there is no review before modifications are accepted. The Million Minds Approach
Personal Communication Johan den Dunnen. Websites such as are increasingly cited in the literaturewww.dmd.nl
The majority of (SP) proteins has more than 1 research group asociated
So…..can we use wikis for this ??????
Contextual annotation of web pages for interactive browsing, van Mulligen E, Diwersy M, Schijvenaars B, Weeber M, van der Eijk CC, Jelier R, Schuemie M, Kors J, Mons B, Medinfo 2004, 11:94-8 Which gene did you mean?, Mons B, BMC Bioinformatics 2005 Jun 7, 6:142 First order semantic enrichment The Knowlet 2 nd order S.E.
: Database facts (mutiple attributes) Community Annotations (WikiProf) Co-occurrence sentence Co-occurrence abstract Concept Profile Match Sequence similarity (BLAST score Genes and Proteins only) <Type c3 Co-expression with (genes from expression Databases) Knowlet building block Knowlet of core concept Knowlet space What does a Knowlet look like under the hood?
KKDD KKGG KKHH KKZZ KKBB KKEE KKAA KKDD KKFF KKII KKDD KKGG KKHH KKZZ KKBB factualassociativeco occurrence Rules to combine different sources of information into a single relationship Time-stamped information The relationship to the original texts or database entries
A Knowlet represents a unit of thought interconnected with other units of thoughts or in other words: a cloud of concepts that have one or more relationship types with the central (selected) concept The interconnection reflects a semantic relationship derived: – From facts in database – From co-occurrence in a text – From other associations Relations have a strength – Based on the source of relationship – Based on the amount of «evidence» Knowlets belong to one or more semantic classes: proteins, diseases, authors, organizations, journals, experiments, etc. Each Knowlet is uniquely identified by a URL or URI (Unique Resource Identifier) The Knowlet
personorganisationObject 1 gene Object 2 disease Object 3 drug 3. Building an association matrix of large data sources 1 Million
SRP PARN l Assignment of protein function and discovery of new nucleolar proteins based on automatic analysis of MEDLINE. Martijn Schuemie, Christine Chichester, Frederique Lisaceck, Yohann Coute, Peter-Jan Roes, Jean Charles Sanchez, Barend Mons Special issue on Systems Biology in Proteomics, 2008 (accepted for publication)
Cluster studies on basis of Homologene IDs Cluster 4: EOM-specific genes in mdx Cluster 5: Development of EOM muscle and rat atrophy Cluster 1: Mdx mice Dysferlin-deficient mice Cluster 2: myositis Cluster 3: DMD GeneSet Clusterer, Rob Jelier, Erasmus MC Kappa-based clustering based on Gene ID
Clustering of genes based on similarity of concept profiles GeneSet Clusterer, Rob Jelier, Erasmus MC Cluster 1: atrophy and myopathy Cluster 2: extraocular muscle of mdx Cluster 3: human and mouse muscular dystrophies and myositis Cluster 4: long gene lists Cluster 5: muscle differentiation; Ky-mutant and Fxr-/- mice Cluster 6: ageing and sarcopenia
Evaluate biological processes that bring studies together DatasetComparer, Rob Jelier, Erasmus MC No overlap on GeneID level Many assocations on concept profile level Annotate
OmegaWiki (terminology system) Wiki Authors Wiki Medical/Clinical Wiki Proteins Wiki Chemicals Wiki Etc. Allow for: Community Annotation Quick growth of terminology systems Semantic Linking between concepts
Knowlet Association MatrixMeta-analysis Expert Challenge WikiZ/P Expert comments Peer to Peer Review Final Approval U.W. Fingerprint Update Literature Protein A
New publications or annotations Solid (a) Liquid (b) Gas (C) 1 st order Semantic enrichment Reduction False Positives Discussion Voting in Wiki Meta-analysis Proximity measures Proposals to Data bases ? Central Annotation
REGISTRATION (1X) Unique Author ID Adress PHP/userpage People Knowlets Unique concept ID Language variants Homonyms Definitions (brief) Object Knowlets Science Wikis UID from WiktionaryZ Research information Talk-page Liquid Threads Object Knowlets UID from WiktionaryZ Articles about UIDs Encyclopaedic/ NPOV Anonymous allowed
Dr. Johan den Dunnen Wiki-Authors OMIM NPOV DMD (Hs) MEI Wiki-Proteins DMD (Hs) AOI
Nature News February 15, 2007
Next generation semantic support technology or BEYOND Text Mining Semantic Exploration: navigating between various Wikis Professionals and exploring the Knowledge Space View constructed Knowledge Discovery: finding new associations between (medical and biological) entities Newness Alerting: monitoring all semantic connections regarding a particular topic and providing alerts when significant changes occur Science Management: communicating with other scientists, maintaining and organizing your personal library, supporting project management, etc. What do we do?
v + ? Step 2: select concepts of interest v v v ? ? ? ? v v v v v v v v v v v v v v v v v ? v v v v v v v v v v ? + + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? v ? v v v v v chloroquine primaquine Para-amino-benzoic acid New Drug ???? Cellular Memberan (GO) Mosquitoes Core concept: Malaria (mean distance 5) Semantic distance to: Malaria v v v v Plasmodium Chabaudi F+, C+, A+ F-, C+, A + F-, C-, A+ The Knowlet Visualization
v ? v v v v chloroquine primaquine New Drug ???? Mosquitoes v v v v Core concept: Malaria (mean distance 5) Para-amino-benzoic acid Cellular Memberan (GO)
v ? v v v v chloroquine primaquine New Drug ???? Mosquitoes v v v v Core concept: Malaria (mean distance 5) Para-amino-benzoic acid Cellular Memberan (GO)