On our way to to Information Overload ?
Or to prevent it by Appropriate use of Technology ?
C C C C C C consolidated knowledge Collexis Fingerprints (CFP’s)
English French Spanish People medical researchers around the world Activities in elect. text like projects, publications Medline abstracts... Disease:#12674 Multilingual Thesaurus Indexer Matches keywords, translates them to identical numbers and ranks them by their relevance Maladie:#12674 Enfermedad:#12674 Malaria:#24530 Hospital:#19994 Paludisme:#24530 Paludismo:#24530 Hôpital :#19994 Hospital:# The Common Language Each activity is represented as a set of keyword numbers ranked by their relevance #4256 : 1.0 #3627 : 0.8 #19994 : 0.5 #28746 : 0.3 #32874 : 0.1 #14325 : 1.0 #3627 : 0.8 #19994 : 0.5 #28746 : 0.3 #32874 : 0.1 #85643 : 1.0 #3627 : 0.8 #19994 : 0.5 #28746 : 0.3 #32874 : 0.1 #17345 : 1.0 #3627 : 0.8 #19994 : 0.5 #28746 : 0.3 #32874 : 0.1 #1c8456 : 0.1 #00356 : 0.1 „Collexion“ of activities You: #17345:1.0 #3627 :0.8 #19994:0.5 #28746:0.3 #32874:0.1 Your activity as text Submit and indexed to keyword numbers Find similar activities and the people behind Cross-language networking
The Early evolution of Fingerprint Manipulation contents fingerprints add people fingerprints add organization fingerprint Jobs CV’s, Skills Articles, books s, Word RFP’s
BIOSEMANTICS “Cellese”: the language that cells use to communicate internally and externally. The Molecular Language and its biological MEANING The Group –Jan Kors PhD. –Erik van Mulligen PhD –Bob Schijvenaars PhD –Marc Weeber PhD –Christiaan v.d. Eyck MsC –Rob Jelier PhD –Barend Mons PhD –Johan van der Lei PhD
A consortium to combine State-of-the-art Information and Knowledge Mining Technologies To support : Thesaurus and ontology enrichment Disambiguation of concepts Semantic meta-analysis of massive information To enable : Information-based discovery Evidence based policy making
Thesaurus and Ontology Enrichment New concepts Synonyms Homonyms Genes, Proteins Pictures
Validation 3 Free text Unexplained Text (XML) Potential concepts Thesauri: Mesh HUGO SwissProt SAGE Others FUA 4 1 Fingerprint s (known concepts) partners E-BioSci EMBO Elsevier NLP 2 TNO LUMC HUGONC Genebio AMC EUR UVA SERENDIP
Too much to read: major trends foreseen: From Reading to Consulting From Reading to Meta-analysis From Text to Knowledge Representations
C C C C C C Semantic types Co-occurrence data The first step: to the Conceptual Semantic Network
Calcium deposition Pleocytosis Basal Ganglia Encephalopathy Cerebrospinal Fluid Tomography, X-Ray Computed Parents Family Aicardi Goutieres syndrome Ferrocalcinotic deposition Spastic quadraplegia Fahr disease Microcephaly AGS1 x G-protein coupled receptors G-substrate Lipoid dermatoarthritis Receptors Complement Factor B RNA, Complementary Xenopus oocyte AGS1 SwissProt: Activator of G-protein signaling 1 (AGS1) * AICARDI-GOUTIERES SYNDROME 1; (AGS1) : OMIM Aicardi Goutieres syndrome 1 Heterogeneity Linkage (Genetics) Clinical diagnosis Family 2 AGS1 ** Lod Score Genetic Heterogeneity analysis Toxoplasmosis Calcium deposition 3 Encephalopathy 4 Cadmium Genus: Human cytomegalovir... Cerebrospinal fluid abnorm. 5.. Interferon-alpha Chromosomes Viral Child Head Tricuspid Valve Stenosis
Fingerprinting disambiguation ACS META-ANALYSIS
Applications Cross-language, jargon and cross-system matching (implemented): Information-based discovery (Research) Community building (Experts,Policy Making) Trendwatching and Indicators (Policy Making)
Seed-Term based Conceptual Semantic Networks
? Clustering of genes on-the-fly
Predicting new knowledge ?
III= Distribution over distance categories of concept-pairs without co-occurrence in the learning set. IV= Distance categories of concept pairs related to the probability that there is no explicit relationship or co-occurrence in Medline (zero ratio). A ratio of 0 means that an automatic Query in Medline with the concept pair with “AND” in between does lead to 0 hits in Medline.
New Drug discovery ?
Semantic Filtering
Name: A Institute contact details IBC text Acronym Organisation contact details DOI metadata text Title metadata DOI
The knowlet: content, people, organisations Publications Molecular Databases Image databases Patents Events Calls
Knowledge Maps, Nature Biotechnology Map
Knowledge Maps: Medline Bioterrorism Map 1997
Knowledge Maps: Medline Bioterrorism Map 2001
Private Research DC Public E-BioSci Pharma etc. ORIEL SERENDIP FP6 etc. I-Research Ministies WHO, FAO etc. SHARED BIREME/VHL EDCTP Oxford intiative etc.