Tera-flops Peta-bytes and Exa-links Where is the Web in the Web of Science? Professor James Hendler University of Maryland
UK e-science, 9/03 MIND SWAP Maryland Information and Network Dynamics Laboratory, Semantic Web and Agents Project J. Hendler B. Parsia Jennifer Golbeck Aditya Kalyanpur Grecia Lapizco-Encinas Katy Newton Evren Sirin Corporate Research Partners: Fujitsu Laboratory of America, College Park Lockheed Martin Advanced Technology Laboratories NTT Corp SAIC Corp. Govt Funding: US Army Research Laboratory, NSF, DARPA, IC ( OWL-powered Semantic Web page ) Ronald Alford Ross Baker Amy Alford Matt Westhoff Kendall Clark Nada Hashmi
UK e-science, 9/03 Outline Justify the title Exa-links? Conjecture Future of Science lies in interdisciplinary work How scientists communicate Models and Web of Semantics Hypothesis A little bit of semantics goes a long way C.f. Web service composition (and its challenges) Outrageous claim E-science fails without semantics What’s next?
UK e-science, 9/03 ExaLinks Paths through graphs grow exponentially Computer scientists have been cursing this for years Even if P=NP Imagine a graph of >4e10 nodes All possible paths an unbelievably large number O(2^4e10) Paths of length 4 Avg links 1/10e7 x links = 1.2e22 Web Graph? Google™ on the WWW
UK e-science, 9/03 The new challenge As we increase the number of processors, the types of data, and the kinds of services The graph of possibilities grows exponentially The paths through this graph grows exponentially
UK e-science, 9/03 (US) Grid computing emphases Tera-flops Despite Moore we are compute bound These problems are HARD Peta-bytes We can collect more than we can process These problems are BIG Japanese Supercomputer Something w/much data NSF Cyberinfrastructure report focuses on moving lots of data to ever larger computers
UK e-science, 9/03 Semantic “Depth” and Complexity Explicit Semantics - used for interoperability across domains Explicit Semantics - used for interoperability across domains “Deepest” semantics required when no shared domain “A” “C” “B” C A+B+C Implicit Semantics - within a given domain Implicit Semantics - within a given domain Each circle represents semantics of a service/source and the overlap is the common semantics/terms. Examples - hard-coded routine has high implicit semantics; highly negotiated, annotated composible service has high explicit semantics, etc Examples - hard-coded routine has high implicit semantics; highly negotiated, annotated composible service has high explicit semantics, etc “C” represents explicit common formats and assumptions required for effective interoperability between different domains “Overlap factor” is
UK e-science, 9/03 Analytic Models of Information Fluidity 100 Ontologies 100 Systems/Ontology Jiang, Cybenko, Hendler Ontologies 100 Systems/Ontology “Fluidity” is the largest number of fully interoperable systems over the total number of systems. Ontology mappings occur with probability proportional to the number of systems using an ontology. Based on recent random graph theory (Chung and Lu). Quantifiable, measurable in real systems.
UK e-science, 9/03 Analytic Models of Deep Semantics and Markup Complexity t=0.5, levels=1 t=0.5, levels=2 Plots of reduction in complexity vs zmax and wmax “C” Within subdomains, there is much overlap Across subdomains, the need for deep semantics to achieve interoperability is much lower Markup complexity is measured by the amount of unique markup required. Wmax and zmax are parameters capturing intra and inter domain semantics.