Presentation is loading. Please wait.

Presentation is loading. Please wait.

Weaving a Semantic Web: Using Linked Open Data from Institutional and National Sources David Eichmann School of Library and Information Science University.

Similar presentations


Presentation on theme: "Weaving a Semantic Web: Using Linked Open Data from Institutional and National Sources David Eichmann School of Library and Information Science University."— Presentation transcript:

1 Weaving a Semantic Web: Using Linked Open Data from Institutional and National Sources David Eichmann School of Library and Information Science University of Iowa

2 Thanks to: Noshir Contractor, Northwestern University Holly J. Falk-Krzesinski, Elsevier & Northwestern University Melissa Haendel, Oregon Health Science University Michael Conlon, University of Florida

3 Thesis of Talk We are well into the first generation of research networking tools, but we are still conceptualizing what successive generations might look like. –As support for this, consider that the majority of participating institutions are still in the process of fully deploying their first research networking system.

4 Research Networking Programmatic support for discovery and use of research and scholarly information regarding people and resources. They are essentially special purpose institutional knowledge management systems.

5 Example RN Systems Profiles (Harvard) VIVO (VIVO Consortium, led by Florida) Loki (Iowa) SciVal Experts (commercial – Elsevier) A number of others

6 A Sample Profiles Page

7 A Sample VIVO Page

8 A Sample Loki Page

9 Some Loki Project History Originally conceived as an electronic research interest “brochure” for Research Week 2006 Scaled quickly into something more permanent –Publications –Grants –Biosketches

10 Some Loki Project History Auto population of PubMed citations Integration with Sponsored Programs funding database Integration with HR database Auto population of NIH Funding Opportunity Announcements

11 Some Loki Project History Integration with eCV project –Digital Measures’ ActivityInsight Auto population of Web of Knowledge citations Out-bound linkage to –PubMed and PubMed Central –Web of Science –Publishers (via DOIs)

12 Loki Demographics 2011 College or Org UnitN Medicine682 Other Units130 Liberal Arts and Sciences90 Nursing68 Dentistry59 Pharmacy50 Public Health43 University Hospitals25 Engineering23 ICTS13

13 Current Loki Demographics

14 Target Demographics All faculty and interested staff The campus eCV project is serving as the catalyst for this, with PubMed/WoK and DSP data serving as a major point of work avoidance for faculty The campus institutional repository may come to play a role here as well

15 Why Bother with VIVO? Words in a profile are just sequences of characters carrying no meaning –Try asking Google Scholar what grant funded a given hit… With structure and relationship comes meaning, aka semantics –Enter the Semantic Web!

16 More… Science –Looking through the current concepts of publication and grant to the science being done Organizational Context –Aggregate concepts as well as the investigators that comprise them Labs, courses, centers, …

17 More… Information –We have more information available to us about our graduate students (via Facebook and Twitter) than about our (potential) colleagues and collaborators

18 Connecting the Dots The real challenge here is translation of information already in existence in scattered sources –Research networking tools –Citation databases (e.g., PubMED) –Award databases (e.g., NIH Reporter) –Curated archives (e.g., GenBank) –Locked up in text (the research literature)

19 Swanson’s bibliographic linkage 1986: "Fish oil, Raynaud's syndrome, and undiscovered public knowledge." Perspectives in Biology and Medicine 30(1): 7-18.

20 Swanson’s bibliographic linkage 1986: "Fish oil, Raynaud's syndrome, and undiscovered public knowledge." Perspectives in Biology and Medicine 30(1): 7-18. 1987: "Two medical literatures that are logically but not bibliographically connected." Journal of the American Society of Information Science 38(4): 228-233. 1988: "Migraine and magnesium: Eleven neglected connections." Perspectives in Biology and Medicine 31(4): 526-557. Connecting to the Science

21 Linked Open Data Architecture

22 Linked Open Data Appeal Models –Low-hanging fruit: mapping our own data –The big payoff: level 5 LOD and deep questions “Which investigators are studying genes implicated in breast cancer?” –The inference chains that are possible: Loki – MEDLINE – RefSeq – GenBank

23 Linked Open Data Appeal Independence with equivalency –Build out a Loki ontology –Low-hanging fruit: RDF triple generation using D2R –Generate ontological equivalences to VIVO, etc.

24 Linked Open Data Appeal Addressing conceptual dissonance –The VIVO concept of investigator maps to two related, but distinct, Loki concepts Faculty Researcher –Who’s “right?” It’s a multi-ontology world – all semantics are relative

25 Looking Beyond Research Networking

26 Supporting the Research Lifecycle Project conceptualization –“Framing the problem” ala Goodman –Literature review Funding opportunity identification –NIH FOA alerts Team identification/formation –RN tool stock-in-trade

27 Supporting the Research Lifecycle Proposal preparation –Biosketch management Research process support –If wikis are the answer, what was the question? Outcomes dissemination and curation –Institutional repositories, Dataverse, etc.

28 Current Approaches to Team Identification Survey the target community –Suffers from issues of scale and detection Quantitatively analyze a surrogate information source –Publication/Grant co-authorship –Temporally offset from actual collaboration –Only the ‘winners’ are detected –Serious information loss re true expertise

29 Some Early Data on CTSA Consortium Collaboration Org.CornellNWOHSUUCSFFlaIowa Cornell NW0 OHSU066 UCSF01944258 Fla025440 Iowa02675235336 Inter-Institutional co-authorship pair counts

30 Social Networking Linkages Holly comes across the new service out of LinkedIn Labs, visualization of your LinkedIn connections –http://inmaps.linkedinlabs.comhttp://inmaps.linkedinlabs.com Holly relates this coolness to Dave, who can’t resist poking about to see if he can scrape the data Having done so, he twists arms of selected colleagues to cough up their maps

31 Dave’s Map

32 Mike’s Map

33 Melissa’s Map

34 Holly’s Map

35 Nosh’s Map

36 Phase 1 Acquisition of graph structure –Nodes, edges, coordinates, cluster membership Acquisition of node characteristics –Person name, URL, public ID

37 Aggregate Graph Statistics Person# Nodes# EdgesAve. Edges/Node Dave3392,8168.3 Holly1,83513,9307.6 Melissa2722,5779.5 Michael4615,57112.1 Noshir2,37327,20311.5

38 Subgraph Size SubgraphDaveHollyMelissaMikeNosh 0961274596189 13712441103155 2361234069217 3341233372212 4251082681143 5231052624247 618102154144

39 Overall Population Characteristics Total distinct individuals: 4959 –Shared by 2 or more: 246 –Shared by 3 or more: 43 –Shared by 4 or more: 22 –Shared by 5 or more: 10

40 For the 5 or more The primary participants (except Nosh!) Others: –Bill Barnett –Ying Ding –Kristi Holmes –Warren Kibbe –Titus Schleyer –Griffin Weber

41 Phase 2 Screen scrape the public page for a person –# of connections (capped at 500) –Organizational affiliations –Expertise endorsements

42 Subgraph Intersection (Dave) SubgraphHollyMelissaMikeNosh 028243 1---- 2614252 3--1- 4---- 5---- 6----

43 Subgraph Expertise Characterization Cluster 0

44 Subgraph Expertise Characterization Cluster 2

45 Subgraph Expertise Characterization Cluster 3

46 Pattuelli’s Spectrum of Relationships (2012)

47 RN Tools

48 Pattuelli’s Spectrum of Relationships (2012) RN Tools Linked In

49 Pattuelli’s Spectrum of Relationships (2012) Ontologies used –foaf (Friend of a Friend) –rel (Relationship) –mo (Music) Echos of Trigg’s link taxonomy –Trigg, R. 1983. Network-Based Approach to Text Handling for the Online Scientific Community. Ph.D. dissertation, Department of Computer Science, University of Maryland, technical report TR-1346

50 Observations N = 5 ! LinkedIn expertise endorsements are an ad hoc folksonomy –Melding this with the typically controlled vocabulary of the research networking tools should prove interesting –These characteristics don’t show up in the RN meta-data

51 Questions? Email: david-eichmann@uiowa.edu Thanks to my co-authors and the Research Networking Affinity Group Supported in part by NIH grants 2 UL1 TR000442-06 and UL1 RR024979


Download ppt "Weaving a Semantic Web: Using Linked Open Data from Institutional and National Sources David Eichmann School of Library and Information Science University."

Similar presentations


Ads by Google