Knowledge Discovery in the Digital Age LIBER: Knowledge Discovery in the Digital Age Turin, 29th November 2016 Susan Reilly Executive Director, LIBER @skreilly
Let’s talk about knowledge discovery
What is Knowledge Discovery? Ultimate goal is to extract high level knowledge from low level data Allows analysis across disciplines “Undiscovered public knowledge” (Swanson) Identifies patterns in the data to produce new knowledge It’s not a new thing, it’s just digital information makes it a whole lot more powerful and relevant!
Undiscovered Public Knowledge Research into Raynaud’s disease = symptom of disease is high blood viscosity Research into fish oil = thins the blood
Why now? “the discovery by computer of new, previously unknown information, by automatically extracting and relating information from different (…)resources, to reveal otherwise hidden meanings” (Hearst, 1999) ICT Availability Computing Power Data 16 trillion gigabytes of data by 2020 (236% growth) Doubles every 2 years (Moores Law, 1965) Over 80% EU citizens have internet access (Eurostat 2014)
A nice example of how TDM can replace or supplement a literature review is this hypothesis finder. An hypothesis is a supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation. Malhotra A, Younesi E, Gurulingappa H, Hofmann-Apitius M (2013) ‘HypothesisFinder:’ A Strategy for the Detection of Speculative Statements in Scientific Text. PLoS Comput Biol 9(7): e1003117. doi:10.1371/journal.pcbi.1003117
Automated Biography of a Nation http://annalyn-ng.com/sg50/chart.html
TDM is Fun! Hedonometer
Economics & Competitiveness (Europe) TDM potentially worth 5.3 billion euro a year to European research budget (2%) Knock-on effect would be a minimum of 32.5 billion euro increase in GDP US responsible for over half the articles and patents on TDM 1100 US patents compared to 39 EU by 2013 Non-english speaking countries falliing behind
http://www.flickr.com/photos/21042103@N03/3136430416/ Funny Kitten
Does copyright apply? We are not seeking free access! We are not seeking to steal content. We are seeking to convert content that we have legal access to into machine readable format so that our computers can ‘read’ the content and extract facts, data and ideas. Strikes at the heart of what libraries do.
What we need “A mandatory pan-European exception for text and data mining (and analogous activities), which cannot be overridden by contract, and is not limited to non-commercial activity.” Photo: Howard Lake https://www.flickr.com/photos/howardlake/5540462170
Copyright in the Digital Single Market
Proposed exception for TDM For text and data Cannot be overridden by contract provisions Only research organisations benefit Public private partnership unclear Allows use of “appropriate technical protection measures” (TPMs) Best practice for TPMs to be agreed at member state level
What can you do? 1. INTELLECTUAL PROPERTY WAS NOT DESIGNED TO REGULATE THE FREE FLOW OF FACTS, DATA AND IDEAS, BUT HAS AS A KEY OBJECTIVE THE PROMOTION OF RESEARCH ACTIVITY http://thehaguedeclaration.com
Thank You! Any questions? @skreilly www.the haguedeclaration.com www.openminted.eu www.futureTDM.eu www.libereurope.eu