TEXT AND DATA MINING IN PUBLIC RESEARCH

Slides:



Advertisements
Similar presentations
Principal Patent Analyst
Advertisements

The JISC vision of research information management Dr Malcolm Read Executive Secretary, JISC.
DEVELOPMENT OF A EUROPEAN NETWORK OF LIBRARIES Hans Geleijnse Director of Library and IT Services & CIO Tilburg University, The Netherlands.
Universities and Patents From Open Science to Open Innovation Gilles Capart Chairman of ProTon Europe.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
ER-0317/2/99 G R U P O S G A E Intellectual Property Rights in digitisation of education Part 1. Current problems in the face of digitisation. Massive.
Copyright Reform for Text and Data Mining Advocacy Workshop Jonas Holm, Stockholm university LIBER 2015 London, 24 th June 2015.
"Open Europe: Open Data for Open Society" Selected legal barriers for Open data results from Lapsi 2.0 best practices in IP.
Text Mining: Opportunities and Barriers John McNaught Deputy Director National Centre for Text Mining
The Changing Face of Exclusive Rights on Digital Cultural Content after the 2013 PSI Directive 3 rd LAPSI 2.0 Meeting – 10 th October 2014.
1 Re–use of PSI – Challenges and Opportunities ePSIplus National Meeting Greece 21 May 2008 Athens.
The Knowledge Exchange Presentation to CNI April 2005 Bas Cordewener, SURF Sigrun Eckelmann, DFG Norman Wiseman, JISC.
Technology Transfer and IP framework initiatives May 2011.
Introductory to the English “NMS” Model The English National Measurement System (NMS) is –in short – described, as illustration of an efficient National.
Orientation for new Lead Partners and Partners Information & Publicity Requirements Lead Partner and Partner Seminar 12 June 2008 – Voss, Norway Kirsti.
| 1 Open Access Advancing Text and Data Mining Libraries & Publishers working together to support Researchers What is Text Mining?
Optimising Internet Bandwidth in Developing Country Higher Education Sara Gwynn INASP
Copyright issues in Text and Data Mining OAI9 Jonas Holm Legal counsel, Stockholm University
European Commission - DG ENV 1 I N S P I R E INfrastructure for SPatial InfoRmation in Europe Info-day INSPIRE, Instituto Geografico.
"The role of Rural Networks as effective tools to promote rural development" TAIEX/Local Administration Facility Seminar on Rural Development Brussels,
The Practice of Openness Susan Reilly Interim Executive Director LIBER: Ligue des Bibliothèques Européennes de
New Data Access Arrangements – The Experiences in Germany Stefan Bender (Deutsche Bundesbank) Claudia Oellers (German Data Forum) Cross National.
Evaluation and Impact Assessment of European FP for R&D :
NRF Open Access Statement
Knowledge for Healthcare: Driver Diagrams October 2016
INSPIRE and the role of Spatial Data Interest Communities (SDIC)
Knowledge Discovery in the Digital Age
The Faculty of Humanities and Social Sciences
Global Coordination Platform
PLOS Facilitating Text & Data Mining The Role the Publisher Can Play
OpenAIRE in 8 Minutes Tony Ross-Hellauer State and University Library,
M25 Group Open Library Data A British Library Perspective
Name Job title Research Councils UK
Europe’s Environment Assessment of Assessments EE-AoA 2011
Global Coordination Platform
ICT PSP 2011, 5th call, Pilot Type B, Objective: 2.4 eLearning
National planning for Open Research euroCRIS 2017, 30 May 2017
Online platforms Brussels, September 2016.
The Faculty of Humanities and Social Sciences
Global Coordination Platform
What is €5 billion worth? Magda Gunn, IMI Scientific Project Manager.
“Economics as if people and the planet mattered”
Introduction and aims Supports libraries by providing a single point of access to e-journal usage data Assists management of e-journals collections to.
The Biodiversity and Protected Areas Management (BIOPAMA) Programme
Standards for success in city IT and construction projects
Knowledge Exchange CNI, December 2009, Sigrun Eckelmann
Horizon 2020 Richard Buxbaum Scotland Europa.
EOSC Governance Development Forum
Open Access and the implications for a developing country
Ανοικτή Πρόσβαση: η άμεση διάδοση της παραγόμενης επιστημονικής γνώσης
The ERA.Net instrument Aims and benefits
The Canadian Higher Education IT Landscape
TDM=Text Mining “automated processing of large amounts of structured digital textual content for purposes of information retrieval, extraction, interpretation.
DG Environment, Unit D.2 Marine Environment and Water Industry
SUMPs SUMPs-Up - Overall goals and dimension of the project
What is Digital Right Management’s Role in Modern Education System’s Play? —A Comparative Research of DRM System’s Influence in.
Copyright Reform for Text and Data Mining Advocacy Workshop
EU-RUSSIA Cooperation in Energy Efficiency
Caroline Morgan, CEO of IFRRO Malaysian National Seminar
The role of the ECCP (1) The involvement of all relevant stakeholders – public authorities, economic and social partners and civil society bodies – at.
Boosting Social Enterprises in Europe December 3-4, 2015
Animal Welfare EU Strategy
The Strategic Content Alliance
Culture Statistics: policy needs
Brian Matthews STFC EOSCpilot Brian Matthews STFC
IFLA Global Vision Challenges facing the library field Report Summary
Main recommendations & conclusions (1)
Presented by European Railway Agency (ERA)
My name is VL, I work at the EEA, on EA, and particularly on developing a platform of exchange which aims at facilitating the planning and development.
The Biodiversity and Protected Areas Management (BIOPAMA) Programme
Presentation transcript:

TEXT AND DATA MINING IN PUBLIC RESEARCH Rob Johnson – 13/12/2016

2. Why isn’t it used more widely in public research? 1. Why does TDM matter? 2. Why isn’t it used more widely in public research? 3. How do we change this?

Study aims Assess economic impact of TDM on public research in France via: Case studies (France, UK, Europe) Analysis of the relevance of a copyright exception for TDM http://adbu.fr/etude-tdm/

€16 billion 6-fold return 20% per annum Value of R&D performed within French universities and public research bodies (source: Eurostat) 6-fold return €6 contribution to EU economy for each €1 directly generated by research universities (source: Biggar Economics) 20% per annum France - €6.4billion R&D in government sector, €10 billion in HE UK - €3billion in government, €9 billion Estimated rate of return to public investment in science and innovation (source: Frontier Economics)

2.5 quintillion bytes 2.4 million Zero Data produced each day Scientific articles per annum Zero Number of researchers who can keep up

What is TDM? Any automated analytical technique aiming to analyse text and data in digital form in order to generate information such as patterns, trends and correlations. European Commission. Proposal for a Directive of the European Parliament and of the Council on copyright in the Digital Single Market A number of studies indicate that TDM can increase the efficiency of research Increase coverage of literature reviews Cut down manual work Automate information retrieval Accelerate drug discovery

Where are we now, and how did we get here? BASE CAMP Where are we now, and how did we get here?

What is the problem? …countries, in which academic researchers must acquire the express consent of rights holders to conduct lawful datamining, exhibit a significantly lower share of data mining research output relative to total research output Handke, Guilbault and Vallbe IS EUROPE FALLING BEHIND IN DATA MINING? (2015)

What is the result? The European ecosystem for engaging in text and data mining remains highly problematic… The end result: Europe is being leapfrogged by rising interest in other regions, notably Asia. Filippov, S. & Hofheinz, P. Text and Data Mining for Research and Innovation (2016)

Legislative options 1 2 3 4 1.5? Mandatory exceptions to copyright Loi pour une République Numérique (Loi LEMAIRE) 28 September 2016 2014 2017? 1 2 3 4 1.5? Industry self-regulation Commercial research, beneficiaries restricted Commercial research purpose, beneficiaries unrestricted Non-commercial research only Mandatory exceptions to copyright

Using a TDM exception Restriction No lawful access France No lawful access Not scientific literature - Not public research Commercial purpose Conservation not by designated body Note - Conservation requirements could be a positive in terms of reproducibility

1. ACHIEVING LEGAL CLARITY

EC Directive Copyright exception (Base Camp) Summit: Researchers embrace TDM Camp 4: Skills and support Camp 3: Technical infrastructure Camp 2: Access to content Camp 1: Legal clarity EC Directive Copyright exception (Base Camp)

The exception has made a massive difference... Petr Knoth, Open University, UK

Petr Knoth, Open University, UK …the definition of commercial and non-commercial research is creating uncertainty Petr Knoth, Open University, UK

EC Proposed Directive Consistent with the existing EU copyright legal framework Could help resolve uncertainty over commercial partnerships Currently out for consultation Source: http://www.comodinicachia.com/timeline.html

What needs to happen? Communicate legal provisions for TDM with certainty and clarity Clarify the exception’s scope where public researchers collaborate with commercial partners Monitor the interaction of the copyright exception with digital rights management (DRM), licensing and other relevant legal regimes

Any questions?

2. SECURING ACCESS

Chris Hartgerink, Tilburg University, Netherlands I scaled down my TDM research, and had to exclude two publishers… I couldn’t do what I set out to do Chris Hartgerink, Tilburg University, Netherlands

Mathieu Andro, INRA, France I had to ask too many publishers for the right to download … it takes a lot of time and … the publishers’ servers frequently block us. Mathieu Andro, INRA, France

What is the problem with access? Technical protection measures (TPMs) Crawler traps Restricted access to application programming interfaces (APIs) A many-to-many problem

What needs to happen? Incorporate TDM clauses into model licence agreements Educate researchers on their rights Maintain dialogue with publishers Improve access through better infrastructure…

3. INFRASTRUCTURE & TOOLS Image: National Geographic

…Every time you have a new project or data source… you hit issues about how the documents are structured, oddities of formatting, and so on. Mark Greenwood, GATE, UK

The TDM Landscape Source: OpenMinTED

What needs to happen? Invest in TDM infrastructure Make TDM accessible to non-specialists Streamline access Open standards and harmonised data formats

4. SKILLS & SUPPORT

François Rioult, GREYC Laboratory, Université de Caen, France …We have algorithms to answer questions, but we do not have algorithms to ask questions François Rioult, GREYC Laboratory, Université de Caen, France François Rioult

What is the role of the librarian? Edmund Hillary and Tenzing Norgay Photo: REUTERS

The library needs to be able to say: ‘If you’ve got a question about TDM, come to us’ Danny Kingsley, Head of Scholarly Communications, University of Cambridge, UK

Library support for TDM Advocacy Copyright advice Access to legal expertise Skills development and training Advice on data sources and tools Advocacy for the benefits of TDM at all levels of the organisation Copyright advice on using the TDM exception Access to legal expertise Skills development (indexing and metadata curation) and access to technical training (coding and high performance computing) Advice on data sources and tools

5. EMBRACING TDM

Why? "Because it's there"

Ross Mounce, University of Cambridge, UK There are so many obstructions in the way of doing this research, and doing it well. It is just too hard and so people do other things Ross Mounce, University of Cambridge, UK

What needs to happen? Endorsement by senior research leaders Funding and incentives linked to TDM Alignment with moves to open science

2. Why isn’t it used more widely in public research? 1. Why does TDM matter? 2. Why isn’t it used more widely in public research? 3. How do we change this?

Why does TDM matter? Public research is valuable TDM makes research more efficient TDM is worth investing in

2. Why isn’t it used more widely in public research? 1. Why does TDM matter? 2. Why isn’t it used more widely in public research? 3. How do we change this?

EC Directive Copyright exception (Base Camp) Summit: Researchers embrace TDM Camp 4: Skills and support Camp 3: Technical infrastructure Camp 2: Access to content Camp 1: Legal clarity EC Directive Copyright exception (Base Camp)

2. Why isn’t it used more widely in public research? 1. Why does TDM matter? 2. Why isn’t it used more widely in public research? 3. How do we change this?

Making TDM a reality Libraries Monitor researchers’ experience Develop case studies and guidance Involve the national library Invest in TDM support Incorporate TDM clauses into licence agreements researchers’ experiences

Making TDM a reality Legislators Institutions/research leaders Provide certainty Enable public/private partnerships Monitor interaction with other legislation (e.g. DRM) Institutions/research leaders Endorse TDM Invest in library services Explore knowledge exchange opportunities Research funders Invest in infrastructure Forum to improve access Link TDM to Open Science Publishers & providers Cloud services for TDM Steamline access Open, harmonised standards

Rob Johnson Thank you Full report available at:: http://adbu.fr/etude-tdm/ rob.johnson@research-consulting.com www.research-consulting.com Template inspired by SlidesCarnival