Virtual Research Environments The story of Scratchpads Dimitris Koureas Natural History Museum London Coordinator, Distributed System of Scientific Collections (DiSSCo) Chair, Biodiversity Information Standards (TDWG) Co-chair, DCF IG, BDI IG & ASDOC WG @DimitrisKoureas
1 2 4 3 Facilitate the research data lifecycle Data collection & generation Facilitate the research data lifecycle Functional and operational characteristics differ significantly according to the communities of practice addressed 1 2 3 4 Data curation VREs enable communities to transfer their typical scientific practice into the digital realm Data publishing Data analysis
Successful examples of VREs across disciplines Past Projects (EC-funded) Current Projects (EC-funded) VI-SEEM MuG OpenDreamKit BlueBRIDGE VRE4EIC West-Life Despite their success for Virtual Research Environments to continue serving their communities we need to reinvent their role within the ever developing research e-infrastructures landscape
The challenges with VREs Built in isolation Developed redundant technological solutions for storage, authentication, computing Failed to develop good sustainability models (failed to attract users)
Build it and they will come? No! Confidence Commitment Longevity Agility Adaptability User monitoring Marketing Visibility Intuitive interface
650 Scratchpads Communities by 7,500 active registered users covering 125,000 taxa in 1,200,000 pages. An example from the Biodiversity domain Per month unique visitors to Scratchpads sites In total more than 5,000,000 visitors 70,000 unique visitors/month
What are Scratchpads? Hosted websites for biodiversity data Virtual research & publication platform Completely open access & open source Modular & flexible What are Scratchpads? They are hosted websites for biodiversity data, but the specifics and the kind of data are entirely up to the user. Scratchpads are designed to be a virtual research and publication platform, a place where you should be able to store, save, publish and reuse your work. They are completely open access, open source and free to use. They are modular and flexible. It is possible for other developers to work with us to write modules for specific functions or communities.
The Scratchpads concept Your data External data & services Scratchpads are fed by your data. Scratchpads help you structure your data in a way that makes them both human and machine readable. Allows you to contribute to global biodiversity databases and also aggregates all related to your data information from external resources.
The Scratchpads concept
What is the challenges today? Hard to benefit from technological advances and external services (e.g. cloud storage and computing) Locked-in certain technologies leads to substantial limitations (e.g. recruiting) The hybrid sustainability model (core funds and community in-kind support) not always working Continuity and uninterrupted service provision, while changing underlying architecture is difficult (expensive)
The future of Scratchpads Retain a thin layer of interfaces that: Enable standardised capture and share of data Collaboration and communication tools Robustly link to underlying e-infrastructures Cloud storage and cloud computing AAI services
The future of Scratchpads JSON Schema Microservice API CRUD Validation OpenAPI Docs JSON-LD The future of Scratchpads Create a database & custom schema matching internally modelled data Input data Map the custom schema Linked Open Data Schemas. Output Linked Open Data
Vertical Integration Utilise the common services of the European Open Science Cloud Currently: 7,000 Target: 15,000 Domain aggregators & repositories Scientific Outputs Authentication Storage Computing Backbone services
Horizontal Integration Interoperability across tools and services Science-Policy IPBES - EBV
The future of Scratchpads Merge with other similar community solutions Shift ownership from the institution to the corresponding Research Infrastructures
Not all research communities are ‘created equal’ For instance, researchers working on genomics, physics or astronomy have long appreciated the value of data sharing. By nurturing a culture of shared physical and computational infrastructure, open-source software and open data, they have embraced the principles of open science. The need to respect diversity and continuously developing needs Research communities are developing new practices organically and that they are best placed to explore which of these contribute to the advancement of their discipline The challenge starts with identifying how professionals work within their research communities and understanding the processes that lead to innovation becoming embedded into common practice (May and Finch 2009). Providing einfrastructures that seamlessly couple with the work practices of a particular professionrequires layers that abstract from a technical level and use the language of each profession. Such layers are typically web-based applications that address elements across the lifecycle of data and research, i.e. data mobilisation and discovery, experimentation, analyses, publication, and open collaboration. Such “Virtual Research Environments” (VREs) should act as intuitive and responsive interventions between researchers and core services.
European Investments in e-infrastructures 2007-2015 more than €740 Million Digital Agenda for Europe Over the last years investments have focused towards robust, reliable and interoperable services that generate global solutions for data sharing and preservation, high performance and cloud computing, user-authorisation and authentication
A global priority of changing the modus operandi of science Investments in development of e-infrastructures Provision of tools and services Training and incentives Digital Agenda for Europe Digital Collaborative Data-intensive research
Developing the European Open Science Cloud A set of robust, sustainable infrastructures that provide cloud services (authorisation, storage, computing) An ecosystem of common backbone services
Researchers are still reluctant to engage European Grid Infrastructure (a flagship initiative that delivers integrated computing services to European researchers) announced (Dec 2015) a user base of 35,959 (European Grid Infrastructure 2015)
Developing the Backbone infrastructure (core services) is not enough Services need to reach the client’s ‘house’
The last mile challenge for research e-infrastructures Koureas D. et al. 10.3897/rio.2.e9933 The last mile challenge for research e-infrastructures Borrowed from telecommunications and transportation Cost and complexity of ‘connecting’ end-users to the backbone (core) infrastructures is high when compared to the core infrastructure itself. VREs act more as the socio-cultural and less as technical brokers. Intermediate environments that connect communities of practice to a common e-infrastructure landscape.
Socio-cultural distance from open digital practices VRE Backbone service Bus (e.g. EOSC) VRE VRE
The development of EOSC does not reduce the importance of VREs benefits In fact, it strengthens it. As they become the needed mechanisms that enable communities of practice to engage with core services VREs benefit The EOSC reduces the operational costs of VRE up keeping and hardwires interoperability minimising fragmentation and redundancy
Developing the next generation of VREs User communities need to be able to: articulate and communicate their community-specific needs in regards to data and services, and translate these needs into clear functional requirements that will drive the development of VREs.
Developing the next generation of VREs VRE operators need to: develop VREs looking beyond the ephemeral timeframes of project-based approaches, invest early in building public-public and public-private partnerships that ensure sustainability and, robustly link VREs with existing underlying e-infrastructure, building on top of available backbone services.
Developing the next generation of VREs Funders need to: further acknowledge the pivotal role of VREs in support of user community engagement and, develop, with a particular eye to long-term sustainability, dedicated VRE funding programmes with targeted calls to discipline-specific communities.
Thank you @DimitrisKoureas d.koureas@nhm.ac.uk