David De Roure Eindhoven Edition. Due to the complexity of the software and the backend infrastructural requirements, e-Science projects usually involve.

Slides:



Advertisements
Similar presentations
GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution.
Advertisements

David De Roure Social Networking and Workflows in Research.
David De Roure. Between 19 th October and 23 rd November 2007 I attended six international meetings related to e-Science Grid 2007 Scientific and Scholarly.
Introducing Progress Arcade Roy Ellis
IT INFRASTRUCTURE AND EMERGING TECHNOLOGIES
The Data Lifecycle and the Curation of Laboratory Experimental Data Tony Hey Corporate VP for Technical Computing Microsoft Corporation.
European Life Sciences Infrastructure for Biological Information Rafael C Jimenez ELIXIR CTO EMBL-EBI workshop networks and pathways.
David De Roure Manchester Edition. John Taylor There are a number of grid applications being developed and there is a whole raft of computer technologies.
Designing, Executing and Reusing Scientific Workflows Katy Wolstencroft, Paul Fisher, myGrid.
Accelerating Time to Experiment – The myExperiment Approach to Open Science David De Roure Carole Goble Jiten Bhagat.
Simon Woodman Hugo Hiden Paul Watson Jacek Cala. Outline 1. What is e-Science Central? 2. Architecture and Features 3. Workflows and Applications.
Microsoft Research Faculty Summit David De Roure University of Southampton, UK.
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
DoDAF 3.0: A Web 2.0 and SOA Mashup!
Jiten Bhagat University of myExperiment A Social VRE for Research Objects JISC Roadshow | February.
Cloud Computing for Chemical Property Prediction Paul Watson School of Computing Science Newcastle University, UK Microsoft Cloud.
CLOUD COMPUTING.
SaaS, PaaS & TaaS By: Raza Usmani
M.A.Doman Model for enabling the delivery of computing as a SERVICE.
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
David De Roure WSRI Summer School RPI July You will be able to answer the question “What is Web 2.0?” 2.You will have some ideas about how our.
2012 National BDPA Technology Conference Creating Rich Data Visualizations using the Google API Yolanda M. Davis Senior Software Engineer AdvancED August.
SOA, BPM, BPEL, jBPM.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Taverna and my Grid Basic overview and Introduction Tom Oinn
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
M.A.Doman Short video intro Model for enabling the delivery of computing as a SERVICE.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
David De Roure University of Southampton, UK Carole Goble The University of Manchester, UK A Web 2.0 Virtual Research Environment OGF Semantic Grid Research.
Teranode Tools and Platform for Pathway Analysis Michael Kellen, Solution Manager June 16, 2006.
MyExperiment 2.0 – Preserving digital Research Objects using the Wf4Ever architecture EGI/SHIWA Workshops on e-Science Workflows Budapest, Stian.
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
Copyright © 2002 Intel Corporation. Intel Labs Towards Balanced Computing Weaving Peer-to-Peer Technologies into the Fabric of Computing over the Net Presented.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
David De Roure Repeat, Reuse, Remix, Reproduce, … Reconstructable Research.
Semantic Web: The Future Starts Today “Industrial Ontologies” Group InBCT Project, Agora Center, University of Jyväskylä, 29 April 2003.
Infrastructures for Social Simulation Rob Procter National e-Infrastructure for Social Simulation ISGC 2010 Social Simulation Tutorial.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Datalayer Notebook Allows Data Scientists to Play with Big Data, Build Innovative Models, and Share Results Easily on Microsoft Azure MICROSOFT AZURE ISV.
© 2013, published by Flat World Knowledge Chapter 10 Understanding Software: A Primer for Managers 10-1.
You are Here! Navigating SharePoint 1. Sharon Weaver 15 years designing, developing, and managing software 10 years SharePoint experience Six Sigma Black.
The Collaborative Semantic Grid David De Roure University of Southampton, UK
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
A presentation about myExperiment David De Roure and Carole Goble.
Semantic Web Technologies Brief Readings Discussion Class work: Projects discussion Research Presentations.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
Ocean Observatories Initiative Serving Ocean Model Data on the Cloud M. Meisinger, C. Farcas, E. Farcas, C. Alexander, M. Arrott, J. de La Beaujardière,
David De Roure Workflows in Support of Large-Scale Science Provenance, a.
Built on the Microsoft Azure Platform, UberCloud Helps Engineers and Software Providers to Offer and Deploy Powerful Cloud Services On Demand MICROSOFT.
Welcome Grids and Applied Language Theory Dave Berry Research Manager 16 th October 2003.
Co-evolution of digital technologies and research methods David De Roure.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Smart Labs for Smart People New ways to collect, curate and share information Jeremy Frey School of Chemistry, University of Southampton June 2010Jeremy.
Step by Step Approach to Create Your Own Governance and Training Delivery Site SharePoint Fest Denver (WS203) Slides:
 GEETHA P.  Originally coined by Tim O’Reilly Publishing Media  Second generation of services available on www.  Lets people collaborate and share.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
The Influence and Impact of Web 2.0 on e-Research Infrastructure, Applications and Users User Day.
Accessing the VI-SEEM infrastructure
Professor Carole Goble University of Manchester, UK
Large Scale Distributed Computing
Presentation transcript:

David De Roure Eindhoven Edition

Due to the complexity of the software and the backend infrastructural requirements, e-Science projects usually involve large teams managed and developed by research laboratories, large universities or governments. e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.

How do we know when e-Science has succeeded? Not just accelerated but new A. When everyone is using the Grid B. When there are routine scientific advances that would not have happened otherwise

How do we move from heroic scientists doing heroic science with heroic infrastructure to everyday scientists doing science they couldn’t do before? humanists archaeologists geographers musicologists... researchers! research It’s the democratisation of e-Research

scientists Local Web Repositories Digital Libraries Graduate Students Undergraduate Students Virtual Learning Environment Technical Reports Reprints Peer- Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses experimentation Data, Metadata Provenance Workflows Ontologies The social process of science

Between 19 th October and 23 rd November 2007 I attended six international meetings related to e-Science Grid 2007 Scientific and Scholarly Workflows e-Social Science 2007 W3C Open Grid Forum Microsoft e-Science This is what I found

Everyday researchers doing everyday research Not just a specialist few doing heroic science with heroic infrastructure Chemists are blogging the lab Everyone is mashing up Everday hardware – multicore machines and mobile devices 1

A data-centric perspective, like researchers Data is large, rich, complex and real-time There is new value in data, through new digital artefacts and through metadata e.g. context, provenance, workflows This isn’t “anti-computation” – design interaction around data 2

Collaborative and participatory The social process of science revisited in the digital age Collaborative tools – blogs and Wikis e-Science now focuses on publishing as well as consuming Scholarly lifecycle perspective 3

Benefitting from the scale of digital science activity to support science This is new and powerful! Community intelligence Review Usage informing recommendation e.g. OpenWetWare e.g. myExperiment 4

Increasingly open Preprints servers and institutional repositories Open journals Open access to data Science Commons Object Reuse & Exchange 5

Better not Perfect The technologies people are using are not perfect They are better They are easy to use They are chosen by scientists 6

Empowering researchers The success stories come from the researchers who have learned to use ICT Domain ICT experts are delivering the solutions Anything that takes away autonomy will be resisted 7

About pervasive computing e-Science is about the intersection of the digital and physical worlds Sensor networks Mobile handheld devices 8

1.Everyday researchers doing everyday research 2.A data-centric perspective, like researchers 3.Collaborative and participatory 4.Benefitting from the scale of digital science activity to support science 5.Increasingly open 6.Better not Perfect 7.Empowering researchers 8.About pervasive computing Signs of the Times

e-Science is now enabling researchers to do some completely new stuff! As the individual pieces become easy to use, researchers can bring them together in new ways and ask new questions “The next level” Onward and Upward “Standing on the shoulders of giants” (Everyday researchers are giants too)

Note to Reader. The next slides are not intended to be anti-grid. Everyone working on Grid is doing great work.

Everyday researchers doing everyday research BUT heroic Grid infrastructure not being adopted A data-centric perspective, like researchers BUT Grid gives APIs to computation not data Collaborative and participatory BUT Grid has deeply rooted service provider mindset Better not Perfect BUT Grid aims to provide well-engineered perfect solution Giving autonomy to researchers BUT Grid has feel of institutional control (at this time) About pervasive computing BUT Grid is about portals, not the next generation of users The Grid Problem

e-Science Technology Creators & Integrators Applications Research EE Research Socio-economic & Commercial Innovation e-Science bespoke tailoring Mass Use by Researchers 5 years CS Research e-Science 10s of integrators 100s of embedded consultants 1000s of research users The Arrow Problem e-Science Pipeline Malcolm Atkinson NB This isn’t wrong!

Don’t think rollout of technologies... Think roll-in of researchers... Mass Use by Researchers Mass Use by Researchers Knowledge co-production vs Service Delivery!

Web ServicesRESTful APIscmd linessshhttp Web BrowserMobile phoneiPodCarEquipmentPDA P2P mashups workflows services applications Subject ICT experts Computer Scientists Software Companies Workflow tools Ruby on Rails ecosystem Scientists open source Software Engineers nesc OeRC

It’s about empowerment as well as provision People power – the new instrument of scale! Hence usability: – Simple/familiar interfaces for users – Simple/familiar interfaces for developers – No need for a summer school! Step into user space and look back Computer Scientists as facilitators and problem solvers(?) For a flourishing ecosystem...

Wikis Mashups REST APIs Google Maps Technologies: – AJAX, JSON, Ruby on Rails,... Social networking Web as a distributed application platform – Amazon S3 and EC2 But what about Web 2.0?!

Signs of the Times The Long Tail Data is the Next Intel Inside Users add value Network effects by default Some Rights Reserved The Perpetual Beta Cooperate, don’t Control Software above the level of the single device Web 2.0 patterns 1.Everyday researchers doing everyday research 2.A data-centric perspective, like researchers 3.Collaborative and participatory 4.Benefitting from the scale of digital science activity 5.Increasingly open 6.Better not Perfect 7.Empowering researchers 8.About pervasive computing

use Web 2.0 here? Grid

use Web 2.0 here? Grid

use Web 2.0 here Grid cloud HPC

A utility is a directly and immediately useable service with established functionality, performance and dependability, illustrating the emphasis on user needs and issues such as trust Services are knowledge- assisted (‘semantic’) to facilitate automation and advanced functionality, the knowledge aspect reinforced by the emphasis on delivering high level services to the user The architecture comprises services which may be instantiated and assembled dynamically, hence the structure, behaviour and location of software is changing at run-time Service-Oriented Knowledge Utility semanticgrid.org/NGG3

If you peel back the label and its says “Grid” or “OGSA” underneath… its not a cloud. If you need to send a 40 page requirements document to the vendor then… it is not cloud. If you can’t buy it on your personal credit card… it is not a cloud If they are trying to sell you hardware… its not a cloud. If there is no API… its not a cloud. If you need to rearchitect your systems for it… Its not a cloud. If it takes more than ten minutes to provision… its not a cloud. If you can’t deprovision in less than ten minutes… its not a cloud. If you know where the machines are… its not a cloud. If there is a consultant in the room… its not a cloud. If you need to specify the number of machines you want upfront… its not a cloud. If it only runs one operating system… its not a cloud. If you can’t connect to it from your own machine… its not a cloud. If you need to install software to use it… its not a cloud. If you own all the hardware… its not a cloud. James Governor

Multicore chips will offer so much performance that we need not cobble together heterogeneous resources but rather can deploy simple powerful systems Geoffrey Fox

Web 2.0 is not high performance – It improves the performance of science and people! Web 2.0 is not a properly engineered solution – Scientists want better, not perfect. And agility. Web 2.0 is not secure – People do lots of “secure” things on the Web Web 2.0 is a fad that will pass – It’s inevitable and it’s already happened! Web 2.0 works for teenagers but it won’t for scientists – See OpenWetWare Web 2.0 lets the oiks in and this is a bad thing – Now we can do peer review even better! Myths

N2N2 N N

One Middleware 2N N N

Middleware ? N N Polynomial involving N1, N2 and M

Workflows are the new rock and roll Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources The era of Service Oriented Applications Repetitive and mundane boring stuff made easier E. Science laboris Carole Goble

Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle Paul meets Jo. Jo is investigating Whipworm in mouse. Jo reuses one of Paul’s workflow without change. Jo identifies the biological pathways involved in sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite. Previously a manual two year study by Jo had failed to do this. Recycling, Reuse, Repurposing

Taverna downloads per day taverna.sourceforge.net

Run on your laptop – no sysadmin required Access independent third party world-wide service providers of applications, tools and datasets – 850 databases, 166 web servers Nucleic Acids Research Jan 2006 My local applications, tools and datasets. In the Enterprise. In the laboratory. Easily incorporate new services without coding The Superclient

Kepler Triana BPEL Ptolemy II

myExperiment.org is… “Facebook for Scientists”...but different to Facebook! A community social network. A gateway to other publishing environments A federated repository A platform for launching workflows Publishing self-describing Encapsulated myExperiment Objects Mindful publication Started March 2007 Closed beta since July 2007 Open beta November 2007 myExperiment.org is...

Google Gadget

Ownership and Attribution

24/5/2007 | myExperiment | Slide 46

` users descriptions groups friendships tags Enactor blobs workflows HTML XML Snapshot map of resources with their relationships and versions

scientists Local Web Repositories Graduate Students Undergraduate Students Virtual Learning Environment Technical Reports Reprints Peer- Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses experimentation Data, Metadata Provenance Workflows Ontologies Digital Libraries The social process of science 2.0

e-Research is about doing new research Grid is just one part of the solution Users are not just consumers of infrastructure. Empower them. Web 2.0 is a set of design patterns Think Web 2.0 coupling Grid and other services Workflows make e-Science easier, and Web 2 makes workflows easier Take Homes 2.0

Contact David De Roure Carole Goble Thanks Malcolm Atkinson, Geoffrey Fox, Jeremy Frey, Savas Parastatides, The myGrid Family

Provenance Harvesting myExperiment metadata bus ORE RDF Store Encapsulated myExperiment Object (EMO) Metadata

ReM=Resource Map, A=aggregation, AR=Aggregated Resource OAI-ORE Object Exchange and Reuse

Anatomy of an EMO EMO Metadata creator, modified, rights URIs into myExperiment(s) with types and comments workflow, data, description URIs to external resources, with alternates, types, comments, versions Optional annotations of URIs and their relationships

Linked Data

TAVERNA FUNCTIONAL LANGUAGE SHOCK! RESEARCH DAILY British Scientists revealed today that Taverna is in fact a functional language. In a police statement, Taverna creator Tom Oinn said “it’s a fair cop guv”... Advertisement New Improved Closurize and Concentrate TM Add Lambda Calculus to your Lambda Network! Satisfaction guaranteed in several different colours

Original workflow High-level design of quality filter Compilation to quality workflow Compilation to quality workflow Integration New quality filter Quality-aware workflow Declarative specification Declarative spec is formal (XML) Compilation is automated QW follows predictable pattern  integration also automated Declarative spec is formal (XML) Compilation is automated QW follows predictable pattern  integration also automated Quality Workflows Paolo Missier

Malcolm Atkinson