Download presentation
Presentation is loading. Please wait.
Published byCurtis Leeke Modified over 10 years ago
1
Cloud Computing for e-Science with CARMEN Paul Watson Newcastle University
2
e-Science “e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it” John Taylor Former Director General of the UK Research Councils
3
Two Strands to talk...
4
Research Challenge Understanding the brain is the greatest informatics challenge Enormous implications for science: Medicine Biology Computer Science
5
Collecting the Evidence 100,000 neuroscientists generate huge quantities of data – molecular (genomic/proteomic) – neurophysiological (time-series activity) – anatomical (spatial) – behavioural
6
Neuroinformatics Problems Data is: expensive to collect but rarely shared in proprietary formats & locally described The result is: a shortage of analysis techniques that can be applied across neuronal systems limited interaction between research centres with complementary expertise
7
Data in Science Bowker’s “Standard Scientific Model” 1.Collect data 2.Publish papers 3.Gradually loose the original data The New Knowledge Economy & Science & Technology Policy, G.C. Bowker Problems: –papers often draw conclusions from data that is not published –inability to replicate experiments –data cannot be re-used
8
Codes in Science Three stages for codes 1.Write code and apply to data 2.Publish papers 3.Gradually loose the original codes Problems: –papers often draw conclusions from codes that are not published –inability to replicate experiments –codes cannot be re-used
9
CARMEN enables sharing and collaborative exploitation of data, analysis code and expertise that are not physically collocated
10
CARMEN Project UK EPRSC e-Science Pilot £5M (2006-10) 20 Investigators Stirling St. Andrews Newcastle York Sheffield Cambridge Imperial Plymouth Warwick Leicester Manchester
11
Newcastle: Colin Ingram Paul Watson Stuart Baker Marcus Kaiser Phil Lord Evelyne Sernagor Tom Smulders Miles Whittington York: Jim Austin Tom Jackson Stirling: Leslie Smith Plymouth: Roman Borisyuk Cambridge: Stephen Eglen Warwick: Jianfeng Feng Sheffield: Kevin Gurney Paul Overton Manchester: Stefano Panzeri Leicester: Rodrigio Quian Quiroga Imperial: Simon Schultz St. Andrews: Anne Smith CARMEN Consortium
12
Industry & Associates
13
cracking the neural code neurone 1 neurone 2 neurone 3 raw voltage signal data typically collected using single or multi-electrode array recording Focus on Neural Activity
14
Epilepsy Exemplar Data analysis guides surgeon removing brain tissue WARNING! The next 2 Slides show an exposed brain
15
Epilepsy Exemplar Recording from removed tissue (up to 20 GB/h) On-line analysis by distributed collaborators will enable experiment to be defined during data collection Repository will enable integration of rare case types from different labs Advances in Treatment Data analysis guides surgeon removing brain tissue
16
e-Science Requirements Summary Sharing –data –code Capacity –vast data storage (100TB+ in CARMEN) –support data intensive analysis
17
CARMEN Cloud Architecture Data storage and analysis User access over Internet (typically via browser) Users upload data & services Users run analyses
18
e-Science Cloud Services Amazon (& Google) offer cloud computing –Basic storage & compute services –e.g. Amazon S3 & EC2 e-Science needs a set of higher-level services to support user needs Which services?....
19
CARMEN Cloud (CAIRN) Search for Data & Analysis Code Raw & Derived Data Store Structured Metadata Store Enabling Search & Annotation Analysis Code Store
20
Dynasoar Code Repository and Deployment –long term storage Code factored as Web Services –Standard (WS-I) interface –Internals not important Java, MatLab, C, C#,C++,... Deployers for a variety of service types –.war files (Tomcat), Virtual Machines (VMWare, Virtual PC),.NET assemblies, database stored procedures
21
Dynasoar: Dynamic Deployment 21 R The deployed service remains in place and can be re-used - unlike job scheduling A request to s4
22
Dynasoar 22 A request for s2 is routed to an existing deployment of the service
23
Performance Gains
24
Scalability
25
CARMEN Cloud (CAIRN) Search for Data & Analysis Code Raw Signal Data Search & Visualisation Enactment of scientific analysis processes Raw & Derived Data Store Security Policies Controlling Access to Data & Code Structured Metadata Store Enabling Search & Annotation Analysis Code Store
26
Controlled Sharing My collaborators can now see it Everyone can see it Only I am allowed to see this data Scientist
27
Security Solution XACML – standard way to encode rules as (subject, action, resource) triples Rules checked on each access
28
Controlled Sharing - conflicts My collaborators can now see it Only I am allowed to see this data All data must be accessible to everyone after the end of the project Scientist Funder
29
Addressing Conflicts Each party expresses policy as XACML rules Rules are converted to formal language –XACML -> VDM++ Run formal model to detect conflicts
30
OMII: Grimoire DAME: Signal Data Explorer OMII/ my Grid: Taverna OGSA-DAI, SRB, DAME Gold: Role & Task based Security my Grid & CISBAN Dynasoar CARMEN CAIRN
31
Using CARMEN for a typical scenario 1.Data Collection from a Multi-Electrode Array 2.Data Visualisation and Exploration 3.Spike Detection 4.Spike Sorting 5.Analysis 6.Visualisation of Analysis Results Currently, this is a semi-manual process CARMEN has automated this….
32
Web Portal
33
Raw Data Exploration with Signal Data Explorer
34
Defining the process with Workflow
35
Running a Workflow
36
SRB FileSystem RDBMS External Client Spike Sorting Service Reporting Dynamically Deployed Services in Dynasoar TAVERNA Registry INPUT Data OUTPUT Metadata Available Services Repository Security Workflow Engine Query Running the Workflow
37
Graphical Output
38
Movie Output
39
CARMEN (www.carmen.org.uk) is delivering an e-Science infrastructure that can be applied across a diverse range of applications uses a Cloud/Software as a Service architecture enables cooperation and interdisciplinary working aims to deliver new results in neuroscience, computer science and medicine
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.