Download presentation
Presentation is loading. Please wait.
Published byChristiana Cross Modified over 9 years ago
1
4 th Annual EPSRC e-science meeting The need for e-Science An industrial perspective Stephen Calvert – VP Cheminformatics GSKYike Guo – Imperial College
2
4 th Annual EPSRC e-science meeting What is the “industrial” world like? Historically –Low volume 30-50 cmpds/yr/chemist: 10,000s assay wells/yr –Low information diversity scientists generally dealt with limited types of data –reductionist approach limited information per experiment –Interpretation critical fro next step scientists required: –simple systems to assist in information monitoring –decision making resides with the scientist
3
4 th Annual EPSRC e-science meeting What is the “industrial” world like? What happened in the last 5 years? –“industrialisation” - Application of “principles of industrialisation” to drug discovery high volume –10,000 cmpd/yr/chemist/100+ million wells/yr –biology revolution Human genome –“system biology” – holistic view and interpretation –high content data --- images –multiple result types from each experiment – bio-markers, pathways –knowledge integration scientific discipline integration –scientists required: complex systems, algorithms, statistics……. decision making shared between systems and scientists “Informatics” essential – partnership not service
4
4 th Annual EPSRC e-science meeting How have we (IT) tackled the transition? Business as usual –problem centric view build applications integrate applications Educate scientists in the realms of IT –“Now I need to be an IT expert alongside chemistry, biology, genetics, robotics, engineering ……” –interesting time scale - generations Technology is our saviour! –client server, web services, java, C#, Corba, OO programming, extreme programming, grid computing, …..
5
4 th Annual EPSRC e-science meeting What are the results? chemistry “islands” of process & data –complex integration problem “spaghetti” joins our worlds - unsustainable - cost control with “IT” –mismatch in cycle time to change –engineered out serendipity –service role reversed infrastructure samples screening “library” design data
6
4 th Annual EPSRC e-science meeting How could we do it differently? result in: –handing control of science back to the scientist –match cycle times to change –Simplify how can we merge the 2 worlds? –physical, information
7
4 th Annual EPSRC e-science meeting Doodling in knowledge and experiment space no predefined steps capture what was done don’t restrict what can be done? don’t restrict the non-obvious Information Resources IC 50 Assay Exclusion Lists Structure Validation Other Assay... Q: - are these results real? Q: - what do I know about these compounds? Q: - what other data can I acquire? this is workflow – isn’t it? physical & information worlds merge
8
4 th Annual EPSRC e-science meeting Doodling in knowledge & experiment space Need access to world-class scientific algorithms and tools Need access to disparate data sources from multiple locations Intuitive & flexible GUI design/analysis Framework needs to be very generic Ability to construct a “just-in-time” application Need to serving the requirements of a varied user community –both in terms of scientific and technical know-how Capture and dissemination of “Best practice” within a creative environment to enhance efficiency company wide
9
4 th Annual EPSRC e-science meeting Discovery Net Overview Funding : –One of the Eight UK National e-Science Projects ( £ 2.4 M) Key Features: –Allow Scientists to Construct, Share and Execute Complex Knowledge Discovery Processes & Services –Allow Institutions to Manage and Utilise the Compositional Services as its Intellectual Properties Applications: –Life Science –Environmental Modelling –Geo-hazard Prediction Achievement : –For the First time Discovery Net Realises the Dynamic Construction of Compositional Services on GRID for Real Time Knowledge Discovery and Decision Making Goal : Constructing the World ’ s First Infrastructure for Global Wide Knowledge Discovery on the Grid of Web Services Using GRID Resources Scientific Information Scientific Discovery In Real Time Literature Databases Operational Data Images Instrument Data Real Time Data Integration Dynamic Application Integration Discovery Services Process Knowledge Management Workflow = Compositional Service
10
4 th Annual EPSRC e-science meeting Enterprise Wide Integrative Scientific Decision Making Platform with Discovery Net Workflow Constructing a ubiquitous workflow : by scientists –Integrate information resources/software applications cross-domain –Support innovation and capture the best practice of your scientific research Warehousing workflows: for scientists –Manage discovery processes within an organisation –Construct an enterprise process knowledge bank Deployment workflow: to scientists –Turn a workflows into reusable applications/services –Turn every scientist into a solution builder
11
4 th Annual EPSRC e-science meeting An Integrative Analysis Example: Interactive&Interactive Scientific Discovery with Workflow Relational data mining Text mining Spectrum data mining Chemical sequence data model Visualizing relational data clusters Visualizing multidimensional data Visualizing sequence data Visualizing pathway data Text mining visualization Visualizing cluster statistics Visualizing serial/spectrum data Decision tree model of metabonomic profile Chemical structure visualization Relational data mining Text mining Spectrum data mining Chemical data model
12
4 th Annual EPSRC e-science meeting Discovery Net Commercialisation Discovery Net Research CS : Workflow for Informatics on SOA Sensor : Sensor Data Processing and Mining Application : Life, Environmental and Geo-physical Sciences DeltaDot Research : Commercialisation (Imperial College Spin Out Companies): Workflow technology HT sensor processing KDE Informatics Platform Label Free HT bioSensors Life Science Industry
13
4 th Annual EPSRC e-science meeting library design - GSK Process of selecting the molecules I want to make from the universe of molecules Toolbox: scientific models, chemical handling, chemical properties, data access, statistics, data visualisation, …. Scientists can doodle in chemical space –Capture how scientists made decisions New algorithms, data sources added in < 1 hour
14
4 th Annual EPSRC e-science meeting The 2003 SARS outbreak KDE Example2 : SARS Genome Annotation Relationship between SARS and other virus Mutual regions identification Homology search against viral genome DB Annotation using Artemis and GenSense Gene prediction Phylogenetic analysis Exon prediction Splice site prediction Immunogenetics Multiple sequence alignment Microarray analysis Bibliographic databases Key word search GeneSense Ontology D-Net: Integration, interpretation, and discovery Epidemiological analysis Predicted genes SARS patients diagnosis Homology search against protein DB Homology search against motif DB Protein localization site prediction Protein interaction prediction Relationship between SARS virus and human receptors prediction Classification and secondary structure prediction Bibliographic databases Genbank Annotation using Artemis and GenSense China SARS Virtual Lab based on Discovery Net Achievement: Dynamic Construction of Compositional Services: Rapid construction of applications via composition of existing web services using workflow. Instant deployment of analytical workflows as new web services with resource mapping. Integrated workflow, provenance and service management Collaborative construction of workflows by large numbers of researchers Requirements: Rapid constructing and sharing mission critical discovery services Integration of diverse bioinformatics applications Support collaborative research between geographically distributed researchers Deploying services as easy to use tools for real time decision making
15
4 th Annual EPSRC e-science meeting Compositional Services for SARS Mutation Analysis 50 data resource > 200 software applications and services Designed on top of the web service environment Used by more than 200 scientists Result published in >
16
4 th Annual EPSRC e-science meeting Future Challenge: GSK- InforSense & IC e-Science Collaboration Workflow Fusion : Applying advanced performance programming technology for dynamic optimization of workflow execution Workflow Abstraction : Investigating abstraction mechanisms for building workflow hierarchy and higher order composition forms Dynamic Service Composition: Investigating service ontology for dynamic composing services with workflow Workflow Metadata Model : Building up a generic meta data model for scientific workflow management and workflow warehousing Man – machine interface – free scientists from IT speak
17
4 th Annual EPSRC e-science meeting How can you help? encourage focused research in key issues SCIENTISTS facing in industries catalyst the joint work in these focused fields between academics, industry and commercial software vendors facilitate the solution-oriented communication between computer scientists and domain scientists in both academic and industry
18
4 th Annual EPSRC e-science meeting e-Science A politician's view: ‘ [The e-Science platform ] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information. ’ Tony Blair A Scientist ’ s View: [The e-Science platform ] should help me to do my scientific research free from the complexity of IT
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.