Presentation is loading. Please wait.

Presentation is loading. Please wait.

E-Science and Cyberinfrastructure Tony Hey Corporate VP for Technical Computing Microsoft Corporation.

Similar presentations


Presentation on theme: "E-Science and Cyberinfrastructure Tony Hey Corporate VP for Technical Computing Microsoft Corporation."— Presentation transcript:

1 e-Science and Cyberinfrastructure Tony Hey Corporate VP for Technical Computing Microsoft Corporation

2 Licklider’s Vision “Lick had this concept – all of the stuff linked together throughout the world, that you can use a remote computer, get data from a remote computer, or use lots of computers in your job” “Lick had this concept – all of the stuff linked together throughout the world, that you can use a remote computer, get data from a remote computer, or use lots of computers in your job” Larry Roberts – Principal Architect of the ARPANET

3 A New Science Paradigm  Thousand years ago: Experimental Science - description of natural phenomena - description of natural phenomena  Last few hundred years: Theoretical Science - Newton’s Laws, Maxwell’s Equations … - Newton’s Laws, Maxwell’s Equations …  Last few decades: Computational Science - simulation of complex phenomena - simulation of complex phenomena  Today: e-Science or Data-centric Science - unify theory, experiment, and simulation - unify theory, experiment, and simulation - using data exploration and data mining - using data exploration and data mining Data captured by instruments Data captured by instruments Data generated by simulations Data generated by simulations Data generated by sensor networks Data generated by sensor networks  Scientist analyzes databases/files (With thanks to Jim Gray)

4 e-Science  e-Science is about data-driven, multidisciplinary science and the technologies to support such distributed, collaborative scientific research  Many areas of science are now being overwhelmed by a ‘data deluge’ from new high-throughput devices, sensor networks, satellite surveys …  Areas such as bioinformatics, genomics, drug design, engineering and healthcare require collaboration between different domain experts  ‘e-Science’ is a shorthand for a set of technologies to support collaborative networked science  HPC and Information Management are key technologies to support this e-Science revolution

5 http://www.neptune.washington.edu/

6 Undersea Sensor Network Connected & Controllable Over the Internet

7 Visual Programming Persistent Distributed Storage

8 Distributed Computation Interoperability & Legacy Support via Web Services

9 Live Documents Searching & Visualization Reputation & Influence

10 Two examples of e-Science  An Astronomy DataGrid – The IVO  Chemistry – The Comb-e-Chem Project

11 IVO: An Astronomy Data Grid  Working to build world-wide telescope  All astronomy data and literature  online and cross indexed  Tools to analyze it  Built SkyServer.SDSS.org  Built Analysis system  MyDB  CasJobs (batch job)  OpenSkyQuery Federation of ~20 observatories.  Results:  It works and is used every day  Spatial extensions in SQL 2005  A good example of Data Grid  A good example of Web Services

12 The Multiwavelength Crab Nebulae X-ray, optical, infrared, and radio views of the nearby Crab Nebula, which is now in a state of chaotic expansion after a supernova explosion first sighted in 1054 A.D. by Chinese Astronomers. Slide courtesy of Robert Brunner @ CalTech. Crab star 1053 AD

13 The Comb-e-Chem Project National X-Ray Service Data Mining and Analysis Automatic Annotation Combinatorial Chemistry Wet Lab HPC Simulation Video Data Stream Diffractometer Middleware Structures Database

14 Crystallographic e-Prints Direct Access to Raw Data from scientific papers Raw data sets can be very large - stored at UK National Datastore using SRB software

15 Cyberinfrastructure  Cyberinfrastructure and e-Infrastructure  In the US, Europe and Asia there is a common vision for the ‘cyberinfrastructure’ required to support the e-Science revolution  Set of Middleware Services supported on top of high bandwidth academic research networks  Software, hardware and organizations that support e-Science  Similar to vision of the Grid as a set of services that allows scientists – and industry – to routinely set up ‘Virtual Organizations’ for their research – or business  The ‘Microsoft Grid’ vision is as much about integrating and managing data and information than about compute cycles

16 Grids for Virtual Organizations

17 Service-Orientation for building Distributed Systems

18 Progress in Grid Standards?  The GGF/EGA merger gives great opportunity for the new Open Grid Forum (OGF) to standardize a small set of basic Grid services based on generally accepted Web Services  Harness the power of the world-wide Grid community to develop robust open source reference implementations  Grid research community needs to propose and explore new features in real experiments  OGF can reassure industry about progress in Grid standards and grow the market for all

19 Scholarly Communication  Global Movement towards permitting ‘Open Access’ to scholarly publications  Libraries can no longer afford publisher subscriptions  Principle that results of publicly funded research should be available to all  Mandates for Open Access  US Proposal – Cornyn-Lieberman Bill  Supported by most top US research universities  EU Proposals  UK, France and German initiatives

20 Technical Computing at Microsoft  Advanced Computing for Science and Engineering  Application of new algorithms, tools and technologies to scientific and engineering problems  High Performance Computing  Application of high performance clusters and database technologies to industrial and scientific applications  Radical Computing  Research in potential breakthrough technologies

21 Fighting HIV with Computer Science Nebojsa Jojic and David Heckerman  A major problem: Over 40 million infected  Drug treatments are effective but are an expensive life commitment  Vaccine needed for third world countries  Effective vaccine could eradicate disease  Methods from computer science are helping with the design of vaccine  Machine learning: Finding biological patterns that may stimulate the immune system to fight the HIV virus  Optimization methods: Compressing these patterns into a small, effective vaccine

22 Developed Set of Specialist Tools  Chromatogram deconvolution  Pathway analysis/association/causal models  Clustering/Trees (phylo, haplotypes etc.)  Protein binding and folding  Sequence diversity models (epitomes)  Image analysis/classification  Evolution modeling and inference  Epitope prediction

23 Top 500 Architectures / Systems

24 Kyril Faenov, Director of HPC Microsoft Corporation http://www.microsoft.com/hpc

25 Windows Compute Cluster Server 2003 Faster time-to-insight through simplified cluster deployment, job submission and status monitoring Better integration with existing Windows infrastructure allowing customers to leverage existing technology and skill-sets Familiar development environment allows developers to write parallel applications from within the powerful Visual Studio IDE

26 Radical Computing  Quantum Computing: Project Q  Multidisciplinary Institute at UCSB with Director Michael Freedman from MSR  Looking at novel material physics and use of topology to protect quantum coherence  Exploration of Many Core architectures, applications and languages  Burton Smith leading Many Core architecture activity  Working with the HPC and CS communities to explore impact of Many Core on research applications

27 10,0001,000100101 ‘70‘80‘90‘00‘10 Power Density (W/cm 2 ) 4004 8008 8080 8085 8086 286 386 486 Pentium ® Hot Plate Nuclear Reactor Rocket Nozzle Sun’s Surface Intel Developer Forum, Spring 2004 - Pat Gelsinger

28 The Future is Parallel: The End of Moore’s Law as We Know It  Future of silicon chips  “100’s of cores on a chip in 2015” (Justin Rattner, Intel) (Justin Rattner, Intel)  Challenge for IT industry and Computer Science community  Can we make parallel computing on a chip easier than message-passing?  Challenge for the Scientific Community  How will the Multi-Core transition affect scientific computing?

29 Summary Microsoft wishes to work with the university research and library communities to: Microsoft wishes to work with the university research and library communities to: develop interoperable high-level services, work flows, tools and data services develop interoperable high-level services, work flows, tools and data services accelerate progress in a small number of societally important scientific applications accelerate progress in a small number of societally important scientific applications assist in the development of interoperable repositories and new models of scholarly publishing assist in the development of interoperable repositories and new models of scholarly publishing explore radical new directions in computing and ways and applications to exploit on-chip parallelism explore radical new directions in computing and ways and applications to exploit on-chip parallelism  How can Microsoft best collaborate with the scientific community?

30 © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.


Download ppt "E-Science and Cyberinfrastructure Tony Hey Corporate VP for Technical Computing Microsoft Corporation."

Similar presentations


Ads by Google