Presentation is loading. Please wait.

Presentation is loading. Please wait.

Virtual Observatory & Grid Technique ZHAO Yongheng (National Astronomical Observatories of China) CANS2002.

Similar presentations


Presentation on theme: "Virtual Observatory & Grid Technique ZHAO Yongheng (National Astronomical Observatories of China) CANS2002."— Presentation transcript:

1 Virtual Observatory & Grid Technique ZHAO Yongheng (National Astronomical Observatories of China) CANS2002

2 Computational Science The Third Science Branch is Evolving In the beginning science was empirical. Then theoretical branches evolved. Now, we have computational branches. –Has primarily been simulation –Growth area data analysis/visualization of peta-scale instrument data. Analysis & Visualization tools –Help both simulation and instruments. –Are primitive today.

3 Computational Science Traditional Empirical Science –Scientist gathers data by direct observation –Scientist analyzes data Computational Science –Data captured by instruments Or data generated by simulator –Processed by software –Placed in a database –Scientist analyzes database Concern: Scalability

4 Astronomy Data Growth In the “old days” astronomers took photos. Starting in the 1960’s they began to digitize. New instruments are digital (100s of GB/night) Detectors are following Moore’s law. Data avalanche: double every 2 years Total area of 3m+ telescopes in the world in m 2, total number of CCD pixels in megapixel, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels.

5 Universal Access to Astronomy Data Astronomers have a few Petabytes now. –1 pixel (byte) / sq arc second ~ 4TB –Multi-spectral, temporal, … → 1PB They mine it looking for new (kinds of) objects or more of interesting ones (quasars), density variations in 400-D space correlations in 400-D space Data doubles every 2 years. Data is public after 2 years. So, 50% of the data is public. Some have private access to 5% more data. So: 50% vs 55% access for everyone

6 The Changing Style of Observational Astronomy The Old Way:Now:Future: Pointed, heterogeneous observations (~ MB - GB ) Large, homogeneous sky surveys ( multi-TB, ~ 10 6 - 10 9 sources) Multiple, federated sky surveys and archives (~ PB ) Small samples of objects (~ 10 0 - 10 3 ) Archives of pointed observations (~ TB ) Virtual Observatory

7 Why Astronomy Data? It has no commercial value –No privacy concerns –Can freely share results with others –Great for experimenting with algorithms It is real and well documented –High-dimensional data (with confidence intervals) –Spatial data –Temporal data Many different instruments from many different places and many different times Federation is a goal The questions are interesting –How did the universe form? There is a lot of it (petabytes) IRAS 100  ROSAT ~keV DSS Optical 2MASS 2  IRAS 25  NVSS 20cm WENSS 92cm GB 6cm

8 Chandra Hubble MMT Sub-mm array VLA Antartica sub-mmMagellan 6.5m Whipple  -ray SIRTF Oak Ridge 1.2m CO Virtual Observatory == World-Wide Telescope

9 Virtual Observatory Premise: Most data is (or could be online) So, the Internet is the world’s best telescope: –It has data on every part of the sky –In every measured spectral band: optical, x-ray, radio.. –As deep as the best instruments (2 years ago). –It is up when you are up. The “seeing” is always great (no working at night, no clouds no moons no..). –It’s a smart telescope: links objects and data to literature on them.

10 Why is VO a Good Scientific Prospect? Technological revolutions as the drivers/enablers of the bursts of scientific growth Historical examples in astronomy: –1960’s: the advent of electronics and access to space Quasars, CMBR, x-ray astronomy, pulsars, GRBs, … –1980’s - 1990’s: computers, digital detectors (CCDs etc.) Galaxy formation and evolution, extrasolar planets, CMBR fluctuations, dark matter and energy, GRBs, … –2000’s and beyond: information technology The next golden age of discovery in astronomy? VO is the mechanism to effect this process

11 Surveys Observatories Missions Survey and Mission Archives Follow-Up Telescopes and Missions Results Data Services --------------- Data Mining and Analysis, Target Selection Digital libraries Primary Data Providers VO Secondary Data Providers SDSS (USA) LAMOST (China)

12 Virtual Observatory & the Public The universe at anyone ’ s fingertips Educational activities involving real data New discoveries made by schoolchildren Interactive exhibits based on archived data Astronomy as a motivator for learning about computing  Real Astronomy Experience

13 Virtual Observatory Challenges Size : multi-Petabyte 40,000 square degrees is 2 Trillion pixels –One band (at 1 sq arcsec) 4 Terabytes –Multi-wavelength 10-100 Terabytes –Time dimension >> 10 Petabytes –Need auto parallelism tools Unsolved MetaData problem –Hard to publish data & programs –How to federate Archives –Hard to find/understand data & programs Current tools inadequate –new analysis & visualization tools –Data Federation is problematic Transition to the new astronomy –Sociological issues

14 Astronomical Strategies PROBLEM SOLUTION Slow CPU growthDistributed Computing Limited storageDistributed Data Limited bandwidthInformation Hierarchies - Move only what you need Data diversityInteroperability VO

15 Grids GRIDMIDDLEWAREGRIDMIDDLEWARE Visualization Supercomputer, PC-Cluster Data-storage, Sensors, Experiments Internet, networks Desktop Mobile Access Hoffmann, Reinefeld, Putzer

16 the Virtual Observatory concept Aim to make all archives speak the same language –all searchable and analysable by the same tools –all data sources accessible through a uniform interface –all data held in distributed databases that appear as one archives form the Digital Sky –eventual interface to real observatories the archive is the sky

17 shared managed distributed resources –documents + data + software + storage + cycles + expertise network : ability to pass messages web : transparent document system computational grid : transparent CPU datagrid: transparent data access and services information grid, knowledge grid... ? Virtual Organisations ? the Grid concept a supercomputer on your desktop everybody can be a power user

18 Three Layer GRID Abstraction Information Grid Knowledge Grid Computation/Data Grid Data to Knowledge Control Automation E-Science

19 What’s needed? Science Data & Questions Scientists Database To store data Execute Queries Plumbers Data Mining Algorithms Miners Question & Answer Visualization Tools

20 obstacles to overcome sociology internet technology i/o bottleneck network bottleneck

21 obstacles to overcome (1) sociology –need agreed formats for data, metadata, provenance –need standardised semantics ("ontology") internet technology –need protocols for publishing and exchanging data –need registry for publishing service availability and semantics –need method of transmitting authentication/authorisation –need methods for managing distributed resources

22 obstacles to overcome (2) i/o bottleneck –need database supercomputers –need innovative search and analysis algorithms network bottleneck –data centers must provide analysis service –facility class analysis code needed shift the results not the data

23 Distributed Computing at Work Virtual and collaborative exploration of the Universe Floating Point Operations Total CPU time Results received 4.260259e+18 49.31 TFLOPs/sec 1.502416e+21 1662.448 years954229.737 years 1092374491854017 50753675440Users Last 24 HoursTotal

24 SkyQuery Won 2 nd prize in Microsoft.NET Contest

25 Compute ResourcesCatalogsData Archives Information Discovery Metadata delivery Data Discovery Data Delivery Catalog Mediator Data mediator 1. Portals and Workbenches Bulk Data Analysis Catalog Analysis Metadata View Data View 4.Grid Security Caching Replication Backup Scheduling 2.Knowledge & Resource Management Standard Metadata format, Data model, Wire format Catalog/Image Specific Access Standard APIs and Protocols Concept space 3. 5. 6. 7. Derived Collections National Virtual Observatory Data Grid

26

27 AVO STATUS AVO approved with EU funds ~2 Million € (total budget ~ 4M €) Contract start on 15 November 2001 - 3 Year Phase A study 9 NEW POSITIONS for 3 years over 6 institutions - total 18 FTE (~ 50 people) Total VO funding AVO+NVO+ASTROGRID = $21 million (US) 3 Year target : Build VO 1.0 among the 6 partner archive sets by Defining and executing trial science cases Defining, developing and deploying new interoperability standards and tools Developing and deploying new Grid-based services

28 Data-Rich Astronomy and Other Fields Technical and methodological challenges facing the VO are common to most data-intensive sciences today, and beyond (commerce, industry, finance, etc.) Interdisciplinary exchanges (e.g., with physics, biology, earth sciences, etc.) intellectual cross- fertilization, avoid wasteful duplication of efforts Partnerships and collaborations with applied CS/IT are essential, may lead to significant technological advances High-energy physics WWW ! The Grid Astronomy (VO) ???

29 Scaling the VO Mountain Discoveries Data Mining Visualization Data Mining Visualization Data Services Existing Centers and Archives We are here Thank you!


Download ppt "Virtual Observatory & Grid Technique ZHAO Yongheng (National Astronomical Observatories of China) CANS2002."

Similar presentations


Ads by Google