1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.

Slides:



Advertisements
Similar presentations
Dominik Stokłosa Pozna ń Supercomputing and Networking Center, Supercomputing Department INGRID 2008 Lacco Ameno, Island of Ischia, ITALY, April 9-11 Workflow.
Advertisements

Configuration management
Provenance-Aware Storage Systems Margo Seltzer April 29, 2005.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 2.
Visibility Information Exchange Web System. Source Data Import Source Data Validation Database Rules Program Logic Storage RetrievalPresentation AnalysisInterpretation.
® IBM India Research Lab © 2006 IBM Corporation Challenges in Building a Strategic Information Integration Infrastructure Mukesh Mohania IBM India Research.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Distributed Systems Architectures
GenSpace: Exploring Social Networking Metaphors for Knowledge Sharing and Scientific Collaborative Work Chris Murphy, Swapneel Sheth, Gail Kaiser, Lauren.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Designing Control System Software for Radio Telescopes S. Chaudhuri, A. Ahuja, S. Natrajan, and H.M. Vin Presenter: Harrick M. Vin Vice President and Chief.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
eGovernance Under guidance of Dr. P.V. Kamesam IBM Research Lab New Delhi Ashish Gupta 3 rd Year B.Tech, Computer Science and Engg. IIT Delhi.
Business Intelligence Dr. Mahdi Esmaeili 1. Technical Infrastructure Evaluation Hardware Network Middleware Database Management Systems Tools and Standards.
© 2006, Cognizant Technology Solutions. All Rights Reserved. The information contained herein is subject to change without notice. Automation – How to.
H-1 Network Management Network management is the process of controlling a complex data network to maximize its efficiency and productivity The overall.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
By Mihir Joshi Nikhil Dixit Limaye Pallavi Bhide Payal Godse.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Framework for Automated Builds Natalia Ratnikova CHEP’03.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Distributed Systems 1 CS- 492 Distributed system & Parallel Processing Sunday: 2/4/1435 (8 – 11 ) Lecture (1) Introduction to distributed system and models.
Parallel Processing CS453 Lecture 2.  The role of parallelism in accelerating computing speeds has been recognized for several decades.  Its role in.
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
DISTRIBUTED COMPUTING
Database System Concepts and Architecture
Exploring the Applicability of Scientific Data Management Tools and Techniques on the Records Management Requirements for the National Archives and Records.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
File Processing - Database Overview MVNC1 DATABASE SYSTEMS Overview.
1 Supporting the development of distributed systems CS606, Xiaoyan Hong University of Alabama.
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
By: Ashish Gohel 8 th sem ISE.. Why Cloud Computing ? Cloud Computing platforms provides easy access to a company’s high-performance computing and storage.
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Presented by: Reem Alshahrani. Outlines What is Virtualization Virtual environment components Advantages Security Challenges in virtualized environments.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Abstract A Structured Approach for Modular Design: A Plug and Play Middleware for Sensory Modules, Actuation Platforms, Task Descriptions and Implementations.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Features Of SQL Server 2000: 1. Internet Integration: SQL Server 2000 works with other products to form a stable and secure data store for internet and.
Jens Hartmann York Sure Raphael Volz Rudi Studer The OntoWeb Portal.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
Mick Badran Using Microsoft Service Fabric to build your next Solution with zero downtime – Lvl 300 CLD32 5.
Lecture On Introduction (DBMS) By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
SQL Server 2012 Session: 1 Session: 4 SQL Azure Data Management Using Microsoft SQL Server.
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
IIS 645 Database Management Systems DDr. Khorsheed Today’s Topics 1. Course Overview 22. Introduction to Database management 33. Components of Database.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
Discovering Computers 2010: Living in a Digital World Chapter 14
Clouds , Grids and Clusters
Joseph JaJa, Mike Smorul, and Sangchul Song
Database Management Systems
Model-Driven Analysis Frameworks for Embedded Systems
COMPASS Database SPACE TELESCOPE SCIENCE INSTITUTE Gretchen Greene
Database management systems
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16 February 2006

2 Software for the SKA There are different ways to view the complex SKA instrument and its software complement: View 1 SKA hardware + control software + data management software + analysis software View 2 A software ‘telescope’ that is implemented by the SKA hardware Taking View 2 SKA software is like other complex software systems built today for many different applications.

3 CHEP Mumbai 16 February 2006 SKA Software Three dimensions of SKA software –Instrument management –Data indexing, archival, retrieval and dissemination –Data and signal processing – astronomy calculations The last dimension is of greatest interest to radio astronomers. However, today they are handling all three dimensions.

4 CHEP Mumbai 16 February 2006 SKA Software Three dimensions of SKA software –Instrument management –Data indexing, archival, retrieval and dissemination –Data and signal processing – astronomy calculations Main problem: Scale and system complexity –Large number of antennas (heterogeneous) –Complex control and data communication networks –Large volume of data storage –Diverse and evolving usage models Together, these pose a significant software challenge!

5 CHEP Mumbai 16 February 2006 SKA Software & the Software Process Like other complex software systems, the SKA software will evolve with time. –Initial design and implementation, –Growth in functionality and performance, –Version management, & –Backup, recovery and observational continuity. Software maintenance can account for 50—75% of the total effort and cost of the software system.

6 CHEP Mumbai 16 February 2006 Manage Complexity Through Abstraction Virtual Telescope - I

7 CHEP Mumbai 16 February 2006 Manage Complexity Through Abstraction Virtual Telescope - I Virtual Telescope - II

8 CHEP Mumbai 16 February 2006 Abstraction, virtualization, and hierarchical composition  hide complexity, simplify management and upgrade Abstraction, virtualization, and hierarchical composition  hide complexity, simplify management and upgrade Manage Complexity Through Abstraction Configurable Data Source Configurable Data Source Configurable Data Source Configurable Data Source Configurable Data Source Configurable Data Source Configurable Data Source Configurable Data Source Configurable Data Source Configurable Data Source Virtual Telescope - I Virtual Telescope - II Configurable Data Source Abstraction also simplifies data management

9 CHEP Mumbai 16 February 2006 Virtual Telescope: A Usage Scenario (I) User sets up an experiment using a virtual telescope console: –Describe telescope configuration (e.g., sky region, time, …) –Describe data processing requirements Experiment scheduled and configured automatically Data streams collected, processed, stored Data quality monitored automatically; resources re-configured as needed without manual intervention –Control engineers informed of any abnormality Data post-processing done; data and results published

10 CHEP Mumbai 16 February 2006 Virtual Telescope: Usage Scenario (II) Data model to capture data lineage Rich methods for search and access –Data collected from a particular sky region over an interval –Data collected from a data source over an interval –Content-based retrieval Data with particular pattern –…

11 CHEP Mumbai 16 February 2006 Virtual Telescope: Usage Scenario (III) Support multiple consoles for different user communities E-Museum Console SKA tour, images/documents, history, search Astronomer’s Console Set-up experiment, monitor, visualize, process data; store/analyze results Control Engineer’s Console configuration, monitoring, fault diagnosis, reliability Data Administrator’s Console Archival, retrieval, disaster recovery

12 CHEP Mumbai 16 February 2006 Virtual Telescope Consoles Data Management Instrument Management Virtual Telescope: Architecture SKA

13 CHEP Mumbai 16 February 2006 Virtual Telescope: Goals Usability and configurability: –Support for multiple simultaneous experiments Automation: –Scheduling of experiments and instrument configuration Availability: –Fault detection, isolation and recovery –Disruption-free maintenance and upgrade Data management: –Storage, retrieval and search of data –Multiple views and data models Extensibility: –Incorporate emerging technology and new usage models

14 CHEP Mumbai 16 February 2006 Instrument Management: CS Challenges Analogy –Emulab: Emulation facility built from a cluster of servers –Usage model Schedule and configure experiments Fault detection, isolation, and recovery –Solution methodology Component virtualization: server operating system System virtualization: cluster operating system Design of instrument management system: –‘Operating System for Radio Astronomy Telescopes’ –Wide variety of instrument components Antenna array, control systems, computational cluster, network & storage –Define a hierarchy of abstractions

15 CHEP Mumbai 16 February 2006 The Data Management Problem Once published, data must remain available –Data may be used long after data collection –Allow others to reproduce work and/or do new science Goals: –Easy to search, access and process data –Ease of adding data and findings to corpus Challenge: Data avalanche –Vast volume of data: Multi-petabyte storage Raw data, processed and filtered data, imagery, … –Lack of data models and descriptors for publication –Capturing data lineage is laborious

16 CHEP Mumbai 16 February 2006 Data Management: CS Challenge Data models and descriptors –When, where, how was data captured? Experimental configuration –Data lineage: how was the data processed/produced? Association of raw, processed and image data Description of data processing applications –Observations and results derived from datasets –…–… System: Federated data repository + search portal –Data visualization – capture queries and present results –Data analysis – data mining, machine learning, etc. –Distributed software systems – data and systems management

17 CHEP Mumbai 16 February 2006 Is the SKA Software Unique? The SKA application is challenging & unique. But many problems are similar to those seen elsewhere, e.g.: –Service-oriented transactional application –Data storage integrity –Redundancy and continuity –Technology independent applications There are also many problems that need to be studied. Success will need collaboration from radio astronomers and computer scientists.

18 CHEP Mumbai 16 February 2006 Technology Independence Large applications can be built to be independent of the hardware infrastructure: –Platform independent –Database, middleware independent –Language independent Similar abstractions can be used to allow for evolution of antennas. Model-based software development standards are emerging -- can provide a framework for the SKA software development.

19 CHEP Mumbai 16 February 2006 Conclusions Constructing the SKA software will be a challenging task –Many of the problems have solutions –Others need investigation. Important to use a mature software process –To permit growth and evolution. There should be some pilot development projects –to evaluate performance, scalability. Need to separate the software system architecture from the astronomy-specific analysis software.

20 CHEP Mumbai 16 February 2006 Thank you.