An Introduction to Scientific Data Grid LUO Ze Computer Network Information Centre, Chinese Academy of Sciences.

Slides:



Advertisements
Similar presentations
Inetrconnection of CNGrid and European Grid Infrastructure Depei Qian Beihang University Feb. 20, 2006.
Advertisements

CPSCG: Constructive Platform for Specialized Computing Grid Institute of High Performance Computing Department of Computer Science Tsinghua University.
Data Management Expert Panel - WP2. WP2 Overview.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Development of China-VO ZHAO Yongheng NAOC, Beijing Nov
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
1.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 1: Introducing Windows Server.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Introduction to Scientific Data Grid Kai Nan Computer Network Information Center, CAS
AgriDrupal - a “suite of solutions” for agricultural information management and dissemination, built on the Drupal CMS; - the community of practice around.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
05/24/2004The 12th UN/ESA Workshop on Basic Space Science Chinese Virtual Observatory (China-VO) Chenzhou Cui Chinese Virtual Observatory Project National.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Enterprise Manager
CNGI Applications in CSTNET QingHua Zhang CSTNET January 2007.
C Copyright © 2009, Oracle. All rights reserved. Appendix C: Service-Oriented Architectures.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
SITools Enhanced Use of Laboratory Services and Data Romain Conseil
Scientific Data Grid on NGI Kai Nan Computer Network Information Center Chinese Academy of Sciences CANS 2004, Miami.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
1 Introduction to Microsoft Windows 2000 Windows 2000 Overview Windows 2000 Architecture Overview Windows 2000 Directory Services Overview Logging On to.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.
Hands-On Microsoft Windows Server Implementing Microsoft Internet Information Services Microsoft Internet Information Services (IIS) –Software included.
Copyright © cs-tutorial.com. Overview Introduction Architecture Implementation Evaluation.
The GRelC Project: architecture, history and a use case in the environmental domain G. Aloisio - S. Fiore The Climate-G testbed is an interdisciplinary.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
Small Projects Bridging IVO Standards and Best Practice Bridging IVOA and Domestic Community Chenzhou Cui, Yongheng Zhao Chinese Virtual Observatory National.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Scientific Data Grid & China-VO Kai Nan Computer Network Information Center Chinese Academy of Sciences November 27, 2003.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
Grid Based Chinese Virtual Observatory System Design Chenzhou CUI, Yongheng ZHAO National Astronomical Observatories, Chinese Academy of Sciences
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
A Collaborative Research Environment for Avian Flu Research Luo Ze Computer Network Information Center, CAS
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware A Cloud Computing Methodology Study of.
Institute for the Protection and Security of the Citizen HAZAS – Hazard Assessment ECCAIRS Technical Course Provided by the Joint Research Centre - Ispra.
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
ChinaGrid: National Education and Research Infrastructure Hai Jin Huazhong University of Science and Technology
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Grid Activities in the Philippines Rey Vincent P. Babilonia Advanced Science and Technology Institute Department of Science and Technology PHILIPPINES.
Bob Jones EGEE Technical Director
Clouds , Grids and Clusters
SuperComputing 2003 “The Great Academia / Industry Grid Debate” ?
Security Requirements for ChinaGrid Applications - What the current grid security solutions cannot do Hai Jin Huazhong University of Science and Technology.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
University of Technology
Synthesizing knowledge During Project
Goals Introduce the Windows Server 2003 family of operating systems
Chapter 2: The Linux System Part 1
A Network Operating System Edited By Maysoon AlDuwais
Google Sky.
Grid Computing Software Interface
Presentation transcript:

An Introduction to Scientific Data Grid LUO Ze Computer Network Information Centre, Chinese Academy of Sciences

Outline  1. Background  Background information about Scientific Database (SDB) and Scientific Data Grid (SDG)  Target of Scientific Data Grid project  2. System Platform  Introduce the status of data resource, storage resource, computing resource of SDG system platform  3. SDG Middleware  A brief Introduction about architecture, module of SDG middleware and its current status  4. Applications  Brief introduction of three domain application supported by SDG

Background  As China’s natural science research centre, Chinese Academy of Sciences (CAS) has produced and accumulated a great store of scientific data and materials in its long history of scientific research and practice.  In 1982, Chinese Academy of Sciences proposed the program of “The Scientific Database and Information System”, which was intended to integrate the scattered databases of different specialties for sharing through utilizing the ever-developing computer, database and network technologies.  Through two decades continuous development, the Scientific Database (SDB) has already become the most characterized scientific database resource on China Science and Technology Network (CSTNET). It provides scientific data service to scientific research, national macro decision-making, as well as to the public.

Background  Scientific Data Grid (SDG) is one of application grids of China National Grid, which is supported by the "High Performance Computer and its Kernel Software " project, which is a key project in the National High-Tech Research and Development Program  SDG is mainly undertaken by the Computer Network Information Center (CNIC), Chinese Academy of Sciences (CAS). CNIC is a subsidiary research institute of the Chinese Academy of Sciences (CAS), engaged mainly in the construction, operation and supporting service of informatization of CAS, R&D of computer network technology, database technology as well as scientific engineering computation.

Background  Scientific Data Grid aims at scientific data resources sharing and collaboration. It integrates different resources in informatization environment of scientific research, i.e. scientific data and computing capacity for data analysis and process, connect more than 40 institutes under Chinese Academy of Sciences via data resources in SDB, realize effective sharing of distributed and heterogeneous data resources by applying Grid technology, especially data Grid technology, and develop some application systems that have practical importance for scientific research.

Background  The target is to resolve following key problems through the research of SDG:  1. How to access large scale, distributed and heterogeneous scientific data uniformly, promote convenient sharing of scientific data resource and enhance efficiency and utility of sharing data resource.  2. How to integrate heterogeneous databases by metadata technology, implement sharing and service of relative information by Grid information service. Further, how to make advanced application systems based on Grid thinking and technology possible by way of combining metadata and information of data resource.  3. Via some application systems of special domain, provide Grid application framework of science research fields, explore main technical difficulties and problems in spreading Grid application of science research field and create elementarily a Gird application standard in some fields.

System Platform  The system platform for SDG consists of scientific data resources, network storage resources and computing resources  By the end of October 2004, the SDB has established 388 databases of different specialties, and increased its gross data volume to 13TB, 7.7TB are available on the Internet, and 45 websites of different domains now provides on-line service with most of the data.  Storage resource includes 20TB network storage and 50TB tape system. SDG provides more than 1TFLOPS computing capability. Storage and computing resources are mainly provided by 59 nodes of the super data server, SDB6800, situating at the data centre of Computer Network Information Centre under Chinese Academy of Sciences

System Platform  SDB6800 is the core component of SDG system platform  Is composed of 59 nodes of DeepComp6800. Each node includes four IA64 II 1.3G processors.  Nodes are connected by Quadrics network and GB ethernet  Take SAN architecture with 20TB disk array and 50TB tape lib.  OS: Linux, Windows  DBMS: Oracle10g, MS SQL Server 2000, mysql.

SDG Middleware  Architecture  SDG middleware is composed of two parts, core services and application-oriented services.

SDG Middleware  Information Service, Data Access Service, Security Infrastructure and Storage Service comprise the core services  Information Service Used for resource discovery and resource locating. On the basis of metadata built for Scientific Databases, the Information Service, including Information and Metadata Service (IMS) and SDGFinder, a web based resource finding tool, supplies information service for SDG and advanced application systems.

SDG Middleware  Data Access Service (DAS) DAS is designed to realize uniform access to massive, distributed, heterogeneous and autonomous databases. At present, we can access, via DAS, a wide range of relation databases, such as Oracle, Microsoft SQL Server and MySQL, and file systems. Through the interface provided by DAS, client can acquire metadata of data resource and execute query. The DAS is implemented by OGSA compliant grid services.

SDG Middleware  Security Infrastructure Security Infrastructure implements primary functions of Certificate administration and access control. We implement software for constructing a Certificate Authority (CA) in a simple manner. CA is an entity in Public Key Infrastructure (PKI), which is responsible for establishing and vouching the authenticity of public keys.

SDG Middleware  Storage Service Storage Service are made up of file storage service, database service and Internet publishing service, provides a series of storage service tools with the utilities of data transfer, storage management and quota assignment.

SDG Middleware  Application-oriented services include Statistics and Analysis Tool, Universal Metadata Management Tool, CA Management Tool, Access Control Toolbox, Storage Sharing Tools and Portal  CA Management Tool We provide a client-side tool, called CertUtility. This tool simplifies the integration and interaction between application and security infrastructure.

SDG Middleware  Statistics and Analysis Tool Statistics and Analysis Tool is installed and deployed in data centre and Institute that participated in SDG. According to the Interface provided by Statistics and Analysis Tool, we can get dynamically data volume information about data resource provided by particular organization. Data volume information could be processed and visualized to demonstrate the state of data resource. This tool is implemented by OGSA compliant grid service.

SDG Middleware  Universal Metadata Management Tool This tool is used for integrating metadata provided by different field. We adopt XML to exchange information among different modules of SDG middleware. This tool implements some management function for metadata, including add, remove and modify operation, and of course, supporting metadata query.

SDG Middleware  Access Control Toolbox By using Access Control Toolbox, user can configure flexibly access right for given user, customize the mapping between account and role. The toolbox provides a way to control the user’s access in a fine granularity manner. Currently, this toolbox supports RDBMS like Oracle, MySql, etc.  Storage Sharing Tools Based on open source software JFtp, we developed Storage Sharing Tools with two important enhancements. First, we enforce the security function and make data transport reliable. Second, these tools support quota assignment.

SDG Middleware  Portal In our SDG Portal, we integrated grid service in the portlets. Every portlet service is compounded by one or more grid service. Portal has a few portlets which can provide service to users. Currently, the basic portlets have been developed.

SDG Middleware  Current Status  After 3 years research and development, SDG middleware gained some important achievements. We released SDG middleware version 1.0 by the end of 2003, and released version 2.0 by the end of The software package was installed and deployed on Institutes that participated in SDG project after special annually training. The prototype of SDG now comes into being.

Applications  One of the primary goals of SDG is to develop and run scientific application based on Grid technologies, as an illustration of e-Science enabled by Grid technologies. In SDG, we currently support three domain applications: China Virtual Observatory; International Cosmic Ray Data Pre- processing Centre; and Chinese Herbal Medicine Virtual Academe.

Applications  China Virtual Observatory.  In SDG, Computer Network Information Centre of CAS collaborates with National Astronomical Observatories of CAS to develop China Virtual Observatory as one of scientific application systems. Currently, services, including Statistical Analysis of Fe Abundances Gradients in the Galaxy, The Decoding Grid Service and Query Grid Service for some catalogue, DSS image retrieval grid service, and Basic Astronomical Computing Service, have been set up.

Applications  Based on layered GRID infrastructure, China Virtual Observatory mainly addresses following three tasks: (1) astronomical data interoperation; (2) spectrum auto-process; (3) VO-enabled LAMOST. LAMOST, means Large Sky Area Multi-Object Fibre Spectroscopic Telescope, is a meridian reflecting Schmidt telescope, using active optics technique to control its reflecting corrector makes it a unique astronomical instrument in combining large aperture with wide field of view.

LAMOST

Applications  International Cosmic Ray Data Pre-processing Centre  YBJ International Cosmic Ray Observatory is located at 90°26'E and 30°13'N in Yangbajing (YBJ) valley of Tibetan highland. The ARGO -YBJ Project is a Sino-Italian cooperation started its detector installation in It aims at the research of the origin of high energy cosmic rays. It explores the approximately 100 GeV uncultivated land and measuring the antiproton/proton ratio by cosmic ray moon shadow.

Applications  The ARGO-YBJ project will be full operational in 2007 and will generate more than 200TB of raw data each year. The raw data will be transferred from Tibet to Beijing Institute of High Energy Physics and processed in to reconstructed data. The physicists will work on the reconstructed data for physics researches. For this purpose a grid based computing system will be built with about 400 CPUs, mass storage system and broad band network links among Tibet, Beijing and institutes in Italy.

Cosmic ray air-shower array detectors

Applications  Chinese Herbal Medicine Virtual Academe.  Based on databases of Chinese herbal medicine information distributed around China, Chinese Herbal Medicine Virtual Academe constructs a Chinese herbal medicine application grid, which implements interconnection and interoperability of Chinese herbal medicine information databases and high degree sharing of Chinese herbal medicine resources, supports the scientific research of Chinese herbal medicine and pushes the process of Chinese herbal medicine modernization.

Thanks