Experience of a low-maintenance distributed data management system W.Takase 1, Y.Matsumoto 1, A.Hasan 2, F.Di Lodovico 3, Y.Watase 1, T.Sasaki 1 1. High.

Slides:



Advertisements
Similar presentations
17th February, 2000 by Maciej Korzeniowski (CERN-IT-IA-MI) 1 Oracle Discoverer Product Presentation  This is an ad hoc query and analysis tool for.
Advertisements

Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
Introducing WatchGuard Dimension. Oceans of Log Data The 3 Dimensions of Big Data Volume –“Log Everything - Storage is Cheap” –Becomes too much data –
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
Setting up and configuring BCO EE (BPA) Linux Console How I Learned to Stop Worrying and Love BCO EE Dima Seliverstov 3/3/2014.
Nada Abdulla Ahmed.  SmoothWall Express is an open source firewall distribution based on the GNU/Linux operating system. Designed for ease of use, SmoothWall.
Access 2007 Product Review. With its improved interface and interactive design capabilities that do not require deep database knowledge, Microsoft Office.
Introduction to Web Database Processing
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
Multiple Tiers in Action
12/11/01 Matt Bridges Advisor: Ralph Morelli. What is Web Analytics? In traditional commerce, store owners can observe their customers habits: What time.
Introduction to eValid Presentation Outline What is eValid? About eValid, Inc. eValid Features System Architecture eValid Functional Design Script Log.
Maintaining and Updating Windows Server 2008
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
Agenda  Overview  Configuring the database for basic Backup and Recovery  Backing up your database  Restore and Recovery Operations  Managing your.
Exchange 2010 Project Presentation/Discussion August 12, 2015 Project Team: Mark Dougherty – Design John Ditto – Project Manager Joel Eussen – Project.
Sharepoint Portal Server Basics. Introduction Sharepoint server belongs to Microsoft family of servers Integrated suite of server capabilities Hosted.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 14: Problem Recovery.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
LAYING OUT THE FOUNDATIONS. OUTLINE Analyze the project from a technical point of view Analyze and choose the architecture for your application Decide.
WorkPlace Pro Utilities.
IRODS performance test and SRB system at KEK Yoshimi KEK Building data grids with iRODS 27 May 2008.
TechEd /22/2017 5:40 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
15 Copyright © 2005, Oracle. All rights reserved. Performing Database Backups.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
OCR GCSE Computing Chapter 2: Secondary Storage. Chapter 2: Secondary storage Computers are able to process input data and output the results of that.
Introduction to Hadoop and HDFS
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
The ProactiveWatch Monitoring Service. Are These Problems For You? Your business gets disrupted when your IT environment has issues Your employee and.
DB Zip Expert Portable database backup and export/import Copyright © SoftTree Technologies, Inc.
The Client/Server Database Environment Ployphan Sornsuwit KPRU Ref.
Personal Computer - Stand- Alone Database  Database (or files) reside on a PC - on the hard disk.  Applications run on the same PC and directly access.
Module 5: Configuring Internet Explorer and Supporting Applications.
Management System of Event Processing and Data Files Based on XML Software Tools at Belle Ichiro Adachi, Nobu Katayama, Masahiko Yokoyama IPNS, KEK, Tsukuba,
IRODS Service in GIMI. 2 User Can Search, Access, Add and Manage Data & Metadata Access distributed data with Web-based Browser or iRODS GUI or Command.
© 2006 Cisco Systems, Inc. All rights reserved.1 Connection 7.0 Serviceability Reports Todd Blaisdell.
IRODS: the use of rules and micro services for automatic data conversion and signal pattern searching Martyn Fletcher, Tom Jackson, Bojian Liang, Michael.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
Storage cleaner: deletes files on mass storage systems. It depends on the results of deletion, files can be set in states: deleted or to repeat deletion.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Chapter 9  Definition of terms  List advantages of client/server architecture  Explain three application components:
FTS monitoring work WLCG service reliability workshop November 2007 Alexander Uzhinskiy Andrey Nechaevskiy.
July 19, 2004Joint Techs – Columbus, OH Network Performance Advisor Tanya M. Brethour NLANR/DAST.
Systems Software. Systems software Applications software such as word processing, spreadsheet or graphics packages Operating systems software to control.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
Web Analytics and Reporting Michal Neuwirth Product Manager – Kentico Software.
ITMT 1371 – Window 7 Configuration 1 ITMT Windows 7 Configuration Chapter 8 – Managing and Monitoring Windows 7 Performance.
CloudBerry Explorer for S3. CB Explorer Free to use Browse and manage files PowerShell functions Open and edit files  CloudBerry Explorer is an easy.
Retele de senzori EEMon Electrical Energy Monitoring System.
Maintaining and Updating Windows Server 2008 Lesson 8.
9 Copyright © 2004, Oracle. All rights reserved. Getting Started with Oracle Migration Workbench.
Speed Cash System. Purpose of the Project  online Banking Transaction Information.  keeping in view of the distributed client server computing technology,
Know About MS Access Database
Tango Administrative Tools
The Client/Server Database Environment
CC-IN2P3 Lyon March 14, 2012 Yoshimi KEK
Interoperability of Digital Repositories
7.1. CONSISTENCY AND REPLICATION INTRODUCTION
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
AEM Operations Dec 2017.
Sending data to EUROSTAT using STATEL and STADIUM web client
Presentation transcript:

Experience of a low-maintenance distributed data management system W.Takase 1, Y.Matsumoto 1, A.Hasan 2, F.Di Lodovico 3, Y.Watase 1, T.Sasaki 1 1. High Energy Accelerator Research Organization (KEK), Japan 2. University of Liverpool, UK 3. Queen Mary, University of London, UK 1

Contents KEK iRODS system – Running in production over 2 years – Rules enable to store file efficiently – Federation with QMUL iRODS applications – SCALA : Visualization tool for SCALA – iRODS XOR-based backup Summary 2

iRODS overview 3 Distributed data management system Client-server architecture Allows data management policies to be enforced on the server-side Provides interface to many different types of storage Client can access to iRODS via – i-commands : Commands-line utilities – iRODS Browser : Web interface

KEK iRODS Systems 4 iRODS servers – RHEL 5.6 – iRODS 2.5 ⇒ 3.2 – PostgreSQL – 2 years〜 4 iRODS Zone – KEK-T2K – KEK-MLF – KEKZone – demoKEKZone HPSS (High Performance Storage System) Disk System Storage resource

Data Management for T2K Tokai to Kamioka (T2K) Neutrino experimental group The experimental data is stored to KEK storage The group needed to provide an easy way to quickly access data collected to evaluate the quality of the data from outside of KEK iRODS provided the solution 5 content/uploads/t2kmap.gif

Data Management for T2K KEK-T2K Zone for the experimental group started operation from October 2010 Detected data are processed then transferred to KEK iRODS People in the group became to able to access the stored data easily and quickly – i-commands – iRODS Browser 6

iRODS Rules for KEK-T2K Zone Bundle and replicate the data 7 Client T2K data server T2K data server disk DB Disk system HPSS iRODS server iRODS server rodsweb file tar file Each experimental data file is small ( 〜 several MB) HPSS prefers large file

iRODS Rules for KEK-T2K Zone Response to request 8 disk DB Disk system HPSS Client iRODS server iRODS server rodsweb tar file file request T2K data server T2K data server

Federation with QMUL 9 Data replication among 2 sites Share each site data KEK-T2K Experimental data QMULZone Analytical data Federation

Amount of data in KEK-T2K 10 T2K group start the data taking on 22 nd Dec, 2011

SCALA : Visualization tool for iRODS 11 Statistical Charts And Log Analyzer iRODS lacked an interface for usage statistics and also for debugging problems We developed a web interface for visualizing iRODS status overview – Statistical Charts page – Log Analyzer page SCALA has been installed to KEK iRODS

SCALA Overview 12 iRODS Resource usage Log files Parse Summa rize Display SCALA Input : iRODS outputs Output : Visualized system daily status as charts Parsed table Summarized table Database

Statistical Charts Visualizes iRODS daily operational data 13

Log Analyzer User clicks an bar 3. User clicks an error message 4. Related log displayed 2. Error detail displayed Provides error debugging tool

Download SCALA 15

iRODS XOR-based backup Full file replication – Current method for reliable storage of data is replicate data – If disk fails or server fails still have a copy – Requires much storage space – Portion of the file becomes corrupt you have to replace the full file XOR-based backup Reduces the space with same robustness Splits file into some blocks and creates parity blocks If a block becomes corrupt you have to recreate only corrupted block 16

XOR-based backup: 100% recovery with any 2 servers fail 17 Full-File Replication uses 3 servers and needs 300GB XOR-based backup uses 4 servers but only needs 200GB iRODS rule enables automatic processing Server 1 Server 2 Server 3 Server 4 ABCD E = B + C F = C + D G = A + D H = A + B

XOR-based backup: Decoding flow 18 Server1Server2Server3Server4 ABCD E = B + CF = C + DG = A + DH = A + B

Summary KEK iRODS system has been running in production over 2 years iRODS gives a way to quickly and easily access data outside of KEK Rule of bundle and replicate the data leads to store files efficiently Federation with QMUL enables to share each data and backup SCALA is a visualizing tool and has been installed KEK iRODS – It leads to better management of the iRODS overall service XOR-based backup provides data reliability and less storage cost compared with replication – iRODS rule enables automatic processing 19

Thank you for your attention! Wataru 20