Polish Infrastructure for Supporting Computational Science in the European Research Space QoS provisioning for data-oriented applications in PL-Grid D.

Slides:



Advertisements
Similar presentations
Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Advertisements

Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Distributed Data Processing
High Performance Computing Course Notes Grid Computing.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Polish Infrastructure for Supporting Computational Science in the European Research Space EUROPEAN UNION Services and Operations in Polish NGI M. Radecki,
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Optimizing of data access using replication technique Renata Słota 1, Darin Nikolow 1,Łukasz Skitał 2, Jacek Kitowski 1,2 1 Institute of Computer Science.
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space User Oriented Provisioning of Secure Virtualized.
Next Generation Domain-Services in PL-Grid Infrastructure for Polish Science. Numerical Simulations of Metal Forming Production Processes and Cycles by.
Chapter 8: Network Operating Systems and Windows Server 2003-Based Networking Network+ Guide to Networks Third Edition.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Ab initio grid chemical software ports – transferring.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Towards scalable, semantic-based virtualized storage.
Polish Infrastructure for Supporting Computational Science in the European Research Space Policy Driven Data Management in PL-Grid Virtual Organizations.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
1. 2 Purpose of This Presentation ◆ To explain how spacecraft can be virtualized by using a standard modeling method; ◆ To introduce the basic concept.
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
Łukasz Skitał 2, Renata Słota 1, Maciej Janusz 1 and Jacek Kitowski 1,2 1 Institute of Computer Science AGH University of Science and Technology, Mickiewicza.
Computing on the Cloud Jason Detchevery March 4 th 2009.
Division of IT Convergence Engineering Towards Unified Management A Common Approach for Telecommunication and Enterprise Usage Sung-Su Kim, Jae Yoon Chung,
J OINT I NSTITUTE FOR N UCLEAR R ESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA 1 Nechaevskiy A. Dubna, 2012.
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
Information Grid Services in the Polish Optical Internet PIONIER Cezary Mazurek, Maciej Stroiński, Jan Węglarz.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
INFSO-RI Module 01 ETICS Overview Etics Online Tutorial Marian ŻUREK Baltic Grid II Summer School Vilnius, 2-3 July 2009.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Application portlets within the PROGRESS HPC Portal Michał Kosiedowski
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
PROGRESS: ICCS'2003 GRID SERVICE PROVIDER: How to improve flexibility of grid user interfaces? Michał Kosiedowski.
DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
High Level Architecture (HLA)  used for building interactive simulations  connects geographically distributed nodes  time management (for time- and.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Polish Infrastructure for Supporting Computational Science in the European Research Space FiVO/QStorMan: toolkit for supporting data-oriented applications.
VO-Ganglia Grid Simulator Catalin Dumitrescu, Mike Wilde, Ian Foster Computer Science Department The University of Chicago.
Scalarm: Scalable Platform for Data Farming D. Król, Ł. Dutka, M. Wrzeszcz, B. Kryza, R. Słota and J. Kitowski ACC Cyfronet AGH KU KDM, Zakopane, 2013.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
CGW 04, Stripped replication for the grid environment as a web service1 Stripped replication for the Grid environment as a web service Marek Ciglan, Ondrej.
KUKDM’2011, Zakopane Semantic Based Storage QoS Management Methodology Renata Słota, Darin Nikolow, Jacek Kitowski Institute of Computer Science AGH-UST,
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,
Rafał Słota, Michał Wrzeszcz, Renata G. Słota, Łukasz Dutka, Jacek Kitowski ACC Cyfronet AGH Department of Computer Science, AGH - UST CGW 2015 Kraków,
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Operational Architecture of PL-Grid project M.Radecki,
National Computational Science National Center for Supercomputing Applications National Computational Science Integration of the MyProxy Online Credential.
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Parameter Sweep and Resources Scaling Automation in Scalarm Data Farming Platform J. Liput, M. Paciorek, M. Wrona, M. Orzechowski, R. Slota, and J. Kitowski.
PROGRESS: GEW'2003 Using Resources of Multiple Grids with the Grid Service Provider Michał Kosiedowski.
Metadata Organization and Management for Globalization of Data Access with Michał Wrzeszcz, Krzysztof Trzepla, Rafał Słota, Konrad Zemek, Tomasz Lichoń,
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Status of SRB/SRM interface development Fu-Ming Tsai Academia Sinica Grid Computing.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
GridOS: Operating System Services for Grid Architectures
POW MND section.
GSAF Grid Storage Access Framework
Presentation transcript:

Polish Infrastructure for Supporting Computational Science in the European Research Space QoS provisioning for data-oriented applications in PL-Grid D. Król, B. Kryza, K. Skalkowski, D. Nikolow, R. Slota, and J. Kitowski ACC Cyfronet AGH, Kraków, Poland Institute of Computer Science AGH- UST, Krakow, Poland Cracow Grid Workshop 2010 Kraków,

Agenda 1.Research and implementation goals 2.Data intensive applications 3.Non-functional requirements in data management 4.FiVO/QStorMan toolkit 5.Architecture overview 6.Main use case 7.Implementation status 8.Conclusions

Research and implementation goals The main objective of the presented research is automation of creation and management a Virtual Organization (VO) using knowledge, especially in the data storage area using the following concepts:  allowing users to define non-functional requirements for storage devices explicitly,  exploiting a knowledge base of the VO extended with descriptions of storage elements  exploiting information from storage monitoring systems and VO knowledge base to find the most suitable storage device complient with the defined requirements

Data intensive applications  Generate gigabytes (or more) of data per day.  Different types of data which require different types of storage.  Heavily uses read/write operations.  The run time of an application heavily depends on storage access time and transfer speed rather than the computation time. Examples (from wikipedia):  The LHC experiment produces 15 PB/year = ~42 TB/day = ~1 GB/s  The German Climate Computing Center (DKRZ) has a storage capacity of 60 petabytes of climate data.

Non-functional requirements in data management  Data intensive applications may have different requirements, e.g. important data should be replicated  Abstraction of storage elements prevents users from influencing the actual location of data  Distribution of data among available storage elements according to the defined requirements  Choose possible ways to check a fulfilling ratio of requirements for each storage element

FiVO/QStorMan toolkit  On the user side – a programming library (libSES) which provides functions for managing files in a distributed storage environment  On the server side :  A service (Storage Element Selection service) which finds the most suitable storage element according to the defined requirements and current workload  A knowledge base (GOM) which stores a configuration of the storage environment along with defined non-functional requirements from the users  A monitoring system (SMED) which monitors storage resources and provides information about current or average values of different QoS parameters  A portal where a user can define nonfunctional requirements for the storage enviornment.

Architecture overview

Main use case User Portal GOM SMED Monitoring system Application SES library SE..... Distributed storage system Defining non- functional requirements Storing requirements definition Classical „write” operation to concrete SE Getting requirements for the application along with configuration of a Lustre installation Getting actual storage system parameter values Monitoring information Getting configuration information of a Lustre installation Operation interception „Write” operation to the most suitable SE

Implementation status – the user side (libSES)  The library is implemented as a shared library in the C++ language.  Communication with the server side is implemented with the libcurl library (exploiting the REST model).  For file storaging we use the Lustre filesystem and its pool mechanism.  The most important part of the library API is : ocreateFile(fileName : char*, policy : StoragePolicy*) : int oopenFile(fileName : char*) : int ocloseFile(fileName : char*) : void ochangeStoragePolicy(fileName : char*, policy : StoragePolicy*) : void

Implementation status – Storage Element Selection service  The finder service is implemented in Python.  Function that determines similarity is as follows:  Communication with the monitoring system is implemented with the REST model  Integration with the VO knowledge base is implemented with the SOAP-based web service - if - else the current value of an attribute – required value of an attribute i

Implementation status – GOM  Allows to use multiple available storage and reasoning mechanisms.  Provides several interfaces for querying and modifying the managed ontologies.  The communication protocols supported by GOM currently include Java RMI and SOAP.

Implementation status – SMED  SMED architecture is based on the Enterprise Service Bus.  Current version of the system supports monitoring of :  local system hard drives,  disk arrays,  hierarchical storage management systems  distributed file systems, e.g. Lustre

Implementation status – non-functional requirements Implementation status – non-functional requirements Current implementation of the StoragePolicy class includes:  preferredDeviceType – a device type which is preferred by the user, e.g. a fast writable device or high available device  capacity –free storage space required  averageReadTransferRate – mean read time from the last monitoring serie  averageWriteTransferRate – mean write time from the last monitoring serie  throughput – numerical value of the required throughput  throughputLevel – an abstract level of required throughput, e.g. LOW, MEDIUM or HIGH The user can choose only these aspects of a policy which are important from their point of view.

Conclusions  The presented research goal is to develop new approaches to issues of storage management in the Grid environment  Explicit definitions of non-functional requirements are necessary in data intensive applications  Allowing to find the most suitable storage element within a distributed file system or a server where a grid job should be scheduled based on the given requirements  The presented solution is easy to use (standard C++ shared library), extend (description of storage elements is in an ontological form) and understand (clean algorithm of finding a storage element)

Do you want to know more ?