Polish Infrastructure for Supporting Computational Science in the European Research Space Policy Driven Data Management in PL-Grid Virtual Organizations.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

High Performance Computing Course Notes Grid Computing.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Company LOGO Remote Method Invocation Georgi Cholakov, Emil Doychev, University of Plovdiv “Paisii.
Grid programming with components: an advanced COMPonent platform for an effective invisible grid © 2006 GridCOMP Grids Programming with components. An.
Optimizing of data access using replication technique Renata Słota 1, Darin Nikolow 1,Łukasz Skitał 2, Jacek Kitowski 1,2 1 Institute of Computer Science.
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB JavaForum.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Next Generation Domain-Services in PL-Grid Infrastructure for Polish Science. Numerical Simulations of Metal Forming Production Processes and Cycles by.
. Memory Management. Memory Organization u During run time, variables can be stored in one of three “pools”  Stack  Static heap  Dynamic heap.
Disk Array Performance Estimation AGH University of Science and Technology Department of Computer Science Jacek Marmuszewski Darin Nikołow, Marek Pogoda,
Włodzimierz Funika, Filip Szura Automation of decision making for monitoring systems.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Ab initio grid chemical software ports – transferring.
DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING Carlos de Alfonso Andrés García Vicente Hernández.
Cracow - CYFRONET PACKAGING pack into portable format e.g. rpm PACKAGING pack into portable format e.g. rpm PACKAGING pack into portable format e.g. rpm.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Towards scalable, semantic-based virtualized storage.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
CGW 2003 Institute of Computer Science AGH Proposal of Adaptation of Legacy C/C++ Software to Grid Services Bartosz Baliś, Marian Bubak, Michał Węgiel,
1. 2 Purpose of This Presentation ◆ To explain how spacecraft can be virtualized by using a standard modeling method; ◆ To introduce the basic concept.
Polish Infrastructure for Supporting Computational Science in the European Research Space QoS provisioning for data-oriented applications in PL-Grid D.
Łukasz Skitał 2, Renata Słota 1, Maciej Janusz 1 and Jacek Kitowski 1,2 1 Institute of Computer Science AGH University of Science and Technology, Mickiewicza.
Frascati, October 9th, Accounting in DataGrid Initial Architecture Albert Werbrouck Frascati, October 9, 2001.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
In each iteration macro model creates several micro modules, sends data to them and waits for the results. Using Akka Actors for Managing Iterations in.
Information Grid Services in the Polish Optical Internet PIONIER Cezary Mazurek, Maciej Stroiński, Jan Węglarz.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
Application portlets within the PROGRESS HPC Portal Michał Kosiedowski
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
PROGRESS: ICCS'2003 GRID SERVICE PROVIDER: How to improve flexibility of grid user interfaces? Michał Kosiedowski.
DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.
High Level Architecture (HLA)  used for building interactive simulations  connects geographically distributed nodes  time management (for time- and.
Polish Infrastructure for Supporting Computational Science in the European Research Space FiVO/QStorMan: toolkit for supporting data-oriented applications.
Source: Operating System Concepts by Silberschatz, Galvin and Gagne.
Grid programming with components: an advanced COMPonent platform for an effective invisible grid © 2006 GridCOMP Grids Programming with components. An.
Scalarm: Scalable Platform for Data Farming D. Król, Ł. Dutka, M. Wrzeszcz, B. Kryza, R. Słota and J. Kitowski ACC Cyfronet AGH KU KDM, Zakopane, 2013.
KUKDM’2011, Zakopane Semantic Based Storage QoS Management Methodology Renata Słota, Darin Nikolow, Jacek Kitowski Institute of Computer Science AGH-UST,
Scheduling Interactive Tasks in the Grid-based Systems M. Okoń, M. Lawenda, N. Meyer, D. Stokłosa, T. Rajtar, D. Kaliszan, M. Stroiński TERENA Networking.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,
Application Ontology Manager for Hydra IST Ján Hreňo Martin Sarnovský Peter Kostelník TU Košice.
Rafał Słota, Michał Wrzeszcz, Renata G. Słota, Łukasz Dutka, Jacek Kitowski ACC Cyfronet AGH Department of Computer Science, AGH - UST CGW 2015 Kraków,
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Operational Architecture of PL-Grid project M.Radecki,
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB Markus.
1. 2 Purpose of This Presentation ◆ To explain how spacecraft can be virtualized by using a standard modeling method; ◆ To introduce the basic concept.
Policy-Based Dynamic Negotiation for Grid Services Authorization Ionut Constandache, Daniel Olmedilla, Wolfgang Nejdl Semantic Web Policy Workshop, ISWC’05.
Parameter Sweep and Resources Scaling Automation in Scalarm Data Farming Platform J. Liput, M. Paciorek, M. Wrona, M. Orzechowski, R. Slota, and J. Kitowski.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Storage Accounting for Grid Environments Fabio Scibilia INFN - Catania.
INFSO-RI Enabling Grids for E-sciencE Policy management and fair share in gLite Andrea Guarise HPDC 2006 Paris June 19th, 2006.
STAR Scheduling status Gabriele Carcassi 9 September 2002.
PROGRESS: GEW'2003 Using Resources of Multiple Grids with the Grid Service Provider Michał Kosiedowski.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
From Use Cases to Implementation 1. Structural and Behavioral Aspects of Collaborations  Two aspects of Collaborations Structural – specifies the static.
2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.
Metadata Organization and Management for Globalization of Data Access with Michał Wrzeszcz, Krzysztof Trzepla, Rafał Słota, Konrad Zemek, Tomasz Lichoń,
Collection and storage of provenance data Jakub Wach Master of Science Thesis Faculty of Electrical Engineering, Automatics, Computer Science and Electronics.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
From Use Cases to Implementation 1. Mapping Requirements Directly to Design and Code  For many, if not most, of our requirements it is relatively easy.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Piotr Bała, Marcin Radecki, Krzysztof Benedyczak
GridOS: Operating System Services for Grid Architectures
University of Technology
Basic Grid Projects – Condor (Part I)
Ms Munawar Khatoon IV Year I Sem Computer Science Engineering
Ponder policy toolkit Jovana Balkoski, Rashid Mijumbi
Presentation transcript:

Polish Infrastructure for Supporting Computational Science in the European Research Space Policy Driven Data Management in PL-Grid Virtual Organizations Dariusz Król, Darin Nikolow, Włodzimierz Funika, Renata Słota, Jacek Kitowski ACC Cyfronet AGH, ul. Nawojki 11, , Kraków, Poland INGRID 2010 Poznań,

Agenda 1.Goals of the PL-Grid project 2.Research and implementation goals 3.Non-functional requirements in data management 4.Description of the proposed solution 5.Architecture overview 6.Sample use cases 7.Implementation status 8.Future work 9.Conclusions

PL-Grid  Polish national grid initiative ( )  The main goal is to provide the Polish scientific community with an IT infrastructure based on the Grid  Extend the amount of computing resources by approximately 215 Tflops of computing power and 2500 TB of storage capacity  Develop grid-oriented applications and tools in VO and data management areas, e.g. FiVO - Grid Virtual Organization Semantic Framework

Research and implementation goals  Allow users to define non-functional requirements for storage devices explicitly (more on the next slide)  Extend VO knowledge base with descriptions of storage elements  Exploit information from storage monitoring systems and VO knowledge base to find the most suitable storage device complient with the defined requirements  Integrate the developed solution with PL-Grid infrastructure, e.g. the Lustre file system  Easy to use and extend

Non-functional requirements in data management  Data intensive applications may have different requirements, e.g. important data should be replicated  Abstraction of storage elements prevents users from influencing the actual location of data  Distribution of data among available storage elements according to the defined requirements  Choose possible ways to check a fulfilling ratio of requirements for each storage element

Proposed solution  On the user side – a programming library which provides functions for managing files in distributed storage environment  On the server side – a service which finds the most suitable storage element according to the defined requirements and current workload  User can specify a storage policy in a declarative way in the application code but the actual location is determined at runtime  The whole computation is done on the server side and the identifier of a concrete element is returned to the user side

Architecture overview

Use case 1 – user-level requirements 1.A ‘StoragePolicy’ instance is created in the user application along with a ‘LustreManager’ instance. 2.The 'createFile(, )‘ function from the ‘LustreManager’ instance is called. 3.The ‘LustreManager’ instance creates a request to the finder service to find the most suitable storage element which meets the provided requirements. 4.The finder service retrieves infromation about available storage elements from a VO knowledge base. 5.The finder service sends a request to a monitoring system for the current values of attributes which are contained in the storage policy object. 6.The fulfilling function is computed with each available storage device as an argument. 7.Information about suitable storage elements is returned to the user side. 8.A file is created on the most suitable storage element and returns a file descriptor to the user application or an error code if something went wrong.

Use case 2 – VO-level requirements 1.The application uses a standard programming library to create a new file. 2.This request is intercepted on the filesystem-level and delegated to the storage-management library. 3.The library sends a request to find the most suitable storage element for this user. 4.The finder service retrieves information about the default storage policy for the VO which the user belongs to. 5.The rest of the use case is similar to the previous one: 6.The finder service retrieves infromation about available storage elements from a VO knowledge base. 7.The finder service sends a request to a monitoring system for the current values of attributes which are contained in the storage policy object. 8.The fulfilling function is computed with each available storage device as an argument. 9.Information about suitable storage elements is returned to the user side. 10.A file is created on the most suitable storage element and returns a file descriptor to the user application or an error code if something went wrong.

Implementation status – the user side  The library is implemented as a shared library in the C++ language.  Communication with the server side is implemented with the libcurl library (exploiting the REST model).  For file storaging we use the Lustre filesystem and its pool mechanism.  The most important part of the library API is : ocreateFile(fileName : char*, policy : StoragePolicy*) : int oopenFile(fileName : char*) : int ocloseFile(fileName : char*) : void ochangeStoragePolicy(fileName : char*, policy : StoragePolicy*) : void

Implementation status – the server side  The finder service is implemented in Python.  Function that determines similarity is as follows:  Communication with the monitoring system is implemented with the REST model  Integration with the VO knowledge base is implemented with the SOAP-based web service - if - else the current value of an attribute – required value of an attribute i

Implementation status – non-functional requirements Implementation status – non-functional requirements Current implementation of the StoragePolicy class includes:  preferredDeviceType – a device type which is preferred by the user, e.g. a fast writable device or high available device  capacity –free storage space required  averageReadTransferRate – mean read time from the last monitoring serie  averageWriteTransferRate – mean write time from the last monitoring serie  currentReadTransferRate – mean read time from the last measurement  currentWriteTransferRate – mean write time from the last measurement  throughput – numerical value of the required throughput  throughputLevel – required throughput level on a device, e.g. LOW, MEDIUM or HIGH – will be mapped onto the numerical value The user can choose only these aspects of a policy which are important from their point of view.

Implementation status – usage example Implementation status – usage example LustreManager manager; StoragePolicyFactory factory; // creating a new policy object with defualt parameters for the given device type StoragePolicy policy = factory.createPolicyForDeviceType(HIGHAVAILABLEDEVICE); // setting additional parameters policy.setCapacity(1.1); // [TB] policy.setThroughput(80.0); // [MB/s] policy.setAverageWriteTransferRate(100.0); // [MB/s] // creating a new file int descriptor = manager.createFile("test_file", &policy);

Future work  Support for multiple Lustre installations (geographically distributed)  Support for different storage systems (e.g. dCache )  Performance tests on the PL-Grid testbed (not available yet)  New algorithms for determinating similarity  Implementation of the second presented use case (the one with default storage policy defined on the VO level)

Conclusions  One of the PL-Grid project goal is to develop new approaches to issues of storage management in the Grid environment  Explicit definitions of non-functional requirements are necessary in data intensive applications  The presented solution is easy to use (standard C++ shared library), extend (description of storage elements is in an ontological form) and understand (clean algorithm of finding a storage element)  There is much to be done

Do you want to know more ?