Download presentation
Presentation is loading. Please wait.
Published byAlban Stevenson Modified over 9 years ago
1
1 eScience on Distributed Infrastructure in Poland Marian Bubak AGH University of Science and Technology ACC Cyfronet Krakow, Poland dice.cyfronet.pl PLAN-E, the Platform of National eScience/Data Research Centers in Europe, 29-30 September 2014, Amsterdam
2
2 ACC Cyfronet AGH PL-Grid Consortium and Programme Focus on users: training and support Platforms and tools: towards PL-ecosystem International cooperation, conferences Summary Outline
3
3 Credits ACC Cyfronet AGH Michał Turała Krzysztof Zieliński Karol Krawentek Agnieszka Szymańska Maciej Twardy Angelika Zaleska-Walterbach Andrzej Oziębło Zofia Mosurska Marcin Radecki Renata Słota Tomasz Gubała Darin Nikolow Aleksandra Pałuk Patryk Lasoń Marek Magryś Łukasz Flis ICM Marek Niezgódka Piotr Bała Maciej Filocha PCSS Maciej Stroiński Norbert Meyer Krzysztof Kurowski Bartek Palak Tomasz Piontek Dawid Szejnfeld Paweł Wolniewicz WCSS Paweł Tykierko Paweł Dziekoński Bartłomiej Balcerek TASK Rafał Tylman Mścislaw Nakonieczny Jarosław Rybicki … and many others domain experts ….
4
4 ACC Cyfronet AGH High Performance Computing High Performance Networking Centre of Competence Participation and coordination of national and international scientific projects. Computational power, storage and libraries for scientific research. Coordinator of PL-Grid Infrastructure Development. Main node of Cracow MAN. South Poland main node of PIONIER network. Access to GEANT network. 40 years of expertise Rank TOP500 SiteSystemCores R max Tflops R peak Tflops 176 VI.2014 Cyfronet Poland Cluster Platform Infiniband Hewlett-Packard 25,468266.9373.9
5
5 Motivation and background Experiments in silico: advanced, distributed computing big international collaboration e-Science and e-Infrastructure interaction World progress in Big Science: Theory, Experiment, Simulation Data intensive computing Numerically intensive computing Computational Science problems to be addressed: algoritms, environments and deployment 4 th paradigm, Big Data, Data Farming Needs: increase of resources support for making science
6
6 PL-Grid Consortium Consortium creation – January 2007 a response to requirements from Polish scientists due to ongoing Grid activities in Europe (EGEE, EGI_DS) Aim: significant extension of amount of computing resources provided to the scientific community (start of the PL-Grid Programme) Development based on: projects funded by the European Regional Development Fund as part of the Innovative Economy Program close international collaboration (EGI, ….) previous projects (5FP, 6FP, 7FP, EDA…) National Network Infrastructure available: Pionier National Project computing resources: Top500 list Polish scientific communities: ~75% highly rated Polish publications in 5 Communities PL-Grid Consortium members: 5 High Performance Computing Polish Centres, representing Communities, coordinated by ACC Cyfronet AGH
7
7 PL-Grid and PLGrid Plus in short PLGrid Plus Project (2011–2014) Budget: total ca.18 M€, from EU: ca.15 M€ Expected outcome: focus on users specific computing environments QoS by SLM PL-Grid Project (2009–2012) Budget: total 21 M€, from EU 17M€ Outcome: Common base infrastructure National Grid Infrastructure (NGI_PL) Resources: 230 Tflops, 3.6 PB Extension of resources and services by: 500 Tflops, 4.4 PB Keeping diversity for users Clusters (thin and thick nodes, GPU) SMP, vSMP, Clouds
8
8 PL-Grid project PL-Grid aimed at significantly extending the amount of computing resources provided to the Polish scientific community (by approximately 215 TFlops of computing power and 2500 TB of storage capacity) and constructing a Grid system that would facilitate effective and innovative use of the available resources. Polish Infrastructure for Supporting Computational Science in the European Research Space – PL-Grid Budget: total 21 m€, from EC 17m€ Duration: 1.1.2009 – 31.3.2012 Managed by the PL-Grid Consortium made up of 5 Polish supercomputing and networking centres Project coordinator: Academic Computer Centre Cyfronet AGH, Krakow, Poland Project web site: projekt.plgrid.pl Main Project Objectives: Common (compatible) base infrastructure Capacity to construct specialized, domain Grid systems for specific applications Efficient use of available financial resources Focus on HPC and Scalability Computing for domain specific Grids
9
9 PL-Grid project – results Publication of the book presenting the scientific and technical achievements of the Polish NGI in the Springer Publisher, in March 2012: „Building a National Distributed e- Infrastructure – PL-Grid” In Lecture Notes in Computer Science, Vol. 7136, subseries: Information Systems and Applications Content: 26 articles describing the experience and the scientific results obtained by the PL-Grid project partners as well as the outcome of research and development activities carried out within the Project. First working NGI in Europe in the framework of EGI.eu (since March 31, 2010) Number of users (March 2012): 900+ Number of jobs per month:750,000 - 1,500,000 Resources available: Computing power: ca. 230 TFlops Storage: ca. 3600 TBytes High level of availiability and realibility of the resources Facilitating effective use of these resources by providing: innovative grid services and end-user tools like Efficient Resource Allocation, Experimental Workbench and Grid Middleware Scientific Software Packages User support: helpdesk system, broad training offer Various, well-performed dissemination activities, carried out at national and international levels, which contributed to increasing of awareness and knowledge about the Project and the grid technology in Poland.
10
10 PLGrid Plus project Domain-oriented services and resources of Polish Infrastructure for Supporting Computational Science in the European Research Space – PLGrid Plus Budget: total ca. 18 M€ including funding from the EC: ca.15 M€ Duration: 1.10.2011 – 31.12.2014 Five PL-Grid Consortium Partners Project Coordinator: ACC CYFRONET AGH The main aim of the PLGrid Plus project is to increase potential of the Polish Science by providing the necessary IT services for research teams in Poland, in line with European solutions. Preparation of specific computing environments so called domain grids i.e. solutions, services and extended infrastructure (including software), tailored to the needs of different groups of scientists. Domain-specific solutions created for 13 groups of users, representing the strategic areas and important topics for the Polish and international science:
11
11 PLGrid Plus project – activities Integration Services National and International levels Dedicated Portals and Environments Unification of distributed Databases Virtual Laboratiories Remote Visualization Service value = utility + warranty SLA management Computing Intensive Solutions Specific Computing Environments Adoption of suitable algorithms and solutions Workflows Cloud computing Porting Scientific Packages Data Intensive Computing Access to distributed Scientific Databases Homogeneous access to distributed data Data discovery, process, visualization, validation…. 4th Paradigm of scientific research Instruments in Grid Remote Transparent Access to instruments Sensor networks Organizational Organizational backbone Professional support for specific disciplines and topics
12
12 New domain-specific services for 13 identified scientific domains Extension of the resources available in the PL-Grid Infrastructure by ca. 500 TFlops of computing power and ca. 4.4 PBytes of storage capacity Design and start-up of support for new domain grids Deployment of Quality of Service system for users by introducing SLA agreement Deployment of new infrastructure services Deployment of Cloud infrastructure for users Broad consultancy, training and dissemination offer Publication of the book presenting the scientific and technical achievements of PLGrid Plus in the Springer Publisher, in September 2014: „eScience on Distributed Computing Infrastructure” In Lecture Notes in Computer Science, Vol. 8500, subseries: Information Systems and Applications PLGrid Plus project – results Content: 36 articles describing the experience and the scientific results obtained by the PLGrid Plus project partners as well as the outcome of research and development activities carried out within the Project. Huge effort of 147 authors, 76 reviewers and editors team in Cyfronet
13
13 PLGrid NG project New generation domain-specific services in the PL-Grid infrastructure for Polish Science Budget: total ca. 14 889 773,23 PLN, including funding from the EC: 12 651 715,38 PLN Duration: 01.01.2014 – 31.10.2015 Five PL-Grid Consortium Partners Project Coordinator: ACC CYFRONET AGH Meteorology Biology Personalized Medicine Complex Networks Mathematics UNRES Medicine Computational Chemistry eBaltic-Grid Hydrology Nuclear Power and CFD OpenOxides Geoinformatics Metal Processing Technologies The aim of the PLGrid NG project is to provide a set of dedicated, domain-specific computing services for 14 new groups of researchers and implementation of these services in the PL-Grid national computing infrastructure.
14
14 PLGrid NG project – activities Tasks: Additional groups of experts involved − identified 14 communities/scientific topics Development and maintenance of the IT infrastructure In line with the best IT Service Management (ITSM) practices, such as ITIL or ISO-20000 Security on new applications, audits In the development stage, before deployment and during exploitation Optimization of resource usage − IT experts Operation Center Optimization of application porting User support First-line support, Helpdesk, domain experts, training Grid infrastructure (Grid services) PL-Grid Application Clusters High Performance ComputersData repositories National Computer Network PIONIER Domain Grid New Advanced Service Platforms
15
15 PLGrid Core project Competence Centre in the Field of Distributed Computing Grid Infrastructures Budget: total 104 949 901,16 PLN, including funding from the EC : 89 207 415,99 PLN Duration: 01.01.2014 – 31.11.2015 Project Coordinator: Academic Computer Centre CYFRONET AGH The main objective of the project is to support the development of ACC Cyfronet AGH as a specialized competence centre in the field of distributed computing infrastructures, with particular emphasis on grid technologies, cloud computing and infrastructures supporting computations on big data.
16
16 PLGrid Core project – services Basic infrastructure services Uniform access to distributed data PaaS Cloud for scientists Applications maintenance environment of MapReduce type End-user services Technologies and environments implementing the Open Science paradigm Computing environment for interactive processing of scientific data Platform for development and execution of large-scale applications organized in a workflow Automatic selection of scientific literature Environment supporting data farming mass computations
17
17 Focus on users Computer centres Hardware/Software User friendly Services Domain Experts Real Users Help Desk QoS/SLM Grants
18
18 User support Interdisciplinary team of IT experts with extensive knowledge on different programming methods used in research: parallel, distributed and GPGPU cards programming various scientific software the specifics of work with HPC/Cloud systems various aspects of work with large data sets Support methods PL-Grid Infrastructure user support systems (Helpdesk, User’s Forum) documentation services, PL-Grid User’s Manual f2f meetings and consultations in ACC Cyfronet AGH and users' home institutions International cooperation cooperation with various institutions and initiatives dedicated to scientists’ training: Software Sustainability Institute (UK), Software Carpentry, Data Carpentry, Mozilla Science Lab, ELIXIR UK Cyfronet is making every effort to become a Software Carpentry regional center in Poland or Central Europe Users of the Cyfronet computing resources are provided with support and professional help in solving any problems related to access and effective use of these resources.
19
19 Training Training on basic and advanced services traditional − in ACC Cyfronet AGH or in the interested users’ home scientific institutions remote − using a teleconference platform (Adobe Connect) and e-learning platforms (Blackboard Learn – currently; Moodle – planned) Courses are prepared based on the experts' experience gained a.o. during previous projects A survey assessing the training is performed after each course
20
20 PL-Grid Infrastructure users PL-Grid Users All accounts Employees
21
21 Grid users of global services
22
22 PL-Grid users of domain-specific services
23
23 GridSpace: a platform for e-Science applications Experiment: an e-science application composed of code fragments (snippets), expressed in either general-purpose scripting programming languages, domain-specific languages or purpose-specific notations. Each snippet is evaluated by a corresponding interpreter. GridSpace2 Experiment Workbench: a web application - an entry point to GridSpace2. It facilitates exploratory development, execution and management of e-science experiments. Embedded Experiment: a published experiment embedded in a web site. GridSpace2 Core: a Java library providing an API for development, storage, management and execution of experiments. Records all available interpreters and their installations on the underlying computational resources. Computational Resources: servers, clusters, grids, clouds and e-infrastructures where the experiments are computed.
24
24 Collage: executable e-Science publications Goal: Extending the traditional scientific publishing model with computational access and interactivity mechanisms; enabling readers (including reviewers) to replicate and verify experimentation results and browse large-scale result spaces. Challenges: Scientific: A common description schema for primary data (experimental data, algorithms, software, workflows, scripts) as part of publications; deployment mechanisms for on-demand reenactment of experiments in e-Science. Technological: An integrated architecture for storing, annotating, publishing, referencing and reusing primary data sources. Organizational: Provisioning of executable paper services to a large community of users representing various branches of computational science; fostering further uptake through involvement of major players in the field of scientific publishing.
25
25 DataNet: colaborative metadata management Objectives Provide means for ad-hoc metadata model creation and deployment of corresponding storage facilities Create a research space for metadata model exchange and discovery with associated data repositories with access restrictions in place Support different types of storage sites and data transfer protocols Support the exploratory paradigm by making the models evolve together with data Architecture Web Interface is used by users to create, extend and discover metadata models Model repositories are deployed in the PaaS Cloud layer for scalable and reliable access from computing nodes through REST interfaces Data items from Storage Sites are linked from the model repositories
26
26 Cloud Platform: resource allocation management VPH-Share Master Int. Admin Developer Scientist Development Mode VPH-Share Core Services Host OpenStack/Nova Computational Cloud Site Worker Node Head Node Image store (Glance) Cloud Facade (secure RESTful API ) Other CSAmazon EC2 Atmosphere Management Service (AMS) Cloud stack plugins (Fog) Atmosphere Internal Registry (AIR) Cloud Manager Generic Invoker Workflow management External application Cloud Facade client Customized applications may directly interface Atmosphere via its RESTful API called the Cloud Facade. The Atmosphere Cloud Platform is a one-stop management service for hybrid cloud resources, ensuring optimal deployment of application services on the underlying hardware.
27
27 InSilicoLab science gateway framework Goals Complex computations done in non-complex way Separating users from the concept of jobs and the infrastructure Modelling the computation scenarios in an intuitive way Different granularity of the computations Interactive nature of applications Dependencies between applications Summary The framework proved to be an easy way to integrate new domain-specific scenarios Even if done by external teams Natively supports multiple types of computational resources Including private resources – e.g. private clouds Supports various types of computations Architecture of the InSilicoLab framework: Domain Layer, Mediation Layer with its Core Services, and Resource Layer. In the Resource Layer, Workers (`W') of different kinds (marked with different colors) are shown.
28
28 Scalarm Self-scalable platform adapting to experiment size and simulation type Exploratory approach for conducting experiments Supporting online analysis of experiment partial results Integrates with clusters, Grids, Clouds Data farming experiments with an exploratory approach Parameter space generation with support of design of experiment methods Accessing heterogeneous computational infrastructure Self-scalability of the management part What problems are addressed with Scalarm ?Scalarm overview
29
29 Veilfs Functionalities provided by VeilFS A system operating in the user space (i.e. FUSE), which virtualizes organizationally distributed, heterogeneous storage systems to obtain uniform and efficient access to data. End users access the data stored within VeilFS through one of the provided user interfaces: FUSE client, which implements a file system in user space to cover the data location and exposes a standard POSIX file system interface, Web-based GUI, which allows data management via any Internet browser, REST API.
30
30 Chemistry InSilicoLab for chemistry The service aims to support the launch of complex computational quantum chemistry experiments in the PL-Grid Infrastructure. Experiments of this service facilitate planning sequential computation schemes that require the preparation of series of data files, based on a common schema.
31
31 Metallurgy Simulations of extrusion process in 3D Main Objective: Optimization of the metallurgical process of profiles extrusion. Optimization includes: shape of foramera, channel position on a die, calibration stripes, extrusion velocity, ingot temperatures, tools. The proposed grid-based software simulates extrusion of thin profiles and rods of special alloys of magnesium, containing calcium supplements. These alloys are characterized by extremely low technological plasticity during metal forming. The FEM mathematical model developed.
32
32 Life Science Integromics – a system for researchers from biomedicine and biotechnology The system was developed to allow: data collection from experiments, laboratory diagnostics, diagnostic imaging, instrumental analysis and from medical interview, integration, management, processing and analysis of the collected data using specialized software and some of data mining techniques, hypotheses generation, data sharing and presentation of the results. Example: The diagram of an artificial neural network used to classify patients based on the expression of selected genes. The used method will allow to raise new hypotheses about the influence of individual genes on changes in the organisms.
33
33 SynchroGrid Elegant − the service for those involved in the design and operation of Synchrotron The developed service consists in: provision of the elegant (ELEctron Generation ANd Tracking) application in the parallel version on a cluster, configuring the Matlab software to read output files produced by this application in a Self Describing Data Sets (SDDS) format and to generate the final results in the form of drawings. Objectives: Preparation of tools needed to Synchrotron deployment and running, aimed at operations and research of the beam line. Addressing the estimated users’ needs in this scientific area focusing on data access and management – especially the metadata for the experimental data gathered during the beam time.
34
34 International cooperation – EU funded projects ACC Cyfronet AGH is involved in numerous projects co-financed by the EU funds and the Polish government. Research conducted in Cyfronet focus on: grid and cloud environments, programming paradigms, research portals, efficient use of computing and storage resources, reconfigurable FPGA and GPGPU computing systems.
35
35 National projects
36
36 Organization of conferences Cyfronet for many years has been organizing national and international conferences, workshops and seminars, which bring together computer scientists and researchers involved in the creation, development and application of information technologies, as well as the users of these technologies. The Centre has also initiated a series of conferences: CGW Workshop, held yearly since 2001 ACC Cyfronet AGH Users' Conference, held yearly since 2008 as well as International Conference on Computational Science (ICCS), organized twice: in 2004 and 2008 ’01 http://www.cyfronet.krakow.pl/cgw14/
37
37 Organization of conferences CGW Workshop Proceedings
38
38 Summary: what we offer We develope and deploy research e-infrastructure in three dimensions: Network & Future Internet HPC/GRID/CLOUDs Data & Knowledge layer Deployments have the national scope; however with close European links Developments oriented on end-users & projects Achieving synergy between research projects and e-infrastructures by close cooperation and offering relevant services Durability at least 5 years after finishing the projects - confirmed in contracts Future plans: continuation of current policy with a support from EU Structural Funds Center of Excellence in Life Science CGW as a place to exchange experience and for collaboration between eScience centers in Europe
39
39 More information www.cyfronet.krakow.pl/en www.plgrid.pl/en www.cyfronet.krakow.pl/cgw14 www.cyfronet.krakow.pl/kdm14 dice.cyfronet.pl
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.