Download presentation
Presentation is loading. Please wait.
Published byJasper Potter Modified over 9 years ago
1
9-Sept-2003CAS2003, Annecy, France, WFS1 Distributed Data Management at DKRZ Distributed Data Management at DKRZ Wolfgang Sell Hartmut Fichtel Deutsches Klimarechenzentrum GmbH sell@dkrz.de, fichtel@dkrz.de Wolfgang Sell Hartmut Fichtel Deutsches Klimarechenzentrum GmbH sell@dkrz.de, fichtel@dkrz.de
2
9-Sept-2003CAS2003, Annecy, France, WFS2 Table of Contents DKRZ - a German HPC Center HPC Systemarchitecture suited for Earth System Modeling The HLRE Implementation at DKRZ Implementing IA64/Linux based Distributed Data Management Some Results Summary DKRZ - a German HPC Center HPC Systemarchitecture suited for Earth System Modeling The HLRE Implementation at DKRZ Implementing IA64/Linux based Distributed Data Management Some Results Summary
3
9-Sept-2003CAS2003, Annecy, France, WFS3 DKRZ - a German HPCC Mission of DKRZ DKRZ and its Organization DKRZ Services Model and Data Services Mission of DKRZ DKRZ and its Organization DKRZ Services Model and Data Services
4
9-Sept-2003 CAS2003, Annecy, France, WFS Page 4 In 1987 DKRZ was founded with the Mission to Provide state-of-the-art supercomputing and data service to the German scientific community to conduct top of the line Earth System and Climate Modelling. Provide associated services including high level visualization. In 1987 DKRZ was founded with the Mission to Provide state-of-the-art supercomputing and data service to the German scientific community to conduct top of the line Earth System and Climate Modelling. Provide associated services including high level visualization. Mission of DKRZ
5
9-Sept-2003 CAS2003, Annecy, France, WFS Page 5 Deutsches KlimaRechenZentrum = DKRZ German Climate Computer Center organised under private law (GmbH) with 4 shareholders investments funded by federal government, operations funded by shareholders usage 50 % shareholders and 50 % community Deutsches KlimaRechenZentrum = DKRZ German Climate Computer Center organised under private law (GmbH) with 4 shareholders investments funded by federal government, operations funded by shareholders usage 50 % shareholders and 50 % community DKRZ and its Organization (1)
6
9-Sept-2003 CAS2003, Annecy, France, WFS Page 6 DKRZ internal Structure 3 departments for systems and networks visualisation and consulting administration 20 staff in total until restructuring end of 1999 a fourth department supported climate model applications and climate data management DKRZ internal Structure 3 departments for systems and networks visualisation and consulting administration 20 staff in total until restructuring end of 1999 a fourth department supported climate model applications and climate data management DKRZ and its Organization (2)
7
9-Sept-2003 CAS2003, Annecy, France, WFS Page 7 operations center:DKRZ technical organization of computational ressources (compute-, data- and network-services, infrastructure) advanced visualisation assistance for parallel architectures (consulting and training) operations center:DKRZ technical organization of computational ressources (compute-, data- and network-services, infrastructure) advanced visualisation assistance for parallel architectures (consulting and training) DKRZ Services
8
9-Sept-2003 CAS2003, Annecy, France, WFS Page 8 competence center:Model & Data professional handling of community models specific scenario runs scientific data handling Model & Data Group external to DKRZ, administered by MPI for Meteorology, funded by BMBF competence center:Model & Data professional handling of community models specific scenario runs scientific data handling Model & Data Group external to DKRZ, administered by MPI for Meteorology, funded by BMBF Model & Data Services
9
9-Sept-2003CAS2003, Annecy, France, WFS9 HPC Systemarchitecture suited for Earth System Modeling Principal HPC System Configuration Links between Different Services The Data Problem Principal HPC System Configuration Links between Different Services The Data Problem
10
9-Sept-2003 CAS2003, Annecy, France, WFS Page 10 Principal HPC System Configuration
11
9-Sept-2003 CAS2003, Annecy, France, WFS Page 11 Functionality and Performance Requirements for Data Service Transparent Access to Migrated Data High Bandwidth for Data Transfer Shared Filesystem Possibility for Adaptation in Upgrade Steps due to Changes in Usage Profile Functionality and Performance Requirements for Data Service Transparent Access to Migrated Data High Bandwidth for Data Transfer Shared Filesystem Possibility for Adaptation in Upgrade Steps due to Changes in Usage Profile Link between Compute Power and Non-Computing Services
12
9-Sept-2003 CAS2003, Annecy, France, WFS Page 12 Compute server power
13
9-Sept-2003 CAS2003, Annecy, France, WFS Page 13 Adaptation Problem for Data Server
14
9-Sept-2003 CAS2003, Annecy, France, WFS Page 14 High Bandwidth between the Coupled Servers Scalability supported by Operating System No Needs for Multiple Copies Record Level Access to Data with High Performance Minimized Data Transfers High Bandwidth between the Coupled Servers Scalability supported by Operating System No Needs for Multiple Copies Record Level Access to Data with High Performance Minimized Data Transfers Pros of Shared Filesystem Coupling
15
9-Sept-2003 CAS2003, Annecy, France, WFS Page 15 Proprietary Software needed Standardisation still missing Limited Number of Vendors whose Systems can be connected Proprietary Software needed Standardisation still missing Limited Number of Vendors whose Systems can be connected Cons of Shared Filesystem Coupling
16
9-Sept-2003CAS2003, Annecy, France, WFS16 HLRE Implementation at DKRZ HöchstLeistungsRechnersystem für die Erdsystem- forschung = HLRE High Performance Computer System for Earth System Research Principal HLRE System Configuration HLRE Installation Phases IA64/Linux based Data Services Final HLRE Configuration HöchstLeistungsRechnersystem für die Erdsystem- forschung = HLRE High Performance Computer System for Earth System Research Principal HLRE System Configuration HLRE Installation Phases IA64/Linux based Data Services Final HLRE Configuration
17
9-Sept-2003 CAS2003, Annecy, France, WFS Page 17 Principal HLRE System Configuration
18
9-Sept-2003 CAS2003, Annecy, France, WFS Page 18 HLRE Phases Mass Storage Capacity [Tbytes] >720 >1400 >3400 Date Feb 2002 4Q 2002 3Q 2003 Nodes 8 16 24 CPUs 64 128 192 Expected Sustained Performance [Gflops] ca. 200 ca. 350 ca. 500 Expected Increase in Thruputcompared toCRAY C916 ca. 40 ca. 75 ca. 100 Main Memory[Tbytes] 0.5 1.0 1.5 Disk-Capacity [Tbytes] ca. 30 ca. 50 ca. 60
19
9-Sept-2003 CAS2003, Annecy, France, WFS Page 19 DS phase 1: basic structure CS performance increase f = 37 F = f 3/4 = 15 minimal component performance indicated in diagram explicit user access ftp, scp... CS disks with local copies DS disks for cache physically distributed DS NAS architecture CS performance increase f = 37 F = f 3/4 = 15 minimal component performance indicated in diagram explicit user access ftp, scp... CS disks with local copies DS disks for cache physically distributed DS NAS architecture CS client(s) DS other clients GE 180 MB/s 45 MB/s 150 MB/s 375 MB/s 16.5 TB ~ PB 11 TB
20
9-Sept-2003 CAS2003, Annecy, France, WFS Page 20 Adaptation Option for Data Server
21
9-Sept-2003 CAS2003, Annecy, France, WFS Page 21 DS phases 2,3: basic structure CS performance increase f = 63/100 F = f 3/4 = 22.4/31.6 minimal component performance indicated in diagram implicit user access local UFS commands CS disks with local copies shared disks (GFS) DS disks for IO buffercache Intel/Linux platforms homogenous HW technological challenge CS performance increase f = 63/100 F = f 3/4 = 22.4/31.6 minimal component performance indicated in diagram implicit user access local UFS commands CS disks with local copies shared disks (GFS) DS disks for IO buffercache Intel/Linux platforms homogenous HW technological challenge CS client(s) DS other clients GE 270/325 MB/s 70/80 MB/s 225/270 MB/s 560/675 MB/s 16.5 TB ~ PB FC 25/30 TB 11 TB
22
9-Sept-2003CAS2003, Annecy, France, WFS22 Implementing IA64/Linux based Distributed Data Management Overall Phase 1 Configurations Introducing Linux based Distributed HSM Introducing Linux based Distributed DBMS Final Overall Phase 3 Configuration Overall Phase 1 Configurations Introducing Linux based Distributed HSM Introducing Linux based Distributed DBMS Final Overall Phase 3 Configuration
23
9-Sept-2003Page 23 Proposed final phase 3 configuration HS/MS LAN GE x 48 x 16 x 25 FE x 2/node For PolestarLite AsAmA 16way GFS/Server UVDM AsAmA 16way GFS/Server UVDM UDSN AsAmA 4way GFS/Client Oracle AsAmA 4way GFS/Client Oracle UDSN/UDNL GFS Disk (Polestar) 0.28 x 53 =14.8TB x 36 GFS Disk (Polestar) 0.28 x 53 =14.8TB x 36 x 8 x 2 x 4 FC x 72 x 8 Disk Cache (Polestar) 0.57TB x 15 = 8.5TB Disk Cache (DDN) 0.69TB x 12 = 8.3TB x 72 Local Disk FC- RAID 0.28TB x20 =5.6TB Local Disk FC- RAID 0.28TB x20 =5.6TB Silkworm 12000 x 20 x 120 x 32 SX-6SX-6SX-6SX-6SX-6SX-6SX-6SX-6 IXS 24nodes 9940B x 20 9840B x 0 9840C x 5 Local Disk (Polestar) 0.14 x 2 = 0.28TB Local Disk (Polestar) 0.14 x 2 = 0.28TB x 2 for Local disk x 2 for Local disk Fibre channel GigabitEther AsAmA 4way GFS/Client Oracle AsAmA 4way GFS/Client Oracle UDSN/UDNL x 2 x 4 AzusA 16way GFS/Server Post processing system UCFM/UDSN Disk FC x 8 Tape FC x 6 Disk FC x 8 Tape FC x 6 x 4 x 16 x 8 Oracle DB (DDN) 2TB x 4 = 8TB x 8 x 4 Oracle Application Server SQLNET Sun 4CPU The Internet AsamA 4CPU SQLNET Migration upon market availability of components
24
9-Sept-2003CAS2003, Annecy, France, WFS24 Some Results Growth of the Data Archive Growth of Transferrate Observed Transferrates for HLRE FLOPS-Rates Growth of the Data Archive Growth of Transferrate Observed Transferrates for HLRE FLOPS-Rates
25
9-Sept-2003 CAS2003, Annecy, France, WFS Page 25 DS archive capacity [TB]
26
9-Sept-2003 CAS2003, Annecy, France, WFS Page 26 DS archive capacity (2001-2003)
27
9-Sept-2003 CAS2003, Annecy, France, WFS Page 27 DS transfer rates [GB/day]
28
9-Sept-2003 CAS2003, Annecy, France, WFS Page 28 DS transfer rates (2001-2003)
29
9-Sept-2003 CAS2003, Annecy, France, WFS Page 29 DS transfer rates (2001-2003)
30
9-Sept-2003 CAS2003, Annecy, France, WFS Page 30 Observed Transferrates for HLRE LinkSingleStream Transferrate [MB/s] Aggregate Transferrate [MB/s] CS -> DS viaftp, (12.1 SUPER-UX) 13100 CS -> DS viaftp, (12.2 SUPER-UX) 25 200 CS ->localdisk, (12.1 SUPER-UX) 40 - 50> 2.000 CS -> GFSdisk, (13.1 SUPER-UX) Up to 90 3.900 DS -> GFSdisk, (Linux) Up to 80 500 per node
31
9-Sept-2003 CAS2003, Annecy, France, WFS Page 31 Observed FLOPS-rates for HLRE 4 node performance > approx.100 GLFOPS ( about 40 % Efficiency) for ECHAM (70-75) MOM Radar Reflection on Sea Ice 24 node performance for Turbulence Code about 470 GFLOPS (30+ % Efficiency)
32
9-Sept-2003CAS2003, Annecy, France, WFS32 SummarySummary DKRZ provides Computing Resources for Climate Research in Germany on an competitive international level The HLRE System Architecture is suited to cope with a data-intensive Usage Profile Shared Filesystems today are operational in Heterogenous System Environments Standardisation-Efforts for Shared Filesystems needed DKRZ provides Computing Resources for Climate Research in Germany on an competitive international level The HLRE System Architecture is suited to cope with a data-intensive Usage Profile Shared Filesystems today are operational in Heterogenous System Environments Standardisation-Efforts for Shared Filesystems needed
33
9-Sept-2003CAS2003, Annecy, France, WFS33 Thank you for your attention !
34
9-Sept-2003 CAS2003, Annecy, France, WFS Page 34 Tape transfer rates (2001-2003)
35
9-Sept-2003 CAS2003, Annecy, France, WFS Page 35 DS transfer requests (2001-2003)
36
9-Sept-2003 CAS2003, Annecy, France, WFS Page 36 DS archive capacity (2001-2003)
37
9-Sept-2003 CAS2003, Annecy, France, WFS Page 37 DS archive capacity (2001-2003)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.