Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer System Replacement at KEK K. Murakami KEK/CRC.

Similar presentations


Presentation on theme: "Computer System Replacement at KEK K. Murakami KEK/CRC."— Presentation transcript:

1 Computer System Replacement at KEK K. Murakami KEK/CRC

2 Outline  Overview  Introduction of New Central Computing System (KEKCC)  CPU  Storage  Operation Aspects 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 2

3 Overview 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 3

4 Computing Facility at KEK  2 System  Super Computer System  Central Computer System  Linux cluster  Support for IT infrastructure (mail / web)  Both system are now under replacement  Rental System  System replacement by every 3-5 years  International Bidding  Cycle of RFI / RFP, System introduction for 2 years 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 4

5 KEK supercomputer system KEKSC is now in service / fully installed soon.  For large scale numerical simulations  System-A is running Sep 2011—Jan 2012  System-A+B: March 2012–  System-A: Hitachi SR16000 model M1  Power7, 54.9 TFlops, 14TB memory  56 nodes: 960GFlops, 256GB/node  Automated parallelization on single node (32 cores)  System-B: IBM Blue Gene/Q  6 racks (3 from Mar 2012, 3 from Oct 2012)  1.258PFlops, 96TB in total  Rack: 1024 nodes, 5D torus network 209.7TFlops, 16TB memory  Scientific subjects  Large-scale simulation program (http://ohgata-s.kek.jp/) http://scwww.kek.jp/ 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 5

6 New Central Computer System 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 6 Central Computer System (KEKCC) B-Factory Computer System new KEKCC Rental period will end in next Feb. Service-in on Apr/2012

7 New KEKCC 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 7

8 Features of New KEKCC  Main Contractor :  3.5 years rental system (until Aug/2015)  4000 cores CPU  Linux cluster (SL5)  Interactive / Batch servers  Grid (gLite) deployed  Storage system for BIG data  7PB disk storage (DDN)  Tape library with max. capacity of 16 PB  High-speed I/O, High scalability 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 8

9 CPU  Work server & Batch server  Xeon 5670 (2.93 GHz / 3.33 GHz TB, 6core)  282 nodes : 4GB /core  58 nodes : 8GB /core  2 CPU/node : 4080 cores  Interconnect  InfiniBand 4xQDR (4GB/s), RDMA  Connection to storage system  Job scheduler  LSF (ver. 8)  Scalability up to 1M jobs  Grid deployment  gLite  Work server as Grid-UI, Batch server as Grid-WN 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 9 IBM System x iDataPlex

10 Disk System 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 10 DDN SFA10000  DDN SFA10K x 6  Capacity : 1152TB x 6 = 6.9 PB (effective)  Throughput: 12 GB/s x 6  used for GPFS and GHI  GPFS file system  Parallel file system  Total throughput : > 50 GB/s  Optimized for massive access  number of file servers  no bottle-neck interconnect, RDMA-enabled  Separation of meta-data area  large block size  Performance  >500MB/s for single file I/O in benchmark test

11 Tape System 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 11  Tape Library  Max. capacity : 16 PB  Tape Drive  TS1140 : 60 drives  latest enterprise drive  We do not use LTO because of less reliability.  Only two venders, IBM or StorageTek  Tape Media  JC : 4TB, 250 MB/s  JB : 1.6TB (repack), 200 MB/s  Magnetic body produced by Fuji Film is used for both IBM and StorageTek media. IBM TS3500 IBM TS1140

12 HSM (Hierarchical Storage Management)  HPSS  Disk (first layer) + Tape (second layer)  Experience in former KEKCC  Improvements from former system  Increase of tape drives  Improvement on tape drive I/O speed  Enforcement on interconnect (10GbE, IB)  Performance improvement on staging area (capacity, access speed)  Integration with GPFS file system (GHI)  GHI (GPFS-HPSS interface) : New!  GPFS as staging area  Perfect coherence with GPFS access (POSIX I/O)  no HPSS client API  instead of current VFS interface  high performance I/O of GPFS 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 12

13 GHI Data Flow Mover #1 Mover #2 Mover #3 Mover #4 Tape Lib CORE Server Lab-LAN SAN Switch HPSS Disk LAN GPFS NSD#1 GPFS NSD#2 GPFS NSD#3 SAN Switch GPFS Disk Lab LAN Linux Cluster 2 1 3 5 4 write read 13 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting

14 Internal Cloud Service  Motivation  Requirements of specific system  experiments, groups, community  test for new operating system  Efficient resource management (servers on demand)  PAAS-type of service  Cloud middleware  Platform IFS + IFS adaptive cluster (coherence with LSF)  In future, open solution (e.g. Openstack)  Provisioning tools  KVM (VM solution)  xCAT (system reinstallation by node)  Virtualization technology, not yet enough…  CPU virtualization is ok, but I/O virtualization is not yet enough.  Technology choice : 10GbE or IB taking into accounts of virtualization technology. (nPAR, SR-IOV) 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 14

15 Operation Aspects 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 15

16 Effect of 3.11 Earthquake  Earthquake Intensity at KEK (Tsukuba)  6- in Japanese scale / 7 max  VIII in MMI (Modified Mercalli) scale  Hardware damage was minimal.  Some racks waved.  Some HDDs were broken, minimal data loss  UPS was no helpful.  Introduce automatic shutdown mechanism within UPS alive especially for disk system.  Crisis of Electricity Supply  Accident of Fukushima nuclear power plant  Many (almost) nuclear power plants are off-line due to investigation of stress test.  Potential risk of blackout on summer day-time  Political electricity saving  30 % power cut compelled  Electricity rate will be raised by about 15%. 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 16

17 Electricity Saving in New System  Saving Energy Products  IBM iData-Plex (high intensity, high cooling efficiency (-40%))  Power Unit Efficiency (>80 PLUS Silver)  Tape is green device.  Disk system is not eco.  No MAID  risk on failure, data transfer rate (grid access)  Electrical Power Visualization  Electrical consumption of all components is monitored.  IBM System Director  Intelligent PDU  Power clamp meter  Power capping  IBM Active Energy Manager  Power capping for servers controlling CPU frequency  Max. power consumption can be set to 220 W – 350 W / server 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 17

18 Challenges to the Future  Facility  Electricity  new system 350 KW +400 KW air cooling  current PUE > 2.x  Mega-W scale in next  Cooling  Water cooling  Space  New building  Data center container  Data management  BIG (Exascale) data management  EByte in near future  Data copy at every system replacement  5PB in current, 20PB in next,...  Strategy for tape / library (IBM / StorageTek)  Development of tape generation is too rapid. 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 18

19 Summary  Computer Facility at KEK  Super computer system  Central computer (KEKCC) system (Linux cluster)  migrated system (former KEKCC + B-Factory CC)  Service-in from Apr./2012  4000-cores CPU  Linux cluster (Scientific Linux 5.6)  Grid environment (gLite)  Storage System  7PB DDN storage / GPFS file system  16PB capacity tape library  HPSS (GPFS-HPSS Interface) as HSM  High-speed access, high scalability for BIG data  Challenges to the Future  How to design next system 2012/Mar/14FJPPL (KEK/CRC - CC/IN2P3) meeting 19


Download ppt "Computer System Replacement at KEK K. Murakami KEK/CRC."

Similar presentations


Ads by Google