Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National Research Grid Initiative (NAREGI)
Hokkaido University HITACHI SR8000 HP Exemplar V2500 HITACHI MP5800/160 Sun Ultra Enterprise 4000 Tohoku University NEC SX-4/128H4(Soon SX-7) NEC TX7/AzusA University of Tokyo HITACHI SR8000 HITACHI SR8000/MPP Others (in institutes) Nagoya University FUJITSU VPP5000/64 FUJITSU GP7000F model 900/64 FUJITSU GP7000F model 600/12 Osaka University NEC SX-5/128M8 HP Exemplar V2500/N Kyoto University FUJITSU VPP800 FUJITSU GP7000F model 900 /32 FUJITSU GS8000 Kyushu University FUJITSU VPP5000/64 HP GS320/32 FUJITSU GP7000F 900/64 Inter-university Computer Centers (excl. National Labs) circa 2002 Tokyo Inst. Technology (Titech) NEC SX-5/16, Origin2K/256 HP GS320/64 University of Tsukuba FUJITSU VPP5000 CP-PACS 2048 (SR8000 proto)
Q: Grid to be a Ubiquitous National Research Computing Infrastructure---How? Simply Extend the Campus Grid? –100,000 users/machines, 1000kms Networking PetaFlops/Petabytes…Problems! Grid Software Stack Deficiency –Large scale resource management –Large scale Grid programming –User support tools – PSE, visualization, portals –Packaging, distribution, troubleshooting –High-Performance networking vs. firewalls –Large scale security management –“Grid-Enabling” applications –Manufacturer experience and support
National Research Grid Initiative (NAREGI) Project:Overview - A new Japanese MEXT National Grid R&D project ~$(US)17M FY’03 (similar until FY’07) + $45mil - One of two major Japanese Govt. Grid Projects -c.f. “BusinessGrid” - Collaboration of National Labs. Universities and Major Computing and Nanotechnology Industries - Acquisition of Computer Resources underway (FY2003) MEXT:Ministry of Education, Culture, Sports,Science and Technology
National Research Grid Infrastructure (NAREGI) Petascale Grid Infrastructure R&D for Future Deployment –$45 mil (US) + $16 mil x 5 ( ) = $125 mil total –Hosted by National Institute of Informatics (NII) and Institute of Molecular Science (IMS) –PL: Ken Miura (Fujitsu NII) SLs Sekiguchi(AIST), Matsuoka(Titech), Shimojo(Osaka-U), Hirata(IMS)… –Participation by multiple (>= 3) vendors –Resource Contributions by University Centers as well AIST Various Partners Grid Middleware SuperSINET Grid R&D Infrastr. 15 TF-100TF Grid and Network Management “NanoGrid” IMS ~10TF (BioGrid RIKEN) OtherInst. National Research Grid Middleware R&D Nanotech Grid Apps (Biotech Grid Apps) (Other Apps) Titech Fujitsu NEC Osaka-U U-Kyushu Hitachi Focused “Grand Challenge” Grid Apps Areas U-Tokyo
(1)R&D in Grid Middleware Grid Software Stack for “Petascale” Nation-wide “Research Grid” Deployment (2) Testbed validating 100+TFlop (2007) Grid Computing Environment for Nanoscience apps on Grid - Initially ~17 Teraflop, ~3000 CPU dedicated testbed - Super SINET (> 10Gbps Research AON backbone) (3) International Collaboration with similar projects (U.S., Europe, Asia-Pacific incl. Australia) (4) Standardization Activities, esp. within GGF National Research Grid Initiative (NAREGI) Project:Goals
Nano-science Applicatons Director(Dr. Hirata, IMS) Operations R&D Group Leader SuperSINET Technical Requirements Utilization of Network Operations Technology Dev. R&D AIST (GTRC) Joint Research National Supercomputing Centers Universities Research Labs. Coordination/ Deployment Center for Grid Research & Development (National Institute of Informatics) Network Technology Refinement National Supercomputeing Centers Coordination in Network Research R&D of Grand-challenge Grid Applocations ( ISSP,Tohoku-u, , AIST etc. , Industrial Partners ) MEXT Group Leaders Grid R&D Progam Management Committee ITBLProject ( JAIRI ) ITBLProject Dir. Operation s Utilization of Computing Resources Computational Nano-science Center ( Institute for Molecular Science ) NAREGI Research Organization and Collaboration Joint Research Grid R&D Advisory Board Grid Networking R&D Grid Middleware and Upper Layer R&D Project Leader (K.Miura, NII) (Titech,Osaka-U, Kyushu-U. etc) ) R&D Joint Research Testbed Resources (Acquisition in FY2003) NII: ~5Tflop/s IMS: ~11Tflop/s Consortium for Promotion of Grid Applications in Industry
Participating Organizations National Institute of Informatics (NII) (Center for Grid Research & Development) Institute for Molecular Science (IMS) (Computational Nano ‐ science Center) Universities and National Labs (Joint R&D) (AIST Grid Tech. Center, Titech GSIC, Osaka-U Cybermedia, Kyushu-U, Kyushu Inst. Tech., etc.) Project Collaborations (ITBL Project, SC Center Grid Deployment Projects etc.) Participating Vendors (IT and NanoTech) Consortium for Promotion of Grid Applications in Industry
NAREGI R&D Assumptions & Goals Future Research Grid Metrics –10s of Institutions/Centers, various Project VOs –> 100,000 users, > 100,000 CPUs/machines Machines very heterogeneous, SCs, clusters, desktops –24/7 usage, production deployment –Server Grid, Data Grid, Metacomputing… Do not reeinvent the wheel –Build on, collaborate with, and contribute to the “Globus, Unicore, Condor” Trilogy –Scalability and dependability are the key Win support of users –Application and experimental deployment essential –However not let the apps get a “free ride” –R&D for production quality (free) software
WP-1: National-Scale Grid Resource Management: Matsuoka (Titech), Kohno(ECU), Aida (Titech) WP-2: Grid Programming: Sekiguchi(AIST), Ishikawa(AIST) WP-3: User-Level Grid Tools & PSE: Miura (NII), Sato (Tsukuba-u), Kawata (Utsunomiya-u) WP-4: Packaging and Configuration Management: Miura (NII) WP-5: Networking, National-Scale Security & User Management Shimojo (Osaka-u), Oie ( Kyushu Tech.) WP-6: Grid-Enabling Nanoscience Applications : Aoyagi (Kyushu-u) NAREGI Work Packages
NAREGI Software Stack 100 Tflops 級のサイエンスグリッド環境 WP6: Grid-Enabled Apps WP3: Grid PSE WP3: Grid Workflow WP1: SuperScheduler WP1: Grid Monitoring & Accounting WP2: Grid Programming -Grid RPC -Grid MPI WP3: Grid Visualization WP1: Grid VM ( Globus,Condor,UNICORE OGSA) WP5: Grid PKI WP5: High-Performance Grid Networking WP4: Packaging
WP-1: National-Scale Grid Resource Management Build on Unicore Condor Globus –Bridge their gaps as well –OGSA in the future –Condor-U and Unicore-C SuperScheduler Monitoring & Auditing/Accounting Grid Virtual Machine PKI and Grid Account Management (WP5) EU GRIP Globus Universe Condor-G Unicore-C Condor-U
WP1: SuperScheduler ( Fujitsu) Hierarchical SuperScheduling structure, scalable to 100,000s users, nodes, jobs among >20+ sites Fault Tolerancy Workflow Engine NAREGI Resource Schema (joint w/Hitachi) Resource Brokering w/resource policy, advanced reservation (NAREGI Broker) Intially Prototyped on Unicore AJO/NJS/TSI –(OGSA in the future)
WP1: SuperScheduler ( Fujitsu) (Cont’d) EuroGrid Broker [ マン大 ] WP3 PSE GATEWAY(U) UPL (Unicore Protocol Layer) over SSL Intranet Internet WP5 h NAREGI PKI [NEC] NJS(U) Network Job Supervisor Broker NJS(U) UPL (Unicore Protocol Layer) UUDB(U) NAREGI BROKER-S [Fujitsu] … Resource Broker IF Execution NJS(U) Execution NJS(U) UPL (Unicore Protocol Layer) CheckQoS & SubmitJobCheckQoS FNTP (Fujitsu European Laboratories NJS to TSI Protocol) TSI(U) Target System Interface TSI(U) Target System Interface TSI(U) Target System Interface TSI Connection IF Condor NAREGI BROKER-L [Fujitsu] DRMAA ? TSI(U) Target System Interface Globus GRIP(G) (U): UNICORE; Uniform Interface to Computing Resources (G): GRIP; Grid Interoperability Project CheckQoS ? C.f. EuroGird [Manchester U] WP3: Workflow Description (convert to UNICORE DAG) Map Resource Requirements in RSL (or JSDL) onto CIM Policy DB (Repository) For Super Scheduler For Local Scheduler Policy Engine: “Ponder” Policy Description Lang. (as a Management App.) Resource Discovery, Selection, Reservation Analysis& Prediction OGSI portType? CIM in XML over HTTP or CIM-to-LDAP CIMOM (CIM Object Manager) Batch Q A CIM Provider NQS Condor CIM Provider ClassAd Globus CIM Provider MDS/GARA Ex. Queue change event CIM Indication (Event) GMA Sensor Being Planned Monitoring [Hitachi] TOG OpenPegasus (derived from SNIA CIMOM) Commercial Products: MS WMI (Windows Management Instrumentation), IBM Tivoli, SUN WBEM Services, etc. Imperia l College, London Used in CGS-WG Demo at GGF7
WP1: Grid Virtual Machine ( NEC & Titech) “Portable” and thin VM layer for the Grid Various VM functions – Access Control, Access Transparency, FT Support, Resource Control, etc. Also provides co- scheduling across clusters Respects Grid standards, e.g., GSI, OGSA (future) Various prototypes on Linux GridVM Access Control &Virtualization Secure Resource Access Control Checkpoint Support Job Migration Resource Usage Rate Control Co-Scheduling & Co-Allocation Job Control Node Virtualization & Access Transparency Resource Control FT Support
WP1: Grid Monitoring & Auditing/Accounting ( Hitachi & Titech) Scalable Grid Monitoring, Accouting, Logging Define CIM- based Unified Resource Schema Distinguish End users vs. Administrators Prototype based on GT3 Index Service, CIMON, etc. * Self Configuring Monitoring (Titech)
WP-2:Grid Programming Grid Remote Procedure Call (RPC) –Ninf-G2 Grid Message Passing Programming –GridMPI
WP-2:Grid Programming – GridRPC/Ninf-G2 (AIST/GTRC) GridRPC Server side Client side Client GRAM 3. invoke Executable 4. connect back Numerical Library IDL Compiler Remote Executable 1. interface request 2. interface reply fork MDS Interface Information LDIF File retrieve IDL FILE generate Programming Model using RPC on the Grid High-level, taylored for Scientific Computing (c.f. SOAP-RPC) GridRPC API standardization by GGF GridRPC WG Ninf-G Version 2 A reference implementation of GridRPC API Implemented on top of Globus Toolkit 2.0 (3.0 experimental) Provides C and Java APIs DEMO is available at AIST/Titech Booth
WP-2:Grid Programming-GridMPI (AIST and U-Tokyo) Provides users an environment to run MPI applications efficiently in the Grid. Flexible and hterogeneous process invocation on each compute node GridADI and Latency-aware communication topology, optimizing communication over non-uniform latency and hides the difference of various lower-level communication libraries. Extremely efficient implementation based on MPI on Score (Not MPICHI-PM) GridMPI RSH P-to-P Communication PMv2Others Vendor MPI Other Comm. Library Latency-aware Communication Topology Grid ADI MPI Core Vendor MPI GRAM SSH RIM IMPI TCP/IP
WP-3: User-Level Grid Tools & PSE Grid Workflow - Workflow Language Definition - GUI(Task Flow Representation) Visualization Tools - Real-time volume visualization on the Grid PSE /Portals - Multiphysics/Coupled Simulation - Application Pool - Collaboration with Nanotech Applicatons Group PSE Toolkit PSE Portal PSE Appli-pool Super-Scheduler Application Server Problem Solving Environment Information Service Workflow
WP-4: Packaging and Configuration Management Collaboration with WP1 management Issues –Selection of packagers to use (RPM, GPTK?) –Interface with autonomous configuration management (WP1) –Test Procedure and Harness –Testing Infrastructurec.f. NSF NMI packaging and testing
WP-5 Grid High Performance Networking Traffic measurement on SuperSINET Optimal Routing Algorithms for Grids Robust TCP/IP Control for Grids Grid CA/User Grid Account Management and Deployment Collaboration with WP-1
WP-6:Adaptation of Nano-science Applications to Grid Environment Analysis of Typical Nanoscience Applications - Parallel Structure - Granularity - Resource Requirement - Latency Tolerance Development of Coupled Simulation Data Exchange Format and Framework Collaboration with IMS
WP6 and Grid Nano-Science and Technology Applications Overview Participating Organizations: -Institute for Molecular Science -Institute for Solid State Physics -AIST -Tohoku University -Kyoto University -Industry (Materials, Nano-scale Devices) -Consortium for Promotion of Grid Applications in Industry Research Topics and Groups: -Electronic Structure -Magnetic Properties -Functional nano-molecules(CNT,Fullerene etc.) -Bio-molecules and Molecular Electronics -Simulation Software Integration Platform -Etc.
SMP SC Cluster (Grid) GridMPI etc. RISM FMO Solvent distribution Solute structure In-sphere correlation Mediator Example: WP6 and IMS Grid- Enabled Nanotechnology IMS RISM-FMO Grid coupled simulation –RISM: Reference Interaction Site Model –FMO: Fragment Molecular Orbital method WP6 will develop the application-level middleware, including the “Mediator” component
KEK Operation (NII) U. of Tokyo NIG ISAS Nagoya U. Kyoto U. Osaka U. DataGRID for High-energy Science Middleware for Computational GRID Nano-Technology For GRID Application OC-48+ transmission for Radio Telescope Bio-Informatics NIFS Kyushu U. Hokkaido U. Okazaki Research Institutes Tohoku U. Tsukuba U. Tokyo Institute of Tech. Wased a U. Doshidha U. NAO NII R&D SuperSINET: AON Production Research Network (separate funding) ■ 10Gbps General Backbone ■ GbE Bridges for peer- connection ■ Very low latency – Titech- Tsukuba 3-4ms roundtrip ■ Operation of Photonic Cross Connect (PXC) for fiber/wavelength switching ■ 6,000+km dark fiber, 100+ e-e lambda and 300+Gb/s ■ Operational from January, 2002 until March, 2005
SuperSINET :Network Topology As of October, 2002 Source:National Institute of Informatics U Tokyo Tokyo hub IMS U Tokyo Osaka hub Kyoto U Uji Nagoya U Nagoya hub Osaka U NIFS KEK Hokkaido U ISAS NII Hitotsubashi NII Chiba NIG NAO Kyushu U Tsukuba U Tohoku U IMS (Okazaki) TITech Waseda U Doshisha U (10Gbps Photonic Backbone Network) NAREGI GRID R&D
The NAREGI Phase 1 Testbed ($45mil, 1Q2004) ~3000 Procs, ~17TFlops NII (Tokyo) IMS (Okazaki) Small Test App Clusters (x 6) SuperSINET (10Gbps MPLS) ~400km Center for Grid R&D ~ 5Tflops Software Testbed Computational Nano-science Center ~11TFlops Application Testbed Osaka-U BioGrid U-Tokyo Titech Campus Grid ~1.8TFlops AIST SuperCluster ~11TFlops Note: NOT a production Grid system c.f. TeraGrid Total ~6500 procs, ~30TFlops
NAREGI Software R&D Grid Testbed (Phase 1) Under Procurement – Installation March 2004 –3 SMPs, 128 procs total ( ), SparcV +IA64+Power4 –6 128-proc PC clusters 2.8Ghz Dual Xeon + GbE (Blades) 3.06Ghz Dual Xeon + Infiniband –10+37TB File Server –Multi-gigabit networking to simulate Grid Env. –NOT a production system (c.f. TeraGrid) –> 5 Teraflops –WAN Simulation –To form a Grid with the IMS NAREGI application testbed infrastructure (> 10 Teraflops, March 2004), and other national centers via SuperSINET
NAREGI R&D Grid NII
AIST (National Institute of Advanced Industrial Science & Technology) Supercluster Challenge –Huge computing power to support various research including life science and nanotechnology within AIST Solution –Linux Cluster IBM eServer 325 P32: 2116 CPU AMD Opteron M64: 520 CPU Intel Madison –Myrinet networking –SCore Cluster OS –Globus toolkit 3.0 to allow shared resources. World’s most powerful Linux- based supercomputer –more than 11 TFLOPS ranked as the third most powerful supercomputer in the world –Operational March, 2004 Collaborations Government Life ScienceNanotechnology LANInternet AcademiaCorporations Grid Technology Advanced Computing Center. Other Research Institute
NII Center for Grid R&D (Jinbo-cho, Tokyo) Imperial Palace Tokyo St. Akihabara Mitsui Office Bldg. 14 th Floor 700m2 office space (100m2 machine room)
Towards Petascale Grid – a Proposal Resource Diversity ( 松竹梅 “Shou-Chiku-Bai”) – 松 (“shou” pine) – ES – like centers Teraflops x (a few), TeraFlops – 竹 (“chiku” bamboo) – Medium-sized machines at SCs, 5-10 TeraFlops x 5, TeraFlops aggregate / Center, TeraFlops total – 梅 (“bai” plumb) – small clusters and PCs spread out throughout campus in a campus Grid x 5k-10k, TeraFlops / Center, PetaFlop total Division of Labor between “Big” centers like ES and Univ. Centers, Large-medium-small resources Utilize Grid sofwate stack developed by NAREGI and other Grid projects Univ SCs ES’s
Collaboration Ideas Data (Grid) –NAREGI deliberately does not handle data Unicore components –“Unicondore” (Condor-U, Unicore-C) NAREGI Middleware –GridRPC, GridMPI –Networking –Resource Management e.g. CIM resource schema International Testbed Other ideas? –Application areas as well