Nov. 2002NERSC/LBNL1 Climate Modeling: Coupling Component Models by MPH for Distributed Multi-Component Environment Chris Ding and Yun (Helen) He NERSC Division Lawrence Berkeley National Laboratory
Nov. 2002NERSC/LBNL 2
Nov. 2002NERSC/LBNL 3 Motivation Application problems grow in scale & complexity Effective organization of simulation software system a major issue Software lasts much longer than a computer!
Nov. 2002NERSC/LBNL 4 Multi-Component Approach Build from (semi-)independent programs Coupled Climate System = Atmosphere + Ocean + Sea-Ice + Land-Surface + Flux-Coupler Components developed by different groups at different institutions Maximum flexibility and independence Algorithm, implementation depends on individual groups, practicality, time-to-completion, etc. Components communicate through well-defined interface data structure. Software industry trend (CORBA, DCE, DCOM) Common Component Architecture (DOE project) ESMF
Nov. 2002NERSC/LBNL 5 Distributed Components on HPC Systems Use MPI for high performance MPH: establish a multi-component environment MPI Communicator for each components Component name registration Resource allocation for each component Support different job execution modes Stand-out / stand-in redirect Complete flexibility Similar to PVM, but much smaller, simpler, scalable, and support more execution / integration modes.
Nov. 2002NERSC/LBNL 6 A climate simulation system consists of many independently-developed components on distributed memory multi-processor computer Single-component executable: Each component is a stand-alone executable Multi-component executable: Several components compiled into an executable Multi-component system execution modes: Single-Component executable Multi-Executable system (SCME) Multi-Component executable Single-Executable system (MCSE) Multi-Component executable Multi-Executable system (MCME)
Nov. 2002NERSC/LBNL 7 Component Integration / Job Execution Modes Single-Component exec. Multi-Executable system (SCME): Each component is an independent executable image Components run on separate subsets of SMP nodes Max flexibility in language, data structures, etc. Industry standard approach Multi-Component exec. Single-Executable system (MCSE): Each component is a module All components compiled into a single executable Many issues: name conflict, static allocations, etc. Stand-alone component Easy to understand and coordinate
Nov. 2002NERSC/LBNL 8 Component Integration / Job Execution Modes Multi-Component exec. Multi-executable system (MCME): Several components compiled into one executable Multiple executables form a single system Different executables run on different processors Different components within same executable could run on separate subsets of processors Maximum flexibility Includes SCME and MCSE as special cases Easy to adopt for concurrent ensemble simulations
Nov. 2002NERSC/LBNL 9 Multi-Component Single-Executable (MCSE) master.F PCM master.F PCM call MPH_setup_MCSE ( call MPH_setup_MCSE ( ‘atmosphere’, ! ‘atmosphere’ registered ‘atmosphere’, ! ‘atmosphere’ registered ‘ocean’, ! ‘ocean’ registered ‘ocean’, ! ‘ocean’ registered ‘coupler’, ! ‘coupler’ registered ‘coupler’, ! ‘coupler’ registered ) ! Add more components ) ! Add more components if (PE_in_component (‘ocean’, comm)) call ocean_v1 (comm) if (PE_in_component (‘ocean’, comm)) call ocean_v1 (comm) if (PE_in_component (‘atmosphere’, comm)) call atmosphere (comm) if (PE_in_component (‘atmosphere’, comm)) call atmosphere (comm) if (PE_in_component (‘coupler’, comm)) call coupler_v2 (comm) if (PE_in_component (‘coupler’, comm)) call coupler_v2 (comm) PROCESSOR_MAP PROCESSOR_MAP atmosphere 0 7 atmosphere 0 7 ocean 8 13 ocean 8 13 coupler coupler 14 15
Nov. 2002NERSC/LBNL 10 Single-Component Multi-Executable (SCME) Coupled System = Atmosphere + Ocean + Flux-Coupler atm.F: call MPH_setup (“atmosphere”, atmosphere_World) atm.F: call MPH_setup (“atmosphere”, atmosphere_World) ocean.F: call MPH_setup (“ocean”, ocean_World) ocean.F: call MPH_setup (“ocean”, ocean_World) coupler.F: call MPH_setup (“coupler”, coupler_World) coupler.F: call MPH_setup (“coupler”, coupler_World) Component Registration File: Component Registration File: atmosphere atmosphere ocean ocean coupler coupler
Nov. 2002NERSC/LBNL 11 MPH3: Multi-Component Multi-Executable (MCME) mpi_exe_world = MPH_components (name1=‘ocean’, name2=‘ice’,…) mpi_exe_world = MPH_components (name1=‘ocean’, name2=‘ice’,…) Component Registration File: Component Registration File: BEGIN BEGIN coupler coupler Multi_Comp_Start Multi_Comp_Start 2 ocean 0 3 ocean 0 3 ice 4 10 ice 4 10 Multi_Comp_End Multi_Comp_End Multi_Comp_Start Multi_Comp_Start 3 atmosphere 0 10 atmosphere 0 10 land land chemistry chemistry Multi_Comp_End Multi_Comp_End END END Launch parallel job on IBM task_geometry = {(5,2)(1,3)(4,6,0)} Launch parallel job on IBM task_geometry = {(5,2)(1,3)(4,6,0)}
Nov. 2002NERSC/LBNL 12 Joining two components MPH_comm_join (“atmosphere”, “ocean”, comm_new) comm_new contains all nodes in “atmosphere”, “ocean”. “atmosphere” nodes rank 0~7, “ocean” nodes rank 8~11 MPH_comm_join (“ocean”, “atmosphere”, comm_new) “ocean” nodes rank 0~3, “atmosphere” nodes rank 4~11 Afterwards, data remapping with “comm_new” Direct communication between two components MPI_send (…, MPH_global_id (“ocean”, 3), …) for query / probe tasks
Nov. 2002NERSC/LBNL 13 MPH Inquiry Functions MPH_global_id() MPH_local_id() MPH_component_name() MPH_total_components()
Nov. 2002NERSC/LBNL 14 Status and Users Completed MPH1, MPH2, MPH3 Software available free online: Complete user manual MPH runs on IBM SP SGI Origin HP Compaq clusters PC Linux clusters MPH users NCAR CCSM CSU Icosahedra Grid Coupled Climate Model People expressed clear interests in using MPH SGI/NASA: Irene Carpenter / Jim Laft, on SGI for coupled models UK ECMWF, for ensemble simulations Johannes Diemer, Germany, for coupled model on HP clusters
Nov. 2002NERSC/LBNL 15 Summary Multi-Component Approach for large & complex application software MPH glues together distributed components Name registration, resource allocation, standard- out Single-Component Multi-Executable (SCME) Multi-Component Single-Executable (MCSE) Multi-Component Multi-Executable (MCME)