需完成之平行化工作 1. 平行化 domain decomposition 之方案確定。 2. timcom 之 preprocessor with f95 and dynamic allocated memory (inmets, indata, bounds) 3. timcom main code.

Slides:

Advertisements

Similar presentations

Geophysical Fluid Dynamics Laboratory Review June 30 - July 2, 2009 Geophysical Fluid Dynamics Laboratory Review June 30 - July 2, 2009.

Advertisements

Chapter Programming in C

FCU, Department of ECE, IC Design Research Lab. TEL: # 4945 Pre-SIm ， Post-Sim.

1 生物計算期末作業暨南大學資訊工程系 2003/05/13. 2 compare f1 f2  只比較兩個檔案 f1 與 f2 ，比完後將結果輸出。 compare directory  以兩兩比對的方式，比對一個目錄下所有檔案的相似程度。  將相似度很高的檔案做成報表輸出，報表中至少要.

Divide-and-Conquer. 什麼是 divide-and-conquer ？ Divide 就是把問題分割 Conquer 則是把答案結合起來.

第七章抽樣與抽樣分配蒐集統計資料最常見的方式是抽查。這牽涉到兩個問題：抽出的樣本是否具有代表性?是否能反應出母體的特徵?

Mathcad 基本認識再mathcad中等於(=)的符號有區分為三種：第一種：冒號等於(:=)是代表我們要定義ㄧ個參數

: A-Sequence 星級 : ★★☆☆☆ 題組： Online-judge.uva.es PROBLEM SET Volume CIX 題號： Problem D : A-Sequence 解題者：薛祖淵解題日期： 2006 年 2 月 21 日題意：一開始先輸入一個.

Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/3 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH3.5 ~ CH /10/31.

PowerPoint2010 李燕秋版面配置版面配置指的是每一個頁面的內容配置方式，不同的版面配置會有對應的母片。

如何將數字變成可用之資訊現代化資料處理與應用概念. 如何將數字變成可用之資訊人最容易接受的訊息是圖像化資訊。在一堆數字中，要進行比較分析，一般會使用表格形式計算與分析。所以一般我們會將數字依關聯性，轉換成表格計算與分析。此表格一般稱試算表或稱表格。再將結果轉換為圖表，進行比較與分析。

期末專題 - 吊人頭遊戲第 35 組組員 : 電機系 49841XXXX XXX 電機系 49841OOOO OOO.

What is static?. Static? 靜態 ? class Test { static int staticX; int instanceX; public Test(int var1, int var2) { this.staticX = var1; this.instanceX =

指導教授：陳淑媛學生：李宗叡李卿輔.  利用下列三種方法 (Edge Detection 、 Local Binary Pattern 、 Structured Local Edge Pattern) 來判斷是否為場景變換，以方便使用者來找出所要的片段。

Review of Chapter 3 - 已學過的 rules( 回顧 )- 朝陽科技大學資訊管理系李麗華教授.

: ShellSort ★★☆☆☆ 題組： Problem D 題號： 10152: ShellSort 解題者：林一帆解題日期： 2006 年 4 月 10 日題意：烏龜王國的烏龜總是一隻一隻疊在一起。唯一改變烏龜位置的方法為：一隻烏龜爬出他原來的位置，然後往上爬到最上方。給你一堆烏龜原來排列的順序，以及我們想要的烏龜的排列順序，你.

Chapter 2 聯立線性方程式與矩陣緒言線性方程式組 (systems of linear equations) 出現在多數線性模式 (linear model) 中。根據以往解題的經驗，讀者們也許已發現方程式的解僅與該方程式的係數有關，求解的過程也僅與係數的運算有關，只要係數間的相關位置不改變，

1.1 電腦的特性電腦能夠快速處理資料：電腦可在一秒內處理數百萬個基本運算，這是人腦所不能做到的。原本人腦一天的工作量，交給電腦可能僅需幾分鐘的時間就處理完畢。電腦能夠快速處理資料：電腦可在一秒內處理數百萬個基本運算，這是人腦所不能做到的。原本人腦一天的工作量，交給電腦可能僅需幾分鐘的時間就處理完畢。

STAT0_sampling Random Sampling  母體： Finite population & Infinity population  由一大小為 N 的有限母體中抽出一樣本數為 n 的樣本，若每一樣本被抽出的機率是一樣的，這樣本稱為隨機樣本 (random sample)

1. 假設以下的敘述為一未提供 “ 捷徑計算 ” 能力的程式段，試用程式設計的技巧，使此敘述經此改寫的動作後，具有與 “ 捷徑計算 ” 之處理方法相同之處理模式。 if and then E1 else E2 endif.

Structural Equation Modeling Chapter 7 觀察變數路徑分析＝路徑分析觀察變數路徑分析.

JAVA 程式設計與資料結構第十四章 Linked List. Introduction Linked List 的結構就是將物件排成一列，有點像是 Array ，但是我們卻無法直接經由 index 得到其中的物件在 Linked List 中，每一個點我們稱之為 node ，第一個 node.

Lecture Note of 9/29 jinnjy. Outline Remark of “Central Concepts of Automata Theory” (Page 1 of handout) The properties of DFA, NFA,  -NFA.

基礎物理總論基礎物理總論熱力學與統計力學（三） Statistical Mechanics 東海大學物理系施奇廷.

Monte Carlo Simulation Part.2 Metropolis Algorithm Dept. Phys. Tunghai Univ. Numerical Methods C. T. Shih.

JAVA 程式設計與資料結構第十章 GUI Introdution III. File Chooser  File Chooser 是一個選擇檔案的圖形介面，無論我們是要存檔還是要開啟檔案，使用這個物件都會讓我們覺得容易且舒適。

具備人臉追蹤與辨識功能的一個智慧型數位監視系統系統架構在巡邏模式中，攝影機會左右來回巡視，並利用動態膚色偵測得知是否有移動膚色物體，若有移動的膚色物體則進入到追蹤模式，反之則繼續巡視。

Introduction to Java Programming Lecture 17 Abstract Classes & Interfaces.

: The largest Clique ★★★★☆ 題組： Contest Archive with Online Judge 題號： 11324: The largest Clique 解題者：李重儀解題日期： 2008 年 11 月 24 日題意：簡單來說，給你一個 directed.

3-3 使用幾何繪圖工具 Flash 的幾何繪圖工具包括線段工具 (Line Tool) 、橢圓形工具 (Oval Tool) 、多邊星形工具 (Rectangle Tool) 3 種。這些工具畫出來的幾何圖形包括了筆畫線條和填色區域, 將它們適當地組合加上有技巧地變形與配色, 不但比鉛筆工具簡單,

Matlab Assignment Due Assignment 兩個 matlab 程式 : Eigenface ： Eigenvector 和 eigenvalue 的應用. Fractal ： Affine transform( rotation, translation,

: War on Weather ★★☆☆☆ 題組： Contest Volumes Archive with Online Judge 題號： 10915: War on Weather 解題者：陳明凱題意：題目總共會給你 k 個點座標代表殺手衛星的位置，距離地球表面最少 50 公里以上，並且會給你.

: Happy Number ★ ? 題組： Problem Set Archive with Online Judge 題號： 10591: Happy Number 解題者：陳瀅文解題日期： 2006 年 6 月 6 日題意：判斷一個正整數 N 是否為 Happy Number.

: Fast and Easy Data Compressor ★★☆☆☆ 題組： Problem Set Archive with Online Judge 題號： 10043: Fast and Easy Data Compressor 解題者：葉貫中解題日期： 2007 年 3.

Digital Signal Processing with Examples in M ATLAB ® Chap 1 Introduction Ming-Hong Shih, Aug 25, 2003.

Fourier Series. Jean Baptiste Joseph Fourier (French)(1763~1830)

: Problem A : MiniMice ★★★★☆ 題組： Contest Archive with Online Judge 題號： 11411: Problem A : MiniMice 解題者：李重儀解題日期： 2008 年 9 月 3 日題意：簡單的說，題目中每一隻老鼠有一個編號.

: Ahoy, Pirates! ★★★★☆ 題組： Contest Archive with Online Judge 題號： 11402: Ahoy, Pirates! 解題者：李重儀解題日期： 2008 年 8 月 26 日題意：有一個海盜島有 N 個海盜，他們的編號 (id)

Fugacity Coefficient and Fugacity

The Lin-Rood Finite Volume (FV) Dynamical Core: Tutorial

:Nuts for nuts..Nuts for nuts.. ★★★★☆ 題組： Problem Set Archive with Online Judge 題號： 10944:Nuts for nuts.. 解題者：楊家豪解題日期： 2006 年 2 月題意：給定兩個正整數 x,y.

從此處輸入帳號密碼登入到管理頁面. 點選進到檔案管理點選「上傳檔案」上傳資料點選瀏覽選擇電腦裡的檔案可選擇公開或不公開為平台上的資料夾此處為檔案分類，可顯示在展示頁面上，若要參加 MY EG 競賽，做品一律上傳到 “ 98 MY EG Contest ” 點選此處確定上傳檔案.

資料結構實習-一參數傳遞.

Dynamic Multi-signatures for Secure Autonomous Agents Panayiotis Kotzanikolaou Mike Burmester.

: Problem G e-Coins ★★★☆☆ 題組： Problem Set Archive with Online Judge 題號： 10306: Problem G e-Coins 解題者：陳瀅文解題日期： 2006 年 5 月 2 日題意：給定一個正整數 S (0

Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/25 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH 2.4~CH 2.6 &

函式 Function Part.2 東海大學物理系‧資訊教育施奇廷. 遞迴（ Recursion ）函式可以「呼叫自己」，這種動作稱為「遞迴」此程式的執行結果相當於陷入無窮迴圈，無法停止（只能按 Ctrl-C ）這給我們一個暗示：函式的遞迴呼叫可以達到部分迴圈的效果.

: GCD - Extreme II ★★★★☆ 題組： Contest Archive with Online Judge 題號： 11426: GCD - Extreme II 解題者：蔡宗翰解題日期： 2008 年 9 月 19 日題意：最多 20,000 組測資，題目會給一個數字.

JAVA 程式設計與資料結構第二十章 Searching. Sequential Searching Sequential Searching 是最簡單的一種搜尋法，此演算法可應用在 Array 或是 Linked List 此等資料結構。 Sequential Searching 的 worst-case.

演算法 8-1 最大數及最小數找法 8-2 排序 8-3 二元搜尋法.

資訊理論授課老師 : 陳建源研究室 : 法 401 網站

逆向選擇和市場失調. 定義  資料不對稱在交易其中，其中一方較對方有多些資料。  逆向選擇出現在這個情況下，就是當買賣雙方隨意在市場上交易，與比較主動交易者作交易為佳。

845: Gas Station Numbers ★★★ 題組： Problem Set Archive with Online Judge 題號： 845: Gas Station Numbers. 解題者：張維珊解題日期： 2006 年 2 月題意：將輸入的數字，經過重新排列組合或旋轉數字，得到比原先的數字大，

Linguistics phonetic symbols. 先下載 IPA 字型檔案，執行安裝。由於這個程式的字型目錄設定錯誤，所以等重新開機時就會發現字型消失。所以必須根據以下步驟來讓 Windows 加入 IPA 字型。

Learning Method in Multilingual Speech Recognition Author : Hui Lin, Li Deng, Jasha Droppo Professor: 陳嘉平 Reporter: 許峰閤.

Chapter 10 m-way 搜尋樹與B-Tree

: Problem E Antimatter Ray Clearcutting ★★★★☆ 題組： Problem Set Archive with Online Judge 題號： 11008: Problem E Antimatter Ray Clearcutting 解題者：林王智瑞.

第 6 章迴圈結構 6-1 計數迴圈 6-1 計數迴圈 6-2 條件迴圈 6-2 條件迴圈 6-3 巢狀迴圈 6-3 巢狀迴圈 6-4 While/End While 迴圈 6-4 While/End While 迴圈 6-5 跳出迴圈 6-5 跳出迴圈 6-6 VB.NET 的錯誤處理 6-6 VB.NET.

Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/30 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH7.1~CH /12/26.

函式 Function 東海大學物理系‧資訊教育施奇廷. 函式簡介當程式越來越大、越複雜時，程式的維護、除錯會變得更困難，此時必須引入函式來簡化程式或將程式分段，將程式重複的部分改寫為函式，將程式「模組化」這種作法有下列優點：節省程式發展的時間、邏輯容易瞭解、程式容易除錯、可分工合作完成程式.

: Finding Paths in Grid ★★★★☆ 題組： Contest Archive with Online Judge 題號： 11486: Finding Paths in Grid 解題者：李重儀解題日期： 2008 年 10 月 14 日題意：給一個 7 個 column.

著作權所有 © 旗標出版股份有限公司第 14 章製作信封、標籤. 本章提要製作單一信封製作單一郵寄標籤.

幼兒行為觀察與記錄第八章事件取樣法.

: How many 0's? ★★★☆☆ 題組： Problem Set Archive with Online Judge 題號： 11038: How many 0’s? 解題者：楊鵬宇解題日期： 2007 年 5 月 15 日題意：寫下題目給的 m 與 n(m

McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. 肆資料分析與表達.

A Look at High-Order Finite- Volume Schemes for Simulating Atmospheric Flows Paul Ullrich University of Michigan.

RPL: IPv6 Routing Protocol for Low power and Lossy Networks

A baroclinic instability test case for dynamical cores of GCMs Christiane Jablonowski (University of Michigan / GFDL) David L. Williamson (NCAR) AMWG Meeting,

Development of an Atmospheric Climate Model with Self-Adapting Grid and Physics Joyce E. Penner 1, Michael Herzog 2, Christiane Jablonowski 3, Bram van.

Presentation transcript:

需完成之平行化工作 1. 平行化 domain decomposition 之方案確定。 2. timcom 之 preprocessor with f95 and dynamic allocated memory (inmets, indata, bounds) 3. timcom main code with f95 and dynamic allocated memory 4. EVP solver with f95 and dynamic allocated memory 5. Subroutines a2o/o2a with cross cpu core data exchange. 或 6. Timcom 改寫為每 cpu core 可同時處理南北半球。 7. Netcdf input and output 。

Domain Decomposition 方案 1) 採 timcom 之架構。需要修改 a2o/o2a 等 subroutine ，使其可以跨 node 來交換 timcom 及 echam 之資料。優 :y 方向 ghost zone 之傳輸量為 2) 之一半。缺 : 需額外跨 core 交換 (llon*llat-2*ng*llat) 之資料。 2) 採 echam 之架構。如採此，則每一個 cpu core 皆需同時計算南北半球之海洋 domain ，這在 timcom 需修改部份 code 。優 : 同樣 cpu 數下，會比 1) 快，因跨 core 之交換資料 llon>2 之條件較少。

nproca (4 ) nprocb (3) glon glat EQ

J0 J1 (jon0) 2 (jos0) 1 2 (jow0) I1 (ioe0) I0 YVDEG(J0), YV(J0) YDEG(J0), Y(J0) Y1DEG, YVDEG(J1) YVDEG(3) YDEG(3) YVDEG(2), YV(2) YVDEG(2), Y(2) Y0DEG, YV(1) YVDEG(1), YDEG(1) ng X Y DX(J0) DY(J0) DYV(J1) DYV(3) DY(3) DX(3) DX(2) DY(2) DYV(2) X0DEG X1DEG

Parallel Consideration 目前這版本許多設定還有問題，因此一下子就會 crash 。但試一下是好的。另如有可能，建議將目前 mo_ocean 中與原始 timcom 之同樣功能之 subroutine 併入 standalone 平行化之 timcom 版，以方便測試，看是否正常，尤其是希望可以發展中 ng>=2 之版之 f90, dynamic allocated memory 之單純海洋模式。這些測試有助於我們之後再併入 echam 。

Information for whole ECHAM domain nlon ： number of longitudes of the global domain nlat ： number of latitudes of the global domain nlev ： number of levels of the global domain Information valid for all processes of a model instance nproca ： number of processors for the dimension counts longitudes nprocb ： number of processors for the dimension counts latitudes d_nprocs ： number of processors used in the model domain nproca × nprocb spe, epe ： Index number of first and last processor which handles this model domain mapmesh(ib,ia) ： array mapping from a logical 2-d mesh to the processor index numbers within the decomposition table global decomposition. ib=1, nprocb ； ia=1, nproca

General local information pe ： processor identifier. This number is used in the mpi send and receive routines set_b ： index of processor in the direction of longitudes. This number determines the location within the array mapmesh. processors with ascending numbers handle subdomains with increasing longitudes. set_a ： index of processor in the direction of latitudes. This number determines the location within the array mapmesh. processors with ascending numbers handle subdomains with decreasing values of absolute latitudes.

Grid space decomposition nglat, nglon ： mumber of longitudes and latitudes in grid space handle by this processor. nglpx ： number of longitudes allocated. glats(1: 2), glate(1: 2) ： start and end values of global latitude indices. glons(1: 2), glone(1: 2) ： start and end values of global longitude indices. glat (1: nglat) ： global latitude index. glon(1: nglon) ： offset to global longitude index.

echam memory_g3b 等變數 ( 如 sitwt, sitwu ，皆是 local 之變數。並不是基於一個 main scatter 出去然後 collect 各 processors 的。而是各個 node 分別計算而來。只是 echam 其排列方式仍與 timecom 不同。

The Lin-Rood Finite Volume (FV) Dynamical Core: Tutorial Christiane Jablonowski National Center for Atmospheric Research Boulder, Colorado NCAR Tutorial, May / 31/ 2005

Topics that we discuss today The Lin-Rood Finite Volume (FV) dynamical core The Lin-Rood Finite Volume (FV) dynamical core –History: where, when, who, … –Equations & some insights into the numerics –Algorithm and code design The grid The grid –Horizontal resolution –Grid staggering: the C-D grid concept –Vertical grid and remapping technique Practical advice when running the FV dycore Practical advice when running the FV dycore –Namelist and netcdf variables variables (input & output) –Dynamics - physics coupling Hybrid parallelization concept Hybrid parallelization concept –Distributed-shared memory parallelization approach: MPI and OpenMP Everything you would like to know Everything you would like to know

Who, when, where, … FV transport algorithm developed by S.-J. Lin and Ricky Rood (NASA GSFC) in D Shallow water model in D FV dynamical core around 1998/1999 Until 2000: FV dycore mainly used in data assimilation system at NASA GSFC Also: transport scheme in ‘Impact’, offline tracer transport In 2000: FV dycore was added to NCAR’s CCM3.10 (now CAM3) Today (2005): The FV dycore –might become the default in CAM3 –Is used in WACCAM –Is used in the climate model at GFDL

Dynamical cores of General Circulation Models Dynamics Physics FV: No explicit diffusion (besides divergence damping)

The NASA/NCAR finite volume dynamical core 3D hydrostatic dynamical core for climate and weather prediction: –2D horizontal equations are very similar to the shallow water equations –3 rd dimension in the vertical direction is a floating Lagrangian coordinate: pure 2D transport with vertical remapping steps Numerics: Finite volume approach –conservative and monotonic 2D transport scheme –upwind-biased orthogonal 1D fluxes, operator splitting in 2D –van Leer second order scheme for time-averaged numerical fluxes –PPM third order scheme (piecewise parabolic method) for prognostic variables –Staggered grid (Arakawa D-grid for prognostic variables)

The 3D Lin-Rood Finite-Volume Dynamical Core Momentum equation in vector-invariant form Continuity equation Thermodynamic equation, also for tracers (replace  ): The prognostics variables are:  p: pressure thickness,  =Tp -  : scaled potential temperature Pressure gradient term in finite volume form

Finite volume principle Continuity equation in flux form: Integrate over one time step  t and the 2D finite volume  with area A: Integrate and rearrange: Time-averaged numerical flux Spatially-averaged pressure thickness

Finite volume principle Apply the Gauss divergence theorem: unit normal vector Discretize:

Orthogonal fluxes across cell interfaces G i,j-1/2 G i,j+1/2 F i+1/2,j F i-1/2,j F: fluxes in x direction G: fluxes in y direction Flux form ensures mass conservation (i,j) Wind directionUpwind-biased:

Quasi semi-Lagrange approach in x direction G i,j-1/2 G i,j+1/2 F i+1/2,j F i-5/2,j (i,j) CFL x = u *  t/  y > 1 possible: implemented as an integer shift and fractional flux calculation CFL y = v *  t/  y < 1 required

Numerical fluxes & subgrid distributions 1st order upwind –constant subgrid distribution 2nd order van Leer –linear subgrid distribution 3rd order PPM (piecewise parabolic method) –parabolic subgrid distribution ‘Monotonocity’ versus ‘positive definite’ constraints Numerical diffusion Explicit time stepping scheme: Requires short time steps that are stable for the fastest waves (e.g. gravity waves) CGD web page for CAM3:

Subgrid distributions: constant (1st order) x1x1 x3x3 x4x4 x2x2 u

Subgrid distributions: piecewise linear (2nd order) x1x1 x3x3 x4x4 x2x2 u van Leer See details in van Leer 1977

Subgrid distributions: piecewise parabolic (3rd order) x1x1 x3x3 x4x4 x2x2 u PPM See details in Carpenter et al and Colella and Woodward 1984

Monotonicity constraint x1x1 x3x3 x4x4 x2x2 u van Leer Monotonicity constraint results in discontinuities not allowed Prevents over- and undershoots Adds diffusion See details of the monotinity constraint in van Leer 1977

Simplified flow chart stepondynpkg physpkg cd_core te_map trac2d p_d_coupling c_sw 1/2  t only: compute C- grid time- mean winds d_sw full  t: update all D-grid variables subcycled Vertical remapping d_p_coupling

vu Grid staggerings (after Arakawa) A grid B grid u v vv vu u u v vv v uu uu D grid C grid Scalars:

Regular latitude - longitude grid Converging grid lines at the poles decrease the physical spacing  x Digital and Fourier filters remove unstable waves at high latitudes Pole points are mass-points

Typical horizontal resolutions Time step is the ‘physics’ time step: Dynamics are subcyled using the time step  t/nsplit ‘nsplit’ is typically 8 or 10 CAM3: check (dtime=1800s due to physics ?) WACCAM: check (nsplit = 4, dtime=1800s for 2 o x2.5 o ?)  x  Lat x Lon Max.  x (km)  t (s) ≈ spectral 4 o x 5 o 46 x T21 (32x64) 2 o x 2.5 o 91 x T42 (64x128) 1 o x 1.25 o 181 x T85 (128x256) Defaults:

Idealized baroclinic wave test case Jablonowski and Williamson 2005 The coarse resolution does not capture the evolution of the baroclinic wave

Idealized baroclinic wave test case Finer resolution: Clear intensification of the baroclinic wave

Idealized baroclinic wave test case Finer resolution: Clear intensification of the baroclinic wave, it starts to converge

Idealized baroclinic wave test case Baroclinic wave pattern converges

Idealized baroclinic wave test case: Convergence of the FV dynamics Solution starts converging at 1deg Global L 2 error norms of p s Shaded region indicates the uncertainty of the reference solution

Floating Lagrangian vertical coordinate 2D transport calculations with moving finite volumes (Lin 2004) Layers are material surfaces, no vertical advection Periodic re-mapping of the Lagrangian layers onto reference grid WACCAM: 66 vertical levels with model top around 130km CAM3: 26 levels with model top around 3hPa (40 km)

Physics - Dynamics coupling Prognostic data are vertically remapped (in cd_core) before dp_coupling is called (in dynpkg) Vertical remapping routine computes the vertical velocity  and the surface pressure p s d_p_coupling and p_d_coupling (module dp_coupling) are the interfaces to the CAM3/WACCAM physics package Copy / interpolate the data from the ‘dynamics’ data structure to the ‘physics’ data structure (chunks), A-grid Time - split physics coupling: –instantaneous updates of the A-grid variables –the order of the physics parameterizations matters –physics tendencies for u & v updates on the D grid are collected

Practical tips What do IORD, JORD, KORD mean? IORD and JORD at the model top are different (see cd_core.F90) Relationship between –dtime –nsplit (what happens if you don’t select nsplit or nsplit =0, default is computed in the routine d_split in dynamics_var.F90) –time interval for the physics & vertical remapping step Namelist variables: Input / Output: Initial conditions: staggered wind components US and VS required (D-grid) Wind at the poles not predicted but derived User’s Guide:

Practical tips IORD, JORD, KORD determine the numerical scheme –IORD: scheme for flux calculations in x direction –JORD: scheme for flux calculations in y direction –KORD: scheme for the vertical remapping step Available options: - 2: linear subgrid, van-Leer, unconstrained 1:constant subgrid, 1st order 2:linear subgrid, van Leer, monotonicity constraint (van Leer 1977) 3:parabolic subgrid, PPM, monotonic (Colella and Woodward 1984) 4: parabolic subgrid, PPM, monotonic (Lin and Rood 1996, see FFSL3) 5: parabolic subgrid, PPM, positive definite constraint 6: parabolic subgrid, PPM, quasi-monotone constraint Defaults: 4 (PPM) on the D grid (d_sw), -2 on the C grid (c_sw) Namelist variables:

‘Hybrid’ Computer Architecture SMP: symmetric multi-processor Hybrid parallelization technique possible: Shared memory (OpenMP) within a node Distributed memory approach (MPI) across nodes Example: NCAR’s Bluesky (IBM) with 8-way and 32-way nodes

Schematic parallelization technique NP SP Eq. 1D Distributed memory parallelization (MPI) across the latitudes: Proc Longitudes0340

Schematic parallelization technique NP SP Eq. Each MPI domain contains ‘ghost cells’ (halo regions): copies of the neighboring data that belong to different processors Proc. 2 Longitudes ghost cells for PPM

Schematic parallelization technique Shared memory parallelization (in CAM3 most often) in the vertical direction via OpenMP compiler directives: Typical loop: do k = 1, plev … enddo Can often be parallelized with OpenMP (check dependencies): !$OMP PARALLEL DO … do k = 1, plev … enddo

Schematic parallelization technique Shared memory parallelization (in CAM3 most often) in the vertical direction via OpenMP compiler directives: e.g.: assume 4 parallel ‘threads’ and a 4-way SMP node (4 CPUs) !$OMP PARALLEL DO … do k = 1, plev … enddo kCPU 1 plev

Thank you ! Any questions ??? Tracer transport ? Fortran code …

References Carpenter, R., L., K. K. Droegemeier, P. W. Woodward and C. E. Hanem 1990: Application of the Piecewise Parabolic Method (PPM) to Meteorological Modeling. Mon. Wea. Rev., 118, Colella, P., and P. R. Woodward, 1984: The piecewise parabolic method (PPM) for gas- dynamical simulations. J. Comput. Phys., 54, Jablonowski, C. and D. L. Williamson, 2005: A baroclinic instability test case for atmospheric model dynamical cores. Submitted to Mon. Wea. Rev. Lin, S.-J., and R. B. Rood, 1996: Multidimensional Flux-Form Semi-Lagrangian Transport Schemes. Mon. Wea. Rev., 124, Lin, S.-J., and R. B. Rood, 1997: An explicit flux-form semi-Lagrangian shallow water model on the sphere. Quart. J. Roy. Meteor. Soc., 123, Lin, S.-J., 1997: A finite volume integration method for computing pressure gradient forces in general vertical coordinates. Quart. J. Roy. Meteor. Soc., 123, Lin, S.-J., 2004: A ‘Vertically Lagrangian’ Finite-Volume Dynamical Core for Global Models. Mon. Wea. Rev., 132, van Leer, B., 1977: Towards the ultimate conservative difference scheme. IV. A new approach to numerical convection. J. Comput. Phys.,