需完成之平行化工作 1. 平行化 domain decomposition 之方案確定。 2. timcom 之 preprocessor with f95 and dynamic allocated memory (inmets, indata, bounds) 3. timcom main code.

Slides:



Advertisements
Similar presentations
Geophysical Fluid Dynamics Laboratory Review June 30 - July 2, 2009 Geophysical Fluid Dynamics Laboratory Review June 30 - July 2, 2009.
Advertisements

Chapter Programming in C
FCU, Department of ECE, IC Design Research Lab. TEL: # 4945 Pre-SIm , Post-Sim.
1 生物計算期末作業 暨南大學資訊工程系 2003/05/13. 2 compare f1 f2  只比較兩個檔案 f1 與 f2 ,比完後將結果輸出。 compare directory  以兩兩比對的方式,比對一個目錄下所有檔案的相 似程度。  將相似度很高的檔案做成報表輸出,報表中至少要.
Divide-and-Conquer. 什麼是 divide-and-conquer ? Divide 就是把問題分割 Conquer 則是把答案結合起來.
第七章 抽樣與抽樣分配 蒐集統計資料最常見的方式是抽查。這 牽涉到兩個問題: 抽出的樣本是否具有代表性?是否能反應出母體的特徵?
Mathcad 基本認識 再mathcad中等於(=)的符號有區分為三種: 第一種:冒號等於(:=)是代表我們要定義ㄧ個參數
: A-Sequence 星級 : ★★☆☆☆ 題組: Online-judge.uva.es PROBLEM SET Volume CIX 題號: Problem D : A-Sequence 解題者:薛祖淵 解題日期: 2006 年 2 月 21 日 題意:一開始先輸入一個.
Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/3 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH3.5 ~ CH /10/31.
PowerPoint2010 李燕秋 版面配置 版面配置指的是每一個頁面的內容配置 方式,不同的版面配置會有對應的母片。
如何將數字變成可用之資訊 現代化資料處理與應用概念. 如何將數字變成可用之資訊 人最容易接受的訊息是圖像化資訊。 在一堆數字中,要進行比較分析,一般會使用表格形 式計算與分析。 所以一般我們會將數字依關聯性, 轉換成表格計算與分析。 此表格一般稱試算表或稱表格。 再將結果轉換為圖表,進行比較與分析。
期末專題 - 吊人頭遊戲 第 35 組 組員 : 電機系 49841XXXX XXX 電機系 49841OOOO OOO.
What is static?. Static? 靜態 ? class Test { static int staticX; int instanceX; public Test(int var1, int var2) { this.staticX = var1; this.instanceX =
指導教授:陳淑媛 學生:李宗叡 李卿輔.  利用下列三種方法 (Edge Detection 、 Local Binary Pattern 、 Structured Local Edge Pattern) 來判斷是否為場景變換,以方便使用者來 找出所要的片段。
Review of Chapter 3 - 已學過的 rules( 回顧 )- 朝陽科技大學 資訊管理系 李麗華 教授.
: ShellSort ★★☆☆☆ 題組: Problem D 題號: 10152: ShellSort 解題者:林一帆 解題日期: 2006 年 4 月 10 日 題意:烏龜王國的烏龜總是一隻一隻疊在一起。唯一改變烏龜位置 的方法為:一隻烏龜爬出他原來的位置,然後往上爬到最上方。給 你一堆烏龜原來排列的順序,以及我們想要的烏龜的排列順序,你.
Chapter 2 聯立線性方程式與矩陣 緒言 線性方程式組 (systems of linear equations) 出現 在多數線性模式 (linear model) 中。根據以往解 題的經驗,讀者們也許已發現方程式的解僅與 該方程式的係數有關,求解的過程也僅與係數 的運算有關,只要係數間的相關位置不改變,
1.1 電腦的特性 電腦能夠快速處理資料:電腦可在一秒內處理數百萬個 基本運算,這是人腦所不能做到的。原本人腦一天的工 作量,交給電腦可能僅需幾分鐘的時間就處理完畢。 電腦能夠快速處理資料:電腦可在一秒內處理數百萬個 基本運算,這是人腦所不能做到的。原本人腦一天的工 作量,交給電腦可能僅需幾分鐘的時間就處理完畢。
STAT0_sampling Random Sampling  母體: Finite population & Infinity population  由一大小為 N 的有限母體中抽出一樣本數為 n 的樣 本,若每一樣本被抽出的機率是一樣的,這樣本稱 為隨機樣本 (random sample)
1. 假設以下的敘述為一未提供 “ 捷徑計算 ” 能力的程式段,試用程 式設計的技巧,使此敘述經此改 寫的動作後,具有與 “ 捷徑計算 ” 之 處理方法相同之處理模式。 if and then E1 else E2 endif.
Structural Equation Modeling Chapter 7 觀察變數路徑分析=路徑分析 觀察變數路徑分析.
JAVA 程式設計與資料結構 第十四章 Linked List. Introduction Linked List 的結構就是將物件排成一列, 有點像是 Array ,但是我們卻無法直接經 由 index 得到其中的物件 在 Linked List 中,每一個點我們稱之為 node ,第一個 node.
Lecture Note of 9/29 jinnjy. Outline Remark of “Central Concepts of Automata Theory” (Page 1 of handout) The properties of DFA, NFA,  -NFA.
基礎物理總論 基礎物理總論 熱力學與統計力學(三) Statistical Mechanics 東海大學物理系 施奇廷.
Monte Carlo Simulation Part.2 Metropolis Algorithm Dept. Phys. Tunghai Univ. Numerical Methods C. T. Shih.
JAVA 程式設計與資料結構 第十章 GUI Introdution III. File Chooser  File Chooser 是一個選擇檔案的圖形介面, 無論我們是要存檔還是要開啟檔案,使 用這個物件都會讓我們覺得容易且舒適。
具備人臉追蹤與辨識功能的一個 智慧型數位監視系統 系統架構 在巡邏模式中 ,攝影機會左右來回巡視,並 利用動態膚色偵測得知是否有移動膚色物體, 若有移動的膚色物體則進入到追蹤模式,反之 則繼續巡視。
Introduction to Java Programming Lecture 17 Abstract Classes & Interfaces.
: The largest Clique ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11324: The largest Clique 解題者:李重儀 解題日期: 2008 年 11 月 24 日 題意: 簡單來說,給你一個 directed.
3-3 使用幾何繪圖工具 Flash 的幾何繪圖工具包括線段工具 (Line Tool) 、橢圓形工具 (Oval Tool) 、多邊星形 工具 (Rectangle Tool) 3 種。這些工具畫出 來的幾何圖形包括了筆畫線條和填色區域, 將它們適當地組合加上有技巧地變形與配 色, 不但比鉛筆工具簡單,
Matlab Assignment Due Assignment 兩個 matlab 程式 : Eigenface : Eigenvector 和 eigenvalue 的應用. Fractal : Affine transform( rotation, translation,
: War on Weather ★★☆☆☆ 題組: Contest Volumes Archive with Online Judge 題號: 10915: War on Weather 解題者:陳明凱 題意:題目總共會給你 k 個點座標代表殺手衛星的位置, 距離地球表面最少 50 公里以上,並且會給你.
: Happy Number ★ ? 題組: Problem Set Archive with Online Judge 題號: 10591: Happy Number 解題者:陳瀅文 解題日期: 2006 年 6 月 6 日 題意:判斷一個正整數 N 是否為 Happy Number.
: Fast and Easy Data Compressor ★★☆☆☆ 題組: Problem Set Archive with Online Judge 題號: 10043: Fast and Easy Data Compressor 解題者:葉貫中 解題日期: 2007 年 3.
Digital Signal Processing with Examples in M ATLAB ® Chap 1 Introduction Ming-Hong Shih, Aug 25, 2003.
Fourier Series. Jean Baptiste Joseph Fourier (French)(1763~1830)
: Problem A : MiniMice ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11411: Problem A : MiniMice 解題者:李重儀 解題日期: 2008 年 9 月 3 日 題意:簡單的說,題目中每一隻老鼠有一個編號.
: Ahoy, Pirates! ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11402: Ahoy, Pirates! 解題者:李重儀 解題日期: 2008 年 8 月 26 日 題意:有一個海盜島有 N 個海盜,他們的編號 (id)
Fugacity Coefficient and Fugacity
The Lin-Rood Finite Volume (FV) Dynamical Core: Tutorial
:Nuts for nuts..Nuts for nuts.. ★★★★☆ 題組: Problem Set Archive with Online Judge 題號: 10944:Nuts for nuts.. 解題者:楊家豪 解題日期: 2006 年 2 月 題意: 給定兩個正整數 x,y.
從此處輸入帳號密碼登入到管理頁面. 點選進到檔案管理 點選「上傳檔案」上傳資料 點選瀏覽選擇電腦裡的檔案 可選擇公開或不公開 為平台上的資料夾 此處為檔案分類,可顯示在展示頁面上,若要參加 MY EG 競賽,做品一律上傳到 “ 98 MY EG Contest ” 點選此處確定上傳檔案.
資料結構實習-一 參數傳遞.
Dynamic Multi-signatures for Secure Autonomous Agents Panayiotis Kotzanikolaou Mike Burmester.
: Problem G e-Coins ★★★☆☆ 題組: Problem Set Archive with Online Judge 題號: 10306: Problem G e-Coins 解題者:陳瀅文 解題日期: 2006 年 5 月 2 日 題意:給定一個正整數 S (0
Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/25 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH 2.4~CH 2.6 &
函式 Function Part.2 東海大學物理系‧資訊教育 施奇廷. 遞迴( Recursion ) 函式可以「呼叫自己」,這種動作稱為 「遞迴」 此程式的執行結果相當於陷入無窮迴圈, 無法停止(只能按 Ctrl-C ) 這給我們一個暗示:函式的遞迴呼叫可以 達到部分迴圈的效果.
: GCD - Extreme II ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11426: GCD - Extreme II 解題者:蔡宗翰 解題日期: 2008 年 9 月 19 日 題意: 最多 20,000 組測資,題目會給一個數字.
JAVA 程式設計與資料結構 第二十章 Searching. Sequential Searching Sequential Searching 是最簡單的一種搜尋法,此演 算法可應用在 Array 或是 Linked List 此等資料結構。 Sequential Searching 的 worst-case.
演算法 8-1 最大數及最小數找法 8-2 排序 8-3 二元搜尋法.
資訊理論 授課老師 : 陳建源 研究室 : 法 401 網站
逆向選擇和市場失調. 定義  資料不對稱 在交易其中,其中一方較對方有多些資料。  逆向選擇 出現在這個情況下,就是當買賣雙方隨意在 市場上交易,與比較主動交易者作交易為佳 。
845: Gas Station Numbers ★★★ 題組: Problem Set Archive with Online Judge 題號: 845: Gas Station Numbers. 解題者:張維珊 解題日期: 2006 年 2 月 題意: 將輸入的數字,經過重新排列組合或旋轉數字,得到比原先的數字大,
Linguistics phonetic symbols. 先下載 IPA 字型檔案,執行安裝。 由於這個程式的字型目錄設定錯誤, 所以等重新開機時就會發現字型消失。 所以必須根據以下步驟來讓 Windows 加入 IPA 字型。
Learning Method in Multilingual Speech Recognition Author : Hui Lin, Li Deng, Jasha Droppo Professor: 陳嘉平 Reporter: 許峰閤.
Chapter 10 m-way 搜尋樹與B-Tree
: Problem E Antimatter Ray Clearcutting ★★★★☆ 題組: Problem Set Archive with Online Judge 題號: 11008: Problem E Antimatter Ray Clearcutting 解題者:林王智瑞.
第 6 章 迴圈結構 6-1 計數迴圈 6-1 計數迴圈 6-2 條件迴圈 6-2 條件迴圈 6-3 巢狀迴圈 6-3 巢狀迴圈 6-4 While/End While 迴圈 6-4 While/End While 迴圈 6-5 跳出迴圈 6-5 跳出迴圈 6-6 VB.NET 的錯誤處理 6-6 VB.NET.
Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/30 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH7.1~CH /12/26.
函式 Function 東海大學物理系‧資訊教育 施奇廷. 函式簡介 當程式越來越大、越複雜時,程式的維護、 除錯會變得更困難,此時必須引入函式來 簡化程式或將程式分段,將程式重複的部 分改寫為函式,將程式「模組化」 這種作法有下列優點:節省程式發展的時 間、邏輯容易瞭解、程式容易除錯、可分 工合作完成程式.
: Finding Paths in Grid ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11486: Finding Paths in Grid 解題者:李重儀 解題日期: 2008 年 10 月 14 日 題意:給一個 7 個 column.
著作權所有 © 旗標出版股份有限公司 第 14 章 製作信封、標籤. 本章提要 製作單一信封 製作單一郵寄標籤.
幼兒行為觀察與記錄 第八章 事件取樣法.
VHDL語法(3).
: How many 0's? ★★★☆☆ 題組: Problem Set Archive with Online Judge 題號: 11038: How many 0’s? 解題者:楊鵬宇 解題日期: 2007 年 5 月 15 日 題意:寫下題目給的 m 與 n(m
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. 肆 資料分析與表達.
A Look at High-Order Finite- Volume Schemes for Simulating Atmospheric Flows Paul Ullrich University of Michigan.
RPL: IPv6 Routing Protocol for Low power and Lossy Networks
A baroclinic instability test case for dynamical cores of GCMs Christiane Jablonowski (University of Michigan / GFDL) David L. Williamson (NCAR) AMWG Meeting,
Development of an Atmospheric Climate Model with Self-Adapting Grid and Physics Joyce E. Penner 1, Michael Herzog 2, Christiane Jablonowski 3, Bram van.
Presentation transcript:

需完成之平行化工作 1. 平行化 domain decomposition 之方案確定。 2. timcom 之 preprocessor with f95 and dynamic allocated memory (inmets, indata, bounds) 3. timcom main code with f95 and dynamic allocated memory 4. EVP solver with f95 and dynamic allocated memory 5. Subroutines a2o/o2a with cross cpu core data exchange. 或 6. Timcom 改寫為每 cpu core 可同時處理南北半球。 7. Netcdf input and output 。

Domain Decomposition 方案 1) 採 timcom 之架構。需要修改 a2o/o2a 等 subroutine ,使其可以跨 node 來交換 timcom 及 echam 之資料。優 :y 方向 ghost zone 之傳輸 量為 2) 之一半。缺 : 需額外跨 core 交換 (llon*llat-2*ng*llat) 之資料。 2) 採 echam 之架構。如採此,則每一個 cpu core 皆需同時計算南北半球之海洋 domain ,這在 timcom 需修改部份 code 。優 : 同樣 cpu 數下, 會比 1) 快,因跨 core 之交換資料 llon>2 之條件 較少。

nproca (4 ) nprocb (3) glon glat EQ

J0 J1 (jon0) 2 (jos0) 1 2 (jow0) I1 (ioe0) I0 YVDEG(J0), YV(J0) YDEG(J0), Y(J0) Y1DEG, YVDEG(J1) YVDEG(3) YDEG(3) YVDEG(2), YV(2) YVDEG(2), Y(2) Y0DEG, YV(1) YVDEG(1), YDEG(1) ng X Y DX(J0) DY(J0) DYV(J1) DYV(3) DY(3) DX(3) DX(2) DY(2) DYV(2) X0DEG X1DEG

Parallel Consideration 目前這版本許多設定還有問題,因此一 下子就會 crash 。但試一下是好的。 另如有可能,建議將目前 mo_ocean 中與 原始 timcom 之同樣功能之 subroutine 併 入 standalone 平行化之 timcom 版,以方 便測試,看是否正常,尤其是希望可以 發展中 ng>=2 之版之 f90, dynamic allocated memory 之單純海洋模式。這些 測試有助於我們之後再併入 echam 。

Information for whole ECHAM domain nlon : number of longitudes of the global domain nlat : number of latitudes of the global domain nlev : number of levels of the global domain Information valid for all processes of a model instance nproca : number of processors for the dimension counts longitudes nprocb : number of processors for the dimension counts latitudes d_nprocs : number of processors used in the model domain nproca × nprocb spe, epe : Index number of first and last processor which handles this model domain mapmesh(ib,ia) : array mapping from a logical 2-d mesh to the processor index numbers within the decomposition table global decomposition. ib=1, nprocb ; ia=1, nproca

General local information pe : processor identifier. This number is used in the mpi send and receive routines set_b : index of processor in the direction of longitudes. This number determines the location within the array mapmesh. processors with ascending numbers handle subdomains with increasing longitudes. set_a : index of processor in the direction of latitudes. This number determines the location within the array mapmesh. processors with ascending numbers handle subdomains with decreasing values of absolute latitudes.

Grid space decomposition nglat, nglon : mumber of longitudes and latitudes in grid space handle by this processor. nglpx : number of longitudes allocated. glats(1: 2), glate(1: 2) : start and end values of global latitude indices. glons(1: 2), glone(1: 2) : start and end values of global longitude indices. glat (1: nglat) : global latitude index. glon(1: nglon) : offset to global longitude index.

echam memory_g3b 等變數 ( 如 sitwt, sitwu ,皆是 local 之變數。並不是基於一 個 main scatter 出去然後 collect 各 processors 的 。而是各個 node 分別計算 而來。只是 echam 其排列方式仍與 timecom 不同。

The Lin-Rood Finite Volume (FV) Dynamical Core: Tutorial Christiane Jablonowski National Center for Atmospheric Research Boulder, Colorado NCAR Tutorial, May / 31/ 2005

Topics that we discuss today The Lin-Rood Finite Volume (FV) dynamical core The Lin-Rood Finite Volume (FV) dynamical core –History: where, when, who, … –Equations & some insights into the numerics –Algorithm and code design The grid The grid –Horizontal resolution –Grid staggering: the C-D grid concept –Vertical grid and remapping technique Practical advice when running the FV dycore Practical advice when running the FV dycore –Namelist and netcdf variables variables (input & output) –Dynamics - physics coupling Hybrid parallelization concept Hybrid parallelization concept –Distributed-shared memory parallelization approach: MPI and OpenMP Everything you would like to know Everything you would like to know

Who, when, where, … FV transport algorithm developed by S.-J. Lin and Ricky Rood (NASA GSFC) in D Shallow water model in D FV dynamical core around 1998/1999 Until 2000: FV dycore mainly used in data assimilation system at NASA GSFC Also: transport scheme in ‘Impact’, offline tracer transport In 2000: FV dycore was added to NCAR’s CCM3.10 (now CAM3) Today (2005): The FV dycore –might become the default in CAM3 –Is used in WACCAM –Is used in the climate model at GFDL

Dynamical cores of General Circulation Models Dynamics Physics FV: No explicit diffusion (besides divergence damping)

The NASA/NCAR finite volume dynamical core 3D hydrostatic dynamical core for climate and weather prediction: –2D horizontal equations are very similar to the shallow water equations –3 rd dimension in the vertical direction is a floating Lagrangian coordinate: pure 2D transport with vertical remapping steps Numerics: Finite volume approach –conservative and monotonic 2D transport scheme –upwind-biased orthogonal 1D fluxes, operator splitting in 2D –van Leer second order scheme for time-averaged numerical fluxes –PPM third order scheme (piecewise parabolic method) for prognostic variables –Staggered grid (Arakawa D-grid for prognostic variables)

The 3D Lin-Rood Finite-Volume Dynamical Core Momentum equation in vector-invariant form Continuity equation Thermodynamic equation, also for tracers (replace  ): The prognostics variables are:  p: pressure thickness,  =Tp -  : scaled potential temperature Pressure gradient term in finite volume form

Finite volume principle Continuity equation in flux form: Integrate over one time step  t and the 2D finite volume  with area A: Integrate and rearrange: Time-averaged numerical flux Spatially-averaged pressure thickness

Finite volume principle Apply the Gauss divergence theorem: unit normal vector Discretize:

Orthogonal fluxes across cell interfaces G i,j-1/2 G i,j+1/2 F i+1/2,j F i-1/2,j F: fluxes in x direction G: fluxes in y direction Flux form ensures mass conservation (i,j) Wind directionUpwind-biased:

Quasi semi-Lagrange approach in x direction G i,j-1/2 G i,j+1/2 F i+1/2,j F i-5/2,j (i,j) CFL x = u *  t/  y > 1 possible: implemented as an integer shift and fractional flux calculation CFL y = v *  t/  y < 1 required

Numerical fluxes & subgrid distributions 1st order upwind –constant subgrid distribution 2nd order van Leer –linear subgrid distribution 3rd order PPM (piecewise parabolic method) –parabolic subgrid distribution ‘Monotonocity’ versus ‘positive definite’ constraints Numerical diffusion Explicit time stepping scheme: Requires short time steps that are stable for the fastest waves (e.g. gravity waves) CGD web page for CAM3:

Subgrid distributions: constant (1st order) x1x1 x3x3 x4x4 x2x2 u

Subgrid distributions: piecewise linear (2nd order) x1x1 x3x3 x4x4 x2x2 u van Leer See details in van Leer 1977

Subgrid distributions: piecewise parabolic (3rd order) x1x1 x3x3 x4x4 x2x2 u PPM See details in Carpenter et al and Colella and Woodward 1984

Monotonicity constraint x1x1 x3x3 x4x4 x2x2 u van Leer Monotonicity constraint results in discontinuities not allowed Prevents over- and undershoots Adds diffusion See details of the monotinity constraint in van Leer 1977

Simplified flow chart stepondynpkg physpkg cd_core te_map trac2d p_d_coupling c_sw 1/2  t only: compute C- grid time- mean winds d_sw full  t: update all D-grid variables subcycled Vertical remapping d_p_coupling

vu Grid staggerings (after Arakawa) A grid B grid u v vv vu u u v vv v uu uu D grid C grid Scalars:

Regular latitude - longitude grid Converging grid lines at the poles decrease the physical spacing  x Digital and Fourier filters remove unstable waves at high latitudes Pole points are mass-points

Typical horizontal resolutions Time step is the ‘physics’ time step: Dynamics are subcyled using the time step  t/nsplit ‘nsplit’ is typically 8 or 10 CAM3: check (dtime=1800s due to physics ?) WACCAM: check (nsplit = 4, dtime=1800s for 2 o x2.5 o ?)  x  Lat x Lon Max.  x (km)  t (s) ≈ spectral 4 o x 5 o 46 x T21 (32x64) 2 o x 2.5 o 91 x T42 (64x128) 1 o x 1.25 o 181 x T85 (128x256) Defaults:

Idealized baroclinic wave test case Jablonowski and Williamson 2005 The coarse resolution does not capture the evolution of the baroclinic wave

Idealized baroclinic wave test case Finer resolution: Clear intensification of the baroclinic wave

Idealized baroclinic wave test case Finer resolution: Clear intensification of the baroclinic wave, it starts to converge

Idealized baroclinic wave test case Baroclinic wave pattern converges

Idealized baroclinic wave test case: Convergence of the FV dynamics Solution starts converging at 1deg Global L 2 error norms of p s Shaded region indicates the uncertainty of the reference solution

Floating Lagrangian vertical coordinate 2D transport calculations with moving finite volumes (Lin 2004) Layers are material surfaces, no vertical advection Periodic re-mapping of the Lagrangian layers onto reference grid WACCAM: 66 vertical levels with model top around 130km CAM3: 26 levels with model top around 3hPa (40 km)

Physics - Dynamics coupling Prognostic data are vertically remapped (in cd_core) before dp_coupling is called (in dynpkg) Vertical remapping routine computes the vertical velocity  and the surface pressure p s d_p_coupling and p_d_coupling (module dp_coupling) are the interfaces to the CAM3/WACCAM physics package Copy / interpolate the data from the ‘dynamics’ data structure to the ‘physics’ data structure (chunks), A-grid Time - split physics coupling: –instantaneous updates of the A-grid variables –the order of the physics parameterizations matters –physics tendencies for u & v updates on the D grid are collected

Practical tips What do IORD, JORD, KORD mean? IORD and JORD at the model top are different (see cd_core.F90) Relationship between –dtime –nsplit (what happens if you don’t select nsplit or nsplit =0, default is computed in the routine d_split in dynamics_var.F90) –time interval for the physics & vertical remapping step Namelist variables: Input / Output: Initial conditions: staggered wind components US and VS required (D-grid) Wind at the poles not predicted but derived User’s Guide:

Practical tips IORD, JORD, KORD determine the numerical scheme –IORD: scheme for flux calculations in x direction –JORD: scheme for flux calculations in y direction –KORD: scheme for the vertical remapping step Available options: - 2: linear subgrid, van-Leer, unconstrained 1:constant subgrid, 1st order 2:linear subgrid, van Leer, monotonicity constraint (van Leer 1977) 3:parabolic subgrid, PPM, monotonic (Colella and Woodward 1984) 4: parabolic subgrid, PPM, monotonic (Lin and Rood 1996, see FFSL3) 5: parabolic subgrid, PPM, positive definite constraint 6: parabolic subgrid, PPM, quasi-monotone constraint Defaults: 4 (PPM) on the D grid (d_sw), -2 on the C grid (c_sw) Namelist variables:

‘Hybrid’ Computer Architecture SMP: symmetric multi-processor Hybrid parallelization technique possible: Shared memory (OpenMP) within a node Distributed memory approach (MPI) across nodes Example: NCAR’s Bluesky (IBM) with 8-way and 32-way nodes

Schematic parallelization technique NP SP Eq. 1D Distributed memory parallelization (MPI) across the latitudes: Proc Longitudes0340

Schematic parallelization technique NP SP Eq. Each MPI domain contains ‘ghost cells’ (halo regions): copies of the neighboring data that belong to different processors Proc. 2 Longitudes ghost cells for PPM

Schematic parallelization technique Shared memory parallelization (in CAM3 most often) in the vertical direction via OpenMP compiler directives: Typical loop: do k = 1, plev … enddo Can often be parallelized with OpenMP (check dependencies): !$OMP PARALLEL DO … do k = 1, plev … enddo

Schematic parallelization technique Shared memory parallelization (in CAM3 most often) in the vertical direction via OpenMP compiler directives: e.g.: assume 4 parallel ‘threads’ and a 4-way SMP node (4 CPUs) !$OMP PARALLEL DO … do k = 1, plev … enddo kCPU 1 plev

Thank you ! Any questions ??? Tracer transport ? Fortran code …

References Carpenter, R., L., K. K. Droegemeier, P. W. Woodward and C. E. Hanem 1990: Application of the Piecewise Parabolic Method (PPM) to Meteorological Modeling. Mon. Wea. Rev., 118, Colella, P., and P. R. Woodward, 1984: The piecewise parabolic method (PPM) for gas- dynamical simulations. J. Comput. Phys., 54, Jablonowski, C. and D. L. Williamson, 2005: A baroclinic instability test case for atmospheric model dynamical cores. Submitted to Mon. Wea. Rev. Lin, S.-J., and R. B. Rood, 1996: Multidimensional Flux-Form Semi-Lagrangian Transport Schemes. Mon. Wea. Rev., 124, Lin, S.-J., and R. B. Rood, 1997: An explicit flux-form semi-Lagrangian shallow water model on the sphere. Quart. J. Roy. Meteor. Soc., 123, Lin, S.-J., 1997: A finite volume integration method for computing pressure gradient forces in general vertical coordinates. Quart. J. Roy. Meteor. Soc., 123, Lin, S.-J., 2004: A ‘Vertically Lagrangian’ Finite-Volume Dynamical Core for Global Models. Mon. Wea. Rev., 132, van Leer, B., 1977: Towards the ultimate conservative difference scheme. IV. A new approach to numerical convection. J. Comput. Phys.,