Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Veszprém Department of Image Processing and Neurocomputing Emulated Digital CNN-UM Implementation of a 3-dimensional Ocean Model on FPGAs.

Similar presentations


Presentation on theme: "University of Veszprém Department of Image Processing and Neurocomputing Emulated Digital CNN-UM Implementation of a 3-dimensional Ocean Model on FPGAs."— Presentation transcript:

1 University of Veszprém Department of Image Processing and Neurocomputing Emulated Digital CNN-UM Implementation of a 3-dimensional Ocean Model on FPGAs Zoltán Nagy, Péter Szolgay

2 Nagy 2 MAPLD 2005/153 Introduction Cellular Neural/Nonlinear Networks Universal Machine (CNN-UM) Ocean modeling Results Conclusions

3 Nagy 3 MAPLD 2005/153 Cellular Neural/Nonlinear Networks (CNN) 2 or N dimensional grid Locally connected Analog processing elements State value is continuous in time

4 Nagy 4 MAPLD 2005/153 Structure of a CNN cell u ij input x ij state y ij output z ij constant bias A ij,kl feedback template B ij,kl feed-forward template

5 Nagy 5 MAPLD 2005/153 CNN-UM implementations Software simulation  Easy to implement  Slow, even if using processor specific instructions Emulated digital VLSI  Specialized digital architecture  Selectable computing precision (Castle architecture: 1, 6, 12 bit)  Orders faster than the software simulation  Long design time Analog VLSI  Huge computing power (~TeraOP/s)  Low accuracy (7-8 bit)  Noise and temperature sensitivity

6 Nagy 6 MAPLD 2005/153 Structure of the Falcon emulated digital CNN-UM Mixer  Contains cell values for the next updates Memory unit  Contains a belt of the cell array Template memory Arithmetic unit Processors can be connected on a grid  Linear speedup

7 Nagy 7 MAPLD 2005/153 Structure of the arithmetic unit Cell update in row wise order Cycle time depends on template size Fully pipelined

8 Nagy 8 MAPLD 2005/153 Configurable parameters State, template and constant width between 2 to 64 bits Number of templates Size of the templates Width of the cell array slice Number of layers Number and arrangement of the processor cores

9 Nagy 9 MAPLD 2005/153 Example: Solution of a simple PDE on CNN The Wave equation Spatial discretization 2 layer CNN

10 Nagy 10 MAPLD 2005/153 Ocean models Barotropic model Baroclinic models  z-coordinate model  σ-coordinate model  isopycnal Fine resolution models  Real-time forecast  Fishing industry  Search and rescue Coarse resolution models  Long term predictions  Climate modeling

11 Nagy 11 MAPLD 2005/153 The Princeton Ocean Model (POM) Sigma coordinate model  Vertical coordinate is scaled on the water column depth Second moment turbulence closure sub-model  Provides vertical mixing coefficients Solution technique: Mode splitting  Internal mode (3D) o Vertical structure equations o Implicit solution  External mode (2D) o Vertically integrated equations o Explicit solution (Leapfrog method)

12 Nagy 12 MAPLD 2005/153 Governing equations of the external (2D) mode u x, u y mass transport η free surface elevation Ω angular rotation of the Earth Θ latitude H depth of the ocean g gravitational acceleration τ w, τ b wind and bottom stress A lateral viscosity

13 Nagy 13 MAPLD 2005/153 Solution on CNN Spatial discretization on a uniform grid 3-layer CNN structure Non-linear template required for advection term Cannot be solved on analog VLSI CNN chips Solvable on the modified Falcon architecture  Support of non-linearity  Specialized cell model

14 Nagy 14 MAPLD 2005/153 The modified arithmetic unit of the Falcon architecture

15 Nagy 15 MAPLD 2005/153 Implementation on FPGA Complicated arithmetic unit Fixed-point number representation Configurable precision High level hardware description language required (e.g. Handel-C)

16 Nagy 16 MAPLD 2005/153 Performance

17 Nagy 17 MAPLD 2005/153 The Seamount problem

18 Nagy 18 MAPLD 2005/153 Results after 72 hours Circulation patternElevation

19 Nagy 19 MAPLD 2005/153 Error of the solution

20 Nagy 20 MAPLD 2005/153 Error of the solution

21 Nagy 21 MAPLD 2005/153 Memory requirements of the internal (3D) equations Extended memory hierarchy  New level stores 3 cross sectional slices from the 3D array o Large memory required (e.g. 512x512x64 sized grid, 3x512x64 elements per state variable) o Cannot be stored on-chip o Off-chip storage requires huge I/O bandwidth Processor array should be used  The 3D array is divided between the processors  Optimal data set for on chip storage: 2048 elements per cross sectional slice (512x32x64 sized grid per processor)  Each processor located on a separate FPGA

22 Nagy 22 MAPLD 2005/153 Solution of the internal (3D) equations Implicit solution  Fixed-point solution o Requires large precision to avoid rounding errors o Seems to be impractical  Floating-point solution o Requires large area (especially add/sub) Explicit solution  Smaller timestep  Simpler arithmetic unit

23 Nagy 23 MAPLD 2005/153 Conclusions Ocean modeling using emulated digital CNN is very promising Moderate precision is required in 2D mode  1% accuracy using 24 bits Expected speedup (compared to an Athlon64 2GHz microprocessor)  80 times on our RC200 prototyping board  3700 times on the largest available FPGA


Download ppt "University of Veszprém Department of Image Processing and Neurocomputing Emulated Digital CNN-UM Implementation of a 3-dimensional Ocean Model on FPGAs."

Similar presentations


Ads by Google