Using Distributed Computing in Computational Fluid Dynamics Zoran Constantinescu, Jens Holmen, Pavel Petrovic 13-15 May 2003, Moscow Norwegian University.

Using Distributed Computing in Computational Fluid Dynamics Zoran Constantinescu, Jens Holmen, Pavel Petrovic 13-15 May 2003, Moscow Norwegian University of Science and Technology

2 need more processing power supercomputers clusters of PCs distributed computing existing systems QADPZ project description advantages applications current status outline Navier-Stokes solver CFD application: existing solver using Diffpack and mpich small mpi replacement results and comparison future work simulations, data processing, complex algs.

3 problems larger and larger amounts of data are generated every day (simulations, measurements, etc.) software applications used for handling this data are requiring more and more CPU power: simulation, visualization, data processing complex algorithms, e.g. evolutionary algorithms large populations, evaluations very time consuming  need for parallel processing

4 use parallel supercomputers –access to tens/hundreds of CPUs e.g. NOTUR/NTNU = embla (512) + gridur (384) –high speed interconnect, shared memory –usually for batch mode processing –very expensive (price, maintenance, upgrade) –these CPUs are not so powerful anymore e.g. 500 MHz RISC vs. today's 2.5 GHz Pentium4 solution1

5 use clusters of PCs (Beowulf) –network of personal computers –usually running Linux operating system –powerful CPUs (Pentium3/4, Athlon) –high speed networking (100 MBps, 1 GBps, Myrinet) –much cheaper than supercomputers –still quite expensive (install, upgrade, maintenance) –trade higher availability and/or greater performance for lower cost solution2

6 use distributed computing –using existing networks of workstations (PCs from labs, offices connected by LAN ) –usually running Windows or Linux operating system (also MacOS, Solaris, IRIX, BSD, etc.) –powerful CPUs (Pentium3/4, AMD Athlon) –high speed networking (100 MBps) –already installed computers – very cheap –already existing maintenance –easy to have a network of tens/hundreds of computers solution3

7 specialized applications run on each individual computer from the LAN they talk to one or more central servers download a task, solve it, and send back results more suited (easier) for task-parallel applications (where the applic. can be decomposed into independent tasks) can also be used for data-parallel applications the number of available CPUs is more dynamic distributed computing

8 seti @ home –search for extraterrestrial intelligence –analysis of data from radio telescopes –client application is very specialized –using the Internet to connect clients to the server, and to download/upload a task/results –no framework for other type of applications –no source code available existing systems

9 distributed.net –one of the largest "computer" in the world (~20TFlops) –used for solving computational challenges: –RC5 cryptography, Optimal Golomb ruler –client application is very specialized –using the Internet to connect clients to the server –no framework for other applications –no source code available existing systems

10 Condor project (Univ.of Wisconsin) –more research oriented computational projects –more advanced features, user applications –very difficult to install, problems with some OSes (started from a Unix environment) –restrictive license (closed system) other commercial projects: Entropia, Parabon –expensive for research, only certain OSes existing systems

11 QADPZ project (NTNU) –initial application domains: large scale visualization, genetic algorithms, neural networks –prototype in early 2001, but abandoned (too viz oriented) –started in July 2001, first release v0.1 in Aug 2001 –we are now close a new release v0.8 (May 2003) –system independent of any specific application domain –open source project on SourceForge.net a new system

12 http://qadpz.sourceforge.net

13 QADPZ project (NTNU) –similar in many ways to Condor (submit computing tasks to idle computers running in a network) –easy to install, use, and maintain –modular and extensible –open source project, implemented in C++ –support for many OSes (Linux, Windows, Unix, …) –support for multiple users, encryption –logging and statistics QADPZ description

14 QADPZ architecture client  master  slave clientmasterclientslave … … management of slaves scheduling tasks monitoring tasks controlling tasks background process reports status download tasks/data executing tasks user interface submit tasks/data

15 clientmasterslave … repository external server: web, ftp, … internal lightweight web server control flow data flow communication

16 the client is the interface to the system describes job project file, prepares task code, prepares input files, submit the tasks communicates with the master two modes: interactive: stay connected to master and wait for the results batch: detach from the master and get results later; the master will keep all the messages

17 jobs, tasks, subtasks job: consists of groups of tasks executed sequentially or in parallel a task can consist of subtasks (the same executable is run, but with different input data) each task is submitted individually, the user specifies which OS and min. resource requirements (disk, mem) the master allocates the most suitable slave for executing the tasks and notifies the client when task is finished, results are stored as specified in the project and the client is notified

18 the tasks normal binary executable code no modifications required must be compiled for each of the used platforms normal interpreted program shell script, Perl, Python Java program requires interpreter/VM on each slave dynamically loaded slave library (our API) better performance more flexibility

19 the master keeps account of all existing slaves (status, specifications) usually one master is enough more can be used if there are too many slaves (communication protocol allows one master to act as another client, but this is not fully implemented yet) keeps account of all submitted jobs/tasks keeps a list of accepted users (based on username/passwd) gathers statistics about slaves, tasks can optionally run the internal web server for the data and code repository

20 the slave …

21 OO design

22 communication layered communication protocol exchanged messages are in XML (w/ compress+encrypt) uses UDP with a reliable layer on top remove?

23 configuration slave qadpz_slave (daemon) + slave.cfg master qadpz_master (daemon) + master.cfg + qadpz_admin + users.txt + privkey client qadpz_run + client.cfg + pubkey Linux, Win32 (9x,2K,XP), SunOS, IRIX, FreeBSD, Darwin MacOSX

24 installation one of our computer labs (Rose lab) ~80 PCs Pentium3, 733 MHz, 128 MBytes dual boot: Win2000 and FreeBSD running for several month when a student logs in into the computer, the slave running on that computer is set into disable mode (no new computing tasks are accepted, any current tasks in killed and/or restarted on another comp.) our Beowulf cluster (ClustIS) 40 nodes AMD Athlon, 1466 MHz, 512/1024 GBytes 100 MBps network, Linux

25 status

26 CFD application existing solver for the incompressible Navier-Stokes equations: using the projection method, eq. for the velocity and pressure are solved sequentially at each time step using Galerkin finite element method to discretize in space the pressure Poisson eq. is preconditioned with a mixed additive/multiplicative domain decomposition method: one iteration of overlapping Schwartz combined with a coarse grid correction

27 NS solver developed in C++ using the Diffpack object oriented numerical library for differential eqs. used the parallel version based on standard MPI (message passing interface) for communication we used mpich tested on our 40 nodes PC cluster

28 MPI implementation needed to replace the MPI calls used by Diffpack a subset of the MPI standard: MPI_Init, MPI_Finalize MPI_Comm_size, MPI_Comm_rank MPI_Send, MPI_Recv MPI_Isend, MPI_Irecv MPI_Bcast, MPI_Wait MPI_Reduce, MPI_Allreduce MPI_Barrier MPI_Wtime

29 test case oscillating flow around a fixed cylinder 3D domain with 8-node isoparametric elements coarse grid: 2 184 nodes, 1 728 elements fine grid: 81 600 nodes, 76 800 elements 10 time steps running time ~30 minutes with 2 CPUs (AMD Athlon 1466 MHz)

30 coarse grid

31 running time

32 monitor

33 future work local caching of binaries on the slaves different scheduling protocols on master dynamic load balancing of tasks connect to the Grid web interface to the client creating jobs easier, with input data starting/stopping jobs monitoring execution of jobs easy access to the output of execution should decrease learning effort for using the system

34 the work cooperation between -Norwegian Univ. of Science and Technology Division of Intelligent Systems (DIS) http://dis.idi.ntnu.no -SINTEF http://www.sintef.no

35 thank you questions ? http://qadpz.sourceforge.net

Using Distributed Computing in Computational Fluid Dynamics Zoran Constantinescu, Jens Holmen, Pavel Petrovic 13-15 May 2003, Moscow Norwegian University.

Similar presentations

Presentation on theme: "Using Distributed Computing in Computational Fluid Dynamics Zoran Constantinescu, Jens Holmen, Pavel Petrovic 13-15 May 2003, Moscow Norwegian University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using Distributed Computing in Computational Fluid Dynamics Zoran Constantinescu, Jens Holmen, Pavel Petrovic 13-15 May 2003, Moscow Norwegian University.

Similar presentations

Presentation on theme: "Using Distributed Computing in Computational Fluid Dynamics Zoran Constantinescu, Jens Holmen, Pavel Petrovic 13-15 May 2003, Moscow Norwegian University."— Presentation transcript:

Similar presentations

About project

Feedback