Presentation is loading. Please wait.

Presentation is loading. Please wait.

What Programming Paradigms and algorithms for Petascale Scientific Computing, a Hierarchical Programming Methodology Tentative Serge G. Petiton June 23rd,

Similar presentations


Presentation on theme: "What Programming Paradigms and algorithms for Petascale Scientific Computing, a Hierarchical Programming Methodology Tentative Serge G. Petiton June 23rd,"— Presentation transcript:

1 What Programming Paradigms and algorithms for Petascale Scientific Computing, a Hierarchical Programming Methodology Tentative Serge G. Petiton June 23rd, 2008 Japan-French Informatic Laboratory (JFIL)

2 June, 23rdPAAP 2 Outline 1. Introduction 2. Present Petaflops, on the Road to Future Exaflops 3. Experimentations, toward models and extrapolations 4. Conclusion

3 June, 23rdPAAP 3 Outline 1. Introduction 2. Present Petaflops, on the Road to Future Exaflops 3. Experimentations, toward models and extrapolations 4. Conclusion

4 June, 23rdPAAP 4 Introduction The Petaflop frontier was crossed (May 25-26 night) – top500 Sustained Petaflop would be soon available by a large number of computers As scheduled since the 9Oth, we didn’t really have large technological gaps to access Petaflops computers Language and tools are not so different since first SMPs What about languages, tools, methods for sustained 10 Petaflops Exaflop would probably ask for new technology advancements and new ecosystems On the road toward Exaflops, we would soon face difficult challenges and we have to anticipate new problems around the 10 Petaflop frontier.

5 June, 23rdPAAP 5 Outline 1. Introduction 2. Present Petaflops, on the Road to Future Exaflops 3. Experimentations, toward models and extrapolations 4. Conclusion

6 June, 23rdPAAP 6 Hyper Large Scale Hierarchical Distributed Parallel Architectures Many-cores ask for new programming paradigm, as data parallel, Message passing would be efficient for gang of cluster, Workflow and Grid-like programming may be a solution for the higher level programming, Accelerators, vector computing, Energy consumption optimization, Optical networks, “Inter” and “intra” (chip, cluster, gang,….) communications Distributed/Shared Memory computer on a chip.

7 June, 23rdPAAP 7 On the road from Petaflop toward Exaflop Multi programming and execution paradigms, Technological and software challenge : compilers, systems, middleware, schedulers, fault tolerance,… New applications and Numerical Methods, Arithmetic and elementary function (multiple and hybrids) Data distributed on networks and grids, Education challenges, we have to educate scientists

8 June, 23rdPAAP 8 and the road would be dificult…. Multi-level programming paradigms, Component Technologies, Mixed data migration and computing, with large instrument control, We have to use end-users expertise, Indeterminist distributed computing, component dependence graph, Middleware and Platform independent “Time to solution” minimization, new metrics We have to allow end-users to propose scheduler assistance and to give some advice to anticipate data migration data

9 June, 23rdPAAP 9 Outline 1. Introduction 2. Present Petaflops, on the Road to Future Exaflops 3. Experimentations, toward models and extrapolations 4. Conclusion

10 June, 23rdPAAP 10 Front end : Depends only of the applications Back end : depends of middleware. Ex. XtermWeb (F), OmniRPC (Jp), and Condor (USA). http://yml.prism.uvsq.fr/ YML Language

11 June, 23rdPAAP 11 Components/Tasks Graph Dependency Begin node End node Graph node Dependence par compute tache1(..); signal(e1); // compute tache2(..); migrate matrix(..); signal(e2); // wait(e1 and e2); Par compute tache3(..); signal(e3); // compute tache4(..); signal(e4); // compute tache5(..); control robot(..); signal(e5); visualize mesh(…) ; end par // wait(e3 and e4 and e5); compute tache6(..); compute tache7(..); end par Generic component node Résultat A

12 June, 23rdPAAP 12 LAKe Library (Nahid Emad, UVSQ)

13 June, 23rdPAAP 13 YML/LAKe

14 June, 23rdPAAP 14 Block Gauss-Jordan, 101 processor Cluster, Grid 5000; YML versus YML/OmniRPC ( with Maximes Hugues (TOTAL and LIFL)) Taille de bloc = 1500 Block Number Task Number Overhead (%) 2x2822.41 3x32714.78 4x46428.37 5x512540.82 6x621665.60 7x734397.01 8x8612138.24 We optimize the « Time to Solution » Several middleware may be choose Number of Blocks Time

15 June, 23rdPAAP 15 GRID 5000, BGJ,10, 101 nœuds, YML versus YML/OmniRPC Block sizes = 1500 Block Number Overhead (%) 101 nodes Overhead (%) 10 nodes 2x222.4121.67 3x314.7811.57 4x428.3712.12 5x540.8222.60 6x665.6050.00 7x797.0163.98 8x8138.24133.69

16 June, 23rdPAAP 16 BGJ, YML/OmniRPC versus YML Block Size = 1500 Block Number Overhead (%) 101 nodes Grid5000 Overhead (%) Cluster of clusters 2x222.4117.58 3x314.7814.22 4x428.3725.17 5x540.8224.64 6x665.6062.86 7x797.0140.12 8x8138.2499.79

17 June, 23rdPAAP 17 Asynchronous Restarted Iterative Methods on multi-node computers With Guy Bergère, Zifan Li, and Ye Zhang (LIFL)

18 June, 23rdPAAP 18 Convergence on GRID 5000 Time (second) Residual Norm

19 June, 23rdPAAP 19 One or two distributed sites, same number of processors, communication overlay One site Two sites

20 June, 23rdPAAP 20 Cell/GPU CEA/DEN : with Christophe Calvin et Jérome Dubois (CEA/DEN Saclay) MINOS/APOLLO3 solver Netronic tranport problem Power Method to compute the dominante eigenvalue Slow convergence Large number of floating point operations Experimentations on : Bi-Xeons quadcore 2.83GHz (45 GFlops) CellBlade (Cines Montpellier)(400 GFlops) GPU Quadro FX 4600 (240 GFlops)

21 June, 23rdPAAP 21 Power method : Performances 21 Matrix size

22 June, 23rdPAAP 22 Power Method : Arithmetic Accuracy 22 Difference

23 June, 23rdPAAP 23 Arnoldi Projection: Performances 23 Matrix Size

24 June, 23rdPAAP 24 Arnoldi Projection : Arithmetic Accuracy 24 Orthogonalization degradation

25 June, 23rdPAAP 25 Outline 1. Introduction 2. Present Petaflops, on the Road to Future Exaflops 3. Experimentations, toward models and extrapolations 4. Conclusion

26 June, 23rdPAAP 26 Conclusion We plan to extrpolate from Grid5000 and our multi-core experimentations some behaviors of the future hiearachical large petascale computers, using YML for the higher level, We need to propose new high-level languages to program large Petaflop computers, to be able to minimize “Time to Solution” and energy consumptions, with system and middleware independencies, MPI would probably very difficult to dethrone, Other important codes would be still carefully “hand-optimized” Several Programming paradigms, with respect to the different level, have to me mixed. The interface have to be well-specified; MPI would probably very difficult to dethrone, End-users have to be able to give expertise to help middleware management such as scheduling, and to chose libraries New Asynchronous Hybrid Methods have to be introduced


Download ppt "What Programming Paradigms and algorithms for Petascale Scientific Computing, a Hierarchical Programming Methodology Tentative Serge G. Petiton June 23rd,"

Similar presentations


Ads by Google