1 Key Objectives Parallel Programming Model and Tools desesperatly needed for the masses (New Scientist, New SME) for new architectures (Multi-cores) As Effective as possible: Efficient However Programmer/User Productivity is first Key For both Multi-cores and Distributed Actually the way around Some Handling of ``Large-scale’’ (Grid, Clouds)
2 Intech, Jeudi 2 juillet 2009
3
D. Caromel, et al. Overview of Cloud, Parallel Computing and ProActive PACA Grid Speed: Application + Development: Productivity
5
6 1. Background: OASIS team 2. Cloud Computing 3. ProActive Parallel Suite: Programming, Optimizing Scheduling 4. CPER ProActive PACA GRID 5. Use Cases & Demos Agenda
7 1. Background Parallel & Distributed
8 OASIS Team, INRIA-UNSA-I3S/CNRS A joint team, about 35 persons Parallelism and Distribution, Proof, Verification ProActive Parallel Suite From Multi-cores to Enterprise GRIDs
9 OASIS Team Composition (35) Researchers (5): D. Caromel (UNSA, Det. INRIA) E. Madelaine (INRIA) F. Baude (UNSA) F. Huet (UNSA) L. Henrio (CNRS) PhDs (11): Antonio Cansado (INRIA, Conicyt) Brian Amedro (SCS-Agos) Cristian Ruz (INRIA, Conicyt) Elton Mathias (INRIA-Cordi) Imen Filali (SCS-Agos / FP7 SOA4All) Marcela Rivera (INRIA, Conicyt) Muhammad Khan (STIC-Asia) Paul Naoumenko (INRIA/Région PACA) Viet Dung Doan (FP6 Bionets) Virginie Contes (SOA4ALL) Guilherme Pezzi (AGOS, CIFRE SCP) + Visitors + Interns PostDoc (1): Regis Gascon (INRIA) Engineers (10): Elaine Isnard (AGOS) Fabien Viale (ANR OMD2, Renault ) Franca Perrina (AGOS) Germain Sigety (INRIA) Yu Feng (ETSI, FP6 EchoGrid) Bastien Sauvan (ADT Galaxy) Florin-Alexandru.Bratu (INRIA CPER) Igor Smirnov (Microsoft) Fabrice Fontenoy (AGOS) Open position (Thales) Trainee (2): Etienne Vallette d’Osia (Master 2 ISI) Laurent Vanni (Master 2 ISI) Assistants (2): Sylvie Lelaidier (INRIA) Sandra Devauchelle (I3S) An international team with about 10 nationalities
10 2. Cloud Computing
11 Clouds: Basic Definition Dynamically scalable, often virtualized resourcesscalablevirtualized Provided as a service over the Internetas a serviceInternet Users need not have knowledge of, expertise in, or control over the technology infrastructure Software as a service (SaaS), CRM, ERP Software as a service Platform as a service (PaaS), Google App Engine Platform as a service Infrastructure as a service (IaaS), Amazon EC2 Infrastructure as a service
12 Clouds in Picture From Joseph Kent LangleyJoseph Kent Langley
13 From Grids to Clouds Grid Computing Several administrative Domains Virtual Organizations Trading not based on Currency (Too) Hard Still a strong need for Sharing: Under used machines, Green IT pressure TCO, Electric Bill Services: Accessing a Hosted Software Distributed, //, & Grid Technologies for Clouds
Multi-Core Push
15 Symetrical Multi-Core: 8-ways Niagara II 8 cores 4 Native threads per core Linux see 32 cores!
16 Multi-Cores: A Few Key Points Moore’s Law rephrased: Nb. of Cores double every 18 to 24 months Key expected Milestones: Cores per Chips (OTS) 2010: 32 to 64 2012: 64 to 128 2014: 128 to 256 1 Million Cores Parallel Machines in 2012 100 M cores coming in 2020 Multi-Cores are NUMA, and turning Heterogeneous (GPU) They are turning into SoC with NoC: NOT SMP!
17 ProActive PACA Grid in Cloud Context
18 3. ProActive Parallel Suite
19 Parallel Acceleration Toolkit in Java: Parallelism: Multi-Core+Distributed
20
21
22 A ProActive : Active objects Proxy Java Object A ag = newActive (“A”, […], VirtualNode) V v1 = ag.foo (param); V v2 = ag.bar (param);... v1.bar(); //Wait-By-Necessity V Wait-By-Necessity is a Dataflow Synchronization JVM A Active Object Future Object Request Req. Queue Thread v1 v2 ag WBN!
23 Standard system at Runtime: No Sharing NoC: Network On Chip Proofs of Determinism
24 TYPED ASYNCHRONOUS GROUPS
25 Broadcast and Scatter JVM ag cg ag.bar(cg); // broadcast cg ProActive.setScatterGroup(cg) ; ag.bar(cg); // scatter cg c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 s c1 c2 c3 s Broadcast is the default behavior Use a group as parameter, Scattered depends on rankings
26 Optimizing
27
28 IC2D
29 IC2D
30 ChartIt
31 Pies for Analysis and Optimization
Video 1: IC2D Optimizing Monitoring, Debugging, Optimizing
33 Scheduling
34
35 Scheduler: User Interface
Video 2: Scheduler, Resource Manager
37 4. ProActive PACA GRID
38 The ProActive PACA Grid Platform (1) Dell Blades 160 cores DELL PowerEdge LAME BLADE Intel Xeon E Ghz quad core 2×4 Mo 16GB 667MHZ FBD 2 hdd 73Go SAS 15Krpm Linux fedora Core 7 Kernel Storage server: Dell PowerEdge P Intel Xeon E Ghz quad core 2×4 Mo 6×500Go SATA 7.2Krpm RAID0
39 The ProActive PACA Grid Platform (2) HP HPCS Windows 64 cores 8 nodes : HP ProLiant BL460c 2 Intel Xeon E5320 quad core 1.86 GHZ 8 Mo 8GB 667MHZ FBD 2 hdd 72Go hot plug 10Krpm RAID 0 Windows HPC bits
40 The ProActive PACA Grid Platform (3) Dell Blades 384 cores Total: 608 Cores available Today Potential Extension: Grid 5000
Video 3: CPER ProActive PACA Grid
43 Use Cases & Demos Downstairs AGOS, SOA, BPEL processes in parallel on the Grid, Franca Perrina Life technologies: Genomic, Transcriptome Parallel Analysis, IPMC, Emil Salageanu Price-It Excel, Finance, Vladimir Bodnartchouk IC2D : An Eclipse GUI to Debug and Optimize your ProActive Application, Brian Amedro CPER ProActive Paca Grid, Germain Sigety Web Start for accessing ProActive Paca Grid, Florin Alexandru-Bratu ProActive PACA GRID Visu, Debug 3 Demos Applicatives
Summary
45 Conclusion: Available in PACA Grid Future Developments: Multi-Core + Distributed
The Future ProActive PACA Grid + Coeur Interactive + Mesocentre (OCA) + Clouds For: Science Labs and Local Industries (Large and SME)
47 Intech, Jeudi 2 juillet 2009
48
49 ProActive PACA Grid in Context
50
4. Use Case: IPMC
52 Use case: SOLiD and ProActive SOLiD from Applied Biosystems (USA) As part of a project with the IPMC research institute, the SOLiD Corona Lite software has been upgraded by integrating ProActive to enable the distribution of parallel tasks on lab desktops in order to accelerate the processing At the moment, only the first pipeline, Matching, has been upgraded by distributing the Mapreads function Constraints Requirements set by IPMC: keep the current software architecture ProActive has been integrated on top of PBS Matching Pairing SNP/Consensus calling ProActive PBS Resource Manager
53 Resources set up Environment 16 nodes Additional external nodes can be easily and dynamically added!
54 Mapreads optimization The Reads are split into smaller files Each Reads subset is compared to one chromosome The resulting files are merged
55 Optimized Mapreads Performances The distributed version with ProActive of Mapreads has been tested on the INRIA cluster with two settings: the Reads file is split in either 30 or 10 slices Use case: matching 31 millions sequences with the human genome (M=2, L=25) Reference point with 16 cores (same as in SOLiD machine) 4 Time faster from 20 to 100 Speed Up of 80 / Th. Sequential 50 Hours 35 Minutes
56 Key benefits of this solution Higher throughput Reduced execution time Scalable Depending on the input and reference data size, the user can chose to increase or reduce the number of extra resources used Solution is ready for next generation reads file Flexible The run can be paused or resumed by the user when needed Priorities between jobs can be easily set by the users Easy nodes acquisition and hot plugging Simplified maintenance ProActive directly supports common schedulers like PBS, LSF, SGE, and W HPCS 08: time consuming adaptations of Corona Lite software are no longer needed Reduced costs for Applied Biosystems customers Optimizing available hardware resources Free use of ProActive Parallel Suite® Easy to install and use: save time Supported by experts in parallel computing
Accelerate & Scale up with ProActive Parallel Suite® 02/07/2009
Presentation overview Item1 Item 2 Item3
Thank you! 60
Co-developing, Support for ProActive Parallel SuiteProActive Parallel Suite Worldwide Customers: Fr, UK, Boston USA Startup Company Born of INRIA & UNSA