Download presentation
Presentation is loading. Please wait.
Published byAlexander Leblanc Modified over 10 years ago
1
Barcelona Supercomputing Center
2
The BSC-CNS objectives: R&D in Computer Sciences, Life Sciences and Earth Sciences. Supercomputing support to external research. BSC-CNS is a consortium that includes : the Spanish Government (MEC) – 51% the Catalonian Government (DIUE) – 37% the Technical University of Catalonia (UPC) – 12% 300 people
3
Research areas Influence the way machines are built, programmed and used Through demonstration, ideas, cooperation with manufacturers & products e-science Programming models Evolving standarts (OpenMP x.y) Prototyping infrastructure (mercurium, nanos library, …) Dependeces/data-flow (StarSs for Cell, SMP, GPU, Grid) Hierarchical/hybrid (MPI/SMPSs, NestedSs, …) Software Distributed Shared Memory Use of Transactional memory Programming models Evolving standarts (OpenMP x.y) Prototyping infrastructure (mercurium, nanos library, …) Dependeces/data-flow (StarSs for Cell, SMP, GPU, Grid) Hierarchical/hybrid (MPI/SMPSs, NestedSs, …) Software Distributed Shared Memory Use of Transactional memory Resource management OS scheduling: resource/power aware job scheduling, dynamic load balancing Scalable file systems Efficient execution on distributed computing environments: GRIDSs @ MN/RES, Grid I/O, heterogenous workloads Management for next-generation data centers: virtualization Resource management OS scheduling: resource/power aware job scheduling, dynamic load balancing Scalable file systems Efficient execution on distributed computing environments: GRIDSs @ MN/RES, Grid I/O, heterogenous workloads Management for next-generation data centers: virtualization Performance analysis Tracing: scalable/online, sampling Visualization: Paraver Automatic analysis: spectral, clustering,… Methodologies and training material Integration with other tools Performance analysis Tracing: scalable/online, sampling Visualization: Paraver Automatic analysis: spectral, clustering,… Methodologies and training material Integration with other tools Prediction and evaluation infrastructure Dimemas: multiscale simulation Interconnection network: overlap, contention, … Node and microarchitecture level simulators: MPsim, TaskSim Architecture support for programming models and runtimes Prediction and evaluation infrastructure Dimemas: multiscale simulation Interconnection network: overlap, contention, … Node and microarchitecture level simulators: MPsim, TaskSim Architecture support for programming models and runtimes Users Earth Sciences Life Sciences Engineering apps
4
Programming models Implementations on top of other low level run times, FPGAs, OpenCL Granularity control Locality aware scheduling Application porting Hybrid MPI/StarSs and comparison with other models Load balancing in nested/hybrid implementations Instrumentation and analysiss for task based systems StarSs CellSs SMPSs GPUSs GridSs ClearSpeedSs ClusterSs CompSs (Java) #pragma css task input(A, B) output(C) void vadd3 (float A[BS], float B[BS], float C[BS]); #pragma css task input(sum, A) output(B) void scale_add (float sum, float A[BS], float B[BS]); #pragma css task input(A) inout(sum) void accum (float A[BS], float *sum); for (i=0; i<N; i+=BS) // C=A+B vadd3 ( &A[i], &B[i], &C[i]);... for (i=0; i<N; i+=BS) // sum(C[i]) accum (&C[i], &sum);... for (i=0; i<N; i+=BS) // B=sum*A scale_add (sum, &E[i], &B[i]);... for (i=0; i<N; i+=BS) // A=C+D vadd3 (&C[i], &D[i], &A[i]);... for (i=0; i<N; i+=BS) // E=C+F vadd3 (&C[i], &F[i], &E[i]);
5
Performance tools Analysis of applications at large scale Maximize ratio of captured information / emitted data Intelligent on line data reduction Mixed instrumentation and sampling Advanced modeling/prediction of sequential computation behavior Memory behavior Use classification techniques of hardware counter metrics to identify potentially interesting transformations CPI STACK model for sequential computation parts
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.