Communication Framework

Slides:



Advertisements
Similar presentations
Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
Advertisements

Autonomic Systems Justin Moles, Winter 2006 Enabling autonomic behavior in systems software with hot swapping Paper by: J. Appavoo, et al. Presentation.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Summary Background –Why do we need parallel processing? Applications Introduction in algorithms and applications –Methodology to develop efficient parallel.
A Structure-free Aggregation Framework for Vehicular Ad Hoc Networks Stefan Dietzel, Elmar Schoch, Boto Bako, Frank Kargl.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Topology Aware Mapping for Performance Optimization of Science Applications Abhinav S Bhatele Parallel Programming Lab, UIUC.
A Framework for Collective Personalized Communication Laxmikant V. Kale, Sameer Kumar, Krishnan Varadarajan.
2012 High Performance Computing Speed, Low Latency, and Parallel Programming in Financial Services 2012 High Performance Computing Speed, Low Latency,
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Cactus Computational Frameowork Freely available, modular, environment for collaboratively developing parallel, high- performance multi-dimensional simulations.
1 Scaling Collective Multicast Fat-tree Networks Sameer Kumar Parallel Programming Laboratory University Of Illinois at Urbana Champaign ICPADS ’ 04.
Bulk Synchronous Parallel Processing Model Jamie Perkins.
Chapter 2 Computer Clusters Lecture 2.2 Computer Cluster Architectures.
Boston, May 22 nd, 2013 IPDPS 1 Acceleration of an Asynchronous Message Driven Programming Paradigm on IBM Blue Gene/Q Sameer Kumar* IBM T J Watson Research.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
© 2010 IBM Corporation Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems Gabor Dozsa 1, Sameer Kumar 1, Pavan Balaji 2,
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Millions of points of measurement Dense spatial and temporal data Need visual analytic tools as conventional analyses are too inefficient Visualization.
N. GSU Slide 1 Chapter 05 Clustered Systems for Massive Parallelism N. Xiong Georgia State University.
Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC.
© 2012 MELLANOX TECHNOLOGIES 1 Disruptive Technologies in HPC Interconnect HPC User Forum April 16, 2012.
Charm Workshop CkDirect: Charm++ RDMA Put Presented by Eric Bohm CkDirect Team: Eric Bohm, Sayantan Chakravorty, Pritish Jetley, Abhinav Bhatele.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Blue Brain Project Carlos Osuna, Carlos Aguado, Fabien Delalondre.
Workshop BigSim Large Parallel Machine Simulation Presented by Eric Bohm PPL Charm Workshop 2004.
University of Illinois at Urbana-Champaign Memory Architectures for Protein Folding: MD on million PIM processors Fort Lauderdale, May 03,
Intelligent Database Systems Lab Presenter: Wu, Jhen-Wei Authors: Fabian Bürger, Josef Pauli ICPRAM. Representation Optimization with Feature Selection.
Design an MPI collective communication scheme A collective communication involves a group of processes. –Assumption: Collective operation is realized based.
1 ©2004 Board of Trustees of the University of Illinois Computer Science Overview Laxmikant (Sanjay) Kale ©
LaHave House Project 1 LaHave House Project Automated Architectural Design BML + ARC.
1 Bogor – Software Model Checking Framework Presented by: Arpita Gandhi.
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 31. Review Creational Design Patterns – Singleton Pattern – Builder Pattern.
Interconnection network network interface and a case study.
A uGNI-Based Asynchronous Message- driven Runtime System for Cray Supercomputers with Gemini Interconnect Yanhua Sun, Gengbin Zheng, Laximant(Sanjay) Kale.
1 Opportunities and Challenges of Modern Communication Architectures: Case Study with QsNet CAC Workshop Santa Fe, NM, 2004 Sameer Kumar* and Laxmikant.
Hierarchical Load Balancing for Large Scale Supercomputers Gengbin Zheng Charm++ Workshop 2010 Parallel Programming Lab, UIUC 1Charm++ Workshop 2010.
Performance Evaluation of Parallel Algorithms on a Computational Grid Environment Simona Blandino 1, Salvatore Cavalieri 2 1 Consorzio COMETA, 2 Faculty.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs in Parallel Machines Dr. Xiao Qin Auburn University
How is creativity managed in Raintree? Aleksei Udatšnõi 30. march 2010.
SERVICE ORIENTED ARCHITECTURE
Community Grids Laboratory
Integrated Energy and Spectrum Harvesting for 5G Wireless Communications submitted by –SUMITH.MS(1KI12CS089) Guided by – BANUSHRI.S(ASST.PROF,Dept.Of.CSE)
Strategy Design Pattern
Game Architecture Rabin is a good overview of everything to do with Games A lot of these slides come from the 1st edition CS 4455.
Chap. 2 Network Models.
Presented by Munezero Immaculee Joselyne PhD in Software Engineering
Parallel Programming By J. H. Wang May 2, 2017.
Parallel Objects: Virtualization & In-Process Components
Pablo Abad, Pablo Prieto, Valentin Puente, Jose-Angel Gregorio
Grid Computing Colton Lewis.
uGNI-based Charm++ Runtime for Cray Gemini Interconnect
Performance Evaluation of Adaptive MPI
Implementing Simplified Molecular Dynamics Simulation in Different Parallel Paradigms Chao Mei April 27th, 2006 CS498LVK.
Overview of the Course Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Early Measurements of a Cluster-based Architecture for P2P Systems
Verilog to Routing CAD Tool Optimization
Milind A. Bhandarkar Adaptive MPI Milind A. Bhandarkar
Summary Background Introduction in algorithms and applications
Optimizing MapReduce for GPUs with Effective Shared Memory Usage
HOME AUTOMATION: WEB BASED CONTROL
Time Gathering Systems Secure Data Collection for IBM System i Server
Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
Faucets: Efficient Utilization of Multiple Clusters
Modeling Effective Communications in an DevOps Environment using the SIS Testbed Amanda Crawford Fall 2017 This project will model a scenario between a.
Course Orientation Chaiporn Jaikaeo
Support for Adaptivity in ARMCI Using Migratable Objects
Call and return architectures
Emulating Massively Parallel (PetaFLOPS) Machines
Chapter 2 from ``Introduction to Parallel Computing'',
Presentation transcript:

Communication Framework Sameer Kumar

Communication: The Ultimate Hurdle Understanding Communication Overheads Optimizing communication patterns in applications

Three Pronged Approach Optimizing communication using features of architectures Better algorithms for optimizing communication patterns Automatic optimization framework

Areas of Research Parallel algorithms Interconnect network simulation Collective communication optimization schemes Interconnect network simulation Understanding networks Possible development of new networks through simulation Software engineering Comlib framework currently has about 80 classes and 4-5 design patterns

Each strategy defines its own learner Class Hierarchy Each strategy defines its own learner insertMessage() doneInserting() Learner Charm Strategy UserProg. delegation ComlibMgr Handle insertMessage() doneInserting() Strategy ConvComlibMgr

Future Tasks Infiniband port Bluegene and IBM SP performance Design of a new communication interconnect suitable for message driven programming Support more collectives (reductions) Need motivated students!!