Template for IXPUG EMEA Ostrava, 2016

Slides:



Advertisements
Similar presentations
Parallel Processing with OpenMP
Advertisements

Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
1-1 Embedded Software Development Tools and Processes Hardware & Software Hardware – Host development system Software – Compilers, simulators etc. Target.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Budapest, November st ALADIN maintenance and phasing workshop Short introduction to OpenMP Jure Jerman, Environmental Agency of Slovenia.
1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.
High level & Low level language High level programming languages are more structured, are closer to spoken language and are more intuitive than low level.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Program documentation Using the Doxygen tool Program documentation1.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
1.NET FRAMEWORK CE-105 Spring 2007 Engr. Faisal ur Rehman.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.
Full and Para Virtualization
Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.
Computing Systems: Next Call for Proposals Dr. Panagiotis Tsarchopoulos Computing Systems ICT Programme European Commission.
Benchmarking and Applications. Purpose of Our Benchmarking Effort Reveal compiler (and run-time systems) weak points and lack of adequate automatic optimizations.
Native Computing & Optimization on Xeon Phi John D. McCalpin, Ph.D. Texas Advanced Computing Center.
Name/Title of Your App Prepared by: …… For the 5 th National ICT Innovation Competition.
Martin Kruliš by Martin Kruliš (v1.1)1.
Parallel OpenFOAM CFD Performance Studies Student: Adi Farshteindiker Advisors: Dr. Guy Tel-Zur,Prof. Shlomi Dolev The Department of Computer Science Faculty.
1 Team Skill 3 Defining the System Part 1: Use Case Modeling Noureddine Abbadeni Al-Ain University of Science and Technology College of Engineering and.
Apache Ignite Compute Grid Research Corey Pentasuglia.
J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013.
Hammoudeh S. Alamri1, Balsam A
Whats New Xamarin and VC++ with VS 2017
Introduction to threads
Intel Many Integrated Cores Architecture
Productive Performance Tools for Heterogeneous Parallel Computing
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Support for Program Analysis as a First-Class Design Constraint in Legion Michael Bauer 02/22/17.
SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data - Aditi Thuse.
Testing of Heterogeneous Multi-Core Embedded Systems
Metis Data Science Meetup:
Tutorial: Big Data Algorithms and Applications Under Hadoop
CSCI-235 Micro-Computer Applications
Scott Michael Indiana University July 6, 2017
CSE 775 – Distributed Objects Submitted by: Arpit Kothari
Stefan Kaestle, Reto Achermann, Timothy Roscoe, Tim Harris ATC’15
OCR on Knights Landing (Xeon-Phi)
Unconventional applications of Intel® Xeon Phi™ Processor (KNL)
CE-105 Spring 2007 Engr. Faisal ur Rehman
Abstract Machine Layer Research in VGrADS
Reducing OS noise using offload driver on Intel® Xeon Phi™ Processor
NGS computation services: APIs and Parallel Jobs
Java Virtual Machine Complete subject details are available at:
Many-core Software Development Platforms
IXPUG Abstract Submission Instructions
Intel® Parallel Studio and Advisor
Virtualization Techniques
Parallel Analytic Systems
Lecture Topics: 11/1 General Operating System Concepts Processes
Chapter by Loco Power Week 6: Medium-Fidelity Prototypes
Immersed Boundary Method Simulation in Titanium Objectives
Project collaborators’ names
Design Brief.
POWSYBL “Power System Blocks”
Software Acceleration in Hybrid Systems Xiaoqiao (XQ) Meng IBM T. J
Overview of Workflows: Why Use Them?
Introduction to Virtual Machines
<Title> <presenter name> <presenter title>
Introduction to Virtual Machines
Support for Adaptivity in ARMCI Using Migratable Objects
Time Zoya Yeprem.
Research: Past, Present and Future
Running C# in the browser
Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.
Presentation transcript:

Template for IXPUG EMEA Ostrava, 2016 (feel free to use your own presentation template) 23.11.2018 IXPUG EMEA 2016 Ostrava Workshop Submission

Topics (not exclusively) Real-world experiences: success stories and open issues with real-world workloads from all application areas including software-defined visualization and machine learning. Optimization techniques: General Vectorization: tough cases, systemic and strategic issues, language interfaces Data preconditioning Thread management, task dependencies Tools for code transformations Performance portability, open standard alternatives Experiences with programming and runtime models: hStreams Runtime optimization techniques Multi-device and multi-node scalability: OpenMP MPI, including experiences with MPI 3, non-blocking collectives Offload over fabric Preparing workloads for KNL: Optimization for MCDRAM On-deviceprocess & thread scalability 23.11.2018 IXPUG EMEA 2016 Ostrava Workshop Submission

<presenter name> <presenter title> <affiliation> 23.11.2018 IXPUG EMEA 2016 Ostrava Workshop Submission

What’s unique about my tuning work <App name, brief description> <Application domain – seeking diversity> <Execution mode: native, offloaded, symmetric MPI, cluster> <Tools used for development, analysis and debugging – seeking diversity of experiences and tools> <Alignment with vectorization effectiveness and/or memory tuning> 23.11.2018 IXPUG EMEA 2016 Ostrava Workshop Submission

IXPUG EMEA 2016 Ostrava Workshop Submission Performance <Compelling performance: with vs. without MIC> <Competitive performance, if available> <How much each of original host code and code with MIC was sped up (e.g. Xeon 1.3x, MIC 2.2x)> <List of optimizations that yielded perf improvements, how much each gave (order dependent), why I thought that’d help, and how generalizable I think such optimization is> 23.11.2018 IXPUG EMEA 2016 Ostrava Workshop Submission

IXPUG EMEA 2016 Ostrava Workshop Submission Insights <What we learned> <What we recommend and how we would have done it differently> <Which tools and optimizations were most useful and why? <Biggest surprises> <Key remaining challenges, where we might want help> <Questions we’d like to raise> <Who we’re thinking of collaborating with> 23.11.2018 IXPUG EMEA 2016 Ostrava Workshop Submission

IXPUG EMEA 2016 Ostrava Workshop Submission References / Codes <Where to get the code> <How to reproduce measurements – compile, run (links to external documents> 23.11.2018 IXPUG EMEA 2016 Ostrava Workshop Submission