Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property.

Slides:



Advertisements
Similar presentations
© 2013 IBM Corporation Implement high-level parallel API in JDK Richard Ning – Enterprise Developer 1 st June 2013.
Advertisements

1 Keith D. Underwood, Eric Borch May 16, 2011 A Unified Algorithm for both Randomized Deterministic and Adaptive Routing in Torus Networks.
11 Auto Regression Analysis Shuang He Intel Linux Graphics Validation Team Open Source Technology Center
The following 10 questions test your knowledge of desired configuration management in Configuration Manager Configuration Manager Desired Configuration.
Endpoints Proposal Update Jim Dinan MPI Forum Hybrid Working Group June, 2014.
INTEL CONFIDENTIAL Threading for Performance with Intel® Threading Building Blocks Session:
Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Intel ® Software Development.
Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Intel ® Xeon ® Processor E v2 Product Family Ivy Bridge Improvements *Other names and brands may be claimed as the property of others. FeatureXeon.
© 2014 Microsoft Corporation. All rights reserved.
Intel® performance analyze tools Nikita Panov Idrisov Renat.
Software and Services Group Optimization Notice Advancing HPC == advancing the business of software Rich Altmaier Director of Engineering Sept 1, 2011.
Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property.
Lloyds 360 Risk Insight Dec 2010 Malcolm Harkins Malcolm Harkins Chief Information and Security Officer General Manager Intel Information Risk and Security.
Intel® Education Fluid Math™
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
HEVC Commentary and a call for local temporal distortion metrics Mark Buxton - Intel Corporation.
Intel ® Server Platform Transitions Nov / Dec ‘07.
BIAF Print Label software setup
Intel® Education Read With Me Intel Solutions Summit 2015, Dallas, TX.
Intel® Education Learning in Context: Science Journal Intel Solutions Summit 2015, Dallas, TX.
Getting Reproducible Results with Intel® MKL 11.0
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
NCCS Brown Bag Series. Vectorization Efficient SIMD parallelism on NCCS systems Craig Pelissier* and Kareem Sorathia
SEC(R) 2008 Intel® Concurrent Collections for C++ - a model for parallel programming Nikolay Kurtov Software and Services.
OpenCL Introduction A TECHNICAL REVIEW LU OCT
Intel - Public Get Rich or Get Thin: The Secure Client Jeff Moriarty, CISSP Security Program Manager Intel Information Risk and Security.
Intel® Composer XE for HPC customers July 2010 Denis Makoshenko, Intel, SSG.
Orion Granatir Omar Rodriguez GDC 3/12/10 Don’t Dread Threads.
Evaluation of a DAG with Intel® CnC Mark Hampton Software and Services Group CnC MIT July 27, 2010.
IBIS-AMI and Direction Indication February 17, 2015 Updated Feb. 20, 2015 Michael Mirmak.
Change Agent Role: A Successful Transformation into Agile Organization (Intel® MKL Case Study) Intel Agile and Lean Development Conference Presenter:
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Conditions and Terms of Use
Intel® Education Learning in Context: Concept Mapping Intel Solutions Summit 2015, Dallas, TX.
Enterprise Platforms & Services Division (EPSD) JBOD Update October, 2012 Intel Confidential Copyright © 2012, Intel Corporation. All rights reserved.
Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Type presentation name here in slide master © 2007 SDL. Company Confidential. Forward-looking information is based upon multiple assumptions and uncertainties.
IBIS-AMI and Direction Decisions
IBIS-AMI and Direction Indication February 17, 2015 Michael Mirmak.
Copyright © 2006 Intel Corporation. WiMAX Wireless Broadband Access: The World Goes Wireless Michael Chen Director of Product & Platform Marketing Group.
Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.
Results of self-organization in the service oriented team
The Drive to Improved Performance/watt and Increasing Compute Density Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise.
Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 1 How Does The Intel® Parallel.
Copyright © 2011 Intel Corporation. All rights reserved. Openlab Confidential CERN openlab ICT Challenges workshop Claudio Bellini Business Development.
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
Boxed Processor Stocking Plans Server & Mobile Q1’08 Product Available through February’08.
Changing Developer Behavior Using Automatic Test Intel Agile and Lean Development Conference Chris Gearing 23 rd May 2014 Version 1.0.
INTEL CONFIDENTIAL Intel® Smart Connect Technology Remote Wake with WakeMyPC November 2013 – Revision 1.2 CDI/IBP #:
Lab Activities 1, 2. Some of the Lab Server Specifications CPU: 2 Quad(4) Core Intel Xeon 5400 processors CPU Speed: 2.5 GHz Cache : Each 2 cores share.
Tuning Threaded Code with Intel® Parallel Amplifier.
Intel® Many Integrated Core Architecture Software & Services Group, Developer Relations Division Copyright© 2011, Intel Corporation. All rights reserved.
1 Game Developers Conference 2008 Comparative Analysis of Game Parallelization Dmitry Eremin Senior Software Engineer, Intel Software and Solutions Group.
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Connectivity to bank and sample account structure
Definition CASE tools are software systems that are intended to provide automated support for routine activities in the software process such as editing.
Presenter: Yoel Kortick
OGSA Service Classifications
Computer Engg, IIT(BHU)
Many-core Software Development Platforms
Intel® Parallel Studio and Advisor
Modeling Parallelism with Intel® Parallel Advisor
A Proposed New Standard: Common Privacy Vulnerability Scoring System (CPVSS) Jonathan Fox, Privacy Office/PDIT Harold A. Toomey, PSG/ISecG Jason M. Fung,
12/26/2018 5:07 AM Leap forward with fast, agile & trusted solutions from Intel & Microsoft* Eman Yarlagadda (for Christine McMonigal) Hybrid Cloud – Product.
Ideas for adding FPGA Accelerators to DPDK
Virtio/Vhost Status Quo and Near-term Plan
Enabling TSO in OvS-DPDK
By Vipin Varghese Application Engineer (NCSD)
Presentation transcript:

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Essential Performance Advanced Performance Distributed Performance Efficient Performance Building parallel application using Guided Auto Parallelization Om P Sachan Intel Compiler and Languages 1

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Optimization Notice 2 Intel compilers, associated libraries and associated development tools may include or utilize options that optimize for instruction sets that are available in both Intel and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition, certain compiler options for Intel compilers, including some that are not specific to Intel micro-architecture, are reserved for Intel microprocessors. For a detailed description of Intel compiler options, including the instruction sets and specific microprocessors they implicate, please refer to the Intel Compiler User and Reference Guides under Compiler Options." Many library routines that are part of Intel compiler products are more highly optimized for Intel microprocessors than for other microprocessors. While the compilers and libraries in Intel compiler products offer optimizations for both Intel and Intel-compatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors. Intel compilers, associated libraries and associated development tools may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include Intel® Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming SIMD Extensions 3 (Intel® SSE3), and Supplemental Streaming SIMD Extensions 3 (Intel SSSE3) instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. While Intel believes our compilers and libraries are excellent choices to assist in obtaining the best performance on Intel and non-Intel microprocessors, Intel recommends that you evaluate other compilers and libraries to determine which best meet your requirements. We hope to win your business by striving to offer the best performance of any compiler or library; please let us know if you find we do not. Notice revision #

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Agenda Introduction to Guided Auto-parallelization. Run Guided Auto-parallelization. Analyze Guided Auto-parallelization reports. Implement Guided Auto-parallelization recommendations. Intel Confidential 3

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 4/6/20104 Parallelization in Mainstream Performance gains coming from more cores per die –Increasing clock frequencies play a smaller role Exposes parallelism to the programmer Every computer is a parallel computer –Implies most programs must execute in parallel Parallelism successful in HPC, servers, graphics,... –Not widespread in the client domain Client apps focused on –Quality user experience –Scalability –Programmer productivity (critical for time-to-market ) Development of multi-threaded apps is hard Need for a low-cost and effective way of threading apps

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 4/6/20105 Parallelization in Mainstream Requires multi-pronged approach: –Simpler parallel programming models and abstractions –Domain-specific parallel libraries –Compiler auto-parallelization, auto-vectorization, and data-transformation –Advise user on how to parallelize –Good debugging tools –Easy-to-use tools for performance analysis Tradeoffs between scalability and productivity Compiler can play an important role in enabling parallelism

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 4/6/20106 Workflow with Compiler as a Tool Compiler Application Source C/C++/Fortran Application Binary + Opt Reports Identify hotspots, problems Performance Tools Simplifies programmer effort in application tuning Application Source + Hotspots Compiler in advice- mode Advice messages Modified Application Source Compiler (extra options) Improved Application Binary

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Intel Confidential 4/6/20107 Compiler as a Tool Use compiler as a tool to give selective advice Initially targets: –Automatic parallelization of loop-nests –Automatic vectorization of inner-loops –Data transformation suggestions Programmer writes serial code – then follows the compiler advice to assert new properties –Does not require a lot of extra time and effort from user Code remains performance-portable Programmer reasons about application properties Tool based on expertise of common pitfalls –Conservative disambiguation assumptions –Compiler assumes upper-bound is changing inside loop –...

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 4/6/20108 How it Works Targeted for Mainstream and HPC Users Advice may involve –suggestions for source-change –adding pragmas –adding new options Simple source changes that assert new properties –Add a new pragma for loop if semantics are satisfied –Use a local-variable for the upper-bound of a loop –Initialize scalar variable unconditionally at top of loop –Reorder fields of a structure (or split into two) Desired behavior –Each advice is specific using source-level variable names –User does semantic analysis – apply or reject each advice –Advice should be as localized as possible –Following the advice should result in better optimizations

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 4/6/20109 Intel Confidential Vectorization Example void mul(NetEnv* ne, Vector* rslt, Vector* den,Vector* flux1, Vector* flux2,Vector* num) { float *r, *d, *n, *s1, *s2; int i; r=rslt->data; d=den->data; n=num->data; s1=flux1->data; s2=flux2->data; for (i = 0; i len; ++i) { r[i] = s1[i] * s2[i] +n[i] * d[i]; } Create an assignment statement to store the upper-bound (ne->len) of loop at line 29 to a local variable if this does not alter program semantics. [VERIFY] Make sure that the upper-bound does not change during the execution of the loop Use pragma ivdep" to vectorize the loop at line 29, if these arrays in the loop do not have unsafe cross-iteration dependencies: r, s1, s2, n, d. [VERIFY] A cross-iteration dependency exists if a memory location is modified in an iteration of a loop and accessed (a read or a write) in another iteration of the loop. Make sure that there are no such dependencies, or that any cross-iteration dependencies can be safely ignored. The compiler guides the user on source-change and on what pragma to insert and on how to determine whether that pragma is correct for this case

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Intel Confidential 10 Activity 1 Prepare and run Sample code Use lab document

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Intel Confidential 4/6/ Usage Model Two main usage models: –Users compiling with auto-parallelization enabled –Users compiling with no auto-parallelization – but still can gain from improved vectorization User can specify regions of a file or routine that are considered hot –Advice will be restricted to the hot region –Default is to provide advice on entire compilation-unit Under guide-mode, no executable-code generated –Only output is a set of advice messages User not required to use advanced options (IPO, PGO), but advice may change based on options User may apply all (or a subset) of the advice –Recompile in normal-mode enables better optimizations

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 4/6/ Usage Model (contd.) Advice targeted only for improving application perf –Use tool during the perf-tuning part of the software development cycle Each advice has a VERIFY part –User is responsible for checking whether it is safe to apply each suggestion User not required to use adv options (IPO, PGO) –When IPO is ON in guide-mode, advice will get emitted as part of link-step There may be multiple msgs targeting same loop –User has to apply ALL to get desired optimization Default debug mode generates no GAP messages –/Zi implies /Od, override by adding /O2 explicitly

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 4/6/ Limitations User may have to deal with lots of messages –Duplicate messages –If no hot region is specified User is responsible for semantic verification–possibility of bugs –Adding an ivdep pragma in a loop is an assertion by the user –May lead to errors if user is not diligent with the verification –Good documentation with examples can help mitigate this More vector/par-loops – does not always guarantee perf gains Tool does not guide the user on how to write parallel code Not a general purpose mechanism to achieve maximum perf –Turning on GAP will not vectorize EVERY loop –Only a subset where compiler can do an intelligent workaround Not a panacea for all problems related to parallelization

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Intel Confidential 4/6/ How to Use GAP Targeting Windows and Linux (IA32 & Intel64) With normal options for the app (-O2 and above), add: –-Qguide:3 (Mainstream) –-Qguide:4 (HPC) No code generation in gap-mode (no executable generated) Can be used with and without –Qparallel option

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Intel Confidential 15 Activity 2 Implementing Guided Auto-parallelization Recommendations, use sample code Use lab document

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Intel Confidential 16 Summary Learned Guided Auto-parallelization. Analyze Guided Auto-parallelization reports. Implemented Guided Auto-parallelization recommendations.

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 17 Intel Confidential

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Optimization Notice 18 Optimization Notice Intel compilers, associated libraries and associated development tools may include or utilize options that optimize for instruction sets that are available in both Intel and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition, certain compiler options for Intel compilers, including some that are not specific to Intel micro-architecture, are reserved for Intel microprocessors. For a detailed description of Intel compiler options, including the instruction sets and specific microprocessors they implicate, please refer to the Intel Compiler User and Reference Guides under Compiler Options." Many library routines that are part of Intel compiler products are more highly optimized for Intel microprocessors than for other microprocessors. While the compilers and libraries in Intel compiler products offer optimizations for both Intel and Intel-compatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors. Intel compilers, associated libraries and associated development tools may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include Intel® Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming SIMD Extensions 3 (Intel® SSE3), and Supplemental Streaming SIMD Extensions 3 (Intel SSSE3) instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. While Intel believes our compilers and libraries are excellent choices to assist in obtaining the best performance on Intel and non-Intel microprocessors, Intel recommends that you evaluate other compilers and libraries to determine which best meet your requirements. We hope to win your business by striving to offer the best performance of any compiler or library; please let us know if you find we do not. Notice revision # Intel Confidential

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Legal Disclaimer 19 INFORMATION IN THIS DOCUMENT IS PROVIDED AS IS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Atom, Centrino Atom Inside, Centrino Inside, Centrino logo, Cilk, Core Inside, FlashFile, i960, InstantIP, Intel, the Intel logo, Intel386, Intel486, IntelDX2, IntelDX4, IntelSX2, Intel Atom, Intel Atom Inside, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, Viiv Inside, vPro Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries. *Other names and brands may be claimed as the property of others. Copyright © Intel Corporation. Intel Confidential