Tuning Python Applications Can Dramatically Increase Performance Vasilij Litvinov Software Engineer, Intel.

Slides:



Advertisements
Similar presentations
11 Auto Regression Analysis Shuang He Intel Linux Graphics Validation Team Open Source Technology Center
Advertisements

SpreadsheetML Basics.
Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Intel ® Xeon ® Processor E v2 Product Family Ivy Bridge Improvements *Other names and brands may be claimed as the property of others. FeatureXeon.
© 2014 Microsoft Corporation. All rights reserved.
Software and Services Group Optimization Notice Advancing HPC == advancing the business of software Rich Altmaier Director of Engineering Sept 1, 2011.
Perceptual Computing SDK Q2, 2013 Update Building Momentum with the SDK 1 Barry Solomon, Senior Product Manager, Intel Xintian Wu, Architect, Intel.
Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property.
Intel® Education Fluid Math™
HEVC Commentary and a call for local temporal distortion metrics Mark Buxton - Intel Corporation.
Intel ® Server Platform Transitions Nov / Dec ‘07.
Intel® Education Read With Me Intel Solutions Summit 2015, Dallas, TX.
Intel® Education Learning in Context: Science Journal Intel Solutions Summit 2015, Dallas, TX.
Getting Reproducible Results with Intel® MKL 11.0
Intel® Solid-State Drive Data Center TCO Calculator The data in this presentation is based on your analysis and business assumptions when using the Intel®
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Vital Signs: Performance Monitoring Windows Server
OpenMP * Support in Clang/LLVM: Status Update and Future Directions 2014 LLVM Developers' Meeting Alexey Bataev, Zinovy Nis Intel.
Orion Granatir Omar Rodriguez GDC 3/12/10 Don’t Dread Threads.
Evaluation of a DAG with Intel® CnC Mark Hampton Software and Services Group CnC MIT July 27, 2010.
® IBM Software Group © 2012 IBM Corporation OPTIM Data Studio – Jon Sayles, IBM/Rational November, 2012.
IBIS-AMI and Direction Indication February 17, 2015 Updated Feb. 20, 2015 Michael Mirmak.
1 Intel® Many Integrated Core (Intel® MIC) Architecture MARC Program Status and Essentials to Programming the Intel ® Xeon ® Phi ™ Coprocessor (based on.
Conditions and Terms of Use
Copyright © 2013 Intel Corporation. All rights reserved. Digital Signage for Growing Businesses November 2013.
Intel® Education Learning in Context: Concept Mapping Intel Solutions Summit 2015, Dallas, TX.
Enterprise Platforms & Services Division (EPSD) JBOD Update October, 2012 Intel Confidential Copyright © 2012, Intel Corporation. All rights reserved.
Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Introduction to OpenCL* Ohad Shacham Intel Software and Services Group Thanks to Elior Malul, Arik Narkis, and Doron Singer 1.
IBIS-AMI and Direction Decisions
IBIS-AMI and Direction Indication February 17, 2015 Michael Mirmak.
Copyright © 2006 Intel Corporation. WiMAX Wireless Broadband Access: The World Goes Wireless Michael Chen Director of Product & Platform Marketing Group.
Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.
Overview of Silverlight Mike Taulty Developer & Platform Group Microsoft Ltd
The Drive to Improved Performance/watt and Increasing Compute Density Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise.
Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 1 How Does The Intel® Parallel.
Copyright © 2011 Intel Corporation. All rights reserved. Openlab Confidential CERN openlab ICT Challenges workshop Claudio Bellini Business Development.
Programming with touchdevelop teacher’s checklist go over this before starting the course Disclaimer: This document is provided “as-is”. Information and.
Boxed Processor Stocking Plans Server & Mobile Q1’08 Product Available through February’08.
How to Enforce Reproducibility with your Existing Intel ® Math Kernel Library Code Noah Clemons Technical Consulting Engineer Intel ® Developer Products.
Installation of Storage Foundation for Windows High Availability 5.1 SP2 1 Daniel Schnack Principle Technical Support Engineer.
Template Library for Vector Loops A presentation of P0075 and P0076
INTEL CONFIDENTIAL Intel® Smart Connect Technology Remote Wake with WakeMyPC November 2013 – Revision 1.2 CDI/IBP #:
Building faster data applications on spark* clusters using Intel® DAAL.
Tuning Threaded Code with Intel® Parallel Amplifier.
This document is provided for informational purposes only and Microsoft makes no warranties, either express or implied, in this document. Information.
Work smarter, keep connected with Lotus Software Jon Crouch | Senior Technical Specialist, Lotus Software Matt Newton | Senior Technical Specialist, Lotus.
Intel® Many Integrated Core Architecture Software & Services Group, Developer Relations Division Copyright© 2011, Intel Corporation. All rights reserved.
1 Game Developers Conference 2008 Comparative Analysis of Game Parallelization Dmitry Eremin Senior Software Engineer, Intel Software and Solutions Group.
SUSE Studio: Building distributions By Cool Person From openSUSE.
System Software EIT, © Author Gay Robertson, 2016.
BLIS optimized for EPYCTM Processors
Developing Drivers in Visual Studio
8/3/2018 7:11 AM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Installation and database instance essentials
TechEd /9/ :26 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Many-core Software Development Platforms
Reaching more customers with accessible Metro style apps using HTML5
Git Version Control for Everyone
Intel® Parallel Studio and Advisor
A Proposed New Standard: Common Privacy Vulnerability Scoring System (CPVSS) Jonathan Fox, Privacy Office/PDIT Harold A. Toomey, PSG/ISecG Jason M. Fung,
TechEd /21/2018 3:13 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
12/26/2018 5:07 AM Leap forward with fast, agile & trusted solutions from Intel & Microsoft* Eman Yarlagadda (for Christine McMonigal) Hybrid Cloud – Product.
Ideas for adding FPGA Accelerators to DPDK
Enabling TSO in OvS-DPDK
By Vipin Varghese Application Engineer (NCSD)
A Scalable Approach to Virtual Switching
Expanded CPU resource pool with
Presentation transcript:

Tuning Python Applications Can Dramatically Increase Performance Vasilij Litvinov Software Engineer, Intel

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 2 Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Copyright © 2015, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. Optimization Notice Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 3 Why do we need Python optimization? How one finds the code to tune? Overview of existing tools An example Prototype capabilities and comparison Q & A Agenda 3

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 4 Python is used to power wide range of software, including those where application performance matters Some Python code may not scale well, but you won’t know it unless you give it enough workload to chew on Sometimes you are just not happy with the speed of your code All in all, there are times when you want to make your code run faster, be more responsive, (insert your favorite buzzword here). So, you need to optimize (or tune) your code. 4 Why do we need Python optimization?

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 5 Hard stare = Often wrong 5 How one finds the code to tune – measuring vs guessing Profile = Accurate, Easy Logging = Analysis is tedious

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 6 There are different profiling techniques, for example: Event-based Example: built-in Python cProfile profiler Instrumentation-based Usually requires modifying the target application (source code, compiler support, etc.) Example: line_profiler Statistical Accurate enough & less intrusive Example: Intel® VTune Amplifier 6 Not All Profilers Are Equal

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 7 ToolDescriptionPlatfo`rm s Profile level Avg. overhead cProfile (built-in) Text interactive mode: “pstats” (built- in) GUI viewer: RunSnakeRun Open Source AnyFunction1.3x-5x Python Tools Visual Studio (2010+) Open Source WindowsFunction~2x PyCharmNot free cProfile/yappi based AnyFunction1.3x-5x (same as cProfile) line_profilerPure Python Open Source Text-only viewer AnyLineUp to 10x or more Most Profilers – High Overhead, No Line Info 7

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 8 class Encoder: CHAR_MAP = {'a': 'b', 'b': 'c'} def __init__(self, input): self.input = input def process_slow(self): result = '' for ch in self.input: result += self.CHAR_MAP.get(ch, ch) return result def process_fast(self): result = [] for ch in self.input: result.append(self.CHAR_MAP.get(ch, ch)) return ''.join(result) 8 Python example to profile: demo.py

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 9 import demo import time def slow_encode(input): return demo.Encoder(input).process_slow() def fast_encode(input): return demo.Encoder(input).process_fast() if __name__ == '__main__': input = 'a' * # 10 millions of 'a' start = time.time() s1 = slow_encode(input) slow_stop = time.time() print 'slow: %.2f sec' % (slow_stop - start) s2 = fast_encode(input) print 'fast: %.2f sec' % (time.time() - slow_stop) 9 Python sample to profile: run.py No profiling overhead - a baseline for tools’ overhead comparison slow: 9.15 sec = 1.00x fast: 3.16 sec = 1.00x

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 10 > python -m cProfile -o run.prof run.py > python -m pstats run.prof run.prof% sort time run.prof% stats Tue Jun 30 18:43: run.prof function calls in seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) demo.py:6(process_slow) demo.py:12(process_fast) {method 'get' of 'dict' objects} {method 'append' of 'list' objects} {method 'join' of 'str' objects} run.py:7(fast_encode) run.py:1( ) run.py:4(slow_encode) cProfile + pstats UI example 10 slow: sec = 1.12x fast: 5.34 sec = 1.69x

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 11 cProfile + RunSnakeRun 11

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 12 cProfile in PyCharm slow: sec = 1.10x fast: 5.60 sec = 1.77x

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 13 Total time: s File: demo_lp.py Function: process_slow at line 6 Line # Hits Time Per Hit % Time Line Contents ========================================================== 7 def process_slow(self): result = '' for ch in self.input: result += self.CHAR_MAP.get( return result Total time: s File: demo_lp.py Function: process_fast at line 13 Line # Hits Time Per Hit % Time Line Contents ========================================================== 14 def process_fast(self): result = [] for ch in self.input: result.append(self.CHAR_MAP.get( return ''.join(result) line_profiler results 13 slow: sec = 2.66x fast: sec = 8.03x

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 14 Python Tools GUI example 14 Note: Wallclock time (not CPU) slow: sec = 1.90x fast: sec = 3.82x

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 15 Prototype GUI example 15 slow: sec = 1.23x fast: 4.45 sec = 1.40x

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 16 Prototype GUI – source view 16

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 17 Line-level profiling details  Uses sampling profiling technique  Average overhead ~1.1x-1.6x (on certain benchmarks) Cross-platform  Windows and Linux  Python 32- and 64-bit; 2.7.x and 3.4.x branches Rich Graphical UI Supported workflows  Start application, wait for it to finish  Attach to application, profile for a bit, detach Our Prototype: Accurate & Easy 17

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 18 Low Overhead and Line-Level Info ToolDescriptionPlatformsProfile level Avg. overhead Our prototype Rich GUI viewerWindows Linux Line~ x cProfile (built-in) Text interactive mode: “pstats” (built-in) GUI viewer: RunSnakeRun (Open Source) PyCharm AnyFunction1.3x-5x Python ToolsVisual Studio (2010+) Open Source WindowsFunction~2x line_profilerPure Python Open Source Text-only viewer AnyLineUp to 10x or more 18

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 19 One widely-used web page in our internally set up Buildbot service: 3x speed up (from 90 seconds to 28) Report generator – from 350 sec to <2 sec for 1MB log file Distilled version was the base for demo.py Internal SCons-based build system: several places sped up 2x or more Loading all configs from scratch tuned from 6 minutes to 3 minutes We’ve Had Success Tuning Our Python Code 19

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 20 Profiling Chromium gyp script

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 21 Profiling NumPy-based simple raytracing code

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 22 Visit us at our table or catch me during sprints Technical Preview & Beta Participation – us at We’re making proof-of-concept work of Intel-accelerated Python (e.g. NumPy/SciPy, etc.). Sign up! Check out Intel Developer Zone – software.intel.comsoftware.intel.com Check out Intel® Software Development tools Qualify for Free Intel® Software Development tools 22 Sign Up with Us to Give the Profiler a Try & Check out Intel® Software Development Tools

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 23 Free Intel® Software Development Tools Intel Performance Libraries for academic research Visit us at

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice … and again: For Tech Preview and Beta, drop us an at Check out free Intel® software – just google for “free intel tools” to see if you’re qualified 24 Q & A

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 25 Profiling GPAW example – calculation of Hydrogen atom energies

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 26 Performance Starts Here! 26 You are a:Products Available ++ Support Model Price Commercial Developer or Academic Researcher Intel® Parallel Studio XE (Compilers, Performance Libraries & Analyzers) Intel® Premier Support $699** - $2949** Academic Researcher + Intel® Performance Libraries Intel® Math Kernel Library Intel® MPI Library Intel® Threading Building Blocks Intel® Integrated Performance Primitives Forum only support Free! Student + Intel® Parallel Studio XE Cluster Edition Educator + Open Source Contributor + Intel® Parallel Studio XE Professional Edition +Subject to qualification ++ OS Support varies by product **Single Seat Pricing