Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property.

Slides:

Advertisements

Similar presentations

Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property.

Advertisements

Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.

Intel ® Xeon ® Processor E v2 Product Family Ivy Bridge Improvements *Other names and brands may be claimed as the property of others. FeatureXeon.

Software and Services Group Optimization Notice Advancing HPC == advancing the business of software Rich Altmaier Director of Engineering Sept 1, 2011.

Perceptual Computing SDK Q2, 2013 Update Building Momentum with the SDK 1 Barry Solomon, Senior Product Manager, Intel Xintian Wu, Architect, Intel.

Lloyds 360 Risk Insight Dec 2010 Malcolm Harkins Malcolm Harkins Chief Information and Security Officer General Manager Intel Information Risk and Security.

Intel® Education Fluid Math™

Lappeenrannan teknillinen yliopisto TITE Prof. Esa Kerttula Päivä 1: Luento 1-1-7: Maaliskuu © Esa Kerttula.

INTEL CONFIDENTIAL Why Parallel? Why Now? Introduction to Parallel Programming – Part 1.

HEVC Commentary and a call for local temporal distortion metrics Mark Buxton - Intel Corporation.

A Move Toward Agile APM: Application Performance Management Frank Ober, Performance Engineer June 2012.

Intel ® Server Platform Transitions Nov / Dec ‘07.

BIAF Print Label software setup

Intel® Education Read With Me Intel Solutions Summit 2015, Dallas, TX.

Intel® Education Learning in Context: Science Journal Intel Solutions Summit 2015, Dallas, TX.

INTEL CONFIDENTIAL Parallel Decomposition Methods Introduction to Parallel Programming – Part 2.

Getting Reproducible Results with Intel® MKL 11.0

Intel® Solid-State Drive Data Center TCO Calculator The data in this presentation is based on your analysis and business assumptions when using the Intel®

Software & Services Group, Developer Products Division Copyright © 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.

Tuning Python Applications Can Dramatically Increase Performance Vasilij Litvinov Software Engineer, Intel.

OpenCL Introduction A TECHNICAL REVIEW LU OCT

Intel - Public Get Rich or Get Thin: The Secure Client Jeff Moriarty, CISSP Security Program Manager Intel Information Risk and Security.

OpenMP * Support in Clang/LLVM: Status Update and Future Directions 2014 LLVM Developers' Meeting Alexey Bataev, Zinovy Nis Intel.

Orion Granatir Omar Rodriguez GDC 3/12/10 Don’t Dread Threads.

Evaluation of a DAG with Intel® CnC Mark Hampton Software and Services Group CnC MIT July 27, 2010.

IBIS-AMI and Direction Indication February 17, 2015 Updated Feb. 20, 2015 Michael Mirmak.

Change Agent Role: A Successful Transformation into Agile Organization (Intel® MKL Case Study) Intel Agile and Lean Development Conference Presenter:

K-12 Blueprint Overview March An Overview The K-12 Blueprint offers resources for education leaders involved.

Copyright © 2013 Intel Corporation. All rights reserved. Digital Signage for Growing Businesses November 2013.

Intel® Education Learning in Context: Concept Mapping Intel Solutions Summit 2015, Dallas, TX.

Copyright 2011, Atmel December, 2011 Atmel ARM-based Flash Microcontrollers 1 1.

Kay-Ulrich Scholl Applying agile SW development methods in a non-agile friendly environment. May 22, Agile and Lean Development Conference 2014.

Legal Notices and Important Information Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each.

Enterprise Platforms & Services Division (EPSD) JBOD Update October, 2012 Intel Confidential Copyright © 2012, Intel Corporation. All rights reserved.

Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

Introduction to OpenCL* Ohad Shacham Intel Software and Services Group Thanks to Elior Malul, Arik Narkis, and Doron Singer 1.

Intel Confidential – For Use with Customers under NDA Only Revision - 01 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL®

IBIS-AMI and Direction Decisions

IBIS-AMI and Direction Indication February 17, 2015 Michael Mirmak.

Copyright © 2006 Intel Corporation. WiMAX Wireless Broadband Access: The World Goes Wireless Michael Chen Director of Product & Platform Marketing Group.

Copyright © 2008 Intel Corporation. All rights reserved. Intel Delivering Leadership HPC Technology – today and tomorrow – …for Grids …for Grids Sept 22th,

Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.

Results of self-organization in the service oriented team

The Drive to Improved Performance/watt and Increasing Compute Density Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise.

Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 1 How Does The Intel® Parallel.

Copyright © 2011 Intel Corporation. All rights reserved. Openlab Confidential CERN openlab ICT Challenges workshop Claudio Bellini Business Development.

Boxed Processor Stocking Plans Server & Mobile Q1’08 Product Available through February’08.

Changing Developer Behavior Using Automatic Test Intel Agile and Lean Development Conference Chris Gearing 23 rd May 2014 Version 1.0.

Template Library for Vector Loops A presentation of P0075 and P0076

INTEL CONFIDENTIAL Intel® Smart Connect Technology Remote Wake with WakeMyPC November 2013 – Revision 1.2 CDI/IBP #:

Building faster data applications on spark* clusters using Intel® DAAL.

© Copyright Khronos Group, Page 1 Real-Time Shallow Water Simulation with OpenCL for CPUs Arnon Peleg, Adam Lake software, Intel OpenCL WG, The.

Intel® Many Integrated Core Architecture Software & Services Group, Developer Relations Division Copyright© 2011, Intel Corporation. All rights reserved.

1 Game Developers Conference 2008 Comparative Analysis of Game Parallelization Dmitry Eremin Senior Software Engineer, Intel Software and Solutions Group.

Connectivity to bank and sample account structure

Models for Resources and Management

Using Parallelspace TEAM Models to Design and Create Custom Profiles

BLIS optimized for EPYCTM Processors

Parallelspace PowerPoint Template for ArchiMate® 2.1 version 1.1

Parallelspace PowerPoint Template for ArchiMate® 2.1 version 2.0

Many-core Software Development Platforms

Modeling Parallelism with Intel® Parallel Advisor

A Proposed New Standard: Common Privacy Vulnerability Scoring System (CPVSS) Jonathan Fox, Privacy Office/PDIT Harold A. Toomey, PSG/ISecG Jason M. Fung,

OpenFabrics Interfaces Working Group Co-Chair Intel November 2016

12/26/2018 5:07 AM Leap forward with fast, agile & trusted solutions from Intel & Microsoft* Eman Yarlagadda (for Christine McMonigal) Hybrid Cloud – Product.

Ideas for adding FPGA Accelerators to DPDK

Enabling TSO in OvS-DPDK

By Vipin Varghese Application Engineer (NCSD)

A Scalable Approach to Virtual Switching

Expanded CPU resource pool with

Presentation transcript:

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 1 Intel® Direct Sparse Solver for Clusters, a research project for solving large sparse systems of linear algebraic equation Alexander Kalinkin Anders Anton Anders Roman

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Legal Disclaimer 2 INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Atom, Centrino Atom Inside, Centrino Inside, Centrino logo, Cilk, Core Inside, FlashFile, i960, InstantIP, Intel, the Intel logo, Intel386, Intel486, IntelDX2, IntelDX4, IntelSX2, Intel Atom, Intel Atom Inside, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, Viiv Inside, vPro Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries. *Other names and brands may be claimed as the property of others. Copyright © Intel Corporation.

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Agenda Intro Algorithm Reordering step Factorization step Experiments Conclusion 3

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Problem statement Cons No extra data available for matrix but some global properties (positive define, hermitian…) Huge size Pros Clusters with modern Intel® CPUs Intel® MKL library with optimized BLAS, LAPACK, PARDISO functionality 4

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Algorithm (Ax=b) Input: matrix A, vector b; special parameters. Matrix reordering and symbolic factorization Numeric factorization Forward and backward substitution Reorder matrix A to reduce fill-in in factor L, create dependency tree representation of matrix A Compute decomposition A=LL T or LDL T or LU The most time-consuming part Solve Ly=b (forward step), Dz=y (diagonal step), then L T x=z (backward step) Output: vector x. 5

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Reordering step ABCDEFG E B C D E F G Matrix A after reordering (example of 4 leafs/process) A A B B D D E E C C F F G G - non-zero block Tree representation of matrix A after reordering 6

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Factorization step ABCDEFG E B C D E F G Matrix A after reordering (example of 4 leafs/process) A A B B D D E E C C F F G G - non-zero block Tree representation of matrix A after reordering - L-block updates R-block (or Right depends on Left) 7

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Factorization step ABCDEFG E B C D E F G Matrix A after reordering (example of 4 leafs/process) A A B B D D E E C C F F G G - non-zero block Tree representation of matrix A after reordering - L-block updates R-block (or Right depends on Left) 8

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Factorization step ABCDEFG E B C D E F G Matrix A after reordering (example of 4 leafs/process) A A B B D D E E C C F F G G - non-zero block Tree representation of matrix A after reordering - L-block updates R-block (or Right depends on Left)

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Factorization step ABCDEFG E B C D E F G Matrix A after reordering (example of 4 leafs/process) A A B B D D E E C C F F G G - non-zero block Tree representation of matrix A after reordering - L-block updates R-block (or Right depends on Left) Both tree and tree-node parallelization used All computations within the node are based on functionality from Intel® MKL Computation of leafs & updates of a block are independent on each process Data distributed between processes uniformly 10

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Factorization step ABCDEFG E B C D E F G Matrix A after reordering (example of 4 leafs/process) A A B B D D E E C C F F G G - non-zero block Tree representation of matrix A after reordering - L-block updates R-block (or Right depends on Left) Both tree and tree-node parallelization used All computations within the node are based on functionality from Intel® MKL Computation of leafs & updates of a block are independent on each process Data distributed between processes uniformly 11

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 12 G G Choosing one thread per process allow us to “mask” data transfer time under computational process Implementation of LU decomposition in “node”

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Current status/interface Supported as 2 additional libraries, Lnx & Win 64 bit only. Ported by different MPI via user-compiled wrapper. C: Fortran: 13 { …. PARDISO (pt, &maxfct, &mnum, &mtype, &phase, &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, b, x, &error); … } { …. comm = MPI_Comm_c2f(MPI_COMM_WORLD); CPARDISO (pt, &maxfct, &mnum, &mtype, &phase, &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, b, x, comm, &error); … } …. Call PARDISO(pt, maxfct, mnum, mtype, phase, n, a, ia, ja, idum, nrhs, iparm, msglvl, b, x, error); … Call CPARDISO(pt, maxfct, mnum, mtype, phase, n, a, ia, ja, idum, nrhs, iparm, msglvl, b, x, comm, &error); …

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Experiments (scalability of time) 14

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Experiments (scalability of time) 15 Additional processes reduce computational time!!!

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Experiments (scalability of time) 16

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Experiments (scalability of memory) 17

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Experiments (scalability of memory) Additional processes decrease memory size per host!!! 18

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Conclusion Intel® Direct Sparse Solver for Clusters based on Intel® MKL functionality results in Good scaling of computational time Good scaling of memory per node 19

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 20 Q & A

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 21

Software & Services Group Developer Products Division Copyright© 2013, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Optimization Notice 22 Optimization Notice Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #