Physics tables and multi-core J. Apostolakis. Motivation Limited reuse of memory in Multi-Processing – A forked process shares all pages of memory which.

Slides:



Advertisements
Similar presentations
Lecture 6: Multicore Systems
Advertisements

Miss Penalty Reduction Techniques (Sec. 5.4) Multilevel Caches: A second level cache (L2) is added between the original Level-1 cache and main memory.
ELEC 6200, Fall 07, Oct 29 McPherson: Vector Processors1 Vector Processors Ryan McPherson ELEC 6200 Fall 2007.
Multithreading and Dataflow Architectures CPSC 321 Andreas Klappenecker.
Chapter 17 Parallel Processing.
Synergistic Processing In Cell’s Multicore Architecture Michael Gschwind, et al. Presented by: Jia Zou CS258 3/5/08.
Презентація за розділом “Гумористичні твори”
Центр атестації педагогічних працівників 2014
Галактики і квазари.
Характеристика ІНДІЇ.
Процюк Н.В. вчитель початкових класів Боярської ЗОШ І – ІІІ ст №4
Objectives Overview Discovering Computers 2014: Chapter 6 See Page 248
F1020/F1031 COMPUTER HARDWARE MEMORY. Read-only Memory (ROM) Basic instructions for booting the computer and loading the operating system are stored in.
Chapter 6 Inside Computers and Mobile Devices Discovering Computers Technology in a World of Computers, Mobile Devices, and the Internet.
IEEE Nuclear Science Symposium and Medical Imaging Conference Short Course The Geant4 Simulation Toolkit Sunanda Banerjee (Saha Inst. Nucl. Phys., Kolkata,
GEant4 Parallelisation J. Apostolakis. Session Overview Part 1: Geant4 Multi-threading C++ 11 threads: opportunity for portability ? Open, revised and.
1 Memory Management Memory Management COSC513 – Spring 2004 Student Name: Nan Qiao Student ID#: Professor: Dr. Morteza Anvari.
Objectives Overview Describe the various computer and mobile device cases and the contents they protect Describe multi-core processors, the components.
1 Multi-core processors 12/1/09. 2 Multiprocessors inside a single chip It is now possible to implement multiple processors (cores) inside a single chip.
GEant4 Parallelisation J. Apostolakis. Session Overview Part 1: Geant4 Multi-threading C++ 11 threads: opportunity for portability ? Open, revised and.
Mariusz Witek - L1-HLT Workshop1 STL limitations Simple measurements.
Copyright © 1997 – 2014 Curt Hill Concurrent Execution of Programs An Overview.
Detector Simulation on Modern Processors Vectorization of Physics Models Philippe Canal, Soon Yung Jun (FNAL) John Apostolakis, Mihaly Novak, Sandro Wenzel.
Thread-safety and shared geometry data J. Apostolakis.
1 Chapter 4 Processes R. C. Chang. 2 Linux Processes n Each process is represented by a task_struct data structure (task and process are terms that Linux.
Духовні символи Голосіївського району
Lecture Topics: 11/24 Sharing Pages Demand Paging (and alternative) Page Replacement –optimal algorithm –implementable algorithms.
DNA REPLICATION. DNA replication video DNA and Chromosomes In _________cells, DNA is located in the cytoplasm. Most prokaryotes have a __________ DNA.
Copyright © Curt Hill Parallelism in Processors Several Approaches.
A Thread-Parallel Geant4 with Shared Geometry Gene Cooperman and Xin Dong College of Computer and Information Science Northeastern University 360 Huntington.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Processes and Threads.
1 Becoming More Effective with C++ … Day Two Stanley B. Lippman
Computing Performance Recommendations #10, #11, #12, #15, #16, #17.
Interpolation Variable Interpolation, Backslash Interpolation.
1 Lecture: SMT, Cache Hierarchies Topics: SMT processors, cache access basics and innovations (Sections B.1-B.3, 2.1)
Special Topics in Computer Engineering OpenMP* Essentials * Open Multi-Processing.
Multi-threading and other parallelism options J. Apostolakis Summary of parallel session. Original title was “Technical aspects of proposed multi-threading.
Rules for Drawing Energy Level Diagrams
Process API COMP 755.
(Friday, 8/18) Notes Review Step 1: Finish YESTERDAY’S NOTES
Architecture and Design of AlphaServer GS320
Multi-core processors
(Friday, 8/26) Notes Review Step 1: Finish YESTERDAY’S NOTES
Chapter Overview Understanding the Database Architecture
Figure 11.1 A basic personal computer system
Discovering Computers 2014: Chapter6
Проф. д-р Васил Цанов, Институт за икономически изследвания при БАН
ЗУТ ПРОЕКТ на Закон за изменение и допълнение на ЗУТ
О Б Щ И Н А С И Л И С Т Р А П р о е к т Б ю д ж е т г.
Електронни услуги на НАП
Боряна Георгиева – директор на
РАЙОНЕН СЪД - БУРГАС РАБОТНА СРЕЩА СЪС СЪДЕБНИТЕ ЗАСЕДАТЕЛИ ПРИ РАЙОНЕН СЪД – БУРГАС 21 ОКТОМВРИ 2016 г.
Сътрудничество между полицията и другите специалисти в България
Съобщение Ръководството на НУ “Христо Ботев“ – гр. Елин Пелин
НАЦИОНАЛНА АГЕНЦИЯ ЗА ПРИХОДИТЕ
ДОБРОВОЛЕН РЕЗЕРВ НА ВЪОРЪЖЕНИТЕ СИЛИ НА РЕПУБЛИКА БЪЛГАРИЯ
Съвременни софтуерни решения
ПО ПЧЕЛАРСТВО ЗА ТРИГОДИШНИЯ
от проучване на общественото мнение,
Васил Големански Ноември, 2006
Програма за развитие на селските райони
ОПЕРАТИВНА ПРОГРАМА “АДМИНИСТРАТИВЕН КАПАЦИТЕТ”
БАЛИСТИКА НА ТЯЛО ПРИ СВОБОДНО ПАДАНЕ В ЗЕМНАТА АТМОСФЕРА
МЕДИЦИНСКИ УНИВЕРСИТЕТ – ПЛЕВЕН
Стратегия за развитие на клъстера 2015
Моето наследствено призвание
Правна кантора “Джингов, Гугински, Кючуков & Величков”
Безопасност на движението
Concurrency: Processes CSE 333 Summer 2018
January 15, 2004 Adrienne Noble
Presentation transcript:

Physics tables and multi-core J. Apostolakis

Motivation Limited reuse of memory in Multi-Processing – A forked process shares all pages of memory which read-only (called ‘Copy-on-write’=COW) – With Geant4 less than 30% can be shared Leading reasons: – Caching in physics tables – Replicas’ copy numbers. With simple changes could increase reuse

Inside a Physics Table Physics Vector (material 1) Physics Vector (material 2) Physics Vector (material 3) Physics Table

Inside a Physics Vector Energy Vector Value Vector 2 nd Derivative Vector Last bin Last value Last 2 nd Deriv. Scalars Cache: Read/WriteRead-only after filling (initialization) 3 doublesTypically double values each Typically created in consecutive areas of the heap. The result is: By writing the 3 doubles (cache) a process creates copy of page(s) which containing ~ 300 doubles

Multiprocessing and ‘timing’ BeamOn is called Initialization of geometry Initialization of physics processes First event is processed Worker processes (or threads) are created Fork waits until the physics tables are initialized!

Requirement for multiprocessing Use separate areas of memory – One for the scalars (which are rewritten) – A different one for the vectors Fork waits until the tables are initialized & shared.

Complications? Other large arrays used by physics processes Static arrays in Brems, pair production, Goldsmith-Saunderson MSc C-arrays in – Hadron Elastic (no caching) – CHIPS Elastic (?caching?)

Outcome Revision of design of physics vector – To separate areas of memory for scalars and vectors – Hisaya was present – he maintains phys. vector