Download presentation
Presentation is loading. Please wait.
Published byAnna Løken Modified over 6 years ago
1
LoGPC: Modeling Network Contention in Message-Passing Programs
Csaba Andras Moritz Matthew I. Frank Laboratory for Computer Science Massachusetts Institute of Technology 2/16/2019 Andras
2
Introduction LogP, LogGP are great models to capture first order system costs Our new model LoGPC extends LogP and LogGP capturing pipelining and network contention Results preview: 3 applications, 50-76% contention found 2/16/2019 Andras
3
Motivation - why do we care?
Regular, tightly synchronized communication patterns successfully modeled with LogP, LogGP. Important classes of applications have irregular comm patterns, are not tightly synchronized or using large messsages. 2/16/2019 Andras
4
Outline of presentation
LoGPC methodology Contention-free models: LogP, LogGP Pipelining model Network contention model Applications This presentation is organized as follows: First we introduce the three models on top of which we define an optimization process for finding optimized grain and balance in Raw systems. We exemplify by showing some aspects in this process for the Jacobi 2D Finally we end with conclusions and recomendations for Raw systems 2/16/2019 Andras
5
LoGPC framework (optional) Platform Network Interfacing Pipelining
model Communication layer Contention free models (optional) Performance signature Contention models Network Application Application performance 2/16/2019 Andras
6
Short messages: LogP (Culler et al)
4 parameters: L = Latency O = Overheads g = gap minimum time interval consecutive sends and receives P = Processors g g Osend L Orec 2/16/2019 Andras
7
Long messages: LogGP (Alexandrov et al)
Sender k bytes message A new parameter introduced: Receiver G = Gap per byte for long messg. L G Os Or (k-1)G 2/16/2019 Andras
8
LoGPC framework (optional) free models Platform: Comm.layer:
Application Platform: Network Interface Contention models Pipelining model Comm.layer: free models Interconnection Network performance (optional) 2/16/2019 Andras
9
Pipelining model Network Interface - Alewife MEMORY NETWORK PROCESSOR
Data Cache bus IRQ DMA1 DMA2 INPUT Q OUTPUT Q IPI in IPI out CONTROLLER 2/16/2019 Andras
10
Pipelining model Sender ... Receiver (k-1)G o dma interrupt L
Memory transfer Network transfer .... 2/16/2019 Andras
11
LoGPC framework (optional) Application Contention model Platform
Network Interface Pipelining Communication layer free models Interconnection performance Performance signature (optional) 2/16/2019 Andras
12
Network contention model
Performance signature {L,o,G} (Active Messages - MIT-Alewife) Application specific: inter message time, average distance Contention model network dimension, network distance Contention delay per message Application Performance 2/16/2019 Andras
13
Contention per message: Cn
Start with open network model by Agarwal for expressing contention per message: Cn= network contention L = network latency L + Cn 2/16/2019 Andras
14
Contention delay per message: Cn
Close the model, P customer system Apply Little’s equation, solve for m Pm ... P T0 T0: inter-message time P: processors m: message rate Cn: contention delay L : network latency L + Cn 2/16/2019 Andras
15
LoGPC step-by-step Extract com. signature {L,o,G}
Estimate inter message time(s) based on application comm pattern(s) {T0}. Estimate application locality ( = average message distance ) Use contention-model for contention delay per message {Cn}. Estimate runtime based on critical path 2/16/2019 Andras
16
Applications: All-to-all remap
Measured LoGPC No contention 2/16/2019 Andras
17
Diamond DAG used in DNA chain comparison Random mapping Measured LoGPC
Perfect mapping Measured LogGP = LoGPC 2/16/2019 Andras
18
Em3d - hot-spot elimination
propagation of electromagnetic waves in solids asynchronous communication pattern with bulk transfer LoGPC used to eliminate performance bugs we improved performance 20% reducing contention by up to 70%. Synch Communication Synch Comp 2/16/2019 Andras
19
Summary We found network contention significant
all-to-all remap: 50% Diamond DAG: up to 56% EM3D:up to 70% (averall performance 20%) LoGPC: Simple way to evaluate how much locality matters for an application. LoGPC: Simple way to evaluate if network contention is significant for an application. We used applications that are parallel. We can observe a following division of resources: 25% memory, 75 % processing and communication The cost-performance optimal Raw configuration is the following: We also used theframeowrk to compare designs with different assumptions. We used DRAM inside a tile and a simple 2 byte FIFO router we obtained better preformance for the same cost. 2/16/2019 Andras
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.