Download presentation
Presentation is loading. Please wait.
1
Hyperthread Support in OpenVMS V8.3
9/17/2018 Hyperthread Support in OpenVMS V8.3 What to do about Montecito? HP_presentation_template
2
Pre-Summary We added some features to help you manage hyperthreads
SHOW CPU/BRIEF displays thread info SET CPU/NOCOTHREAD [SYSTEST]HTHREAD.EXE We added some features to reduce hyperthreads hurting or confusing you Scheduler change Accounting change You need to experiment with your own application mix to see if hyperthreads help you September 17, 2018
3
Definitions of terms Processor Core Hyperthread CPU
A chip or package Core A ‘thing’ within a processor that physically executes programs Hyperthread A ‘thing’ within a core that logically executes programs CPU The OpenVMS abstraction for a ‘thing’ that executes programs Thread of execution Software concept of what a CPU executes September 17, 2018
4
What is “Hyperthreading” vs “Dual Core”?
Both are features of new “Montecito” Itanium chips Both abstracted as CPUs on OpenVMS Very different in implementation September 17, 2018
5
Dual Core Two (nearly) complete CPUs on one chip
Think two older CPU chips glued together :-) Separate cache, separate processing units, separate state. (Share bus interface) Both cores executing simultaneously September 17, 2018
6
Montecito Micrograph 2 Way 1MB L2I Multi-threading Power Management/
9/17/2018 Montecito Micrograph 2 Way Multi-threading 1MB L2I Dual- core Power Management/ Frequency Boost (Foxton) 2x12MB L3 caches with Pellston Soft Error Detection/ Correction For people who like pictures and chip micrographs. Thing to note: left and right halves are flipped mirror images of each other. Arbiter September 17, 2018 HP_presentation_template
7
Dual Cores 9/17/2018 HP_presentation_template
If you prefer block diagrams…thing to note is that everything is duplicated except system interface logic. September 17, 2018 HP_presentation_template
8
Hyperthreading Hyperthread: A set of state (e.g. user registers, control registers, IP, etc) in a core Shares execution resources with other threads Only one hyperthread active (i.e. executing a program) at once on Montecito When hyperthread blocks, other hyperthread activates Also swaps on a timer September 17, 2018
9
Montecito Multi-threading
9/17/2018 Montecito Multi-threading Serial Execution Ai Idle Ai+1 Bi Idle Bi+1 Montecito Multi-threaded Execution Ai Idle Ai+1 “A” is one thread of execution, “B” is another thread of execution. On the Montecito multithread section, the top is one hyperthread the bottom line is another hyperthread. The serial execution section assumes that there is only a single hyperthread (or an un-threaded core). Thread A executes both part I and part i+1 and the OS swaps in thread B and it executes its part i and part i+1. The Montecito multithread section assumes that one hyperthread is ready to execute thread of execution A and the other is ready to execute thread of execution B. Since A’s stall time can overlap B’s execution time, we get increased performance. Bi Bi+1 Multi-threading decreases stalls and increases performance September 17, 2018 HP_presentation_template
10
Dynamic Thread Switching
9/17/2018 Dynamic Thread Switching Speculate that a long latency event will stall execution L3 miss Uncached accesses Time outs ensure fairness gives software control OS has no knowledge or control of hyperthread switches September 17, 2018 HP_presentation_template
11
Hyperthread Abstraction in VMS
9/17/2018 Hyperthread Abstraction in VMS Reminder: 1 processor (or package or chip) has 2 Cores 4 Threads Each hyperthread appears in OpenVMS as a CPU CPUs that share the same cores are called “Cothread CPUs” Note: Cores that share a processor (or package or chip) are not named or treated differently September 17, 2018 HP_presentation_template
12
Identifying CoThread CPUs on OpenVMS
$ show cpu/brief System: XXXXXX, HP rx4640 (1.40GHz/12.0MB) CPU 0 State: RUN CPUDB: 8202A Handle: 00005D70 Owner: C Current: C8 Partition 0 Cothd: CPU 1 State: RUN CPUDB: 820FDF Handle: 00005E80 Cothd: CPU 2 State: RUN CPUDB: 820FFC Handle: 00005F90 Cothd: CPU 3 State: RUN CPUDB: 82101A Handle: A0 Cothd: September 17, 2018
13
Tradeoffs with Hyperthreads: Basics
One core with two threads MAY perform better than one core with one thread (but not always) One core with two threads NEVER performs as well as two cores September 17, 2018
14
Montecito Multi-threading
9/17/2018 Montecito Multi-threading Serial Execution Ai Idle Ai+1 Bi Idle Bi+1 Montecito Multi-threaded Execution Ai Idle Ai+1 “A” is one thread of execution, “B” is another thread of execution. On the Montecito multithread section, the top is one hyperthread the bottom line is another hyperthread. The serial execution section assumes that there is only a single hyperthread (or an un-threaded core). Thread A executes both part I and part i+1 and the OS swaps in thread B and it executes its part i and part i+1. The Montecito multithread section assumes that one hyperthread is ready to execute thread of execution A and the other is ready to execute thread of execution B. Since A’s stall time can overlap B’s execution time, we get increased performance. Bi Bi+1 Multi-threading decreases stalls and increases performance September 17, 2018 HP_presentation_template
15
Montecito Multi-threading (No Stalls)
9/17/2018 Montecito Multi-threading (No Stalls) Serial Execution Ai Ai+1 Bi Bi+1 Montecito Multi-threaded Execution Ai Ai+1 But suppose A and B don’t have much stall time. For example they are carefully-designed to stay within their cache. In that case, the two hyperthreads swap because of the timer rather than because of stalling. Since the swap takes some time (the yellow), then serial execution could be faster. (Note that we are not assuming any time for the OS to swap the threads in the serial execution case. Not realistic, but illustrative only) Bi Bi+1 September 17, 2018 HP_presentation_template
16
Multi-threading vs Two Cores
9/17/2018 Multi-threading vs Two Cores Execution on Two Cores Ai Ai+1 Bi Bi+1 Montecito Multi-threaded Execution Ai Ai+1 We said two cores is always faster. That’s because two cores can execute A and B simultaneously. (There is less difference between these cases if we used the stalling version of the threads) Bi Bi+1 September 17, 2018 HP_presentation_template
17
VMS support for Hyperthreading
Three categories of support Managing/getting info Reducing “waste” of hyperthread cycles Scheduling September 17, 2018
18
Managing/Getting Info
Hyperthread to CPU mapping First thread of all cores followed by second threads Ex: 2 processor system. CPU 0,1,2,3 are all separate cores. CPU 4,5,6,7 are cothreads of 0,1,2,3 SHOW CPU/BRIEF and /FULL Notes CPU that is the Cothread of the displayed CPU SET CPU/[NO]COTHREAD Stops one of the cothreads on the core associated with this CPU Accounting Only charge a process ½ the CPU time if CPUs cothread is busy September 17, 2018
19
Managing Efi command: cpuconfig threads on/off [systest]hthread.exe
Supported part of efi Requires two resets: one to get to efi; one to make thread command take effect. [systest]hthread.exe Like RADCHECK, an unsupported but helpful little utility Check and modify firmware state of hyperthreading $hthread –show $hthread –on $ hthread –off Change after next reboot (i.e. only a single reset) September 17, 2018
20
Reducing Hyperthread Cycle Waste
9/17/2018 Reducing Hyperthread Cycle Waste Main point: A hyperthread spinning in halt or idle still uses cycles that its cothread might have used Idle loop between each check for busy Power saver mode as usual STOP/CPU while halted Future possibilities: while spinning on locks? Tradeoffs abound! Reduce waste? Yes because if a hyperthread is doing something, even spinning in a loop, its cothread can not be doing useful work. Thus we want to reduce spinning. September 17, 2018 HP_presentation_template
21
Scheduler Changes Two cores always better than two hyperthreads on the same core so: Attempt to schedule processes on CPUs without a busy cothread Ties in with waste reduction since an idle hyperthread will give up its cycles to its cothread September 17, 2018
22
Question you are too polite to ask
9/17/2018 Question you are too polite to ask Why didn’t you change the scheduler to make good use of hyperthreads? Answer: We don’t know how. Seriously, it is VERY application mix dependent. Caution! This is a build/animation slide. For humor… September 17, 2018 HP_presentation_template
23
Tradeoffs with Hyperthreading
Imagine you want to make best use of hyperthreads What threads of execution do you run on same core? September 17, 2018
24
Who shares a core? Threads that share the same memory space (e.g. kernel threads within a process) They might share some cache and require fewer cache fills and thus perform better! But if they stall less, hyperthreads are less advantageous! Threads that have nothing to do with each other More cache misses so threads help more But more cache misses means poorer individual performance! Clearly there is a tradeoff somewhere, but we can’t make it automatically September 17, 2018
25
My recommendation Even without threads, Montecito works well
Try it with threads off; you will likely be happy Experiment with processes on threads Use affinity to group different processes on cothreads, or to avoid cothreads Experiment with fastpath CPUs on threads. Do you get better throughput spreading I/O across all threads or only using one thread per core? September 17, 2018
26
Other features - Soon ar.ruc NUMA Power control September 17, 2018
27
Other features further out
User mode rfi Might allow one to go to an instruction within a bundle Useful for AST returns (maybe?) September 17, 2018
28
9/17/2018 HP_presentation_template
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.