Download presentation
Presentation is loading. Please wait.
Published byBertram Shelton Modified over 9 years ago
1
Peter Lawrey CEO of Higher Frequency Trading JAX Finance 2015 Low Latency: The best way to high throughput
2
Peter Lawrey Java Developer/Consultant for hedge fund and trading firms for 6 years. Most answers for Java and JVM on stackoverflow.com Founder of the Performance Java User’s Group. Architect of Chronicle Software
3
Agenda Little’s law and concurrency Co-ordinated omission Why should you use less servers?
4
Little’s law Little’s law states; The long-term average number of customers in a stable system L is equal to the long-term average effective arrival rate, λ, multiplied by the (Palm ‑ )average time a customer spends in the system, W; or expressed algebraically: L = λW
5
Little’s law as work. The number of active workers must be at least the average arrival rate of tasks multiplied by the average time to complete those tasks. workers >= tasks/second * seconds to perform task. Or throughput <= workers / latency.
6
Consequences of Little’s law If you have a problem with a high degree of independent tasks, you can throw more workers at the problem to handle the load. E.g. web services If you have a problem with a low degree of independent tasks, adding more workers will mean more will be idle. E.g. many trading systems. The solution is to reduce latency to increase throughput.
7
Consequences of Little’s law Average latency is a function, sometimes the inverse, of the throughput. Throughput focuses on the average experience. The worst case is often the ones which will hurt you, but averages are very good at hiding your worst cases. E.g. from long GC pauses. Testing with Co-ordinated omission also hides worst case latencies.
8
Co-ordinated omission A term coined by Gil Tene. Co-ordinated omission occurs when the system being tested is allowed to apply back pressure on the system doing the testing. When the tested system being tested is slow, it can effectively pause the test, esp. when averages or latency percentiles are considered.
9
Co-ordinated omission: Example A shop is open 10 hours a day between 8 AM and 6 PM. A customer comes every 5 minutes, waits to be served and leaves. When the shop keeper is there, he takes 1 minute to serve. But if he takes a 2 hour lunch break, how does this effect the average latency or the 98 th percentile?
10
How not to measure latency. You have one person go to the shop and time how long she has to wait. Once per day she has to wait 2 hours and 1 minute, but the rest of the day it only takes 1 minute. The average of 97 tests is 2.2 minutes. Had the shop been open all day, there would be 120 tests, but one took 2 hours. Not great but doesn’t sound much worse than 1 minute. The 98 th percentile is 1 minute.
11
Avoiding co-ordinated omission You have as many people as you need. Most of the time, only one is waiting, however over the lunch break, there is 31 people delayed 121, 117, 113, 109 … 5 mins. The average of 120 tests is 16.5 minutes wait time. This is much higher than the 2.2 minutes calculated previously. The 98 th percentile is 111 minutes, instead of 1 minute in the previous test.
12
Why use less servers? You can buy commodity mid range servers with 38 cores and 512 GB of memory for a reasonable price. < £20K each. Increasing number of libraries support off heap storage allowing you to support much larger datasets in memory.
13
Why use less servers? Deploying to one servers lowers the cost of development. The cost of development is often higher than the cost of the hardware. Deploying to one server also reduces the network latency, increasing the throughput.
14
Even latencies you can’t see add up Data passingLatencyHuman scaleThroughput on at a time Method callInlined: 0 Real call: 50 ns. Eye blink20,000,000/sec Shared memory200 nsMouse click 5,000,000/sec SYSV Shared memory 2 µsDrop a phone. 500,000/sec Low latency network 8 µsFlight a paper plane 125,000/sec Typical LAN network 30 µsHalf a minute 30,000/sec Typical data grid system800 µsRunning three miles 1,250/sec 60 Hz power flickers 8 msA football game 120/sec 4G request latency in UK 55 msA summer’s day. 18/sec
15
Doesn’t the GC stop the world? The GC only pauses the JVM when it has some work to do. Produce less garbage and it will pause less often Produce less than 1 GB/hour of garbage and you can get less than one pause per day. (With a 24 GB Eden)
16
Do I need to avoid all objects? In Java 8 you can have very short lived objects placed on the stack. This requires your code to be inlined and escape analysis to kick in. When this happens, no garbage is created and the code is faster. You can have very long lived objects, provided you don’t have too much. The rest of your data you can place in native memory (off heap) You can create 1 GB/hour of garbage and still not GC
17
Low Latency with lots of Lambdas Chronicle Wire is an API for generic serialization and deserialization. You determine what you want to read/write, but the exact wire format can be injected. This works for Yaml, Binary Yaml, and raw data. It will support XML, FIX, JSON and BSON. This uses lambdas extensively but the objects associated can be eliminated.
18
Low Latency with lots of Lambdas wire.writeDocument(false, out -> out.write(() -> "put").marshallable(m -> m.write(() -> "key").int64(n).write(() -> "value").text(words[n]))); As Yaml --- !!data put: { key: 1, value: hello } As Binary Yaml ⒗ ٠٠٠ Ãput\u0082 ⒎ ٠٠٠ ⒈ åhello
19
Next Steps Chronicle is open source so you can start right away! Working with clients to produce Chronicle Enterprise Support contract for Chronicle and consultancy
20
Q & A Peter Lawrey @PeterLawrey http://chronicle.software http://vanillajava.blogspot.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.