Over the past couple of years I’ve helped a number of financial institutions identify and eliminate or significantly reduce sources of latency and jitter in their systems. In the city there’s currently something akin to an arms race, as banks seek to gain a competitive edge by shaving microseconds off transaction times. It’s all about beating the competition to make the right trade at the right price. A saving of one millisecond can be worth millions in profit. And the regulatory bodies need to be able to keep up with this new low latency world too.
Code path length is only part of the picture (although an important one). However, processor architectures with challenging single thread performance (such as Sun’s T-series systems) are still able to offer competitive advantage in scenarios where quick access to CPU resource is a bigger factor. Your mileage will vary.
When it comes to jitter I’ve seen a fair amount of naivety. Just because I have 32 cores in a system doesn’t mean I won’t see issues such as preemption, interrupt pinning, thundering herds and thread migration. Thankfully, DTrace provides the kind of observability I need to identify and quantify such issues, and Solaris often has the features needed to ameliorate them, often without the need to change application code.
I generally find that there is a lot of “low hanging fruit”, and am often able to demonstrate a dramatic reduction in jitter and absolute latency in a short amount of time. You may have seen some pretty big claims for DTrace, but in my experience it is hard to over-hype what can be achieved. It’s not just about shaving milliseconds of transaction times, but about reducing the amount of hardware that needs to be thrown at the problem.
could you maybe show one detailed example (if there is one you are allowed to make public) of how you found the cause of latency and removed it?