Cpp | System & Software Engineering

The Kernel Tax: IPC Latency and the Cost of Boundary Crossings

Every pipe round-trip crosses the user/kernel boundary four times. Each crossing costs a minimum of 101 ns on modern hardware. Shared memory crosses it zero times. A forensic breakdown of what the kernel actually charges per message, and why Tachyon reaches 56.5 ns where pipes cost 6.2 µs.

The Instruction Count Lie: Latency, Throughput, and the Dependency Graph

Counting instructions is not measuring execution time. A deep dive into how Read-After-Write hazards serialize the Out-of-Order engine, and how restructuring dependencies unlocks the superscalar machine.

The Invisible Lock: Cache Coherency and the Physics of False Sharing

You removed the mutex. The CPU added a hardware lock. A deep dive into how the MESI protocol and false sharing silently destroy multi-core scaling.

The O(1) Illusion: Why Pointer Chasing is the Death of Throughput

Algorithmic complexity assumes memory is flat and fast. It isn’t. A deep dive into why contiguous arrays destroy linked lists on modern superscalar CPUs.

The 'If' is an admission of failure: When algebra replaces decision

Why your Clean Code is poison for the CPU pipeline. Analyzing branch misprediction costs.