Today after some time-killing IETF debates I started to analyze the in-kernel BPF filter execution time for different BDP filters. Starting with no filter, which is translated into a simple @BPF_RET|BPF_K@ OPCODE till some more complex instructions. The average execution time lies somewhere at 300ns for no filter and somewhere above 350ns for a simple ICMP filter with 17 CPU instructions on my x86_64 (excluding call overhead).

The next image illustrates this (statistically sampled data):

images/bpf-complex.png