linkdin
Leon Vak Embedded Division Manager @abra
22/04/2024

Introduction

There’s something comforting about starting a kernel module with a DEFINE_KFIFO. 
It gives you the feeling that things will be simple. Efficient. Boring, even. 
Just a ring buffer between an interrupt handler and a worker thread. No mutexes. No allocations. No surprises. 

And so I did. I had one DMA channel, cyclic SG, feeding me data at an extremely fast rate. The interrupt callback would read the current descriptor index, derive a number between 0 and 64k, and push it into a kfifo. The receiving thread would pop from it, map the number back into a memory block, and ship it over TCP using kernel sockets. 

Worked fine. No drama. 

 

When "working fine" is a trap 

Then came the second DMA. 

Smaller buffers. Only 128 of them. 

But otherwise – same concept. New thread, new socket, new base address. 

And since I'm the type who dislikes duplicating logic for fun, I decided to make the code generic: 

A configuration array holding per-DMA settings, including buffer sizes, sockets, and… FIFOs. 

You can probably guess where this is going. 

 

The first mistake: trying to be clean 

I started by defining two kfifos statically, just like I did before: 

DEFINE_KFIFO(dma0_fifo, u16, 65536); 
DEFINE_KFIFO(dma1_fifo, u16, 128); 

And then I wanted to reference both in my DMA config array. Not the data inside them – the FIFOs themselves. 

But it didn’t work. The compiler wasn’t happy. 
At first I thought I just got something wrong: type mismatch, pointer level, whatever. But no. Turns out, each DEFINE_KFIFO(…) expands into a different struct type. Not a typedef, not a macro alias – a real, unique C struct based on the type and size. 

So I couldn’t point to both of them with a single void * or struct kfifo *. Not without breaking type safety, or jumping through hoops I didn’t want to know existed. 

The second mistake: trusting it worked just because it ran 

So I switched to DECLARE_KFIFO_PTR() and kfifo_alloc(), acknowledging that my use case now required runtime configuration. 

I wanted to pass FIFO pointers through my DMA config array, just like everything else. The code compiled, the FIFOs were allocated dynamically with kfifo_alloc(), and the whole system appeared to work. Threads were launched, data moved, and I started feeling optimistic. 

Then, the kernel crashed. 

No exotic edge case. No obscure race. Just a FIFO that wrapped around – and chaos. 

That’s when I realized: the macro-based type system in kfifo wasn’t just for show. Even though everything looked like it lined up, I was breaking the implicit contract. 

(Going deeper (into the underscore abyss

Naturally, I did what any self-respecting kernel dev would do: I tried to outsmart the system. 

I thought, “Well, if the macros are generating a struct, surely I can just use the same struct myself, right?” 

So I dug into <linux/kfifo.h>, and found this gem: 

struct kfifo __STRUCT_KFIFO_PTR(unsigned char, 0, void); 

Which expands from: 

#define __STRUCT_KFIFO_PTR(type, recsize, ptrtype) \ 
{ \ 
__STRUCT_KFIFO_COMMON(type, recsize, ptrtype); \ 
type		buf[0]; \ 
} 

And deeper:

#define __STRUCT_KFIFO_COMMON(datatype, recsize, ptrtype) \ 
union { \ 
struct __kfifo	kfifo; \ 
datatype	*type; \ 
const datatype	*const_type; \ 
char		(*rectype)[recsize]; \ 
ptrtype		*ptr; \ 
ptrtype const	*ptr_const; \ 
} 

So eventually, the real internal data lives here: 

struct __kfifo { 
unsigned int	in; 
unsigned int	out; 
unsigned int	mask; 
unsigned int	esize; 
void		*data; 
}; 

All prefixed with double underscores. All very much meant to be private. And yes, I still thought it’d be fine to cast my way through it. Maybe wrap it all in a void *, or just reach into the .kfifo union member directly. 

Needless to say, it compiled. And it even ran. For a while. 

But the moment the FIFO wrapped – probably the first time out > in and modulo math kicked in – the kernel threw its hands in the air. 

That crash was earned. 

 

When I started looking around 

At that point, I figured maybe I was doing something unusual. 

So I started searching through the kernel tree for examples of people using kfifo the way I wanted to – dynamically, generically, or at least with more than one. 

I didn’t find much. 

Most uses were exactly what you’d expect: 

One FIFO, one device. Fixed size. Fixed type. Statistically boring, but functionally solid. Even the official kernel samples under samples/kfifo/ stick to simple cases. No shared logic, no dynamic behavior. Certainly no abstractions. 

Eventually I stumbled across a post by Stefani Seibold, the original author of kfifo, bluntly stating: 

“You should avoid using struct kfifo. Use the macro DECLARE_KFIFO_PTR, DECLARE_KFIFO or DEFINE_KFIFO instead.” 

Which, in kernel-speak, means: 

“We made this efficient and type-safe. Don’t try to make it flexible.” 

 

 

Why it’s like this, and why it’s not your fault 

Now, to be fair, this whole design wasn’t accidental. 

The kfifo API was deliberately built this way, and for good reasons. 

By generating unique types for each buffer (based on both the element type and buffer size), the compiler can inline everything. No function call overhead. Full type checking. Tiny fast paths. In performance-critical or lock-free paths, which are common in drivers and embedded code – that kind of optimization really matters. 

Stefani Seibold described the code as: “complete hand crafted and optimized” for speed and small size. 

And honestly, that wasn’t wrong. For the use cases it was designed for (one buffer, one producer, one consumer, statically declared) it works beautifully. Most in-kernel users follow this exact model. 

The problems only start when you step outside that box. 

Want to dynamically allocate FIFOs? Use different sizes? Pass them around in arrays or structs? 

That’s when the wheels start to wobble, because there’s no abstraction layer to help you.

There’s no official wrapper, no interface to unify them. The macros do all the heavy lifting, and they don’t leave a lot of room for flexibility. 

Even the kernel samples quietly reinforce this idea: use a kfifo, but use it the way it wants to be used. Don’t try to be creative. 

So, if you find yourself in a situation where kfifo doesn't quite fit, you're not wrong. You’re just not in its target audience anymore. 

 

What kfifo is (and isn’t) 

-Let’s be fair. kfifo is solid engineering. Lockless, efficient, and type-safe. 

-Perfect when you need a single-producer/single-consumer buffer. 

-Excellent when used in the narrow lane it was designed for. 

But it is also highly specialized. 

You may find yourself stepping into macro labyrinths, wondering if typeof((fifo)+1) is casting your soul into undefined behavior. You will scroll through <linux/kfifo.h> looking for salvation, only to find unions and typedefs that look like compiler incantations. And worst of all, you might convince yourself you almost made it work – until a kernel panic tells you otherwise. 

At that point, writing your own simple FIFO (even a limited one) starts to feel not just reasonable, but responsible. 

That’s what I did. It was small, readable, and didn’t try to be anything more than it had to be. No compile-time type generation. No hidden traps. Just a ring buffer. The kind you’d write in college, and then quietly thank later in life for not being clever. But it also was what kfifo was supposed to solve: endless private fifo implementations in the kernel codebase. 

 

Have questions or insights to share? feel free to contact us for expert help. You can also visit here to learn more about us! 🙂