Essential Linux Device Drivers

Sreekrishnan Venkateswaran

Mentioned 8

One of the world's most experienced Linux driver developers demonstrates how to develop reliable Linux drivers for virtually any device. This resource is for any programmer with a working knowledge of operating systems and C, including programmers who have never written drivers before.

More on

Mentioned in questions and answers.

I want to learn linux kernel programming.

What would be the starting points for that ? What could be some of the simpler problems to target ?

thanks in advance

**TODO** +editPic: Linux Kernel Developer -> (Ring Layer 0)
         +addSection: Kernel Virtualization Engine

KERN_WARN_CODING_STYLE: Do not Loop unless you absolutely have to.

Recommended Books for the Uninitialized void *i

"Men do not understand books until they have a certain amount of life, or at any rate no man understands a deep book, until he has seen and lived at least part of its contents". –Ezra Pound

A journey of a thousand code-miles must begin with a single step. If you are in confusion about which of the following books to start with, don't worry, pick any one of your choice. Not all those who wander are lost. As all roads ultimately connect to highway, you will explore new things in your kernel journey as the pages progress without meeting any dead ends, and ultimately connect to the code-set. Read with alert mind and remember: Code is not Literature.

What is left is not a thing or an emotion or an image or a mental picture or a memory or even an idea. It is a function. A process of some sort. An aspect of Life that could be described as a function of something "larger". And therefore, it appears that it is not really "separate" from that something else. Like the function of a knife - cutting something - is not, in fact, separate from the knife itself. The function may or may not be in use at the moment, but it is potentially NEVER separate.

Solovay Strassen Derandomized Algorithm for Primality Test:

Solovay Strassen Derandomized Algorithm for Primality Test

Read not to contradict and confute; nor to believe and take for granted; nor to find talk and discourse; but to weigh and consider. Some books are to be tasted, others to be swallowed, and some few to be chewed and digested: that is, some books are to be read only in parts, others to be read, but not curiously, and some few to be read wholly, and with diligence and attention.

static void tasklet_hi_action(struct softirq_action *a)
        struct tasklet_struct *list;

        list = __this_cpu_read(tasklet_hi_vec.head);
        __this_cpu_write(tasklet_hi_vec.head, NULL);
        __this_cpu_write(tasklet_hi_vec.tail, this_cpu_ptr(&tasklet_hi_vec.head));

        while (list) {
                struct tasklet_struct *t = list;

                list = list->next;

                if (tasklet_trylock(t)) {
                        if (!atomic_read(&t->count)) {
                                if (!test_and_clear_bit(TASKLET_STATE_SCHED,

                t->next = NULL;
                *__this_cpu_read(tasklet_hi_vec.tail) = t;
                __this_cpu_write(tasklet_hi_vec.tail, &(t->next));

Core Linux ( 5 -> 1 -> 3 -> 2 -> 7 -> 4 -> 6 )

“Nature has neither kernel nor shell; she is everything at once” -- Johann Wolfgang von Goethe

Reader should be well versed with operating system concepts; a fair understanding of long running processes and its differences with processes with short bursts of execution; fault tolerance while meeting soft and hard real time constraints. While reading, it's important to understand and n/ack the design choices made by the linux kernel source in the core subsystems.

Threads [and] signals [are] a platform-dependent trail of misery, despair, horror and madness (~Anthony Baxte). That being said you should be a self-evaluating C expert, before diving into the kernel. You should also have good experience with Linked Lists, Stacks, Queues, Red Blacks Trees, Hash Functions, et al.

volatile int i;
int main(void)
    int c;
    for (i=0; i<3; i++) {
        c = i&&&i;
        printf("%d\n", c);    /* find c */
    return 0;

The beauty and art of the Linux Kernel source lies in the deliberate code obfuscation used along. This is often necessitated as to convey the computational meaning involving two or more operations in a clean and elegant way. This is especially true when writing code for multi-core architecture.

Video Lectures on Real-Time Systems, Task Scheduling, Memory Compression, Memory Barriers, SMP

#ifdef __compiler_offsetof
#define offsetof(TYPE,MEMBER) __compiler_offsetof(TYPE,MEMBER)
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
  1. Linux Kernel Development - Robert Love
  2. Understanding the Linux Kernel - Daniel P. Bovet, Marco Cesati
  3. The Art of Linux KerneL Design - Yang Lixiang
  4. Professional Linux Kernel Architecture - Wolfgang Mauerer
  5. Design of the UNIX Operating System - Maurice J. Bach
  6. Understanding the Linux Virtual Memory Manager - Mel Gorman
  7. Linux Kernel Internals - Tigran Aivazian
  8. Embedded Linux Primer - Christopher Hallinan

Linux Device Drivers ( 1 -> 2 -> 4 -> 3 -> 8 -> ... )

"Music does not carry you along. You have to carry it along strictly by your ability to really just focus on that little small kernel of emotion or story". -- Debbie Harry

Your task is basically to establish a high speed communication interface between the hardware device and the software kernel. You should read the hardware reference datasheet/manual to understand the behavior of the device and it's control and data states and provided physical channels. Knowledge of Assembly for your particular architecture and a fair knowledge of VLSI Hardware Description Languages like VHDL or Verilog will help you in the long run.

Q: But, why do I have to read the hardware specs?

A: Because, "There is a chasm of carbon and silicon the software can't bridge" - Rahul Sonnad

However, the above doesn't poses a problem for Computational Algorithms (Driver code - bottom-half processing), as it can be fully simulated on a Universal Turing Machine. If the computed result holds true in the mathematical domain, it's a certainty that it is also true in the physical domain.

Video Lectures on Linux Device Drivers (Lec. 17 & 18), Anatomy of an Embedded KMS Driver, Pin Control and GPIO Update, Common Clock Framework, Write a Real Linux Driver - Greg KH

static irqreturn_t phy_interrupt(int irq, void *phy_dat)
         struct phy_device *phydev = phy_dat;

         if (PHY_HALTED == phydev->state)
                 return IRQ_NONE;                /* It can't be ours.  */

         /* The MDIO bus is not allowed to be written in interrupt
          * context, so we need to disable the irq here.  A work
          * queue will write the PHY to disable and clear the
          * interrupt, and then reenable the irq line.

         queue_work(system_power_efficient_wq, &phydev->phy_queue);

         return IRQ_HANDLED;
  1. Linux Device Drivers - Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman
  2. Essential Linux Device Drivers - Sreekrishnan Venkateswaran
  3. Writing Linux Device Drivers - Jerry Cooperstein
  4. The Linux Kernel Module Programming Guide - Peter Jay Salzman, Michael Burian, Ori Pomerantz
  5. Linux PCMCIA Programmer's Guide - David Hinds
  6. Linux SCSI Programming Howto - Heiko Eibfeldt
  7. Serial Programming Guide for POSIX Operating Systems - Michael R. Sweet
  8. Linux Graphics Drivers: an Introduction - Stéphane Marchesin
  9. Programming Guide for Linux USB Device Drivers - Detlef Fliegl
  10. The Linux Kernel Device Model - Patrick Mochel

Kernel Networking ( 1 -> 2 -> 3 -> ... )

“Call it a clan, call it a network, call it a tribe, call it a family: Whatever you call it, whoever you are, you need one.” - Jane Howard

Understanding a packet walk-through in the kernel is a key to understanding kernel networking. Understanding it is a must if we want to understand Netfilter or IPSec internals, and more. The two most important structures of linux kernel network layer are: struct sk_buff and struct net_device

static inline int sk_hashed(const struct sock *sk)
        return !sk_unhashed(sk);
  1. Understanding Linux Network Internals - Christian Benvenuti
  2. Linux Kernel Networking: Implementation and Theory - Rami Rosen
  3. UNIX Network Programming - W. Richard Stevens
  4. The Definitive Guide to Linux Network Programming - Keir Davis, John W. Turner, Nathan Yocom
  5. The Linux TCP/IP Stack: Networking for Embedded Systems - Thomas F. Herbert
  6. Linux Socket Programming by Example - Warren W. Gay
  7. Linux Advanced Routing & Traffic Control HOWTO - Bert Hubert

Kernel Debugging ( 1 -> 4 -> 9 -> ... )

Unless in communicating with it one says exactly what one means, trouble is bound to result. ~Alan Turing, about computers

Brian W. Kernighan, in the paper Unix for Beginners (1979) said, "The most effective debugging tool is still careful thought, coupled with judiciously placed print statements". Knowing what to collect will help you to get the right data quickly for a fast diagnosis. The great computer scientist Edsger Dijkstra once said that testing can demonstrate the presence of bugs but not their absence. Good investigation practices should balance the need to solve problems quickly, the need to build your skills, and the effective use of subject matter experts.

There are times when you hit rock-bottom, nothing seems to work and you run out of all your options. Its then that the real debugging begins. A bug may provide the break you need to disengage from a fixation on the ineffective solution.

Video Lectures on Kernel Debug and Profiling, Core Dump Analysis, Multicore Debugging with GDB, Controlling Multi-Core Race Conditions, Debugging Electronics

/* Buggy Code -- Stack frame problem
 * If you require information, do not free memory containing the information
char *initialize() {
  char string[80];
  char* ptr = string;
  return ptr;

int main() {
  char *myval = initialize();
/*  “When debugging, novices insert corrective code; experts remove defective code.”
 *     – Richard Pattis
 printk("The above can be considered as Development and Review in Industrial Practises");
  1. Linux Debugging and Performance Tuning - Steve Best
  2. Linux Applications Debugging Techniques - Aurelian Melinte
  3. Debugging with GDB: The GNU Source-Level Debugger - Roland H. Pesch
  4. Debugging Embedded Linux - Christopher Hallinan
  5. The Art of Debugging with GDB, DDD, and Eclipse - Norman S. Matloff
  6. Why Programs Fail: A Guide to Systematic Debugging - Andreas Zeller
  7. Software Exorcism: A Handbook for Debugging and Optimizing Legacy Code - Bill Blunden
  8. Debugging: Finding most Elusive Software and Hardware Problems - David J. Agans
  9. Debugging by Thinking: A Multidisciplinary Approach - Robert Charles Metzger
  10. Find the Bug: A Book of Incorrect Programs - Adam Barr

File Systems ( 1 -> 2 -> 6 -> ... )

"I wanted to have virtual memory, at least as it's coupled with file systems". -- Ken Thompson

On a UNIX system, everything is a file; if something is not a file, it is a process, except for named pipes and sockets. In a file system, a file is represented by an inode, a kind of serial number containing information about the actual data that makes up the file. The Linux Virtual File System VFS caches information in memory from each file system as it is mounted and used. A lot of care must be taken to update the file system correctly as data within these caches is modified as files and directories are created, written to and deleted. The most important of these caches is the Buffer Cache, which is integrated into the way that the individual file systems access their underlying block storage devices.

Video Lectures on Storage Systems, Flash Friendly File System

long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
        struct open_flags op;
        int fd = build_open_flags(flags, mode, &op);
        struct filename *tmp;

        if (fd)
                return fd;

        tmp = getname(filename);
        if (IS_ERR(tmp))
                return PTR_ERR(tmp);

        fd = get_unused_fd_flags(flags);
        if (fd >= 0) {
                struct file *f = do_filp_open(dfd, tmp, &op);
                if (IS_ERR(f)) {
                        fd = PTR_ERR(f);
                } else {
                        fd_install(fd, f);
        return fd;

SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
        if (force_o_largefile())
                flags |= O_LARGEFILE;

        return do_sys_open(AT_FDCWD, filename, flags, mode);
  1. Linux File Systems - Moshe Bar
  2. Linux Filesystems - William Von Hagen
  3. UNIX Filesystems: Evolution, Design, and Implementation - Steve D. Pate
  4. Practical File System Design - Dominic Giampaolo
  5. File System Forensic Analysis - Brian Carrier
  6. Linux Filesystem Hierarchy - Binh Nguyen
  7. BTRFS: The Linux B-tree Filesystem - Ohad Rodeh
  8. StegFS: A Steganographic File System for Linux - Andrew D. McDonald, Markus G. Kuhn

Security ( 1 -> 2 -> 8 -> 4 -> 3 -> ... )

"UNIX was not designed to stop its users from doing stupid things, as that would also stop them from doing clever things". — Doug Gwyn

No technique works if it isn't used. Ethics change with technology.

"F × S = k" the product of freedom and security is a constant. - Niven's Laws

Cryptography forms the basis of trust online. Hacking is exploiting security controls either in a technical, physical or a human-based element. Protecting the kernel from other running programs is a first step toward a secure and stable system, but this is obviously not enough: some degree of protection must exist between different user-land applications as well. Exploits can target local or remote services.

“You can't hack your destiny, brute need a back door, a side channel into Life." ― Clyde Dsouza

Computers do not solve problems, they execute solutions. Behind every non-deterministic algorithmic code, there is a determined mind. -- /var/log/dmesg

Video Lectures on Cryptography and Network Security, Namespaces for Security, Protection Against Remote Attacks, Secure Embedded Linux

env x='() { :;}; echo vulnerable' bash -c "echo this is a test for Shellsock"
  1. Hacking: The Art of Exploitation - Jon Erickson
  2. The Rootkit Arsenal: Escape and Evasion in the Dark Corners of the System - Bill Blunden
  3. Hacking Exposed: Network Security Secrets - Stuart McClure, Joel Scambray, George Kurtz
  4. A Guide to Kernel Exploitation: Attacking the Core - Enrico Perla, Massimiliano Oldani
  5. The Art of Memory Forensics - Michael Hale Ligh, Andrew Case, Jamie Levy, AAron Walters
  6. Practical Reverse Engineering - Bruce Dang, Alexandre Gazet, Elias Bachaalany
  7. Practical Malware Analysis - Michael Sikorski, Andrew Honig
  8. Maximum Linux Security: A Hacker's Guide to Protecting Your Linux Server - Anonymous
  9. Linux Security - Craig Hunt
  10. Real World Linux Security - Bob Toxen

Kernel Source ( 0.11 -> 2.4 -> 2.6 -> 3.18 )

"Like wine, the mastery of kernel programming matures with time. But, unlike wine, it gets sweeter in the process". --Lawrence Mucheka

You might not think that programmers are artists, but programming is an extremely creative profession. It's logic-based creativity. Computer science education cannot make anybody an expert programmer any more than studying brushes and pigment can make somebody an expert painter. As you already know, there is a difference between knowing the path and walking the path; it is of utmost importance to roll up your sleeves and get your hands dirty with kernel source code. Finally, with your thus gained kernel knowledge, wherever you go, you will shine.

Immature coders imitate; mature coders steal; bad coders deface what they take, and good coders make it into something better, or at least something different. The good coder welds his theft into a whole of feeling which is unique, utterly different from that from which it was torn.

Video Lectures on Kernel Recipes

├── boot
│   ├── bootsect.s      head.s      setup.s
├── fs
│   ├── bitmap.c    block_dev.c buffer.c        char_dev.c  exec.c
│   ├── fcntl.c     file_dev.c  file_table.c    inode.c     ioctl.c
│   ├── namei.c     open.c      pipe.c          read_write.c
│   ├── stat.c      super.c     truncate.c
├── include
│   ├── a.out.h     const.h     ctype.h     errno.h     fcntl.h
│   ├── signal.h    stdarg.h    stddef.h    string.h    termios.h
│   ├── time.h      unistd.h    utime.h
│   ├── asm
│   │   ├── io.h    memory.h    segment.h   system.h
│   ├── linux
│   │   ├── config.h    fdreg.h fs.h    hdreg.h     head.h
│   │   ├── kernel.h    mm.h    sched.h sys.h       tty.h
│   ├── sys
│   │   ├── stat.h      times.h types.h utsname.h   wait.h
├── init
│   └── main.c
├── kernel
│   ├── asm.s       exit.c      fork.c      mktime.c    panic.c
│   ├── printk.c    sched.c     signal.c    sys.c       system_calls.s
│   ├── traps.c     vsprintf.c
│   ├── blk_drv
│   │   ├── blk.h   floppy.c    hd.c    ll_rw_blk.c     ramdisk.c
│   ├── chr_drv
│   │   ├── console.c   keyboard.S  rs_io.s
│   │   ├── serial.c    tty_io.c    tty_ioctl.c
│   ├── math
│   │   ├── math_emulate.c
├── lib
│   ├── close.c  ctype.c  dup.c     errno.c  execve.c  _exit.c
│   ├── malloc.c open.c   setsid.c  string.c wait.c    write.c
├── Makefile
├── mm
│   ├── memory.c page.s
└── tools
    └── build.c
  1. Beginner's start with Linux 0.11 source (less than 20,000 lines of source code). After 20 years of development, compared with Linux 0.11, Linux has become very huge, complex, and difficult to learn. But the design concept and main structure have no fundamental changes. Learning Linux 0.11 still has important practical significance.
  2. Mandatory Reading for Kernel Hackers => Linux_source_dir/Documentation/*
  3. You should be subscribed and active on at-least one kernel mailing list. Start with kernel newbies.
  4. You do not need to read the full source code. Once you are familiar with the kernel API's and its usage, directly start with the source code of the sub-system you are interested in. You can also start with writing your own plug-n-play modules to experiment with the kernel.
  5. Device Driver writers would benefit by having their own dedicated hardware. Start with Raspberry Pi.

I just wrote a RTC driver for an NXP RTC chip on my board, it works great. This chip also has some battery backed RAM that I'd like to make available to a user space application. The RTC framework doesn't support this. It's only 512 bytes but I'm tossed between doing a seekable CHAR driver or a full blown BLOCK driver. I've never done a block driver before but it appears to require a bit more information than a simple CHAR.

I could also interface with IOCTLS but that doesn't feel as clean as it could be. What feels like the best way to make these bytes available to userland?

[EDIT] I forgot to mention that that the RTC chip is hanging off an I2C port, it's not mapped into memory, thus not making it a good candidate for mmaping. [/EDIT]

I think a character device driver implementing mmap should be adequate. Linux Device Drives covers that in chapter 15.


Well, i2c is a serial bus, so mmap is not an option. I will refer you to Essential Linux Device Drivers book. I believe it has a sample i2c EEPROM char device driver in Chapter 8. Hope this helps.

Do you know if new editions of ULK or R.Love's books are going to be re-released? Or maybe another book is in writings?

Latest books are based on 2.6.18 kernels, so I'm looking if anything newer is coming.

There are two good and mostly still accurate books on the Linux kernel. I'm not aware of anyone writing a new book just now.

If you just care about higher structures, how the scheduler works and things like that, use the Robert Love 3rd Edition.

If you want to know about all the various driver subsystems, choose the Venkateswaran book. Note that the book is now exactly 3 years old and is starting to show its age.

All other kernel books (including Jonathan Corbet's, Bovet/Cesati and others) are no longer worth reading: too much details have changed.

Especially anything pre 2.6.24 should be avoided because the updated timer framework that got finalized at that revision had quite a big ripple effect.

I want to code drivers in C in linux os, though I think its very tough. Can I get some hints as to how to start or books to follow? Drivers can be from my USB port to graphics card!!

I know as to where I can search for books, I would like to know as to what the basic knowledge I should start with. Do I need to have hardware knowledge and which specific books are good for novice like me?

Before you jump into designing drivers you should first get exceptional C skills and probably some Linux Kernel know-how. Desigining drivers is not trivial and might scare you off if you are not used to programming on a low-level.

I might recommend The C programming Language if you are not accustomed to C as it is, in my opinion, the primer on C if you have some programming background.

Several texts:

"Linux Device Drivers" (the O'Reilley book) by Rubini and Corbet is the definitive book for Linux Device Drivers.

Cool! see the free pdf version in Roddy's answer & kristina's comment!

I'm porting / debugging a device driver (that is used by another kernel module) and facing a dead end because dma_sync_single_for_device() fails with an kernel oops.

I have no clue what that function is supposed to do and googling does not really help, so I probably need to learn more about this stuff in total.

The question is, where to start?

Oh yeah, in case it is relevant, the code is supposed to run on a PowerPC (and the linux is OpenWRT)

EDIT: On-line resources preferrable (books take a few days to be delivered :)


Anatomy of the Linux slab allocator

Understanding the Linux Virtual Memory Manager

Linux Device Drivers, Third Edition

The Linux Kernel Module Programming Guide

Writing device drivers in Linux: A brief tutorial


Linux Kernel Development (2nd Edition)

Essential Linux Device Drivers ( Only the first 4 - 5 chapters )

Useful Resources:

the Linux Cross Reference ( Searchable Kernel Source for all Kernels )

API changes in the 2.6 kernel series

dma_sync_single_for_device calls dma_sync_single_range_for_cpu a little further up in the file and this is the source documentation ( I assume that even though this is for arm the interface and behavior are the same ):

 380 * dma_sync_single_range_for_cpu
 381 * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
 382 * @handle: DMA address of buffer
 383 * @offset: offset of region to start sync
 384 * @size: size of region to sync
 385 * @dir: DMA transfer direction (same as passed to dma_map_single)
 386 *
 387 * Make physical memory consistent for a single streaming mode DMA
 388 * translation after a transfer.
 389 *
 390 * If you perform a dma_map_single() but wish to interrogate the
 391 * buffer using the cpu, yet do not wish to teardown the PCI dma
 392 * mapping, you must call this function before doing so.  At the
 393 * next point you give the PCI dma address back to the card, you
 394 * must first the perform a dma_sync_for_device, and then the
 395 * device again owns the buffer.
 396 */

I've searched on this a good deal, though I definitely may have missed something, and I'm coming from reading the and this, but I'm having trouble reading out large arrays I've created in my linux kernel module ~200kB a piece. I'm developing on the ARM based ti-omap3530. I allocate the arrays like this:

static unsigned int* captured_params;
static unsigned int* time_params;

captured_params=vmalloc(3500*7*sizeof(unsigned int));
time_params=vmalloc(3500*7*sizeof(unsigned int));

My read function looks like this (I realize that I'm not doing this probably the best way):

ssize_t tgdev_read(struct file *file, char *buf, size_t count, loff_t *ppos){
  /*This file will expect reads in 8-byte chunks, 4 for the data parameter
  and 4 for the time parameter*/

  struct tg_dev *tg_devp = file->private_data;
  unsigned int num_pairs=sample_counter;
  static unsigned int pairs_out=0;
  static unsigned int sent_successfully=0;
  unsigned int* param_ptr;
  unsigned int* time_ptr;

  if(pairs_out==num_pairs){//all returned end of file
    return 0; //EOF
    //update data pointers
    //send user time first
    //send param value for that time
    //update number of pairs sent to user
    //update file_pointer
    tg_devp->current_pointer +=8;
  return pairs_out*8;

And I read it with a user space program like this:

int fp;
char* mode="r";
char* fname="/dev/my_device";
unsigned int param[3500*7]={0};
unsigned int time[3500*7]={0};
unsigned int* param_ptr;
unsigned int* time_ptr;
char buff[3500*7*4*2];
int i=0;
int rc=0;
int completed_reads;


  printf("failed to open file EXITING\n");
  return 1;

printf("rc=%i cr=%i\n",rc,completed_reads);


The userspace read always reports reading the correct number of bytes, but I don't get the right data.

This solution seems to work fine if I replace the 3500s above with something smaller (up to 1000 behaves just fine), but above that I get strange behavior where the array I read out zero pads the last N of each array (same number of elements at the same point in each of the 2 arrays) and starts late in the series e.g. time/param[0]= some values much farther down the original arrays I wanted to read out.

I'm thinking this is because I don't understand the memory handling well enough, but I don't know how to make this do what I want to, which is store these data arrays in the module until I want to read them out into user space. Any suggestions or ideas where I'm going wrong would be greatly appreciated.

Thanks in advance for your time, efforts and patience in my regard.

Where I can found textbooks about architecture, implementation of the dma, iommu? Are you know good links for learning?

  1. Linux Device Drivers - Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman
  2. Essential Linux Device Drivers - Sreekrishnan Venkateswaran
  3. Writing Linux Device Drivers - Jerry Cooperstein