The Linux Programming Interface

Michael Kerrisk

Mentioned 14

The Linux Programming Interface is the definitive guide to the Linux and UNIX programming interface—the interface employed by nearly every application that runs on a Linux or UNIX system. In this authoritative work, Linux programming expert Michael Kerrisk provides detailed descriptions of the system calls and library functions that you need in order to master the craft of system programming, and accompanies his explanations with clear, complete example programs. You'll find descriptions of over 500 system calls and library functions, and more than 200 example programs, 88 tables, and 115 diagrams. You'll learn how to: * Read and write files efficiently * Use signals, clocks, and timers * Create processes and execute programs * Write secure programs * Write multithreaded programs using POSIX threads * Build and use shared libraries * Perform interprocess communication using pipes, message queues, shared memory, and semaphores * Write network applications with the sockets API While The Linux Programming Interface covers a wealth of Linux-specific features, including epoll, inotify, and the /proc file system, its emphasis on UNIX standards (POSIX.1-2001/SUSv3 and POSIX.1-2008/SUSv4) makes it equally valuable to programmers working on other UNIX platforms. The Linux Programming Interface is the most comprehensive single-volume work on the Linux and UNIX programming interface, and a book that's destined to become a new classic.

More on Amazon.com

Mentioned in questions and answers.

Think MUDs/MUCKs but maybe with avatars or locale illustrations. My language of choice is ruby.

I need to handle multiple persistent connections with data being asynchronously transferred between the server and its various clients. A single database must be kept up-to-date based on activity occurring in the client sessions. Activity in each client session may require multiple other clients to be immediately updated (a user enters a room; a user sends another user a private message).

This is a goal project and a learning project, so my intention is to re-invent a wheel or two to learn more about concurrent network programming. However, I am new to both concurrent and network programming; previously I have worked almost exclusively in the world of non-persistent, synchronous HTTP requests in web apps. So, I want to make sure that I'm reinventing the right wheels.

Per emboss's excellent answer, I have been starting to look at the internals of certain HTTP servers, since web apps can usually avoid threading concerns due to how thoroughly the issue is abstracted away by the servers themselves.

I do not want to use EventMachine or GServer because I don't yet understand what they do. Once I have a general sense of how they work, what problems they solve and why they're useful, I'll feel comfortable with it. My goal here is not "write a game", but "write a game and learn how some of the lower-level stuff works". I'm also unclear on the boundaries of certain terms; for example, is "I/O-unbound apps" a superset of "event-driven apps"? Vice-versa?

I am of course interested in the One Right Way to achieve my goal, if it exists, but overall I want to understand why it's the right way and why other ways are less preferable.

Any books, ebooks, online resources, sample projects or other tidbits you can suggest are what I'm really after.

The way I am doing things right now is by using IO#select to block on the list of connected sockets, with a timeout of 0.1 seconds. It pushes any information read into a thread-safe read queue, and then whenever it hits the timeout, it pulls data from a thread-safe write queue. I'm not sure if the timeout should be shorter. There is a second thread which polls the socket-handling thread's read queue and processes the "requests". This is better than how I had it working initially, but still might not be ideal.

I posted this question on Hacker News and got linked to a few resources that I'm working through; anything similar would be great:

Although you probably don't like to hear it I would still recommend to start investigating HTTP servers first. Although programming for them seemed boring, synchronous, and non-persistent to you, that's only because the creators of the servers did their job to hide the gory details from you so tremendously well - if you think about it, a web server is so not synchronous (it's not that millions of people have to wait for reading this post until you are done... concurrency :) ... and because these beasts do their job so well (yeah, I know we yell at them a lot, but at the end of the day most HTTP servers are outstanding pieces of software) this is the definite starting point to look into if you want to learn about efficient multi-threading. Operating systems and implementations of programming languages or games are another good source, but maybe a bit further away from what you intend to achieve.

If you really intend to get your fingers dirty I would suggest to orient yourself at something like WEBrick first - it ships with Ruby and is entirely implemented in Ruby, so you will learn all about Ruby threading concepts there. But be warned, you'll never get close to the performance of a Rack solution that sits on top of a web server that's implemented in C such as thin.

So if you really want to be serious, you would have to roll your own server implementation in C(++) and probably make it support Rack, if you intend to support HTTP. Quite a task I would say, especially if you want your end result to be competitive. C code can be blazingly fast, but it's all to easy to be blazingly slow as well, it lies in the nature of low-level stuff. And we haven't discussed memory management and security yet. But if it's really your desire, go for it, but I would first dig into well-known server implementations to get inspiration. See how they work with threads (pooling) and how they implement 'sessions' (you wanted persistence). All the things you desire can be done with HTTP, even better when used with a clever REST interface, existing applications that support all the features you mentioned are living proof for that. So going in that direction would be not entirely wrong.

If you still want to invent your own proprietary protocol, base it on TCP/IP as the lowest acceptable common denominator. Going beyond that would end up in a project that your grand-children would probably still be coding on. That's really as low as I would dare to go when it comes to network programming.

Whether you are using it as a library or not, look into EventMachine and its conceptual model. Overlooking event-driven ('non-blocking') IO in your journey would be negligent in the context of learning about/reinventing the right wheels. An appetizer for event-driven programming explaining the benefits of node.js as a web server.

Based on your requirements: asynchronous communication, multiple "subscribers" reacting to "events" that are centrally published; well that really sounds like a good candidate for an event-driven/message-based architecture.


Some books that may be helpful on your journey (Linux/C only, but the concepts are universal):

(Those were the classics)

  • The Linux programming interface - if you just intend to buy one book, let it be this one, I'm not entirely through yet but it is truly amazing and covers all the topics you need to know about for your adventure

Projects you may want to check out:

i am writing a device driver on linux-2.6.26. I want to have a dma buffer mapped into userspace for sending data from driver to userspace application. Please suggest some good tutorial on it.

Thanks

OK, this is what I did. Disclaimer: I'm a hacker in the pure sense of the word and my code ain't the prettiest. I read LDD3 and infiniband source code and other predecessor stuff and decided that "get_user_pages" and pinning them and all that other rigmarole was just too painful to contemplate while hungover. Also, I was working with the other person across the PCIe bus and I was also responsible in "designing" the user space application. I wrote the driver such that at load time, it preallocates as many buffers as it can with the largest size by calling the function myAddr[i] = pci_alloc_consistent(blah,size,&pci_addr[i]) until it fails. (failure -> myAddr[i] is NULL I think, I forget). I was able to allocate around 2.5GB of buffers, each 4MiB in size in my meagre machine which only has 4GiB of memory. The total number of buffers varies depending on when the kernel module is loaded of course. Load the driver at boot time and the most buffers are allocated. Each individual buffer's size maxed out at 4MiB in my system. Not sure why. I catted /proc/buddyinfo to make sure I wasn't doing anything stupid which is of course my usual starting pattern. The driver then proceeds to give the array of pci_addr to the PCIe device along with their sizes. The driver then just sits there waiting for the interrupt storm to begin. Meanwhile in userspace, the application opens the driver, queries the number of allocated buffers(n) and their sizes (using ioctls or reads etc) and then proceeds to call the system call mmap() multiple (n) times. Of course mmap() must be properly implemented in the driver and LDD3 pages 422-423 were handy. Userspace now has n pointers to n areas of driver memory. As the driver is interrupted by the PCIe device, it's told which buffers are "full" or "available" to be sucked dry. The application in turn is pending on a read() or ioctl() to be told which buffers are full of useful data. The tricky part was to manage the userspace to kernel space synchronization such that buffers which are being DMA's into by the PCIe are not also being modified by userspace but that's what we get paid for. I hope this makes sense and I'd be more than happy to be told I'm an idiot but please tell me why. I recommend this book as well by the way: http://www.amazon.com/Linux-Programming-Interface-System-Handbook/dp/1593272200 . I wish I had that book seven years ago when I wrote my first Linux driver. There is another type of trickery possible by adding even more memory and not letting the kernel use it and mmapping on both sides of the userspace/kernelspace divide but the PCI device must also support higher than 32-bit DMA addressing. I haven't tried but I wouldn't be surprised if I'll eventually be forced to.

am very much interested in unix. Want to learn in and out. Can you guys help me by listing some books which can make me a wizard? Ultimately I want to become a unix programmer.

I am not a novice user in Unix.

You want system administration knowledge, or programming knowledge?

For programming:

For system administration:

As other responders have noted, Advanced Programming in the Unix Environment (APUE) is indispensable.

Other books that you might want to consider (these have more of a Linux focus, but are a good way to become familiar with Unix internals):

Possible Duplicate:
shifting from windows to *nix programming platform

Does anyone know a good, compact resource that would allow me to migrate from Windows programming to Linux programming?

I managed to get simple apps running, checked daemon architecture, but somehow I don't know where to begin to get a better understanding of the best practices and common solutions for architecture in general.

I guess all threading, mutex, critical section, i/o, (named?)pipe stuff is probably way off from Windows development. But I can't find a good, compact documentation.

The daemons in Linux seem to be way simpler than in Windows, but I already stumbled upon fork function that is completely unusual, and there should be other things like that I guess.

Also, what's about all that POSIX compliance thing? I heard it's supposed to be platform agnostic, but I also read that it's not exactly supported under some distributions.

Take a look at Unix Systems Programming by Robbins. It's really more about POSIX rather than just UNIX, and covers quite a bit of material with some very nice in-depth examples as well. The POSIX factor means that it translates quite nicely to Linux as well as some other UNIX variants such as BSD and OSX. After reading it, you'll definitely get a very good overview of how a POSIX system works, as well as an excellent survey of the major areas you'd use the API in such as threading, sockets, and file I/O.

Beginning Linux Programming and Advanced Linux Programming are two good resources to start with.

The Linux Programming Interface is amazing book I am reading now:

http://www.amazon.com/Linux-Programming-Interface-System-Handbook/dp/1593272200

Just look at its outstanding customers range - it is really excellent Linux programming book.

While using the POSIX Message queues I noticed there were some files being created on the File system with the name I was creating the queues. My questions :

Q1. Do message queues queue up the messages on the Hard Disk and not RAM ?

Q2. If so, shouldn't this be very slow in implementation as it involves HardDisk ?


Edit:

I read this in the book The Linux Programming Interface :

On Linux, POSIX message queues are implemented as i-nodes in a virtual file system, and message queue descriptors and open message queue descriptions are implemented as file descriptors and open file descriptions, respectively. However, these are implementation details that are not required by SUSv3 and don’t hold true on some other UNIX implementations.

Even if it is VFS, it is still stored on the HardDisk, right ?

With this information in mind, can someone comment on the second question now ? (and / or First one also if there is something more to add)

I've read Kerrisk's The Linux Programming Interface: A Linux and UNIX System Programming Handbook, Chapter 31 on Threads. The chapter include Thread Specific Data (Section 31.3.4) and Thread Local Storage (Section 31.4). The topics were covered on pages 663-669.

Thread Specific Data (pthread_key_create, pthread_setspecific, pthread_getspecific, and friends) looks more powerful, but appears to be a little more cumbersome to use, and appears to use the memory manager more frequently.

Thread Local Storage (__thread on static and global declarations) looks a little less powerful since its limited to compile time, but its appears to be easier to use, and appears to stay out of the memory manager at runtime.

I could be wrong about the runtime memory manager since there could be code behind the scenes that calls pthread_key_create when it encounters __thread variables.

Kerrisk did not offer a compare/contrast of the two strategies, and he did not make a recommendation on when to use which in a given situation.

To add context to the question: I'm evaluating a 3rd party library. The library uses globals, does not utilize locking, and I want to use it in a multi-threaded program. The program uses threading to minimize network latencies.

Is there a hands down winner? Or are there different scenarios that warrant using one or the other?

The ptread_key_create and friends are much older, and thus supported on more systems.

The __thread is a relative newcomer, is generally much more convenient to use, and (according to Wikipedia) is supported on most POSIX systems that still matter: Solaris Studio C/C++, IBM XL C/C++, GNU C, Clang and Intel C++ Compiler (Linux systems).

The __thread also has a significant advantage that it is usable from signal handlers (with the exception of using __thread from dlopened shared library, see this bug), because its use does not involve malloc (with the same exception).

I'm curious if someone can point me in the right direction here. I'm learning about computer systems programming (the basics) and I'm trying to trace code through different levels to see how each interacts with the other. An example would be calling the fgets() function in C or getline() in C++ or similar. Both of those would make calls to the system right? Is there an easy way to look at the code that is called?

I'm working on Unix (Ubuntu). Is this something that is proprietary with Windows and Apple? Any good resources out there for this kind of thing? As always, thanks guys!

At least in the UNIX world, the answer is fairly easy: "Use the Source, Luke".

In your example, you would look at the sources for, say, fgetc(). That's in the C standard library, and the easiest way to find the source is google something like "C libraary fgets() source".

When you get that source, you'll see a bunch of code handling buffers etc, and a system call, probably to read(2). The "2" there tells you it is documented in Chapter 2 of the manual (eg, you can find it with man 2 read).

The system call is implemented in the kernel, so then you need to read the kernel source. Proceed from there.

Now, what you need to find this all without having to read randomly about in the sources (although that's the way a lot of people have learned it, it's not very efficient) is to get hold of a book on Linux like Kerrisk's The Linux Programming Interface, which explains some of these things at a somewhat higher level than just the source.

I have a basic understand of the following:

  • How to read hardware device Data Sheets
  • How hardware devices work in theory
  • What does a device driver do
  • General concepts of C programming
  • Linux OS

I have always believed that if I can understand all the code of a given device driver and can ultimately write the same code from scratch just with the help of above (without looking at the source code), I will be able to get a very good understand of C language and how device drivers work and interact with OS. For me, this would be like a major milestone in my career.

So, over the last few years, I have always thought to learn how to write device drivers (mostly for Linux OS). I always start (this has happened at least 6-7 times) with great enthusiasm and pick a few good online resources and read them. I then take an existing driver code from the Linux kernel (say an Ethernet driver code), obtain a datasheet and start to read the driver code but after reading few lines I kind of get confused and then ultimately give up because I could not follow the rest of the code.

My Question: I know such a tutorial is too good to be true but I still want to ask - Does anybody know a good resource which explains how a Linux device driver was written from starting with detailed references to data sheet and how the existing line of code related with it and then explaining each function/block of code as to why it exists and what does it do exactly.

There are at least two good books on Linux device drivers development:

I have personally read LDD3 and use it as a reference, but the second one is also very good according to other fellow developers.

When (and if) you read LDD3, it describes everything in good details and has code snippets following the development process from step #1 and to the end. The book, however, doesn't have the full code itself (which as a good thing, or else it becomes bloated), but I recommend you actually download and look at examples.

It will most definitely not only get you started, you will be able to write any device. Be that an Ethernet network driver, a fancy kernel extension for your specialized user application, or a fully-blown kernel bypass strategy with DMA buffers mapped into the user space.

However, you will probably not be able to a full step by step story of any real-world device driver. This is because of a few things — it is a lot harder to write a book like that, harder then writing a driver itself. Chances are that it won't sell well due to being extremely device specific. So when it comes to details like working nicely with a DMA engine of some device, or Ethernet LSO, you will have either have experience in this, learn some existing device and its driver, or at least ask specific questions (here or somewhere else).

I'd say that the most straight forward way for you when you come to that point is to join a team that does exactly that, work closely with people, keep getting more and more experience. Until one day you would be able to sit down, pull off your 10G NIC off the shelf, sit down and write an industrial grade driver (well, or until your interests change).

You may also try some open source projects too. For example, take a look at PF_RING DNA or similar projects. It is very interesting because you can take existing drivers and have to make a few adjustments to make it work with PF_RING infrastructure. In my personal opinion though, open-source projects are usually a little bit less effective in teaching and helping you gain real-world experience useful because in there people don't sit next to you, etc.

So... just do it!

I learning these concepts, please help me why the following code is throwing a segmentation fault. My intention in this code is to print capital letter D and move to next address. Please explain me. thank u.

main()
{
    char *ptr="C programming";
    printf(" %c \n",++*ptr);    
}

Reason for your error

You are trying to modify a string literal, which is a non-modifiable object. That's why you get a segmentation fault.

Even if you simply call ptr[0]++, it will be a segmentation fault.

Solution

One solution is to change the declaration to:

char ptr[] = "C programming"

Then you can modify the char array.

They look similar? Yes, the string literal is still non-modifiable, but you have declared an array with its own space, which will be initialized by the string literal, and the array is stored in the stack, and thus modifiable.

Example

Here is a full code example:

#include <stdio.h>

int test() {
    char str[]="C programming";
    char *ptr = str;

    while(*ptr != '\0') {
        // print original char,
        printf("%c \n", *(ptr++));

        // print original char plus 1, but don't change array,
        // printf("%c \n", (*(ptr++))+1);

        // modify char of array to plus 1 first, then print,
        // printf("%c \n", ++(*(ptr++)));
    }

    return 0;
}

int main(int argc, char * argv[]) {
    test();
    return 0;
}

Tip: you should only enable one of the printf() lines at the same time, due to ++ operator.

More tips

Note that we declared a ptr and a str, because we can't use ++ operation on an array (you can't change the address of an array), thus ++str will get a compile error, while ++ptr won't.


@Update - About memory

(To answer your comment)

  • A string literal is usually stored in a read-only area of a process; refer to C string literals: Where do they go?
  • A char array, if declared inside a method, then it's allocated on the stack, thus you can modify it; when you initialize a char array from a string literal, there are actually 2 copies of the same value: 1 is read-only; 1 is modifiable; the char array is initialized from the read-only copy.
  • A char pointer it stores a single address; the pointer itself is allocated on stack in this case, and you can modify it.

You might also want to know more about pointer or address or array or Linux process memory layout_ or data sections of a C program; try searching on Google, or refer to books like The C Programming Language, 2nd Edn and The Linux Programming Interface — though this is a question about C rather than Linux. There's also The Definitive C Book Guide and List on Stack Overflow.

There is a difference in re-entrant and thread-safe functions and I don't know if Linux functions ending with _r are thread-safe, re-entrant (I mean async-signal safe) or both,

They are thread-safe.

Stevens/Rago APUE teaches the distinction between thread-safe functions (reentrant with respect to being called by multiple threads), and async-signal-safe functions (reentrant with respect to signal handlers, so can be called safely from within a signal handler).

APUE ch 12.5 Reentrancy lists ~79 functions which are not thread-safe, then ~11 have equivalents which are reentrant, those are the *_r functions. That means those 11 can be called by multiple threads at the same time.

APUE ch 10.6 Reentrant Functions lists ~135 functions which are async-signal-safe. They block signal delivery when needed. So, you can use them in signal handler code. Note, async-signal-safeness only matters when functions are called inside a signal handler. That may motivate one to not write signal handler code, as further details are tricky.

Kerrisk TLPI ch 21 Signals: Signal Handlers has its own table of functions which are async-signal-safe. Interestingly it is not quite same as APUE.

None of the *._r are listed as async-signal-safe by either of these references.

Where is a good place to start if one is interested in Unix systems programming?

Any recommended reading, tutorials etc that are aimed at the beginner?

What knowledge is needed to start with systems programming?

Start with Mark Rochkind's "Advanced Unix Programming" if you can find it. Then graduate to Stevens "Advanced Programming in the Unix Environment".

I discovered this too for anyone interested. Apparently it is the "New standard" for linux programming. alt text

The Linux Programming Interface: A Linux and UNIX System Programming Handbook

Stevens is the bible. Read and understand this and his other books and you have most of what you need.

I have good knowledge of C++ and after reading The Elements of Computing Systems I have a basic knowledge of how computers work. I made a list of topics I want to learn next and books I want to buy on those topics. One of them is operating systems, but the one on the top of the list is Game development.

I am still a noob on those topics, so I wonder if I should know how an operating system (unix specifically) works before trying to learn game programming (Opengl, etc). On operating systems I have the book Operating Systems by Tanenbaum, and I want to buy The Linux Programming Interface by Michael Kerish.

On game development I was planning on buying Game Engine Architecture and Game Coding Complete to acquire a general concept of game programming and how engines work and then learn Opengl.

I am really lost on what to do first and I hope this is an appropriate question. What should I learn first, what books should I read and in what order. Should I learn how a VGA works before trying Opengl? Are there any other topics I should know before delving into games programming. I am asking this because I like to know what I am coding, what the functions I am calling do under the hood, I don't like holes in my knowledge.

Thanks.

Fluffy opinion answer incoming. Take with grain of salt.

The nice thing about programming is that that you don't need to learn everything about everything to do one thing effectively. Knowing exactly how to implement a video driver isn't required for using OpenGL effectively. The point of OpenGL is to abstract that out so you don't have to worry.

Since you want to do game development, make a project. Like recreating Asteroids using OpenGL for graphics and writing all the game logic yourself. And set about completing it. In the process you'll learn much more than simply reading. Use books as reference. At least thats what I've found works for me.

The Operating Systems book is pretty good. Its the one I read in college. But those concepts presented in it, though interesting, are not something you'll have trouble learning simultaneously with game development or anything else.

Also you should read this: http://www.linuxforu.com/tag/linux-device-drivers-series/. It's a great article series that teaches linux driver development and operating systems concepts in the process.

I'm currently porting some code from Linux to Windows (with MinGW).

From what I understand, MinGW doesn't support poll(), which was used in the original, so I'm rewriting everything for select().

And now I stumbled upon if (pfd[i].revents & (POLLERR|POLLHUP))...

How can I get the equivalent of this condition with select() - or alternatively, with whatever the winsock api or MinGW provides? The POLLERR part is simple enough; if(FD_ISSET (i, &error_fd_set)) but I'm at loss about the POLLHUP part.

According to my copy of The Linux Programming Interface, kernel poll events are mapped to select() events like so:

// Ready for reading
#define POLLIN_SET (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP | POLLERR)
// Ready for writing
#define POLLOUT_SET (POLLWRBAND | POLLWRNORM | POLLOUT | POLLERR)
// Exceptional condition
#define POLLEX_SET (POLLPRI)

So this suggests that you need to check the 'ready' event. To actually distinguish between POLLHUP, POLLIN, and POLLIN | POLLHUP, you can use the following chart from the book:

| ----- Condition or event ---- |
Data in pipe? | Write end open? | select() | poll()
no              no                r        | POLLHUP
yes             yes               r        | POLLIN
yes             no                r        | POLLIN | POLLHUP

I have a couple of structs with pointers to one another allocated on the heap. I'm converting a multi-threaded program to a multi-process program so I have to make those structs on the heap into shared memory. So far, I've run into nothing but problems on top of problems. MY TA suggested I use memcpy, but I'm not sure that's going to work. Is there any way to convert a set of structs on the heap into shared memory?

Structs I'm using:

 struct SharedData {
    int da;
    int         isopen;
    int     refcount;   // reference count:  number of threads using this object
    unsigned int    front;      // subscript of front of queue
    unsigned int    count;      // number of chars in queue
    unsigned int    bufsize;
    pthread_cond_t buffer_full;
    pthread_cond_t buffer_empty;
    pthread_mutex_t mtex;
    fifo_t* queue;
    sem_t       empty_count;

    sem_t       full_count;
    sem_t       use_queue;  // mutual exclusion
};

struct OverSharedData{
    struct SharedData ** rep;
    int rop;
};

I malloc'd OverSharedData , the SharedData structs, and the fifo_t queue, along with multiple char pointers later on. Do they all have to be declared as shared memory?

In the way malloc() request memory from heap, there are system calls (for e.g. shmget()) to request/create shared memory segment. If your request is successful, you can copy whatever you like over there. (Yes, you can use memcpy.) But remember to be careful about pointers, a pointer valid for one process, kept in its shared memory, is not necessarily valid for another process using that shared memory segment.

The shared memory is accessible to all processes for reading and/or writing. If multiple processes are reading/writing to a shared memory segment, then, needless to say, some synchronization techniques (for e.g. semaphore) need to be applied.

Please read up on shared memory. An excellent source is "The Linux Programming Interface" by Michael Kerrisk.

Reference: