UNIX Network Programming

W. Richard Stevens, Bill Fenner, Andrew M. Rudoff

Mentioned 53

* *Previous editions sold over 160,000 units! Second Edition (1998) sold over 53,000 in retail alone! *Updates coverage of programming standards, debugging techniques, and covers Operating Systems including Red Hat 9, Solaris 9, HP-UX, Free BSD 4.8/5.0, AIX 5.x, and Mac OS X. *Bill Fenner (AT/T Labs) and Andrew Rudoff (SUN) carry on the tradition of this great work.

More on Amazon.com

Mentioned in questions and answers.

Are there any good books for a relatively new but not totally new *nix user to get a bit more in depth knowledge (so no "Linux for dummies")? For the most part, I'm not looking for something to read through from start to finish. Rather, I'd rather have something that I can pick up and read in chunks when I need to know how to do something or whenever I have one of those "how do I do that again?" moments. Some areas that I'd like to see are:

  • command line administration
  • bash scripting
  • programming (although I'd like something that isn't just relevant for C programmers)

I'd like this to be as platform-independent as possible (meaning it has info that's relevant for any linux distro as well as BSD, Solaris, OS X, etc), but the unix systems that I use the most are OS X and Debian/Ubuntu. So if I would benefit the most from having a more platform-dependent book, those are the platforms to target.

If I can get all this in one book, great, but I'd rather have a bit more in-depth material than coverage of everything. So if there are any books that cover just one of these areas, post it. Hell, post it even if it's not relevant to any of those areas and you think it's something that a person in my position should know about.

I've wiki'd this post - could those with sufficient rep add in items to it.

System administration, general usage books

Programming:

Specific tools (e.g. Sendmail)

Various of the books from O'Reilly and other publishers cover specific topics. Some of the key ones are:

Some of these books have been in print for quite a while and are still relevant. Consequently they are also often available secondhand at much less than list price. Amazon marketplace is a good place to look for such items. It's quite a good way to do a shotgun approach to topics like this for not much money.

As an example, in New Zealand technical books are usurously expensive due to a weak kiwi peso (as the $NZ is affectionately known in expat circles) and a tortuously long supply chain. You could spend 20% of a week's after-tax pay for a starting graduate on a single book. When I was living there just out of university I used this type of market a lot, often buying books for 1/4 of their list price - including the cost of shipping to New Zealand. If you're not living in a location with tier-1 incomes I recommend this.

E-Books and on-line resources (thanks to israkir for reminding me):

  • The Linux Documentation project (www.tldp.org), has many specific topic guides known as HowTos that also often concern third party OSS tools and will be relevant to other Unix variants. It also has a series of FAQ's and guides.

  • Unix Guru's Universe is a collection of unix resources with a somewhat more old-school flavour.

  • Google. There are many, many unix and linux resources on the web. Search strings like unix commands or learn unix will turn up any amount of online resources.

  • Safari. This is a subscription service, but you can search the texts of quite a large number of books. I can recommend this as I've used it. They also do site licences for corporate customers.

Some of the philosophy of Unix:

I recommend the Armadillo book from O'Reilly for command line administration and shell scripting.

alt text

Jason,

Unix Programming Environment by Kernighan and Pike will give you solid foundations on all things Unix and should cover most of your questions regarding shell command line scripting etc.

The Armadillo book by O'Reilly will add the administration angle. It has served me well!

Good luck!

The aforementioned Unix Power Tools is a must. Other classics are sed&awk and Mastering Regular Expressions. I also like some books from the O'Reilly "Cookbook" series:

Tornadoweb and Nginx are popular web servers for the moment and many benchmarkings show that they have a better performance than Apache under certain circumstances. So my question is:

Is 'epoll' the most essential reason that make them so fast? And what can I learn from that if I want to write a good socket server?

If you're looking to write a socket server, a good starting point is Dan Kegel's C10k article from a few years back:

http://www.kegel.com/c10k.html

I also found Beej's Guide to Network Programming to be pretty handy:

http://beej.us/guide/bgnet/

Finally, if you need a great reference, there's UNIX Network Programming by W. Richard Stevens et. al.:

http://www.amazon.com/Unix-Network-Programming-Sockets-Networking/dp/0131411551/ref=dp_ob_title_bk

Anyway, to answer your question, the main difference between Apache and Nginx is that Apache uses one thread per client with blocking I/O, whereas Nginx is single-threaded with non-blocking I/O. Apache's worker pool does reduce the overhead of starting and destorying processes, but it still makes the CPU switch between several threads when serving multiple clients. Nginx, on the other hand, handles all requests in one thread. When one request needs to make a network request (say, to a backend), Nginx attaches a callback to the backend request and then works on another active client request. In practice, this means it returns to the event loop (epoll, kqueue, or select) and asks for file descriptors that have something to report. Note that the system call in main event loop is actually a blocking operation, because there's nothing to do until one of the file descriptors is ready for reading or writing.

So that's the main reason Nginx and Tornado are efficient at serving many simultaneous clients: there's only ever one process (thus saving RAM) and only one thread (thus saving CPU from context switches). As for epoll, it's just a more efficient version of select. If there are N open file descriptors (sockets), it lets you pick out the ones ready for reading in O(1) instead of O(N) time. In fact, Nginx can use select instead of epoll if you compile it with the --with-select_module option, and I bet it will still be more efficient than Apache. I'm not as familiar with Apache internals, but a quick grep shows it does use select and epoll -- probably when the server is listening to multiple ports/interfaces, or if it does simultaneous backend requests for a single client.

Incidentally, I got started with this stuff trying to write a basic socket server and wanted to figure out how Nginx was so freaking efficient. After poring through the Nginx source code and reading those guides/books I linked to above, I discovered it'd be easier to write Nginx modules instead of my own server. Thus was born the now-semi-legendary Emiller's Guide to Nginx Module Development:

http://www.evanmiller.org/nginx-modules-guide.html

(Warning: the Guide was written against Nginx 0.5-0.6 and APIs may have changed.) If you're doing anything with HTTP, I'd say give Nginx a shot because it's worked out all the hairy details of dealing with stupid clients. For example, the small socket server that I wrote for fun worked great with all clients -- except Safari, and I never figured out why. Even for other protocols, Nginx might be the right way to go; the eventing is pretty well abstracted from the protocols, which is why it can proxy HTTP as well as IMAP. The Nginx code base is extremely well-organized and very well-written, with one exception that bears mentioning. I wouldn't follow its lead when it comes to hand-rolling a protocol parser; instead, use a parser generator. I've written some stuff about using a parser generator (Ragel) with Nginx here:

http://www.evanmiller.org/nginx-modules-guide-advanced.html#parsing

All of this was probably more information than you wanted, but hopefully you'll find some of it useful.

I will do a few small projects over the next few months and need some books (preferably) or URLs to learn some basic concepts.

In general one PC or embedded device (which varies by project) collects some user input or data from an external hardware device and transmits it to a remote PC which will enter it into a database.

The back-end will be coded in Delphi using Indy socket components. The front-end might be a PC running a Delphi app using the same Indy sockets, but it might equally be a small controller board, probably programmed in C (with neither Windows nor Linux as an o/s, but with some unforeseeable socket support).

So, what I need is

  1. something - probably language agnostic - to get me up to speed on sockets programming
  2. conformation that I can just use a stream and write/read to define my own protocol (over TCP/IP) which will be very simple
  3. some overview of general networking (TCP?) concepts; maybe a little on security, general client/server stuff (for instance, I can send some from clients to the server and send a reply, but am not so sure about server initiated communication to a single server or a broadcast to all clients)
  4. anything else?

Any recommendations to get me up to speed, at least enough for a small project which would allow me to learn on the job.

Thanks in advance

This is the book to learn TCP/IP, doesn't matter what language you will be using:

W. Richard Stevens, TCP/IP Illustrated, Volume 1: The Protocols

The following is the C network programmer's bible, highly recommended:

W. Richard Stevens, Unix Network Programming, Volume 1: The Sockets Networking API

Out of online resources, Beej's Guide to Network Programming tops the list.

On most UNIX systems passing an open file between processes can be easily done for child/parent processes by fork(); however I need to share a fd "after" the child was already forked.

I've found some webpages telling me that sendmsg() may work for arbitary processes; but that seems very OS dependent and complex. The portlisten seems like the best example I can find, but I'd prefer a good wrapper library like libevent that hides all the magic of kqueue, pool, ....

Does anyone know if there's some library (and portable way) to do this?

There is a Unix domain socket-based mechanism for transferring file descriptors (such as sockets - which cannot be memory mapped, of course) between processes - using the sendmsg() system call.

You can find more in Stevens (as mentioned by Curt Sampson), and also at Wikipedia.

You can find a much more recent question with working code at Sending file descriptor by Linux socket.

Over the years I've developed a small mass of C++ server/client applications for Windows using WinSock (Routers, Web/Mail/FTP Servers, etc... etc...).

I’m starting to think more and more of creating an IPv6 version of these applications (While maintaining the original IPv4 version as well, of course).

Questions:

  1. What pitfalls might I run into?
  2. Is the porting/conversion difficult?
  3. Is the conversion worth it?


For a reference (or for fun), you can sneek a peak of the IPv4 code at the core of my applications.

Ulrich Drepper, the maintainer of glibc, has a good article on the topic,

http://people.redhat.com/drepper/userapi-ipv6.html

But don't forget Richard Steven's book, Unix Network Programming, Volume 1: The Sockets Networking API for good practice.

Think MUDs/MUCKs but maybe with avatars or locale illustrations. My language of choice is ruby.

I need to handle multiple persistent connections with data being asynchronously transferred between the server and its various clients. A single database must be kept up-to-date based on activity occurring in the client sessions. Activity in each client session may require multiple other clients to be immediately updated (a user enters a room; a user sends another user a private message).

This is a goal project and a learning project, so my intention is to re-invent a wheel or two to learn more about concurrent network programming. However, I am new to both concurrent and network programming; previously I have worked almost exclusively in the world of non-persistent, synchronous HTTP requests in web apps. So, I want to make sure that I'm reinventing the right wheels.

Per emboss's excellent answer, I have been starting to look at the internals of certain HTTP servers, since web apps can usually avoid threading concerns due to how thoroughly the issue is abstracted away by the servers themselves.

I do not want to use EventMachine or GServer because I don't yet understand what they do. Once I have a general sense of how they work, what problems they solve and why they're useful, I'll feel comfortable with it. My goal here is not "write a game", but "write a game and learn how some of the lower-level stuff works". I'm also unclear on the boundaries of certain terms; for example, is "I/O-unbound apps" a superset of "event-driven apps"? Vice-versa?

I am of course interested in the One Right Way to achieve my goal, if it exists, but overall I want to understand why it's the right way and why other ways are less preferable.

Any books, ebooks, online resources, sample projects or other tidbits you can suggest are what I'm really after.

The way I am doing things right now is by using IO#select to block on the list of connected sockets, with a timeout of 0.1 seconds. It pushes any information read into a thread-safe read queue, and then whenever it hits the timeout, it pulls data from a thread-safe write queue. I'm not sure if the timeout should be shorter. There is a second thread which polls the socket-handling thread's read queue and processes the "requests". This is better than how I had it working initially, but still might not be ideal.

I posted this question on Hacker News and got linked to a few resources that I'm working through; anything similar would be great:

Although you probably don't like to hear it I would still recommend to start investigating HTTP servers first. Although programming for them seemed boring, synchronous, and non-persistent to you, that's only because the creators of the servers did their job to hide the gory details from you so tremendously well - if you think about it, a web server is so not synchronous (it's not that millions of people have to wait for reading this post until you are done... concurrency :) ... and because these beasts do their job so well (yeah, I know we yell at them a lot, but at the end of the day most HTTP servers are outstanding pieces of software) this is the definite starting point to look into if you want to learn about efficient multi-threading. Operating systems and implementations of programming languages or games are another good source, but maybe a bit further away from what you intend to achieve.

If you really intend to get your fingers dirty I would suggest to orient yourself at something like WEBrick first - it ships with Ruby and is entirely implemented in Ruby, so you will learn all about Ruby threading concepts there. But be warned, you'll never get close to the performance of a Rack solution that sits on top of a web server that's implemented in C such as thin.

So if you really want to be serious, you would have to roll your own server implementation in C(++) and probably make it support Rack, if you intend to support HTTP. Quite a task I would say, especially if you want your end result to be competitive. C code can be blazingly fast, but it's all to easy to be blazingly slow as well, it lies in the nature of low-level stuff. And we haven't discussed memory management and security yet. But if it's really your desire, go for it, but I would first dig into well-known server implementations to get inspiration. See how they work with threads (pooling) and how they implement 'sessions' (you wanted persistence). All the things you desire can be done with HTTP, even better when used with a clever REST interface, existing applications that support all the features you mentioned are living proof for that. So going in that direction would be not entirely wrong.

If you still want to invent your own proprietary protocol, base it on TCP/IP as the lowest acceptable common denominator. Going beyond that would end up in a project that your grand-children would probably still be coding on. That's really as low as I would dare to go when it comes to network programming.

Whether you are using it as a library or not, look into EventMachine and its conceptual model. Overlooking event-driven ('non-blocking') IO in your journey would be negligent in the context of learning about/reinventing the right wheels. An appetizer for event-driven programming explaining the benefits of node.js as a web server.

Based on your requirements: asynchronous communication, multiple "subscribers" reacting to "events" that are centrally published; well that really sounds like a good candidate for an event-driven/message-based architecture.


Some books that may be helpful on your journey (Linux/C only, but the concepts are universal):

(Those were the classics)

  • The Linux programming interface - if you just intend to buy one book, let it be this one, I'm not entirely through yet but it is truly amazing and covers all the topics you need to know about for your adventure

Projects you may want to check out:

Are there any libraries or guides for how to read and parse binary data in C?

I am looking at some functionality that will receive TCP packets on a network socket and then parse that binary data according to a specification, turning the information into a more useable form by the code.

Are there any libraries out there that do this, or even a primer on performing this type of thing?

The standard way to do this in C/C++ is really casting to structs as 'gwaredd' suggested

It is not as unsafe as one would think. You first cast to the struct that you expected, as in his/her example, then you test that struct for validity. You have to test for max/min values, termination sequences, etc.

What ever platform you are on you must read Unix Network Programming, Volume 1: The Sockets Networking API. Buy it, borrow it, steal it ( the victim will understand, it's like stealing food or something... ), but do read it.

After reading the Stevens, most of this will make a lot more sense.

Basically suggestions about casting to struct work but please be aware that numbers can be represented differently on different architectures.

To deal with endian issues network byte order was introduced - common practice is to convert numbers from host byte order to network byte order before sending the data and to convert back to host order on receipt. See functions htonl, htons, ntohl and ntohs.

And really consider kervin's advice - read UNP. You won't regret it!

I'm a beginning C++ programmer / network admin, but I figure I can learn how to do this if someone points me in the right direction. Most of the tutorials are demonstrated using old code that no longer works for some reason.

Since I'm on Linux, all I need is an explanation on how to write raw Berkeley sockets. Can someone give me a quick run down?

Read Unix Network Programming by Richard Stevens. It's a must. It explains how it all works, gives you code, and even gives you helper methods. You might want to check out some of his other books. Advanced Programming In The Unix Enviernment is a must for lower level programming in Unix is general. I don't even do stuff on the Unix stack anymore, and the stuf from these books still helps how I code.

For TCP client side:

Use gethostbyname to lookup dns name to IP, it will return a hostent structure. Let's call this returned value host.

hostent *host = gethostbyname(HOSTNAME_CSTR);

Fill the socket address structure:

sockaddr_in sock;
sock.sin_family = AF_INET;
sock.sin_port = htons(REMOTE_PORT);
sock.sin_addr.s_addr = ((struct in_addr *)(host->h_addr))->s_addr;

Create a socket and call connect:

s = socket(AF_INET, SOCK_STREAM, 0); 
connect(s, (struct sockaddr *)&sock, sizeof(sock))

For TCP server side:

Setup a socket

Bind your address to that socket using bind.

Start listening on that socket with listen

Call accept to get a connected client. <-- at this point you spawn a new thread to handle the connection while you make another call to accept to get the next connected client.

General communication:

Use send and recv to read and write between the client and server.

Source code example of BSD sockets:

You can find some good example code of this at wikipedia.

Further reading:

I highly recommend this book and this online tutorial:

alt text

4:

I want to create an extremely simple iPhone program that will open a telnet session on a lan-connected device and send a sequence of keystrokes. Most of the code I've seen for sockets is overwhelming and vast overkill for what I want to do:

  1. open telnet socket to IP address
  2. send ascii keystrokes

Any simple code examples out there I can play with?

Do yourself a favor: go read at least first 6 chapters of this Steven's book in which you can find plenty of simple examples and many advices how to avoid common pitfalls with network programming. Without doing that you will end with a buggy, slow and incomplete client.

What are Async Sockets? How are they different from normal sockets (Blocking and Non-Blocking)?

Any pointers in that direction or any links to tutorials will be helpful.

Thanks.

Comparison of the following five different models for I/O in UNIX Network Programming: The sockets networking API would be helpful:

Blocking

Nonblocking

I/O multiplexing

Signal-driven I/O

Asynchronous I/O

I'm just cleaning up some code we wrote a while back and noticed that for a udp socket, 0 is being treated as the connection closed.

I'm quite sure this was the result of porting the same recv loop from the equivalent tcp version. But it makes me wonder. Can recv return 0 for udp? on tcp it signals the other end has closed the connection. udp doesn't have the concept of a connection so can it return 0? and if it can, what is it's meaning?

Note: the man page in linux does not distinguish udp and tcp for a return code of zero which may be why we kept the check in the code.

udp doesn't have the concept of a connection so can it return 0? and if it can, what is it's meaning

It means a 0-length datagram was received. From the great UNP:

Writing a datagram of length 0 is acceptable. In the case of UDP, this results in an IP datagram containing an IP header (normally 20 bytes for IPv4 and 40 bytes for IPv6), an 8-byte UDP header, and no data. This also means that a return value of 0 from recvfrom is acceptable for a datagram protocol: It does not mean that the peer has closed the connection, as does a return value of 0 from read on a TCP socket. Since UDP is connectionless, there is no such thing as closing a UDP connection.

I am trying to send some file descriptor by linux socket, but it does not work. What am I doing wrong? How is one supposed to debug something like this? I tried putting perror() everywhere it's possible, but they claimed that everything is ok. Here is what I've written:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <fcntl.h>

void wyslij(int socket, int fd)  // send fd by socket
{
    struct msghdr msg = {0};

    char buf[CMSG_SPACE(sizeof fd)];

    msg.msg_control = buf;
    msg.msg_controllen = sizeof buf;

    struct cmsghdr * cmsg = CMSG_FIRSTHDR(&msg);
    cmsg->cmsg_level = SOL_SOCKET;
    cmsg->cmsg_type = SCM_RIGHTS;
    cmsg->cmsg_len = CMSG_LEN(sizeof fd);

    *((int *) CMSG_DATA(cmsg)) = fd;

    msg.msg_controllen = cmsg->cmsg_len;  // why does example from man need it? isn't it redundant?

    sendmsg(socket, &msg, 0);
}


int odbierz(int socket)  // receive fd from socket
{
    struct msghdr msg = {0};
    recvmsg(socket, &msg, 0);

    struct cmsghdr * cmsg = CMSG_FIRSTHDR(&msg);

    unsigned char * data = CMSG_DATA(cmsg);

    int fd = *((int*) data);  // here program stops, probably with segfault

    return fd;
}


int main()
{
    int sv[2];
    socketpair(AF_UNIX, SOCK_DGRAM, 0, sv);

    int pid = fork();
    if (pid > 0)  // in parent
    {
        close(sv[1]);
        int sock = sv[0];

        int fd = open("./z7.c", O_RDONLY);

        wyslij(sock, fd);

        close(fd);
    }
    else  // in child
    {
        close(sv[0]);
        int sock = sv[1];

        sleep(0.5);
        int fd = odbierz(sock);
    }

}

Stevens (et al) UNIX® Network Programming, Vol 1: The Sockets Networking API describes the process of transferring file descriptors between processes in Chapter 15 Unix Domain Protocols and specifically §15.7 Passing Descriptors. It's fiddly to describe in full, but it must be done on a Unix domain socket (AF_UNIX or AF_LOCAL), and the sender process uses sendmsg() while the receiver uses recvmsg().

I got this mildly modified (and instrumented) version of the code from the question to work for me on Mac OS X 10.10.1 Yosemite with GCC 4.9.1:

#include "stderr.h"
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

static
void wyslij(int socket, int fd)  // send fd by socket
{
    struct msghdr msg = { 0 };
    char buf[CMSG_SPACE(sizeof(fd))];
    memset(buf, '\0', sizeof(buf));
    struct iovec io = { .iov_base = "ABC", .iov_len = 3 };

    msg.msg_iov = &io;
    msg.msg_iovlen = 1;
    msg.msg_control = buf;
    msg.msg_controllen = sizeof(buf);

    struct cmsghdr * cmsg = CMSG_FIRSTHDR(&msg);
    cmsg->cmsg_level = SOL_SOCKET;
    cmsg->cmsg_type = SCM_RIGHTS;
    cmsg->cmsg_len = CMSG_LEN(sizeof(fd));

    *((int *) CMSG_DATA(cmsg)) = fd;

    msg.msg_controllen = cmsg->cmsg_len;

    if (sendmsg(socket, &msg, 0) < 0)
        err_syserr("Failed to send message\n");
}

static
int odbierz(int socket)  // receive fd from socket
{
    struct msghdr msg = {0};

    char m_buffer[256];
    struct iovec io = { .iov_base = m_buffer, .iov_len = sizeof(m_buffer) };
    msg.msg_iov = &io;
    msg.msg_iovlen = 1;

    char c_buffer[256];
    msg.msg_control = c_buffer;
    msg.msg_controllen = sizeof(c_buffer);

    if (recvmsg(socket, &msg, 0) < 0)
        err_syserr("Failed to receive message\n");

    struct cmsghdr * cmsg = CMSG_FIRSTHDR(&msg);

    unsigned char * data = CMSG_DATA(cmsg);

    err_remark("About to extract fd\n");
    int fd = *((int*) data);
    err_remark("Extracted fd %d\n", fd);

    return fd;
}

int main(int argc, char **argv)
{
    const char *filename = "./z7.c";

    err_setarg0(argv[0]);
    err_setlogopts(ERR_PID);
    if (argc > 1)
        filename = argv[1];
    int sv[2];
    if (socketpair(AF_UNIX, SOCK_DGRAM, 0, sv) != 0)
        err_syserr("Failed to create Unix-domain socket pair\n");

    int pid = fork();
    if (pid > 0)  // in parent
    {
        err_remark("Parent at work\n");
        close(sv[1]);
        int sock = sv[0];

        int fd = open(filename, O_RDONLY);
        if (fd < 0)
            err_syserr("Failed to open file %s for reading\n", filename);

        wyslij(sock, fd);

        close(fd);
        nanosleep(&(struct timespec){ .tv_sec = 1, .tv_nsec = 500000000}, 0);
        err_remark("Parent exits\n");
    }
    else  // in child
    {
        err_remark("Child at play\n");
        close(sv[0]);
        int sock = sv[1];

        nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000}, 0);

        int fd = odbierz(sock);
        printf("Read %d!\n", fd);
        char buffer[256];
        ssize_t nbytes;
        while ((nbytes = read(fd, buffer, sizeof(buffer))) > 0)
            write(1, buffer, nbytes);
        printf("Done!\n");
        close(fd);
    }
    return 0;
}

The output from the instrumented but unfixed version of the original code was:

$ ./fd-passing
fd-passing: pid=1391: Parent at work
fd-passing: pid=1391: Failed to send message
error (40) Message too long
fd-passing: pid=1392: Child at play
$ fd-passing: pid=1392: Failed to receive message
error (40) Message too long

Note that the parent finished before the child, so the prompt appeared in the middle of the output.

The output from the 'fixed' code was:

$ ./fd-passing
fd-passing: pid=1046: Parent at work
fd-passing: pid=1048: Child at play
fd-passing: pid=1048: About to extract fd
fd-passing: pid=1048: Extracted fd 3
Read 3!
This is the file z7.c.
It isn't very interesting.
It isn't even C code.
But it is used by the fd-passing program to demonstrate that file
descriptors can indeed be passed between sockets on occasion.
Done!
fd-passing: pid=1046: Parent exits
$

The primary significant changes were adding the struct iovec to the data in the struct msghdr in both functions, and providing space in the receive function (odbierz()) for the control message. I reported an intermediate step in debugging where I provided the struct iovec to the parent and the parent's "message too long" error was removed. To prove it was working (a file descriptor was passed), I added code to read and print the file from the passed file descriptor. The original code had sleep(0.5) but since sleep() takes an unsigned integer, this was equivalent to not sleeping. I used C99 compound literals to have the child sleep for 0.5 seconds. The parent sleeps for 1.5 seconds so that the output from the child is complete before the parent exits. I could use wait() or waitpid() too, but was too lazy to do so.

I have not gone back and checked that all the additions were necessary.

The "stderr.h" header declares the err_*() functions. It's code I wrote (first version before 1987) to report errors succinctly. The err_setlogopts(ERR_PID) call prefixes all messages with the PID. For timestamps too, err_setlogopts(ERR_PID|ERR_STAMP) would do the job.

Alignment issues

Nominal Animal suggests in a comment:

May I suggest you modify the code to copy the descriptor int using memcpy() instead of accessing the data directly? It is not necessarily correctly aligned — which is why the man page example also uses memcpy() — and there are many Linux architectures where unaligned int access causes problems (up to SIGBUS signal killing the process).

And not only Linux architectures: both SPARC and Power require aligned data and often run Solaris and AIX respectively. Once upon a time, DEC Alpha required that too, but they're seldom seen in the field these days.

The code in the manual page cmsg(3) related to this is:

struct msghdr msg = {0};
struct cmsghdr *cmsg;
int myfds[NUM_FD]; /* Contains the file descriptors to pass. */
char buf[CMSG_SPACE(sizeof myfds)];  /* ancillary data buffer */
int *fdptr;

msg.msg_control = buf;
msg.msg_controllen = sizeof buf;
cmsg = CMSG_FIRSTHDR(&msg);
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
cmsg->cmsg_len = CMSG_LEN(sizeof(int) * NUM_FD);
/* Initialize the payload: */
fdptr = (int *) CMSG_DATA(cmsg);
memcpy(fdptr, myfds, NUM_FD * sizeof(int));
/* Sum of the length of all control messages in the buffer: */
msg.msg_controllen = cmsg->cmsg_len;

The assignment to fdptr appears to assume that CMSG_DATA(cmsg) is sufficiently well aligned to be converted to an int * and the memcpy() is used on the assumption that NUM_FD is not just 1. With that said, it is supposed to be pointing at the array buf, and that might not be sufficiently well aligned as Nominal Animal suggests, so it seems to me that the fdptr is just an interloper and it would be better if the example used:

memcpy(CMSG_DATA(cmsg), myfds, NUM_FD * sizeof(int));

And the reverse process on the receiving end would then be appropriate. This program only passes a single file descriptor, so the code is modifiable to:

memmove(CMSG_DATA(cmsg), &fd, sizeof(fd));  // Send
memmove(&fd, CMSG_DATA(cmsg), sizeof(fd));  // Receive

I also seem to recall historical issues on various OSes w.r.t. ancillary data with no normal payload data, avoided by sending at least one dummy byte too, but I cannot find any references to verify, so I might remember wrong.

Given that Mac OS X (which has a Darwin/BSD basis) requires at least one struct iovec, even if that describes a zero-length message, I'm willing to believe that the code shown above, which includes a 3-byte message, is a good step in the right general direction. The message should perhaps be a single null byte instead of 3 letters.

I've revised the code to read as shown below. It uses memmove() to copy the file descriptor to and from the cmsg buffer. It transfers a single message byte, which is a null byte.

It also has the parent process read (up to) 32 bytes of the file before passing the file descriptor to the child. The child continues reading where the parent left off. This demonstrates that the file descriptor transferred includes the file offset.

The receiver should do more validation on the cmsg before treating it as a file descriptor passing message.

#include "stderr.h"
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

static
void wyslij(int socket, int fd)  // send fd by socket
{
    struct msghdr msg = { 0 };
    char buf[CMSG_SPACE(sizeof(fd))];
    memset(buf, '\0', sizeof(buf));

    /* On Mac OS X, the struct iovec is needed, even if it points to minimal data */
    struct iovec io = { .iov_base = "", .iov_len = 1 };

    msg.msg_iov = &io;
    msg.msg_iovlen = 1;
    msg.msg_control = buf;
    msg.msg_controllen = sizeof(buf);

    struct cmsghdr * cmsg = CMSG_FIRSTHDR(&msg);
    cmsg->cmsg_level = SOL_SOCKET;
    cmsg->cmsg_type = SCM_RIGHTS;
    cmsg->cmsg_len = CMSG_LEN(sizeof(fd));

    memmove(CMSG_DATA(cmsg), &fd, sizeof(fd));

    msg.msg_controllen = cmsg->cmsg_len;

    if (sendmsg(socket, &msg, 0) < 0)
        err_syserr("Failed to send message\n");
}

static
int odbierz(int socket)  // receive fd from socket
{
    struct msghdr msg = {0};

    /* On Mac OS X, the struct iovec is needed, even if it points to minimal data */
    char m_buffer[1];
    struct iovec io = { .iov_base = m_buffer, .iov_len = sizeof(m_buffer) };
    msg.msg_iov = &io;
    msg.msg_iovlen = 1;

    char c_buffer[256];
    msg.msg_control = c_buffer;
    msg.msg_controllen = sizeof(c_buffer);

    if (recvmsg(socket, &msg, 0) < 0)
        err_syserr("Failed to receive message\n");

    struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);

    err_remark("About to extract fd\n");
    int fd;
    memmove(&fd, CMSG_DATA(cmsg), sizeof(fd));
    err_remark("Extracted fd %d\n", fd);

    return fd;
}

int main(int argc, char **argv)
{
    const char *filename = "./z7.c";

    err_setarg0(argv[0]);
    err_setlogopts(ERR_PID);
    if (argc > 1)
        filename = argv[1];
    int sv[2];
    if (socketpair(AF_UNIX, SOCK_DGRAM, 0, sv) != 0)
        err_syserr("Failed to create Unix-domain socket pair\n");

    int pid = fork();
    if (pid > 0)  // in parent
    {
        err_remark("Parent at work\n");
        close(sv[1]);
        int sock = sv[0];

        int fd = open(filename, O_RDONLY);
        if (fd < 0)
            err_syserr("Failed to open file %s for reading\n", filename);

        /* Read some data to demonstrate that file offset is passed */
        char buffer[32];
        int nbytes = read(fd, buffer, sizeof(buffer));
        if (nbytes > 0)
            err_remark("Parent read: [[%.*s]]\n", nbytes, buffer);

        wyslij(sock, fd);

        close(fd);
        nanosleep(&(struct timespec){ .tv_sec = 1, .tv_nsec = 500000000}, 0);
        err_remark("Parent exits\n");
    }
    else  // in child
    {
        err_remark("Child at play\n");
        close(sv[0]);
        int sock = sv[1];

        nanosleep(&(struct timespec){ .tv_sec = 0, .tv_nsec = 500000000}, 0);

        int fd = odbierz(sock);
        printf("Read %d!\n", fd);
        char buffer[256];
        ssize_t nbytes;
        while ((nbytes = read(fd, buffer, sizeof(buffer))) > 0)
            write(1, buffer, nbytes);
        printf("Done!\n");
        close(fd);
    }
    return 0;
}

And a sample run:

$ ./fd-passing
fd-passing: pid=8000: Parent at work
fd-passing: pid=8000: Parent read: [[This is the file z7.c.
It isn't ]]
fd-passing: pid=8001: Child at play
fd-passing: pid=8001: About to extract fd
fd-passing: pid=8001: Extracted fd 3
Read 3!
very interesting.
It isn't even C code.
But it is used by the fd-passing program to demonstrate that file
descriptors can indeed be passed between sockets on occasion.
And, with the fully working code, it does indeed seem to work.
Extended testing would have the parent code read part of the file, and
then demonstrate that the child codecontinues where the parent left off.
That has not been coded, though.
Done!
fd-passing: pid=8000: Parent exits
$

I recently started attending two classes in school that focus on networking, one regarding distributed systems and another regarding computer networks in general. After completing the first labs for the two classes, I now have a pretty good understand of network protocol and socket concepts with both C and Java.

Now I'm trying to move beyond the basic concepts and become better at communication class and object design, network design patterns, intermediate socket/stream management conventions, important libraries, and general *nix network programming intermediate techniques in either C or OO languages.

Can you suggest any resources that you've had success with?

Unix network programming by Richard Stevens is a must-have book which discusses many advanced network programming techniques. I've been doing network programming for years, and even now hardly a day goes by without me looking up something in this great reference.

Steven's Know it all book about Network Programming is too detailed to start with and contains library to encapsulate the real socket programming.

I would recommend to start with the Beej's Guide to Network Programing and then move to the TCP/IP Sockets in C. They give a a lot good basics about the network programming and provide a platform to finally go through the Stevens book.

Stevens others books like TCP/IP illustrated series covers all the conceptual part of networks.

I learned network programming from the Linux Socket Programming By Example book by Warren W. Gray.

am very much interested in unix. Want to learn in and out. Can you guys help me by listing some books which can make me a wizard? Ultimately I want to become a unix programmer.

I am not a novice user in Unix.

You want system administration knowledge, or programming knowledge?

For programming:

For system administration:

As other responders have noted, Advanced Programming in the Unix Environment (APUE) is indispensable.

Other books that you might want to consider (these have more of a Linux focus, but are a good way to become familiar with Unix internals):

How to migrate to *nix platform after spending more than 10 years on windows? Which flavor will be easy to handle to make me more comfortable and then maybe I can switch over to more stadard *nix flavors? I have been postponing for a while now. Help me with the extra push.

Linux is the most accessible and has the most mature desktop functionality. BSD (in its various flavours) has less userspace baggage and would be easier to understand at a fundamental level. In this regard it is more like a traditional Unix than a modern Linux distribution. Some might view this as a good thing (and from certain perspectives it is) but will be more alien to someone familiar with Windows.

The main desktop distributions are Ubuntu and Fedora. These are both capable systems but differ somewhat in their userspace architecture The tooling for the desktop environment and default configuration for system security works a bit differently on Ubuntu than it does on most other Linux or Unix flavours but this is of little relevance to development. From a user perspective either of these would be a good start.

From a the perspective of a developer, all modern flavours of Unix and Linux are very similar and share essentially the same developer tool chain. If you want to learn about the system from a programmer's perspective there is relatively little to choose.

Most unix programming can be accomplished quite effectively with a programmer's editor such as vim or emacs, both of which come in text mode and windowing flavours. These editors are very powerful and have rather quirky user interfaces - the user interfaces are ususual but contribute significantly to the power of the tools. If you are not comfortable with these tools, this posting discusses several other editors that offer a user experience closer to common Windows tooling.

There are several IDEs such as Eclipse that might be of more interest to someone coming off Windows/Visual Studio.

Some postings on Stackoverflow that discuss linux/unix resources are:

If you have the time and want to do a real tour of the nuts and bolts Linux From Scratch is a tutorial that goes through building a linux installation by hand. This is quite a good way to learn in depth.

For programming, get a feel for C/unix from K&R and some of the resources mentioned in the questions linked above. The equivalent of Petzold, Prosise and Richter in the Unix world are W Richard Stevens' Advanced Programming in the Unix Environment and Unix Network Programming vol. 1 and 2.

Learning one of the dynamic languages such as Perl or Python if you are not already familiar with these is also a useful thing to do. As a bonus you can get good Windows ports of both the above from Activestate which means that these skills are useful on both platforms.

If you're into C++ take a look at QT. This is arguably the best cross-platform GUI toolkit on the market and (again) has the benefit of a skill set and tool chain that is transferrable back into Windows. There are also several good books on the subject and (as a bonus) it also works well with Python.

Finally, Cygwin is a unix emulation layer that runs on Windows and gives substantially unix-like environment. Architecturally, Cygwin is a port of glibc and the crt (the GNU tool chain's base libraries) as an adaptor on top of Win32. This emulation layer makes it easy to port unix/linux apps onto Cygwin. The platform comes with a pretty complete set of software - essentially a full linux distribution hosted on a Windows kernel. It allows you to work in a unix-like way on Windows without having to maintain a separate operating system installations. If you don't want to run VMs, multiple boots or multiple PCs it may be a way of easing into unix.

I was just thrust into Linux programming (Red Hat) after several years of C++ on Win32. So I am not looking for the basics of programming. Rather I am looking to get up to speed with things unique to the Linux programming world, such as packages, etc. In other words, I need to know everything in https://www.redhat.com/courses/rhd251_red_hat_linux_programming/details/ without spending 3K. Any ideas of how I can acquire that knowledge quickly (and relatively cheaply)?

Update: The things that I am used to doing on Windows like building .exe and dlls using VC++, creating install scripts etc are just done differently on Linux. They use things like yum, make and make install, etc. Things like dependency walker that I take for granted in the windows world constantly send me to google while doing linux. Is there a 'set' of new skills somewhere that I can browse or is this more of a learn as you go?

The primary problem is this: As a very experienced programmer in Windows,I am having to ask simple questions like what's the difference between usr\bin and usr\local\bin and I would like to be prepared.

For POSIX and such I can recommend Advanced Programming in the UNIX Environment and having a bookmark to The single UNIX Specification.

For GCC/GDB and those tools I'm afraid I can't give you any good recommendation.

Hope that helps anyway.

Edit: Duck was slightly faster.

Edited because I had to leave a meeting when I originally submitted this, but wanted to complete the information

Half of that material is learning about development in a Unix-like environment, and for that, I'd recommend a book since it's tougher to filter out useful information from the start.

I'd urge you to go to a bookstore and browse through these books:

  • Advanced Programming in the Unix Environment by Stevens and Rago - this book covers threads, networking, IPC, signals, files, process management
  • Unix Network Programming, Volume 1 by Stevens - This book is focused on network programming techniques, design - you might not need this until much later
  • Unix/Linux System Administration - This book covers the more system administrator side of stuff, like directory structure of most Unix and Linux file systems (Linux distributions are more diverse than their Unix-named counterparts in how they might structure their file system)

    Other information accessible online:

  • GCC Online Manual - the comprehensive GNU GCC documentation

  • Beej's network programming guide - A really well written tutorial to network programming with the use of the BSD API. If you have done work with winsock, this should be mostly familiar to you.
  • Red Hat Enterprise Linux 5's Deployment Guide - talks specifically about Red Hat EL 5's basic administrative/deployment, like installing with package manager, a Red Hat system's directory structure...
  • make - Wikipedia article that will have links to the various make documentation out there
  • binutils - These are the Linux tools used for manipulating object/binaries.
  • GNU Build System - Wikipedia article about the traditional build system of GNU software, using autoconf/automake/autogen

Additionally, you will want to learn about ldd, which is like dependency walker in Windows. It lists a target binary's dependencies, if it has any.

And for Debugging, check out this StackOverflow thread which talks about a well written GDB tutorial and also links to an IBM guide.

Happy reading.

Recently I started taking this guide to get myself started on downloading files from the internet. I read it and came up with the following code to download the HTTP body of a website. The only problem is, it's not working. The code stops when calling the recv() call. It does not crash, it just keeps on running. Is this my fault? Am I using the wrong approch? I intent to use the code to not just download the contents of .html-files, but also to download other files (zip, png, jpg, dmg ...). I hope there's somebody that can help me. This is my code:

#include <stdio.h>
#include <sys/socket.h> /* SOCKET */
#include <netdb.h> /* struct addrinfo */
#include <stdlib.h> /* exit() */
#include <string.h> /* memset() */
#include <errno.h> /* errno */
#include <unistd.h> /* close() */
#include <arpa/inet.h> /* IP Conversion */

#include <stdarg.h> /* va_list */

#define SERVERNAME "developerief2.site11.com"
#define PROTOCOL "80"
#define MAXDATASIZE 1024*1024

void errorOut(int status, const char *format, ...);
void *get_in_addr(struct sockaddr *sa);

int main (int argc, const char * argv[]) {
    int status;

    // GET ADDRESS INFO
    struct addrinfo *infos; 
    struct addrinfo hints;

    // fill hints
    memset(&hints, 0, sizeof(hints));
    hints.ai_socktype = SOCK_STREAM;
    hints.ai_flags = AI_PASSIVE;
    hints.ai_family = AF_UNSPEC;

    // get address info
    status = getaddrinfo(SERVERNAME, 
                         PROTOCOL, 
                         &hints, 
                         &infos);
    if(status != 0)
        errorOut(-1, "Couldn't get addres information: %s\n", gai_strerror(status));

    // MAKE SOCKET
    int sockfd;

    // loop, use first valid
    struct addrinfo *p;
    for(p = infos; p != NULL; p = p->ai_next) {
        // CREATE SOCKET
        sockfd = socket(p->ai_family, 
                        p->ai_socktype, 
                        p->ai_protocol);
        if(sockfd == -1)
            continue;

        // TRY TO CONNECT
        status = connect(sockfd, 
                         p->ai_addr, 
                         p->ai_addrlen);
        if(status == -1) {
            close(sockfd);
            continue;
        }

        break;
    }

    if(p == NULL) {
        fprintf(stderr, "Failed to connect\n");
        return 1;
    }

    // LET USER KNOW
    char printableIP[INET6_ADDRSTRLEN];
    inet_ntop(p->ai_family,
              get_in_addr((struct sockaddr *)p->ai_addr),
              printableIP,
              sizeof(printableIP));
    printf("Connection to %s\n", printableIP);

    // GET RID OF INFOS
    freeaddrinfo(infos);

    // RECEIVE DATA
    ssize_t receivedBytes;
    char buf[MAXDATASIZE];
    printf("Start receiving\n");
    receivedBytes = recv(sockfd, 
                         buf, 
                         MAXDATASIZE-1, 
                         0);
    printf("Received %d bytes\n", (int)receivedBytes);
    if(receivedBytes == -1)
        errorOut(1, "Error while receiving\n");

    // null terminate
    buf[receivedBytes] = '\0';

    // PRINT
    printf("Received Data:\n\n%s\n", buf);

    // CLOSE
    close(sockfd);

    return 0;
}

void *get_in_addr(struct sockaddr *sa) {
    // IP4
    if(sa->sa_family == AF_INET)
        return &(((struct sockaddr_in *) sa)->sin_addr);

    return &(((struct sockaddr_in6 *) sa)->sin6_addr);
}

void errorOut(int status, const char *format, ...) {
    va_list args;
    va_start(args, format);
    vfprintf(stderr, format, args);
    va_end(args);
    exit(status);
}

If you want to grab files using HTTP, then libcURL is probably your best bet in C. However, if you are using this as a way to learn network programming, then you are going to have to learn a bit more about HTTP before you can retrieve a file.

What you are seeing in your current program is that you need to send an explicit request for the file before you can retrieve it. I would start by reading through RFC2616. Don't try to understand it all - it is a lot to read for this example. Read the first section to get an understanding of how HTTP works, then read sections 4, 5, and 6 to understand the basic message format.

Here is an example of what an HTTP request for the stackoverflow Questions page looks like:

GET http://stackoverflow.com/questions HTTP/1.1\r\n
Host: stackoverflow.com:80\r\n
Connection: close\r\n
Accept-Encoding: identity, *;q=0\r\n
\r\n

I believe that is a minimal request. I added the CRLFs explicitly to show that a blank line is used to terminate the request header block as described in RFC2616. If you leave out the Accept-Encoding header, then the result document will probably be transfered as a gzip-compressed stream since HTTP allows for this explicitly unless you tell the server that you do not want it.

The server response also contains HTTP headers for the meta-data describing the response. Here is an example of a response from the previous request:

HTTP/1.1 200 OK\r\n
Server: nginx\r\n
Date: Sun, 01 Aug 2010 13:54:56 GMT\r\n
Content-Type: text/html; charset=utf-8\r\n
Connection: close\r\n
Cache-Control: private\r\n
Content-Length: 49731\r\n
\r\n
\r\n
\r\n
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" ... 49,667 bytes follow

This simple example should give you an idea what you are getting into implementing if you want to grab files using HTTP. This is the best case, most simple example. This isn't something that I would undertake lightly, but it is probably the best way to learn and appreciate HTTP.

If you are looking for a simple way to learn network programming, this is a decent way to start. I would recommend picking up a copy of TCP/IP Illustrated, Volume 1 and UNIX Network Programming, Volume 1. These are probably the best way to really learn how to write network-based applications. I would probably start by writing an FTP client since FTP is a much simpler protocol to start with.

If you are trying to learn the details associated with HTTP, then:

  1. Buy HTTP: the Definitive Guide and read it
  2. Read RFC2616 until you understand it
    • Try examples using telnet server 80 and typing in requests by hand
    • Download the cURL client and use the --verbose and --include command line options so that you can see what is happening
  3. Read Fielding's dissertation until HTTP really makes sense.

Just don't plan on writing your own HTTP client for enterprise use. You do not want to do that, trust me as one who has been maintaining such a mistake for a little while now...

Could anyone give me some pointers as to the best way in which to learn how to do very low latency programming? I have many programming books but I've never seen one which focused (or helped) on writing extremely fast code. Or are books not the best way forward?

Some advice from an expert would be really appreciated!

EDIT: I think I'm referring more to CPU/Memory bound.

[C++ programmer]:

Ultra-low-latency programming is hard. Much harder than people suspect when they first start down the path. There are some techniques and "tricks" you can employ. Like IO Completion ports, multi core utilization, highly optimized synchronization techniques, shared memory. The list goes on forever. (edit) It's not as simple as "code-profile-refactor-repeat" because you can write excellent code that is robust and fast, but will never be truly ultra-low latency code.

Unfortunately there is no one single resource I know of that will show you how it's done. Programmers specializing in (and good at) ultra low-latency code are among the best in the business and the most experienced. And with good reason. Because if there is a silver bullet solution to becoming a good low-latency programmer, it is simply this: you have to know a lot about everything. And that knowledge is not easy to come by. It takes years (decades?) of experience and constant study.

As far as the study itself is concerned, here's a few books I found useful or especially insightful for one reason or another:

I'm writing a concurrent TCP server that has to handle multiple connections with the 'thread per connection' approach (using a thread pool). My doubt is about which is the most optimal way for every thread to get a different file descriptor.

I found that the next two methods are the most recommended:

  1. A main thread that accepts() all the incoming connections and stores their descriptors on a data structure (e.g.: a queue). Then every thread is able to get an fd from the queue.
  2. Accept() is called directly from every thread. (Recommended in Unix Network Programming V1 )

Problems I find to each of them:

  1. The static data structure that stores all the fd's must be locked (mutex_lock) before a thread can read from it, so in the case that a considerable number of threads wants to read in exactly the same moment I don't know how much time would pass until all of them would get their goal.
  2. I've been reading that the Thundering Herd problem related to simultaneous accept() calls has not been totally solved on Linux yet, so maybe I would need to create an artificial solution to it that would end up making the application at least as slow as with the approach 1.

Sources:

(Some links talking about approach 2: does-the-thundering-herd-problem-exist-on-linux-anymore - and one article I found about it (outdated) : linux-scalability/reports/accept.html

And an SO answer that recommends approach 1: can-i-call-accept-for-one-socket-from-several-threads-simultaneously


I'm really interested on the matter, so I will appreciate any opinion about it :)

As mentioned in the StackOverflow answer you linked, a single thread calling accept() is probably the way to go. You mention concerns about locking, but these days you will find lockfree queue implementations available in Boost.Lockfree, Intel TBB, and elsewhere. You could use one of those if you like, but you might just use a condition variable to let your worker threads sleep and wake one of them when a new connection is established.

As there are several ways of connecting multiclients to the server such as: fork, select, threads, etc. I would be glad if you could describe which is better to connect multiple clients to the server?

Chapter 30 of Unix Network Programming introduce the alternatives of client/server design. The UNP book is the best start of network programming.

I am trying to make a simple client-server chat program. On the client side I spin off another thread to read any incomming data from the server. The problem is, I want to gracefully terminate that second thread when a person logs out from the main thread. I was trying to use a shared variable 'running' to terminate, problem is, the socket read() command is a blocking command, so if I do while(running == 1), the server has to send something before the read returns and the while condition can be checked again. I am looking for a method (with common unix sockets only) to do a non-blocking read, basically some form of peek() would work, for I can continually check the loop to see if I'm done.

The reading thread loop is below, right now it does not have any mutex's for the shared variables, but I plan to add that later don't worry! ;)

void *serverlisten(void *vargp)
{
    while(running == 1)
    {
        read(socket, readbuffer, sizeof(readbuffer));
        printf("CLIENT RECIEVED: %s\n", readbuffer);
    }
    pthread_exit(NULL);
}

This may not be the specific answer you're looking for, but it may be the place you could find it. I'm currently reading:

Unix Network Programming, Volume 1: The Sockets Networking API, 3rd Edition

And there are a lot of examples of multi-threaded, non-blocking servers and clients. Also, they explain a lot of the reasoning and trade-offs that go between the different methods.

Hope that helps...

I would like to implement a telnet server in C. How would I proceed with this? Which RFCs should I look at? This is important to me, and I would appreciate any help.

If you are serious about network programming I would highly recommend Richard W. Stevens' "UNIX Network Programming Vol 1" - it's much better reading than RFCs with great examples.

It is very expensive book but there are cheap paperback edition available on eBay. Even if you get expensive hard-cover edition it worth every penny you paid.

I don't quite understand the purpose of the first argument in the select function. Wikipedia describes it as the maximum file descriptor across all the sets, plus 1 . Why +1 and why does select need this information ?

This is a happenstance detail of the (original) Berkeley sockets implementation. Basically, the implementation used the number of file descriptors as a sizing variable for some temporary internal bit arrays. Since Unix descriptors start with zero, the largest descriptor would be one less than the size of any array with a one-slot-per-descriptor semantic. Hence the "largest-plus-one" requirement. This plus-1 adjustment could have been absorbed into the system call itself, but wasn't.

Ancient history, that's all. The result is that the correct interpretation of the first argument has less to do with descriptor values than with the number of them (i.e. the maximum number of descriptors to be tested). See Section 6.3 of Stevens et al (This is a revised and updated version of Rich Stevens' classic text. If you don't have it, get it!)

I'm thinking of purchasing this book to learn more on C/C++ networking programming:

I was just wondering is there/do I need a windows equivalent book (I do plan to code on both OS'). I'm not sure of the difference in the evolution regarding BSD sockets the windows.

I'm mainly buying this to eventually write code which will stream data between computers.

For my application, I need to have a central process responsible for interacting with many client processes. The client processes need a way to identify and communicate with the central process. Additionally, the central process may not be running, and the client process needs a way to identify that fact. This application will be running on a Unix-like system, so I'd thought about using named pipes sockets for the task. How would I, specifically, go about using named pipes sockets for this task (actual code would be greatly appreciated!)? If named pipes sockets are not ideal, are there better alternatives?

You need to read one of the standard text books on the subject - probably Stevens 'UNIX Network Programming, Vol 1, 3rd Edn'.

You have to make a number of decisions, including:

  • Does the central server handle all the messages itself, or does it accept a connection from the other process, fork, and delegate the work to its child while it resumes listening for new connections?
  • Does the central server need to keep a session running, or does it just respond to 'one (short) message at a time'?

These decisions radically impact the code that's required. If the process is dealing with one short message at a time, it might be appropriate to consider UDP instead of TCP. If the process is dealing with extended conversations with several correspondents and is not forking a child to handle them, then you can consider whether threaded programming is appropriate. If not, you need to consider timeliness and whether the central process can be held up indefinitely.

As indicated in a comment, you also need to consider how a process should deal with the central process not running - and you need to consider how fast that response should be. You also have to protect against the central process being so busy that it looks like it is not running, even though it is running but is just very busy dealing with other connections. How much trouble do you get into if the process that can't connect to the central process tries to kick off a new copy of it.

I am new to IOS Swift. I want to develop UDP client and send data in Swift.

I have reference the following link:

Swift: Receive UDP with GCDAsyncUdpSocket

Retrieving a string from a UDP server message

Swift UDP Connection

But I could not find a good way to implement UDP in Swift.

Does someone can teach me How to implement UDP client and send data in Swift in iPhone?

Thanks in advance.

This looks like a dupe to Swift UDP Connection.

Refactoring the example a little bit:

let INADDR_ANY = in_addr(s_addr: 0)

udpSend("Hello World!", address: INADDR_ANY, port: 1337)

func udpSend(textToSend: String, address: in_addr, port: CUnsignedShort) {
  func htons(value: CUnsignedShort) -> CUnsignedShort {
    return (value << 8) + (value >> 8);
  }

  let fd = socket(AF_INET, SOCK_DGRAM, 0) // DGRAM makes it UDP

  var addr = sockaddr_in(
    sin_len:    __uint8_t(sizeof(sockaddr_in)),
    sin_family: sa_family_t(AF_INET),
    sin_port:   htons(port),
    sin_addr:   address,
    sin_zero:   ( 0, 0, 0, 0, 0, 0, 0, 0 )
  )

  textToSend.withCString { cstr -> Void in
    withUnsafePointer(&addr) { ptr -> Void in
      let addrptr = UnsafePointer<sockaddr>(ptr)
      sendto(fd, cstr, strlen(cstr), 0, addrptr, socklen_t(addr.sin_len))
    }
  }

  close(fd)
}

If you have the target IP as a string, use inet_pton() to convert it to an in_addr. Like so:

var addr = in_addr()
inet_pton(AF_INET, "192.168.0.1", &buf)

Feel free to steal code from over here: SwiftSockets

Oh, and if you plan to do any serious network programming, grab this book: Unix Network Programming

I have a network application that I need to convert so that it works for ipv6 network. Could you please let me know what I need to do (replace socket APIs)?

One more thing, how can I test my application?

Thanks.

3rd edition of "Unix Network Programming" has numerous examples and a whole chapter in IPv4/IPv6 interoperability.

I've just been looking at the following books: -K&R C -The Complete Reference C++ -The Complete Reference C -Deitel How to Program C -Deitel How to Program C++

and not one of them contains any networking, how to create sockets etc. Is there any 'definitive' reference for C network programming? Google wasn't particularly helpful.

I'm probably considering windows and unix platforms

"C books" usually describe the language and the standard library. Sockets aren't in the standard library.

If you want a good book on sockets, UNP by Stevens is arguably the best.

Recently, one of my work is to write network stacks using C++ in an OS developed by my team which is totally different from linux. However, I think how a deep understanding of linux network stacks may be helpful to design and implementation a well one.

Any advice or helpful material?

Unix Network Programming by W. Richard Stevens

I'd like to do some hobby development of command line applications for UNIX in C. To narrow that down, I'd like to focus on the BSD family, specifically FreeBSD as my development machine is a Mac OS X 10.7 Lion box.

Searches for UNIX development have returned some from Addison Wesley, but I cannot find adequate documentation for FreeBSD. If there is a good general book on developing for either BSD or AT&T UNIX, I would be interested in that. please note I prefer books as I learn best that way.

Thanks,

Scott

Stevens "Advanced Programming in the Unix Environment". It covers FreeBSD but it's not FreeBSD specific. It is Unix specific, and covers all the bases you require.

I guess you could take a look at these:

Programming with POSIX Threads

The sockets Networking API

Interprocess Communications

Advanced Programming in the UNIX environment

The first three are very specific and would serve only if you need to focus on that particular subject. The last link is a highly rated book on Amazon that you may be interested in.

All in all, if you already have a grip of threads, IPC, networking, filesystem, all you need is the internet because there is widely available documentation about the POSIX API.

I'm going to graduate soon in electronics and tlc engineering and I have some decent OO programming experience with PHP and Java.
Now I would like to try starting a career as a C programmer.

I'm interested in C since this is, I think, the most suited language, without considering Assembly, to develop device drivers, firmwares and other low-level softwares in. In particular I hope to be able to work on network related topics. I want to work quite close to the hardware since I suppose this is the only way I'll be able to fruitfully spend my degree while at the same time finding gratification in being a programmer.

So I'd like to ask what you think I should read considering that I can already write something in C, nothing fancy though, and that I've read a couple of times the K&R.

If you know of any tools or libraries (like libevent and libev) that are de facto standards in the field of low-level, network related, C programming that would be nice to know as well.

In chapter 5 in the book Unix Network Programming by Stevens et al, there are a server program and a client program as follows:

server

mysignal(SIGCHLD, sig_child);
for(;;)
{
    connfd = accept(listenfd, (struct sockaddr *)&ca, &ca_len);

    pid = fork();
    if(pid == 0)
    {
        //sleep(60);
        close(listenfd);
        str_echo(connfd);
        close(connfd);
        exit(0);
    }
    close(connfd);
}

function sig_child works to handle signal SIGCHLD; the code is as follows:

void sig_child(int signo)
{
    pid_t pid;
    int stat;
    static i = 1;
    i++;
    while(1)
    {
        pid = wait(&stat);
        if(pid > 0)
        {
            printf("ith: %d, child %d terminated\n", i, pid);
        }
        else
        {
            break;
        }
    }   
    //pid = wait(&stat);
    return;
}

client

for(i = 0 ; i < 5; i++)
{
    sockfd[i] = socket(AF_INET, SOCK_STREAM, 0);
    if(sockfd[i] < 0)
    {
        perror("create error");
        exit(-1);
    }

    memset(&sa, 0, sizeof(sa));
    sa.sin_family = AF_INET;
    sa.sin_port = htons(5900);

    if(inet_pton(AF_INET, argv[1], &sa.sin_addr) != 1)
        {
                perror("inet_pton error");
                exit(-1);
        }
    connect(sockfd[i], (struct sockaddr *)&sa, sizeof(sa));
}
str_cli(sockfd[0], stdin);
exit(0);

As you can see in the source code of the client, the program will establish five connections to the server end, but only one connection is used in the program; after str_cli is finished, exit(0) is called. And all the connections should be closed, then the five child processes in the server will exit, and SIGCHLD is sent to the parent process, which uses function sig_child to handle the SIGCHLD. Then the while loop will confirm all the child processes will be waited by parent process correctly. And I test the program for a couple of times; it works well, all the children will be cleaned.

But in the book, the authors wrote that "wait could not work properly, because function wait may become blocked before all the child processes exit". So is the statement right in the book? If it's right, could you please explain it in more detail. (PS: I think wait in a while statement would properly handle the exit of all the child processes.)

The problem is not with wait, but with signal delivery. The sig_chld function in the book doesn't have the while loop, it only waits for one child

void sig_child(int signo)
{
    pid_t pid;
    int stat;
    pid = wait(&stat);
    printf("child %d terminated\n", pid);
    return;
}

When the client exits, all connections are closed and all the children eventually terminate. Now, the first SIGCHLD signal is delivered and upon entering the signal handler, the signal is blocked. Any further signal won't be queued and is therefore lost, causing zombie children in the server.

You can fix this by wrapping wait in some loop, as you did. Another solution is to ignore SIGCHLD explicitly, which is valid when you don't need the exit status of your children.


While wait in a loop finally waits for all children, it has the drawback, that wait blocks, if there are still children running. This means the process is stuck in the signal handler until all children are terminated.

The solution in the book is to use waitpid with option WNOHANG in a loop

while ((pid = waitpid(-1, &stat, WNOHANG)) > 0)
    printf("child %d terminated\n", pid);

This loop waits for all terminated children, but exits as soon as possible, even if there are running children.


To reproduce the server hanging in the signal handler, you must do the following

  • start server
  • start first client
  • start second client
  • close one of the clients
  • start a third client
  • enter text in the third client
    You won't get a response

Is there a good book for ruby sockets programming or we have to rely on Unix C books for theory and source ?.

If the goal is to learn socket programming I would recommend using C and reading through some Unix system programming books. The books will probably be much better and go into more detail than a sockets book that is Ruby specific (mainly because C, Unix and sockets have been around much longer than Ruby and my quick Googling didn't find a Ruby specific book).

Books I would recommend:

After getting a handle on the sockets in general (even if you don't write any C) the Ruby API for sockets will make much more sense.

The Tcp server need to serve many clients, If one client one server port and one server thread to listen the port, I want to know weather it is faster ?

If one port is good, could someone explain the different between one port and multiple ports in this case, thanks!

The problem with using multiple ports to achieve this is that each of your clients will each have a specific port number. Depending on the number of clients there could be a tremendous amount of bookkeeping involved.

Typically, for a tcp server that is to serve multiple clients, you have a "main" thread which listens to a port and accepts connections on that port. That thread then passes the connected socket off to another thread for processing and goes back to listening.

For a wealth of Unix network programming knowledge check out "The Stevens Book"

What is the best way to coordinate the accepts of a listenning socket between multiple processes?

I am thinking of one of the two ways:

  • Have a "Master" process that will send a message to each process when it is its turn to start accepting connections.

    So the sequence will be:

    Master process gives token to Worker A. Worker A accepts connection, gives token back to Master process. Master process gives token to Worker B. etc.

  • Each process will have an accepting thread that will be spinning around a shared mutex. Lock the mutex, accept a connection, free the lock.

Any better ideas?

  • When a connection comes in ALL processes get woken up. Before accepting the connection the they try to lock a shared mutex. The one to lock the mutex first gets to accept the connection.

1) I'm not sure why you wouldn't want multiple "threads" instead of "processes".

But should you require a pool of worker processes (vs. "worker threads"), then I would recommend:

2) The master process binds, listens ... and accepts all incoming connections

3) Use a "Unix socket" to pass the accepted connection from the master process to the worker process.

4) As far as "synchronization" - easy. The worker simply blocks reading the Unix socket until there's a new file descriptor for it to start using.

5) You can set up a shared memory block for the worker to communicate "busy/free" status to the master.

Here's a discussion of using a "Unix domain socket":

Stevens "Network Programming" is also an excellent resource:

Environment: a RedHat-like distro, 2.6.39 kernel, glibc 2.12.

I fully expect that if a signal was delivered while accept() was in progress, accept should fail, leaving errno==EINTR. However, mine doesn't do that, and I'm wondering why. Below are the sample program, and strace output.

#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <signal.h>
#include <errno.h>
#include <arpa/inet.h>
#include <string.h>

static void sigh(int);

int main(int argc, char ** argv) {

    int s;
    struct sockaddr_in sin;

    if ((s = socket(AF_INET, SOCK_STREAM, 0))<0) {
        perror("socket");
        return 1;
    }
    memset(&sin, 0, sizeof(struct sockaddr_in));
    sin.sin_family = AF_INET;
    if (bind(s, (struct sockaddr*)&sin, sizeof(struct sockaddr_in))) {
        perror("bind"); 
        return 1;
    }
    if (listen(s, 5)) {
        perror("listen");
    }

    signal(SIGQUIT, sigh);

    while (1) {
        socklen_t sl = sizeof(struct sockaddr_in);
        int rc = accept(s, (struct sockaddr*)&sin, &sl);
        if (rc<0) {
            if (errno == EINTR) {
                printf("accept restarted\n");
                continue;
            }
            perror("accept");
            return 1;
        }
        printf("accepted fd %d\n", rc);
        close(rc);
    }

}

void sigh(int s) {

    signal(s, sigh);

    unsigned char p[100];
    int i = 0;
    while (s) {
        p[i++] = '0'+(s%10);
        s/=10;
    }
    write(1, "sig ", 4);
    for (i--; i>=0; i--) {
        write(1, &p[i], 1);
    }
    write(1, "\n", 1);

}

strace output:

execve("./accept", ["./accept"], [/* 57 vars */]) = 0
<skipped>
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(3, 5)                            = 0
rt_sigaction(SIGQUIT, {0x4008c4, [QUIT], SA_RESTORER|SA_RESTART, 0x30b7e329a0}, {SIG_DFL, [], 0}, 8) = 0
accept(3, 0x7fffe3e3c500, [16])         = ? ERESTARTSYS (To be restarted)
--- SIGQUIT (Quit) @ 0 (0) ---
rt_sigaction(SIGQUIT, {0x4008c4, [QUIT], SA_RESTORER|SA_RESTART, 0x30b7e329a0}, {0x4008c4, [QUIT], SA_RESTORER|SA_RESTART, 0x30b7e329a0}, 8) = 0
write(1, "sig ", 4sig )                     = 4
write(1, "3", 13)                        = 1
write(1, "\n", 1
)                       = 1
rt_sigreturn(0x1)                       = 43
accept(3, ^C <unfinished ...>

Within Unix Network Programming book, there is a section which says:

We used the term "slow system call" to describe accept, and we use this term for any system call that can block forever. That is, the system call need never return. Most networking functions fall into this category. For example, there is no guarantee that a server's call to accept will ever return, if there are no clients that will connect to the server. Similarly, our server's call to read in Figure 5.3 will never return if the client never sends a line for the server to echo. Other examples of slow system calls are reads and writes of pipes and terminal devices. A notable exception is disk I/O, which usually returns to the caller (assuming no catastrophic hardware failure).

The basic rule that applies here is that when a process is blocked in a slow system call and the process catches a signal and the signal handler returns, the system call can return an error of EINTR. Some kernels automatically restart some interrupted system calls. For portability, when we write a program that catches signals (most concurrent servers catch SIGCHLD), we must be prepared for slow system calls to return EINTR. Portability problems are caused by the qualifiers "can" and "some," which were used earlier, and the fact that support for the POSIX SA_RESTART flag is optional. Even if an implementation supports the SA_RESTART flag, not all interrupted system calls may automatically be restarted. Most Berkeley-derived implementations, for example, never automatically restart select, and some of these implementations never restart accept or recvfrom.

I am trying to set up multicast sources for an application on linux using source specific multicast (SSM) and the code is going ok (using the C interface) but I would like to verify that the system will behave as I expect it to.

Setup:
Multicast address - 233.X.X.X:9876
Source1 - 192.X.X.1
Source2 - 192.X.X.2
Interface1 - 192.X.X.100
Interface1 - 192.X.X.101

Steps

  1. Configure so that only Source1 is sending to the multicast address
  2. Start a reader (reader1) that binds to the multicast address and joins the multicast with ssm src as Source1 and interface as Interface1
  3. Observe that data is seen on reader1
  4. Do the same (reader2) but using Source2 and Interface2

Desired Outcome:
Reader1 can see the data from the multicast.
Reader2 can't see the data from the multicast.

I am concerned that the above will not be the case as in my testing using non source specific multicast an IP_ADD_MEMBERSHIP has global effect. So reader2's socket sees data because it is bound to the unique multicast address which has been joined to an interface seeing data. The info at this link under "Joining a Multicast" matches up with my observations.

It may well be that IP_ADD_SOURCE_MEMBERSHIP behaves differently to IP_ADD_MEMBERSHIP but the documentation is sparse and not specific in this regard.

Specific questions:

  1. Is a multicast join using IP_ADD_SOURCE_MEMBERSHIP global i.e. will that cause any socket bind()'d to the multicast address to receive packets from that source.
  2. How is SSM supposed to be used in general? does it make sense to have one multicast address with N sources?

I am inexperienced with network programming so please forgive any shortcomings in my understanding.

Thanks for any assistance.

I've worked through this and after obtaining a copy of Unix Network Programming the behaviour at least seems clear and understandable.

  1. The answer is yes all multicast joins are global whether they be SSM or otherwise. The reason for this is that the join actually takes effect a couple of layers down from a process issuing a join request. Basically, it tells the IP layer to accept multicast packets from the source specified and provide them to any process bound to the socket with the multicast address.

  2. SSM was actually introduced because of the limited address space of IPv4. When using multicast on the internet there are not nearly enough unique multicast addresses such that each person who want to use one could have a unique address. SSM pairs a source address with a multicast address which as a pair form a globally unique identifier i.e. shared multicast address e.g. 239.10.5.1 and source 192.168.1.5. So the reason that SSM exists is purely for this purpose of facilitating multicast in a limited address space. In the environment that our software is working in (Cisco) SSM is being used for redundancy and convenience of transmission, stacking multiple streams of data on the same IP:port combo and having downstream clients select the stream they want. This all works just fine until a given host wants access to more than one stream in the multicast, because they're all on the same multicast address all subscribed processes get all the data, and this is unavoidable due to the way the network stack works.

  3. Final solution
    Now that the behaviour has been understood the solution is straightforward, but does require additional code in each running process. Each process must filter the incoming data from the multicast address and only read data from the source(s) that they are interested in. I had hoped that there was some "magic" in built into SSM to do this automatically, but there is not. recvfrom() already provides the senders address so doing this is relatively low cost.

I am wondering if there is a simpler way to broadcast my own IP to all devices on my network on a specific port using C or C++

I don't know too much about socket programming just cause most of my applications don't need a network. I looked into it and I found one piece of code that looks promising although it doesn't exactly broadcast since it sends to specific IP addresses.

Is there a way to broadcast to addresses say between 192.168.1.0/255 in one fell swoop or do I need to go through loop through the addresses and then send a packet to them myself?

EDIT: I'm asking about c++ implementation, not network infrastructure. That is why I linked the above link.

This is pretty much what IP multicast was invented for.

Edit 0:

Ton of examples on the internet. Just a couple for you:

Many more out there, but if you are any serious about doing network programming, get this book and you'll never regret.

I can either send data throughout the udp protocol with the UdpClient.Send(byte array) or the UdpClient.Client.Send(stream) method. both methods work. the only differences are that on one method I pass a byte array and on the other I pass a stream.

quick example:

UdpClient udpClient = new UdpClient(localEndPoint);
// I can eather send data as:
udpClient.Send(new byte[] { 0, 1, 2 }, 3);
udpClient.Client.Send(new byte[5]);

Also which method will ensure that my data reaches it's destination without loosing information? I have read that the udp protocol does not ensure that all bytes reach it's destination thus is better for streaming video, audio but not for transferring files like I am doing. The reason why I am using udp instead of tcp is because it is very complicated to establish a tcp connection between two users that happen to be behind a router. I know it will be possible if one of the users enables port forwarding on his router. I managed to send data by doing what is called udp punch holing. udp punch holing enables you to establish a connection between two users that are behind a router with the help of a server. It will be long to explain how that work in here you can find lot's of information if you google it. Anyways I just wanted to let you know why I was using udp instead tcp. I don't now if it will be possible to send a file with this protocol making sure that no data is lost. maybe I have to create an algorithm. or maybe the UdpClient.Client.Send method ensures that data will be received and the UdpClient.Send method does not ensure that data will be received.

UDP does not guarantee data delivery or order of them. It only guarantees if you receive packet successfully, the packet is complete. You need to make your network communication reliable with your own implementation. The two functions should not make any difference.

UNIX Network Programming has a chapter for this topic. (22.5 Adding Reliability to a UDP Application). You can also take a look at libginble which supports NAT traversal function (with STUN or relay) and reliability of communication.

This article, Reliability and Flow Control, might also help you to understand one possible way to implement it. Good Luck!

I have a client using select() to check if there is anything to be received, else it times out and the user is able to send(). Which works well enough. However, the program locks up waiting for user input, so it cant recv() again until the user has sent something. I am not having much luck with using threads either, as I cannot seem to find a good resource, which shows me how to use them.

I have tried a creating two threads (using CreateThread) for send and recv functions which are basically two functions using while loops to keep sending and receiving. And then the two CreateThreads() are wrapped up in a while loop, because otherwise it just seemed to drop out.

I have basically no previous experience with threads, so my description of what i've been doing will probably sound ridiculous. But I would appreciate any help, in using them properly for this kind of use, or an alternative method would also be great.

Can't find a good resource on socket programming? Bah. Read the bible:

I have a link error when trying to use sctp_get_no_strms function. I am running Slackware 14.1 and have lksctp-tools installed.

# ls /var/log/packages | grep sctp
lksctp-tools-1.0.16-x86_64-1_SBo

However libsctp symbol list does not include this function.

# nm -D /usr/lib64/libsctp.so | grep sctp_get
0000000000001100 T sctp_getaddrlen
00000000000010e0 T sctp_getladdrs
00000000000010c0 T sctp_getpaddrs

Is sctp_get_no_strms not supported by lksctp-tools?

The compilation command is as follows:

gcc -o srv sctpserv01.o sctp_wrapper.o -L../lib -lsock -lsctp

When using functions from a library, the appropriate header file needs to be #include'd — for example #include "unp.h" since the function sctp_get_no_strms() is described in Stevens Unix Network Programming, Volume 1: The Sockets API, 3rd Edn and is specific to that book.

Note that unp.h is not a standard system header, which is why I used "unp.h" rather than <unp.h> (with angle brackets around the header file name).

I have an IP = '127.0.0.1', port number 'port', and trying

class AClass(asyncore.dispatcher):
     def __init__(self, ip, port):        
        asyncore.dispatcher.__init__(self)
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)        
        self.connect((ip, port))

    def handle_connect(self):
        print 'connected'   

    def handle_read(self):        
        data = self.recv(8192)
        print data   

    def handle_close(self):
        self.close()

Nothing is being printed though.

The sending is being handled by another process. How can I check ? Is the above correct ?

Edit: For sending: a different python program, running separately:

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((addr, port))
packet="some string"
sock.sendall(packet+'\r\n') 

Nothing happens

If I put, on server side (!!!) print sock.recv(4096) I see the packet printed - but still nothing happens from the client side.

I assume that you mean for AClass to be the server-side. In that case, AClass should not be doing a connect(). There are two sides to socket communications. On the server-side, you typically create the socket, bind it to an address and port, set the backlog (via listen()), and accept() connections. Typically, when you have a new connection from a client, you spawn off some other entity to handle that client.

This code implements an echo server:

import asyncore
import socket


class EchoHandler(asyncore.dispatcher_with_send):
    def handle_read(self):
        self.out_buffer = self.recv(1024)
        if not self.out_buffer:
            self.close()
        print "server:", repr(self.out_buffer)

    def handle_close(self):
        self.close()


class EchoServer(asyncore.dispatcher):
    def __init__(self, ip, port):
        asyncore.dispatcher.__init__(self)
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
        self.set_reuse_addr()
        self.bind((ip, port))
        self.listen(1)

    def handle_accept(self):
        sock, addr = self.accept()

        print "Connection from", addr

        EchoHandler(sock)

s = EchoServer('127.0.0.1', 21345)
asyncore.loop()

Notice that in the __init__() method, I'm binding the socket, and setting the backlog. handle_accept() is what handles the new incoming connection. In this case, we get the new socket object and address returned from accept(), and create a new async handler for that connection (we supply the socket to EchoHandler). EchoHandler then does the work of reading from the socket, and then puts that data into out_buffer. asyncore.dispatcher_with_send will then notice that data is ready to send and write it to the socket behind the scenes for us, which sends it back to the client. So there we have both sides, we've read data from the client, and then turn around and send the same data back to the server.

You can check this implementation in a couple of ways. Here is some python code to open a connection, send a message, read the response, and exit:

client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect((ip, port))
client.sendall("Hello world!\r\n")
print "client:", repr(client.recv(1024))
client.close()

You can also just use telnet as your client using the command telnet localhost 21345, for this example. Type in a string, hit enter, and the server will send that string back. Here's an example session I did:

:: telnet 127.0.0.1 21345
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Hello!
Hello!
aouaoeu aoeu aoeu
aouaoeu aoeu aoeu
^]
telnet> quit
Connection closed.

In the example, the first "Hello!" is the one I typed in my client, the second one is the echo from the server. Then I tried another string, and it was echoed back as well.

If you've not done socket communications before, I really can't recommend Richard Stevens UNIX Network Programming enough. A 3rd edition is now in print and available on Amazon. It doesn't, however, cover using Python or asyncore. And, unfortunately, asyncore is one module that's really not covered very well in the Python documentation. There some examples out there in the wild that are pretty decent though.

Hopefully, this gets you moving in the right direction.

EDIT: here's a asyncore based client:

class Client(asyncore.dispatcher_with_send):
    def __init__(self, ip, port, message):
        asyncore.dispatcher_with_send.__init__(self)
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
        self.connect((ip, port))
        self.out_buffer = message

    def handle_connect(self):
        print 'connected'

    def handle_read(self):
        data = self.recv(8192)
        print "client:", repr(data)

    def handle_close(self):
        self.close()

I have currently been working on a multiplayer game using Node.js and Socket.io. I've done okay and I understand the basics, but the further I get into it, the more I realize I really don't understand how networking programming works. As the codebase grows and more features are added, my code is starting to become horribly inefficient and very hard to maintain.

The only resources I've really been able to find online cover very small applications and the methods used don't seem to have much scalability to them.

I'm wondering if anyone has a good book, or perhaps some online videos or articles that cover more advanced aspects or best practices of programming a large multiplayer game. I'm not new to game development, however I am new to the multiplayer and networking side of it.

Unix Network Programming, by Stevens is the best I have seen on network programming. It is incredibly complete and thorough and clear. You can go as deep as you want to with this book.

Also check out the excellent Beej's Guide to Network Programming online.

I have two software components working in tandem - a Visual Studio C# chat application I made, which connects via a TCP NetworkStream to a Node.js application which is hosting a very simple TCP server.

The C# application sends messages via the TCP NetworkStream to the server, and everything works well, until I close the application (the actual process is relatively unimportant, but I've listed it here and included a short gif for clarity):

  • First, I start up the Node.js TCP server

  • Then the C# Application starts and opens up a TCP NetworkStream to the Node.js server

  • I type the message 'Hello!' into the C# app's input and hit 'Send'

  • The message is recieved by the TCP server

  • However when I close my C# application, I get an ECONNRESET error

enter image description here

I'm closing the NetworkStream on the client side using NetworkStream.Close() which, as @RonBeyer pointed out, is ill-advised. According to MSDN it:

Closes the current stream and releases any resources (such as sockets and file handles) associated with the current stream. Instead of calling this method, ensure that the stream is properly disposed.

I assume this is done with the using keyword in C# (which I am pretty unfamiliar with, given my novice understanding of the language).

BUT I DIGRESS. What I know is that an ECONNRESET error is caused by an abrupt closure of the socket / stream. The peculiar thing is, disregarding the ECONNRESET error on the server allows the server to accept new connections seemingly unperturbed.

So my question is:

  • does ignoring the error even matter in this context? Are there any possible implications I'm not seeing here?
  • and if it does/there are, what problems is it causing under the surface (undisposed resources etc.)?

I say this because the function of the server (to accept new TCP connections) remains uninterrupted, so it seems to make no difference superficially. Any expertise on the subject would be really helpful.

Pastebin to my C# code

Thanks in advance!

I suggest reading this post and then deciding for yourself based on your own knowledge of your code whether or not it's OK to ignore the ECONNRESET. It sounds like your Node app may be trying to write to the closed connection (heartbeats being sent?). Proper closing of the connection from your C# app would probably take care of this, but I have no knowledge of C#.

You may have a problem if you get lots of users and if ECONNRESET causes the connection to go into TIME_WAIT. That will tie up the port for 1-2 minutes. You would use netstat on Linux to look for that but I'm sure there is an equivalent Windows app.

If you really want to get into the nitty gritty of socket communications I suggest the excellent Unix Network Programming, by Stevens.

I'm coding a TCP Server class based on the I/O multiplexing (select) way. The basic idea is explained in this chunk of code:

GenericApp.cpp

TServer *server = new Tserver(/*parameters*/);
server->mainLoop();

For now the behavior of the server is independent from the context but in a way that i nedd to improove.

Actual Status

 receive(sockFd , buffer);
 MSGData * msg=     MSGFactory::getInstance()->createMessage(Utils::getHeader(buffer,1024));
 EventHandler * rightHandler =eventBinder->getHandler(msg->type());
 rightHandler->callback(msg);

At this version the main loop reads from the socket, instantiates the right type of message object and calls the appropriate handler(something may not work properly because it compiles but i have not tested it). As you can notice this allows a programmer to define his message types and appropriate handlers but once the main loop is started nothing can be done. I need to make this part of the server more customizable to adapt this class to a bigger quantity of problems.

MainLoop Code

void TServer::mainLoop()
{

    int sockFd;
    int connFd;
    int maxFd;
    int maxi;
    int i;
    int nready; 

    maxFd = listenFd;
    maxi = -1;

     for(i = 0 ; i< FD_SETSIZE ; i++) clients[i] = -1; //Should be in the constructor?

     FD_ZERO(&allset); //Should be in the constructor?
     FD_SET(listenFd,&allset); //Should be in the constructor?


         for(;;)
          {
             rset = allset;
             nready = select (maxFd + 1 , &rset , NULL,NULL,NULL);

             if(FD_ISSET( listenFd , &rset ))
             {
                cliLen = sizeof(cliAddr);
                connFd = accept(listenFd , (struct sockaddr *) &cliAddr, &cliLen);

                 for (i = 0; i < FD_SETSIZE; i++)
                 {  
                     if (clients[i] < 0) 
                    {
                         clients[i] = connFd;   /* save descriptor */
                         break;
                    }
                 }

                 if (i == FD_SETSIZE) //!!HANDLE ERROR

                 FD_SET(connFd, &allset);   /* add new descriptor to set */

                 if (connFd > maxFd) maxFd = connFd;            /* for select */

                 if (i > maxi) maxi = i;                /* max index in client[] array  */

                 if (--nready <= 0)  continue;  
             }

            for (i = 0; i <= maxi; i++)     
             {  
                /* check all clients for data */
                if ( (sockFd = clients[i]) < 0) continue;

                if (FD_ISSET(sockFd, &rset)) 
                {
                    //!!SHOULD CLEAN BUFFER BEFORE READ
                    receive(sockFd , buffer);
                    MSGData * msg =  MSGFactory::getInstance()->createMessage(Utils::getHeader(buffer,1024));
                    EventHandler * rightHandler =eventBinder->getHandler(msg->type());
                    rightHandler->callback(msg);
                }
                   if (--nready <= 0)   break;              /* no more readable descriptors */
                }
           }
 }

Do you have any suggestions on a good way to do this? Thanks.

Your question requires more than just a stack overflow question. You can find good ideas in these book:

Basically what you're trying to do is a reactor. You can find open source library implementing this pattern. For instance:

If you want yout handler to have the possibility to do more processing you could give them a reference to your TCPServer and a way to register a socket for the following events:

  • read, the socket is ready for read
  • write, the socket is ready for write
  • accept, the listening socket is ready to accept (read with select)
  • close, the socket is closed
  • timeout, the time given to wait for the next event expired (select allow to specify a timeout)

So that the handler can implement all kinds of protocols half-duplex or full-duplex:

  • In your example there is no way for a handler to answer the received message. This is the role of the write event to let a handler knows when it can send on the socket.
  • The same is true for the read event. It should not be in your main loop but in the socket read handler.
  • You may also want to add the possibility to register a handler for an event with a timeout so that you can implement timers and drop idle connections.

This leads to some problems:

  • Your handler will have to implement a state-machine to react to the network events and update the events it wants to receive.
  • You handler may want to create and connect new sockets (think about a Web proxy server, an IRC client with DCC, an FTP server, and so on...). For this to work it must have the possibility to create a socket and to register it in your main loop. This means the handler may now receive callbacks for one of the two sockets and there should be a parameter telling the callback which socket it is. Or you will have to implement a handler for each socket and they will comunnicate with a queue of messages. The queue is needed because the readiness of one socket is independent of the readiness of the other. And you may read something on one and not being ready to send it on the other.
  • You will have to manage the timeouts specified by each handlers which may be different. You may end up with a priority queue for timeouts

As you see this is no simple problem. You may want to reduce the genericity of your framework to simplify its design. (for instance handling only half-duplex protocols like simple HTTP)

I'm looking for resources and book which one can use to get started with IPv4 and IPv6 network development. The most relevant book I've came up so far is "Unix Network Programming, Volume 1: The Sockets Networking API (3rd Edition)" which covers both protocols but apart from that I did not find very much.

The information I'm looking for is how both protocols work in detail, how IPv6 and its handling differs to IPv4 and how to use the APIs (Windows or *nix) to set up basic communication between applications across both protocols.

Is above mentioned book already the right starting point or are there other resources and books one can use to get started with this topic?

Douglas Comer
Apart from Programming, if you looking for TCP/IP (v4-6), and other stack related queries and design rationale, his books are the ultimate references. Ofcourse you can dig as much as you want, reading papers online. But from basic to intermediate level his books serves the best.
To start with read

Internetworking with TCP/IP Vol-1, 4e.

This is a must, if i may say. After that, you probably would like to look at the stack details then follow

Internetworking with TCP/IP Vol-2 (ANSCI and BSD)

For programming on *nix machine UNP by stevens, is unbeatable. Underlying concepts are almost same for unix/linux/windowx/mac/ -- mostly everything is based BSD designed Sockets. So i think UNP is best for programming. I think these three books shall solve your purpose. If you like collecting books, then you can add another one to you library by Stevens again

http://www.kohala.com/start/tcpipiv2.html

Some excellent video tutorials on networking, excellent resource

http://www.ecse.rpi.edu/Homepages/shivkuma/teaching/video_index.html

I'm new to C programming and Linux at this time, still struggling my way through a whole lot of things. I'm trying understand whether or not there's any official documentation available for all C related structures and methods available for Linux. I'm aware that we have Linux man page for almost everything, however I'm unable to find any structure definition in there.

For eg. I wish to find out more about sockaddr_in and sockaddr structure. When I search google I come with up with a lots of result, including an MSDN documentation page describing this structure. However, I'm looking for something more official (something similar to MSDN) for linux specific C programming (structures, methods etc.). I'm unsure if any such documentation exist, or if I'm misunderstanding something. Please help me find it in case it exists.

Many Thanks in advance !

If you can buy books, here are some books that explain well mostly used structures in C. I would recommend you to start with The Linux Documentation Project so you can start with the basic structures:

The Linux Documentation Project's Programming using C-API

Richars Steven's book Unix Network Programming, Volume 1 The Sockets Networking API

Robert Love's Linux System Programming

Robert Love's Linux Kernel Development, 3rd Edition

Richard Stone's Beginning Linux Programming, 4th Edition

Michael Kerrisk The Linux Programming Interface

Except these books man pages, GNU libc can be read for more information.

I hope that helps. Cheers

I'm new to socket programing, so forgive me if this question is basic; I couldn't find an answer anywhere.

What constitutes requiring a new socket?

For example, it appears possible to send and receive with the same socket fd on the same port. Could you send on port XXXX and receive on port YYYY with one socket? If not, then are sockets specific to host/port combinations?

Thanks for the insight!

A socket establishes an "endpoint", which consists of an IP address and a port:

Yes, a single socket is specific to a single host/port combination.

READING RECOMMENDATION:

Beej's Guide to Network Programming:

Unix Network Programming: Stevens et al:

I'm recently learning how to program a basic webserver in c. My server depending on certain inputs will send various lines of text all which end in a blank line, and I need to recieve them from the client side somehow to display one line at a time.

Right now the server does something like

send(socket, sometext, strlen(sometext), 0)
send ...
send(socket, "\n", 2, 0);

....

Client:

Currently what I am doing is using fdopen to wrap the socket file descriptor and read with fgets:

FILE *response = fdopen(someSocket, "r");
while(1){
            fgets(input, 500, response);

        if(strcmp(input, "\n") != 0){
            //do some stuff
        }
        else {
            // break out of the loop
            break;
        }
}

This all works well except my program need to fulfil one condition, and that is to not break the socket connection ever. Because I opened a file, in order to stop memory leak the next line after is:

fclose(response);

This however is the root of my problems because it not only closes the file but also closes the socket, which I can't have. Can anyone give me any guidence on simple methods to work around this? I was thinking about maybe using recv() but doesn't that have to take in the whole message all at once? Is there any way to recreate the behavior of fgets using recv?

I'm really a newbie at all this, thanks for all the help!

Your only real alternative is to write a function that:

1) calls recv() with a fixed buffer and a loop,

2) only returns when you get a complete line (or end-of-connection)

3) saves any "leftover characters for the next read of the next line of text

IIRC, Stevens has such a function:

PS: Any self-respecting higher-level web programming library will give you a "getNextLine()" function basically for free.

Just finished self-studying C with "The C Programming Language, 2nd Ed." by Brian Kernighan and Dennis Ritchie, and am looking for a good follow-up book that can kind of pick up where that book left of - specifically pertaining to programming C under Linux (sockets, threading, IPC, etc.). I did the search thing, but there are so many books (some of them quite expensive) that it is hard to pick one.

I know the Kernighan/Ritchie book is frequently used in C programming courses, so I was kind of curious what schools (and other self-learners) have been using after it?

You can read Deep C secrets as a follow-up. But I also strongly recommend you to read comp.lang.c Frequently Asked Questions. As K&R is very old , so you also need to brush up your self with C's new standards, C99/ C11.

For network programming you can follow Unix Network Programming.

Ok so I have been at this bug all day, and I think I've got it narrowed down to the fundamental problem.

Background:

I am working on an app that has required me to write my own versions of NSNetService and NSNetServiceBrowser to allow for Bonjour over Bluetooth in iOS 5. It has been a great adventure, as I knew nothing of network programming before I started this project. I have learned a lot from various example projects and from the classic Unix Network Programming textbook. My implementation is based largely on Apple's DNSSDObjects sample project. I have added code to actually make the connection between devices once a service has been resolved. An NSInputStream and an NSOutputStream are attained with CFStreamCreatePairWithSocketToHost( ... ).

Problem:

I am trying to send some data over this connection. The data consists of an integer, a few NSStrings and an NSData object archived with NSKeyedArchiver. The size of the NSData is around 150kb so the size of the whole message is around 160kb. After sending the data over the connection I am getting the following exception when I try to unarchive...

Terminating app due to uncaught exception 'NSInvalidArgumentException', 
reason: '*** -[NSKeyedUnarchiver initForReadingWithData:]: incomprehensible archive 

After further exploration I have noticed that the received data is only about 2kb.. The message is being truncated, thus rendering the archive "incomprehensible."

Potentially relevant code:

The method that sends the data to all connected devices

- (void) sendMessageToPeers:(MyMessage *)msg
{
  NSEnumerator *e = [self.peers objectEnumerator];

  //MyMessage conforms to NSCoding, messageAsData getter calls encodeWithCoder:
  NSData *data = msg.messageAsData; 

  Peer *peer;
  while (peer = [e nextObject]) {
    if (![peer sendData:data]) {
      NSLog(@"Could not send data to peer..");
    }
  }
}

The method in the Peer class that actually writes data to the NSOutputStream

- (BOOL) sendData:(NSData *)data
{
  if (self.outputStream.hasSpaceAvailable) {
    [self.outputStream write:data.bytes maxLength:data.length];
    return YES;
  }
  else {
    NSLog(@"PEER DIDN'T HAVE SPACE!!!");
    return NO;
  }
}

NSStreamDelegate method for handling stream events ("receiving" the data)

The buffer size in this code is 32768 b/c that's what was in whatever example code I learned from.. Is it arbitrary? I tried changing it to 200000, thinking that the problem was just that the buffer was too small, but it didn't change anything.. I don't think I fully understand what's happening.

- (void)stream:(NSStream *)aStream handleEvent:(NSStreamEvent)eventCode
{
  switch (eventCode) {

    case NSStreamEventHasBytesAvailable: {
      NSInteger       bytesRead;
      uint8_t         buffer[32768];      // is this the issue?
//    uint8_t         buffer[200000];   //this didn't change anything

      bytesRead = [self.inputStream read:buffer maxLength:sizeof(buffer)];
      if (bytesRead == -1) ...;
      else if (bytesRead == 0) ...;
      else {
        NSData  *data = [NSData dataWithBytes:buffer length:bytesRead];
        [self didReceiveData:data];
      }
    } break;

    /*omitted code for other events*/
  }
}

NSStream over a network like that will be using a TCP connection. It can vary, but the maximum packet size is often around 2k. As the message you’re sending is actually 160k, it will be split up into multiple packets.

TCP abstracts this away to just be a stream of data, so you can be sure all these packets will receive in the correct order.

However, the stream:handleEvent: delegate method is probably being called when only the first 2k of data has arrived – there’s no way for it to know that there’s more coming until it does.

Note the method read:maxLength: doesn’t gauruntee you’ll always get that max length – in this case it seems to be only giving you up to 2k.

You should count up the actual bytesReceived, and concatenate all the data together until you receive the total amount you’re waiting for.

How does the receiver know how much data it wants? – you might want to design your protocol so before sending data, you send an integer of defined size indicating the length of the coming data. Alternatively, if you’re only ever sending one message over the socket, you could simply close it when finished, and have the receiver only unarchive after the socket is closed.