Mentioned in questions and answers.

I want to access a large file (file size may vary from 30 MB to 1 GB) through 10 threads and then process each line in the file and write them to another file through 10 threads. If I use only one thread to access the IO, the other threads are blocked. The processing takes some time almost equivalent to reading a line of code from file system. There is one more constraint, the data in the output file should be in the same order as that of the input file.

I want your thoughts on the design of this system. Is there any existing API to support concurrent access to files?

Also writing to same file may lead to deadlock.

Please suggest how to achieve this if I am concerned with time constraint.

One of the possible ways will be to create a single thread that will read input file and put read lines into a blocking queue. Several threads will wait for data from this queue, process the data.

Another possible solution may be to separate file into chunks and assign each chunk to a separate thread.

To avoid blocking you can use asynchronous IO. You may also take a look at Proactor pattern from Pattern-Oriented Software Architecture Volume 2

I have multiple client handler threads, these threads need to pass received object to a server queue and the sever queue will pass another type of object back to the sending thread. The server queue is started and keeps running when the server starts.I am not sure which thread mechanism to use for the client handler threads notified an object is sent back. I don't intend to use socket or writing to a file.

A book tip for if you want to dive a bit more into communication between threads (or processes, or systems): Pattern-Oriented Software Architecture Volume 2: Patterns for Concurrent and Networked Objects

What is the correct way to accept sockets in a multi connection environment in .NET? Will the following be enough even if the load is high?

while (true)
{
   //block untill socket accepted
   var socket = tcpListener.AcceptSocket();
   DoStuff(socket) //e.g. spawn thread and read data
}

That is, can I accept sockets in a single thread and then handle the sockets in a thread / dataflow / whatever. So the question is just about the accept part..

Take a look at either the Reactor or Proactor pattern depending on if you wan't to block or not. I'll recommend the Patterns for Concurrent and Networked Objects book.

I am learning C++ I don't know much about this stuff except the fact that programming design pattern is neccesary when actually working in large projects.I hope its correct to some extent.

Is this common to all object oriented languages or do I need to look specifically into C++ design patterns.

Also How does it helps you.Is this realy important to learn as a C++ programmer.

Please suggest

You will hear discording opinions about design patterns, in the programming community at large.

In my opinion, it is sure that there are abstractions that patterns encapsulate that are really useful (factory, singleton, delegate, etc.). I use patterns a lot, but I myself am sometime puzzled by the apparent lack of depth or level of insight that you get by reading a pattern description. This is also in tune with the proliferation of design patterns that specialize for any kind of things.

When the design hey are useful, they are a very good means of communication and certainly they guide you through the process of designing or defining the architecture of your app. They are useful both for small project and for large ones, and they can be applied at different granularity levels.

Patters are a generic concept and all programming languages support them. Anyway, if you work in C++, a book focusing on it is best, because you will get the pattern adapted to the characteristics of the language.

In my opinion, the really fundamental book about design patterns are:

GoF, Design Patterns: Elements of Reusable Object-Oriented Software

VV.AA., Pattern-Oriented Software Architecture Volume 1: A System of Patterns

VV.AA., Pattern-Oriented Software Architecture Volume 2: Patterns for Concurrent and Networked Objects

Design patterns are solutions to commonly occuring problems in Design phase of a project.These patterns provide the solutions which we can use independent of programming language.For e.g. there is Singleton design pattern which ensures that there is only one instance of a class.Now there are numerous occaions on which this may be required.You can use the solution of these pattern and use in your code.

They provide the re usability in Software development .Put simply, design patterns help a designer get a design right faster.

For more better understanding you could refer Design Patterns: Elements of Reusable Object-Oriented Software

I currently have an application that has a large number of threads and this makes the application very large. Each thread is long-running, basically an infinite loop of polling for new emails then handling them. Each thread holds on to one SSL connection, which is why threading works well for the application.

I want to use thread pooling. The simplest approach is to just fix the number of threads then add say 10 users per thread but even at that point it does not seem to be balancing the work as uniformly as 1 user / thread as each loop is fairly long to process. Plus this isn't actually a thread pool.

My question is - what is the proper design pattern here (as it's certainly more intelligent than what I wrote above), and is there a C++ library that handles this well? It would also be helpful to point me to a Java utility as in my experience it's pretty easy to work out a design pattern from a Java utility.

There are numerous patterns that could solve the problem, but the key point in making the decision is understanding the pattern's consequences. The ACE website has some papers for some concurrency patterns that may be suitable, such as Active Object, Proactor, and Reactor. The examples are C++, but there are diagrams and descriptions that will be helpful if you pursue your own implementation.

The proactor may provide an elegant solution or serve as the foundation for another pattern. For C++ implementations, I would recommend the boost::asio library:

  • Fairly well documented and examples.
  • SSL support.
  • The HTTP Server 3 example shows how to use boost::asio with a thread pool.

If you do not want a dependency on boost, then the asio library is also packaged separately here. For more patterns and information, consider this book.

Being interested in high frequency trading/High performance computing I came across 'ACE':

http://www.cs.wustl.edu/~schmidt/ACE-overview.html

However, I noticed a lot of the papers on the website are from 1995 era and I wondered is this framework still used and if not, what was it's replacement?

Or has boost replaced this? Does ACE contain desired libraries that boost doesnt?

If you have a look at their subversion repository, it does not seem that ACE is undergoing much development nowadays, possibly just bug fixing or minor extensions. On the other hand, ACE is the foundations of other frameworks by the same group that indeed are more active. Anyway, the discussion forum shows relevant activity and constant interest in ACE.

As to your question about ACE vs. boost, I don't think that the two libraries are on a par. ACE is aimed at enabling cross-platform advanced networking (even on real-time and embedded systems), offering specific patterns like reactor, service configurator, completion tokens, memory management and so on. The "portability" layer (ACEOS, if I am not wrong) is just a basic layer, but it is not, in my opinion, the real value proposition of ACE nowadays, rather it is there to enable the other subsystems.

Overall, I think that for advanced networking patterns, like those described in POSA2, ACE is a good choice. If you need just an abstraction layer over the OS, boost is the way to go (more modern and widely adopted).

I'm coding a TCP Server class based on the I/O multiplexing (select) way. The basic idea is explained in this chunk of code:

GenericApp.cpp

TServer *server = new Tserver(/*parameters*/);
server->mainLoop();

For now the behavior of the server is independent from the context but in a way that i nedd to improove.

Actual Status

 receive(sockFd , buffer);
 MSGData * msg=     MSGFactory::getInstance()->createMessage(Utils::getHeader(buffer,1024));
 EventHandler * rightHandler =eventBinder->getHandler(msg->type());
 rightHandler->callback(msg);

At this version the main loop reads from the socket, instantiates the right type of message object and calls the appropriate handler(something may not work properly because it compiles but i have not tested it). As you can notice this allows a programmer to define his message types and appropriate handlers but once the main loop is started nothing can be done. I need to make this part of the server more customizable to adapt this class to a bigger quantity of problems.

MainLoop Code

void TServer::mainLoop()
{

    int sockFd;
    int connFd;
    int maxFd;
    int maxi;
    int i;
    int nready; 

    maxFd = listenFd;
    maxi = -1;

     for(i = 0 ; i< FD_SETSIZE ; i++) clients[i] = -1; //Should be in the constructor?

     FD_ZERO(&allset); //Should be in the constructor?
     FD_SET(listenFd,&allset); //Should be in the constructor?


         for(;;)
          {
             rset = allset;
             nready = select (maxFd + 1 , &rset , NULL,NULL,NULL);

             if(FD_ISSET( listenFd , &rset ))
             {
                cliLen = sizeof(cliAddr);
                connFd = accept(listenFd , (struct sockaddr *) &cliAddr, &cliLen);

                 for (i = 0; i < FD_SETSIZE; i++)
                 {  
                     if (clients[i] < 0) 
                    {
                         clients[i] = connFd;   /* save descriptor */
                         break;
                    }
                 }

                 if (i == FD_SETSIZE) //!!HANDLE ERROR

                 FD_SET(connFd, &allset);   /* add new descriptor to set */

                 if (connFd > maxFd) maxFd = connFd;            /* for select */

                 if (i > maxi) maxi = i;                /* max index in client[] array  */

                 if (--nready <= 0)  continue;  
             }

            for (i = 0; i <= maxi; i++)     
             {  
                /* check all clients for data */
                if ( (sockFd = clients[i]) < 0) continue;

                if (FD_ISSET(sockFd, &rset)) 
                {
                    //!!SHOULD CLEAN BUFFER BEFORE READ
                    receive(sockFd , buffer);
                    MSGData * msg =  MSGFactory::getInstance()->createMessage(Utils::getHeader(buffer,1024));
                    EventHandler * rightHandler =eventBinder->getHandler(msg->type());
                    rightHandler->callback(msg);
                }
                   if (--nready <= 0)   break;              /* no more readable descriptors */
                }
           }
 }

Do you have any suggestions on a good way to do this? Thanks.

Your question requires more than just a stack overflow question. You can find good ideas in these book:

Basically what you're trying to do is a reactor. You can find open source library implementing this pattern. For instance:

If you want yout handler to have the possibility to do more processing you could give them a reference to your TCPServer and a way to register a socket for the following events:

  • read, the socket is ready for read
  • write, the socket is ready for write
  • accept, the listening socket is ready to accept (read with select)
  • close, the socket is closed
  • timeout, the time given to wait for the next event expired (select allow to specify a timeout)

So that the handler can implement all kinds of protocols half-duplex or full-duplex:

  • In your example there is no way for a handler to answer the received message. This is the role of the write event to let a handler knows when it can send on the socket.
  • The same is true for the read event. It should not be in your main loop but in the socket read handler.
  • You may also want to add the possibility to register a handler for an event with a timeout so that you can implement timers and drop idle connections.

This leads to some problems:

  • Your handler will have to implement a state-machine to react to the network events and update the events it wants to receive.
  • You handler may want to create and connect new sockets (think about a Web proxy server, an IRC client with DCC, an FTP server, and so on...). For this to work it must have the possibility to create a socket and to register it in your main loop. This means the handler may now receive callbacks for one of the two sockets and there should be a parameter telling the callback which socket it is. Or you will have to implement a handler for each socket and they will comunnicate with a queue of messages. The queue is needed because the readiness of one socket is independent of the readiness of the other. And you may read something on one and not being ready to send it on the other.
  • You will have to manage the timeouts specified by each handlers which may be different. You may end up with a priority queue for timeouts

As you see this is no simple problem. You may want to reduce the genericity of your framework to simplify its design. (for instance handling only half-duplex protocols like simple HTTP)

is there any new design patterns available other than the patterns covered by GoF book and Head First Design Patterns? Have any one of you used your own design patterns in your projects? Please let me know. if possible give some UML Diagrams. Thanks in advance.

The "sequel" to the GoF book is Pattern Hatching by John Vlissides. It does not publish really new patterns, but variations of some included in the original GoF book. Its great value is rather in that it shows the thought and design process involved in applying the patterns.

Although this is not an answer to your question in the strict sense, there are also lots of other kinds of patterns relevant to our field: