Timothy G. Mattson, Beverly A. Sanders, Berna L. Massingill
The Parallel Programming Guide for Every Software Developer From grids and clusters to next-generation game consoles, parallel computing is going mainstream. Innovations such as Hyper-Threading Technology, HyperTransport Technology, and multicore microprocessors from IBM, Intel, and Sun are accelerating the movement's growth. Only one thing is missing: programmers with the skills to meet the soaring demand for parallel software. That's where Patterns for Parallel Programming comes in. It's the first parallel programming guide written specifically to serve working software developers, not just computer scientists. The authors introduce a complete, highly accessible pattern language that will help any experienced developer "think parallel"-and start writing effective parallel code almost immediately. Instead of formal theory, they deliver proven solutions to the challenges faced by parallel programmers, and pragmatic guidance for using today's parallel APIs in the real world. Coverage includes: Understanding the parallel computing landscape and the challenges faced by parallel developers Finding the concurrency in a software design problem and decomposing it into concurrent tasks Managing the use of data across tasks Creating an algorithm structure that effectively exploits the concurrency you've identified Connecting your algorithmic structures to the APIs needed to implement them Specific software constructs for implementing parallel programs Working with today's leading parallel programming environments: OpenMP, MPI, and Java Patterns have helped thousands of programmers master object-oriented development and other complex programming technologies. With this book, you will learn that they're the best way to master parallel programming too. 0321228111B08232004
I'm currently working on a wireless networking application in C++ and it's coming to a point where I'm going to want to multi-thread pieces of software under one process, rather than have them all in separate processes. Theoretically, I understand multi-threading, but I've yet to dive in practically.
What should every programmer know when writing multi-threaded code in C++?
Part of my graduate study area relates to parallelism.
I read this book and found it a good summary of approaches at the design level.
At the basic technical level, you have 2 basic options: threads or message passing. Threaded applications are the easiest to get off the ground, since pthreads, windows threads or boost threads are ready to go. However, it brings with it the complexity of shared memory.
Message-passing usability seems mostly limited at this point to the MPI API. It sets up an environment where you can run jobs and partition your program between processors. It's more for supercomputer/cluster environments where there's no intrinsic shared memory. You can achieve similar results with sockets and so forth.
At another level, you can use language type pragmas: the popular one today is OpenMP. I've not used it, but it appears to build threads in via preprocessing or a link-time library.
The classic problem is synchronization here; all the problems in multiprogramming come from the non-deterministic nature of multiprograms, which can not be avoided.
See the Lamport timing methods for a further discussion of synchronizations and timing.
Multithreading is not something that only Ph.D.`s and gurus can do, but you will have to be pretty decent to do it without making insane bugs.
This is an open-ended question. What approaches should I consider?
Your first step is to find and understand the parallelism in your problem. It is really easy to write multi-threaded code that performs no better than the single-threaded code it replaces. "Patterns for Parallel Programming" (Amazon) is a great introduction to the key concepts.
Once you have a workable design, start reading the articles in the "Concurrency" topic in the MSDN Magazine archives (link), particularly anything written by Jeff Richter. Those will give you the nuts and bolts stuff on the threading constructs specific to Windows and .NET. (The multi-threading section in Richter's "CLR via C# (Amazon)is short, but very insightful - highly recommended.)
I have built an application in C# that I would like to be optimized for multiple cores. I have some threads, should I do more?
Updated for more detail
Understanding the parallelism (or potential for parallelism) in the problem(s) you are trying to solve, your application and its algorithms is much more important than any details of thread synchronization, libraries, etc.
Start by reading Patterns for Parallel Programming (which focuses on 'finding concurrency' and higher-level design issues), and then move on to The Art of Multiprocessor Programming (practical details starting from a theoretical basis).
There are certain common library functions in erlang that are much slower than their c equivalent.
Is it possible to have c code do the binary parsing and number crunching, and have erlang spawn processes to run the c code?
If you really look for speed you should try OpenMP or MPI parallel programming frameworks for C and C++. I recommend you to take a look at Patterns for Parallel Programming (link to amazon.com) for the details of OpenMP and MPI programming patterns.
The section of erl_nif in Erlang ERTS reference manual will be helpful.
I'm working on a project were we need more performance. Over time we've continued to evolve the design to work more in parallel(both threaded and distributed). Then latest step has been to move part of it onto a new machine with 16 cores. I'm finding that we need to rethink how we do things to scale to that many cores in a shared memory model. For example the standard memory allocator isn't good enough.
What resources would people recommend?
So far I've found Sutter's column Dr. Dobbs to be a good start. I just got The Art of Multiprocessor Programming and The O'Reilly book on Intel Threading Building Blocks
A couple of other books that are going to be helpful are:
Also, consider relying less on sharing state between concurrent processes. You'll scale much, much better if you can avoid it because you'll be able to parcel out independent units of work without having to do as much synchronization between them.
Even if you need to share some state, see if you can partition the shared state from the actual processing. That will let you do as much of the processing in parallel, independently from the integration of the completed units of work back into the shared state. Obviously this doesn't work if you have dependencies among units of work, but it's worth investigating instead of just assuming that the state is always going to be shared.
If you plan to write a very computationally intensive parallel application, what guidelines would you use to design your objects (whether classes or structs, or anything else) to maximize your potential of getting the most out of the parallelism.
I am thinking of an application that say interprets/compiles a tree-like graph of objects that require creating stuff, passing it to another object to be processed, and so on, with tree like structure.
What should one consider from the early design process?
The pattern Jorge Córdoba describes above is just one approach. The following is definitely worth a read:
It very much depends on the dependencies between your data as to the best way to decompose your problem. For example, patterns like Master-Worker and single program multiple data (SPMD) tend to be very simple approaches if your problem lends itself to sunch and approach.
I want to start learning about parallelism. Appropriately,
I am collecting resources to do the same. Here is some stuff that I have found:
Just requesting to share your resources.
You should start by first learning about how different parallel architectures work. There are plenty of freely available and quality materials on the Internet.
Lawrence Livermore National Laboratory (LLNL) provides a nice set of HPC tutorials.
Edinburgh Parallel Computing Centre (EPCC) also provides for free large amount of course training materials.
I leave this answer as a community wiki so anyone can modify it and add other sources.