The Algorithm Design Manual: Text

Steven S. Skiena

Mentioned 22

This volume helps take some of the "mystery" out of identifying and dealing with key algorithms. Drawing heavily on the author's own real-world experiences, the book stresses design and analysis. Coverage is divided into two parts, the first being a general guide to techniques for the design and analysis of computer algorithms. The second is a reference section, which includes a catalog of the 75 most important algorithmic problems. By browsing this catalog, readers can quickly identify what the problem they have encountered is called, what is known about it, and how they should proceed if they need to solve it. This book is ideal for the working professional who uses algorithms on a daily basis and has need for a handy reference. This work can also readily be used in an upper-division course or as a student reference guide.THE ALGORITHM DESIGN MANUAL comes with a CD-ROM that contains:* a complete hypertext version of the full printed book.* the source code and URLs for all cited implementations.* over 30 hours of audio lectures on the design and analysis of algorithms are provided, all keyed to on-line lecture notes.

More on Amazon.com

Mentioned in questions and answers.

I can't figure out the principles of dynamic programming and I really do want it. DP is very powerful, it can solve problems like this:

Getting the lowest possible sum from numbers' difference

So, can you suggest me good books or articles (preferably with examples with real code) which would explain me what is dynamic programming? I really want simple examples first of all, then I'll move on.

In short, Dynamic Programming is a method to solve complex problems by breaking them down into simpler steps, that is, going through solving a problem step-by-step.

  1. Dynamic programming;
  2. Introduction to Dynamic Programming;
  3. MIT's Introduction to Algorithms, Lecture 15: Dynamic Programming;
  4. Algorithm Design (book).

I hope this links will help at least a bit.

Dynamic programming is a useful type of algorithm that can be used to optimize hard problems by breaking them up into smaller subproblems. By storing and re-using partial solutions, it manages to avoid the pitfalls of using a greedy algorithm. There are two kinds of dynamic programming, bottom-up and top-down.

In order for a problem to be solvable using dynamic programming, the problem must possess the property of what is called an optimal substructure. This means that, if the problem was broken up into a series of subproblems and the optimal solution for each subproblem was found, then the resulting solution would be realized through the solution to these subproblems. A problem that does not have this structure cannot be solved with dynamic programming.

Top-Down

Top-down is better known as memoization. It is the idea of storing past calculations in order to avoid re-calculating them each time.

Given a recursive function, say:

fib(n) = 0 if n = 0
         1 if n = 1
         fib(n - 1) + fib(n - 2) if n >= 2

We can easily write this recursively from its mathematic form as:

function fib(n)
  if(n == 0 || n == 1)
    n
  else
    fib(n-1) + fib(n-2)

Now, anyone that has been programming for awhile or knows a thing or two about algorithmic efficiency will tell you that this is a terrible idea. The reason is that, at each step, you must to re-calculate the value of fib(i), where i is 2..n-2.

A more efficient example of this is storing these values, creating a top-down dynamic programming algorithm.

m = map(int, int)
m[0] = 0
m[1] = 1
function fib(n)
  if(m[n] does not exist)
    m[n] = fib(n-1) + fib(n-2)

By doing this, we calculate fib(i) at most once.


Bottom-Up

Bottom-up uses the same technique of memoization that is used in top-down. The difference, however, is that bottom-up uses comparative sub-problems known as recurrences to optimize your final result.

In most bottom-up dynamic programming problems, you are often trying to either minimize or maximize a decision. You are given two (or more) options at any given point and you have to decide which is more optimal for the problem you're trying to solve. These decisions, however, are based on previous choices you made.

By making the most optimal decision at each point (each subproblem), you are making sure that your overall result is the most optimal.

The most difficult part of these problems is finding the recurrence relationships for solving your problem.

To pay for a bunch of algorithm textbooks, you plan to rob a store that has n items. The problem is that your tiny knapsack can only hold at most W kg. Knowing the weight (w[i]) and value (v[i]) of each item, you want to maximize the value of your stolen goods that all together weight at most W. For each item, you must make a binary choice - take it or leave it.

Now, you need to find what the subproblem is. Being a very bright thief, you realize that the maximum value of a given item, i, with a maximum weight, w, can be represented m[i, w]. In addition, m[0, w] (0 items at most weight w) and m[i, 0] (i items with 0 max weight) will always be equal to 0 value.

so,

m[i, w] = 0 if i = 0 or w = 0

With your thinking full-face mask on, you notice that if you have filled your bag with as much weight as you can, a new item can't be considered unless its weight is less than or equal to the difference between your max weight and the current weight of the bag. Another case where you might want to consider an item is if it has less than or equal weight of an item in the bag but more value.

 m[i, w] = 0 if i = 0 or w = 0
           m[i - 1, w] if w[i] > w
           max(m[i - 1, w], m[i - 1, w - w[i]] + v[i]) if w[i] <= w

These are the recurrence relations described above. Once you have these relations, writing the algorithm is very easy (and short!).

v = values from item1..itemn
w = weights from item1..itemn
n = number of items
W = maximum weight of knapsack

m[n, n] = array(int, int)
function knapsack
  for w=0..W
    m[0, w] = 0
  for i=1 to n
    m[i, 0] = 0
    for w=1..W
      if w[i] <= w
        if v[i] + m[i-1, w - w[i]] > m[i-1, w]
           m[i, w] = v[i] + m[i-1, w - w[i]]
        else
           m[i, w] = m[i-1, w]
      else
        m[i, w] = c[i-1, w]

  return m[n, n]

Additional Resources

  1. Introduction to Algorithms
  2. Programming Challenges
  3. Algorithm Design Manual

Example Problems

Luckily, dynamic programming has become really in when it comes to competitive programming. Check out Dynamic Programming on UVAJudge for some practice problems that will test your ability to implement and find recurrences for dynamic programming problems.

As part of a correspondence Mathematics MSc I did a course based on the book http://www.amazon.co.uk/Introduction-Programming-International-mathematics-computer/dp/0080250645/ref=sr_1_4?ie=UTF8&qid=1290713580&sr=8-4 It really is more of a mathematical angle than a programming angle, but if you can spare the time and effort, it is a very thorough introduction, which seemed work for me as a course that was run pretty much out of the book.

I also have an early version of the book "Algorithms" by Sedgewick, and there is a very readable short chapter on dynamic programming in there. He now seems to sell a bewildering variety of expensive books. Looking on amazon, there seems to be a chapter of the same name at http://www.amazon.co.uk/gp/product/toc/0201361205/ref=dp_toc?ie=UTF8&n=266239

Almost all introductory algorithm books have some chapter for dynamic programming. I'd recommend:

Start with

If you want to test yourself my choices about online judges are

and of course

You can also checks good universities algorithms courses

After all, if you can't solve problems ask SO that many algorithms addict exist here

Lately I'm interested in a topic of genetic algorithms, but i couldn't find any good resource. If you know any good resource, book or a site i would appreciate it. I have solid knowledge of algorithms and A.I. but im looking for something with good introduction in genetic programing.

Best references for me so far:

Also if you're an absolute beginner I'd suggest you to start with the Hello World of Genetics Algorithms. There's nothing like a nice clean example to get started.

I found Melanie Mitchell's book, An Introduction to Genetic Algorithms, to be very good. For a wider coverage of evolutionary computation topics, Introduction to Evolutionary Computing by Eiben and Smith is also worthwhile.

If you're just starting out, I recently wrote an introductory article that may be of use.

There are further links both in that article and also on the home page for my evolutionary computation framework.

If I may plug one of my favorite books, The Algorithm Design Manual by Steve Skiena has a great section on genetic algorithms (plus a lot of other interesting heuristics for solving various types of problems).

I have a technical interview on Monday and they were kind enough to give me a heads-up to brush up on my basic algorithms. It's been years since I looked at that kind of stuff and I'm pretty weak on it to begin with so I generally have a bad feeling about this. What's the best way to review the basics and get some practice in before Monday?

  1. TopCoder Algorithm Tutorials

  2. Get the Algorithm Design Manual and look at the reference section. It has a nice "Problem -> Algorithm" cheat sheet.

I have been coding for a while now and am confortable with a few programming languages but now feel that I need to improve my problem solving or program design technique.

What skills do I need have to be able to design a program as a solution to a problem?

More: The question is about how to design a solution rather than how to code better.

For problem solving technique, look at how other people have solved problems. I recommend The Algorithm Design Manual and Algorithms in a Nutshell. Both of these books take you from a problem statement through several iterations of solutions to show you the thought process, rather than going directly to the final solution.

Talking about books, I suggest this great one, by Martin Fowler

Refactoring: Improving the Design of Existing Code

I had a painful experience with the "Analysis of Algorithms" classes back in college but have recently found a need for it in the real world. -- Anyway, I'm looking for a simple-yet-effective crash course. Any ideas?

Related Sidenote: It sure would be nice if there were a "Cartoon Guide to Algorithm Analysis", taught by Dilbert.

UPDATE: A very similar question can be found at: How to get started on ALGORITHMS?

You don't say a lot about the remainder of you background. For straight out analysis of algorithms, the methods by which you evaluate an algorithm to find its order statistics and behavior, If you're comfortable with mathematics in general -- say you've had two years of calculus, or a good abstract algebra course -- then you can't really do much better than to read Knuth Volume One.

The usual "Analysis of Algorithms" course is also a data structures course, so a data structures text might be better if you also need to learn about lists, trees, etc. My favorite in graduate school was Aho, Hopcroft and Ullman.

I like Cormen as a reference; it also serves as an admirable doorstop, for the execution of large icky bugs, for clamping small glue joints (the slick cover releases most wood glues), and as an end book because it holds a bookend in place. Wouldn't recommend it as an intro text.

I recommend Data Structures and Algorithms by Adam Drozdek available in Java and C++ editions.

The single most helpful tool I've had for algorithms is Introduction to Algorithms.

It's the single best resource I know of for algorithms. It covers so many topics in good depth and has examples throughout. I still refer to it quite frequently.

Without that book my algorithms analysis class would have been a pain.

I like Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein. It's a bit heavy, but I found it to be a decent reference text.

There are a lot of good books on the subject. I like An Introduction to the Analysis of Algorithms. Also check out the algorithms course on MIT OpenCourseWare (using CLRS as the course text). It's a little bit deep, but having it online allows you to go at your own pace.

A couple of other books that I've started reading recently are Algorithms in a Nutshell and the Algorithm Design Manual. They both take a lighter approach than most algorithms books. Instead of heavy math and formal proofs these books give you realistic problem statements and show you the steps taken to refine an algorithm. They also show you how to estimate and measure the complexity of a solution. I would highly recommend either book.

When faced with a problem in software I usually see a solution right away. Of course, what I see is usually somewhat off, and I always need to sit down and design (admittedly, I usually don't design enough), but I get a certain intuition right away.

My problem is I don't get that same intuition when it comes to advanced algorithms. I feel much more up to the task of building another Facebook then building another Google search, or a Music Genom project. It's probably because I've been building software for quite some time, but I have little experience with composing algorithms.

I would like the community's advice on what to read and what projects to undertake to be better at composing algorithms.

(This question has nothing to do with Algorithmic composition. Well, almost nothing)

Steve Yegge referred to "The Algorithm Design Manual" in one of his rants. I haven't seen it myself, but it sounds like it's just the ticket from his description.

My absolute favorite for this kind of interview preparation is Steven Skiena's The Algorithm Design Manual. More than any other book it helped me understand just how astonishingly commonplace (and important) graph problems are – they should be part of every working programmer's toolkit. The book also covers basic data structures and sorting algorithms, which is a nice bonus. But the gold mine is the second half of the book, which is a sort of encyclopedia of 1-pagers on zillions of useful problems and various ways to solve them, without too much detail. Almost every 1-pager has a simple picture, making it easy to remember. This is a great way to learn how to identify hundreds of problem types.

When we start getting into algorithm design and more discrete computer science topics, we end up having to prove things all of the time. Every time I've seen somebody ask how to become really good at proofs, the common (and possibly lazy) answer is "practice".

Practicing is all fine if you have the basics down, but how do you get into the mind set for mathematical proofs? When did induction click? What resources are best for teaching these topics? What foundation topics should be researched prior to indulging in proof-writing?

I'll start off my answer by admitting that as a CS student, I had a really tough time grasping a formal way of thinking, and it's never easy, unless you have a talent for it.

I'm afraid there is no better answer than practice and study.

A formal mathematical and algorithmic way of thinking and visioning problems is a skill which first demands a very deep understanding of the subjects you are dealing with. Second, it requires you have good knowledge of existing proofs. Try to envision yourself as some of the great scientists who came up with the algorithms you are studying. Understand how you would have tried to tackle that specific problem. Then see how they proved the correctness of their algorithm.

I can only recommend the greatest textbook in this subject which is Intro to Algorithms by CLRS. If you go through it from start to finish, including every exercise, you will enhance your skills.

I'm afraid that "practice" really is the best answer here.

Its very similar to programming: once you get the hang of it, you find patterns which solve problems particularly well, and you can create a picture of the high-level design of novel systems which you've never implemented before. However, neophyte programmers aren't aware of patterns: they hack away at code until they accidentally stumble on some solution which appears to "work".

When you're given a problem to prove, you can usually identify properties ("Do I have a set of distinct objects?", "Am I generating permutations?", "Am I looking to minimize/maximize some value?", etc). Sooner or later, proofs will clump together into vaguely similar group, where techniques used to solve one problem can easily apply to novel variations.

Recommended reading:

After reading an introductory book on algorithms and data structures I am now craving for examples on how to combine these for optimal efficiency.

For instance, you can combine hashmaps with specific sorting algorithms to create a simple text search program.

Is there any good book or online resource for this?

(I have already ordered Programming Pearls, which looks great, but I want to learn more about this.)

Good book (worked for me):

Data Structures and Algorithm Analysis in Java (Second Edition)

Published by Addison-Wesley, 2007

ISBN: 0-321-37013-9

To answer my own question, it seems I just have to read up on a lot of algorithms and real world use cases.

As for books, I've ordered

Any good algorithms book is going to have a chapter or two on the importance of choosing the right data structures. I recommend the following books:

I also recommend you check out the Stony Brook Algorithm Repository, particularly the lectures.

http://www.amazon.com/Structure-Interpretation-Computer-Programs-Second/dp/0070004846/ref=sr_1_1?ie=UTF8&qid=1301994609&sr=8-1

I can warmly recommend this book. It is rather abstract with examples in Scheme (a Lisp dialect) but it will really change the way you think about programs, data and algorithms.

In algorithms, I've mostly been self-taught and that's largely been fine. However, I'm having trouble grasping graph algorithns. I'm looking for some kind of reference that has concepts and actual code so I can not only learn the theory (which I usually do ok with) but also get a feel for how graphs are represented and manipulated in practice (what I usually have a harder time grasping). Can SO deliver? Anything from books, to links, to existing projects would be great as long as they have both concept and implementation.

This is language agnostic, but I'm most familiar with python and don't have much experience with FP.

Steve Yegge says this is a terrific book on algorithms that uses graphs extensively.

Both of these are very good:

Free Stuff:

I'm interested in finding out what people would consider the most useful data structures to know in programming. What data structure do you find yourself using all the time?

Answers to this post should help new programmers interested in finding a useful data structure for their problem. Answers should probably include the data structure, information about it or a relevant link, the situation it is being used in and why it is a good choice for this problem (e.g ideal computation complexities, simplicity and understanding etc.)

Each answer should be about one data structure only.

Thanks for any pearls of wisdom and experience people can share.

Graphs are a very powerful overlooked data structure.

A lot of problems can be solved by constructing a graph modeling your problem, then using a well-known algorithm on the graph. Some examples natural language processing (the edge weight connecting nodes can represent how likely one word is to follow another) video games (use graphs to determine shortest paths for AI characters), and network topology.

I learned about graphs from The Algorithm Design Manual, which was recommended by Steve Yegge in a blog post.

Dijkstra's is typically used to find the shortest distance between two nodes in a graph. Can it be used to find a minimum spanning tree? If so, how?

Edit: This isn't homework, but I am trying to understand a question on an old practice exam.

Strictly, the answer is no. Dijkstra's algorithm finds the shortest path between 2 vertices on a graph. However, a very small change to the algorithm produces another algorithm which does efficiently produce an MST.

The Algorithm Design Manual is the best book I've found to answer questions like this one.

What are the most common problems that can be solved with both these data structures?

It would be good for me to have also recommendations on books that:

  • Implement the structures
  • Implement and explain the reasoning of the algorithms that use them

The Algorithm Design Manual contains some interesting case studies with creative use of graphs. Despite its name, the book is very readable and even entertaining at times.

The first thing I think about when I read this question is: what types of things use graphs/trees? and then I think backwards to how I could use them.

For example, take two common uses of a tree:

  • The DOM
  • File systems

The DOM, and XML for that matter, resemble tree structures.
alt text

It makes sense, too. It makes sense because of how this data needs to be arranged. A file system, too. On a UNIX system there's a root node, and branching down below. When you mount a new device, you're attaching it onto the tree.

You should also be asking yourself: does the data fall into this type of structure? Create data structures that make sense to the problem and the rest will follow.

As far as being easier, I think thats relative. Are you good with recursive functions to traverse a tree/graph? What if you need to balance the tree?

Think about a program that solves a word search puzzle. You could map out all the letters of the word search into a graph and check surrounding nodes to see if that string is matching any of the words. But couldn't you just do the same with with a single array? All you really need to do is move an index to check letters to the left and right, and by the width to check above and below letters. Solving this problem with a graph isn't difficult, but it can create a lot of extra work and difficulty if you're not comfortable with using them - of course that shouldn't discourage you from doing it, especially if you are learning about them.

I hope that helps you think about these structures. As for a book recommendation, I'd have to go with Introduction to Algorithms.

@DavidJoiner / all:

FWIW: A new version of the Algorithm Design Manual is due out any day now.

The entire course that he Prof Skiena developed this book for is also available on the web:

http://www.cs.sunysb.edu/~algorith/video-lectures/2007-1.html

Algorithms for Java: Part 5 by Robert Sedgewick is all about graph algorithms and datastructures. This would be a good first book to work through if you want to implement some graph algorithms.

Possible Duplicate:
Are there any open source C libraries with common data structures?

What is the best source of algorithm and data structure implementations for C programmers? Links to an open source library or a book will do. (Please don't point to Sedgewick, as I already have his book).

I'm going through the data structures chapter in The Algorithm Design Manual and came across Suffix Trees.

The example states:

Input:

XYZXYZ$
 YZXYZ$
  ZXYZ$
   XYZ$
    YZ$
     Z$
      $

Output:

enter image description here

I'm not able to understand how that tree gets generated from the given input string. Suffix trees are used to find a given Substring in a given String, but how does the given tree help towards that? I do understand another given example of a trie shown below, but if the below trie gets compacted to a suffix tree, then what would it look like?

enter image description here

I've read about Big O notation from many sources, including Skiena and the Wikipedia entry, the Example section of which states:

In typical usage, the formal definition of O notation is not used directly; rather, the O notation for a function f(x) is derived by the following simplification rules:

  • If f(x) is a sum of several terms, the one with the largest growth rate is kept, and all others omitted.

  • If f(x) is a product of several factors, any constants (terms in the product that do not depend on x) are omitted.

The solution to problem 2.2 is O((n^3)/3). Shouldn't the "/3" be omitted, or am I missing something?

You are correct. 1/3 is a constant, and therefore should omitted.

I want to find connected components of a Directed Acyclic Graph using a set of nodes. What would be the most efficient way to solve this problem?

Connected Component: If one of the nodes is a predecessor or successors of another node, they are in the same connected component.

For example, let's say I have the following graph and vector = [2,4,5,6,3]. For this vector, there are two connected component as below.

C1 = [2,4,5,6]
C2 = [3]

enter image description here

My solution:

  • Sort the nodes using their depth value
  • Pick a node
  • Check the other nodes if they are successors or not. If so, keep looking. If not, stop and go to step 2.

What do you think?

The way to find connected component is something like following :

while(anymore_root_vertex_tostartfrom){
    vertex v=pick_a_vertex;
    remove vertex from root_vertex_list;
    initialize connected_comp_list_of_vertices;
    do a breadth_first traversal from this vertex
    keep adding the vertex you encounter in bfs to connected_comp_list_of_vertices
    of that root
}

I would suggest reading some more about Graph processing. Finding connected component, BFS, DFS traversal have fairly standard techniques. An excellent book on this subject is Skiena's Algoritham Design Manual

My copy of The Design and Analysis of Computer Algorithms has arrived today. In the first chapter, the author introduced Turing Machines. I have two other algorithms textbooks, Introduction to Algorithms and The Algorithm Design Manual, but none of them talks about Turing machines, even though they are famous on the subject of algorithms and data structures.

I would like to understand What is the relation between Turing Machine and Algorithm/Datastructure. Is is really important to understand Turing machines to become expert in algorithms?

Turing machines are just theoretical tools to analyze computation, ie. we can specify an algorihm by creating a turing machine which computes it. They are very useful in the study of computability, that is, if it is possible at all to compute a function. Turing machines and several other formal language constructs are discuessed in the classic book by Hopcroft and Ullmann. Turing machines also appear when discussing NP-completeness for instance in this book, by Garey and Johnson.

Both books and turing machines in general are pretty theoretical. If you are interested in algorihhms in an academic way, I'd recommend them. However, if you want a practical understanding of algorihms implemented on real computers and run on real data then I'd say that it's only important to have a cursory understanding of turing machines.

The reason that Turing machines are of importance when describing data structures and algorithms is that they provide a mathematical model in which we can describe what an algorithm is. Most of the time, algorithms are described using high-level language or pseudocode. For example, I might describe an algorithm to find the maximum value in an array like this:

Set max = -infinity
For each element in the array:
    If that element is greater than max:
        Set max equal to that element.

From this description it's easy to see how the algorithm works, and it would be easy to translate it into source code. However, suppose that I had written out this description:

Guess the index at which the maximum element occurs.
Output the element at that position.

Is this a valid algorithm? That is, can we say "guess the index" and rigorously define what it means? If we do allow this, how long will it take to do this? If we don't allow it, why not? What's different about the first description from the second?

In order to have a mathematically rigorous definition of an algorithm, we need to have some formal model of how a computer works and what it can and cannot do. The Turing machine is one common way to formally define computation, though there are others that can be used as well (register machines, string rewriting systems, Church's lambda calculus, etc.) Once we have defined a mathematical model of computation, we can start talking about what sorts of algorithmic descriptions are valid - namely, those that could be implemented using our model of computation.

Many modern algorithms depend critically on the properties of the underlying model of computation. For example, cache-oblivious algorithms assume that the model of computation has some memory buffer of an unknown size and a two-tiered memory. Some algorithms require that the underlying machine be transdichotomous, meaning that the size of a machine word must be at least large enough to hold the size of any problem. Randomized algorithms require a formal definition of randomess and how the machine can use random values. Nondeterministic algorithms require a means of specifying a nondeterministic computation. Algorithms based on circuits must know what circuit primitives are and are not allowed. Quantum computers need a formal definition of what operations are and are not allowed, along with what the definition of an algorithm is given that the output is probabilistic. Distributed algorithms need a formal definition of communication across machines.

In short, it's important to be explicit about what is and is not allowed when describing an algorithm. However, to be a good programmer or to have a solid grasp of algorithms, you don't need to necessarily know Turing machines inside and out, nor do you need to know about the specific details of how you'd encode particular problems using them. What you should know, though, is what the model of computation can and cannot do, and what the cost is per operation. This way, you can reason about how efficient algorithms are, how much of various resources (time, space, memory, communication, randomess, nondeterminism, etc.) they use. But that said, don't panic if you don't understand the underlying model.

There is one other reason to think about the underlying model of computation - discussing its limitations. Every model of computation has its limits, and in some cases you can prove that certain algorithms cannot possibly exist for certain problems, or that any algorithm that would solve some problem necessarily must use some amount of a given resource. The most common example where this comes up in algorithm design the notion of NP-hardness. Some problems are conjectured to be extremely "difficult" to solve, but the formal definitions of what this difficulty is relies on knowledge of Turing machines and nondeterministic Turing machines. Understanding the model is useful in this case because it allows you to reason about the computational feasibility of certain problems.

Hope this helps!

In The Algorithm Design Manual, the author provides an algorithm for two-coloring a graph. It's similar to the algorithm that counts the number of components, in that it iterates over all available vertices, and then colors and performs a BFS on that vertex only if it is not discovered:

for(i = 1; i <= (g->nvertices); i++) {
    if(discovered[i] == FALSE) {
        color[i] = WHITE;
        bfs(g, i);
    }
}

The BFS calls a process_edge function when y in the edge x -> y is not processed, or if the graph is directed. The BFS looks like this:

bfs(graph *g, int start) {

    queue q; //queue of vertices to visit
    int v; //current vertex
    int y; //successor vertex
    edgenode* p; //temporary pointer used to traverse adjacency list

    init_queue(&q);
    enqueue(&q, start);
    discovered[start] = TRUE;

    while(empty_queue(&q) == FALSE) {
        v = dequeue(&q);

        process_vertex_early(v); //function that handles early processing
        processed[v] = TRUE;
        p = g->edges[v];

        while(p != NULL) {
            y = p->y;

            //If the node hasn't already been processed, or if the graph is directed
            //process the edge. This part bothers me with undirected graphs
            //because how would you process an edge that was wrongly colored
            //in an earlier iteration when it will have processed[v] set to TRUE?
            if((processed[y] == FALSE) || g->directed) {
                process_edge(v, y); //coloring happens here
            }

            if(discovered[y] == FALSE) {
                enqueue(&q, y);
                discovered[y] = TRUE;
                parent[y] = v;
            }

            p = p->next;
        }

        process_vertex_late(v); //function that handles late processing
    }
}

The process_edge function looks like this:

process_edge(int x, int y) {
    if(color[x] == color[y]) {
        bipartite = FALSE;
        printf("Warning: not bipartite due to (%d, %d)\n", x, y);
    }

    color[y] = complement(color[x]);
}

Now assume we have a graph like this:

enter image description here

We can two-color it like this:

enter image description here

But if we are traversing it by vertex order, then we will initially start with node 1, and color it to WHITE. Then we will find node 13 and color it to BLACK. In the next iteration of the loop, we are looking at node 5, which is undiscovered and so we will color it WHITE and initiate a BFS on it. While doing this, we will discover a conflict between nodes 5 and 1 because 1 should be BLACK, but it was previously set to WHITE. We will then discover another conflict between 1 and 13, because 13 should be WHITE, but it was set to BLACK.

When performing a normal traversal of a graph through all components (connected or not), order will not matter since we will end up visiting all the nodes anyway, however the order seems to matter in the case of coloring the graph. I didn't see a mention of this in the book and only ran across this issue when I was trying to two-color a randomly-generated graph like the one above. I was able to make a small change to the existing algorithm, which eliminated this problem:

for(i = 1; i <= (g->nvertices); i++) {
    //Only initiate a BFS on undiscovered vertices and vertices that don't
    //have parents.
    if(discovered[i] == FALSE && parent[i] == NULL) {
        color[i] = WHITE;
        bfs(g, i);
    }
}

Does this change make sense, or is it a hack due to my not understanding some fundamental concept?

UPDATE

Based on G. Bach's answer, assume we have the following graph:

enter image description here

I'm still confused as to how this would end up being two-colored properly. With the original algorithm, the first iteration will initiate a BFS with node 1 to give us a graph that is colored like this:

enter image description here

In the next iteration, we will initiate a BFS with node 5 to give us a graph that is colored like this:

enter image description here

The next iteration will initiate a BFS with node 6 to give us a graph that is colored like this:

enter image description here

But now we won't re-color 5 because we have already visited it and so this leaves us with a graph that hasn't been colored properly.

The directed nature of the graph has no bearing on the bipartite coloring problem you posed, unless you define a different problem where the direction would indeed begin to matter. So you can convert the graph you used in your example into an undirected graph and run the algorithm as given in the textbook.

While the textbook does not explicitly mention that the graph should be undirected, edge direction has no bearing on the common coloring problems we study. However, you could define problems which take into account edge directions (http://www.labri.fr/perso/sopena/pmwiki/index.php?n=TheOrientedColoringPage.TheOrientedColoringPage).

Note: I intended to write this as a comment, but as a newbie, I am not allowed to do so, until I accumulate a few reputation points.

Lately I have been stuck on improving my algorithmic skills. And at this point I am finding myself out of good material for solving grid problems based on dfs and bsf. I somehow managed to do http://www.spoj.pl/problems/POUR1/ with a brute force logic but i recently go-ogled to find out that the problem can be done by bfs. But I can't figure out exactly how to go about it. Can someone please provide some text to read or some kind of explanation to the above mentioned problem so I can add this to my skill set. It would be extremely kind if you could even help me out for these techniques in problems like these http://www.codechef.com/problems/MMANT/ .please help as soon as possible I am really stuck in these kind of problems ant can't move on. It would also be really kind if u could provide a list of good questions about Binary Indexed Trees and segment trees and some more examples of their usage.

Thanks for the help!! :)

One resource I've found useful is The Algorithmist:

The Algorithmist is a resource dedicated to anything algorithms - from the practical realm, to the theoretical realm. There are also links and explanation to problemsets.

Also The Algorithm Design Manual by Steve Skiena is extremely useful, especially the second part.

What algorithms should a java developer(or maybe a better question would be a software developer in general) should know. I have Introduction to Algorithms by Cormen and Algorithms by Richard Johnsonbaugh at hand. Would going through their content be overkill?

The CLRS is probably the most widely used one and reading it is definitely not overkill, just skip the proofs (I would read them just to enjoy, but not too much practical use). You may also want to check this one.

With the problem and suggested solution? such as: for various/graph network based problems, comparison advanced sorting algorithm, dynamic algorithm for various problems and optimised bs trees and their implementation for solving various problems..

Check out the Algorithm Design Manual by Steven Skiena. The first half is a tutorial with "war stories" interspersed throughout (real-life problems and solutions), while the second half is a comprehensive reference of various problems, known algorithms, and their implementations.

It's helpful whether you're trying to prepare for job interviews, learn algorithms, or just as a reference.

On Amazon: http://www.amazon.com/Algorithm-Design-Manual-Steve-Skiena/dp/0387948600

There's an online PDF version.