Introduction to Automata Theory, Languages, and Computation

John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman

Mentioned 5

This classic book on formal languages, automata theory, and computational complexity has been updated to present theoretical concepts in a concise and straightforward manner with the increase of hands-on, practical applications. This new edition comes with Gradiance, an online assessment tool developed for computer science. Gradiance is the most advanced online assessment tool developed for the computer science discipline. With its innovative underlying technology, Gradiance turns basic homework assignments and programming labs into an interactive learning experience for students. By using a series of “root questions” and hints, it not only tests a student’s capability, but actually simulates a one-on-one teacher-student tutorial that allows for the student to more easily learn the material. Through the programming labs, instructors are capable of testing, tracking, and honing their students’ skills, both in terms of syntax and semantics, with an unprecedented level of assessment never before offered.

More on

Mentioned in questions and answers.

It is said that true mastery of a subject is achieved when you can explain it simply. Unfortunately as an Electronics Engineer I lack some of the more formal aspects of computer science.

Taking into consideration that there is some background in math, how would you explain computational complexity theory to the naïve? Where to start to get into this very important topic of CS? I understand some of the concepts involved, but I lack a general overview that allows me to drop into the details.

Edit: Big O notation is clear to me. What I am looking more is an explanation of what is this P = NP question. What is a problem P? What is NP? what is a NP-Hard?

Edit 2: Some really great answers! now a lot of things make more sense. The problem is that sometimes wikipedia is written as if the reader already understood all concepts involved. Now with this quick overview many of these articles make a lot more sense

Unfortunately, the best two books I am aware of (Garey and Johnson and Hopcroft and Ullman) both start at the level of graduate proof-oriented mathematics. This is almost certainly necessary, as the whole issue is very easy to misunderstand or mischaracterize. Jeff nearly got his ears chewed off when he attempted to approach the matter in too folksy/jokey a tone.

Perhaps the best way is to simply do a lot of hands-on work with big-O notation using lots of examples and exercises. See also this answer. Note, however, that this is not quite the same thing: individual algorithms can be described by asymptotes, but saying that a problem is of a certain complexity is a statement about every possible algorithm for it. This is why the proofs are so complicated!

  1. How do you find the minimal Deterministic FSM?
  2. Is there a way to normalize non-deterministic FSMs?
  3. Is there a linear time bound algorithm to find the minimal FSM for a given machine?
  4. Is there a way to see if two FSMs are equivalent?

This is not a homework question. I was watching this lecture series and just got curious.

Some informal answers to give you the ideas, for detailed proofs read a good book on Automata, for example this one or the ones mentioned in the other answers. And I am pretty sure there are online materials you could find answering all of your questions.

  • How do you find the minimal Deterministic FSM?

The procedure is to eliminate duplicated states (or to merge equivalent states). You know that state and transitions are the keys to generate strings. Basically, duplicated states do not contribute in making the language generated any larger or less. The algorithm starts from final states that always have the ability of generating the lamda (empty string), and recursively update a table which indicate the generating ability of a state, and finally merge those states making no difference.

  • Is there a way to normalize non-deterministic FSMs?

The normalized DFA for the NFA is using different collections of the states of NFA as DFA's states, for example, {state0} -(1)-> {state1, state2} to remove the non-deterministic part, there is no way to avoid the state explosion as DFA has to do that to represent the language.

  • Is there a linear time bound algorithm to find the minimal FSM for a given machine?

I remember the best one is O(NLogN) by doing some tricks of reusing information in some paper by a Western Ontario University Professor, and doubt there exists better ones. I believe the classical one is O(N^2).

  • Is there a way to see if two FSMs are equivalent?

Yes. Get the minimal one, code the state by their accessing string (a string that could reach the state from the start state, this is pretty much the real "name" of a state there), and verify the transition map. There might be better ways though, but there would be no big difference on the bigO.

What would be a regex (PHP) to replace/remove (using preg_replace()) END where its not been preceded by an unended START?

Here are a few examples to portray what I mean better:

Example 1:




sometext.... //because theres no START, therefore no need for the excess END

Example 2:




STARTsometext....END //because its preceded by a START

Example 3:




STARTsometext....END....... //because the END is not preceded by a START

Hoping someone can help?

Thank You.

This is a textbook example of a non-regular language (START and END are the equivalent of opening and closing parentheses). That means you cannot match this language with a simple regular expression. You can do it to some specific depth with a complicated regex, but not arbitrary depth.

You need to write a language parser.

Related reading:

Please suggest me some good books on "Formal languages and Automata Theory".


The book here is Introduction to Automata Theory, Languages and Computation by Hopcroft, Motwani and Ullman (Ullman is one of the dragon book guys). (I recommend finding an older edition in your library if you can; the older editions were shorter and I don't see much value in the additional material in the new editions).

Another great book is Introduction to the Theory of Computation by Sipser.

You can not go wrong with one of those two.

My copy of The Design and Analysis of Computer Algorithms has arrived today. In the first chapter, the author introduced Turing Machines. I have two other algorithms textbooks, Introduction to Algorithms and The Algorithm Design Manual, but none of them talks about Turing machines, even though they are famous on the subject of algorithms and data structures.

I would like to understand What is the relation between Turing Machine and Algorithm/Datastructure. Is is really important to understand Turing machines to become expert in algorithms?

Turing machines are just theoretical tools to analyze computation, ie. we can specify an algorihm by creating a turing machine which computes it. They are very useful in the study of computability, that is, if it is possible at all to compute a function. Turing machines and several other formal language constructs are discuessed in the classic book by Hopcroft and Ullmann. Turing machines also appear when discussing NP-completeness for instance in this book, by Garey and Johnson.

Both books and turing machines in general are pretty theoretical. If you are interested in algorihhms in an academic way, I'd recommend them. However, if you want a practical understanding of algorihms implemented on real computers and run on real data then I'd say that it's only important to have a cursory understanding of turing machines.

The reason that Turing machines are of importance when describing data structures and algorithms is that they provide a mathematical model in which we can describe what an algorithm is. Most of the time, algorithms are described using high-level language or pseudocode. For example, I might describe an algorithm to find the maximum value in an array like this:

Set max = -infinity
For each element in the array:
    If that element is greater than max:
        Set max equal to that element.

From this description it's easy to see how the algorithm works, and it would be easy to translate it into source code. However, suppose that I had written out this description:

Guess the index at which the maximum element occurs.
Output the element at that position.

Is this a valid algorithm? That is, can we say "guess the index" and rigorously define what it means? If we do allow this, how long will it take to do this? If we don't allow it, why not? What's different about the first description from the second?

In order to have a mathematically rigorous definition of an algorithm, we need to have some formal model of how a computer works and what it can and cannot do. The Turing machine is one common way to formally define computation, though there are others that can be used as well (register machines, string rewriting systems, Church's lambda calculus, etc.) Once we have defined a mathematical model of computation, we can start talking about what sorts of algorithmic descriptions are valid - namely, those that could be implemented using our model of computation.

Many modern algorithms depend critically on the properties of the underlying model of computation. For example, cache-oblivious algorithms assume that the model of computation has some memory buffer of an unknown size and a two-tiered memory. Some algorithms require that the underlying machine be transdichotomous, meaning that the size of a machine word must be at least large enough to hold the size of any problem. Randomized algorithms require a formal definition of randomess and how the machine can use random values. Nondeterministic algorithms require a means of specifying a nondeterministic computation. Algorithms based on circuits must know what circuit primitives are and are not allowed. Quantum computers need a formal definition of what operations are and are not allowed, along with what the definition of an algorithm is given that the output is probabilistic. Distributed algorithms need a formal definition of communication across machines.

In short, it's important to be explicit about what is and is not allowed when describing an algorithm. However, to be a good programmer or to have a solid grasp of algorithms, you don't need to necessarily know Turing machines inside and out, nor do you need to know about the specific details of how you'd encode particular problems using them. What you should know, though, is what the model of computation can and cannot do, and what the cost is per operation. This way, you can reason about how efficient algorithms are, how much of various resources (time, space, memory, communication, randomess, nondeterminism, etc.) they use. But that said, don't panic if you don't understand the underlying model.

There is one other reason to think about the underlying model of computation - discussing its limitations. Every model of computation has its limits, and in some cases you can prove that certain algorithms cannot possibly exist for certain problems, or that any algorithm that would solve some problem necessarily must use some amount of a given resource. The most common example where this comes up in algorithm design the notion of NP-hardness. Some problems are conjectured to be extremely "difficult" to solve, but the formal definitions of what this difficulty is relies on knowledge of Turing machines and nondeterministic Turing machines. Understanding the model is useful in this case because it allows you to reason about the computational feasibility of certain problems.

Hope this helps!