Compiler Construction

Kenneth C. Louden

Mentioned 5

This compiler design and construction text introduces students to the concepts and issues of compiler design, and features a comprehensive, hands-on case study project for constructing an actual, working compiler

More on Amazon.com

Mentioned in questions and answers.

I've been wanting to play around with writing my own language for a while now (ostensibly for the learning experience) and as such need to be relatively grounded in the construction of Parsers, Interpreters, and Compilers. So:

  • Does anyone know of any good resources on constructing Parsers, Interpreters, and Compilers?

EDIT: I'm not looking for compiler-compilers/parser-compilers such as Lex, Yacc and Bison...

Compiler Construction: Principles and Practice is a best book on the subject.

I recommend Compiler Design in C which you would have to find in a used book site unfortunately. The only real problem with the book is that it was written back when speed of compilation was an important factor so the compiler is written in C. That's enough of a low level language that sometimes the implementation theory is buried under the implementation code.

You mentioned both Interpreters and Compilers. I'd actually recommend starting with an Interpreter rather than a Compiler. It's much easier to get started with an Interpreter and they tend to be more fun to work on because you can get immediate feedback on how you're doing.

Are the stages of compilation of a C++ program specified by the standard?

If so, what are they?

If not, an answer for a widely-used compiler (I'd prefer MSVS) would be great.

I'm talking about preprocessing, tokenization, parsing and such. What is the order in which they are executed and what do they do in particular?

EDIT: I know what compilation, linking and preprocessing do, I'm mostly interested in the others and the order. Explanations for these are, of course, also welcomed since I might not be the only one interested in an answer.

The C++ specification is intentionally vague in many respects, mostly to remain implementation independent. A lot of the areas where the language is vague aren't a large concern anymore - for example, you can usually rely on a char being 8 bits. However, other issues such as layout of structures which use multiple inheritance is a real concern, as is the implications of virtual functions on classes. These issues impact the compatibility of code generated with different compilers. The Application Binary Interface (or ABI) of C++, isn't rigorously defined and as a result you occasionally have to dip into C where this becomes problematic. Writing a plugin interface is a good example.

Similarly, the standard doesn't give a detailed description of how a compiler should be built because there are many key decisions and features that differentiate compilers. For example, MSVC can perform partial builds (allowing edit and continue), which GCC doesn't. Generally speaking though, all compilers perform similar stages: preprocessing, syntax parsing, determining program flow, producing a symbol table, and producing a linear series of instructions which can subsequently be linked to produce an executable. Oh, and linking those object files, this is usually done by a linker.

I had a brief look, it's rather hard to find descriptions of individual compilers. I doubt there's much out there on commercial compilers like Microsoft's offering, purely for commercial reasons. GCC is your best bet, although Microsoft is happy to describe the process. This is pretty banal stuff though: compilers all work pretty much the same way. The real gold is in how they execute these stages, the algorithms and data structures they use. In that respect, I recommend this book. I bought a brand new copy for a university course a few years back, and I borrowed most of my textbooks from the library :).

I'm taking several classes this fall for my masters and one of them is Compiler Design and Construction. I am pretty well versed in most things related to computer technology, but I have not had much experience with how compilers do the dirty, I just use them when I need to. I am not usually nervous about classes, but I kind of feel like I am walking into this one naked. If anyone can recommend some good reading or provide a short list of basic principles that I can research to bring me up to speed quickly I would be most grateful.

UPDATE:

Well I great in the class, and the text book we used was actually very good. This site also helped me visualize and test my regular expressions (which I now believe is the best thing to master when learning about compilers). I picked up the basics of LEX pretty quickly but YACC (for some reason) was a bit harder for me. Simply looking up examples online helped with both of them.

Check the Dragon Book

Possible Duplicate:
Learning to write a compiler

I need to come up with a dummy SQL like language which has very limited features. I have never done any compiler or parsing stuff before. Can anyone let me know a good point to start may be a link or a example of the same. I am so clueless.

I will be using this dummy language with C/C++ as my primary language.

Thanks

I did a compiler construction course last year and we used the book

Compiler Construction by Kenneth C. Louden

It is very detailed with a good theoretical background. At the same time the author gives enough examples and uses very informative figures, so that you're never lost while learning. Eventually a compiler in C for a toy language is listed in the later chapters.

I really liked it!

The Dragon Book is often considered a good starting point. However, I will also recommend the ANTLR book

I want to implement another debugger (language) for .NET (it's just for academic reason, so that it can implement just a part of a language). I myself like to implement NS2 (network Simlator 2) script for .NET in which anybody can write ns2 script and debug it with .NET

I read this article in stackoverflow and it is far from what I'm looking for.

Here is the requirement

  • have some predefined keywords (e.g: for, while, if ...)
  • check the correct form of the statements (e.g: for(start;end;counter){commands} ...)
  • diffferent colour for different types of statements
  • ability to add to any IDE (e.g: implementatin like add-in or as a dll or ...(I have no idea))
  • many other thing that is not necessary for now

How can I do this?

Update : I'm not sure that you got my point, take a look at this, it is very close to what I am looking for.

It will not be an easy task. However: The Dragon Book is probably a good place to start (assuming you've got sufficient computer science background for a compiler theory book to make much sense to you). Compiler Construction: Principles and Practice is also a good text.

You'll want to compile to CIL (common intermediary language). This handy wiki article outlines the CIL instruction set. Debugging your intermediate code against the CLR... well, that's where the StackOverflow article you've linked will come in handy =)

That'll cover your first two bullets (and consume a big chunk of your life).

The next two are different issues, but the easiest way to 'make it go' would probably be to define a syntax for an existing text editor, and set up a macro in the program to call your compiler. I'd recommend TextPad, though I'm sure opinions on a configurable general-purpose text editor will vary among the community ;)

Designing a full IDE with all of the features you've come to know and love in your environment could be quite a task ... or you could try to build an eclipse plugin. Personally (assuming you can design your language and learn something from it), I'd just stick with syntax highlighting in TextPad.