Data Structures and Algorithms

Alfred V. Aho

Mentioned 8

Data -- Data Structures.

More on

Mentioned in questions and answers.

Can you recommend me a book or (better!) a site with many hard problems and exercises about data structures?

I'm already answering project Euler questions, but these questions are about interesting, but uncommon algorithms. I hardly used even a simple tree. Maybe there is a site with exercises like: hey, you need to calculate this: ... . Do it using a tree. Now do it using a zipper. Upload your C (Haskell, Lisp, even Pascal or Fortress go) solution. Oh, your solution is so slow!

Self-education is very hard then you trying to learn very common, fundamental things. How can I help myself with them without attending to courses or whatever?

If you want an enlightening alternative for learning algorithms you can always try: Rabhi F., Lapalme G. Algorithms.. a functional programming approach.

The authors challenge more traditional methods of teaching algorithms by using a functional programming context, with Haskell as the implementation language. This leads to smaller, clearer and more elegant programs which enable the programmer to understand the algorithm itself more quickly and to use that understanding to explore alternative solutions. Placing the emphasis on program development rather than the mathematical properties of algorithms, the book uses a succession of practical programming examples to develop in the reader problem-solving skills which can be easily transferred to other language other language paradigms.

As for a site with (hard) exercises you can always try to solve, I recommend: spoj .

Introduction to Algorithms, Third Edition, by Cormen, Leiserson, Rivest and Stein is a good intro to algorithms and data structures. It has many exercises at the end of each chapter. most of them are simple, but there are some more difficult.

This has to be a duplicate.

I'd recommend the MIT open courseware site here. There's algorithms courses in the "Electrical Engineering and Computer Science" section some way down the page.

6.006 - Introduction to Algorithms
6.046J - Introduction to Algorithms (SMA 5503)

I recommend the latter. The materials are on the site. The videos are probably best accessed from YouTube here - search for "mit algorithms". The textbook is well respected. Third edition is just out, second edition matches the course. The first edition was also included as part of Dr Dobbs Algorithms and Data Structures CD ROM.

Niklaus Wirth has an Algorithms and Data Structures book available for download from his personal site. I have the Modula 2 print version, and while it's not a a substitute for Cormen (or aho hopcroft ullman, etc) it's a nice book to have.

CS 61B: Data Structures - Fall 2006

Instructor: Jonathan Shewchuk

Fundamental dynamic data structures, including linear lists, queues, trees, and other linked structures; arrays strings, and hash tables. Storage management. Elementary principles of software engineering. Abstract data types. Algorithms for sorting and searching. Introduction to the Java programming language.

Also you can read this book for Algorithms..

Apart from the aforementioned Cormen, Leiserson and Rivest, there is also a very new book by Peter Brass, "Advanced Data Structures". It has relatively ugly example code in C, and the author is somewhat fanatic about performance (for example, he doesn't use recursion), but the theoretical content of that book is brilliant and unique, it hardly intersects with Cormen. I expect it to become a classic.

I had a painful experience with the "Analysis of Algorithms" classes back in college but have recently found a need for it in the real world. -- Anyway, I'm looking for a simple-yet-effective crash course. Any ideas?

Related Sidenote: It sure would be nice if there were a "Cartoon Guide to Algorithm Analysis", taught by Dilbert.

UPDATE: A very similar question can be found at: How to get started on ALGORITHMS?

I recommend Data Structures and Algorithms by Adam Drozdek available in Java and C++ editions.

You don't say a lot about the remainder of you background. For straight out analysis of algorithms, the methods by which you evaluate an algorithm to find its order statistics and behavior, If you're comfortable with mathematics in general -- say you've had two years of calculus, or a good abstract algebra course -- then you can't really do much better than to read Knuth Volume One.

The usual "Analysis of Algorithms" course is also a data structures course, so a data structures text might be better if you also need to learn about lists, trees, etc. My favorite in graduate school was Aho, Hopcroft and Ullman.

I like Cormen as a reference; it also serves as an admirable doorstop, for the execution of large icky bugs, for clamping small glue joints (the slick cover releases most wood glues), and as an end book because it holds a bookend in place. Wouldn't recommend it as an intro text.

The single most helpful tool I've had for algorithms is Introduction to Algorithms.

It's the single best resource I know of for algorithms. It covers so many topics in good depth and has examples throughout. I still refer to it quite frequently.

Without that book my algorithms analysis class would have been a pain.

I like Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein. It's a bit heavy, but I found it to be a decent reference text.

There are a lot of good books on the subject. I like An Introduction to the Analysis of Algorithms. Also check out the algorithms course on MIT OpenCourseWare (using CLRS as the course text). It's a little bit deep, but having it online allows you to go at your own pace.

A couple of other books that I've started reading recently are Algorithms in a Nutshell and the Algorithm Design Manual. They both take a lighter approach than most algorithms books. Instead of heavy math and formal proofs these books give you realistic problem statements and show you the steps taken to refine an algorithm. They also show you how to estimate and measure the complexity of a solution. I would highly recommend either book.

What is the best place or a link to learn algorithms in C? How do you know when and where to use the implementation of algorithms by just looking into the problems?

Algorithms in C by Sedgewick is a great place to start the investigation. Once you are familiar with what algorithms are available and what the performance characteristics of each are, you'll be able to see where to use each of them.

Algorithms aren't necessarily tied to a specific language, just to clarify, so any algorithms book will work great as long as you can understand the concept being the data structure/algorithm.

That said, this seems like a good choice: Algorithms in C. I have the C++ equivalent on my shelf.

There is also a book that seems language agnostic (correct me if I'm wrong) called Data Structures & Algorithm's, though I hear it's a bit dated, so you'll miss out on more recent structures.

Don't forget the internet has a plethora of information available to you. However, books are usually better for these sorts of things. This is because internet resources tend to focus on one thing at a time. For example, you need to understand what Big-O notation is before you can understand what it means when we say a List has O(1) [constant time] removal.

A book will cover these things in the correct order, but an internet resource will focus on either Big-O notation or data structures, but often won't easily connect the two.

When it comes to using it, you'll mostly make the connection when it comes to what you'll be doing with the data.

For example, you might want a vector (array) if you just need ordered elements, but if you need ordered elements and removal from any place (but can sacrifice random access), then a list would be more appropriate, due to it's constant-time removal.

I read Pointers on C by Kenneth Reek recently. I thought I was pretty well versed in C, but this book gave me a few epiphanies, despite being aimed at beginners. The code examples are things of beauty (but not the fastest code on a x86-like CPU). It provide good implementations of many of the most common algorithms and data-structures that are in use, with excellent explanations about why they are implemented as they are (and sometimes code or suggestions for alternative implementations).

On the same page as your question: patterns for creating reusable code in C (that is what we all want, isn't it?), C Interfaces and Implementations: Techniques for Creating Reusable Software, by David R. Hanson. It has been a few years since I read it, and I don't have a copy to verify what I recall is correct, but if I remember correctly it deals with how to create good C API:s to data structures and algorithms, as well as giving example implementations of some of the most common algorithms.

Of topic: As I have mostly written throw-away programs in C for private use, this one helped me get rid of some bad coding habits as well as being an excellent C reference: C: A reference Manual. Reminds me that I ought to buy that one.

I have learned about data structures a long time ago. I need to refresh my knowledge of basic topics in data structure for a job interview. Can anyone provide me with some resources or links about this topic?

Introduction to algorithms by Corman, Leiserson, Rivest and Stein.

Buy it at Amazon, read it online or borrow it at your local library.

I'd recommend all times classic: "Data Structures and Algorithms" by Aho, Ullman and Hopcroft.

While reading the topic on priority queues in Data Structures And Algorithms I came across the following paragraph...

one possible way to favour short processes yet not lock the long ones is to give process P a priority 100 t(used)-t(init) where t(used) is the time taken by a process and t(init) is the time at which the process initiated , measured from some time 0. Note that 100 is a magic number as it is selected to be somewhat larger than the largest number of processes we expect to be active at once. The reader may observe that if we always pick the process with the smallest priority number and there are not too many short processes in the mix, then in the long run a process that does not finish quickly will receive 1% of the processor's time.

can anyone explain how the process takes up 1%? it would be good if you could derive the result mathematically.

Sure, so initially ever process has a negative priority value. It doesn't matter what it is, only that it is negative and based on current time, however that is represented. For simplicity, let's assume it's just an integer value.

Process A starts with a priority of 0 (assume we are at t=0). It executes for, say 10 time units but isn't finished, hence it needs to be queued to continue processing at some future point. So the priority will be, based on the formula

priority = priority + 100*(endtime - starttime)

priorityA = 0 + 100*(10-0)
priorityA = 1000

Process B's initial priority is initialized at t = 5, so it is -5. It's got the lowest of the two priorities in the queue so it will get some time too. Assume that it also runs for 10 time units. After it is done, B will have a priority of:

priorityB = -5 + 100*(20-10)
priorityB = 995

and so it'll get queued up again. And let's assume it again runs for 10 units. Following it's timeslice it's new priority will be:

priorityB = 995 + 100*(30-20)
priorityB = 1995

So that'll reposition B after A in the priority queue. A then runs but only runs for 5 time units this time. It's new priority is:

priorityA = 1000 + 100*(35-30)
priorityA = 1500

And process A will again be at the top of the queue and get attention. Suppose it runs for 5 time units again, it's new priority is:

priorityA = 1500 + 100(40-35)
priorityA = 2000

which will position it after process B and so B will get some processor time. Let's assume it uses 5 time units this time.

priorityB = 1995 + 100(45-40)
priorityB = 2495

Now it it is A's turn again. See how these two process effectively get 50% of the processor's attention each? If we add a third long-running process ("long running" like both A and B are "long running" in the sense that they didn't run for 10 units and then finish but rather were requeued) to the mix, those processes will each likely get roughly 33% of the processor's time since each time one runs and doesn't finish, it get's its priority adjusted based on how long it spent running. New processes always run first in this scenario because they come in with a negative priority value and they'll actually end up with a higher (lower numeric value) priority for awhile. However, that won't last forever - the new processes will start to get a larger and larger priority value.

But since we've got, based on the assumptions made for the book's example, around 100 processes at any one time awaiting some processing time and also the assumption is that there aren't too many short-running processes, there will generally be about 100 processes vying for processor attention and so each one will get about the same amount of time and soon enough their relative numeric priority values will all cluster around each other (meaning there's not much difference numerically from the one with the highest priority and the one with the lowest priority) and so after the "top" process runs, it'll get enough added to its priority to push it to the bottom (or near the bottom). Rinse and repeat and you essentially get a round-robin thing going, assuming they all use about the same amount of time each go around. There's about 100 at any one time so, again assuming few short-running processes exist, each one gets 1/100 of the processor's attention.

Are there books you know that covers data structure? how data should be properly structured (eg, in array or database).


data structure is for how you want to manage your data in tree,stack,queue or etc. database is a software that keep 'related' data that can be any structure:table form, Object Oriented form of any form. here related has main meaning in database. database is for keeping any data that data that are related to each other data

data structure books are: Data structure and algorithm or

I'm planning to take DataStructures course this semester. I had gone through the internet for material related to Data Strucutures. I would like to have an idea on the concepts of data structures before my semester begins. Can anyone suggest me any material or book ?

There are several books which will offer you a great deal of learning . To Name a few,

1.Data structures and algorithms made easy - A very good read from an Indian author

2.Data structures and algorthims by Alfred Aho and Ulman

3.Fundamental of Data structures by Horowitz and Sahni

to name a few on learning elementary data structures. Apart from that there are other books which are very good for algorithms. But these books should be enough to give you an introduction.

I wanted to study algorithms and data structures in detail (in the quest of becoming a better programmer :P ), i have been programming for 2-3 years (c++, java, python)

searching on google confused me between two type of books/web-resources

should i go for the books that are language specific like

or a generic book like

these are just examples, the main question is what type of resource should i choose, something language specific or generic? would there be a difference in anyway?

also, suggest a good web-resource/book (free is better) where i can accumulate good amount of knowledge in detail regarding algorithms and data structures. math is no issue

Note: i know there are many similar questions but my doubt is, would your study of algorithms and data structures depend on what programming language you use?

thanks, Shobhit,

There are pluses and minuses to learning about algorithms and data structures in a language-specific way. On the plus side, the examples, exercises, and explanations can be very concrete. Provided you have access to an execution environment for that language, you can experiment on your own with the concepts as you are learning them. This is very powerful.

On the minus side, it is harder to distinguish between the core concepts (e.g., nodes and links in a tree) and the language-specific methods for implementing them (e.g., structs and pointers in C). Other languages may be so different (e.g., Prolog), that the core concepts may be totally unrecognizable if you haven't learned how to separate them from the language-specific aspects of what you have learned. There's also the problem that there are usually lots of language-specific stuff that are entirely a distraction from the core concepts. (malloc/free in C; constructors and destructors in C++, etc., -- unless you're studying memory management algorithms.)

One way to have the benefits of a language-specific presentation and also address its shortcomings is to study the same material presented for two radically different languages. The entire family of Algol-like languages (C, C++, Pascal, Algol, Java, C#, etc.) are basically equivalent for this purpose. I mentioned Prolog before; you might also consider Lisp. Another interesting approach might be a 4GL language like SQL. If you can implement a good balanced tree in a C program and also in an SQL schema and set of queries, then you can be confident that you have mastered the underlying concepts involved in balanced trees.

You should get CLRS to start with. Language agnostic. In terms of algorithms, language doesnt make much difference. It s the theory/concept and how the algorithm works. that s what you need to learn.

Once you learn the concept, you can write the algorithm in any language of your choice.

This is due to the Complexity. Languages doesnt make a difference in terms of algorithms and complexity but the actual algorithm.

In respond to @birryree, Skienna's book, if you ask me is great to prepare for exams. I felt like there were just lot of puzzles. There is also Kleinberg, and computer algorithms. CLRS is the bible.

Most Algorithms and Datastructures book are very language-independent. Some use pseudocode, some don't even have that much code and one of my favourite books used Pascal, of all things.