Programming Pearls

Jon Louis Bentley

Mentioned 27

A guide to practical programming techniques and design principles, with information on such topics as testing, debugging and timing, set representations, and string problems.

More on Amazon.com

Mentioned in questions and answers.

If you could go back in time and tell yourself to read a specific book at the beginning of your career as a developer, which book would it be?

I expect this list to be varied and to cover a wide range of things.

To search: Use the search box in the upper-right corner. To search the answers of the current question, use inquestion:this. For example:

inquestion:this "Code Complete"

Applying UML and Patterns by Craig Larman.

The title of the book is slightly misleading; it does deal with UML and patterns, but it covers so much more. The subtitle of the book tells you a bit more: An Introduction to Object-Oriented Analysis and Design and Iterative Development.

Masters of doom. As far as motivation and love for your profession go: it won't get any better than what's been described in this book, truthfully inspiring story!

Beginning C# 3.0: An Introduction to Object Oriented Programming

This is the book for those who want to understand the whys and hows of OOP using C# 3.0. You don't want to miss it.

alt text

Mastery: The Keys to Success and Long-Term Fulfillment, by George Leonard

It's about about what mindsets are required to reach mastery in any skill, and why. It's just awesome, and an easy read too.

Pro Spring is a superb introduction to the world of Inversion of Control and Dependency Injection. If you're not aware of these practices and their implications - the balance of topics and technical detail in Pro Spring is excellent. It builds a great case and consequent personal foundation.

Another book I'd suggest would be Robert Martin's Agile Software Development (ASD). Code smells, agile techniques, test driven dev, principles ... a well-written balance of many different programming facets.

More traditional classics would include the infamous GoF Design Patterns, Bertrand Meyer's Object Oriented Software Construction, Booch's Object Oriented Analysis and Design, Scott Meyer's "Effective C++'" series and a lesser known book I enjoyed by Gunderloy, Coder to Developer.

And while books are nice ... don't forget radio!

... let me add one more thing. If you haven't already discovered safari - take a look. It is more addictive than stack overflow :-) I've found that with my google type habits - I need the more expensive subscription so I can look at any book at any time - but I'd recommend the trial to anyone even remotely interested.

(ah yes, a little obj-C today, cocoa tomorrow, patterns? soa? what was that example in that cookbook? What did Steve say in the second edition? Should I buy this book? ... a subscription like this is great if you'd like some continuity and context to what you're googling ...)

Database System Concepts is one of the best books you can read on understanding good database design principles.

alt text

Algorithms in C++ was invaluable to me in learning Big O notation and the ins and outs of the various sort algorithms. This was published before Sedgewick decided he could make more money by dividing it into 5 different books.

C++ FAQs is an amazing book that really shows you what you should and shouldn't be doing in C++. The backward compatibility of C++ leaves a lot of landmines about and this book helps one carefully avoid them while at the same time being a good introduction into OO design and intent.

Here are two I haven't seen mentioned:
I wish I had read "Ruminations on C++" by Koenig and Moo much sooner. That was the book that made OO concepts really click for me.
And I recommend Michael Abrash's "Zen of Code Optimization" for anyone else planning on starting a programming career in the mid 90s.

Perfect Software: And Other Illusions about Testing

TITLE Cover

Perfect Software: And Other Illusions about Testing by Gerald M. Weinberg

ISBN-10: 0932633692

ISBN-13: 978-0932633699

Rapid Development by McConnell

The most influential programming book for me was Enough Rope to Shoot Yourself in the Foot by Allen Holub.

Cover of the book

O, well, how long ago it was.

I have a few good books that strongly influenced me that I've not seen on this list so far:

The Psychology of Everyday Things by Donald Norman. The general principles of design for other people. This may seem to be mostly good for UI but if you think about it, it has applications almost anywhere there is an interface that someone besides the original developer has to work with; e. g. an API and designing the interface in such a way that other developers form the correct mental model and get appropriate feedback from the API itself.

The Art of Software Testing by Glen Myers. A good, general introduction to testing software; good for programmers to read to help them think like a tester i. e. think of what may go wrong and prepare for it.

By the way, I realize the question was the "Single Most Influential Book" but the discussion seems to have changed to listing good books for developers to read so I hope I can be forgiven for listing two good books rather than just one.

alt text

C++ How to Program It is good for beginner.This is excellent book that full complete with 1500 pages.

Effective C++ and More Effective C++ by Scott Myers.

Inside the C++ object model by Stanley Lippman

I bough this when I was a complete newbie and took me from only knowing that Java existed to a reliable team member in a short time

Not a programming book, but still a very important book every programmer should read:

Orbiting the Giant Hairball by Gordon MacKenzie

The Pragmatic programmer was pretty good. However one that really made an impact when I was starting out was :

Windows 95 System Programming Secrets"

I know - it sounds and looks a bit cheesy on the outside and has probably dated a bit - but this was an awesome explanation of the internals of Win95 based on the Authors (Matt Pietrek) investigations using his own own tools - the code for which came with the book. Bear in mind this was before the whole open source thing and Microsoft was still pretty cagey about releasing documentation of internals - let alone source. There was some quote in there like "If you are working through some problem and hit some sticking point then you need to stop and really look deeply into that piece and really understand how it works". I've found this to be pretty good advice - particularly these days when you often have the source for a library and can go take a look. Its also inspired me to enjoy diving into the internals of how systems work, something that has proven invaluable over the course of my career.

Oh and I'd also throw in effective .net - great internals explanation of .Net from Don Box.

I recently read Dreaming in Code and found it to be an interesting read. Perhaps more so since the day I started reading it Chandler 1.0 was released. Reading about the growing pains and mistakes of a project team of talented people trying to "change the world" gives you a lot to learn from. Also Scott brings up a lot of programmer lore and wisdom in between that's just an entertaining read.

Beautiful Code had one or two things that made me think differently, particularly the chapter on top down operator precedence.

K&R

@Juan: I know Juan, I know - but there are some things that can only be learned by actually getting down to the task at hand. Speaking in abstract ideals all day simply makes you into an academic. It's in the application of the abstract that we truly grok the reason for their existence. :P

@Keith: Great mention of "The Inmates are Running the Asylum" by Alan Cooper - an eye opener for certain, any developer that has worked with me since I read that book has heard me mention the ideas it espouses. +1

I found the The Algorithm Design Manual to be a very beneficial read. I also highly recommend Programming Pearls.

This one isnt really a book for the beginning programmer, but if you're looking for SOA design books, then SOA in Practice: The Art of Distributed System Design is for you.

For me it was Design Patterns Explained it provided an 'Oh that's how it works' moment for me in regards to design patterns and has been very useful when teaching design patterns to others.

Code Craft by Pete Goodliffe is a good read!

Code Craft

The first book that made a real impact on me was Mastering Turbo Assembler by Tom Swan.

Other books that have had an impact was Just For Fun by Linus Torvalds and David Diamond and of course The Pragmatic Programmer by Andrew Hunt and David Thomas.

In addition to other people's suggestions, I'd recommend either acquiring a copy of SICP, or reading it online. It's one of the few books that I've read that I feel greatly increased my skill in designing software, particularly in creating good abstraction layers.

A book that is not directly related to programming, but is also a good read for programmers (IMO) is Concrete Mathematics. Most, if not all of the topics in it are useful for programmers to know about, and it does a better job of explaining things than any other math book I've read to date.

For me "Memory as a programming concept in C and C++" really opened my eyes to how memory management really works. If you're a C or C++ developer I consider it a must read. You will defiantly learn something or remember things you might have forgotten along the way.

http://www.amazon.com/Memory-Programming-Concept-C/dp/0521520436

Agile Software Development with Scrum by Ken Schwaber and Mike Beedle.

I used this book as the starting point to understanding Agile development.

Systemantics: How Systems Work and Especially How They Fail. Get it used cheap. But you might not get the humor until you've worked on a few failed projects.

The beauty of the book is the copyright year.

Probably the most profound takeaway "law" presented in the book:

The Fundamental Failure-Mode Theorem (F.F.T.): Complex systems usually operate in failure mode.

The idea being that there are failing parts in any given piece of software that are masked by failures in other parts or by validations in other parts. See a real-world example at the Therac-25 radiation machine, whose software flaws were masked by hardware failsafes. When the hardware failsafes were removed, the software race condition that had gone undetected all those years resulted in the machine killing 3 people.

It seems most people have already touched on the some very good books. One which really helped me out was Effective C#: 50 Ways to Improve your C#. I'd be remiss if I didn't mention The Tao of Pooh. Philosophy books can be good for the soul, and the code.

Discrete Mathematics For Computer Scientists

Discrete Mathematics For Computer Scientists by J.K. Truss.

While this doesn't teach you programming, it teaches you fundamental mathematics that every programmer should know. You may remember this stuff from university, but really, doing predicate logic will improve you programming skills, you need to learn Set Theory if you want to program using collections.

There really is a lot of interesting information in here that can get you thinking about problems in different ways. It's handy to have, just to pick up once in a while to learn something new.

I saw a review of Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools on a blog talking also about XI-Factory, I read it and I must say this book is a must read. Altough not specifically targetted to programmers, it explains very clearly what is happening in the programming world right now with Model-Driven Architecture and so on..

Solid Code Optimizing the Software Development Life Cycle

Although the book is only 300 pages and favors Microsoft technologies it still offers some good language agnostic tidbits.

Managing Gigabytes is an instant classic for thinking about the heavy lifting of information.

My vote is "How to Think Like a Computer Scientist: Learning With Python" It's available both as a book and as a free e-book.

It really helped me to understand the basics of not just Python but programming in general. Although it uses Python to demonstrate concepts, they apply to most, if not all, programming languages. Also: IT'S FREE!

Object-Oriented Programming in Turbo C++. Not super popular, but it was the one that got me started, and was the first book that really helped me grok what an object was. Read this one waaaay back in high school. It sort of brings a tear to my eye...

My high school math teacher lent me a copy of Are Your Lights Figure Problem that I have re-read many times. It has been invaluable, as a developer, and in life generally.

I'm reading now Agile Software Development, Principles, Patterns and Practices. For those interested in XP and Object-Oriented Design, this is a classic reading.

alt text

Kernighan & Plauger's Elements of Programming Style. It illustrates the difference between gimmicky-clever and elegant-clever.

to get advanced in prolog i like these two books:

The Art of Prolog

The Craft of Prolog

really opens the mind for logic programming and recursion schemes.

Here's an excellent book that is not as widely applauded, but is full of deep insight: Agile Software Development: The Cooperative Game, by Alistair Cockburn.

What's so special about it? Well, clearly everyone has heard the term "Agile", and it seems most are believers these days. Whether you believe or not, though, there are some deep principles behind why the Agile movement exists. This book uncovers and articulates these principles in a precise, scientific way. Some of the principles are (btw, these are my words, not Alistair's):

  1. The hardest thing about team software development is getting everyone's brains to have the same understanding. We are building huge, elaborate, complex systems which are invisible in the tangible world. The better you are at getting more peoples' brains to share deeper understanding, the more effective your team will be at software development. This is the underlying reason that pair programming makes sense. Most people dismiss it (and I did too initially), but with this principle in mind I highly recommend that you give it another shot. You wind up with TWO people who deeply understand the subsystem you just built ... there aren't many other ways to get such a deep information transfer so quickly. It is like a Vulcan mind meld.
  2. You don't always need words to communicate deep understanding quickly. And a corollary: too many words, and you exceed the listener/reader's capacity, meaning the understanding transfer you're attempting does not happen. Consider that children learn how to speak language by being "immersed" and "absorbing". Not just language either ... he gives the example of some kids playing with trains on the floor. Along comes another kid who has never even SEEN a train before ... but by watching the other kids, he picks up the gist of the game and plays right along. This happens all the time between humans. This along with the corollary about too many words helps you see how misguided it was in the old "waterfall" days to try to write 700 page detailed requirements specifications.

There is so much more in there too. I'll shut up now, but I HIGHLY recommend this book!

alt text

The Back of the Napkin, by Dan Roam.

The Back of the Napkin

A great book about visual thinking techniques. There is also an expanded edition now. I can't speak to that version, as I do not own it; yet.

Agile Software Development by Alistair Cockburn

Do users ever touch your code? If you're not doing solely back-end work, I recommend About Face: The Essentials of User Interface Design — now in its third edition (linked). I used to think my users were stupid because they didn't "get" my interfaces. I was, of course, wrong. About Face turned me around.

"Writing Solid Code: Microsoft's Techniques for Developing Bug-Free C Programs (Microsoft Programming Series)" by Steve MacGuire.

Interesting what a large proportion the books mentioned here are C/C++ books.

While not strictly a software development book, I would highly recommend that Don't Make me Think! be considered in this list.

As so many people have listed Head First Design Patterns, which I agree is a very good book, I would like to see if so many people aware of a title called Design Patterns Explained: A New Perspective on Object-Oriented Design.

This title deals with design patterns excellently. The first half of the book is very accessible and the remaining chapters require only a firm grasp of the content already covered The reason I feel the second half of the book is less accessible is that it covers patterns that I, as a young developer admittedly lacking in experience, have not used much.

This title also introduces the concept behind design patterns, covering Christopher Alexander's initial work in architecture to the GoF first implementing documenting patterns in SmallTalk.

I think that anyone who enjoyed Head First Design Patterns but still finds the GoF very dry, should look into Design Patterns Explained as a much more readable (although not quite as comprehensive) alternative.

Even though i've never programmed a game this book helped me understand a lot of things in a fun way.

How influential a book is often depends on the reader and where they were in their career when they read the book. I have to give a shout-out to Head First Design Patterns. Great book and the very creative way it's written should be used as an example for other tech book writers. I.e. it's written in order to facilitate learning and internalizing the concepts.

Head First Design Patterns

97 Things Every Programmer Should Know

alt text

This book pools together the collective experiences of some of the world's best programmers. It is a must read.

Extreme Programming Explained: Embrace Change by Kent Beck. While I don't advocate a hardcore XP-or-the-highway take on software development, I wish I had been introduced to the principles in this book much earlier in my career. Unit testing, refactoring, simplicity, continuous integration, cost/time/quality/scope - these changed the way I looked at development. Before Agile, it was all about the debugger and fear of change requests. After Agile, those demons did not loom as large.

One of my personal favorites is Hacker's Delight, because it was as much fun to read as it was educational.

I hope the second edition will be released soon!

You.Next(): Move Your Software Development Career to the Leadership Track ~ Michael C. Finley (Author), Honza Fedák (Author) link text

I've been arounda while, so most books that I have found influential don't necessarily apply today. I do believe it is universally important to understand the platform that you are developing for (both hardware and OS). I also think it's important to learn from other peoples mistakes. So two books I would recommend are:

Computing Calamities and In Search of Stupidity: Over Twenty Years of High Tech Marketing Disasters

Working Effectively with Legacy Code is a really amazing book that goes into great detail about how to properly unit test your code and what the true benefit of it is. It really opened my eyes.

Are there any good books for a relatively new but not totally new *nix user to get a bit more in depth knowledge (so no "Linux for dummies")? For the most part, I'm not looking for something to read through from start to finish. Rather, I'd rather have something that I can pick up and read in chunks when I need to know how to do something or whenever I have one of those "how do I do that again?" moments. Some areas that I'd like to see are:

  • command line administration
  • bash scripting
  • programming (although I'd like something that isn't just relevant for C programmers)

I'd like this to be as platform-independent as possible (meaning it has info that's relevant for any linux distro as well as BSD, Solaris, OS X, etc), but the unix systems that I use the most are OS X and Debian/Ubuntu. So if I would benefit the most from having a more platform-dependent book, those are the platforms to target.

If I can get all this in one book, great, but I'd rather have a bit more in-depth material than coverage of everything. So if there are any books that cover just one of these areas, post it. Hell, post it even if it's not relevant to any of those areas and you think it's something that a person in my position should know about.

I've wiki'd this post - could those with sufficient rep add in items to it.

System administration, general usage books

Programming:

Specific tools (e.g. Sendmail)

Various of the books from O'Reilly and other publishers cover specific topics. Some of the key ones are:

Some of these books have been in print for quite a while and are still relevant. Consequently they are also often available secondhand at much less than list price. Amazon marketplace is a good place to look for such items. It's quite a good way to do a shotgun approach to topics like this for not much money.

As an example, in New Zealand technical books are usurously expensive due to a weak kiwi peso (as the $NZ is affectionately known in expat circles) and a tortuously long supply chain. You could spend 20% of a week's after-tax pay for a starting graduate on a single book. When I was living there just out of university I used this type of market a lot, often buying books for 1/4 of their list price - including the cost of shipping to New Zealand. If you're not living in a location with tier-1 incomes I recommend this.

E-Books and on-line resources (thanks to israkir for reminding me):

  • The Linux Documentation project (www.tldp.org), has many specific topic guides known as HowTos that also often concern third party OSS tools and will be relevant to other Unix variants. It also has a series of FAQ's and guides.

  • Unix Guru's Universe is a collection of unix resources with a somewhat more old-school flavour.

  • Google. There are many, many unix and linux resources on the web. Search strings like unix commands or learn unix will turn up any amount of online resources.

  • Safari. This is a subscription service, but you can search the texts of quite a large number of books. I can recommend this as I've used it. They also do site licences for corporate customers.

Some of the philosophy of Unix:

I recommend the Armadillo book from O'Reilly for command line administration and shell scripting.

alt text

Jason,

Unix Programming Environment by Kernighan and Pike will give you solid foundations on all things Unix and should cover most of your questions regarding shell command line scripting etc.

The Armadillo book by O'Reilly will add the administration angle. It has served me well!

Good luck!

The aforementioned Unix Power Tools is a must. Other classics are sed&awk and Mastering Regular Expressions. I also like some books from the O'Reilly "Cookbook" series:

Whenever people ask about the halting problem as it pertains to programming, people respond with "If you just add one loop, you've got the halting program and therefore you can't automate task"

Makes sense. If your program has an infinite loop, then when your program is running, you have no way of knowing whether the program is still crunching input, or if it is just looping infinitely.

But some of this seems counter intuitive. What if I was writing a halting problem solver, which takes source code as its input. rascher@localhost$ ./haltingSolver source.c

If my code (source.c) looks like this:

for (;;) {  /* infinite loop */  }

It seems like it'd be pretty easy for my program to see this. "Look the loop, and look at the condition. If the condition is just based on literals, and no variables, then you always know the outcome of the loop. If there are variables (eg while (x < 10)), see if those variables are ever modified. If not, then you always know the outcome of the loop."

Granted, these checks would not be trivial (calculating pointer arithmetics, etc) but it does not seem impossible. eg:

int x = 0
while (x < 10) {}

could be detected. along with - albeit not trivially:

int x = 0
while (x < 10)
{
   x++;
   if (x == 10)
   {
      x = 0
   }
}

Now what about user input? That is the kicker, that is what makes a program unpredictable.

int x = 0;
while (x < 10) 
{
   scanf("%d", &x); /* ignoring infinite scanf loop oddities */
}

Now my program can say: "If the user enters a 10 or greater, the program will halt. On all other input, it will loop again."

Which means that, even with hundreds of inputs, one ought to be able to list the conditions on which the program will stop. Indeed, when I write a program, I always make sure someone has the ability to terminate it! I am not saying that the resulting list of conditions is trivial to create, but it doesn't seem impossible to me. You could take input from the user, use them to calculate pointer indexes, etc - but that just adds to the number of conditions to ensure the program will terminate, doesn't make it impossible to enumerate them.

So what exactly is the halting problem? What am I not understanding about the idea that we cannot write a problem to detect infinite loops? Or, why are "loops" such an oft-cited example?

UPDATE

So, let me change the question a little bit: what is the halting problem as it applies to computers? And then I will respond to some of the comments:

Many people have said that the program must be able to deal with "any arbitrary input." But in computers, there isn't ever any arbitrary input. If I only input a single byte of data, than I only have 2^8 possible inputs. So, as an example:

int c = getchar()

switch (c) {
   case 'q':
      /* quit the program */
}

All of the sudden, I have just accounted for all of the possibilities. If c has the bit pattern 0x71, it does one thing. For all other patterns, it does something else. Even a program that accepts arbitrary string input is never really "arbitrary", since resources are finite, which means that while the theory of "arbitrary" applies... it isn't exactly one-to-one with the practice.

The other example people cited is this:

while (n != 1)
    if (n & 1 == 1) 
        n = 3 * n + 1;
    else 
        n /= 2;

If n is a 32-bit integer... then I can visually tell you whether or not this will halt.

I guess this edit isn't asking anything, but the most convincing example I've seen is this one:

Assume that you have your magical program/method to determine that a program halts.

public bool DeterminesHalt(string filename, string[] args){
    //runs whatever program you tell it do, passing any args
    //returns true if the program halts, false if it doesn't
}

Now lets say we write a small piece of code such as...

public static void Main(string[] args){
    string filename = Console.ReadLine(); //read in file to run from user
    if(DeterminesHalt(filename, args))
        for(;;);
    else
        return;
}

So for this example, we can write a program to do the exact opposite of our magical halting method does. If we somehow determine that a given program will halt, we just hop into an infinite loop; otherwise if we determine that the program is in an infinite loop, we end the program.

Then again, if you intentionally write a program which contains an infinite loop... "solving the halting problem" is kind of moot, isn't it?

From Programming Pearls, by Jon Bentley

4.6 Problems

5. Prove that this program terminates when its input x is a positive integer.

while x != 1 do
    if even(x)
        x = x/2
    else
        x = 3*x +1

A lot of interesting specific examples/analogies so far. If you want to read deeper into the background, there's a good book on Turing's original paper, The Annotated Turing, by Charles Petzold.

In a related, sideways-sorta, vein, there's a really neat essay up on the web, Who Can Name the Bigger Number? which brushes on Turing machines and Ackermann functions.

I've always been a largely independent learner gleaning what I can from Wikipedia and various books. However, I fear that I may have biased my self-education by inadvertent omission of topics and concepts. My goal is to teach myself the equivalent of an undergraduate degree in Computer Science from a top university (doesn't matter which one).

To that end, I've purchased and started reading a few academic textbooks:

As well as a few textbooks I have left over from classes I've taken at a mediocre-at-best state university:

My questions are:

  • What topics aren't covered by this collection?
  • Are there any books that are more rigorous or thorough (or even easier to read) than a book listed here?
  • Are there any books that are a waste of my time?
  • In what order should I read the books?
  • What does an MIT or Stanford (or UCB or CMU ...) undergrad learn that I might miss?

Software engineering books are welcome, but in the context of academic study only please. I'm aware of Code Complete and the Pragmatic Programmer, but I'm looking for a more theoretical approach. Thanks!

I think you can use most of the other books for reference and just absorb Programming Pearls in its entirety. Doing so would make you better than 90% of the programmers I've ever met.

The "Gang of Four" Design Patterns book. The Design Patterns course I took in college was probably the most beneficial class I've ever taken.

First, I wouldn't worry about it. But if you'd like a book to learn some of the abstract CS ideas, I'd recommend The Turing Omnibus or Theoretical Introduction to Programming.

If I were deciding between hiring two programmers and neither had much experience, but one had a CS degree and the other didn't, I'd hire the one with the CS degree. But when you get to comparing two programmers with a dozen years of experience, the degree hardly matters.

Even i'm in the same plane: studying computer science in my free time after work; These are some of the books i have in my shelf right now

  1. Applying UML and patterns - Larman
  2. Introduction to algorithms - Cormen
  3. Discrete mathematics and its applications - Rosen
  4. Software Engineering
  5. Advanced Programming in the UNIX Environment

Will udpate this list further as soon as i finish them... :-)

File Structures: An object oriented approach with C++

A lot of good info about block devices and file structuring which you won't find in any of the books you listed. It got a few critical reviews on Amazon because people didn't like his code examples, but the point of the book is to teach the concepts, not give cut and paste code examples.

Also make sure to get a book on compilers

Biggest two omissions I see:

For operating systems I prefer the Tanenbaum instead of the Silberschatz but both are good:

And about the order, that would depend on your interests. There aren't many prerequisites, automata for compilers is the most obvious one. First read the automata book and then the dragon one.

I don't know all the books you have, but the ones I know are good enough so that may mean the others are decent as well.

You are missing some logic and discrete math books as well.

And let's not forget some database theory books!

In an attempt to be a better programmer, I am planning to read a lot of books and learn at least one new language (which I think is going to be python) during the 3-month long holiday that I am going to have.

The list of books that I am planning to read --

Ebooks:

The list of things that I want to do --

  • Start using Linux (probably starting with Ubuntu).
  • Learning to use the bunch of tools mentioned here.
  • Setup a blog (hopefully).

I enjoy watching lectures as well, so, along with Introduction to Algorithms I am going to watch a bunch of Stanford Courses.

A little background: I am a 17 year old guy and I really enjoy programming and every aspect related to it. I have been programming in C and C++ for a while now. The former is more stronger than the latter. I was hoping to utilize the time that I have got on hand thoroughly. Any changes needed with the plan or any additions?

EDIT: Did not mention programming projects.

  1. Making a game using Allegro.
  2. Using QT4 to create a GUI-based database for my school.

Do not just passively read all that information, instead practice after every book chapter or lecture. Practice by writing those algorithms or regexpes for yourself, check your previous code in the light of what code complete has taught you and so on.

If that doesn't give you enough time to do it all, doesn't matter, you better learn properly instead of swallowing material as fast as you can. And all of that is probably too much to do it in only three months anyway, so prioritize by your interests. Take one book and a set of lectures and go at them until finished, then pickup the next, never forgetting to put in practice the concepts shown.

Along with the excellent books you and others listed here I recommend The Little Schemer which will give you a perspective of functional programming. Do not lock yourself into imperative languages (C, C++, C#, Pascal, Java,.. ) while you can dive into different paradigms easily. 17 is a great age :-) Enjoy your journey!

These probably won't fit within your three month plan, but should you wish to master C and Unix, long have I been glad that I spent the time to follow The Loginataka. After that, I found Lions' Commentary on UNIX 6th Edition to be deeply enlightening.

I would add the following two books if you haven't read them already:

Programming Pearls by Jon Bentley

Code by Charles Petzold

Even as it stands, you're going to have a very busy, hopefully productive break. Good luck!

Response to question edit: If you're interested in learning about databases, then I recommend Database in Depth by Chris Date. I hope by "create a GUI-based database" you mean implementing a front-end application for an existing database back end. There are plenty of database solutions out there, and it will be well worth it for your future career to learn a few of them.

Programming Collective Intelligence is an awesome way to get your feet wet in Machine learning. I am looking for similar books which has small but interesting programming projects. Do you have any recommendations?

Edit: It need not be related to machine learning. It could be any programming project-based books. Thanks.

Edit2: Collective Intelligence in Action is one more book that looks at some interesting CS stuffs. Do you guys have any similar recommendations?

Programming Game AI by Example by Mat Buckland has a lot of little cool AI related projects in it.

alt text

Programming Game AI by Example provides a comprehensive and practical introduction to the bread and butter AI techniques used by the game development industry, leading the reader through the process of designing, programming, and implementing intelligent agents for action games using the C++ programming language. Techniques covered include state- and goal-based behavior, inter-agent communication, individual and group steering behaviors, team AI, graph theory, search, path planning and optimization, triggers, scripting, scripted finite state machines, perceptual modeling, goal evaluation, goal arbitration, and fuzzy logic.

Might be up your alley. Take a look at the Table of Contents.

I would check out Programming Pearls by John Bentley. It has lots of smaller problems to get your programming brain going.

While Project Euler is not a book per se, it does contain a large number of "small but interesting programming projects". It's a great way to expand your math skills as well as try out new languages. (Code Kata seems similar but more CS-oriented; I've not yet dived into it.)

I also have fond memories of Parallel and Distributed Simulation Systems; the book itself may be a bit dry, but it's very example-driven, and applies to everything from cellphone tower switching to airport scheduling to weather simulation to video games. It's uncannily fun to write rollback-capable code, too.

Best of Ruby Quiz is a book of a bunch of small interesting projects such as create a self learning Tic Tac Toe AI. The projects can definitely be done in any language, so don't let the fact that it says Ruby in the title deter you from it.

Making Things Talk: Practical Methods for Connecting Physical Objects by Tom Igoe

It is a great way to expand your programming knowledge in how to communicate with hardware. The book has lots of fun and interesting projects with good code examples. It's a great starting point for future projects that you may want to do on your own.

There are several relevant answers at this SO question:

  • Peter Norvig, Paradigms of AI Programming
  • Mark Jason Dominus, Higher-Order Perl
  • Abelson and diSessa, Turtle Geometry
  • Kernighan and Plauger, Software Tools in Pascal
  • Paul Graham, On Lisp
  • Peter Seibel, Practical Common Lisp

Links and descriptions over there. Also, Etudes for Programmers was the original CS project book, still worth sampling even at over 30 years old. (It isn't listed at the other question because it presents no source code of its own for the projects, except one.)

In an interview one of my friends was asked to find the subarray of an array with maximum sum, this my solution to the problem , how can I improve the solution make it more optimal , should i rather consider doing in a recursive fashion ?

def get_max_sum_subset(x):
        max_subset_sum = 0
        max_subset_i = 0
        max_subset_j = 0
        for i in range(0,len(x)+1):
            for j in range(i+1,len(x)+1):
                current_sum = sum(x[i:j])
                if current_sum > max_subset_sum:
                    max_subset_sum = current_sum
                    max_subset_i = i
                    max_subset_j = j
        return max_subset_sum,max_subset_i,max_subset_j

Your solution is O(n^2). The optimal solution is linear. It works so that you scan the array from left to right, taking note of the best sum and the current sum:

def get_max_sum_subset(x):
    bestSoFar = 0
    bestNow = 0
    bestStartIndexSoFar = -1
    bestStopIndexSoFar = -1
    bestStartIndexNow = -1
    for i in xrange(len(x)):
        value = bestNow + x[i]
        if value > 0:
            if bestNow == 0:
                bestStartIndexNow = i
            bestNow = value
        else:
            bestNow = 0

        if bestNow > bestSoFar:
            bestSoFar = bestNow
            bestStopIndexSoFar = i
            bestStartIndexSoFar = bestStartIndexNow

    return bestSoFar, bestStartIndexSoFar, bestStopIndexSoFar

This problem was also discussed thourougly in Programming Pearls: Algorithm Design Techniques (highly recommended). There you can also find a recursive solution, which is not optimal (O(n log n)), but better than O(n^2).

I want to create a set of random numbers without duplicates in Java.

For example I have an array to store 10,000 random integers from 0 to 9999.

Here is what I have so far:

import java.util.Random;
public class Sort{

    public static void main(String[] args){

        int[] nums = new int[10000];

        Random randomGenerator = new Random();

        for (int i = 0; i < nums.length; ++i){
            nums[i] = randomGenerator.nextInt(10000);
        }
    }
}

But the above code creates duplicates. How can I make sure the random numbers do not repeat?

A simple algorithm that gives you random numbers without duplicates can be found in the book Programming Pearls p. 127.

Attention: The resulting array contains the numbers in order! If you want them in random order, you have to shuffle the array, either with Fisher–Yates shuffle or by using a List and call Collections.shuffle().

The benefit of this algorithm is that you do not need to create an array with all the possible numbers and the runtime complexity is still linear O(n).

public static int[] sampleRandomNumbersWithoutRepetition(int start, int end, int count) {
    Random rng = new Random();

    int[] result = new int[count];
    int cur = 0;
    int remaining = end - start;
    for (int i = start; i < end && count > 0; i++) {
        double probability = rng.nextDouble();
        if (probability < ((double) count) / (double) remaining) {
            count--;
            result[cur++] = i;
        }
        remaining--;
    }
    return result;
}

I recently posted this question about codes for a gift-card-like voucher that users can redeem online. I wanted to find the best tradeoff between large keyspace, low guessability, and human readability. Now that I'm into implementation I realize I've got another problem altogether, more of an algorithmic challenge.

Let's assume I adopt some code format - say 10 characters from A to Z for simplicity, and I start generating vouchers. What is the correct algorithm to do this?!

My first approach is to number all possible codes from 0 to 308,915,776, then start generating random numbers in that range. This obviously has a big problem though - I have to check my random number against all previously generated voucher codes and if it collides with an existing one I'll have to discard the code and try another. As the system accumulates more data it will slow down. At the extreme when there is only one code left it will be nearly impossible for the system to guess it correctly.

I could pre-generate all codes and shuffle them, then consume them in order. But this means I have to store many codes, and in fact my keyspace is bigger than the one i described, so we're talking about a very large amount of data. So that's also not too desirable.

So this leaves me with using the codes sequentially. I do not want guessable voucher codes though. The user who buys voucher "AAAAAAAAAY" should not have a good chance of getting another valid code if they type in "AAAAAAAAAZ".

I can shuffle my alphabet and my positions so that instead of

'ABCDEFGHIJKLMNOPQRSTUVWXYZ' i use

'LYFZTGKBNDRAPWEOXQHVJSUMIC'

and so that instead of positions

9 8 7 6 5 4 3 2 1 0 the positions are

1 8 0 7 5 4 3 9 2 6

Using this logic, given the code

LNWHDTECMA

the next code would be

LNEHDTECMA

This is definitely way less guessable. But they're still only one character off from each other, and given just two of these vouchers you would know which position is incrementing, and you would have a 90% chance of getting the next code in 24 guesses or less.

My "escape hatch" is to ditch all this and go with GUIDs. They have more characters than I wanted my users to have to type in, and contain similar characters like I/1 and O/0, but they magically make all of the above headaches go away. Still, I'm having fun thinking about this, maybe you are too. I'd love to hear some alternate suggestions. What have you got?

Thanks!

Programming Pearls has several examples of algorithms to generate sets of random numbers, you should read it if you're interested in this kind of problem.

The book shows that if you generate m random numbers with value less than n, the simple approach of generating numbers and throwing out duplicates will generate no more than 2m random numbers if m < n / 2. Here it is, in C++:

void gensets(int m, int n)
{
    set<int> S;
    set<int>::iterator i;
    while (S.size() < m) {
        int t = bigrand() % n;
        S.insert(t);
    }
    for (i = S.begin(); i != S.end(); ++i)
        cout << *i << "\n";
}

Obviously, if you're worried about people guessing values, you will want m to be much less than n / 2.

There's even a set-based algorithm to generate m random numbers less than n with each value being equally likely, no duplicates, and a guarantee not to generate more than m random numbers:

void genfloyd(int m, int n)
{
    set<int> S;
    set<int>::iterator i;
    for (int j = n-m; j < n; j++) {
        int t = bigrand() % (j+1);
        if (S.find(t) == S.end())
            S.insert(t); // t not in S
        else
            S.insert(j); // t in S
    }
    for (i = S.begin(); i != S.end(); ++i)
        cout << *i << "\n";
}

The order of the numbers isn't random, though, so this is probably not a good choice for you.

It's an interview question:

There are 1 billion cell-phone numbers which has 11 digits, they are stored randomly in a file, for example 12345678910, the first digit gotta be 1. Go through these numbers to see whether there is one with duplicate, just see if duplicate exists, if duplicate found, return True, or return False. Only 10 MB memory allowed.

Here is my solution:

Hash all these numbers into 1000 files using hash(num)%1000, then the duplicates should fall into the same file.

After the hashing, I got 1000 small files, each of which contains 1 million numbers at most, right? I'm not sure about this, I simply do it 1 billion / 1000 = 1 million.

Then for each file, build a hash table to store each number and a flag representing its occurrence.

I guess, it will take 5 B to represent the number, 4 B for the lower 8 digits and 1 B for the upper 3 digits; and actually 1 bit will suffice the flag, because I just need to find out whether duplicate exists, only how many times. But how can I apply the 1 bit flag to each number? I'm stumbled, so I choose bool to be the flag, 1 B is taken. So finally, each number in the hash table will take 5B<for number> + 1B<for flag> + 4B<for the next-pointer> = 10B, then each file will take 10M for the hash table.

That's my stupid solution, Please give me a better one.

Thanks.

FOLLOW UP:

If there are no duplicates in these 1 billion phone numbers, given one phone number, how to find out the given one is or is not in these 1 billion numbers? Use as few memory as possible.

I came up with 2 solutions,

  1. The phone number can be represented using 5B as I said above, scan through the file, read one number a time, and xor the given number with the one read from the file, if the result is 0, then the given one is in the file, it'll take O(n) time, right?

  2. Partition these numbers into 2 small files according to the leading bit, which means, those numbers with a leading 1-bit go to a file, leading 0-bit go to another file, meanwhile count how many numbers in each file, if the given number fall into the 1-bit file and the 1-bit file's count is not full, then again partition the 1-bit file according to the secondary leading-bit, and check the given number recursively; if the 1-bit file is full, then the given number gotta be in the file, it'll take O(logn) time, right?

After the hashing, I got 1000 small files, each of which contains 1 million numbers at most, right

Not true, in extreme case it's possible that one file contains all the numbers.

Create the files based on the first or last x digits of the numbers (ignore the starting 1). When creating those files you can actually chop those digits because they are equal within a file. This is a lot better than hashing because although all the numbers can still end up in one file, now the range of those numbers is limited, so you can fit it into 10MB.

Each number can be represeted by a simple bit because the only information you need is whether the number occured previously. You don't have to store the actual numbers, the address of the bit is the number. In 10MB you can store 80M bits, so you will need 1G/80M = 12.5 files, but remember, those digits must differ so actually you will need 100 files (x=2).

Finally, you don't have to create those files, you can also scan the whole file multiple times. In this case you can have multiple bit-maps in memory as one doesn't occupy 10MB.

I strongly suggest reading this book, it starts with an almost identical example: http://www.amazon.co.uk/Programming-Pearls-ACM-Press-Bentley/dp/0201657880

Recently, I tried to solve the Max Double Slice Sum problem in codility which is a variant of max slice problem. My Solution was to look for a slice that has maximum value when its minimum value is taken out. So I implemented max slice, but on the current slice took out the minimum number.

My score was 61 of 100 as it failed during some of the tests, mainly the tests on array including both negative and position numbers.

Could you help me to figure out why the code failed or if there is a better solution for the problem?

The problem is as follows:

A non-empty zero-indexed array A consisting of N integers is given.
A triplet (X, Y, Z), such that 0 ≤ X < Y < Z < N, is called a double slice.
The sum of double slice (X, Y, Z) is the total of A[X + 1] + A[X + 2] + ... + A[Y − 1]+ A[Y + 1] + A[Y + 2] + ... + A[Z − 1].
For example, array A such that:
A[0] = 3
A[1] = 2
A[2] = 6
A[3] = -1
A[4] = 4
A[5] = 5
A[6] = -1
A[7] = 2
contains the following example double slices:
 double slice (0, 3, 6), sum is 2 + 6 + 4 + 5 = 17,
 double slice (0, 3, 7), sum is 2 + 6 + 4 + 5 − 1 = 16,
 double slice (3, 4, 5), sum is 0.
The goal is to find the maximal sum of any double slice.
Write a function:
class Solution { public int solution(int[] A); }
that, given a non-empty zero-indexed array A consisting of N integers, returns the maximal sum of any double slice.
For example, given:
 A[0] = 3
 A[1] = 2
 A[2] = 6
 A[3] = -1
 A[4] = 4
 A[5] = 5
 A[6] = -1
 A[7] = 2
the function should return 17, because no double slice of array A has a sum of greater than 17.
Assume that:
 N is an integer within the range [3..100,000];
 each element of array A is an integer within the range [−10,000..10,000].
Complexity:
 expected worst-case time complexity is O(N);
 expected worst-case space complexity is O(N), beyond input storage (not counting the    storage required for input arguments).
Elements of input arrays can be modified.
Copyright 2009–2013 by Codility Limited. All Rights Reserved. Unauthorized copying, publication or disclosure prohibited.

And my code is as follows:

public class Solution {
public int solution(int[] A) {
    int currentSliceTotal=0; 
    Integer currentMin=null, SliceTotalBeforeMin =0;
    int maxSliceTotal= Integer.MIN_VALUE;
    for(int i= 1; i<A.length-1; i++){
        if( currentMin==null || A[i] < currentMin ){
            if(currentMin!=null ){
                if(SliceTotalBeforeMin+currentMin <0){
                    currentSliceTotal-=SliceTotalBeforeMin;
                } else {
                    currentSliceTotal += currentMin;
                }
            }                
            currentMin = A[i];
            SliceTotalBeforeMin  =currentSliceTotal;

            if( SliceTotalBeforeMin<0){
                SliceTotalBeforeMin = 0;
                currentMin = null;
                currentSliceTotal = 0;
            }
        } else {
            currentSliceTotal+= A[i];
        }

        maxSliceTotal = Math.max(maxSliceTotal, currentSliceTotal);
    }

    return maxSliceTotal;
}
}

This is a Java 100/100 Solution: https://codility.com/demo/results/demoVUMMR9-JH3/

class Solution {
    public int solution(int[] A) {        
        int[] maxStartingHere = new int[A.length];
        int[] maxEndingHere = new int[A.length];
        int maxSum = 0, len = A.length;

        for(int i = len - 2; i > 0; --i ) {            
            maxSum = Math.max(0, A[i] + maxSum);
            maxStartingHere[i] = maxSum;
        }
        maxSum = 0;
        for(int i = 1; i < len - 1; ++i ) {            
            maxSum = Math.max(0, A[i] + maxSum);
            maxEndingHere[i] = maxSum;
        }
        int maxDoubleSlice = 0;

        for(int i = 0; i < len - 2; ++i) {
            maxDoubleSlice = Math.max(maxDoubleSlice, maxEndingHere[i] + maxStartingHere[i+2]);
        }

        return maxDoubleSlice;

    }
}

You can find more information going to this Wikipedia link and in the Programming Pearls book.

Possible Duplicate:
Sorting in linear time?

Suppose we are given a sequence S of n elements, each of which is an integer in the range [0,n^2-1]. Can we sort it in O(n) time?

Please dont mind me asking too many interview questions. I am just appetent.

Programming Pearls covers variations on this problem. If S is large and you can assume that there are no duplicates, the fastest approach is to allocate a bit vector of length n^2, and use it to mark the presence or absence of each number in the range.

I am a mid level developer. Currently I work in C# ASP.NET but I have worked in PHP, classic ASP, and a tiny bit in Java (would like to get more into this in the future). I am looking for a few suggestions on good books to help me continue growing in my coding techniques.

Basically a book with lots of best practices, coding tricks, rarely used but very useful features and operators (e.g. the ?? operator), stuff like that. I am NOT looking for a book to teach me a specific language. I don't particularly care about what language the code samples are in but C or Java would probably be best. The suggestions don't need to be limited to pure code objects either. Books with information about database connection management ect. would be good.

Thanks for your suggestions.

For your java Skills:

and in general

I strongly recommend to read Effective Java before reading Clean Code. -- Because it will make you more sensitive for a few (I personal belive not so optimal) code changes suggested in clean code.

The first book that comes to mind for me is Code Complete. It's fantastic. Absolutely required reading for a serious developer.

If by rarely used stuff, you think Bitwise operations, Hacker's Delight is supposed to be really well regarded.

I wonder if anyone knows the (optimal?) algorithm for longest recurring non-overlapping sub string.

For example, in the string

ABADZEDGBADEZ

the longest recurring would be "BAD". Incidentally if there is no such result, the algorithm should alert that such a thing has occurred. My guess is that this involves suffix trees.

I'm sure this must exist somewhere already. Thanks for the help!

Suffix array is a good data structure for solving that problem.

There is a solution to this problem in Programming Pearls by Jon Bentley.

Actually this is an interesting topic from programming pearls, sorting 10 digits telephone numbers in a limited memory with an efficient algorithm. You can find the whole story here

What I am interested in is just how fast the implementation could be in python. I have done a naive implementation with the module bitvector. The code is as following:

from BitVector import BitVector
import timeit
import random
import time
import sys

def sort(input_li):
        return sorted(input_li)

def vec_sort(input_li):
        bv = BitVector( size = len(input_li) )
        for i in input_li:
                bv[i] = 1

        res_li = []
        for i in range(len(bv)):
                if bv[i]:
                        res_li.append(i)

        return res_li

if __name__ == "__main__":
        test_data = range(int(sys.argv[1]))
        print 'test_data size is:', sys.argv[1]
        random.shuffle(test_data)

        start = time.time()
        sort(test_data)
        elapsed = (time.time() - start)
        print "sort function takes " + str(elapsed)
        start = time.time()
        vec_sort(test_data)
        elapsed = (time.time() - start)
        print "sort function takes " + str(elapsed)
        start = time.time()
        vec_sort(test_data)
        elapsed = (time.time() - start)
        print "vec_sort function takes " + str(elapsed)

I have tested from array size 100 to 10,000,000 in my macbook(2GHz Intel Core 2 Duo 2GB SDRAM), the result is as following:


  • test_data size is: 1000
  • sort function takes 0.000274896621704
  • vec_sort function takes 0.00383687019348

  • test_data size is: 10000

  • sort function takes 0.00380706787109
  • vec_sort function takes 0.0371489524841

  • test_data size is: 100000

  • sort function takes 0.0520560741425
  • vec_sort function takes 0.374383926392

  • test_data size is: 1000000

  • sort function takes 0.867373943329
  • vec_sort function takes 3.80475401878

  • test_data size is: 10000000

  • sort function takes 12.9204008579
  • vec_sort function takes 38.8053860664

What disappoints me is that even when the test_data size is 100,000,000, the sort function is still faster than vec_sort. Is there any way to accelerate the vec_sort function?

Given an array of integers, how can you find two indices, i and j, such that the sum of the elements in the subarray starting and ending at the indices is maximized, in linear time?

from my copy of programming pearls:

maxsofar = 0
maxendinghere = 0
for i = [0, n)
    /* invariant: maxendinghere and maxsofar are accurate
       are accurate for x[0..i-1] */
    maxendinghere = max(maxendinghere + x[i], 0)
    maxsofar = max(maxsofar, maxendinghere)

Am an awk newbie. Using Windows-GNU gawk in UNXUTILS.

Have 2 kinds of records arranged sequentially in date and time order in my file(s), 30-field Order records (start with "O") where quantity is the 15th field, and 18-field Trade records (start with "T") where quantity is the 8th field. The underlying research data is historical-archival Indian stock market data spanning 15 days in April 2006, about 1000 firms, and comprising in all about 100 million separate order or trade records. My test data is 500 records for 2 dates, and some 200 firms.

My objective at this point is only to compute for each firm and each date, that firm-date's cumulative order quantity and trade quantity.

The raw data IS ordered by date and time (firms obviously jumbled up, just like voters who don't usually vote in alphabetical order!). And I do now have two separate text files, one containing a list of just the distinct firm symbols; and the other, the distinct dates, one per line.

I want try to complete the computations in a way that does not require making me go thru all of the records over and over again for each of the firms and dates. The basic computations given a firm=FIRM_1 and a date=DATE_1 are easy, for e.g. what I have resembles

# For each order record with firm_symbol = FIRM_1, date = DATE_1, 
# cumulate its Order quantity ($15).

( /^O/ && $4~/FIRM_1/ ) && $2~/DATE_1/ 
            { Order_Q[FIRM_1_DATE_1]=Order_Q[FIRM_1_DATE_1]+$15] }

# For each trade record with firm_symbol = FIRM_1, date = DATE_1, 
#cumulate its Trade quantity ($8).

( /^T/ && $4~/FIRM_1/ ) && $2~/DATE_1/ 
            { Trade_Q[FIRM_1_DATE_1]=Trade_Q[FIRM_1_DATE_1]+$8] }

END { print "FIRM_1 ", "DATE_1 ", Order_Q[FIRM_1_DATE_1], Trade_Q[FIRM_1_DATE_1] }

The question is how to construct an intelligent loop over all firms and dates, given the size of the underlying data. There are several related questions.

  1. I know the name FIRM_1 need not be hard-coded inside this awk script, but could be given as a command line parameter. But can one go one step further and get awk to take that name sequentially from a list of names in a separate file, one per line? (If that's feasible, then taking dates from a list of dates would also be possible.)

  2. I constructed the array argument names to hold Order quantity and Trade quantity knowing FIRM_1 and DATE_1. If we succeed in resolving 1 above, can one construct array argument names like FIRM_1_DATE_1 and FIRM_1_DATE_1 on the fly, inside awk, while it is running? Will string concatenation to help form a name be allowed?

  3. I realize that I could use an editor macro or some such method, to combine my 2 keys, FIRM (1000 values) and DATE (15 values) into one FIRM_DATE key (15000 values) before doing any of this, in a separate step. If 2 above is feasible, I'm assuming there's no value to doing this. Would it help anyway?

  4. In principle we are looking to hold in memory perhaps 1000 firms times 15 days times 2 variables = 30,000 cell entries in 2 arrays, ORDER_Q and TRADE_Q. Is this a lot? I use a modest Windows desktop with I think 8GB RAM.

Any suggestion or reference or example that will help reduce having to go over the original large input data several times will be very welcome. If something involves learning more not just about awk but about shell scripts, that will also be very welcome.

Use associative arrays. Assuming that $2 contains the name of the firm and $4 the date, then:

awk '/^O/ { order_qty[$2,$4] += $15 }
     /^T/ { trade_qty[$2,$4] += $8  }
     END  { for (key in order_qty) { print key, "O", order_qty[key]; }
            for (key in trade_qty) { print key, "T", trade_qty[key]; }
          }'

That does not give you a defined order for the companies or dates in the output. There are techniques to do that. This makes a single pass over the data accumulating the results for all the companies and all the dates all in one turn.

awk '     { if (date[$4]++ == 0) date_list[d++] = $4; # Dates appear in order
            if (firm[$2]++ == 0) firm_list[f++] = $2; # Firms appear out of order
          }
     /^O/ { order_qty[$2,$4] += $15 }
     /^T/ { trade_qty[$2,$4] += $8  }
     END  { for (i = 0; i < f; i++)
            {
                for (j = 0; j < d; j++)
                {
                    if ((qty = order_qty[firm_list[i],date_list[j]]) > 0)
                        print firm_list[i], date_list[j], "O", qty
                    if ((qty = trade_qty[firm_list[i],date_list[j]]) > 0)
                        print firm_list[i], date_list[j], "T", qty
                }
            }
          }'

If you want the firms in a specific (e.g. sorted) order, sort the firm list before printing. GNU awk provides built-in sort functions. Otherwise, you'll have to write an awk function to do it. (See Programming Pearls or More Programming Pearls (or both) for more information on writing sort functions in awk.)

Warning: untested code.

I have a program that is running slower than I'd like it to.

I've done some profiling, and I've found the section that is taking up the vast majority of processing time

        DO K = 0, K_MAX
            WRITE(EIGENVALUES_IO, *) K * 0.001 * PI, (W_UP(J), J=1, ATOM_COUNT)
            DCMPLXW_UP(:) = DCMPLX(W_UP(:))
            DO E = 1, ENERGY_STEPS
                ENERGY = MIN_ENERGY + ENERGY_STEP * REAL(E, DP)
                ZV = DCMPLX(ENERGY, DELTA)
                ON_SITE_SINGLE = DCMPLX(0.0_DP)
                DO Q = 1, ATOM_COUNT
                    DO J = 1, ATOM_COUNT
                        ON_SITE_SINGLE(J) = ON_SITE_SINGLE(J) + (MATRIX_UP(J, Q) * MATRIX_UP_CONJG(J, Q)) / (ZV - DCMPLXW_UP(Q))
                    END DO
                END DO
                DOS_DOWN(E) = DOS_DOWN(E) - WEIGHTS(K) * SUM(IMAG(ON_SITE_SINGLE))
            END DO
        END DO

The line

ON_SITE_SINGLE(J) = ON_SITE_SINGLE(J) + (MATRIX_UP(J, Q) * MATRIX_UP_CONJG(J, Q)) / (ZV - DCMPLXW_UP(Q))

Is the one that is doing the damage.

I'm fairly novice at this, is there some way of speeding this up? AFAIK, the same principles apply with C, so any help from you guys too would be nice.

The arrays are all COMPLEX

K_MAX is 1000

ENERGY_STEPS is 1000

ATOM_COUNT is low ( < 50)

The factors you divide by, namely

(ZV - DCMPLXW_UP(Q))

do not depend on J, only on Q. Hence, I would move this calculation up to the Q loop. Better still, calculate:

1/(ZV - DCMPLXW_UP(Q))

in the outer loop, and multiply by it instead of dividing inside the loop (AFAIR, multiplications are faster than divisions). Also, check that your matrix data structures correspond to the loops (that the loops go over contiguous parts of memory as much as possible). As a rule, if you can improve the algorithm, this will be the greatest run time improvement.

Programming Pearls has a great description of similar optimizations.

I have to implement quicksort. From Programming Pearls here is the code:

public class Quick{
    public static void quicksort(int x[], int l, int u)
    {
        if (l>=u)
            return;
        int t = x[l];
        int i = l;
        int j = u;

        do
        {
            i++;
        } while (i<=u && x[i]<t);

        do
        {
            j--;
            if (i>=j) break;
        } while (x[j]>t);

        swap(x, i, j);
        swap(x, l, j);

        quicksort(x, l  , j-1);
        quicksort(x, j+1, u  );
    }

    public static void main(String[]args)
    {
        int x[] = new int[]{55,41,59,26,53,58,97,93};
        quicksort(x, 0, x.length-1);
        for (int i=0; i<x.length; i++)
        {
            System.out.println(x[i]);
        }
    }

    public static void swap(int x[], int i, int j)
    {
        int s = x[i];
        x[i] = x[j];
        x[j] = s;
    }
}

It does not work. Here is output:

59
41
55
26
53
97
58
93

Why doesn't it work?

Should be:

int t=x[l]; 
int  i=l; 
->    int  j=u + 1; 

In addition, you have translated the psuedo-code incorrectly: Here it is in C# (Very similiar to C, just change the array declarations):

public static class Sort
{
    public static void quicksort(int[] x, int l, int u)
    {
        if (l >= u)
            return;

        int t = x[l];
        int i = l;
        int j = u + 1;

        while (true)  // In C, make this while(1)
        {
            do
            {
                i++;
            } while (i <= u && x[i] < t);

            do
            {
                j--;
            } while (x[j] > t);

            if (i >= j)
                break;

            swap(x, i, j);
        }

        swap(x, l, j);

        quicksort(x, l, j - 1);
        quicksort(x, j + 1, u);
    }


    public static void swap(int[] x, int i, int j)
    {
        int s = x[i];
        x[i] = x[j];
        x[j] = s;
    }

Calling with this:

    static void Main(string[] args)
    {
        int[] x = new int[] { 55, 41, 59, 26, 53, 58, 97, 93 };

        Sort.quicksort(x, 0, x.Length - 1);
        for (int i = 0; i < x.Length; i++)
        {
            Console.WriteLine(x[i]);
        }
    }

Produces:

26
41
53
55
58
59
93
97

Suppose we have a loop that runs for 100 times. Does using unsigned char instead of int for its counter make a difference? And also using i += 1U instead of i++? Or do compilers take care of that?

As far as I know performance figures change from compiler to compiler. In addition to that performance might vary from OS to OS as well.

Better approach is write a small program to measure the performance of operations you are interested.

Programming Pearls covers a lot of details about performance of primitive operations.Also it shows how to write some programs to measure performance of those operations.

Yesterday I have attended for an interview. He gave me few programming questions to solve. When I solved them, the interviewer said it can be done in better Time complexity. I was so much depressed that I cannot do the program in the best time complexity. Finally I am not able to get through the interview process. But what I want to know is how can we do in best time for any problem? What should be my approach to reach that state? I know the perfect answer is practice. But still I want to know how and what ways to do a program such that it runs in less time and use best memory. What books I have to read? What problems I have to practice?

P.S: I know this is not a technical question. But please let me know how can I do that.

One of the best books about algorithms, data structures, time and space complexity is Introduction to Algorithms. I can also recommend you to read following books for good preparation for an interview:

  1. Cracking the Coding Interview: 150 Programming Questions and Solutions
  2. Programming Interviews Exposed: Secrets to Landing Your Next Job
  3. Programming Pearls

I'm having trouble with this binary search algorithm. Here are explanations of the variables.

value: the number being searched within the array

values[]: the array that is being searched

n: number of elements in the array

high: highest element (by zero indexed position) of portion of the array being searched

low: lowest element (by zero indexed position) the portion of the array being searched

My problem isn't the recursion. The portion of the array being searched centers around "value" and conditions identified below are being met. the problem is that my if statements don't seem to be recognizing that they are. I know the conditions are being met because when I print out values[high], values[middle], and values[low] for each recursion it shows that they are.

int search(int value, int values[], int n, int high, int low)
 {   
   if (n <= 0)
   {
    return 1;
   }

   int middle = (high + low) / 2;

     ///condition #1
   if (value == values[middle])
   {
     return 0;
   }

   //conditions #2 and #3 (account for the maxes and mins of the array because the operation to find middle truncates)
  else if ( values[middle]==values[low] || values[middle]==values[high])
    {
     return 0;
    }

  else if (value > values[middle])
   {
        low = middle;
        search(value, values, n, high, low);
   }

  else if (value < values[middle])
   {
      high = middle;
      search(value, values, n, high, low);
   }

    return 2;
   } 

What's wrong here?

Quoth cb3k

That seemed to make it work...what might the other problems be?

Here's your code with the minimal (necessary, but not sufficient) fix diagnosed by templatetypedef and a test harness.

#include <stdio.h>

static
int search(int value, int values[], int n, int high, int low)
{
    if (n <= 0)
    {
        return 1;
    }

    int middle = (high + low) / 2;

    ///condition #1
    if (value == values[middle])
    {
        return 0;
    }

    // conditions #2 and #3 (account for the maxes and mins of the array because the operation to find middle truncates)
    else if (values[middle] == values[low] || values[middle] == values[high])
    {
        return 0;
    }

    else if (value > values[middle])
    {
        low = middle;
        return search(value, values, n, high, low);
    }

    else if (value < values[middle])
    {
        high = middle;
        return search(value, values, n, high, low);
    }

    return 2;
}

int main(void)
{
    int data[15];
    for (int i = 0; i < 15; i++)
        data[i] = 2 * i + 1;

    printf("Data:");
    for (int i = 0; i < 15; i++)
        printf("%3d", data[i]);
    putchar('\n');

    for (int i = -1; i < 2 * 15 + 3; i++)
        printf("Search for %2d - result %d\n", i, search(i, data, 15, 14, 0));
    return 0;
}

Here's the output:

Data:  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29
Search for -1 - result 0
Search for  0 - result 0
Search for  1 - result 0
Search for  2 - result 0
Search for  3 - result 0
Search for  4 - result 0
Search for  5 - result 0
Search for  6 - result 0
Search for  7 - result 0
Search for  8 - result 0
Search for  9 - result 0
Search for 10 - result 0
Search for 11 - result 0
Search for 12 - result 0
Search for 13 - result 0
Search for 14 - result 0
Search for 15 - result 0
Search for 16 - result 0
Search for 17 - result 0
Search for 18 - result 0
Search for 19 - result 0
Search for 20 - result 0
Search for 21 - result 0
Search for 22 - result 0
Search for 23 - result 0
Search for 24 - result 0
Search for 25 - result 0
Search for 26 - result 0
Search for 27 - result 0
Search for 28 - result 0
Search for 29 - result 0
Search for 30 - result 0
Search for 31 - result 0
Search for 32 - result 0

It is returning 0 regardless of whether the value sought is present in the array or not. This is incorrect behaviour.

You should take time out to study Programming Pearls by Jon Bentley. It covers a lot the basics of the testing of binary searches in a variety of forms — the test harness shown is a variant on what he describes. Also take the time to read Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken. Maybe you should take reassurance that lots of other people have got binary search wrong over time. (IIRC, the first versions of binary search were published in the 1950s, but it wasn't until the early 1960s that a correct version was published — and then there's the Extra information from 2006, too.)

When I added a printf() in the block after else if (values[middle] == values[low] || values[middle] == values[high]), it printed on every search that should have failed. Note that the interface makes it hard to spot what's happening — it doesn't report where the element is found, just whether it is found. You can add the debugging and code changes necessary to deal with the residual problems. (Hint: that condition is probably not part of the solution. However, when you do remove it, the code goes into a permanent loop because you don't eliminate the value known not to be in the range from the range that you check recursively.)

This seems to work — note that return 2; is never executed (because the final else if is never false.

#include <stdio.h>

static
int search(int value, int values[], int n, int high, int low)
{
    //if (n <= 0)
    if (n <= 0 || high < low)
    {
        return 1;
    }

    int middle = (high + low) / 2;

    ///condition #1
    if (value == values[middle])
    {
        return 0;
    }

#if 0
    // conditions #2 and #3 (account for the maxes and mins of the array because the operation to find middle truncates)
    else if (values[middle] == values[low] || values[middle] == values[high])
    {
        //printf(" (#2 || #3) ");
        return 0;
    }
#endif

    else if (value > values[middle])
    {
        //low = middle;
        low = middle + 1;
        return search(value, values, n, high, low);
    }

    else if (value < values[middle])
    {
        //high = middle;
        high = middle - 1;
        return search(value, values, n, high, low);
    }

    return 2;
}

int main(void)
{
    int data[15];
    for (int i = 0; i < 15; i++)
        data[i] = 2 * i + 1;

    printf("Data:");
    for (int i = 0; i < 15; i++)
        printf("%3d", data[i]);
    putchar('\n');

    for (int i = -1; i < 2 * 15 + 3; i++)
        printf("Search for %2d - result %d\n", i, search(i, data, 15, 14, 0));
    return 0;
}

Output:

Data:  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29
Search for -1 - result 1
Search for  0 - result 1
Search for  1 - result 0
Search for  2 - result 1
Search for  3 - result 0
Search for  4 - result 1
Search for  5 - result 0
Search for  6 - result 1
Search for  7 - result 0
Search for  8 - result 1
Search for  9 - result 0
Search for 10 - result 1
Search for 11 - result 0
Search for 12 - result 1
Search for 13 - result 0
Search for 14 - result 1
Search for 15 - result 0
Search for 16 - result 1
Search for 17 - result 0
Search for 18 - result 1
Search for 19 - result 0
Search for 20 - result 1
Search for 21 - result 0
Search for 22 - result 1
Search for 23 - result 0
Search for 24 - result 1
Search for 25 - result 0
Search for 26 - result 1
Search for 27 - result 0
Search for 28 - result 1
Search for 29 - result 0
Search for 30 - result 1
Search for 31 - result 1
Search for 32 - result 1

I want to write a function ContigSum(i,j) that calculates the sum of the contiguous elements a[i] through a[j], where i<=j and a[] contains positive and negative numbers.

Could you please tell me a time efficient solution to find maximized contiguous SUM in the array?

This is discussed in Column 7 of the 1st Edition or Column 8 of the 2nd Edition of 'Programming Pearls' by Jon Bentley.

int recurbinarysearch(int * oringalary, int low, int high, int target) {

    if (low > high) {
        return -1;
    } else {
        int mid = (low + high) / 2;
        if (target == * (oringalary + mid)) return mid;
        if (target > * (oringalary + mid)) return recurbinarysearch(oringalary, mid + 1, high, target);
        if (target < * (oringalary + mid)) return recurbinarysearch(oringalary, low, mid - 1, target);
    }
}

Does anyone see an error in my recursive binary search algorithm? Occasionally it returns an index that is incorrect (most often it is off by one), but once in awhile its way off. Usually however, it is right. I do not see the issue, would appreciate the help.

For a thorough discussion of binary search, study Jon Bentley's Programming Pearls.

Here's a test harness for your code, very much inspired by Programming Pearls, plus an instrumented version of your code. The only change I made is adding the (now commented out) debug printing in the binary search. The output from the test code is almost perfect (the harness says everything passes, but it is not quite correct):

N =  0: 
search for  0 in  0 entries - returned  0 found  0 PASS
N =  1: [0] = 1;
search for  0 in  1 entries - returned -1          PASS
search for  1 in  1 entries - returned  0 found  1 PASS
search for  2 in  1 entries - returned -1          PASS
N =  2: [0] = 1;[1] = 3;
search for  0 in  2 entries - returned -1          PASS
search for  1 in  2 entries - returned  0 found  1 PASS
search for  2 in  2 entries - returned -1          PASS
search for  3 in  2 entries - returned  1 found  3 PASS
search for  4 in  2 entries - returned -1          PASS
N =  3: [0] = 1;[1] = 3;[2] = 5;
search for  0 in  3 entries - returned -1          PASS
search for  1 in  3 entries - returned  0 found  1 PASS
search for  2 in  3 entries - returned -1          PASS
search for  3 in  3 entries - returned  1 found  3 PASS
search for  4 in  3 entries - returned -1          PASS
search for  5 in  3 entries - returned  2 found  5 PASS
search for  6 in  3 entries - returned -1          PASS
N =  4: [0] = 1;[1] = 3;[2] = 5;[3] = 7;
search for  0 in  4 entries - returned -1          PASS
search for  1 in  4 entries - returned  0 found  1 PASS
search for  2 in  4 entries - returned -1          PASS
search for  3 in  4 entries - returned  1 found  3 PASS
search for  4 in  4 entries - returned -1          PASS
search for  5 in  4 entries - returned  2 found  5 PASS
search for  6 in  4 entries - returned -1          PASS
search for  7 in  4 entries - returned  3 found  7 PASS
search for  8 in  4 entries - returned -1          PASS
N =  5: [0] = 1;[1] = 3;[2] = 5;[3] = 7;[4] = 9;
search for  0 in  5 entries - returned -1          PASS
search for  1 in  5 entries - returned  0 found  1 PASS
search for  2 in  5 entries - returned -1          PASS
search for  3 in  5 entries - returned  1 found  3 PASS
search for  4 in  5 entries - returned -1          PASS
search for  5 in  5 entries - returned  2 found  5 PASS
search for  6 in  5 entries - returned -1          PASS
search for  7 in  5 entries - returned  3 found  7 PASS
search for  8 in  5 entries - returned -1          PASS
search for  9 in  5 entries - returned  4 found  9 PASS
search for 10 in  5 entries - returned -1          PASS
N =  6: [0] = 1;[1] = 3;[2] = 5;[3] = 7;[4] = 9;[5] = 11;
search for  0 in  6 entries - returned -1          PASS
search for  1 in  6 entries - returned  0 found  1 PASS
search for  2 in  6 entries - returned -1          PASS
search for  3 in  6 entries - returned  1 found  3 PASS
search for  4 in  6 entries - returned -1          PASS
search for  5 in  6 entries - returned  2 found  5 PASS
search for  6 in  6 entries - returned -1          PASS
search for  7 in  6 entries - returned  3 found  7 PASS
search for  8 in  6 entries - returned -1          PASS
search for  9 in  6 entries - returned  4 found  9 PASS
search for 10 in  6 entries - returned -1          PASS
search for 11 in  6 entries - returned  5 found 11 PASS
search for 12 in  6 entries - returned -1          PASS
N =  7: [0] = 1;[1] = 3;[2] = 5;[3] = 7;[4] = 9;[5] = 11;[6] = 13;
search for  0 in  7 entries - returned -1          PASS
search for  1 in  7 entries - returned  0 found  1 PASS
search for  2 in  7 entries - returned -1          PASS
search for  3 in  7 entries - returned  1 found  3 PASS
search for  4 in  7 entries - returned -1          PASS
search for  5 in  7 entries - returned  2 found  5 PASS
search for  6 in  7 entries - returned -1          PASS
search for  7 in  7 entries - returned  3 found  7 PASS
search for  8 in  7 entries - returned -1          PASS
search for  9 in  7 entries - returned  4 found  9 PASS
search for 10 in  7 entries - returned -1          PASS
search for 11 in  7 entries - returned  5 found 11 PASS
search for 12 in  7 entries - returned -1          PASS
search for 13 in  7 entries - returned  6 found 13 PASS
search for 14 in  7 entries - returned -1          PASS
N =  8: [0] = 1;[1] = 3;[2] = 5;[3] = 7;[4] = 9;[5] = 11;[6] = 13;[7] = 15;
search for  0 in  8 entries - returned -1          PASS
search for  1 in  8 entries - returned  0 found  1 PASS
search for  2 in  8 entries - returned -1          PASS
search for  3 in  8 entries - returned  1 found  3 PASS
search for  4 in  8 entries - returned -1          PASS
search for  5 in  8 entries - returned  2 found  5 PASS
search for  6 in  8 entries - returned -1          PASS
search for  7 in  8 entries - returned  3 found  7 PASS
search for  8 in  8 entries - returned -1          PASS
search for  9 in  8 entries - returned  4 found  9 PASS
search for 10 in  8 entries - returned -1          PASS
search for 11 in  8 entries - returned  5 found 11 PASS
search for 12 in  8 entries - returned -1          PASS
search for 13 in  8 entries - returned  6 found 13 PASS
search for 14 in  8 entries - returned -1          PASS
search for 15 in  8 entries - returned  7 found 15 PASS
search for 16 in  8 entries - returned -1          PASS
N =  9: [0] = 1;[1] = 3;[2] = 5;[3] = 7;[4] = 9;[5] = 11;[6] = 13;[7] = 15;[8] = 17;
search for  0 in  9 entries - returned -1          PASS
search for  1 in  9 entries - returned  0 found  1 PASS
search for  2 in  9 entries - returned -1          PASS
search for  3 in  9 entries - returned  1 found  3 PASS
search for  4 in  9 entries - returned -1          PASS
search for  5 in  9 entries - returned  2 found  5 PASS
search for  6 in  9 entries - returned -1          PASS
search for  7 in  9 entries - returned  3 found  7 PASS
search for  8 in  9 entries - returned -1          PASS
search for  9 in  9 entries - returned  4 found  9 PASS
search for 10 in  9 entries - returned -1          PASS
search for 11 in  9 entries - returned  5 found 11 PASS
search for 12 in  9 entries - returned -1          PASS
search for 13 in  9 entries - returned  6 found 13 PASS
search for 14 in  9 entries - returned -1          PASS
search for 15 in  9 entries - returned  7 found 15 PASS
search for 16 in  9 entries - returned -1          PASS
search for 17 in  9 entries - returned  8 found 17 PASS
search for 18 in  9 entries - returned -1          PASS

Almost all of that is fine; the only problem child is the very first search, which should fail because no value should be found in an empty array.

The test code is:

#include <stdio.h>

int recurbinarysearch(int *oringalary, int low, int high, int target)
{
    //printf("-->> %d..%d: ", low, high);
    if (low > high)
    {
        //printf("<<-- %d ", -1);
        return -1;
    }
    else
    {
        int mid = (low + high) / 2;
        if (target == * (oringalary + mid))
        {
            //printf("<<-- %d ", mid);
            return mid;
        }
        if (target > * (oringalary + mid))
        {
            int r = recurbinarysearch(oringalary, mid + 1, high, target);
            //printf("<<-- %d ", r);
            return r;
        }
        if (target < * (oringalary + mid))
        {
            int r = recurbinarysearch(oringalary, low, mid - 1, target);
            //printf("<<-- %d ", r);
            return r;
        }
    }
}

int main(void)
{
    for (int i = 0; i < 10; i++)
    {
        int a[i+1]; // No zero-size arrays in C
        printf("N = %2d: ", i);
        for (int j = 0; j < i; j++)
        {
            a[j] = 2 * j + 1;
            printf("[%d] = %d;",j, a[j]);
        }
        putchar('\n');

        for (int j = 0; j < 2*i+1; j++)
        {
            int f = recurbinarysearch(a, 0, i, j);
            //putchar('\n');  // debug
            printf("search for %2d in %2d entries - returned %2d",
                    j, i, f);
            if (f >= 0 && f <= i)
            {
                printf(" found %2d", a[f]);
                printf(" %s", (a[f] == j) ? "PASS" : "FAIL");
            }
            else
                printf(" %8s %s", "", (j % 2 == 0) ? "PASS" : "FAIL");
            putchar('\n');
        }
    }
    return(0);
}

I'm going to leave you to work out how to deal with the empty array case.

An excerpt from Programming Perls:

A Simple Design : Antonie de Saint-Exupery, 
the Fresh writer and aircraft designer, said that,
*"A designer knows he has arrived at perfection 
not when there is no longer anything to add,
but when there is no longer anything to take away."* 
More programmers should judge their work by this criteria.

Can any one elaborate this, please?

What does the author mean when he say "...TAKE AWAY"

the Take Away part means that the design can be considered simple if all that remains are essential components, if you take away anything, it won't work.

"Designing the right algorithm for a given application is a difficult job. It requires a major creative act, taking a problem and pulling a solution out of the ether. This is much more difficult than taking someone else's idea and modifying it or tweaking it to make it a little better. The space of choices you can make in algorithm design is enormous, enough to leave you plenty of freedom to hang yourself".

I have studied several basic design techniques of algorithms like Divide and Conquer, Dynamic Programming, greedy, backtracking etc.

But i always fail to recognize what principles to apply when i come across certain programming problems. I want to master the designing of algorithms.

So can any one suggest a best place to understand the principles of algorithm design in depth.....

I suggest Programming Pearls, 2nd edition, by Jon Bentley. He talks a lot about algorithm design techniques and provides examples of real world problems, how they were solved, and how different algorithms affected the runtime.

Throughout the book, you learn algorithm design techniques, program verification methods to ensure your algorithms are correct, and you also learn a little bit about data structures. It's a very good book and I recommend it to anyone who wants to master algorithms. Go read the reviews in amazon: http://www.amazon.com/Programming-Pearls-2nd-Edition-Bentley/dp/0201657880

You can have a look at some of the book's contents here: http://netlib.bell-labs.com/cm/cs/pearls/

Enjoy!

I have implemented quick-sort.But this code is giving me segmentation fault.

#include<stdio.h>
#include<stdlib.h>
void Quick_Sort(int *a,int p,int r);
int partition(int *a,int p,int r);
int main(){
    int testcase,num,search;
    int i,j,*array;
    scanf("%d",&testcase);
    for(i=0;i<testcase;i++){
        scanf("%d",&num);
        array=malloc(sizeof(int)*num);
        for(j=0;j<num;j++){
            scanf("%d",(array+j));
        }
        Quick_Sort(array,0,num-1);
        scanf("%d",&search);
        /*for(j=0;j<num;j++){
            if(search==*(array+j)){
                printf("%d\n",j+1 );
                break;
            }

        }*/
    }
    for(i=0;*(array+i)!='\0';++i){
        printf("%d ",*(array+i));
    }
}
void Quick_Sort(int *a,int p,int r){
    int q;
    if(p<r){
        q=partition(a,p,r);
        Quick_Sort(a,p,q);
        Quick_Sort(a,q+1,r);
    }
}
int partition(int *a,int p,int r){
    int i,j,x,temp;
    i=p-1;
    j=r+1;
    x=*(a+p);
    while(1){
        do{
        j=j-1;
    }
    while(x<=(*(a+j)));

    do{
        i=i+1;
    }
    while(x>=(*(a+i)));

    if(i<j){
        temp=*(a+i);
        *(a+i)=*(a+j);
        *(a+j)=temp;
    }else{
        return j;
    }
}

Reading the data

Your data file seems to be structured as:

T
N1 V11 V12 V13 ... V1N S1
N2 V21 V22 V23 ... V2N S2
...

You have a total number of test cases, T. Then for each test case, you have a number of entries in the array (N1, N2, etc), each time followed by the appropriate number of values, V11 .. V1N, and a spare value which you're supposed to search for (all described using 1-based array notations; in C, you'll be using 0-based arrays). Although I've shown the data for each test set all on a single line, the actual data file could have the numbers laid out in any sequence — everything could be on one line, or each number on a line of its own possibly with blank lines separating them, or any mixture of these formats.

The main() program you show does not do a particularly good job of reading that data. It lacks all error checking. It uses the (legal but) bizarre *(array + i) notations instead of the simpler to understand array[i] notation, apparently in the belief that it will be more efficient. When you use pointers for efficiency, you don't keep on adding i to the value before dereferencing the pointer. Your code dynamically allocates memory but never frees it, leaking horribly.

Reading the data using subscript notation

In this revised code, I'm using return 1; to exit from the program. It should print an error message too, but I'm moderately lazy.

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int testcase, num, search;
    int i, j, *array;
    if (scanf("%d", &testcase) != 1)
        return 1;

    for (i = 0; i < testcase; i++)
    {
        if (scanf("%d", &num) != 1)
            return 1;
        array = malloc(sizeof(int)*num);
        if (array == 0)
            return 1;
        for (j = 0; j < num; j++)
        {
            if (scanf("%d", &array[j]) != 1)
                return 1;
        }
        //Quick_Sort(array, 0, num-1);
        if (scanf("%d", &search) != 1)
            return 1;
        for (j = 0; j < num; j++)
        {
            if (search == array[j])
            {
                printf("%d\n", j+1);
                break;
            }
        }
        // Print the array - best encapsulated in a small function
        for (j = 0; j < num; j++)
        {
            printf(" %d", array[j]);
            if (j % 10 == 9)
                putchar('\n');
        }
        if (j % 10 != 0)
            putchar('\n');
        // Prevent memory leaks
        free(array);
    }
    return 0;
}

I note in passing that the search loop will work whether or not the QuickSort sorts the data; it makes no use of the fact that the array is sorted. You could print the data before the sort and after the sort. You should tag the output to identify what you're printing. For example, the search code might write:

printf("Found %d at %d\n", search, j);

You also do not identify when the value being searched for is not found.

It is also often a good idea to print the data you read after you've read it and before you process it, just to make sure that your program is getting the data you expect it to get. It can lead to confusion if the program isn't working on the data you think it is working on.

Note that this code does not make any assumptions about the values in the arrays beyond 'they are valid integers'. And it does check every input operation. Tedious though it may seem, it is necessary to head off trouble.

Reading the data using pointer notation

Here is code making more or less idiomatic use of pointers:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int testcase;
    if (scanf("%d", &testcase) != 1)
        return 1;

    for (int i = 0; i < testcase; i++)
    {
        int num;
        if (scanf("%d", &num) != 1)
            return 1;
        int *array = malloc(sizeof(int)*num);
        if (array == 0)
            return 1;
        int *end = array + num;
        int *ptr;
        for (ptr = array; ptr < end; ptr++) 
        {
            if (scanf("%d", ptr) != 1)
                return 1;
        }

        //Quick_Sort(array, 0, num-1);

        int search;
        if (scanf("%d", &search) != 1)
            return 1;
        for (ptr = array; ptr < end; ptr++) 
        {
            if (search == *ptr)
            {
                break;
            }
        }
        if (ptr < end)
            printf("Found %d at %td\n", search, ptr - array + 1);
        else
            printf("Missing %d\n", search);

        // Print the array - best encapsulated in a small function
        printf("Array (%d):", num);
        int j;
        for (j = 0; j < num; j++)
        {
            printf(" %d", array[j]);
            if (j % 10 == 9)
                putchar('\n');
        }
        if (j % 10 != 0)
            putchar('\n');

        // Prevent memory leaks
        free(array);
    }
    return 0;
}

The printing loop is simplest if written using indexing, so I didn't change it to use pointers. It could be done, but using (ptr - array) instead of j in the 'is it time to print a newline' code makes it less worthwhile. The code uses C99 features like declaring variables as they're needed and the t qualifier in %td for the ptrdiff_t value. It could be written to use a VLA instead of malloc(), too.

Sample input data

3
2 1 2 1
3 3 2 0 1
4 5 4 3 2 4

Sample output

Found 1 at 1
Array (2): 1 2
Missing 1
Array (3): 3 2 0
Found 4 at 2
Array (4): 5 4 3 2

Working Quicksort Code

Your partition algorithm was defective. It is fixed, with key changes marked. The debugging scaffolding that I used while sorting out the issues is left in place, with many of the printing operations that guided me commented out. Read Programming Pearls by Jon Bentley, especially 'Column 11: Sorting' (chapter 11, but the chapters were originally columns in the Communications of the ACM, hence the designation Column 11). It was an invaluable guide while fixing the problems.

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void Quick_Sort(int *a, int p, int r);
static int  partition(int *a, int p, int r);

static void dump_partition(char const *tag, int const *data, int lo, int hi);

/* Debugging functions */
static void check_sorted(int const *data, int lo, int hi);
static int *copy_partition(int const *data, int lo, int hi);
static void check_consistency(int const *a1, int const *a2, int lo, int hi);

int main(void)
{
    int testcase, num, search;
    int i, j, *array;
    if (scanf("%d", &testcase) != 1)
        return 1;

    for (i = 0; i < testcase; i++)
    {
        if (scanf("%d", &num) != 1)
            return 1;
        array = malloc(sizeof(int)*num);
        if (array == 0)
            return 1;
        for (j = 0; j < num; j++)
        {
            if (scanf("%d", &array[j]) != 1)
                return 1;
        }

        printf("\nData set %d:\n", i);
        int *copy = copy_partition(array, 0, num-1);
        dump_partition("Before:", array, 0, num-1);
        //dump_partition("Copy", copy, 0, num-1);
        Quick_Sort(array, 0, num-1);
        dump_partition("After: ", array, 0, num-1);
        check_sorted(array, 0, num-1);
        check_consistency(array, copy, 0, num-1);
        free(copy);

        if (scanf("%d", &search) != 1)
            return 1;
        for (j = 0; j < num; j++)
        {
            if (search == array[j])
                break;
        }
        if (j < num && search == array[j])
            printf("Found %d at %d\n", search, j+1);
        else
            printf("Missing %d\n", search);

        // Prevent memory leaks
        free(array);
    }
    return 0;
}

/* Although we're interested in data[lo]..data[hi], the copy must have data[0]..data[lo-1] too */
static int *copy_partition(int const *data, int lo, int hi)
{
    assert(lo <= hi);
    size_t nbytes = (hi + 1) * sizeof(int);
    int *space = (int *)malloc(nbytes);
    if (space == 0)
    {
        fputs("Out of memory\n", stderr);
        exit(1);
    }
    memmove(space, data, nbytes);
    return(space);
}

/* Check that the two arrays contain the same sets of data */
/* Each value in a1 must be present in a2 and vice versa */
static void check_consistency(int const *a1, int const *a2, int lo, int hi)
{
    int *a3 = copy_partition(a1, lo, hi);
    int  a3_lo = lo;
    int  a3_hi = hi;
    //printf("-->> check_consistency:\n");
    //dump_partition("a1", a1, lo, hi);
    //dump_partition("a2", a2, lo, hi);
    //dump_partition("a3", a3, lo, hi);
    for (int i = lo; i <= hi; i++)
    {
        int found = 0;
        for (int j = a3_lo; j <= a3_hi; j++)
        {
            if (a2[i] == a3[j])
            {
                /* Found a match for a2[i] at a3[j] */
                /* Move a3[j] to end of array and ignore it from here on */
                //printf("Found a2[%d] = %d at a3[%d] = %d\n", i, a2[i], j, a3[j]);
                int t = a3[a3_hi];
                a3[a3_hi] = a3[j];
                a3[j] = t;
                a3_hi--;
                //dump_partition("a3-free", a3, a3_lo, a3_hi);
                //dump_partition("a3-used", a3, a3_hi+1, hi);
                found = 1;
                break;
            }
        }
        if (!found)
        {
            printf("No value %d for a2[%d] in a1\n", a2[i], i);
            dump_partition("a2", a2, lo, hi);
            dump_partition("a1-free", a3, a3_lo, a3_hi);
            dump_partition("a1-used", a3, a3_hi+1, hi);
        }
    }
    free(a3);
    //printf("<<-- check_consistency\n");
}

static void dump_partition(char const *tag, int const *data, int lo, int hi)
{
    printf("%s [%d..%d]%s", tag, lo, hi, (hi - lo) > 10 ? "\n" : "");
    int i;
    for (i = lo; i <= hi; i++)
    {
        printf(" %2d", data[i]);
        if ((i - lo) % 10 == 9)
            putchar('\n');
    }
    if ((i - lo) % 10 != 0 || i == lo)
        putchar('\n');
}

static void check_sorted(int const *data, int lo, int hi)
{
    //printf("-->> check_sorted:\n");
    for (int i = lo; i < hi; i++)
    {
        if (data[i] > data[i+1])
            printf("Out of order: a[%d] = %d bigger than a[%d] = %d\n", i, data[i], i+1, data[i+1]);
    }
    //printf("<<-- check_sorted\n");
}

void Quick_Sort(int *a, int p, int r)
{
    if (p < r)
    {
        int q = partition(a, p, r);
        //dump_partition("Lo Range", a, p, q-1);
        //printf("Pivot: a[%d] = %d\n", q, a[q]);
        //dump_partition("Hi Range", a, q+1, r);
        Quick_Sort(a, p, q-1);          // JL: Optimization
        Quick_Sort(a, q+1, r);
    }
}

static int partition(int *a, int p, int r)
{
    assert(p <= r);
    if (p == r)                         // JL: Key change
        return p;                       // JL: Key change
    int i = p;                          // JL: Key change
    int j = r + 1;
    int x = a[p];
    //printf("-->> partition: lo = %d, hi = %d, pivot = %d\n", p, r, x);
    while (1)
    {
        do
        {
            j--;
            //printf("---- partition 1: a[%d] = %d\n", j, a[j]);
        }   while (x < a[j]);           // JL: Key change

        do
        {
            i++;
            //printf("---- partition 2: a[%d] = %d\n", i, a[i]);
        }   while (i <= r && x > a[i]); // JL: Key change

        if (i <= j)                     // JL: Key change
        {
            //printf("---- partition: swap a[%d] = %d with a[%d] = %d\n", i, a[i], j, a[j]);
            int temp = a[i];
            a[i] = a[j];
            a[j] = temp;
        }
        else
        {
            // This swap step is crucial.
            int temp = a[p];            // JL: Key change
            a[p] = a[j];                // JL: Key change
            a[j] = temp;                // JL: Key change
            //dump_partition("a-lo", a, p, j-1);
            //printf("a-pivot[%d] = %d\n", j, a[j]);
            //dump_partition("a-hi", a, j+1, r);
            //printf("<<-- partition: return j = %d; a[%d] = %d; (i = %d; a[%d] = %d)\n", j, j, a[j], i, i, a[i]);
            return j;
        }
    }
}

Extended Sample Input

10

2 1 2 1
3 3 2 0 1
4 5 4 3 2 4
4 3 1 9 3 8
5 3 4 1 9 3 8
9 3 6 4 9 5 1 9 3 3 8
10 3 6 4 9 6 5 1 9 3 3 8

16
3 6 4 9 6 5 1 9 3 3
8 2 1 7 3 5
3

26
3 6 4 9 6 5 1 9 3 3
2 7 8 2 0 8 4 4 7 5
8 2 1 7 3 5
7

96
3 6 4 9 6 5 1 9 3 3
4 0 5 0 7 5 6 3 6 0
1 2 0 7 3 1 7 6 2 3
0 4 6 6 9 8 9 5 3 4
1 9 2 9 2 7 5 9 8 9
4 7 5 8 7 8 5 8 2 7
5 8 2 9 8 3 7 6 5 3
9 1 2 0 3 4 6 5 1 0
2 7 8 2 0 8 4 4 7 5
8 2 1 7 3 5
6

Extended Sample Output

Data set 0:
Before: [0..1]  1  2
After:  [0..1]  1  2
Found 1 at 1

Data set 1:
Before: [0..2]  3  2  0
After:  [0..2]  0  2  3
Missing 1

Data set 2:
Before: [0..3]  5  4  3  2
After:  [0..3]  2  3  4  5
Found 4 at 3

Data set 3:
Before: [0..3]  3  1  9  3
After:  [0..3]  1  3  3  9
Missing 8

Data set 4:
Before: [0..4]  3  4  1  9  3
After:  [0..4]  1  3  3  4  9
Missing 8

Data set 5:
Before: [0..8]  3  6  4  9  5  1  9  3  3
After:  [0..8]  1  3  3  3  4  5  6  9  9
Missing 8

Data set 6:
Before: [0..9]  3  6  4  9  6  5  1  9  3  3
After:  [0..9]  1  3  3  3  4  5  6  6  9  9
Missing 8

Data set 7:
Before: [0..15]
  3  6  4  9  6  5  1  9  3  3
  8  2  1  7  3  5
After:  [0..15]
  1  1  2  3  3  3  3  4  5  5
  6  6  7  8  9  9
Found 3 at 4

Data set 8:
Before: [0..25]
  3  6  4  9  6  5  1  9  3  3
  2  7  8  2  0  8  4  4  7  5
  8  2  1  7  3  5
After:  [0..25]
  0  1  1  2  2  2  3  3  3  3
  4  4  4  5  5  5  6  6  7  7
  7  8  8  8  9  9
Found 7 at 19

Data set 9:
Before: [0..95]
  3  6  4  9  6  5  1  9  3  3
  4  0  5  0  7  5  6  3  6  0
  1  2  0  7  3  1  7  6  2  3
  0  4  6  6  9  8  9  5  3  4
  1  9  2  9  2  7  5  9  8  9
  4  7  5  8  7  8  5  8  2  7
  5  8  2  9  8  3  7  6  5  3
  9  1  2  0  3  4  6  5  1  0
  2  7  8  2  0  8  4  4  7  5
  8  2  1  7  3  5
After:  [0..95]
  0  0  0  0  0  0  0  0  1  1
  1  1  1  1  1  2  2  2  2  2
  2  2  2  2  2  3  3  3  3  3
  3  3  3  3  3  3  4  4  4  4
  4  4  4  4  5  5  5  5  5  5
  5  5  5  5  5  5  6  6  6  6
  6  6  6  6  6  7  7  7  7  7
  7  7  7  7  7  7  8  8  8  8
  8  8  8  8  8  8  9  9  9  9
  9  9  9  9  9  9
Found 6 at 57

The code was also tested on some bigger data sets (2097 random entries, etc). Automated check functions are crucial when the data is that big — hence check_sorted() and check_consistency(). They check that the data is output in sorted order, and the conservation property, that all the values in the input appear in the output (as often as they appeared in the input). Sorting is not supposed to add new data or remove pre-existing data.