Joe Celko's SQL for Smarties

Joe Celko

Mentioned 11

An industry consultant shares his most useful tips and tricks for advanced SQL programming to help the working programmer gain performance and work around system deficiencies.

More on

Mentioned in questions and answers.

If you could go back in time and tell yourself to read a specific book at the beginning of your career as a developer, which book would it be?

I expect this list to be varied and to cover a wide range of things.

To search: Use the search box in the upper-right corner. To search the answers of the current question, use inquestion:this. For example:

inquestion:this "Code Complete"

Applying UML and Patterns by Craig Larman.

The title of the book is slightly misleading; it does deal with UML and patterns, but it covers so much more. The subtitle of the book tells you a bit more: An Introduction to Object-Oriented Analysis and Design and Iterative Development.

Masters of doom. As far as motivation and love for your profession go: it won't get any better than what's been described in this book, truthfully inspiring story!

Beginning C# 3.0: An Introduction to Object Oriented Programming

This is the book for those who want to understand the whys and hows of OOP using C# 3.0. You don't want to miss it.

alt text

Mastery: The Keys to Success and Long-Term Fulfillment, by George Leonard

It's about about what mindsets are required to reach mastery in any skill, and why. It's just awesome, and an easy read too.

Pro Spring is a superb introduction to the world of Inversion of Control and Dependency Injection. If you're not aware of these practices and their implications - the balance of topics and technical detail in Pro Spring is excellent. It builds a great case and consequent personal foundation.

Another book I'd suggest would be Robert Martin's Agile Software Development (ASD). Code smells, agile techniques, test driven dev, principles ... a well-written balance of many different programming facets.

More traditional classics would include the infamous GoF Design Patterns, Bertrand Meyer's Object Oriented Software Construction, Booch's Object Oriented Analysis and Design, Scott Meyer's "Effective C++'" series and a lesser known book I enjoyed by Gunderloy, Coder to Developer.

And while books are nice ... don't forget radio!

... let me add one more thing. If you haven't already discovered safari - take a look. It is more addictive than stack overflow :-) I've found that with my google type habits - I need the more expensive subscription so I can look at any book at any time - but I'd recommend the trial to anyone even remotely interested.

(ah yes, a little obj-C today, cocoa tomorrow, patterns? soa? what was that example in that cookbook? What did Steve say in the second edition? Should I buy this book? ... a subscription like this is great if you'd like some continuity and context to what you're googling ...)

Database System Concepts is one of the best books you can read on understanding good database design principles.

alt text

Algorithms in C++ was invaluable to me in learning Big O notation and the ins and outs of the various sort algorithms. This was published before Sedgewick decided he could make more money by dividing it into 5 different books.

C++ FAQs is an amazing book that really shows you what you should and shouldn't be doing in C++. The backward compatibility of C++ leaves a lot of landmines about and this book helps one carefully avoid them while at the same time being a good introduction into OO design and intent.

Here are two I haven't seen mentioned:
I wish I had read "Ruminations on C++" by Koenig and Moo much sooner. That was the book that made OO concepts really click for me.
And I recommend Michael Abrash's "Zen of Code Optimization" for anyone else planning on starting a programming career in the mid 90s.

Perfect Software: And Other Illusions about Testing


Perfect Software: And Other Illusions about Testing by Gerald M. Weinberg

ISBN-10: 0932633692

ISBN-13: 978-0932633699

Rapid Development by McConnell

The most influential programming book for me was Enough Rope to Shoot Yourself in the Foot by Allen Holub.

Cover of the book

O, well, how long ago it was.

I have a few good books that strongly influenced me that I've not seen on this list so far:

The Psychology of Everyday Things by Donald Norman. The general principles of design for other people. This may seem to be mostly good for UI but if you think about it, it has applications almost anywhere there is an interface that someone besides the original developer has to work with; e. g. an API and designing the interface in such a way that other developers form the correct mental model and get appropriate feedback from the API itself.

The Art of Software Testing by Glen Myers. A good, general introduction to testing software; good for programmers to read to help them think like a tester i. e. think of what may go wrong and prepare for it.

By the way, I realize the question was the "Single Most Influential Book" but the discussion seems to have changed to listing good books for developers to read so I hope I can be forgiven for listing two good books rather than just one.

alt text

C++ How to Program It is good for beginner.This is excellent book that full complete with 1500 pages.

Effective C++ and More Effective C++ by Scott Myers.

Inside the C++ object model by Stanley Lippman

I bough this when I was a complete newbie and took me from only knowing that Java existed to a reliable team member in a short time

Not a programming book, but still a very important book every programmer should read:

Orbiting the Giant Hairball by Gordon MacKenzie

The Pragmatic programmer was pretty good. However one that really made an impact when I was starting out was :

Windows 95 System Programming Secrets"

I know - it sounds and looks a bit cheesy on the outside and has probably dated a bit - but this was an awesome explanation of the internals of Win95 based on the Authors (Matt Pietrek) investigations using his own own tools - the code for which came with the book. Bear in mind this was before the whole open source thing and Microsoft was still pretty cagey about releasing documentation of internals - let alone source. There was some quote in there like "If you are working through some problem and hit some sticking point then you need to stop and really look deeply into that piece and really understand how it works". I've found this to be pretty good advice - particularly these days when you often have the source for a library and can go take a look. Its also inspired me to enjoy diving into the internals of how systems work, something that has proven invaluable over the course of my career.

Oh and I'd also throw in effective .net - great internals explanation of .Net from Don Box.

I recently read Dreaming in Code and found it to be an interesting read. Perhaps more so since the day I started reading it Chandler 1.0 was released. Reading about the growing pains and mistakes of a project team of talented people trying to "change the world" gives you a lot to learn from. Also Scott brings up a lot of programmer lore and wisdom in between that's just an entertaining read.

Beautiful Code had one or two things that made me think differently, particularly the chapter on top down operator precedence.


@Juan: I know Juan, I know - but there are some things that can only be learned by actually getting down to the task at hand. Speaking in abstract ideals all day simply makes you into an academic. It's in the application of the abstract that we truly grok the reason for their existence. :P

@Keith: Great mention of "The Inmates are Running the Asylum" by Alan Cooper - an eye opener for certain, any developer that has worked with me since I read that book has heard me mention the ideas it espouses. +1

I found the The Algorithm Design Manual to be a very beneficial read. I also highly recommend Programming Pearls.

This one isnt really a book for the beginning programmer, but if you're looking for SOA design books, then SOA in Practice: The Art of Distributed System Design is for you.

For me it was Design Patterns Explained it provided an 'Oh that's how it works' moment for me in regards to design patterns and has been very useful when teaching design patterns to others.

Code Craft by Pete Goodliffe is a good read!

Code Craft

The first book that made a real impact on me was Mastering Turbo Assembler by Tom Swan.

Other books that have had an impact was Just For Fun by Linus Torvalds and David Diamond and of course The Pragmatic Programmer by Andrew Hunt and David Thomas.

In addition to other people's suggestions, I'd recommend either acquiring a copy of SICP, or reading it online. It's one of the few books that I've read that I feel greatly increased my skill in designing software, particularly in creating good abstraction layers.

A book that is not directly related to programming, but is also a good read for programmers (IMO) is Concrete Mathematics. Most, if not all of the topics in it are useful for programmers to know about, and it does a better job of explaining things than any other math book I've read to date.

For me "Memory as a programming concept in C and C++" really opened my eyes to how memory management really works. If you're a C or C++ developer I consider it a must read. You will defiantly learn something or remember things you might have forgotten along the way.

Agile Software Development with Scrum by Ken Schwaber and Mike Beedle.

I used this book as the starting point to understanding Agile development.

Systemantics: How Systems Work and Especially How They Fail. Get it used cheap. But you might not get the humor until you've worked on a few failed projects.

The beauty of the book is the copyright year.

Probably the most profound takeaway "law" presented in the book:

The Fundamental Failure-Mode Theorem (F.F.T.): Complex systems usually operate in failure mode.

The idea being that there are failing parts in any given piece of software that are masked by failures in other parts or by validations in other parts. See a real-world example at the Therac-25 radiation machine, whose software flaws were masked by hardware failsafes. When the hardware failsafes were removed, the software race condition that had gone undetected all those years resulted in the machine killing 3 people.

It seems most people have already touched on the some very good books. One which really helped me out was Effective C#: 50 Ways to Improve your C#. I'd be remiss if I didn't mention The Tao of Pooh. Philosophy books can be good for the soul, and the code.

Discrete Mathematics For Computer Scientists

Discrete Mathematics For Computer Scientists by J.K. Truss.

While this doesn't teach you programming, it teaches you fundamental mathematics that every programmer should know. You may remember this stuff from university, but really, doing predicate logic will improve you programming skills, you need to learn Set Theory if you want to program using collections.

There really is a lot of interesting information in here that can get you thinking about problems in different ways. It's handy to have, just to pick up once in a while to learn something new.

I saw a review of Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools on a blog talking also about XI-Factory, I read it and I must say this book is a must read. Altough not specifically targetted to programmers, it explains very clearly what is happening in the programming world right now with Model-Driven Architecture and so on..

Solid Code Optimizing the Software Development Life Cycle

Although the book is only 300 pages and favors Microsoft technologies it still offers some good language agnostic tidbits.

Managing Gigabytes is an instant classic for thinking about the heavy lifting of information.

My vote is "How to Think Like a Computer Scientist: Learning With Python" It's available both as a book and as a free e-book.

It really helped me to understand the basics of not just Python but programming in general. Although it uses Python to demonstrate concepts, they apply to most, if not all, programming languages. Also: IT'S FREE!

Object-Oriented Programming in Turbo C++. Not super popular, but it was the one that got me started, and was the first book that really helped me grok what an object was. Read this one waaaay back in high school. It sort of brings a tear to my eye...

My high school math teacher lent me a copy of Are Your Lights Figure Problem that I have re-read many times. It has been invaluable, as a developer, and in life generally.

I'm reading now Agile Software Development, Principles, Patterns and Practices. For those interested in XP and Object-Oriented Design, this is a classic reading.

alt text

Kernighan & Plauger's Elements of Programming Style. It illustrates the difference between gimmicky-clever and elegant-clever.

to get advanced in prolog i like these two books:

The Art of Prolog

The Craft of Prolog

really opens the mind for logic programming and recursion schemes.

Here's an excellent book that is not as widely applauded, but is full of deep insight: Agile Software Development: The Cooperative Game, by Alistair Cockburn.

What's so special about it? Well, clearly everyone has heard the term "Agile", and it seems most are believers these days. Whether you believe or not, though, there are some deep principles behind why the Agile movement exists. This book uncovers and articulates these principles in a precise, scientific way. Some of the principles are (btw, these are my words, not Alistair's):

  1. The hardest thing about team software development is getting everyone's brains to have the same understanding. We are building huge, elaborate, complex systems which are invisible in the tangible world. The better you are at getting more peoples' brains to share deeper understanding, the more effective your team will be at software development. This is the underlying reason that pair programming makes sense. Most people dismiss it (and I did too initially), but with this principle in mind I highly recommend that you give it another shot. You wind up with TWO people who deeply understand the subsystem you just built ... there aren't many other ways to get such a deep information transfer so quickly. It is like a Vulcan mind meld.
  2. You don't always need words to communicate deep understanding quickly. And a corollary: too many words, and you exceed the listener/reader's capacity, meaning the understanding transfer you're attempting does not happen. Consider that children learn how to speak language by being "immersed" and "absorbing". Not just language either ... he gives the example of some kids playing with trains on the floor. Along comes another kid who has never even SEEN a train before ... but by watching the other kids, he picks up the gist of the game and plays right along. This happens all the time between humans. This along with the corollary about too many words helps you see how misguided it was in the old "waterfall" days to try to write 700 page detailed requirements specifications.

There is so much more in there too. I'll shut up now, but I HIGHLY recommend this book!

alt text

The Back of the Napkin, by Dan Roam.

The Back of the Napkin

A great book about visual thinking techniques. There is also an expanded edition now. I can't speak to that version, as I do not own it; yet.

Agile Software Development by Alistair Cockburn

Do users ever touch your code? If you're not doing solely back-end work, I recommend About Face: The Essentials of User Interface Design — now in its third edition (linked). I used to think my users were stupid because they didn't "get" my interfaces. I was, of course, wrong. About Face turned me around.

"Writing Solid Code: Microsoft's Techniques for Developing Bug-Free C Programs (Microsoft Programming Series)" by Steve MacGuire.

Interesting what a large proportion the books mentioned here are C/C++ books.

While not strictly a software development book, I would highly recommend that Don't Make me Think! be considered in this list.

As so many people have listed Head First Design Patterns, which I agree is a very good book, I would like to see if so many people aware of a title called Design Patterns Explained: A New Perspective on Object-Oriented Design.

This title deals with design patterns excellently. The first half of the book is very accessible and the remaining chapters require only a firm grasp of the content already covered The reason I feel the second half of the book is less accessible is that it covers patterns that I, as a young developer admittedly lacking in experience, have not used much.

This title also introduces the concept behind design patterns, covering Christopher Alexander's initial work in architecture to the GoF first implementing documenting patterns in SmallTalk.

I think that anyone who enjoyed Head First Design Patterns but still finds the GoF very dry, should look into Design Patterns Explained as a much more readable (although not quite as comprehensive) alternative.

Even though i've never programmed a game this book helped me understand a lot of things in a fun way.

How influential a book is often depends on the reader and where they were in their career when they read the book. I have to give a shout-out to Head First Design Patterns. Great book and the very creative way it's written should be used as an example for other tech book writers. I.e. it's written in order to facilitate learning and internalizing the concepts.

Head First Design Patterns

97 Things Every Programmer Should Know

alt text

This book pools together the collective experiences of some of the world's best programmers. It is a must read.

Extreme Programming Explained: Embrace Change by Kent Beck. While I don't advocate a hardcore XP-or-the-highway take on software development, I wish I had been introduced to the principles in this book much earlier in my career. Unit testing, refactoring, simplicity, continuous integration, cost/time/quality/scope - these changed the way I looked at development. Before Agile, it was all about the debugger and fear of change requests. After Agile, those demons did not loom as large.

One of my personal favorites is Hacker's Delight, because it was as much fun to read as it was educational.

I hope the second edition will be released soon!

You.Next(): Move Your Software Development Career to the Leadership Track ~ Michael C. Finley (Author), Honza Fedák (Author) link text

I've been arounda while, so most books that I have found influential don't necessarily apply today. I do believe it is universally important to understand the platform that you are developing for (both hardware and OS). I also think it's important to learn from other peoples mistakes. So two books I would recommend are:

Computing Calamities and In Search of Stupidity: Over Twenty Years of High Tech Marketing Disasters

Working Effectively with Legacy Code is a really amazing book that goes into great detail about how to properly unit test your code and what the true benefit of it is. It really opened my eyes.

Looking at a job descriptions where "advanced SQL" is a requirement. I can write basic queries as well as anyone, and have worked with MySQL databases in a professional setting, but what would I be getting into with these jobs, if I were to be hired? What are examples of advanced SQL and where am I along the scale of SQL noob to SQL master?

Check out SQL For Smarties. I thought I was pretty good with SQL too, until I read that book... Goes into tons of depth, talks about things I've not seen elsewhere (I.E. difference between 3'rd and 4'th normal form, Boyce Codd Normal Form, etc)...

So lets say that you want to learn some stuff about database internals. What's the best source code to look at? the best books to buy?

I was talking about this with a buddy the other day and he recommended:
Art of Computer Programming, Volume 3: Sorting and Searching

What other books would help me learn about all the File IO and memory issues, pages, locking, etc.... ?

A colleague and I got a great deal of information out of Database in Depth: Relational Theory for Practitioners Very low level stuff but it sounds like that is the sort of thing you are looking for.

Take a look at Database Systems: The Complete Book by by Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer D. Widom. It is specifically about the internals of the DBMS.

The answer by SquareCog also contains sensible suggestions; I've not looked at the two books mentioned (though the Stonebreaker "Architecture" book is only 136 pages according to Amazon, which seems a tad lightweight).

Textbook: Database Management Systems by Ramakrishnan and Gehrke.

Or: Architecture of a Database System by Hellerstein, Stonebraker, and Hamilton.

Production Code: PostgreSQL

(I like the PG code better than SQLite , it's far more complete and, I think, better organized. SQLite is awesome for what it does, but there is a lot it doesn't take on).

Extra Credit: Readings in Database Systems, 4th edition edited by Hellerstein.

Not everybody likes his style, but I find that Joe Celko does a fine job of explaining the set-based logic that drives SQL databases. If you already have a little SQL experience under your belt, you should read SQL for Smarties.

In depth information about internals is database specific, here's a source on SQL Server 2008:

My SQL skills are rather limited and since I find myself working with a DB (Oracle) a lot lately I would like to get beyond the basic select statements.

I want to write queries that do things like get data from multiple tables, sum quantities, compare dates, group, filter etc.

What sites can you recommend for getting SQL reporting skills to a more advanced level?

Pick up Joe Celcko's SQL For Smarties. That's one of the definitive take-your-sql-to-the-next-level books. Otherwise, just keep writing queries.

Make sure you understand joins. Since the beginning of time, my SQL methodology has always been row-count driven - In other words, as I write a complicated query, I'm always #1 making sure it returns the correct number of rows. If your rowcount is correct, then your sums/groups/aggregates will all be correct. And they are VERY easy to mess up.

Make sure you understand the data. Make sure you understand keys and uniqueness so that you can enforce your joins.

You can also read for a lot of really cool SQL trickery. Laurent Schneider is also very cutting-edge sql-wise. I wouldn't be half the DBA/Developer I am today had I not set asktom as my home page for the last 5-6 years.

Finally, make sure you understand set based operations. Think of the result set as a whole, not just a collection of rows. It'll click as you do it. This relates back to the row-count-driven methodology.

So far I have encountered adjacency list, nested sets and nested intervals as models for storing tree structures in a database. I know these well enough and have migrated trees from one to another.

What are other popular models? What are their characteristics? What are good resources (books, web, etc) on this topic?

I'm not only looking for db storage but would like to expand my knowledge on trees in general. For example, I understand that nested sets/intervals are especially favorable for relational database storage and have asked myself, are they actually a bad choice in other contexts?

The seminal resource for this are chapters 28-30 of SQL for Smarties.

(I've recommended this book so much I figure Celko owes me royalties by now!)

I'm looking for something similar to SQL for Smarties by Joe Celko. More specifically I'm interested in .Net 3.5 or 4.0.

I have a loginAudit table and I am trying to get a count for all logins for each day. What I'd like to do is have days where there are no logins return their day and a login count of 0. Currently no row is returned for days with no logins.

Could be this isn't possible and have to fill in empty groups in the app after query results returned?

  SELECT DATEADD(day, DATEDIFF(day,0,LoginAudit.LoginDateTime), 0) as LoginDate,  
         COUNT(DISTINCT LoginAudit.LoginAuditID) AS Logins  
    FROM LoginAudit 
GROUP BY DATEADD(day, DATEDIFF(day,0,LoginAudit.LoginDateTime), 0)

Essentially what you're asking is to join your table to a "table" of dates. The date table would have no gaps and you would group on the date value. So how to create a table of dates?

In SQL for Smarties, it's suggested that you keep a table of integers around for cases when you need a gapless sequence to join to. Then you can select whatever sequence you need by joining your table to it.

So if you had an integer table with values going as many days back from NOW() as required you might do the following:

       COUNT(DISTINCT LoginAudit.LoginAuditID) AS logins
FROM i LEFT JOIN dual ON (DATE_SUB(NOW(), INTERVAL i.intvalue DAY)= day)

ETA, for mysql:

//create an integer table

create table i(i integer not null primary key);
insert into i values (0),(1),(2) ... (9);

if I need 0-99 consecutive numbers:

SELECT 10*t.i + u.i AS number
  FROM i AS u
  i AS t
ORDER BY number;

if I need consecutive dates:

SELECT date_sub(curdate(), interval (10*t.i + u.i) DAY) as thedate 
  FROM i AS u
  i AS t
ORDER BY thedate;

I have the following table

User ID    Start Date   End Date
John Doe   Mar 11 2011  May 28 2011
Robret S   Mar 21 2011  Jun 29 2011
Tina T     Feb 01 2011  August 20 2011

I want to show how many people I have available for the past 6 months, even if the month has no people. How can this be possible. I know I have to do grouping and use distinct count.

Expected Output:

February  = 1 Resource
March     = 3 Resources
April     = 3 Resources
May       = 3 Resources
June      = 2 Resources
July      = 1 Resource
August    = 1 Resource

I don't think you can do what you want using a "simple" select statement (even using GROUPing etc.) The following is off the top of my head, so you'll have to experiment with it a little, and preferably read Joe Celko's excellent SQL for Smarties book.

You need to create a second table that contains all of your months (start/end dates). You only need one table for all types of similar queries, and it must contain all the months in the date ranges your interested in querying:

CREATE TABLE months (id, start DATE, end DATE);

INSERT INTO months (id, start, end) 
          values ( (1, 2011-01-01, 2011-01-31), 
                   (2, 2011-02-01, 2011-02-28), ...);

You then LEFT OUTER JOIN from your user table to this month table. That will give you a row for each user for each month they were available, which you can GROUP as required:

  FROM months LEFT OUTER JOIN users 
         ON user.start_date < months.end 
        AND user.end_date > months.start 

Hope that helps.

Given a tree stored as a relation:


How can I get all the descendants of a given node? For example, for 1, I want (1, 2, 3, 4, 5, 6) and for 3 I want (3, 4, 5), and for 7 I want (7, 8, 9).

I'm doing this from a script (Python, but that doesn't matter) so I can do something like:

   nodes = SELECT child FROM relation WHERE parent=p
   for each node in nodes
        nodes += children(node)
   return nodes

nodes = children(root)

But if there's some funky SQL that lets me do that in one query, that would be awesome.

If you have the ability to change the table definition then using a nested set rather than a direct parent link makes this problem much easier to solve. Joe Celko's SQL for Smarties covers this in some detail.

I'm trying to store a hierarchical tree in SQL. In my case, the same son can have many fathers (the tree represents a VLSI design where the same cells can be used several times in different designs). All models I've found on the web describe the employee/manager relationship where each employee has one manager. In my case, the number of fathers can be quite large and if I try to store all of them in a table field, they can exceed the character limit of the field. Can anyone suggest a better method for storing this tree ?



You find some clever methods for tree handling in the book of Joe Celko:

...however, I don't know if it covers your problem

I want to learn SQL/TSQL queries. Can somebody tell me how to learn it? I have learned syntax and some basics too, but I'm not finding any good stuff to make practice and learn it. Does anybody have any ideas?