Database Design for Mere Mortals

Michael James Hernandez

Mentioned 17

“This book takes the somewhat daunting process of database design and breaks it into completely manageable and understandable components. Mike’s approach whilst simple is completely professional, and I can recommend this book to any novice database designer.” –Sandra Barker, Lecturer, University of South Australia, Australia “Databases are a critical infrastructure technology for information systems and today’s business. Mike Hernandez has written a literate explanation of database technology–a topic that is intricate and often obscure. If you design databases yourself, this book will educate you about pitfalls and show you what to do. If you purchase products that use a database, the book explains the technology so that you can understand what the vendor is doing and assess their products better.” –Michael Blaha, consultant and trainer, author of A Manager’s Guide to Database Technology “If you told me that Mike Hernandez could improve on the first edition of Database Design for Mere Mortals I wouldn’t have believed you, but he did! The second edition is packed with more real-world examples, detailed explanations, and even includes database-design tools on the CD-ROM! This is a must-read for anyone who is even remotely interested in relational database design, from the individual who is called upon occasionally to create a useful tool at work, to the seasoned professional who wants to brush up on the fundamentals. Simply put, if you want to do it right, read this book!” –Matt Greer, Process Control Development, The Dow Chemical Company “Mike’s approach to database design is totally common-sense based, yet he’s adhered to all the rules of good relational database design. I use Mike’s books in my starter database-design class, and I recommend his books to anyone who’s interested in learning how to design databases or how to write SQL queries.” –Michelle Poolet, President, MVDS, Inc. “Slapping together sophisticated applications with poorly designed data will hurt you just as much now as when Mike wrote his first edition, perhaps even more. Whether you’re just getting started developing with data or are a seasoned pro; whether you've read Mike’s previous book or this is your first; whether you're happier letting someone else design your data or you love doing it yourself–this is the book for you. Mike’s ability to explain these concepts in a way that’s not only clear, but fun, continues to amaze me.” –From the Foreword by Ken Getz, MCW Technologies, coauthor ASP.NET Developer's JumpStart “The first edition of Mike Hernandez’s book Database Design for Mere Mortals was one of the few books that survived the cut when I moved my office to smaller quarters. The second edition expands and improves on the original in so many ways. It is not only a good, clear read, but contains a remarkable quantity of clear, concise thinking on a very complex subject. It’s a must for anyone interested in the subject of database design.” –Malcolm C. Rubel, Performance Dynamics Associates “Mike’s excellent guide to relational database design deserves a second edition. His book is an essential tool for fledgling Microsoft Access and other desktop database developers, as well as for client/server pros. I recommend it highly to all my readers.” –Roger Jennings, author of Special Edition Using Access 2002 “There are no silver bullets! Database technology has advanced dramatically, the newest crop of database servers perform operations faster than anyone could have imagined six years ago, but none of these technological advances will help fix a bad database design, or capture data that you forgot to include! Database Design for Mere Mortals™, Second Edition, helps you design your database right in the first place!” –Matt Nunn, Product Manager, SQL Server, Microsoft Corporation “When my brother started his professional career as a developer, I gave him Mike’s book to help him understand database concepts and make real-world application of database technology. When I need a refresher on the finer points of database design, this is the book I pick up. I do not think that there is a better testimony to the value of a book than that it gets used. For this reason I have wholeheartedly recommended to my peers and students that they utilize this book in their day-to-day development tasks.” –Chris Kunicki, Senior Consultant, OfficeZealot.com “Mike has always had an incredible knack for taking the most complex topics, breaking them down, and explaining them so that anyone can ‘get it.’ He has honed and polished his first very, very good edition and made it even better. If you're just starting out building database applications, this book is a must-read cover to cover. Expert designers will find Mike’s approach fresh and enlightening and a source of great material for training others.” –John Viescas, President, Viescas Consulting, Inc., author of Running Microsoft Access 2000 and coauthor of SQL Queries for Mere Mortals “Whether you need to learn about relational database design in general, design a relational database, understand relational database terminology, or learn best practices for implementing a relational database, Database Design for Mere Mortals™, Second Edition, is an indispensable book that you’ll refer to often. With his many years of real-world experience designing relational databases, Michael shows you how to analyze and improve existing databases, implement keys, define table relationships and business rules, and create data views, resulting in data integrity, uniform access to data, and reduced data-entry errors.” –Paul Cornell, Site Editor, MSDN Office Developer Center Sound database design can save hours of development time and ensure functionality and reliability. Database Design for Mere Mortals™, Second Edition, is a straightforward, platform-independent tutorial on the basic principles of relational database design. It provides a commonsense design methodology for developing databases that work. Database design expert Michael J. Hernandez has expanded his best-selling first edition, maintaining its hands-on approach and accessibility while updating its coverage and including even more examples and illustrations. This edition features a CD-ROM that includes diagrams of sample databases, as well as design guidelines, documentation forms, and examples of the database design process. This book will give you the knowledge and tools you need to create efficient and effective relational databases.

More on Amazon.com

Mentioned in questions and answers.

I am pretty well versed with SQL Server, MySQL, Oracle etc but putting these Database products aside, is there a resource that will help me design relational databases well? Is there something like patterns or best practices for database design?

I have seen a few times that database is often not scalable; people have personal preferences with keeping columns like isChecked column which is boolean in nature but stored as Char(1) with values like 'Y' and 'N' instead of 0 and 1 which to me sounds better. Ways not to commit common mistakes while doing database design?

Links to books or articles will be highly appreciated.

Thanks in advance.

The best book I've ever read in regards to database design is "Database Design for Mere Mortals" by Michael J Hernandez. The name sounds like a beginners book, but people at any level could gain knowledge from it. It's also platform independent as it deals with looking at the data itself and how to properly organize it - not the technology being used.

He also wrote a book on writing queries named "SQL Queries for Mere Mortals" that I've heard (haven't read this one myself yet) is quite good.

Database Design for Mere Mortals

I'm looking for a book/site/tutorial on best practices for relational database design, tuning for performance etc. It turns out this kind of resource is a bit difficult to find; there's a lot of "here's normalization, here's ER diagrams, have at it," but not much in the way of real examples. Anyone have any ideas?

What are your steps and guidelines for database normalization? If given a bunch of people with good computer and logic skills, how do you explain to them how to normalize a database design? What is your thought process for taking a list of attributes (fields) and turning that list into a database in 1NF? 2NF? 3NF?

Check out some of these resources:

or get your folks the best book I've found for learning database design:

Database Design for Mere Mortals(R): A Hands-On Guide to Relational Database Design (2nd Edition) by Mike Hernandez

enter image description here

Database design question for y'all. I have a form (like, the paper kind) that has several entry points for data. This form has changed, and is expected to change over years. It is being turned into a computer app, so that we can, among other things, quit wasting paper. (And minor things, like have all the data in one central store that can be queried, etc.) I'd like to store all of the forms data in a database, and have it be pretty agnostic as to the changes.

Originally, I was just considering each field to be a string -- and I had a table something like this:

FormId int (FK)
FieldName nvarchar(64)
FieldValue nvarchar(128)

...something like that. It was actually a bit more 3NFy in that FieldName was in another table, associated with an artificial key, so that the field names weren't duplicated all over the place.

However, I'd like to extend this to numeric and drop-down data. I could just store numeric data as strings, but that seems like a pretty crappy idea. Same with drop downs.

I could stop using a table, and actually use columns on the main form table (the one that FormId above references), but that means adding a column for each new item as they come along, and older forms would just be null. (And, unless I stored it, I wouldn't know when that column was created. With the string table above, it's implicit.)

I could extend the table above to something like:

FormId int (FK)
FieldName nvarchar(64)
FieldValueType int -- enum as to which of the columns below are valid (or just let nulls imply that)
FieldValue nvarchar(128)
FieldValueInt int

Combos would have to be in a OTLT (one true lookup table), which I have reservations about, but perhaps it's needed here?

Any advice on StackOverflow? I'm using MSSQL, but this is really a more general question.

Use Nulls. Proper database design is a complicated subject; you may do well to pick up a good reference and do some research on the whole thing (I gather this is a good book on the topic). In general, it sounds like you would be well served by starting with a single table that encapsulates all the fields in your form, and then putting it through the normalization process. And yes, use nulls and do NOT use an int to enumerate which columns are set to valid values; that is exactly what nulls are for.

My questions is regarding Database Modeling. I tried to look for this question amongst other Database Designing questions on SO but haven't found it and so here am asking about:

What are the general guidelines and best practices to keep in mind while designing database for an application ?

What are the best resources/books/University Lectures available on Database Design Concepts ?

Thanks.

While I agree with others that your question right now is much too broad and can't really be answered (except for the "it depends" approach :-)), there is one book I would wholeheartedly recommend for anyone beginning database design in general:

Michael Hernandez: Database Design for Mere Mortals(R): A Hands-On Guide to Relational Database Design

alt text

It's a really hands-on, no-frills, down to earth book and introduces all the major and important concepts in a very understandable, very approachable fashion. Well written, interesting, very sound and useful - highly recommended!

Marc

I'm dabbling in a bit of php coding with databases and am getting a bit stuck with relationships and the like. Can anybody reccomend some books, online or real, that would be a good start for someone new(ish) to php and mysql databases?

I'm very new to SQL Server. I got a task from my boss where it involves SQL Server. He gave me a little introduction about it but I would like to know if there are other ways on how to deal with it.

I have a little background on Access but not that so intensive.

I would like to ask your opinion, where to start if you're dealing with SQL Server data bank.

Would appreciate your ideas.

If you're looking for books on the subject, I really enjoyed and learned a great deal from the two books by Michael Hernandez, SQL Queries for Mere Mortals and Database Design for Mere Mortals. Both highly recommended!

Marc

I've written an Android game and have reached the point where I need to store the game state on my game server. I am pretty much a novice at network programming, and am not sure what balance of security/technology is right for the above mentioned application. My current approach which I've got up and running over the last 2 days is as follows:

  • RESTful web service written in PHP on the server.
  • Use of .htaccess to enable nice PUT / GET address commands.
  • MySQL database with user name, password, and data fields, e.g. in game purchased items, in game currency.
  • Salted & hashed passwords.

Since I'm totally new to this the first part of my question would be to ask whether this approach is acceptable, i.e. is it secure for the purpose I have, or perhaps it's overkill; is there anything missing in general with respect to security?

The seconds part of my question concerns the MySQL database. Should I store id, username, and salted/hashed passwords in a users database, and then store user data, e.g. in app purchased items, in game currency, in a separate data database, which is indexed by the id in the user database?

Thanks for any help regarding this.

Update

In addition, I will be implementing HTTPS so communication between client/server is secure, and authorization tokens so that I know my server is communicating with whom it should be (This seems a good tutorial)

NB - using authorization tokens means I probably don't need to store/deal with username and passwords directly - will see.

Security:

You have some elements of best practices, but there are some things missing.

Database design:

You seem to be using the word "database" where I would expect you to say "table." Yes, you can and should have multiple tables on your MySQL server. The application I support has over 120 tables, and there are certainly many applications with many more tables. It depends on the complexity of the data you need to store.

Here's a good book to start with: Database Design for Mere Mortals

If you want to get more into the theory behind the practice, I enjoyed SQL and Relational Theory: How to Write Accurate SQL Code 3rd Edition.

And I have to plug my own book: SQL Antipatterns: Avoiding the Pitfalls of Database Programming.

want to learn details about how to do a better database design any suggested recommended for this topic? thx very much

There's a vendor-neutral intro course online:

And here are two book recommendations:

Well, I am going to design a database structure, but I don't know where to start. Can you guys help me by answering the following questions? My database is going to store about hundreds of thousands rows/data and serving 1000 ++ users at the same time. It needs to be well designed.

Design mysql database structure questions:

Q1: I know this is a stupid question, but how do you guys design a mysql database structure? Must I study normalization first? I suck at this topic :(

Q2: Is there any tips or techniques when design a database structure?

Q3: What are the important things in designing mysql database structure?

Q4: Okay, another stupid question, what are the differences using mysql database and xml?

Q5: Is there any downside/disadvantages using mysql database?

FYI: I am a new mysql developer.

It's commendable that you're willing to jump in to this project, but in my experience trying to do a real project at the same time as you're learning the basics of any technical subject is likely to result in a project that doesn't work. And it takes ten times as long.

So I would recommend spending some time educating yourself first, by doing some reading and some experimentation. Design a practice database and the code to use it, but not the database you need for your project.

Here's a well-regarded book that might help you:

After you are more comfortable with the basics of database design, then you can start your real project.

Basically my job is to develop web applications using a database as backend. What I have been doing till now is,
Basded on the requirement of the client,

  • I draw a basic sketch of what the tables are ,how they look like
  • fields in those tables and some one-to-one or many-to-one or many-to-many relations

Although I am not perfect at these things, I try to figure out how the relations should be from my past projects that I worked on. But there are still some doubts about this in my mind.

If the client asks that he wants a particular data, I try to achieve it either through a direct SQL query or thought the scritp (in most cases PHP), if I am unable to figure out a query at all for that particular request.

Now, here comes my question.

Based on the relationships that I figured out while developing tables, are there any limitations to what a client can ask? What I mean to say by this is, the client will ask that he wants list all the indidual products, their counts, associated category, all the counts of
category, the products in each category and the their prices, sum of all the category prices and the total prices so on so forth.

This is just an example of one request to explain my situation.

Now, if there is any request that can potentially take longer time for the exection, can the developer satisfy this request by breaking down the request? Do I need to tell him why is this break down necessary? What if he feels that I am not capable of doing it in a single shot?

Is every report that he asks for need to be in single query? or will there be any need to itake the help of PHP to proces one loop and based on the values that I get, I put some conditions to apply rules that the client wants?

What is the better way to do this kind of job?

Any views?

Thanks.

It sounds like you just need to improve your data design skills. A properly designed/ normalized database, as astander suggests, won't run into the issues you're worried about. But it takes a lot of time to learn the Right Way to design a database if you just keep learning from mistakes. When I was starting out as a web dev, I found Database Design for Mere Mortals a huge help in showing you how to avoid painting yourself into corners. There's a companion book about how to write good queries on your databases as well. The two books won't teach you everything there is to know, but they give you a great foundation.

I am currently working on an application which references the Microsoft Business Contact Manager SQL Server database add-on for Outlook. The main Contact table in the db has several sub tables which have identical keys to the main table and a small number of columns; for example, an email address table which has 3 columns: contactID, EmailAddress, and EmailDisplayAs. Because of this approach, you need to use a view with multiple outer joins just to see all the columns which make up each Contact record. I have never seen a database with this kind of structure; it seems messy to say the least. I would be interested to see comments on why one would take this approach.

This sounds like basic normalization to me. Contact is a parent entity, and has a number of attributes that make up one-to-many relationships. Novice designers sometimes store these in the same table, and you get columns like phone1, phone2, phone3, email1, email2, etc.

Normalization allows for efficient storage of information, and provides a structure whereby you can get back to the multiple attribute record (the view in the question).

So I would expect to see a contact table(with a primary key of contactId), and a phone table(with a foreign key of contactId) and an email table(with a foreign key of contactId).

If you are not familiar with normalization, primary keys and foreign keys I recommend you pick up a book like Database Design for Mere Mortals.

I'm currently writing a little django app to get some practice with the framework. The app lets user log in, write entries and see a list of their entries. How should I assign the entries to the user that created them? Is it a good idea to create a table for every new user and save the entries there or should I just add an additional field in the entry model (e.g. 'created_by') and filter the items to be displayed in the list accordingly?

One thing thats need to be considered, is that there should be absolutely no way that a user sees entries other than his own (e.g someones uses the app to write a diary). Is this given with both ways?

I've newer really worked with databases before, so I would appreciate an explanation why one way is better than the other.

Like the others have said, even trying to create a table per user in django would be difficult.

Generally when working with relational databases, if you try the one table per user, you'll only try it once. :) Imagine if you started scaling to a few hundred users and now you have a requirement to get a count of all entries. You want to do 300 queries for that? Maybe a massive query with 300 UNION ALL statements? Not only will performance suffer, but you'll get tired of maintaining the app as well.

Because you haven't worked with databases much, you might want to read up a little on the relational data model a little bit. You could pick up a decent book on it, Database Design For Mere Mortals or maybe a tutorial online like: http://www.15seconds.com/issue/020522.htm. Get some background on RDBMS design and then come back to the django tutorial and it will make sense.

Some people tell me null is evil within a relational database is this true or just subjective?

Answer:

I believe in black and white answers... Yes, Null values in a relational database are bad. Some individuals (including senior level DBAs) will argue that some NULL values are fine for this that and other, but this is like saying some water in a gas tank is OK. Yes, technically a car will still run, albeit poorly, with some water in the tank, but wouldn't you avoid it if you could? The more NULL values in a database the more SQL you will write for your reports.

If data is optional then the business model obviously does not "need" it, and so you're better off either leaving it out or enforcing it so that it must be captured.

Recomendation

For individuals looking for answers to this and similar questions, I cannot recommend Database Design for Mere Mortals: second edition strong enough. It was recommended to me by a software architect and I have yet to meet anyone who recommends anything else. I actually purchased two copies just in case.

There is also a newer Database Design for Mere Mortals: third edition by the same author. I have not purchased this yet, so I can't say better or worse.

Personal Note: As far as this Question & answer being "primarily opinion-based" goes, please bear in mind that an opinion without logical evidence to the contrary is not opinion at all- It's a fact. If another SSRS or Crystal Reports Analyst/Developer comes in here and says "NULL values are no problem at all" then I will remove my answer.

A book that is an introduction to databases and geared towards web developers.

I really liked this book, was clear, concise and informative.

http://www.amazon.com/PHP-MySQL-Web-Development-4th/dp/0672329166

Naturally, you want The Manga Guide to Databases. :D

I am intrested to know a little bit more about databases then i currently know. I know how to setup a database backend for any webapp that i happen to be creating but that is all. For example if i was creating three different apps i would simply create three different databases and then configure each database for the particular app. This is all simple knowledge and i would now like to have a deeper understanding of how databases actually work.

Lets say that I developed an application for example that needed lot of space and processing power.This database would then have to be spread over numerous machines. How exactly would a database be spread across numerous machines and still be able to write records and then retreieve them. Would each table get their own machine and what software is needed to make sure that the different machines have all performed their transactions successfully.

As you can see i am quite a database ignoramus lol.

Any help in clearing this up would be greatly appreciated.

Database Design for Mere Mortals

This is the best book about the subject if you don't have any experience with databases. It's got historical background and practical examples. Most books often skip the historical stuff because they assume you know what a db is, or it doesn't matter, and jump right to the practical. This book gives you the complete picture.

I don't know what RDBMS you're using but I have two book suggestions.

For theory (which should come first, in my opinion): Database in Depth: Relational Theory for Practitioners

For implementation: High Performance MySQL: Optimization, Backups, Replication, and More

I own both these books and they are both pretty great, especially the first one.

I want to do an ECommerce app. I want to add properties to categories and search them dynamicly category within properties.

I mean:

  CATEGORIES             PROPERTIES
| Computer properties  | MotherBoard, GraphicCart etc.
| Phone properties     | Cam, Connecting properties, bluetoot etc.
| Digital Camera       | Zoom option , Lcd, Battery etc.

I have allready done this wiht relational tables. This is the way I think:

enter image description here

So, my question is this way true? Or We have other way?

You have asked a question that cannot be answered by one simple little post on SO. There are some basic design concepts that are fundamental to good database design. There is a lot of thought that goes into this type of activity.

I recommend you read the book Database Design for Mere Mortals by Michael Hernandez ISBN: 0-201-69471-9. Amazon Listing I noticed he has a second edition.

Link to Second edition

He walks you through the entire process (from start to finish) of designing a database. I recommend you start with this book.

You have to learn to look at things in groups or chunks. Database design has simple building blocks just like programming does. If you gain a thorough understanding of these simple building blocks you can tackle any database design.

The simpler you make things the better. A database is nothing more than a place where you put data into cubbie holes. Start by identifying what these cubbie holes are and what kind of stuff you want in them.

You are never going to create the perfect database design the first time you try. This is a fact. Your design will go through several refinements during the process. Sometimes things won't seem apparent until you start entering data, and then you have an ahh ha moment.