Facts and Fallacies of Software Engineering

Robert L. Glass

Mentioned 8

Regarding the controversial and thought-provoking assessments in this handbook, many software professionals might disagree with the authors, but all will embrace the debate. Glass identifies many of the key problems hampering success in this field. Each fact is supported by insightful discussion and detailed references.

More on Amazon.com

Mentioned in questions and answers.

I think back to Joel Spolsky's article about never rewriting code from scratch. To sum up his argument: The code doesn't get rusty, and while it may not look pretty after many maintenance releases, if it works, it works. The end user doens't care how pretty the code is.

You can read the article here: Things You Should Never Do

I've recently taken over a project and after looking through their code, it's pretty awful. I immediately thought of prototypes I had built before, and explicitly stated that it should not be used for any production environment. But of course, people don't listen.

The code is built as a website, has no separation of concerns, no unit testing, and code duplication everywhere. No Data layer, no real business logic, unless you count a bunch of classes in App_Code.

I've made the recommendation to the stake holders that, while we should keep the existing code, and do bug fix releases, and some minor feature releases, we should start rewriting it immediately with Test Driven Development in mind and with clear separation of concerns. I'm thinking of going the ASP.NET MVC route.

My only concern is of course, the length of time it might take to rewrite from scratch. It's not entirely complicated, pretty run of the mill web application with membership, etc..

Have any of you come across a similar problem? Any particular steps you took?

Thanks a bunch!

UPDATE:

So.. What did I end up deciding to do? I took Matt's approach and decided to refactor many areas.

  • Since App_Code was getting rather large and thus slowing down the build time, I removed many of the classes and converted them into a Class Library.
  • I created a very simple Data Access Layer, which contained all of the ADO calls, and created a SqlHelper object to execute these calls.

  • I implemented a cleaner logging
    solution, which is much more concise.

While I no longer work on this project [funding, politics, blah blah], I think it gave me some enormous insight into how bad some projects can be written, and steps one developer can take to make things a lot cleaner, readable and just flat out better with small, incremental steps over time.

Thanks again to everyone who commented.

The book Facts and Fallacies Of Software Engineering states this fact: "Modification of reused code is particularly error-prone. If more than 20 to 25 percent of a component is to be revised, it is more efficient and effective to rewrite it from scratch." The numbers come from some statistical studies performed on the subject. I think the numbers may vary due to the quality of the code base, so in your case, it seems to be more efficient and effective to rewrite it from scratch by taking this statement into account.

My firm just got its first large-scale development project inquiry and I would like to use an Agile process. The client has a vision for the application but openly admits to having very few requirements and recognizes that we will have to charge by the hour. Because of this, I've all but sold him on an Agile approach.

The problem is that he wants a figure to budget around. I've read a number of articles that pretty much advocate against giving up an estimate because the client will budget for that number and even as requirements change, the number in their head and in the books doesn't.

I've read there are a number of ways to factor in pricing in the contract, but almost all of them (save one) incorporate an up-front number. This just seems to violate the entire set of principles of Agile development.

So my question is, if you're an Agile shop, how do you manage to circumvent this Catch-22 of Agile development?

This is a really tough issue. See what Robert Glass has to say on the subject in his book "Facts and Fallacies of Software Engineering". (Note that Amazon has books available new for less than they're available second-hand! [as of 2009-01-05 12:20 -08:00]; also at Google books.)

These answers are really great. I don't have a lot of experience to share, but I thought of an analogy:

Last year my wife and I remodeled our kitchen. The contractor proposed billing on time & materials instead of giving an estimate, because we didn't know what we'd discover with respect to plumbing, subfloor damage, and so on. We agreed, because I'm working from home and I was available continuously to answer questions. The contractor and I had a good rapport so when something came up, he felt free to knock on my office door and we'd discuss it. Decisions got made quickly and the work was done the way I wanted it. That seems similar to the Agile priority on frequent customer review.

By contrast, my wife's cousin is also remodeling his house. He's an ER doctor and he is not available on site during the remodel. So he described being regularly surprised by the contractors making major decisions without consulting him ("What? I didn't want that picture window blocked off! Tear out the wall and reframe the window the way it was!"). This is of course painfully expensive and blows the schedule out of the water. Definitely not Agile.

So one way to convince a client that a time & materials mode will work may be to assure them you'll make frequent progress reports to get their feedback. I believe this is already a core recommendation of most Agile methodologies. To begin, of course this requires the customer give their trust to the consultant.

As a separate resource, I searched and found a book on this subject: "Agile Estimating and Planning" by Mike Cohn. Read the praise quotes on that page, from an impressive set of Agile experts.

One of the most frequent arguments I hear for not adhering to the SOLID principles in object-oriented design is YAGNI (although the arguer often doesn't call it that):

"It is OK that I put both feature X and feature Y into the same class. It is so simple why bother adding a new class (i.e. complexity)."

"Yes, I can put all my business logic directly into the GUI code it is much easier and quicker. This will always be the only GUI and it is highly unlikely that significant new requirements will ever come in."

"If in the unlikely case of new requirements my code gets too cluttered I still can refactor for the new requirement. So your 'What if you later need to…' argument doesn't count."

What would be your most convincing arguments against such practice? How can I really show that this is an expensive practice, especially to somebody that doesn't have too much experience in software development.

Design is the management and balance of trade-offs. YAGNI and SOLID aren't conflicting: the former says when to add features, the latter says how, but they both guide the design process. My responses, below, to each of your specific quotes use principles from both YAGNI and SOLID.

  1. It is three times as difficult to build reusable components as single use components.
  2. A reusable component should be tried out in three different applications before it will be sufficiently general to accept into a reuse library.

  — Robert Glass' Rules of Three, Facts and Fallacies of Software Engineering

Refactoring into reusable components has the key element of first finding the same purpose in multiple places, and then moving it. In this context, YAGNI applies by inlining that purpose where needed, without worrying about possible duplication, instead of adding generic or reusable features (classes and functions).

The best way, in the initial design, to show when YAGNI doesn't apply is to identify concrete requirements. In other words, do some refactoring before writing code to show that duplication is not merely possible, but already exists: this justifies the extra effort.


Yes, I can put all my business logic directly into the GUI code it is much easier and quicker. This will always be the only GUI and it is highly unlikely that signifcant new requirements will ever come in.

Is it really the only user interface? Is there a background batch mode planned? Will there ever be a web interface?

What is your testing plan, and will you be testing back-end functionality without a GUI? What will make the GUI easy for you to test, since you usually don't want to be testing outside code (such as platform-generic GUI controls) and instead concentrate on your project.

It is OK that I put both feature X and feature Y into the same class. It is so simple why bother adding a new class (i.e. complexity).

Can you point out a common mistake that needs to be avoided? Some things are simple enough, such as squaring a number (x * x vs squared(x)) for an overly-simple example, but if you can point out a concrete mistake someone made—especially in your project or by those on your team—you can show how a common class or function will avoid that in the future.

If, in the unlikely case of new requirements, my code gets too cluttered I still can refactor for the new requirement. So your "What if you later need to..." argument doesn't count.

The problem here is the assumption of "unlikely". Do you agree it's unlikely? If so, you're in agreement with this person. If not, your idea of the design doesn't agree with this person's—resolving that discrepancy will solve the problem, or at least show you where to go next. :)

[Edit:] Earlier I asked this as a perhaps poorly-framed question about when to use OOP versus when to use procedural programming - some responses implied I was asking for help understanding OOP. On the contrary, I have used OOP a lot but want to know when to use a procedural approach. Judging by the responses, I take it that there is a fairly strong consensus that OOP is usually a better all-round approach but that a procedural language should be used if the OOP architecture will not provide any reuse benefits in the long term.

However my experience as a Java programmer has been otherwise. I saw a massive Java program that I architected rewritten by a Perl guru in 1/10 of the code that I had written and seemingly just as robust as my model of OOP perfection. My architecture saw a significant amount of reuse and yet a more concise procedural approach had produced a superior solution.

So, at the risk of repeating myself, I'm wondering in what situations should I choose a procedural over an object-oriented approach. How would you identify in advance a situation in which an OOP architecture is likely to be overkill and a procedural approach more concise and efficient.

Can anyone suggest examples of what those scenarios would look like?

What is a good way to identify in advance a project that would be better served by a procedural programming approach?

I like Glass' rules of 3 when it comes to Reuse (which seems to be what you're interested in).

1) It is 3 times as difficult to build reusable components as single use components
2) A reusable component should be tried out in three different applications before it will be sufficiently general to accept into a reuse library

From this I think you can extrapolate these corollaries

a) If you don't have the budget for 3 times the time it would take you to build a single use component, maybe you should hold off on reuse. (Assuming Difficulty = Time)
b) If you don't have 3 places where you'd use the component you're building, maybe you should hold off on building the reusable component.

I still think OOP is useful for building the single use component, because you can always refactor it into something that is really reusable later on. (You can also refactor from PP to OOP but I think OOP comes with enough benefits regarding organization and encapsulation to start there)

I recently forced myself to study C++ and I just finished reading the book C++: The Complete Reference, by Herbert Schildt. I liked the book and think I got more or less the big picture. I noticed though that when I try to check with other people the things I code using the material I learned, they are usually considered non-idiomatic and superseded by an STL way to do it that is safer and easier (well, the book doesn't cover STL and Boost libraries).

So I'd like to ask: what are good sources to learn the patterns of a good C++ program? Where can I learn basic patterns from the "C++ way" to do things and not just repeating C patterns in C++?

I'd be particularly interested in sources that included STL and Boost stuff.

I'd (also) recommend:

  • Effective C++, Effective STL by Steve Myers. They are easy to digest, yet very valuable - sometimes even illuminating.
  • Code Complete (The 1993 edition is available cheaply and not that much different). It's lengthy, but it walks across the entire field from what it means to be a programmer to how a for loop should look. (It would be better if it was shorter - the way it is it covers so much ground it's hard to place it). I hold it dear because it illustrate two points very well:
    • code is compromise
    • There are know facts, (but we still manage to get by by gut feel)
  • C++ FAQ Lite / C++ FAQ.
  • I'd throw in Facts and Fallacies by Robert Glass - it doesn't fit your request very well, but go read it.

It's natural that you are unhappy with other people's code. That's typical for programming - heck, even my own code of five years ago was written by a total n00b. That might be more articulated for C++, since it caters for different styles, and often puts freedom ("you can") over guildelines ("that's the way").

Still, mulling over existing code - yours or others - and considering how it can be improved. Also, figuring out why it is the way it is sometimes helps.


(I remember a few TheDailyWTF's where everyone would chime in how stupid and unreasonable this is - yet somewhere, buried among the me too's, was someone with domain experience explaining convincingly under what circumstances this was better than the obvious solution).

You might wnat to check out The Definitive C++ Book Guide and List

For your purposes I would especially recommend:

They are not in particular order, also you might want to read and code something in between them.

(Note: As noted by @John Dibling the Boost book might be a bit out of date, I do not have experience with that one myself)

Say you have a program that currently functions the way it is supposed to. The application has very poor code behind it, eats up a lot of memory, is unscalable and would take major rewriting to implement any changes in functionality.

At what point does refactoring become less logical then a total rebuild?

Joel wrote a nice essay about this very topic:

Things You Should Never Do, Part 1

The key lesson I got from this is that although the old code is horrible, hurts your eyes and your aesthetic sense, there's a pretty good chance that a lot of that code is patching undocumented errors and problems. Ie., it has a lot of domain knowledge embedded in it and it will be difficult or impossible for you to replicate it. You'll constantly be hitting against bugs-of-omission.

A book I found immensely useful is Working Effectively With Legacy Code by Michael C. Feathers. It offers strategies and methods for approaching even truly ugly legacy code.

When seeking advice on good programming practices the typical answers are a variation of agile software development, test driven development or design patterns. However, as far as I know, neither of these are proven using the scientific method (if I'm wrong on this then feel free to correct me).

I wonder, are there any good resources on the topic of evidence-based development practices?

The best comprehensive reference I know about software engineering scientific evidence is Facts and Falacies of Software Engineering. The book is concise, with refernces to the original sources (or it plainly says there are not), well-written, and not expensive.

The second best reference is Code Complete but it is much longer, and it is not so focused on the evidence itself. It is nevertheless a must read book.

Once you have read these two books it is also worth looking on the "voice of evidence" series of articles from the IEEE Software magazine.

Ok, I think this question is at the wrong place and I'll head over to http://programmers.stackexchange.com/ to read/ask about it. Thanks all for your answers up to this point. :)


apologies ;) I'm sorry if this question is a little bit subjective, but I can not come up with a better title. I'll correct it if you know something better.

In my organization there is a lot of buzz about this whole automated testing and continuous integration thing, but one argument I constantly hear is this:

How should I develop good, clean, easy to maintain code and write unit tests, if the deadline is already set and it is only half of my estimate?

I'm a developer myself, so I can understand this. But I always try to respond that not only the developers need a paradigm shift, but the management too.

If you are a developer and your estimates are cut half, no matter what you estimate, you are not going anywhere, no matter how complex or trivial your problems are. You need the backup of the management guys, the One Guy who is giving the money.

Conclusion?

Can you give me some help, may it be a good URL to read about this development/management conflict, a book or maybe a personal insight? Did you survive a large process shift like this in a Waterfall company that is now doing Lean development? Or do you know this argument and have a clever answer to it?

And please, help me rename or move this question. :-)

Update

Thanks for all the answers already! :) I think I have to make clear that my point wasn't the "do it twice as fast" statement from management. It's about the negative point of view that comes with this statement from a developer.

Is there anything I can do to help people to understand that this is not the default in software development? That the PM is not actively preventing writing good code and that maybe both sides need a bit more education about the pros/contras of clean code bases, good coverage and lots of automated tests?

One good example is Technical Debt. It's manager friendly. Imagine your credit card. If you accrue debt for a few weeks that can be helpful. You don't need to carry around cash for daily purchases and you pay it off at the end of the month.

This is like a crunch before a release. You take on some debt and then pay it back soon. If you keep charging things and never paying off that debt it starts to compound. That new feature you want it more difficult because the foundation you're building on is unsound. The debt you've accumulated is keeping you from acting quickly. If you're over your limit even typical small purchases won't go through.

You might also want to take a look at Facts and Fallacies of Software Engineering . It talks about estimates and the troubles they can cause when they're not reviewed as the project evolves.