Writing Solid Code

Steve Maguire

Mentioned 17

A Microsoft developer examines the problem of programming "bugs," showing how and where developers make mistakes along the development process and providing ways users can detect errors early. Original.

More on Amazon.com

Mentioned in questions and answers.

If you could go back in time and tell yourself to read a specific book at the beginning of your career as a developer, which book would it be?

I expect this list to be varied and to cover a wide range of things.

To search: Use the search box in the upper-right corner. To search the answers of the current question, use inquestion:this. For example:

inquestion:this "Code Complete"

Applying UML and Patterns by Craig Larman.

The title of the book is slightly misleading; it does deal with UML and patterns, but it covers so much more. The subtitle of the book tells you a bit more: An Introduction to Object-Oriented Analysis and Design and Iterative Development.

Masters of doom. As far as motivation and love for your profession go: it won't get any better than what's been described in this book, truthfully inspiring story!

Beginning C# 3.0: An Introduction to Object Oriented Programming

This is the book for those who want to understand the whys and hows of OOP using C# 3.0. You don't want to miss it.

alt text

Mastery: The Keys to Success and Long-Term Fulfillment, by George Leonard

It's about about what mindsets are required to reach mastery in any skill, and why. It's just awesome, and an easy read too.

Pro Spring is a superb introduction to the world of Inversion of Control and Dependency Injection. If you're not aware of these practices and their implications - the balance of topics and technical detail in Pro Spring is excellent. It builds a great case and consequent personal foundation.

Another book I'd suggest would be Robert Martin's Agile Software Development (ASD). Code smells, agile techniques, test driven dev, principles ... a well-written balance of many different programming facets.

More traditional classics would include the infamous GoF Design Patterns, Bertrand Meyer's Object Oriented Software Construction, Booch's Object Oriented Analysis and Design, Scott Meyer's "Effective C++'" series and a lesser known book I enjoyed by Gunderloy, Coder to Developer.

And while books are nice ... don't forget radio!

... let me add one more thing. If you haven't already discovered safari - take a look. It is more addictive than stack overflow :-) I've found that with my google type habits - I need the more expensive subscription so I can look at any book at any time - but I'd recommend the trial to anyone even remotely interested.

(ah yes, a little obj-C today, cocoa tomorrow, patterns? soa? what was that example in that cookbook? What did Steve say in the second edition? Should I buy this book? ... a subscription like this is great if you'd like some continuity and context to what you're googling ...)

Database System Concepts is one of the best books you can read on understanding good database design principles.

alt text

Algorithms in C++ was invaluable to me in learning Big O notation and the ins and outs of the various sort algorithms. This was published before Sedgewick decided he could make more money by dividing it into 5 different books.

C++ FAQs is an amazing book that really shows you what you should and shouldn't be doing in C++. The backward compatibility of C++ leaves a lot of landmines about and this book helps one carefully avoid them while at the same time being a good introduction into OO design and intent.

Here are two I haven't seen mentioned:
I wish I had read "Ruminations on C++" by Koenig and Moo much sooner. That was the book that made OO concepts really click for me.
And I recommend Michael Abrash's "Zen of Code Optimization" for anyone else planning on starting a programming career in the mid 90s.

Perfect Software: And Other Illusions about Testing

TITLE Cover

Perfect Software: And Other Illusions about Testing by Gerald M. Weinberg

ISBN-10: 0932633692

ISBN-13: 978-0932633699

Rapid Development by McConnell

The most influential programming book for me was Enough Rope to Shoot Yourself in the Foot by Allen Holub.

Cover of the book

O, well, how long ago it was.

I have a few good books that strongly influenced me that I've not seen on this list so far:

The Psychology of Everyday Things by Donald Norman. The general principles of design for other people. This may seem to be mostly good for UI but if you think about it, it has applications almost anywhere there is an interface that someone besides the original developer has to work with; e. g. an API and designing the interface in such a way that other developers form the correct mental model and get appropriate feedback from the API itself.

The Art of Software Testing by Glen Myers. A good, general introduction to testing software; good for programmers to read to help them think like a tester i. e. think of what may go wrong and prepare for it.

By the way, I realize the question was the "Single Most Influential Book" but the discussion seems to have changed to listing good books for developers to read so I hope I can be forgiven for listing two good books rather than just one.

alt text

C++ How to Program It is good for beginner.This is excellent book that full complete with 1500 pages.

Effective C++ and More Effective C++ by Scott Myers.

Inside the C++ object model by Stanley Lippman

I bough this when I was a complete newbie and took me from only knowing that Java existed to a reliable team member in a short time

Not a programming book, but still a very important book every programmer should read:

Orbiting the Giant Hairball by Gordon MacKenzie

The Pragmatic programmer was pretty good. However one that really made an impact when I was starting out was :

Windows 95 System Programming Secrets"

I know - it sounds and looks a bit cheesy on the outside and has probably dated a bit - but this was an awesome explanation of the internals of Win95 based on the Authors (Matt Pietrek) investigations using his own own tools - the code for which came with the book. Bear in mind this was before the whole open source thing and Microsoft was still pretty cagey about releasing documentation of internals - let alone source. There was some quote in there like "If you are working through some problem and hit some sticking point then you need to stop and really look deeply into that piece and really understand how it works". I've found this to be pretty good advice - particularly these days when you often have the source for a library and can go take a look. Its also inspired me to enjoy diving into the internals of how systems work, something that has proven invaluable over the course of my career.

Oh and I'd also throw in effective .net - great internals explanation of .Net from Don Box.

I recently read Dreaming in Code and found it to be an interesting read. Perhaps more so since the day I started reading it Chandler 1.0 was released. Reading about the growing pains and mistakes of a project team of talented people trying to "change the world" gives you a lot to learn from. Also Scott brings up a lot of programmer lore and wisdom in between that's just an entertaining read.

Beautiful Code had one or two things that made me think differently, particularly the chapter on top down operator precedence.

K&R

@Juan: I know Juan, I know - but there are some things that can only be learned by actually getting down to the task at hand. Speaking in abstract ideals all day simply makes you into an academic. It's in the application of the abstract that we truly grok the reason for their existence. :P

@Keith: Great mention of "The Inmates are Running the Asylum" by Alan Cooper - an eye opener for certain, any developer that has worked with me since I read that book has heard me mention the ideas it espouses. +1

I found the The Algorithm Design Manual to be a very beneficial read. I also highly recommend Programming Pearls.

This one isnt really a book for the beginning programmer, but if you're looking for SOA design books, then SOA in Practice: The Art of Distributed System Design is for you.

For me it was Design Patterns Explained it provided an 'Oh that's how it works' moment for me in regards to design patterns and has been very useful when teaching design patterns to others.

Code Craft by Pete Goodliffe is a good read!

Code Craft

The first book that made a real impact on me was Mastering Turbo Assembler by Tom Swan.

Other books that have had an impact was Just For Fun by Linus Torvalds and David Diamond and of course The Pragmatic Programmer by Andrew Hunt and David Thomas.

In addition to other people's suggestions, I'd recommend either acquiring a copy of SICP, or reading it online. It's one of the few books that I've read that I feel greatly increased my skill in designing software, particularly in creating good abstraction layers.

A book that is not directly related to programming, but is also a good read for programmers (IMO) is Concrete Mathematics. Most, if not all of the topics in it are useful for programmers to know about, and it does a better job of explaining things than any other math book I've read to date.

For me "Memory as a programming concept in C and C++" really opened my eyes to how memory management really works. If you're a C or C++ developer I consider it a must read. You will defiantly learn something or remember things you might have forgotten along the way.

http://www.amazon.com/Memory-Programming-Concept-C/dp/0521520436

Agile Software Development with Scrum by Ken Schwaber and Mike Beedle.

I used this book as the starting point to understanding Agile development.

Systemantics: How Systems Work and Especially How They Fail. Get it used cheap. But you might not get the humor until you've worked on a few failed projects.

The beauty of the book is the copyright year.

Probably the most profound takeaway "law" presented in the book:

The Fundamental Failure-Mode Theorem (F.F.T.): Complex systems usually operate in failure mode.

The idea being that there are failing parts in any given piece of software that are masked by failures in other parts or by validations in other parts. See a real-world example at the Therac-25 radiation machine, whose software flaws were masked by hardware failsafes. When the hardware failsafes were removed, the software race condition that had gone undetected all those years resulted in the machine killing 3 people.

It seems most people have already touched on the some very good books. One which really helped me out was Effective C#: 50 Ways to Improve your C#. I'd be remiss if I didn't mention The Tao of Pooh. Philosophy books can be good for the soul, and the code.

Discrete Mathematics For Computer Scientists

Discrete Mathematics For Computer Scientists by J.K. Truss.

While this doesn't teach you programming, it teaches you fundamental mathematics that every programmer should know. You may remember this stuff from university, but really, doing predicate logic will improve you programming skills, you need to learn Set Theory if you want to program using collections.

There really is a lot of interesting information in here that can get you thinking about problems in different ways. It's handy to have, just to pick up once in a while to learn something new.

I saw a review of Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools on a blog talking also about XI-Factory, I read it and I must say this book is a must read. Altough not specifically targetted to programmers, it explains very clearly what is happening in the programming world right now with Model-Driven Architecture and so on..

Solid Code Optimizing the Software Development Life Cycle

Although the book is only 300 pages and favors Microsoft technologies it still offers some good language agnostic tidbits.

Managing Gigabytes is an instant classic for thinking about the heavy lifting of information.

My vote is "How to Think Like a Computer Scientist: Learning With Python" It's available both as a book and as a free e-book.

It really helped me to understand the basics of not just Python but programming in general. Although it uses Python to demonstrate concepts, they apply to most, if not all, programming languages. Also: IT'S FREE!

Object-Oriented Programming in Turbo C++. Not super popular, but it was the one that got me started, and was the first book that really helped me grok what an object was. Read this one waaaay back in high school. It sort of brings a tear to my eye...

My high school math teacher lent me a copy of Are Your Lights Figure Problem that I have re-read many times. It has been invaluable, as a developer, and in life generally.

I'm reading now Agile Software Development, Principles, Patterns and Practices. For those interested in XP and Object-Oriented Design, this is a classic reading.

alt text

Kernighan & Plauger's Elements of Programming Style. It illustrates the difference between gimmicky-clever and elegant-clever.

to get advanced in prolog i like these two books:

The Art of Prolog

The Craft of Prolog

really opens the mind for logic programming and recursion schemes.

Here's an excellent book that is not as widely applauded, but is full of deep insight: Agile Software Development: The Cooperative Game, by Alistair Cockburn.

What's so special about it? Well, clearly everyone has heard the term "Agile", and it seems most are believers these days. Whether you believe or not, though, there are some deep principles behind why the Agile movement exists. This book uncovers and articulates these principles in a precise, scientific way. Some of the principles are (btw, these are my words, not Alistair's):

  1. The hardest thing about team software development is getting everyone's brains to have the same understanding. We are building huge, elaborate, complex systems which are invisible in the tangible world. The better you are at getting more peoples' brains to share deeper understanding, the more effective your team will be at software development. This is the underlying reason that pair programming makes sense. Most people dismiss it (and I did too initially), but with this principle in mind I highly recommend that you give it another shot. You wind up with TWO people who deeply understand the subsystem you just built ... there aren't many other ways to get such a deep information transfer so quickly. It is like a Vulcan mind meld.
  2. You don't always need words to communicate deep understanding quickly. And a corollary: too many words, and you exceed the listener/reader's capacity, meaning the understanding transfer you're attempting does not happen. Consider that children learn how to speak language by being "immersed" and "absorbing". Not just language either ... he gives the example of some kids playing with trains on the floor. Along comes another kid who has never even SEEN a train before ... but by watching the other kids, he picks up the gist of the game and plays right along. This happens all the time between humans. This along with the corollary about too many words helps you see how misguided it was in the old "waterfall" days to try to write 700 page detailed requirements specifications.

There is so much more in there too. I'll shut up now, but I HIGHLY recommend this book!

alt text

The Back of the Napkin, by Dan Roam.

The Back of the Napkin

A great book about visual thinking techniques. There is also an expanded edition now. I can't speak to that version, as I do not own it; yet.

Agile Software Development by Alistair Cockburn

Do users ever touch your code? If you're not doing solely back-end work, I recommend About Face: The Essentials of User Interface Design — now in its third edition (linked). I used to think my users were stupid because they didn't "get" my interfaces. I was, of course, wrong. About Face turned me around.

"Writing Solid Code: Microsoft's Techniques for Developing Bug-Free C Programs (Microsoft Programming Series)" by Steve MacGuire.

Interesting what a large proportion the books mentioned here are C/C++ books.

While not strictly a software development book, I would highly recommend that Don't Make me Think! be considered in this list.

As so many people have listed Head First Design Patterns, which I agree is a very good book, I would like to see if so many people aware of a title called Design Patterns Explained: A New Perspective on Object-Oriented Design.

This title deals with design patterns excellently. The first half of the book is very accessible and the remaining chapters require only a firm grasp of the content already covered The reason I feel the second half of the book is less accessible is that it covers patterns that I, as a young developer admittedly lacking in experience, have not used much.

This title also introduces the concept behind design patterns, covering Christopher Alexander's initial work in architecture to the GoF first implementing documenting patterns in SmallTalk.

I think that anyone who enjoyed Head First Design Patterns but still finds the GoF very dry, should look into Design Patterns Explained as a much more readable (although not quite as comprehensive) alternative.

Even though i've never programmed a game this book helped me understand a lot of things in a fun way.

How influential a book is often depends on the reader and where they were in their career when they read the book. I have to give a shout-out to Head First Design Patterns. Great book and the very creative way it's written should be used as an example for other tech book writers. I.e. it's written in order to facilitate learning and internalizing the concepts.

Head First Design Patterns

97 Things Every Programmer Should Know

alt text

This book pools together the collective experiences of some of the world's best programmers. It is a must read.

Extreme Programming Explained: Embrace Change by Kent Beck. While I don't advocate a hardcore XP-or-the-highway take on software development, I wish I had been introduced to the principles in this book much earlier in my career. Unit testing, refactoring, simplicity, continuous integration, cost/time/quality/scope - these changed the way I looked at development. Before Agile, it was all about the debugger and fear of change requests. After Agile, those demons did not loom as large.

One of my personal favorites is Hacker's Delight, because it was as much fun to read as it was educational.

I hope the second edition will be released soon!

You.Next(): Move Your Software Development Career to the Leadership Track ~ Michael C. Finley (Author), Honza Fedák (Author) link text

I've been arounda while, so most books that I have found influential don't necessarily apply today. I do believe it is universally important to understand the platform that you are developing for (both hardware and OS). I also think it's important to learn from other peoples mistakes. So two books I would recommend are:

Computing Calamities and In Search of Stupidity: Over Twenty Years of High Tech Marketing Disasters

Working Effectively with Legacy Code is a really amazing book that goes into great detail about how to properly unit test your code and what the true benefit of it is. It really opened my eyes.

For example, I recently came across this in the linux kernel:

/* Force a compilation error if condition is true */
#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))

So, in your code, if you have some structure which must be, say a multiple of 8 bytes in size, maybe because of some hardware constraints, you can do:

BUILD_BUG_ON((sizeof(struct mystruct) % 8) != 0);

and it won't compile unless the size of struct mystruct is a multiple of 8, and if it is a multiple of 8, no runtime code is generated at all.

Another trick I know is from the book "Graphics Gems" which allows a single header file to both declare and initialize variables in one module while in other modules using that module, merely declare them as externs.

#ifdef DEFINE_MYHEADER_GLOBALS
#define GLOBAL
#define INIT(x, y) (x) = (y)
#else
#define GLOBAL extern
#define INIT(x, y)
#endif

GLOBAL int INIT(x, 0);
GLOBAL int somefunc(int a, int b);

With that, the code which defines x and somefunc does:

#define DEFINE_MYHEADER_GLOBALS
#include "the_above_header_file.h"

while code that's merely using x and somefunc() does:

#include "the_above_header_file.h"

So you get one header file that declares both instances of globals and function prototypes where they are needed, and the corresponding extern declarations.

So, what are your favorite C programming tricks along those lines?

Two good source books for this sort of stuff are The Practice of Programming and Writing Solid Code. One of them (I don't remember which) says: Prefer enum to #define where you can, because enum gets checked by the compiler.

I am debugging a (native) multi-threaded C++ application under Visual Studio 2008. On seemingly random occasions, I get a "Windows has triggered a break point..." error with a note that this might be due to a corruption in the heap. These errors won't always crash the application right away, although it is likely to crash short after.

The big problem with these errors is that they pop up only after the corruption has actually taken place, which makes them very hard to track and debug, especially on a multi-threaded application.

  • What sort of things can cause these errors?

  • How do I debug them?

Tips, tools, methods, enlightments... are welcome.

Application Verifier combined with Debugging Tools for Windows is an amazing setup. You can get both as a part of the Windows Driver Kit or the lighter Windows SDK. (Found out about Application Verifier when researching an earlier question about a heap corruption issue.) I've used BoundsChecker and Insure++ (mentioned in other answers) in the past too, although I was surprised how much functionality was in Application Verifier.

Electric Fence (aka "efence"), dmalloc, valgrind, and so forth are all worth mentioning, but most of these are much easier to get running under *nix than Windows. Valgrind is ridiculously flexible: I've debugged large server software with many heap issues using it.

When all else fails, you can provide your own global operator new/delete and malloc/calloc/realloc overloads -- how to do so will vary a bit depending on compiler and platform -- and this will be a bit of an investment -- but it may pay off over the long run. The desirable feature list should look familiar from dmalloc and electricfence, and the surprisingly excellent book Writing Solid Code:

  • sentry values: allow a little more space before and after each alloc, respecting maximum alignment requirement; fill with magic numbers (helps catch buffer overflows and underflows, and the occasional "wild" pointer)
  • alloc fill: fill new allocations with a magic non-0 value -- Visual C++ will already do this for you in Debug builds (helps catch use of uninitialized vars)
  • free fill: fill in freed memory with a magic non-0 value, designed to trigger a segfault if it's dereferenced in most cases (helps catch dangling pointers)
  • delayed free: don't return freed memory to the heap for a while, keep it free filled but not available (helps catch more dangling pointers, catches proximate double-frees)
  • tracking: being able to record where an allocation was made can sometimes be useful

Note that in our local homebrew system (for an embedded target) we keep the tracking separate from most of the other stuff, because the run-time overhead is much higher.


If you're interested in more reasons to overload these allocation functions/operators, take a look at my answer to "Any reason to overload global operator new and delete?"; shameless self-promotion aside, it lists other techniques that are helpful in tracking heap corruption errors, as well as other applicable tools.

Can we check whether a pointer passed to a function is allocated with memory or not in C?

I have wriiten my own function in C which accepts a character pointer - buf [pointer to a buffer] and size - buf_siz [buffer size]. Actually before calling this function user has to create a buffer and allocate it memory of buf_siz.

Since there is a chance that user might forget to do memory allocation and simply pass the pointer to my function I want to check this. So is there any way I can check in my function to see if the pointer passed is really allocated with buf_siz amount of memory .. ??

EDIT1: It seems there is no standard library to check it .. but is there any dirty hack to check it .. ??

EDIT2: I do know that my function will be used by a good C programmer ... But I want to know whether can we check or not .. if we can I would like to hear to it ..

Conclusion: So it is impossible to check if a particular pointer is allocated with memory or not within a function

As everyone else said, there isn't a standard way to do it.

So far, no-one else has mentioned 'Writing Solid Code' by Steve Maguire. Although castigated in some quarters, the book has chapters on the subject of memory management, and discusses how, with care and complete control over all memory allocation in the program, you can do as you ask and determine whether a pointer you are given is a valid pointer to dynamically allocated memory. However, if you plan to use third party libraries, you will find that few of them allow you to change the memory allocation routines to your own, which greatly complicates such analysis.

For OOP languages, there are many books describing how to design software, and design patterns are mainly for OOP languages.

I am wondering whether there are any books/good articles teaching how to use C in a big project, like it is a good practice to use static functions when this function is only used in a single file.

You must read Expert C Programming by Peter van der Linden.

alt text

Code Complete 1st Ed by Steve McConell is more oriented towards C, that may be worth a look as well. At any rate his books are great reading for any professional programmer.

G'day,

While heavily focused on C++, John Lakos's excellent book "Large-Scale C++ Software Design" has a lot of information that is very relevant to the design of software written in C.

Edit: Oooh. After seeing @Jackson's suggestion for the excellent "The Practice of Programming" I'd also highly recommend Eric Raymond's excellent book "The Art of UNIX Programming.". Thanks for the reminder @Jackson.

HTH

cheers,

  1. C FAQ
  2. K & R
  3. Linux kernel source code

Lately, I discover more and more that it's good to have extensive knowledge of programming fundamentals. Sadly, I am (one of the many) self-taught PHP developers and have no regrets choosing that path.

However, I still think I should extend my knowledge to some "real" programming languages starting from zero and build up my knowledge from there. I have no intention of changing my career path, but I do think it would be good to think out of the web-development box.

I prefer not taking classes or courses, because I simply do not have the time for this. So:

  • What is the best way to teach myself C step by step, starting from level zero?

  • As my main goal is to learn more programming fundamentals, is C even a good choice for this?

  • If not, what language would be?


Summary so far:

First of all, thanks for all the great responses. These will be quite helpful. Although most people seem to agree that starting off with C is not a bad choice, I have also seen people state that it is probably a better idea to skip C and go with C++ or even C#, since these languages are more current.

My personal opinion is still that it would be good to start from level zero, even if the language itself is not directly contributive to the things I make. I still believe it will indirectly make me a better programmer. But then again, like said, my knowledge of these languages is quite limited, so I'd love to hear your thoughts on the matter aswell.

I have to disagree with the previous two answers who recommend the famous "K&R" guide. I was completely unable to learn anything from that book; I simply gave up after reading the first third of the book about three times. Maybe I'm just dumb.

I suggest, instead, this wonderful book: C Programming: A Modern Approach (disclaimer: amazon link)

I've learned everything I need to know about C from that book, and it covers the history as much as needs to be done, while still keeping a "modern" point of view.

Caveat: I didn't come to C "for C", I passed through it on the way to my eventual goal, Objective-C and Cocoa programming for desktop applications on Apple's Mac OS X. If you really want a very deep knoweldge of C, it may not hurt to get both of the above-mentioned books, and read the K&R guide after reading Modern C

first(as always) I want to apologize about my english, it may not be clear enough.

I'm not that good at C programming, and I was asked to read a "string" input with undefined length.

This is my solution

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *newChar();
char *addChar(char *, char);
char *readLine(void);

int main() {
  char *palabra;
  palabra = newChar();

  palabra = readLine();
  printf("palabra=%s\n", palabra);

  return 0;
}

char *newChar() {
  char *list = (char *) malloc(0 * sizeof (char));
  *list = '\0';
  return list;
}

char *addChar(char *lst, char num) {
  int largo = strlen(lst) + 1;
  realloc(&lst, largo * sizeof (char));
  *(lst + (largo - 1)) = num;
  *(lst + largo) = '\0';
  return lst;
}

char *readLine() {
  char c;
  char *palabra = newChar();

  c = getchar();
  while (c != '\n') {
    if (c != '\n') {
      palabra = addChar(palabra, c);
    }
    c = getchar();
  }
  return palabra;
}

Please, I'd appreciate that you help me by telling me if it's a good idea or giving me some other idea(and also telling me if it's a "correct" use for pointers).

Thanks in advance


EDIT: Well, thanks for you answers,they were very useful. Now I post edited(and I hope better) code, maybe could be useful for someone new to C(like me) and be feedbacked again.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


void reChar(char **, int *);
void readLine(char **, int *);

int main() {
    char *palabra = NULL;
    int largo = 0;

    reChar(&palabra, &largo);
    readLine(&palabra, &largo);
    printf("palabra=%s\n", palabra, largo);

    system("pause");
    return 0;
}

void reChar(char **lst, int *largo) {
    (*largo) += 4;
    char *temp = (char*) realloc(*lst, (*largo) * sizeof (char));

    if (temp != NULL) {
        *lst = temp;
    } else {
        free(*lst);
        puts("error (re)allocating memory");
        exit(1);
    }
}

void readLine(char **lst, int *largo) {
    int c;
    int pos = 0;

    c = getchar();
    while (c != '\n' && c != EOF) {
        if ((pos + 1) % 4 == 0) {
            reChar(lst, largo);
        }
        (*lst)[pos] =(char) c;
        pos++;
        c = getchar();
    }
    (*lst)[pos] = '\0';
}

PS:

  • It seem enough to slow increase size of "palabra".

  • I'm not sure if capture getchar() into a int and then cast it into a char is the correct way to hadle EOF pitfall

  1. Look up the definition of POSIX getline().

  2. Remember that you need to capture the return value from realloc(); it is not guaranteed that the new memory block starts at the same position as the old one.

  3. Know that malloc(0) may return a null pointer, or it may return a non-null pointer that is unusable (because it points to zero bytes of memory).

  4. You may not write '*list = '\0'; when list points to zero bytes of allocated memory; you don't have permission to write there. If you get a NULL back, you are likely to get a core dump. In any case, you are invoking undefined behaviour, which is 'A Bad Idea™'. (Thanks)

  5. The palabra = newChar(); in main() leaks memory - assuming that you fix the other problems already discussed.

  6. The code in readLine() doesn't consider the possibility of getting EOF before getting a newline; that is bad and will result in a core dump when memory allocation (finally) fails.

  7. Your code will exhibit poor performance because it allocates one character at a time. Typically, you should allocate considerably more than one extra character at a time; starting with an initial allocation of perhaps 4 bytes and doubling the allocation each time you need more space might be better. Keep the initial allocation small so that the reallocation code is properly tested.

  8. The return value from getchar() is an int, not a char. On most machines, it can return 256 different positive character values (even if char is a signed type) and a separate value, EOF, that is distinct from all the char values. (The standard allows it to return more than 256 different characters if the machine has bytes that are bigger than 8 bits each.) (Thanks) The C99 standard §7.19.7.1 says of fgetc():

    If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).

    (Emphasis added.) It defines getchar() in terms of getc(), and it defines getc() in terms of fgetc().

  9. (Borrowed: Thanks). The first argument to realloc() is the pointer to the start of the currently allocated memory, not a pointer to the pointer to the start of the currently allocated memory. If you didn't get a compilation warning from it, you are not compiling with enough warnings set on your compiler. You should turn up the warnings to the maximum. You should heed the compiler warnings - they are normally indicative of bugs in your code, especially while you are still learning the language.

  10. It is often easier to keep the string without a null terminator until you know you have reached the end of the line (or end of input). When there are no more characters to be read (for the time being), then append the null so that the string is properly terminated before it is returned. These functions do not need the string properly terminate while they are reading, as long as you keep track of where you are in the string. Do make sure you have enough room at all times to add the NUL '\0' to the end of the string, though.

See Kernighan & Pike 'The Practice of Programming' for a lot of relevant discussions. I also think Maguire 'Writing Solid Code' has relevant advice to offer, for all it is somewhat dated. However, you should be aware that there are those who excoriate the book. Consequently, I recommend TPOP over WSC (but Amazon has WSC available from $0.01 + p&p, whereas TPOP starts at $20.00 + p&p -- this may be the market speaking).


TPOP was previously at http://plan9.bell-labs.com/cm/cs/tpop and http://cm.bell-labs.com/cm/cs/tpop but both are now (2015-08-10) broken. See also Wikipedia on TPOP.

Possible Duplicate:
When and why will an OS initialise memory to 0xCD, 0xDD, etc. on malloc/free/new/delete?

Why is memory I haven't initialized set to 0xCC?

Setting the memory to 0xCC will decrease performance, so there must be a reason for filling the memory with this byte.

The compiler does this for you in debug mode, so that if you accidentally read uninitialized memory, you'll see the distinctive 0xCC value, and recognize that you (probably) read uninitialized memory. The 0xCC value has a lot of other useful properties, for example it is the machine language instruction for invoking a hardware breakpoint should you accidentally execute some uninitialized memory.

The basic principle: make it easy to identify values that come from reading uninitialized memory.

This doesn't happen in your release builds.

This technique was introduced in Writing Solid Code.

I am using realloc in every iteration of a for loop that iterates more that 10000 times.

Is this a good practice? Will realloc cause an error if it was called a lot of times?

In C:

Used properly, there's nothing wrong with realloc. That said, it's easy to use it incorrectly. See Writing Solid Code for an in-depth discussion of all the ways to mess up calling realloc and for the additional complications it can cause while debugging.

If you find yourself reallocating the same buffer again and again with only a small incremental size bump, be aware that it's usually much more efficient to allocate more space than you need, and then keep track of the actual space used. If you exceed the allocated space, allocate a new buffer at a larger size, copy the contents, and free the old buffer.

In C++:

You probably should avoid realloc (as well as malloc and free). Whenever possible, use a container class from the standard library (e.g., std::vector). They are well-tested and well-optimized and relieve you of the burden of a lot of the housekeeping details of managing the memory correctly (like dealing with exceptions).

C++ doesn't have the concept of reallocating an existing buffer. Instead, a new buffer is allocated at the new size, contents are copied, and the old buffer is deleted. This is what realloc does when it cannot satisfy the new size at the existing location, which makes it seem like C++'s approach is less efficient. But it's rare that realloc can actually take advantage of an in-place reallocation. And the standard C++ containers are quite smart about allocating in a way that minimizes fragmentation and about amortizing the cost across many updates, so it's generally not worth the effort to pursue realloc if you're goal is to increase performance.

I'm trying to make my code easily understood by future readers.

I've always had issues with how to word my if statement comments to make it the most understandable.

Maybe it seems trivial, but it's something that's always bothered me

Here's an example:

if ( !$request ) {
    $request = $_SERVER['REQUEST_URI'];
}

Here are some way I can think of commenting it

// If request doesn't exist
if ( !$request ) {
    // Set request to current request_uri
    $request = $_SERVER['REQUEST_URI'];
}

// Check for a request
if ( !$request ) {
    $request = $_SERVER['REQUEST_URI'];
}

// Request doesn't exist
if ( !$request ) {
    // Set request
    $request = $_SERVER['REQUEST_URI'];
}

Not the best example, but the way I see it there are infinite ways to word it.

I've never really worked on a team so I don't have much experience on other people reading my code.

What are your experiences on the best way to word that to make it readable for future coders.

Read the book Clean Code by Robert Martin.

http://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882

Comments should be only used to explain concepts that the code can't explain itself.

As said before, maybe your example is a bit too simple in order really have a relevant comment on it, but here are a couple of general suggestions:

  • usually is easier to read "positive" conditions rather than negations (not condition)

  • don't hesitate to extract methods that will have a detailed name that will communicate the intent and avoid thus the need of some of the comments. In your case, say you create:

    function getRequest( $req ) {
        if( isset( $req ) ) {
            return $req;
        } else {
            return $_SERVER['REQUEST_URI'];
        }
    }
    

    but again, we need a more appropriate example, in your case it might be overkill

  • must read :

I am studying for a test, and I was wondering if any of these are equivalent to free(ptr):

 malloc(NULL); 

 calloc(ptr); 

 realloc(NULL, ptr); 

 calloc(ptr, 0); 

 realloc(ptr, 0);

From what I understand, none of these will work because the free() function actually tells C that the memory after ptr is available again for it to use. Sorry that this is kind of a noob question, but help would be appreciated.

Actually, the last of those is equivalent to a call to free(). Read the specification of realloc() very carefully, and you will find it can allocate data anew, or change the size of an allocation (which, especially if the new size is larger than the old, might move the data around), and it can release memory too. In fact, you don't need the other functions; they can all be written in terms of realloc(). Not that anyone in their right mind would do so...but it could be done.

See Steve Maguire's "Writing Solid Code" for a complete dissection of the perils of the malloc() family of functions. See the ACCU web site for a complete dissection of the perils of reading "Writing Solid Code". I'm not convinced it is as bad as the reviews make it out to be - though its complete lack of a treatment of const does date it (back to the early 90s, when C89 was still new and not widely implemented in full).


D McKee's notes about MacOS X 10.5 (BSD) are interesting...

The C99 standard says:

7.20.3.3 The malloc function

Synopsis

#include <stdlib.h>
void *malloc(size_t size);

Description

The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.

Returns

The malloc function returns either a null pointer or a pointer to the allocated space.

7.20.3.4 The realloc function

Synopsis

#include <stdlib.h>
void *realloc(void *ptr, size_t size);

Description

The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.

If ptr is a null pointer, the realloc function behaves like the malloc function for the specified size. Otherwise, if ptr does not match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has been deallocated by a call to the free or realloc function, the behavior is undefined. If memory for the new object cannot be allocated, the old object is not deallocated and its value is unchanged.

Returns

The realloc function returns a pointer to the new object (which may have the same value as a pointer to the old object), or a null pointer if the new object could not be allocated.


Apart from editorial changes because of extra headers and functions, the ISO/IEC 9899:2011 standard says the same as C99, but in section 7.22.3 instead of 7.20.3.


The Solaris 10 (SPARC) man page for realloc says:

The realloc() function changes the size of the block pointer to by ptr to size bytes and returns a pointer to the (possibly moved) block. The contents will be unchanged up to the lesser of the new and old sizes. If the new size of the block requires movement of the block, the space for the previous instantiation of the block is freed. If the new size is larger, the contents of the newly allocated portion of the block are unspecified. If ptr is NULL, realloc() behaves like malloc() for the specified size. If size is 0 and ptr is not a null pointer, the space pointed to is freed.

That's a pretty explicit 'it works like free()' statement.

However, that MacOS X 10.5 or BSD says anything different reaffirms the "No-one in their right mind" part of my first paragraph.


There is, of course, the C99 Rationale...It says:

7.20.3 Memory management functions

The treatment of null pointers and zero-length allocation requests in the definition of these functions was in part guided by a desire to support this paradigm:

OBJ * p; // pointer to a variable list of OBJs
    /* initial allocation */
p = (OBJ *) calloc(0, sizeof(OBJ));
     /* ... */
     /* reallocations until size settles */
 while(1) {
    p = (OBJ *) realloc((void *)p, c * sizeof(OBJ));
         /* change value of c or break out of loop */
 }

This coding style, not necessarily endorsed by the Committee, is reported to be in widespread use.

Some implementations have returned non-null values for allocation requests of zero bytes. Although this strategy has the theoretical advantage of distinguishing between “nothing” and “zero” (an unallocated pointer vs. a pointer to zero-length space), it has the more compelling theoretical disadvantage of requiring the concept of a zero-length object. Since such objects cannot be declared, the only way they could come into existence would be through such allocation requests.

The C89 Committee decided not to accept the idea of zero-length objects. The allocation functions may therefore return a null pointer for an allocation request of zero bytes. Note that this treatment does not preclude the paradigm outlined above.

QUIET CHANGE IN C89

A program which relies on size-zero allocation requests returning a non-null pointer will behave differently.

[...]

7.20.3.4 The realloc function

A null first argument is permissible. If the first argument is not null, and the second argument is 0, then the call frees the memory pointed to by the first argument, and a null argument may be returned; C99 is consistent with the policy of not allowing zero-sized objects.

A new feature of C99: the realloc function was changed to make it clear that the pointed-to object is deallocated, a new object is allocated, and the content of the new object is the same as that of the old object up to the lesser of the two sizes. C89 attempted to specify that the new object was the same object as the old object but might have a different address. This conflicts with other parts of the Standard that assume that the address of an object is constant during its lifetime. Also, implementations that support an actual allocation when the size is zero do not necessarily return a null pointer for this case. C89 appeared to require a null return value, and the Committee felt that this was too restrictive.


Thomas Padron-McCarthy observed:

C89 explicitly says: "If size is zero and ptr is not a null pointer, the object it points to is freed." So they seem to have removed that sentence in C99?

Yes, they have removed that sentence because it is subsumed by the opening sentence:

The realloc function deallocates the old object pointed to by ptr

There's no wriggle room there; the old object is deallocated. If the requested size is zero, then you get back whatever malloc(0) might return, which is often (usually) a null pointer but might be a non-null pointer that can also be returned to free() but which cannot legitimately be dereferenced.

this is my first question after long time checking on this marvelous webpage.

Probably my question is a little silly but I want to know others opinion about this. What is better, to create several specific methods or, on the other hand, only one generic method? Here is an example...

unsigned char *Method1(CommandTypeEnum command, ParamsCommand1Struct *params)
{
if(params == NULL) return NULL;

// Construct a string (command) with those specific params (params->element1, ...)

return buffer; // buffer is a member of the class 
}

unsigned char *Method2(CommandTypeEnum command, ParamsCommand2Struct *params)
{
...
}

unsigned char *Method3(CommandTypeEnum command, ParamsCommand3Struct *params)
{
...
}
unsigned char *Method4(CommandTypeEnum command, ParamsCommand4Struct *params)
{
...
}

or

unsigned char *Method(CommandTypeEnum command, void *params)
{
switch(command)
{
case CMD_1:
{
if(params == NULL) return NULL;

ParamsCommand1Struct *value = (ParamsCommand1Struct *) params;

// Construct a string (command) with those specific params (params->element1, ...)

return buffer;
}
break;

// ...

default:
break;
}
}

The main thing I do not really like of the latter option is this,

ParamsCommand1Struct *value = (ParamsCommand1Struct *) params;

because "params" could not be a pointer to "ParamsCommand1Struct" but a pointer to "ParamsCommand2Struct" or someone else.

I really appreciate your opinions!

General Answer

In Writing Solid Code, Steve Macguire's advice is to prefer distinct functions (methods) for specific situations. The reason is that you can assert conditions that are relevant to the specific case, and you can more easily debug because you have more context.

An interesting example is the standard C run-time's functions for dynamic memory allocation. Most of it is redundant, as realloc can actually do (almost) everything you need. If you have realloc, you don't need malloc or free. But when you have such a general function, used for several different types of operations, it's hard to add useful assertions and it's harder to write unit tests, and it's harder to see what's happening when debugging. Macquire takes it a step farther and suggests that, not only should realloc just do _re_allocation, but it should probably be two distinct functions: one for growing a block and one for shrinking a block.

While I generally agree with his logic, sometimes there are practical advantages to having one general purpose method (often when operations is highly data-driven). So I usually decide on a case by case basis, with a bias toward creating very specific methods rather than overly general purpose ones.

Specific Answer

In your case, I think you need to find a way to factor out the common code from the specifics. The switch is often a signal that you should be using a small class hierarchy with virtual functions.

If you like the single method approach, then it probably should be just a dispatcher to the more specific methods. In other words, each of those cases in the switch statement simply call the appropriate Method1, Method2, etc. If you want the user to see only the general purpose method, then you can make the specific implementations private methods.

I am trying to allocate a single block of shorts, fwrite it to a file, and then read it back. But the data that gets written into the file doesn't match what is coming out. I've isolated the problem to the following bit of code. Any ideas what I'm doing wrong?

#define CHUNK_SIZE 1000
void xwriteStructuresToFile(FILE *file, void * structureData)
{
    assert((fwrite(structureData, sizeof(short), CHUNK_SIZE, file)) == CHUNK_SIZE);

}

void wwbuildPtxFiles(void)
{   
    FILE *file = fopen("s:\\tv\\run32\\junky.bin", WRITE_BINARY);
    int count = 10;
    short *ptx = (short *) calloc(CHUNK_SIZE * count, sizeof(short ) );

    memset(ptx, '3', sizeof(short) * CHUNK_SIZE * count);
    for (int dayIndex = 0; dayIndex < count; ++dayIndex)
        xwriteStructuresToFile(file, (void *) &ptx[ CHUNK_SIZE * sizeof(short) * dayIndex ]);

    free(ptx);
    fclose(file);

    file = fopen("s:\\tv\\run32\\junky.bin", READ_BINARY);
    int xcount = CHUNK_SIZE * count * sizeof(short );
    for (int i = 0; i < xcount; ++i)
    {
        char x;
        if ((x = getc(file)) != '3')
            assert(false);
    }
}

You're writing the 'data' beyond the end of the array!

xwriteStructuresToFile(file, (void *) &ptx[ CHUNK_SIZE * sizeof(short) * dayIndex ]);

You should be using:

xwriteStructuresToFile(file, &ptx[CHUNK_SIZE * dayIndex]);

The C compiler scales by sizeof(short) automatically. If you have an array of integers, you don't write array[i * sizeof(int)] to access the ith member of the array; similarly, here you don't need to scale the index by sizeof(short). Indeed, it is crucial that you don't because you are writing twice (on the assumption that sizeof(short) == 2) as far through the memory as you expected.

You also should not use assert() around a function call that must be executed. You use assert() in a separate statement that can be omitted from the program without affecting its function. This is discussed at some length in 'Writing Solid Code' by Steve Maguire, which is a bit dated in places, but sound on this point, at least.

Is there any way in C to know if a memory block has previously been freed with free()? Can i do something like...

if(isFree(pointer))
{ 
    //code here
}

For a platform-specific solution, you may be interested in the Win32 function IsBadReadPtr (and others like it). This function will be able to (almost) predict whether you will get a segmentation fault when reading from a particular chunk of memory.

Note: IsBadReadPtr has been deprecated by Microsoft.

However, this does not protect you in the general case, because the operating system knows nothing of the C runtime heap manager, and if a caller passes in a buffer that isn't as large as you expect, then the rest of the heap block will continue to be readable from an OS perspective.

Pointers have no information with them other than where they point. The best you can do is say "I know how this particular compiler version allocates memory, so I'll dereference memory, move the pointer back 4 bytes, check the size, makes sure it matches..." and so on. You cannot do it in a standard fashion, since memory allocation is implementation defined. Not to mention they might have not dynamically allocated it at all.

On a side note, I recommend reading 'Writing Solid Code' by Steve McGuire. Excellent sections on memory management.

I have to do a project in C where I have to constantly allocate memory for big data structures and then free it. Does there exista a library with a function that helps to keep track of the memory usage so I can be sure if I am doing things correctly? (I'm new to C)

For example, a function that returns: A) The total of memory used by the program at the moment, OR B) The total of memory left, would do the job. I already googled for that and searched in other answers.

Thanks!

Although some people excoriate it, the book "Writing Solid Code" by Steve Maguire has a lot of reasonable ideas about how to track your memory usage without modifying the system memory allocation functions. Basically, instead of calling the raw malloc() etc functions directly, you call your own memory allocation API built on top of the standard one. Your API can track allocations and frees, detect double frees, frees of non-allocated memory, unreleased (leaked) memory, complete dumps of what is allocated, etc. You either need to crib the code from the book or write your own equivalent code. One interesting problem is providing a stack trace for each allocation; there isn't a standard way to determine the call stack. (The book is a bit dated now; it was written just a few years after the C89 standard was published and does not exploit const qualifiers.)

Some will argue that these services can be provided by the system malloc(); indeed, they can, and these days often are. You should look carefully at the manual provided for your version of malloc(), and decide whether it provides enough for you. If not, then the wrapper API mechanism is reasonable. Note that using your own API means you track what you explicitly allocate, while leaving library functions not written to use your API using the system services - as, indeed, does your code, under the covers.

You should also look into valgrind. It does a superb job tracking memory abuses, and in particular will report leaked memory (memory that was allocated but not freed). It also spots when you read or write outside the bounds of an allocated space, spotting buffer overflows.

Nevertheless, ultimately, you need to be disciplined in the way you write your code, ensuring that every time you allocate memory, you know when it will be released.

I have a project to write cache simulation in C and so far i am stuck on allocating space for an array and program crashes. Please help me out. I have to use pointers in this program. When i had array simply created in the main function, there were no problems, but after making a pointer to in global, program started to crash and i have to have it global. Its one of the requirements.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <assert.h>

int INDEXLENGTH; // global variable for num of bits in index
int OFFSETLENGTH; // global variable for num of bits in byte offset
int TAGBITS; // global variable for num of bits in the tag
int misses=0; //variable for total number of misses
int hits=0;
int **tagAr; // LRUArray pointer
int **lruAr; // tagArray pointer
int cacheSize; // user specified size of cache
int blockSize; // user specified size of each block
int linesperset;
int sets;

int main(void)
{
    int i; // simply a few variables for future for loops
    int j;

    printf("Welcome to Lab A Cache Simulator by Divya, Alex and Jenn.\n");
    printf("To begin, please input the size of the cache you will be simulating in bytes.\n");
    scanf(" %d", &cacheSize); // (cache size in bytes) is read in from the user
    printf("You have inputed this: %d\n", cacheSize);

    printf("Next, please input the size of each cache line in bytes.\n");
    scanf(" %d", &blockSize); // blocksize is read in from the user
    printf("You have inputed this: %d\n", blockSize);

    printf("Finally, please input the number of ways of associativity for this cache.\n");
    printf("Also, input the number as a power of 2.\n");
    scanf(" %d", &linesperset); // linesperset is read in from the user
    printf("You have inputed this: %d\n", linesperset);

    sets = (cacheSize/(blockSize*linesperset));
    printf("Variable sets is: %d\n", sets);


    tagAr=(int **)malloc(sets*sizeof(int *)); // allocating space for "l" array pointers
    for (i=0; i<sets; i++) tagAr[i]=(int *)malloc(linesperset*sizeof(int)); //allocates space for k columns for each p[i]

    lruAr=(int **)malloc(sets*sizeof(int *)); // allocating space for "l" array pointers
    for (i=0; i<sets; i++) lruAr[i]=(int *)malloc(linesperset*sizeof(int)); //allocates space for k columns for each p[i]


    for (i=0; i<sets; i++)
        {
            for (j=0; j<blockSize; j++)
            {
                tagAr[i][j] = -1;
                lruAr[i][j] = -1;
            }
        }

    for (i = 0; i < sets; i++) //This part of the code prints array for debuging purposes
        {
            for (j = 0; j < blockSize; j++)
            {
                printf(" %d", lruAr[i][j]);
            }
            printf("\n");
            printf("\n");
        }

    printf("This is the value of IndexLength before setting, should be 0 : %d\n", INDEXLENGTH); //only for debuging purposes
    setIndexLength();
    printf("This is the value of IndexLength after setting: %d\n", INDEXLENGTH); //only for debuging purposes
    printf("This is the value of OffsetLength before setting, should be 0 : %d\n", OFFSETLENGTH); //only for debuging purposes
    offsetLength();
    printf("This is the value of OffsetLength after setting: %d\n", OFFSETLENGTH); //only for debuging purposes





    return misses;
}

This isn't really an answer but rather a follow-up to Pankrates' very good suggestion.

If you run the program under valgrind (after compiling it with debug information and no optimizations, just to be safe)

$ gcc -O0 -g -o test.o -W -Wall test.c   # Compile

$ valgrind ./test.o     # run simple Valgrind

you get:

==26726== Use of uninitialised value of size 8
==26726==    at 0x4008C5: main (test.c:54)
==26726==
==26726== Invalid write of size 4
==26726==    at 0x4008C5: main (test.c:54)
==26726==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==26726==
==26726==

which tells you that in the source file, at line 54, you are writing to an uninitialized variable. Something you did not allocate. Valgrind tells you what and where. To discover the why, the line is:

lruAr[i][j] = -1;

so you check where you allocated lruAr (valgrind says it's not allocated!), and you quickly find out what looks like a copy-and-paste bug:

for (i=0; i

tagAr=(int **)malloc(sets*sizeof(int *));
for (i=0; i<sets; i++) {
    tagAr[i]=(int *)malloc(linesperset*sizeof(int)); // <-- ORIGINAL LINE
}

lruAr=(int **)malloc(sets*sizeof(int *));
for (i=0; i<sets; i++) {
    tagAr[i]=(int *)malloc(linesperset*sizeof(int)); // <-- THE COPY ("tagAr"?)
}

The fix is the one described by Charlie Burns.

Valgrind can help you do much more than this, and will help you intercept much subtler bugs - bugs that are not so kind as to crash your application immediately and reproduceably.

For the same reason, do not use casts (or do so only as a last resort). By casting, you are telling the compiler "I know better". Truth is, the compiler usually knows better than you and me and mostly everybody else. By not using casts, you let it stand up and speak whenever you unwittingly do something suspicious, rather than go on dumbly and do what you told it to do instead of what you meant it to do.

(Sorry for the long and patronizing rant. But I got bitten so many times, I thought maybe I might save you some skin :-) )

Update

"Still crashes". Okay, so we run valgrind again after updating the source, and this times we get

==28686== Invalid write of size 4
==28686==    at 0x4008B8: main (test.c:54)
==28686==  Address 0x51e00a0 is 0 bytes after a block of size 16 alloc'd
==28686==    at 0x4C2C27B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28686==    by 0x4007F3: main (test.c:43)
==28686==
==28686== Invalid write of size 4
==28686==    at 0x4008E2: main (test.c:55)
==28686==  Address 0x51e0190 is 0 bytes after a block of size 16 alloc'd
==28686==    at 0x4C2C27B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28686==    by 0x400852: main (test.c:46)

Notice: we are writing 0 bytes after a block. This means we have a buffer overrun: we did allocate an array, but it is too small. Or our write is too large.

The array was allocated at lines 43 and 46 (the two cycled mallocs) and is written here:

            tagAr[i][j] = -1;
            lruAr[i][j] = -1;

...so this means that tagAr[i] has a width, but blockSize is bigger.

This might be due to me inserting nonsensical data, so you'd have to check yourself. But it is apparent that you alloc linesperset elements, and write blockSize of them. At the very least, blockSize must never be allowed to exceed linesperset.

Finally

The code below passes valgrind tests. I have also added a check that the memory has indeed be allocated: it is almost always unnecessary, and without it, the program will almost never crash and coredump. If almost never and shouldn't aren't good enough for you, you'll better check.

// allocating space for "l" array pointers
tagAr=malloc(sets*sizeof(int *));
lruAr=malloc(sets*sizeof(int *));

if ((NULL == tagAr)||(NULL == lruAr)) {
    fprintf(stderr, "out of memory\n");
}

for (i=0; i < sets; i++) {
    tagAr[i] = malloc(linesperset*sizeof(int));
    lruAr[i] = malloc(linesperset*sizeof(int));
    for (j=0; j < linesperset; j++) {
         tagAr[i][j] =
         lruAr[i][j] = -1;
    }
}

For the same reason, even if many call it being paranoid and anal-retentive, and modern garbage collecting languages laugh at it, it is a good habit to free every allocated pointer once you're done, and set it to NULL for good measure (not to prevent loitering - but to ensure you're not using an obsoleted value somewhere).

// To free:
for (i=sets; i > 0; ) {
    i--;
    // To be really sure, you might want to bzero()
    // lruAr[i] and tagAr[i], or better still, memset()
    // them to an invalid and easily recognizable value.
    // I usually use 0xDEADBEEF, 0xDEAD or 0xA9.
    free(lruAr[i]); lruAr[i] = NULL;
    free(tagAr[i]); tagAr[i] = NULL;
}
free(lruAr); lruAr = NULL;
free(tagAr); tagAr = NULL;

(The above practices, including the 0xA9 value, from Writing Solid Code).

void insertAtTail( List *list, int val )
{
  //double pointer to head
  Node **link = &( list->head );
  //move to end of list
  while(*link != NULL) {
    *link = (*link)->next;
  }
  //create new node
  Node *n = (Node *)malloc(sizeof(Node));
  n->value = val;
  n->next = NULL;
  //insert node at end
  *link = n;

This code is supposed to add a value from the input file to the end of the list(to reverse the list) but my output is just 29 with a segmentation fault. It seems to be just overwriting each value added then writing past the end of the list. How could I insert each value to the end of the list without overwriting each value? The input follows:

72 19 47 31 8 36 12 88 15 75 51 29

The problem is that the loop stops when *link is NULL, but you then use *link = n; to assign. You need to stop when (*link)->next is NULL, and you need to set (*link)->next = n;. You also need to step through the list without rewriting its links.

You didn't include an MCVE (Minimal, Complete, Verifiable Example) so it isn't clear how you create an empty list or otherwise set things up. The code below depends on the setup shown — where the empty list has a pointer to a valid head node that itself contains a pointer to null. The notation &(Node){ 0 } is a compound literal, a feature of C99 and C11 but not in C89. There may be a better way of initializing the list; I haven't got the time right now to go check the book where I know the information is available (Writing Solid Code by Steve Maguire).

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct Node Node;

struct Node
{
    int value;
    Node *next;
};

typedef struct List List;

struct List
{
    Node *head;
};

static void print_list(const char *tag, List *list);
static void free_list_data(List *list);

static void insertAtTail(List *list, int val)
{
    // double pointer to head
    assert(list != NULL);
    Node **link = &list->head;
    // move to end of list
    while ((*link)->next != NULL)
    {
        // *link = (*link)->next;   // Old code
        link = &(*link)->next;      // Key change
    }
    // create new node
    Node *n = (Node *)malloc(sizeof(Node));
    n->value = val;
    n->next = NULL;
    // insert node at end
    // *link = n;                   // Old code
    (*link)->next = n;              // Key change
}

int main(void)
{
    List list = { &(Node){ 0 } };
    print_list("Empty", &list);
    static const int values[] =
    {
        85, 10, 77, 51, 12, 66, 88, 79, 60, 87,
        15, 83, 76, 44, 15, 53, 69, 85, 81, 12,
    };
    enum { NUM_VALUES = sizeof(values) / sizeof(values[0]) };

    for (int i = 0; i < NUM_VALUES; i++)
    {
        char buffer[16];
        snprintf(buffer, sizeof(buffer), "Added %d", values[i]);
        insertAtTail(&list, values[i]);
        print_list(buffer, &list);
    }

    free_list_data(&list);

    return 0;
}

static void free_list_data(List *list)
{
    Node *node = list->head->next;
    while (node != NULL)
    {
        Node *next = node->next;
        free(node);
        node = next;
    }
}

static void print_list(const char *tag, List *list)
{
    assert(list != NULL && list->head != NULL);
    printf("%s:\n", tag);
    int i = 0;
    for (Node *node = list->head->next; node != 0; node = node->next)
    {
        printf("%4d", node->value);
        if (++i % 10 == 0)
            putchar('\n');
    }
    if (i % 10 != 0)
        putchar('\n');
}

Example output:

Empty:
Added 85:
  85
Added 10:
  85  10
Added 77:
  85  10  77
Added 51:
  85  10  77  51
Added 12:
  85  10  77  51  12
Added 66:
  85  10  77  51  12  66
Added 88:
  85  10  77  51  12  66  88
Added 79:
  85  10  77  51  12  66  88  79
Added 60:
  85  10  77  51  12  66  88  79  60
Added 87:
  85  10  77  51  12  66  88  79  60  87
Added 15:
  85  10  77  51  12  66  88  79  60  87
  15
Added 83:
  85  10  77  51  12  66  88  79  60  87
  15  83
Added 76:
  85  10  77  51  12  66  88  79  60  87
  15  83  76
Added 44:
  85  10  77  51  12  66  88  79  60  87
  15  83  76  44
Added 15:
  85  10  77  51  12  66  88  79  60  87
  15  83  76  44  15
Added 53:
  85  10  77  51  12  66  88  79  60  87
  15  83  76  44  15  53
Added 69:
  85  10  77  51  12  66  88  79  60  87
  15  83  76  44  15  53  69
Added 85:
  85  10  77  51  12  66  88  79  60  87
  15  83  76  44  15  53  69  85
Added 81:
  85  10  77  51  12  66  88  79  60  87
  15  83  76  44  15  53  69  85  81
Added 12:
  85  10  77  51  12  66  88  79  60  87
  15  83  76  44  15  53  69  85  81  12