Inside the C++ Object Model

Stanley B. Lippman

Mentioned 29

There is a lot of misinformation and myth about the overhead and costs associated with C++. Now Stan Lippman, the acclaimed author of the C++ Primer, answers the call for a book that gives strategy guidelines for C++ programming. Inside the C++ Object Model explains where overhead costs reside and what they actually consist of. The author explains which parts vary by implementation and which are invariant. He tells how the various implementation models arose, points out areas where they are likely to evolve, and explains why they are what they are. This book is a must for C++ programmers who want to understand the semantic implications of the C++ object model and how the model affects their programs.

More on Amazon.com

Mentioned in questions and answers.

This question attempts to collect the few pearls among the dozens of bad C++ books that are published every year.

Unlike many other programming languages, which are often picked up on the go from tutorials found on the Internet, few are able to quickly pick up C++ without studying a well-written C++ book. It is way too big and complex for doing this. In fact, it is so big and complex, that there are very many very bad C++ books out there. And we are not talking about bad style, but things like sporting glaringly obvious factual errors and promoting abysmally bad programming styles.

Please edit the accepted answer to provide quality books and an approximate skill level — preferably after discussing your addition in the C++ chat room. (The regulars might mercilessly undo your work if they disagree with a recommendation.) Add a short blurb/description about each book that you have personally read/benefited from. Feel free to debate quality, headings, etc. Books that meet the criteria will be added to the list. Books that have reviews by the Association of C and C++ Users (ACCU) have links to the review.

Note: FAQs and other resources can be found in the C++ tag info and under . There is also a similar post for C: The Definitive C Book Guide and List

Beginner

Introductory, no previous programming experience

  • Programming: Principles and Practice Using C++ (Bjarne Stroustrup) (updated for C++11/C++14) An introduction to programming using C++ by the creator of the language. A good read, that assumes no previous programming experience, but is not only for beginners.

Introductory, with previous programming experience

  • C++ Primer * (Stanley Lippman, Josée Lajoie, and Barbara E. Moo) (updated for C++11) Coming at 1k pages, this is a very thorough introduction into C++ that covers just about everything in the language in a very accessible format and in great detail. The fifth edition (released August 16, 2012) covers C++11. [Review]

  • A Tour of C++ (Bjarne Stroustrup) (EBOOK) The “tour” is a quick (about 180 pages and 14 chapters) tutorial overview of all of standard C++ (language and standard library, and using C++11) at a moderately high level for people who already know C++ or at least are experienced programmers. This book is an extended version of the material that constitutes Chapters 2-5 of The C++ Programming Language, 4th edition.

  • Accelerated C++ (Andrew Koenig and Barbara Moo) This basically covers the same ground as the C++ Primer, but does so on a fourth of its space. This is largely because it does not attempt to be an introduction to programming, but an introduction to C++ for people who've previously programmed in some other language. It has a steeper learning curve, but, for those who can cope with this, it is a very compact introduction into the language. (Historically, it broke new ground by being the first beginner's book to use a modern approach at teaching the language.) [Review]

  • Thinking in C++ (Bruce Eckel) Two volumes; is a tutorial style free set of intro level books. Downloads: vol 1, vol 2. Unfortunately they’re marred by a number of trivial errors (e.g. maintaining that temporaries are automatically const), with no official errata list. A partial 3rd party errata list is available at (http://www.computersciencelab.com/Eckel.htm), but it’s apparently not maintained.

* Not to be confused with C++ Primer Plus (Stephen Prata), with a significantly less favorable review.

Best practices

  • Effective C++ (Scott Meyers) This was written with the aim of being the best second book C++ programmers should read, and it succeeded. Earlier editions were aimed at programmers coming from C, the third edition changes this and targets programmers coming from languages like Java. It presents ~50 easy-to-remember rules of thumb along with their rationale in a very accessible (and enjoyable) style. For C++11 and C++14 the examples and a few issues are outdated and Effective Modern C++ should be preferred. [Review]

  • Effective Modern C++ (Scott Meyers) This is basically the new version of Effective C++, aimed at C++ programmers making the transition from C++03 to C++11 and C++14.

  • Effective STL (Scott Meyers) This aims to do the same to the part of the standard library coming from the STL what Effective C++ did to the language as a whole: It presents rules of thumb along with their rationale. [Review]

Intermediate

  • More Effective C++ (Scott Meyers) Even more rules of thumb than Effective C++. Not as important as the ones in the first book, but still good to know.

  • Exceptional C++ (Herb Sutter) Presented as a set of puzzles, this has one of the best and thorough discussions of the proper resource management and exception safety in C++ through Resource Acquisition is Initialization (RAII) in addition to in-depth coverage of a variety of other topics including the pimpl idiom, name lookup, good class design, and the C++ memory model. [Review]

  • More Exceptional C++ (Herb Sutter) Covers additional exception safety topics not covered in Exceptional C++, in addition to discussion of effective object oriented programming in C++ and correct use of the STL. [Review]

  • Exceptional C++ Style (Herb Sutter) Discusses generic programming, optimization, and resource management; this book also has an excellent exposition of how to write modular code in C++ by using nonmember functions and the single responsibility principle. [Review]

  • C++ Coding Standards (Herb Sutter and Andrei Alexandrescu) “Coding standards” here doesn't mean “how many spaces should I indent my code?” This book contains 101 best practices, idioms, and common pitfalls that can help you to write correct, understandable, and efficient C++ code. [Review]

  • C++ Templates: The Complete Guide (David Vandevoorde and Nicolai M. Josuttis) This is the book about templates as they existed before C++11. It covers everything from the very basics to some of the most advanced template metaprogramming and explains every detail of how templates work (both conceptually and at how they are implemented) and discusses many common pitfalls. Has excellent summaries of the One Definition Rule (ODR) and overload resolution in the appendices. A second edition is scheduled for 2017. [Review]


Advanced

  • Modern C++ Design (Andrei Alexandrescu) A groundbreaking book on advanced generic programming techniques. Introduces policy-based design, type lists, and fundamental generic programming idioms then explains how many useful design patterns (including small object allocators, functors, factories, visitors, and multimethods) can be implemented efficiently, modularly, and cleanly using generic programming. [Review]

  • C++ Template Metaprogramming (David Abrahams and Aleksey Gurtovoy)

  • C++ Concurrency In Action (Anthony Williams) A book covering C++11 concurrency support including the thread library, the atomics library, the C++ memory model, locks and mutexes, as well as issues of designing and debugging multithreaded applications.

  • Advanced C++ Metaprogramming (Davide Di Gennaro) A pre-C++11 manual of TMP techniques, focused more on practice than theory. There are a ton of snippets in this book, some of which are made obsolete by typetraits, but the techniques, are nonetheless useful to know. If you can put up with the quirky formatting/editing, it is easier to read than Alexandrescu, and arguably, more rewarding. For more experienced developers, there is a good chance that you may pick up something about a dark corner of C++ (a quirk) that usually only comes about through extensive experience.


Reference Style - All Levels

  • The C++ Programming Language (Bjarne Stroustrup) (updated for C++11) The classic introduction to C++ by its creator. Written to parallel the classic K&R, this indeed reads very much alike it and covers just about everything from the core language to the standard library, to programming paradigms to the language's philosophy. [Review]

  • C++ Standard Library Tutorial and Reference (Nicolai Josuttis) (updated for C++11) The introduction and reference for the C++ Standard Library. The second edition (released on April 9, 2012) covers C++11. [Review]

  • The C++ IO Streams and Locales (Angelika Langer and Klaus Kreft) There's very little to say about this book except that, if you want to know anything about streams and locales, then this is the one place to find definitive answers. [Review]

C++11/14 References:

  • The C++ Standard (INCITS/ISO/IEC 14882-2011) This, of course, is the final arbiter of all that is or isn't C++. Be aware, however, that it is intended purely as a reference for experienced users willing to devote considerable time and effort to its understanding. As usual, the first release was quite expensive ($300+ US), but it has now been released in electronic form for $60US.

  • The C++14 standard is available, but seemingly not in an economical form – directly from the ISO it costs 198 Swiss Francs (about $200 US). For most people, the final draft before standardization is more than adequate (and free). Many will prefer an even newer draft, documenting new features that are likely to be included in C++17.

  • Overview of the New C++ (C++11/14) (PDF only) (Scott Meyers) (updated for C++1y/C++14) These are the presentation materials (slides and some lecture notes) of a three-day training course offered by Scott Meyers, who's a highly respected author on C++. Even though the list of items is short, the quality is high.

  • The C++ Core Guidelines (C++11/14/17/…) (edited by Bjarne Stroustrup and Herb Sutter) is an evolving online document consisting of a set of guidelines for using modern C++ well. The guidelines are focused on relatively higher-level issues, such as interfaces, resource management, memory management and concurrency affecting application architecture and library design. The project was announced at CppCon'15 by Bjarne Stroustrup and others and welcomes contributions from the community. Most guidelines are supplemented with a rationale and examples as well as discussions of possible tool support. Many rules are designed specifically to be automatically checkable by static analysis tools.

  • The C++ Super-FAQ (Marshall Cline, Bjarne Stroustrup and others) is an effort by the Standard C++ Foundation to unify the C++ FAQs previously maintained individually by Marshall Cline and Bjarne Stroustrup and also incorporating new contributions. The items mostly address issues at an intermediate level and are often written with a humorous tone. Not all items might be fully up to date with the latest edition of the C++ standard yet.

  • cppreference.com (C++03/11/14/17/…) (initiated by Nate Kohl) is a wiki that summarizes the basic core-language features and has extensive documentation of the C++ standard library. The documentation is very precise but is easier to read than the official standard document and provides better navigation due to its wiki nature. The project documents all versions of the C++ standard and the site allows filtering the display for a specific version. The project was presented by Nate Kohl at CppCon'14.


Classics / Older

Note: Some information contained within these books may not be up-to-date or no longer considered best practice.

  • The Design and Evolution of C++ (Bjarne Stroustrup) If you want to know why the language is the way it is, this book is where you find answers. This covers everything before the standardization of C++.

  • Ruminations on C++ - (Andrew Koenig and Barbara Moo) [Review]

  • Advanced C++ Programming Styles and Idioms (James Coplien) A predecessor of the pattern movement, it describes many C++-specific “idioms”. It's certainly a very good book and might still be worth a read if you can spare the time, but quite old and not up-to-date with current C++.

  • Large Scale C++ Software Design (John Lakos) Lakos explains techniques to manage very big C++ software projects. Certainly a good read, if it only was up to date. It was written long before C++98, and misses on many features (e.g. namespaces) important for large scale projects. If you need to work in a big C++ software project, you might want to read it, although you need to take more than a grain of salt with it. The first volume of a new edition is expected in 2015.

  • Inside the C++ Object Model (Stanley Lippman) If you want to know how virtual member functions are commonly implemented and how base objects are commonly laid out in memory in a multi-inheritance scenario, and how all this affects performance, this is where you will find thorough discussions of such topics.

  • The Annotated C++ Reference Manual (Bjarne Stroustrup, Margaret A. Ellis) This book is quite outdated in the fact that it explores the 1989 C++ 2.0 version - Templates, exceptions, namespaces and new casts were not yet introduced. Saying that however this is book goes through the entire C++ standard of the time explaining the rationale, the possible implementations and features of the language. This is not a book not learn programming principles and patterns on C++, but to understand every aspect of the C++ language.

If you could go back in time and tell yourself to read a specific book at the beginning of your career as a developer, which book would it be?

I expect this list to be varied and to cover a wide range of things.

To search: Use the search box in the upper-right corner. To search the answers of the current question, use inquestion:this. For example:

inquestion:this "Code Complete"

Applying UML and Patterns by Craig Larman.

The title of the book is slightly misleading; it does deal with UML and patterns, but it covers so much more. The subtitle of the book tells you a bit more: An Introduction to Object-Oriented Analysis and Design and Iterative Development.

Masters of doom. As far as motivation and love for your profession go: it won't get any better than what's been described in this book, truthfully inspiring story!

Beginning C# 3.0: An Introduction to Object Oriented Programming

This is the book for those who want to understand the whys and hows of OOP using C# 3.0. You don't want to miss it.

alt text

Mastery: The Keys to Success and Long-Term Fulfillment, by George Leonard

It's about about what mindsets are required to reach mastery in any skill, and why. It's just awesome, and an easy read too.

Pro Spring is a superb introduction to the world of Inversion of Control and Dependency Injection. If you're not aware of these practices and their implications - the balance of topics and technical detail in Pro Spring is excellent. It builds a great case and consequent personal foundation.

Another book I'd suggest would be Robert Martin's Agile Software Development (ASD). Code smells, agile techniques, test driven dev, principles ... a well-written balance of many different programming facets.

More traditional classics would include the infamous GoF Design Patterns, Bertrand Meyer's Object Oriented Software Construction, Booch's Object Oriented Analysis and Design, Scott Meyer's "Effective C++'" series and a lesser known book I enjoyed by Gunderloy, Coder to Developer.

And while books are nice ... don't forget radio!

... let me add one more thing. If you haven't already discovered safari - take a look. It is more addictive than stack overflow :-) I've found that with my google type habits - I need the more expensive subscription so I can look at any book at any time - but I'd recommend the trial to anyone even remotely interested.

(ah yes, a little obj-C today, cocoa tomorrow, patterns? soa? what was that example in that cookbook? What did Steve say in the second edition? Should I buy this book? ... a subscription like this is great if you'd like some continuity and context to what you're googling ...)

Database System Concepts is one of the best books you can read on understanding good database design principles.

alt text

Algorithms in C++ was invaluable to me in learning Big O notation and the ins and outs of the various sort algorithms. This was published before Sedgewick decided he could make more money by dividing it into 5 different books.

C++ FAQs is an amazing book that really shows you what you should and shouldn't be doing in C++. The backward compatibility of C++ leaves a lot of landmines about and this book helps one carefully avoid them while at the same time being a good introduction into OO design and intent.

Here are two I haven't seen mentioned:
I wish I had read "Ruminations on C++" by Koenig and Moo much sooner. That was the book that made OO concepts really click for me.
And I recommend Michael Abrash's "Zen of Code Optimization" for anyone else planning on starting a programming career in the mid 90s.

Perfect Software: And Other Illusions about Testing

TITLE Cover

Perfect Software: And Other Illusions about Testing by Gerald M. Weinberg

ISBN-10: 0932633692

ISBN-13: 978-0932633699

Rapid Development by McConnell

The most influential programming book for me was Enough Rope to Shoot Yourself in the Foot by Allen Holub.

Cover of the book

O, well, how long ago it was.

I have a few good books that strongly influenced me that I've not seen on this list so far:

The Psychology of Everyday Things by Donald Norman. The general principles of design for other people. This may seem to be mostly good for UI but if you think about it, it has applications almost anywhere there is an interface that someone besides the original developer has to work with; e. g. an API and designing the interface in such a way that other developers form the correct mental model and get appropriate feedback from the API itself.

The Art of Software Testing by Glen Myers. A good, general introduction to testing software; good for programmers to read to help them think like a tester i. e. think of what may go wrong and prepare for it.

By the way, I realize the question was the "Single Most Influential Book" but the discussion seems to have changed to listing good books for developers to read so I hope I can be forgiven for listing two good books rather than just one.

alt text

C++ How to Program It is good for beginner.This is excellent book that full complete with 1500 pages.

Effective C++ and More Effective C++ by Scott Myers.

Inside the C++ object model by Stanley Lippman

I bough this when I was a complete newbie and took me from only knowing that Java existed to a reliable team member in a short time

Not a programming book, but still a very important book every programmer should read:

Orbiting the Giant Hairball by Gordon MacKenzie

The Pragmatic programmer was pretty good. However one that really made an impact when I was starting out was :

Windows 95 System Programming Secrets"

I know - it sounds and looks a bit cheesy on the outside and has probably dated a bit - but this was an awesome explanation of the internals of Win95 based on the Authors (Matt Pietrek) investigations using his own own tools - the code for which came with the book. Bear in mind this was before the whole open source thing and Microsoft was still pretty cagey about releasing documentation of internals - let alone source. There was some quote in there like "If you are working through some problem and hit some sticking point then you need to stop and really look deeply into that piece and really understand how it works". I've found this to be pretty good advice - particularly these days when you often have the source for a library and can go take a look. Its also inspired me to enjoy diving into the internals of how systems work, something that has proven invaluable over the course of my career.

Oh and I'd also throw in effective .net - great internals explanation of .Net from Don Box.

I recently read Dreaming in Code and found it to be an interesting read. Perhaps more so since the day I started reading it Chandler 1.0 was released. Reading about the growing pains and mistakes of a project team of talented people trying to "change the world" gives you a lot to learn from. Also Scott brings up a lot of programmer lore and wisdom in between that's just an entertaining read.

Beautiful Code had one or two things that made me think differently, particularly the chapter on top down operator precedence.

K&R

@Juan: I know Juan, I know - but there are some things that can only be learned by actually getting down to the task at hand. Speaking in abstract ideals all day simply makes you into an academic. It's in the application of the abstract that we truly grok the reason for their existence. :P

@Keith: Great mention of "The Inmates are Running the Asylum" by Alan Cooper - an eye opener for certain, any developer that has worked with me since I read that book has heard me mention the ideas it espouses. +1

I found the The Algorithm Design Manual to be a very beneficial read. I also highly recommend Programming Pearls.

This one isnt really a book for the beginning programmer, but if you're looking for SOA design books, then SOA in Practice: The Art of Distributed System Design is for you.

For me it was Design Patterns Explained it provided an 'Oh that's how it works' moment for me in regards to design patterns and has been very useful when teaching design patterns to others.

Code Craft by Pete Goodliffe is a good read!

Code Craft

The first book that made a real impact on me was Mastering Turbo Assembler by Tom Swan.

Other books that have had an impact was Just For Fun by Linus Torvalds and David Diamond and of course The Pragmatic Programmer by Andrew Hunt and David Thomas.

In addition to other people's suggestions, I'd recommend either acquiring a copy of SICP, or reading it online. It's one of the few books that I've read that I feel greatly increased my skill in designing software, particularly in creating good abstraction layers.

A book that is not directly related to programming, but is also a good read for programmers (IMO) is Concrete Mathematics. Most, if not all of the topics in it are useful for programmers to know about, and it does a better job of explaining things than any other math book I've read to date.

For me "Memory as a programming concept in C and C++" really opened my eyes to how memory management really works. If you're a C or C++ developer I consider it a must read. You will defiantly learn something or remember things you might have forgotten along the way.

http://www.amazon.com/Memory-Programming-Concept-C/dp/0521520436

Agile Software Development with Scrum by Ken Schwaber and Mike Beedle.

I used this book as the starting point to understanding Agile development.

Systemantics: How Systems Work and Especially How They Fail. Get it used cheap. But you might not get the humor until you've worked on a few failed projects.

The beauty of the book is the copyright year.

Probably the most profound takeaway "law" presented in the book:

The Fundamental Failure-Mode Theorem (F.F.T.): Complex systems usually operate in failure mode.

The idea being that there are failing parts in any given piece of software that are masked by failures in other parts or by validations in other parts. See a real-world example at the Therac-25 radiation machine, whose software flaws were masked by hardware failsafes. When the hardware failsafes were removed, the software race condition that had gone undetected all those years resulted in the machine killing 3 people.

It seems most people have already touched on the some very good books. One which really helped me out was Effective C#: 50 Ways to Improve your C#. I'd be remiss if I didn't mention The Tao of Pooh. Philosophy books can be good for the soul, and the code.

Discrete Mathematics For Computer Scientists

Discrete Mathematics For Computer Scientists by J.K. Truss.

While this doesn't teach you programming, it teaches you fundamental mathematics that every programmer should know. You may remember this stuff from university, but really, doing predicate logic will improve you programming skills, you need to learn Set Theory if you want to program using collections.

There really is a lot of interesting information in here that can get you thinking about problems in different ways. It's handy to have, just to pick up once in a while to learn something new.

I saw a review of Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools on a blog talking also about XI-Factory, I read it and I must say this book is a must read. Altough not specifically targetted to programmers, it explains very clearly what is happening in the programming world right now with Model-Driven Architecture and so on..

Solid Code Optimizing the Software Development Life Cycle

Although the book is only 300 pages and favors Microsoft technologies it still offers some good language agnostic tidbits.

Managing Gigabytes is an instant classic for thinking about the heavy lifting of information.

My vote is "How to Think Like a Computer Scientist: Learning With Python" It's available both as a book and as a free e-book.

It really helped me to understand the basics of not just Python but programming in general. Although it uses Python to demonstrate concepts, they apply to most, if not all, programming languages. Also: IT'S FREE!

Object-Oriented Programming in Turbo C++. Not super popular, but it was the one that got me started, and was the first book that really helped me grok what an object was. Read this one waaaay back in high school. It sort of brings a tear to my eye...

My high school math teacher lent me a copy of Are Your Lights Figure Problem that I have re-read many times. It has been invaluable, as a developer, and in life generally.

I'm reading now Agile Software Development, Principles, Patterns and Practices. For those interested in XP and Object-Oriented Design, this is a classic reading.

alt text

Kernighan & Plauger's Elements of Programming Style. It illustrates the difference between gimmicky-clever and elegant-clever.

to get advanced in prolog i like these two books:

The Art of Prolog

The Craft of Prolog

really opens the mind for logic programming and recursion schemes.

Here's an excellent book that is not as widely applauded, but is full of deep insight: Agile Software Development: The Cooperative Game, by Alistair Cockburn.

What's so special about it? Well, clearly everyone has heard the term "Agile", and it seems most are believers these days. Whether you believe or not, though, there are some deep principles behind why the Agile movement exists. This book uncovers and articulates these principles in a precise, scientific way. Some of the principles are (btw, these are my words, not Alistair's):

  1. The hardest thing about team software development is getting everyone's brains to have the same understanding. We are building huge, elaborate, complex systems which are invisible in the tangible world. The better you are at getting more peoples' brains to share deeper understanding, the more effective your team will be at software development. This is the underlying reason that pair programming makes sense. Most people dismiss it (and I did too initially), but with this principle in mind I highly recommend that you give it another shot. You wind up with TWO people who deeply understand the subsystem you just built ... there aren't many other ways to get such a deep information transfer so quickly. It is like a Vulcan mind meld.
  2. You don't always need words to communicate deep understanding quickly. And a corollary: too many words, and you exceed the listener/reader's capacity, meaning the understanding transfer you're attempting does not happen. Consider that children learn how to speak language by being "immersed" and "absorbing". Not just language either ... he gives the example of some kids playing with trains on the floor. Along comes another kid who has never even SEEN a train before ... but by watching the other kids, he picks up the gist of the game and plays right along. This happens all the time between humans. This along with the corollary about too many words helps you see how misguided it was in the old "waterfall" days to try to write 700 page detailed requirements specifications.

There is so much more in there too. I'll shut up now, but I HIGHLY recommend this book!

alt text

The Back of the Napkin, by Dan Roam.

The Back of the Napkin

A great book about visual thinking techniques. There is also an expanded edition now. I can't speak to that version, as I do not own it; yet.

Agile Software Development by Alistair Cockburn

Do users ever touch your code? If you're not doing solely back-end work, I recommend About Face: The Essentials of User Interface Design — now in its third edition (linked). I used to think my users were stupid because they didn't "get" my interfaces. I was, of course, wrong. About Face turned me around.

"Writing Solid Code: Microsoft's Techniques for Developing Bug-Free C Programs (Microsoft Programming Series)" by Steve MacGuire.

Interesting what a large proportion the books mentioned here are C/C++ books.

While not strictly a software development book, I would highly recommend that Don't Make me Think! be considered in this list.

As so many people have listed Head First Design Patterns, which I agree is a very good book, I would like to see if so many people aware of a title called Design Patterns Explained: A New Perspective on Object-Oriented Design.

This title deals with design patterns excellently. The first half of the book is very accessible and the remaining chapters require only a firm grasp of the content already covered The reason I feel the second half of the book is less accessible is that it covers patterns that I, as a young developer admittedly lacking in experience, have not used much.

This title also introduces the concept behind design patterns, covering Christopher Alexander's initial work in architecture to the GoF first implementing documenting patterns in SmallTalk.

I think that anyone who enjoyed Head First Design Patterns but still finds the GoF very dry, should look into Design Patterns Explained as a much more readable (although not quite as comprehensive) alternative.

Even though i've never programmed a game this book helped me understand a lot of things in a fun way.

How influential a book is often depends on the reader and where they were in their career when they read the book. I have to give a shout-out to Head First Design Patterns. Great book and the very creative way it's written should be used as an example for other tech book writers. I.e. it's written in order to facilitate learning and internalizing the concepts.

Head First Design Patterns

97 Things Every Programmer Should Know

alt text

This book pools together the collective experiences of some of the world's best programmers. It is a must read.

Extreme Programming Explained: Embrace Change by Kent Beck. While I don't advocate a hardcore XP-or-the-highway take on software development, I wish I had been introduced to the principles in this book much earlier in my career. Unit testing, refactoring, simplicity, continuous integration, cost/time/quality/scope - these changed the way I looked at development. Before Agile, it was all about the debugger and fear of change requests. After Agile, those demons did not loom as large.

One of my personal favorites is Hacker's Delight, because it was as much fun to read as it was educational.

I hope the second edition will be released soon!

You.Next(): Move Your Software Development Career to the Leadership Track ~ Michael C. Finley (Author), Honza Fedák (Author) link text

I've been arounda while, so most books that I have found influential don't necessarily apply today. I do believe it is universally important to understand the platform that you are developing for (both hardware and OS). I also think it's important to learn from other peoples mistakes. So two books I would recommend are:

Computing Calamities and In Search of Stupidity: Over Twenty Years of High Tech Marketing Disasters

Working Effectively with Legacy Code is a really amazing book that goes into great detail about how to properly unit test your code and what the true benefit of it is. It really opened my eyes.

C++ supports dynamic binding through virtual mechanism. But as I understand the virtual mechanism is an implementation detail of the compiler and the standard just specifies the behaviors of what should happen under specific scenarios. Most compilers implement the virtual mechanism through the virtual table and virtual pointer. And yes I am aware of how this works, So my question is not about implementation detail of virtual pointers and table. My questions are:

  1. Are there any compilers which implement Virtual Mechanism in any other way other than the virtual pointer and virtual table mechanism? As far as i have seen most(read g++,Microsoft visual studio) implement it through virtual table, pointer mechanism. So practically are there any other compiler implementations at all?
  2. The sizeof of any class with just a virtual function will be size of an pointer (vptr inside this) on that compiler, So given that virtual ptr and tbl mechanism itself is compiler implementation, will this statement I made above be always true?
  1. I don't think there are any modern compilers with an approach other than vptr/vtable. Indeed, it would be hard to figure out something else that is not just plain inefficient.

    However, there is still a pretty large room for design tradeoffs within that approach. Maybe especially regarding how virtual inheritance is handled. So it makes sense to make this implementation-defined.

    If you are interested in this kind of stuff, I strongly suggest reading Inside the C++ Object Model.

  2. sizeof class depends on the compiler. If you want portable code, don't make any assumptions.

I am basically wondering how C++ lays out the object in memory. So, I hear that dynamic casts simply adjust the object's pointer in memory with an offset; and reinterpret kind of allows us to do anything with this pointer. I don't really understand this. Details would be appreciated!

If you really want to go in depth this is the best book: Inside C++ object model

Say I have a class, something like the following;

class MyClass
{
public:
  MyClass();
  int a,b,c;
  double x,y,z;
};

#define  PageSize 1000000

MyClass Array1[PageSize],Array2[PageSize];

If my class has not pointers or virtual methods, is it safe to use the following?

memcpy(Array1,Array2,PageSize*sizeof(MyClass));

The reason I ask, is that I'm dealing with very large collections of paged data, as decribed here, where performance is critical, and memcpy offers significant performance advantages over iterative assignment. I suspect it should be ok, as the 'this' pointer is an implicit parameter rather than anything stored, but are there any other hidden nasties I should be aware of?

Edit:

As per sharptooths comments, the data does not include any handles or similar reference information.

As per Paul R's comment, I've profiled the code, and avoiding the copy constructor is about 4.5 times faster in this case. Part of the reason here is that my templated array class is somewhat more complex than the simplistic example given, and calls a placement 'new' when allocating memory for types that don't allow shallow copying. This effectively means that the default constructor is called as well as the copy constructor.

Second edit

It is perhaps worth pointing out that I fully accept that use of memcpy in this way is bad practice and should be avoided in general cases. The specific case in which it is being used is as part of a high performance templated array class, which includes a parameter 'AllowShallowCopying', which will invoke memcpy rather than a copy constructor. This has big performance implications for operations such as removing an element near the start of an array, and paging data in and out of secondary storage. The better theoretical solution would be to convert the class to a simple structure, but given this involves a lot of refactoring of a large code base, avoiding it is not something I'm keen to do.

According to the Standard, if no copy constructor is provided by the programmer for a class, the compiler will synthesize a constructor which exhibits default memberwise initialization. (12.8.8) However, in 12.8.1, the Standard also says,

A class object can be copied in two ways, by initialization (12.1, 8.5), including for function argument passing (5.2.2) and for function value return (6.6.3), and by assignment (5.17). Conceptually, these two operations are implemented by a copy constructor (12.1) and copy assignment operator (13.5.3).

The operative word here is "conceptually," which, according to Lippman gives compiler designers an 'out' to actually doing memberwise initialization in "trivial" (12.8.6) implicitly defined copy constructors.

In practice, then, compilers have to synthesize copy constructors for these classes that exhibit behavior as if they were doing memberwise initialization. But if the class exhibits "Bitwise Copy Semantics" (Lippman, p. 43) then the compiler does not have to synthesize a copy constructor (which would result in a function call, possibly inlined) and do bitwise copy instead. This claim is apparently backed up in the ARM, but I haven't looked this up yet.

Using a compiler to validate that something is Standard-compliant is always a bad idea, but compiling your code and viewing the resulting assembly seems to verify that the compiler is not doing memberwise initialization in a synthesized copy constructor, but doing a memcpy instead:

#include <cstdlib>

class MyClass
{
public:
    MyClass(){};
  int a,b,c;
  double x,y,z;
};

int main()
{
    MyClass c;
    MyClass d = c;

    return 0;
}

The assembly generated for MyClass d = c; is:

000000013F441048  lea         rdi,[d] 
000000013F44104D  lea         rsi,[c] 
000000013F441052  mov         ecx,28h 
000000013F441057  rep movs    byte ptr [rdi],byte ptr [rsi] 

...where 28h is the sizeof(MyClass).

This was compiled under MSVC9 in Debug mode.

EDIT:

The long and the short of this post is that:

1) So long as doing a bitwise copy will exhibit the same side effects as memberwise copy would, the Standard allows trivial implicit copy constructors to do a memcpy instead of memberwise copies.

2) Some compilers actually do memcpys instead of synthesizing a trivial copy constructor which does memberwise copies.

Single inheritance is easy to implement. For example, in C, the inheritance can be simulated as:

struct Base { int a; }
struct Descendant { Base parent; int b; }

But with multiple inheritance, the compiler has to arrange multiple parents inside newly constructed class. How is it done?

The problem I see arising is: should the parents be arranged in AB or BA, or maybe even other way? And then, if I do a cast:

SecondBase * base = (SecondBase *) &object_with_base1_and_base2_parents;

The compiler must consider whether to alter or not the original pointer. Similar tricky things are required with virtuals.

And then, if I do a cast:

SecondBase base = (SecondBase *) object_with_base1_and_base2_parents;

The compiler must consider whether to alter or not the original pointer. Similar tricky things with virtuals.

With non-virutal inheritance this is less tricky than you might think - at the point where the cast is compiled, the compiler knows the exact layout of the derived class (after all, the compiler did the layout). Usually all that happens is a fixed offset (which may be zero for one of the base classes) is added/subtracted from the derived class pointer.

With virutal inheritance it is maybe a bit more complex - it may involve grabbing an offset from a vtbl (or similar).

Stan Lippman's book, "Inside the C++ Object Model" has very good descriptions of how this stuff might (and often actually does) work.

I am considering using virtual inheritance in a real-time application. Does using virtual inheritance have a performance impact similar to that of calling a virtual function? The objects in question would only be created at start up but I'm concerned if all functions from the hierarchy would be dispatched via a vtable or if only those from the virtual base class would be.

Common implementations will make access to data members of virtual base classes use an additional indirection.

As James points out in his comments, calling a member function of a base class in a multiple inheritance scenario will need adjustment of the this pointer, and if that base class is virtual, then the offset of the base class sub-object in the derived class's object depends on the dynamic type of the derived class and will need to be calculated at runtime.

Whether this has any visible performance impact on real-world applications depends on many things:

  • Do virtual bases have data members at all? Often, it's abstract base classes that need to be derived from virtually, and abstract bases that have any data members are often a code smell anyway.

  • Assuming you have virtual bases with data members, are those accessed in a critical path? If a user clicking on some button in a GUI results in a few dozen additional indirections, nobody will notice.

  • What would be the alternative if virtual bases are avoided? Not only might the design be inferior, it is also likely that the alternative design has a performance impact, too. It has to achieve the same goal, after all, and TANSTAAFL. Then you traded one performance loss for another plus an inferior design.


Additional note: Have a look at Stan Lippmann's Inside the C++ Object Model, which answers such questions quite thoroughly.

Yesterday, me and my colleague weren't sure why the language forbids this conversion

struct A { int x; };
struct B : virtual A { };

int A::*p = &A::x;
int B::*pb = p;

Not even a cast helps. Why does the Standard not support converting a base member pointer to a derived member pointer if the base member pointer is a virtual base class?

Relevant C++ standard reference:

A prvalue of type “pointer to member of B of type cv T”, where B is a class type, can be converted to a prvalue of type “pointer to member of D of type cv T”, where D is a derived class (Clause 10) of B. If B is an inaccessible (Clause 11), ambiguous (10.2), or virtual (10.1) base class of D, or a base class of a virtual base class of D, a program that necessitates this conversion is ill-formed.

Both function and data member pointers are affected.

Lippman's "Inside the C++ Object model" has a discussion about this:

[there] is the need to make the virtual base class location within each derived class object available at runtime. For example, in the following program fragment:

class X { public: int i; }; 
class A : public virtual X { public: int j; }; 
class B : public virtual X { public: double d; }; 
class C : public A, public B { public: int k; }; 
// cannot resolve location of pa->X::i at compile-time 
void foo( const A* pa ) { pa->i = 1024; } 

main() { 
 foo( new A ); 
 foo( new C ); 
 // ... 
} 

the compiler cannot fix the physical offset of X::i accessed through pa within foo(), since the actual type of pa can vary with each of foo()'s invocations. Rather, the compiler must transform the code doing the access so that the resolution of X::i can be delayed until runtime.

Essentially, the presence of a virtual base class invalidates bitwise copy semantics.

Here is an example of polymorphism from http://www.cplusplus.com/doc/tutorial/polymorphism.html (edited for readability):

// abstract base class
#include <iostream>
using namespace std;

class Polygon {
    protected:
        int width;
        int height;
    public:
        void set_values(int a, int b) { width = a; height = b; }
        virtual int area(void) =0;
};

class Rectangle: public Polygon {
    public:
        int area(void) { return width * height; }
};

class Triangle: public Polygon {
    public:
        int area(void) { return width * height / 2; }
};

int main () {
    Rectangle rect;
    Triangle trgl;
    Polygon * ppoly1 = &rect;
    Polygon * ppoly2 = &trgl;
    ppoly1->set_values (4,5);
    ppoly2->set_values (4,5);
    cout << ppoly1->area() << endl; // outputs 20
    cout << ppoly2->area() << endl; // outputs 10
    return 0;
}

My question is how does the compiler know that ppoly1 is a Rectangle and that ppoly2 is a Triangle, so that it can call the correct area() function? It could find that out by looking at the "Polygon * ppoly1 = ▭" line and knowing that rect is a Rectangle, but that wouldn't work in all cases, would it? What if you did something like this?

cout << ((Polygon *)0x12345678)->area() << endl;

Assuming that you're allowed to access that random area of memory.

I would test this out but I can't on the computer I'm on at the moment.

(I hope I'm not missing something obvious...)

Chris Jester-Young gives the basic answer to this question.

Wikipedia has a more in depth treatment.

If you want to know the full details for how this type of thing works (and for all type of inheritance, including multiple and virtual inheritance), one of the best resources is Stan Lippman's "Inside the C++ Object Model".

Besides the normal explenation of being visible or not to derived classes, is their any other difference?

If you make it more visible, is it taking more or less memory, does it slow thing down or...?

Apart from the accessibility of members outside or to the derived classes, access specifiers might affect the object layout.

Quoting from my other answer:

Usually, memory address for data members increases in the order they're defined in the class . But this order may be disrupted at any place where the access-specifiers (private, protected, public) are encountered. This has been discussed in great detail in Inside the C++ Object Model by Lippman.

An excerpt from C/C++ Users Journal,

The compiler isn't allowed to do this rearrangement itself, though. The standard requires that all data that's in the same public:, protected:, or private: must be laid out in that order by the compiler. If you intersperse your data with access specifiers, though, the compiler is allowed to rearrange the access-specifier-delimited blocks of data to improve the layout, which is why some people like putting an access specifier in front of every data member.

Interesting, isn't it?

how are virtual tables stored in memory? their layout?

e.g.

class A{
    public:
         virtual void doSomeWork();
};

class B : public A{
    public:
         virtual void doSomeWork();
};

How will be the layout of virtual tables of class A and class B in memory?

As others already wrote, there is no general approach. (Heck, nobody even mandates that virtual tables are used at all.)

However, I believe they are most likely implemented as a hidden pointer at a certain offset in the object which references a table of function pointers. Certain virtual functions' addresses occupy certain offsets in that table. Usually there's also a pointer to the dynamic type's std::type_info object.

If you're interested in things like this, read Lippmann's "Inside the C++ Object Model". However, unless your interest is academic (or you're trying to write a C++ compiler -- but then you shouldn't need to ask), you shouldn't bother. It's an implementation detail you don't need to know and should never rely on.

I want to know the in-memory representation of .NET constructs such as "interface", "class", "struct", etc. There's an excellent book for C++ object model - <Inside the C++ Object Model> by Stanley. Lippman, I want a similar book for .NET and C#.

I have read some books about .NET, but they are mostly about the logical usage of .NET. None of them talks about the physical in-memory layout info. I think it's necessary to know at least one implementation of .NET.

I have read about the "Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects" Could someone provide some hints about more in-depth books and articles?

If this info is not publicly avaialble. Shared source one like Mono or Shared Source CLI could be an option.

Many thanks.

The reason this information is not easily available is almost certainly deliberate on Microsofts behalf.

Microsoft created the .NET Framework and the CLR so you do not have to (unduly) worry yourself about where/how your objects are stored in memory (implmentation details). This "ignorance" is actually one of the biggest benefits of using .NET; you do not need to worry about issues such as manual memory allocation, processor/memory models etc.

The other benefit of this is that it improves security, ie it makes writing malicious code that much harder, although not impossible of course.

CLR Via C# by Jeff Richter is probably the best current book for "under the hood" type .NET information. Chapters 4, 5, 20 and 21 would probably be of most interest regarding layout of .NET types, although, as explained above, you will not find the same level of detail as the C++ object model.

I have following code as below

class A {
  private:
    int i;
 };
class B:public A {
 private:
  int j;
 };

when find the sizeof derived class object i see the sizeof baseclass + sizeof derived class...but as per the inheritance behaviour, the private member of base class are not inherited to derived class right!! why the size of derived class has addition of bot the classes!

You either misunderstand sizeof or your misunderstand the layout (in memory) of C++ objects.

For performance reason (to avoid the cost of indirection), compilers will often implement Derivation using Composition:

// A
+---+
| i |
+---+

// B
+---+---+
| A | j |
+---+---+

Note that if private, B cannot peek in A even though it contains it.

The sizeof operator will then return the size of B, including the necessary padding (for alignment correction) if any.

If you want to learn more, I heartily recommend Inside the C++ Object Model by Stanley A. Lippman. While compiler dependent, many compilers do in fact use the same basic principles.

I have two questions to ask...

a)

Class A{
int a;
public:
virtual void f(){}
};

Class B {
int b;
public:
virtual void f1(){}
};

Class C: public A, public B {

int c;
public:
virtual void f(){} // Virtual is optional here
virtual void f1(){} // Virtual is optional here

virtual void f2(){}

};

Class D: public C {
int d;
public:
void f2(){}

};

Now C++ says that there won't be 3 virtual pointers in C's instance but only 2. And then, how could a call to say,

C* c = new D();

c->f2(); // Since there is no virtual pointer corresponding to the virtual function defined in f2(). How is the late binding done ?..

I read saying that , the virtual pointer to this function is added in the virtual pointer of the first super class of C. Why is that so ?.. Why is there no virtual table ?...

sizeof(*c); // It would be 24 and not 28.. Why ?...

Also say, considering the above code, i do this ,

void (C::*a)() = &C::f;
void (C::*b)() = &C::f1;

printf("%u", a); 
printf("%u",b);

// Both the above printf() statements print the same address. Why is that so ?...
// Now consider this,

C* c1 = new C();

c1->(*a)();

c1->(*b)();

// Inspite of a and b having the same address, the function invoked is different. How is the definition of the function bounded here ?...

Hope I get a reply soon.

A good read if you like to understand what is going on under the hood: Inside the C++ Object Model, de Stanley Lippman. The content starts to show its age, but it provides a comprehensive presentation of some techniques that were (and sometimes still are) used to implement the C++ features such as inheritance, polymorphism, templates, etc.

Now, to answer your question: first of all, you should know that the way a vendor must implement a given feature is usually not specified by the C++ standard. This is the case here: an implementation is not required to use virtual method tables at all (even though they often do).

That being said, we can still try to guess what is happening here. First, let's see what the memory would like if we created an A instance:

A someA;
    ________________               ----------------                  
    | @A_vtable    | vptr -------->|     @A::f    |                   
    ________________               ----------------                  
    | [some value] | a             A_vtable
    ________________
    someA

You can see that an instance of Acontains a virtual table pointer (vptr) in addition to its member variable. This vptr points to A's virtual table, which contains the address of the A's implementation of f.

An instance of B should be quite similar, so I won't bother drawing one. Let's see now what would a C instance look like:

C someC;
    ________________         ------->----------------                  
    | @C_A_vtable  | A_vptr /        |     @C::f    |                   
    ________________                 ----------------                  
    | [some value] | a               |     @C::f2   |
    ----------------                 ---------------- 
    | @C_B_vtable  | B_vptr \         C_A_vtable
    ________________         \         
    | [some value] | b        \
    ________________           \      
    someC                       ---->----------------
                                     |     @C::f1   |
                                     ----------------
                                     C_B_vtable

You can see that a someC contains an A part and a B part, both containing a vptr. This way, we can cast a C into an A or a B simply by using an offset into the class. Now, regarding the method added by C, you'll notice that I placed its address at the end of the existing vtable for A: instead of creating an entirely new table which would require an additional vptr, I simply extended the existing one. A call to f2 will simply fetch the good address in the table pointed to by A_vptr, and call it, in a way completely similar to the other virtual methods.

D's instances just need to set their two vptr to point to the correct tables (one containing the address of C::f (since f is not overriden) and D::f2, and the other one containing the address of C::f1).

I'm reading the book Inside the C++ Object Model. In the book there's an example like:

struct Base1
{
    int v1;
};

struct Base2
{
    int v2;
};

class Derived : public Base1, public Base2 {};

printf("&Derived::v1 = %p\n", &Derived::v1);        // Print 0 in VS2008/VS2012
printf("&Derived::v2 = %p\n", &Derived::v2);        // Print 0 in VS2008/VS2012

In the previous code, the print of address Derived::v1 & Derived::v2 will both be 0. However, if print the same address via a variable:

int Derived::*p;
p = &Derived::v1;
printf("p = %p (&Derived::v1)\n", p);        // Print 0 in VS2008/VS2012 as before
p = &Derived::v2;
printf("p = %p (&Derived::v2)\n", p);        // Print 4 in VS2008/VS2012

By examining the size of &Derived::v1 and p, I get 4 in both.

// Both are 4
printf("Size of (&Derived::v1) is %d\n", sizeof(&Derived::v1));
printf("Size of p is %d\n", sizeof(p));

The address of Derived::v1 will be 0, but the address of Derived::v2 will be 4. I don't understand why &Derived::v2 became 4 when assign it to a variable.

Examine the assembly code, when directly query the address of Derived::v2, it is translated to a 0; but when assign it to a variable, it gets translated to a 4.

I tested it on both VS2008 & VS2012, the result is the same. So I think there's must be some reason to make Microsoft choose such design.

And, if you do like this:

d1.*(&Derived::v2) = 1;

Apparently &Derived::v2 is not 0. Why does the compiler distinguish this two cases?

Can anyone please tell the thing happens behind? Thank you!

--Edit--

For those think the &Derived::v1 doesn't get a valid address. Haven't you ever did this?

Derived d1, d2;
d1.*p = 1;
d2.*p = 1;

The poster asked me about this, and at first I also suspected similar wrong causes. This is not specific to VC++.

It turns out that what's happening is that the type of &Derived::v2 is not int Derived::*, but int Base2::*, which naturally does have an offset of zero because it's the offset with respect to Base2. When you explicitly convert it to an int Derived::*, the offset is corrected.

Try this code on VC++ or GCC or Clang... I'm sticking with stdio/printf as the poster was using.

struct Base1 { int a; };
struct Base2 { int b; };
struct Derived : Base1, Base2 { };

#include <cassert>
#include <cstdio>
#include <typeinfo>
using namespace std;

int main () {

   printf( "%s\n", typeid(&Derived::a).name() );  // mentions Base1
   printf( "%s\n", typeid(&Derived::b).name() );  // mentions Base2

   int Derived::* pdi = &Derived::b;  // OK
   int Base2::*   p2i = &Derived::b;  // OK
   //int Base1::* p1i = &Derived::b;  // ERROR

   assert( sizeof(int*) == sizeof(pdi) );
   printf( "%p %p", p2i, pdi );  // prints "(nil) 0x4" using GCC 4.8 at liveworkspace.org

}

In C++ I presume the C++ standard has nothing to do with how data members are arranged within a class, in terms of memory layout? Would I be right in thinking this is down to the compiler in question?

I'm very interested in learning how objects and other C++ entities (structs etc) are represented in physical memory (I know things like lists are node to node and arrays are continuous memory- but all the other aspects to the language).

EDIT: Would learning x86 assembler help with this and understanding C++ better?

Yes, the standard doesn't say how the objects are to be represented in memory. To get an idea how normall C++ objects are represented read this book: inside C++ object model.

I was reading the Wikipedia article on virtual inheritance. I followed the whole article but I could not really follow the last paragraph

This is implemented by providing Mammal and WingedAnimal with a vtable pointer (or "vpointer") since, e.g., the memory offset between the beginning of a Mammal and of its Animal part is unknown until runtime. Thus Bat becomes (vpointer,Mammal,vpointer,WingedAnimal,Bat,Animal). There are two vtable pointers, one per inheritance hierarchy that virtually inherits Animal. In this example, one for Mammal and one for WingedAnimal. The object size has therefore increased by two pointers, but now there is only one Animal and no ambiguity. All objects of type Bat will have the same vpointers, but each Bat object will contain its own unique Animal object. If another class inherits from Mammal, such as Squirrel, then the vpointer in the Mammal object in a Squirrel will be different from the vpointer in the Mammal object in a Bat, although they can still be essentially the same in the special case that the Squirrel part of the object has the same size as the Bat part, because then the distance from the Mammal to the Animal part is the same. The vtables are not really the same, but all essential information in them (the distance) is.

Can someone please shed some more light on this.

As mkluwe suggested, vpointers are not really a part of the language. However, knowing about implementation techniques might be useful, especially in a low-level language like C++.

If you really want to learn this, I would recommend Inside the C++ Object Model, which explains this and a lot of other things in detail.

Respected Sir!

i should tell you that what i know and what i don't know about the asked question so that you can address the weak area of my understanding.

i know that c++ implements the polymorphism by using the Vtable which is array of pointers each pointer points to the virtual function of the class, each class in the hierarchy has a vtable. now suppose i have the following class

class person
{
    char name[20];
public:
    person(char* pname)
    {
        strcpy(name,pname);
    }

    virtual void show()
    {
        cout<<"inside person show method, Name: "<<name;
    }
};

class teacher:public person
{
     int scale;

     teacher(char*pname, int s):person(pname)
     {
         scale=s;
     }

     void show()
     {
         cout<<"inside the teacher show method, Scale: "<<scale;
     }
};

now suppose i write in main program

person *ptr;
ptr=new teacher(16,"Zia");
ptr->show();

now i am confuse at this point, the call will go to the show function of the base class, now as it is a virtual function so it inturn calls the approprite function. i know i am wrong here. i am confused that what would be the sequence of calls. What is the role of Vtable and how it works please elaborate.

I think you should draw attention to Stanley B. Lippman's book "Inside C++ object model".

Lets look for internal presentation for your classes:

Virtual Table for person and teacher

|---------------|  +---> |------------------------|
|  name         |  |     | "type_info" for person |
|---------------|  |     |------------------------|
|__vptr__person |--+     | "person::~person"      |
|---------------|        |------------------------|
person p;                | "person::show"         |
                         |------------------------|

|----------------|  +---> |-------------------------|
|person subobject|  |     | "type_info" for teacher |
|----------------|  |     |-------------------------|
|__vptr__teacher |--+     | "teacher::~teacher"     |
|----------------|        |-------------------------|
teacher t;                | "teacher::show"         |
                          |-------------------------|               

In general, we don't know the exact type of the object ptr addresses at each invocation of show(). We do know, however, that through ptr we can access the virtual table associated with the object's class.

Although we don't know which instance of show() to invoke, we know that each instance's address is contained in slot 2.

This information allows the compiler to internally transform the call into

( *ptr->vptr[ 2 ] )( ptr ); 

In this transformation, vptr represents the internally generated virtual table pointer inserted within each class object and 2 represents show()'s assigned slot within the virtual table associated with the Point hierarchy. The only thing we need to do in runtime is compute ptr's dynamic type (and appropriate vtable) using RTTI.

I am a vc++ developer but I spend most of my time learning c++.What are all the things I should know as a vc developer.

I don't understand why people here post things about WinAPI, .NET, MFC and ATL.

You really must know the language. Another benefit would be the cross platform libraries. C++ is not about GUI or Win32 programming. You can write Multi-Platform application with libraries like boost, QT, wxWidgets (may be some XML parser libs).

Visual C++ is a great IDE to develop C++ application and Microsoft is trying hard to make Visual C++ more standard conform. Learning standard language without dialects (MS dialect as well) will give you an advantage of Rapid Development Environment combined with multi-platform portability. There are many abstraction libraries out there, which work equally on Windows, Linux, Unix or Mac OS. Debugger is a great app in VC++ but not the first thing to start with. Try to write unit tests for your application. They will ensure on next modifications that you did not broke other part of tested (or may be debugged:) code.

Do not try to learn MFC or ATL from scratch, try to understand STL. MFC is old, and new version are more or less wrapper around ATL. ATL is some strange lib, which tries to marry STL-idioms (and sometimes STL itself) and WinAPI. But using ATL concepts without knowing what is behind, will make you unproductive as well. Some ATL idioms are very questionable and might be replaced by some better from boost or libs alike.

The most important things to learn are the language philosophy and concepts. I suggest you to dive into the language and read some serious books:

When here you will be a very advanced C++ developer Next books will make guru out of you:

Remember one important rule: If you have a question, try to find an answer to it in ISO C++ Standard (i.e. Standard document) first. Doing so you will come along many other similar things, which will make you think about the language design.

Hope that book list helps you. Concepts from these books you will see in all well designed modern C++ frameworks.

With Kind Regards,
Ovanes

I have kept hearing this statement. Switch..Case is Evil for code maintenance, but it provides better performance(since compiler can inline stuffs etc..). Virtual functions are very good for code maintenance, but they incur a performance penalty of two pointer indirections.

Say i have a base class with 2 subclasses(X and Y) and one virtual function, so there will be two virtual tables. The object has a pointer, based on which it will choose a virtual table. So for the compiler, it is more like

switch( object's function ptr )
{

   case 0x....:

       X->call();

       break;

   case 0x....:

       Y->call();
};

So why should virtual function cost more, if it can get implemented this way, as the compiler can do the same in-lining and other stuff here. Or explain me, why is it decided not to implement the virtual function execution in this way?

Thanks, Gokul.

Your statement about the branching when calling virtual function is wrong. There is not such thing in generated code. Take a look at the assembly code will give you a better idea.

In a nut shell, one general simplified implementation of C++ virtual function is: each class will have a virtual table (vbtl), and each instance of the class will have a virtual table pointer (vptr). The virtual table is basically a list of function pointers.

When you are calling a virtual function, say it is like:

class Base {};
class Derived {};
Base* pB = new Derived();
pB->someVirtualFunction();

The 'someVirtualFunction()' will have a corresponding index in the vtbl. And the call

pB->someVirtualFunction(); 

will be converted to something like:

pB->vptr[k](); //k is the index of the 'someVirtualFunction'.

In this way the function is actually called indirectly and it has the polymorphism.

I suggest you to read 'The C++ Object Model' by Stanley Lippman.

Also, the statement that virtual function call is slower than the switch-case is not accruate. It depends. As you can see above, a virtual function call is just 1 extra time of dereference compared to a regular function call. And with switch-case branching you'd have extra comparision logic (which introduce the chance of missing cache for CPU) which also consumes CPU cycles. I would say in most case, if not all, virutal function call should be faster than switch-case.

In c++, does inheritance occur at run time or compile time?

Examples?

In C++, inheritance in itself (without polymorphism) is a compile-time feature. In the compiled code, there will be little or no difference between

struct foo : bar {};

and

struct foo { bar b; };

Except for offsets to access their members, there will not be any "knowledge" about how bar relates to foo in the compiled binary.


This is different, however, when you add polymorphism (i.e., virtual functions, allowing dynamic_cast<>()) to the picture. It allows late binding: which exact function will be called is decided at runtime. Of course, this requires data structures to perform this (usually employing so-called virtual tables), and those data structures are accessed at runtime in order to determine which function to call.

Also, virtual base classes require runtime support in order to access them withing derived objects.


If you are interested in the runtime costs of certain C++ features, you might want to try to get hold of a copy of Inside the C++ Object Model by Stanley Lippman. It's an old book, but if you want to know how virtual member functions are commonly implemented and how base objects are commonly laid out in memory in a multi-inheritance scenario, and how all this affects performance, this is where you will find thorough discussions of such topics.

I am trying to learn object oriented concepts by studying a real world example in c++. This exmple should illustrate all concepts like inheritance, encapsulation, overloading, polymorphism etc.

Considering a quite popular post here on SO , there are lot of solutions to your problem, probably the best one is this book.

I would like to also recommend another path: pick 1 between Obj-C and Java if you can deviate a little from the original focus.

The reason why I'm suggesting this is the long time existence of Java, combined with its big popularity in both production and education, and the fact that Obj-C is somehow an object oriented language that "exposed" itself in the way this mechanisms works, there are a lot of pointers and really simple but powerful concepts that can help you understand this.

It's also possible to use Obj-C on platforms that are not MAC OS driven but you should use clang, not gcc, gcc is a little bit behind on objc support, at least this is what I experienced.

There is also the usual list of free resources with a lot of goodies that always helps .

#include<cstdio>
#include<iostream>
using namespace std;
class A
{
public:
    int x;
};
class B: public A
{
};
int main()
{
    B b;
    b.x=5;
    cout<<b.x<<endl;

    return 0;
}

i have the above code.it's all okay.but i want to know when i inherit class B from class A does the member variable x declared in class B too just like A or the class B just get access to the member variable x of class A ?
are there two variables with the same name in two different classes or there are only one variable and the objects of the both classes have access to it ?
if there are two different variables with the same name in two different classes then why, when an object of derived class is declared the constructor of base class is called ?

When you create an object of the derived class, a base class sub-object is embedded in the memory layout of the derived class object. So, to your question, there's only on variable that will be a part of the derived object. Since, we are only taking about non-static members here, each derived object gets its base-class sub-object laid out in memory. When you create a base class object, its a different piece of memory representing different object and has nothing to do with derived object created earlier.

Hope it clarifies your doubt!

This is a great book to understand C++ object model:

http://www.amazon.com/Inside-Object-Model-Stanley-Lippman/dp/0201834545/ref=sr_1_1?ie=UTF8&qid=1412535828&sr=8-1&keywords=inside+c%2B%2B+object+model

I want to know how does class object (not instances, but exactly classes) store in memory?

class A {
public:
    int a;
    virtual void f();
    virtual ~A();
};

class B : public A {
public:
    int b;
    void f() final override;
};

I know, that usually (not strongly described by standard) in case of this inheritance (B derived from A) we have:

memory: ....AB...

where AB is a class object of B (if I understand it correctly). If we go deeper (tried with clang and gcc), we can see something like (again, not strongly described in standard):

A
    vtptr*
    int a
B
    vtptr*
    int b

Okay, now we see where do the a and b properties store. And we also see the pointer to virtual method table. But where does vtptr* (virtual method table) actually store? Why not near with the classes? Or it does?

Also, here is another question: I was able to change virtual method tables by changing the pointers (simple logic). Can I also change a pointer to it's methods safely?

P.S. In your questions you may answer for gcc and clang. P.P.S. If I am wrong somewhere please point it too in your answers.

The C++ standard does not prescribe how the virtual function mechanism should be implemented. In practice all C++ implementations use a virtual function table per class, and a virtual function table pointer in each object of class with virtual functions (called a polymorphic class). Yet the details can differ, in particular for multiple inheritance and virtual inheritance.

You can read about the common choices in Stanley Lippman's classic book Inside The C++ Object Model.

It doesn't make much sense to ask “where” a virtual function table is stored. It's much like any static variable: its location depends on the implementation and is pretty much arbitrary. And regarding

Why not near with the classes?

… classes as such are not stored anywhere, they are not objects, so this doesn't make sense, sorry.

You can ask more meaningfully where is the vtable pointer stored in each object, for a given implementation?

And usually that's at the start of the object, but if you derive from a non-polymorphic class, and add a virtual function, then you might get the vtable pointer somewhere else. Or not. The latter possibility is much of the reason why a static_cast of Derived* to Base* (or vice versa) can do an address adjustment, i.e. is different from a simple reinterpret_cast.

class base {
public:
  virtual void fn(){}
};


class der: public base {
public:
  void fn(){}
};

der d;

base *b = &d;
b->fn();

When the compiler encounters the statement b->fn(), the following information is available to the compiler:

  1. b is a pointer to the class base,
  2. base class is having a virtual function as well as a vptr.

My question is: how does the vptr of class der come into picture at run time?

The Holy Standard does not require a vptr or a vptr table. However in practice that’s the only way this is implemented.

So here’s psudo-code for what happens:

  1. a_base_compatible_vtable_ptr = b->__vtable_ptr__
  2. a_func_ptr = a_base_compatible_vtable_ptr[INDEX_FOR_fn]
  3. a_func_ptr( b )

A main insight is that for an object of the der class the vtable pointer in the object will point to the der class’ vtable, which is compatible with the base class’ vtable, but contains pointers pointing to the der class’ function implementations.

Thus, the der implementation of the function is called.

In practice the this pointer argument passing in point (3) is typically optimized, special, by passing the this pointer in a dedicated processor register instead of on the machine stack.

For more in depth discussion see the literature on C++ memory model, e.g. Stanly Lippman’s book Inside the C++ Object Model.

Cheers & hth.,

When we declare object of a class is its memory layout successive(One after the other)?If its successive than does padding occurs in it (like structure padding)?Please help me out with the concepts of memory layout for a class

Thanks in advance.

When we declare object of a class is its memory allocation successive(One after the other)?

The Standard doesn't give any such guarantee. Object memory layout is implementation-defined.

Usually, memory address for data members increases in the order they're defined in the class . But this order may be disrupted at any place where the access-specifiers (private, protected, public) are encountered. This has been discussed in great detail in Inside the C++ Object Model by Lippman.

An excerpt from C/C++ Users Journal,

The compiler isn't allowed to do this rearrangement itself, though. The standard requires that all data that's in the same public:, protected:, or private: must be laid out in that order by the compiler. If you intersperse your data with access specifiers, though, the compiler is allowed to rearrange the access-specifier-delimited blocks of data to improve the layout, which is why some people like putting an access specifier in front of every data member.

Interesting, isn't it?

Where can I find a good explanation of C++ stateful virtual base?

I looked in the JSF C++ Coding Standard and read their explanation, but was looking for some additional information.

Thank you for any additional details provided.

Stanley Lippman's book "Inside the C++ Object Model" has a great run-through of the subject (though dated, but still valid).

struct Base1 {
    int value1;
    Base1() : value1(1) {}
};

struct Base2 {
    int value2;
    Base2() : value2(2) {}
};

struct Derived : public Base1, public Base2 {};

void func(int Derived::*pmf, Derived *d)
{
    printf("%d\n", d->*pmf);
}

void func2()
{
    Derived d;
    int Base2::*b  = &Base2::value2;
    func(b, &d);
}

int main()
{
    func2();
}

Output is 2

Hello everyone. I am reading a book Inside the c++ object model, pointers to member data chapter. There is a code given on the picture, where pointer to data member of a second base class is passed to a function, that expects a pointer to a member of a derived class. I don't really understand what happenes behind the scenes there. It does not make much sense to me such a convertion, especially if you think in terms of pointers to classes, where such thing is not allowed, unless explicit cast is used, although in the book it says that the compiler will adjust the pointer passed to a function. So my two questions are:

1) what happens behind the scenes in this situation.

2) If it is just that the compiler adjusts the pointer, is it always the case that compiler knows types of the pointers, and there is no way when pointer type is not known until runtime?

Update: Fixed the code and added initial values to value1 and value2. After the call of a func2() the output is '2', which is the Base2::value2, so apparently compiler did adjust the pointer.

Behind the scenes, the compiler stores a pointer to data member as an integer offset into the class object. (I'm ignoring virtual base classes for the sake of simplicity.)

So the line int Base2::*bmp = &Base2::val2; looks under the hood like initializing a number to zero, since val2 is at the start of Base2.

When you pass bmp to func, the compiler implicitly converts from the type int Base2::* to the type int Derived::*. This is allowed and safe because anything that is a member of Base2 is also a member of Derived. But the Base2 subobject of Derived is (probably) not at the beginning of Derived, so the offset needs to be changed when doing this conversion. If we suppose that the Base2 subobject begins 4 bytes into a Derived object, the behavior under the hood would look like adding 4 to bmp when it gets passed to func.

All func does, then, is add the offset it received to the address represented by the pointer d to find the int member being dereferenced.

If it is just that the compiler adjusts the pointer, is it always the case that compiler knows types of the pointers, and there is no way when pointer type is not known until runtime?

There is no such thing as a pointer type that is not known until runtime. Every variable has a type known at compile time. (Polymorphism doesn't directly come into play with pointers to data members.)

Possible Duplicate:
How are objects stored in memory in C++?

Hello All,

I am improving the question a bit. I was asked a c++ question about, how the c++ objects are stored in memory when created a object using stack memory and dynamic memory.

When object is created in heap, the automatic variables in class i.e like POD data types. memory will be destroyed after deleting the object. or any different behavior ?

Please provide your comments.

Thanks

This book will help - "Inside the C++ Object Model"

http://www.amazon.com/Inside-Object-Model-Stanley-Lippman/dp/0201834545