Henry S. Warren

Mentioned 36

" "This is the first book that promises to tell the deep, dark secrets of computer arithmetic, and it delivers in spades. It contains every trick I knew plus many, many more. A godsend for library developers, compiler writers, and lovers of elegant hacks, it deserves a spot on your shelf right next to Knuth.""--Josh Bloch" "When I first saw the title, I figured that the book must be either a cookbook for breaking into computers (unlikely) or some sort of compendium of little programming tricks. It's the latter, but it's thorough, almost encyclopedic, in its coverage." "--Guy Steele These are the timesaving techniques relished by computer hackers--those devoted and persistent code developers who seek elegant and efficient ways to build better software. The truth is that much of the computer programmer's job involves a healthy mix of arithmetic and logic. In "Hacker's Delight," veteran programmer Hank Warren shares the tricks he has collected from his considerable experience in the worlds of application and system programming. Most of these techniques are eminently practical, but a few are included just because they are interesting and unexpected. The resulting work is an irresistible collection that will help even the most seasoned programmers better their craft. Topics covered include: A broad collection of useful programming tricks Small algorithms for common tasksPower-of-2 boundaries and bounds checkingRearranging bits and bytesInteger division and division by constantsSome elementary functions on integersGray codeHilbert's space-filling curveAnd even formulas for prime numbers! This book is for anyone who wants to create efficient code. "Hacker's Delight" will help you learn to program at a higher level--well beyond what is generally taught in schools and training courses--and will advance you substantially further than is possible through ordinary self-study alone. 0201914654B06272002

Mentioned in questions and answers.

If you could go back in time and tell yourself to read a specific book at the beginning of your career as a developer, which book would it be?

I expect this list to be varied and to cover a wide range of things.

**To search:** Use the search box in the upper-right corner. To search the answers of the current question, use `inquestion:this`

. For example:

```
inquestion:this "Code Complete"
```

Applying UML and Patterns by Craig Larman.

The title of the book is slightly misleading; it does deal with UML and patterns, but it covers so much more. The subtitle of the book tells you a bit more: An Introduction to Object-Oriented Analysis and Design and Iterative Development.

Masters of doom. As far as motivation and love for your profession go: it won't get any better than what's been described in this book, truthfully inspiring story!

Beginning C# 3.0: An Introduction to Object Oriented Programming

This is the book for those who want to understand the whys and hows of OOP using C# 3.0. You don't want to miss it.

Mastery: The Keys to Success and Long-Term Fulfillment, by George Leonard

It's about about what mindsets are required to reach mastery in any skill, and why. It's just awesome, and an easy read too.

Adding to the great ones mentioned above:

Pro Spring is a superb introduction to the world of Inversion of Control and Dependency Injection. If you're not aware of these practices and their implications - the balance of topics and technical detail in Pro Spring is excellent. It builds a great case and consequent personal foundation.

Another book I'd suggest would be Robert Martin's Agile Software Development (ASD). Code smells, agile techniques, test driven dev, principles ... a well-written balance of many different programming facets.

More traditional classics would include the infamous GoF Design Patterns, Bertrand Meyer's Object Oriented Software Construction, Booch's Object Oriented Analysis and Design, Scott Meyer's "Effective C++'" series and a lesser known book I enjoyed by Gunderloy, Coder to Developer.

And while books are nice ... don't forget radio!

... let me add one more thing. If you haven't already discovered safari - take a look. It is more addictive than stack overflow :-) I've found that with my google type habits - I need the more expensive subscription so I can look at any book at any time - but I'd recommend the trial to anyone even remotely interested.

(ah yes, a little obj-C today, cocoa tomorrow, patterns? soa? what was that example in that cookbook? What did Steve say in the second edition? Should I buy this book? ... a subscription like this is great if you'd like some continuity and context to what you're googling ...)

Database System Concepts is one of the best books you can read on understanding good database design principles.

Algorithms in C++ was invaluable to me in learning Big O notation and the ins and outs of the various sort algorithms. This was published before Sedgewick decided he could make more money by dividing it into 5 different books.

C++ FAQs is an amazing book that really shows you what you should and shouldn't be doing in C++. The backward compatibility of C++ leaves a lot of landmines about and this book helps one carefully avoid them while at the same time being a good introduction into OO design and intent.

Here are two I haven't seen mentioned:

I wish I had read "Ruminations on C++" by Koenig and Moo much sooner. That was the book that made OO concepts really click for me.

And I recommend Michael Abrash's "Zen of Code Optimization" for anyone else planning on starting a programming career in the mid 90s.

Perfect Software: And Other Illusions about Testing

**Perfect Software: And Other Illusions about Testing** *by Gerald M. Weinberg*

ISBN-10: 0932633692

ISBN-13: 978-0932633699

Rapid Development by McConnell

The most influential programming book for me was *Enough Rope to Shoot Yourself in the Foot* by Allen Holub.

O, well, how long ago it was.

I have a few good books that strongly influenced me that I've not seen on this list so far:

*The Psychology of Everyday Things* by Donald Norman. The general principles of design for other people. This may seem to be mostly good for UI but if you think about it, it has applications almost anywhere there is an interface that someone besides the original developer has to work with; e. g. an API and designing the interface in such a way that other developers form the correct mental model and get appropriate feedback from the API itself.

*The Art of Software Testing* by Glen Myers. A good, general introduction to testing software; good for programmers to read to help them think like a tester i. e. think of what may go wrong and prepare for it.

By the way, I realize the question was the "Single Most Influential Book" but the discussion seems to have changed to listing good books for developers to read so I hope I can be forgiven for listing two good books rather than just one.

C++ How to Program It is good for beginner.This is excellent book that full complete with 1500 pages.

Effective C++ and More Effective C++ by Scott Myers.

Inside the C++ object model by Stanley Lippman

I bough this when I was a complete newbie and took me from only knowing that Java existed to a reliable team member in a short time

Not a programming book, but still a very important book every programmer should read:

The Pragmatic programmer was pretty good. However one that really made an impact when I was starting out was :

Windows 95 System Programming Secrets"

I know - it sounds and looks a bit cheesy on the outside and has probably dated a bit - but this was an awesome explanation of the internals of Win95 based on the Authors (Matt Pietrek) investigations using his own own tools - the code for which came with the book. Bear in mind this was before the whole open source thing and Microsoft was still pretty cagey about releasing documentation of internals - let alone source. There was some quote in there like "If you are working through some problem and hit some sticking point then you need to stop and really look deeply into that piece and really understand how it works". I've found this to be pretty good advice - particularly these days when you often have the source for a library and can go take a look. Its also inspired me to enjoy diving into the internals of how systems work, something that has proven invaluable over the course of my career.

Oh and I'd also throw in effective .net - great internals explanation of .Net from Don Box.

I recently read Dreaming in Code and found it to be an interesting read. Perhaps more so since the day I started reading it Chandler 1.0 was released. Reading about the growing pains and mistakes of a project team of talented people trying to "change the world" gives you a lot to learn from. Also Scott brings up a lot of programmer lore and wisdom in between that's just an entertaining read.

Beautiful Code had one or two things that made me think differently, particularly the chapter on top down operator precedence.

@Juan: I know Juan, I know - but there are some things that can only be learned by actually getting down to the task at hand. Speaking in abstract ideals all day simply makes you into an academic. It's in the application of the abstract that we truly grok the reason for their existence. :P

@Keith: Great mention of "The Inmates are Running the Asylum" by Alan Cooper - an eye opener for certain, any developer that has worked with me since I read that book has heard me mention the ideas it espouses. +1

I found the The Algorithm Design Manual to be a very beneficial read. I also highly recommend Programming Pearls.

This one isnt really a book for the beginning programmer, but if you're looking for SOA design books, then SOA in Practice: The Art of Distributed System Design is for you.

Advanced Programming in the UNIX Environment by W. Richard Stevens.

For me it was Design Patterns Explained it provided an 'Oh that's how it works' moment for me in regards to design patterns and has been very useful when teaching design patterns to others.

I'm a big fan of most titles by Robert C. Martin, especially Agile Software Development, Principles, and Practices and Clean Code: A Handbook of Agile Software Craftsmanship.

Code Craft by Pete Goodliffe is a good read!

The first book that made a real impact on me was *Mastering Turbo Assembler* by Tom Swan.

Other books that have had an impact was *Just For Fun* by Linus Torvalds and David Diamond and of course *The Pragmatic Programmer* by Andrew Hunt and David Thomas.

In addition to other people's suggestions, I'd recommend either acquiring a copy of SICP, or reading it online. It's one of the few books that I've read that I feel *greatly* increased my skill in designing software, particularly in creating good abstraction layers.

A book that is not *directly* related to programming, but is also a good read for programmers (IMO) is Concrete Mathematics. Most, if not all of the topics in it are useful for programmers to know about, and it does a better job of explaining things than any other math book I've read to date.

For me "Memory as a programming concept in C and C++" really opened my eyes to how memory management really works. If you're a C or C++ developer I consider it a must read. You will defiantly learn something or remember things you might have forgotten along the way.

http://www.amazon.com/Memory-Programming-Concept-C/dp/0521520436

Agile Software Development with Scrum by Ken Schwaber and Mike Beedle.

I used this book as the starting point to understanding Agile development.

Systemantics: How Systems Work and Especially How They Fail. Get it used cheap. But you might not get the humor until you've worked on a few failed projects.

The beauty of the book is the copyright year.

Probably the most profound takeaway "law" presented in the book:

*The Fundamental Failure-Mode Theorem (F.F.T.): Complex systems usually operate in failure mode.*

The idea being that there are failing parts in any given piece of software that are masked by failures in other parts or by validations in other parts. See a real-world example at the Therac-25 radiation machine, whose software flaws were masked by hardware failsafes. When the hardware failsafes were removed, the software race condition that had gone undetected all those years resulted in the machine killing 3 people.

It seems most people have already touched on the some very good books. One which really helped me out was Effective C#: 50 Ways to Improve your C#. I'd be remiss if I didn't mention The Tao of Pooh. Philosophy books can be good for the soul, and the code.

Discrete Mathematics For Computer Scientists by J.K. Truss.

While this doesn't teach you programming, it teaches you fundamental mathematics that every programmer should know. You may remember this stuff from university, but really, doing predicate logic will improve you programming skills, you need to learn Set Theory if you want to program using collections.

There really is a lot of interesting information in here that can get you thinking about problems in different ways. It's handy to have, just to pick up once in a while to learn something new.

I saw a review of Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools on a blog talking also about XI-Factory, I read it and I must say this book is a must read. Altough not specifically targetted to programmers, it explains very clearly what is happening in the programming world right now with Model-Driven Architecture and so on..

Solid Code Optimizing the Software Development Life Cycle

Although the book is only 300 pages and favors Microsoft technologies it still offers some good language agnostic tidbits.

Managing Gigabytes is an instant classic for thinking about the heavy lifting of information.

My vote is "How to Think Like a Computer Scientist: Learning With Python" It's available both as a book and as a free e-book.

It really helped me to understand the basics of not just Python but programming in general. Although it uses Python to demonstrate concepts, they apply to most, if not all, programming languages. Also: IT'S FREE!

Object-Oriented Programming in Turbo C++. Not super popular, but it was the one that got me started, and was the first book that really helped me grok what an object was. Read this one waaaay back in high school. It sort of brings a tear to my eye...

My high school math teacher lent me a copy of Are Your Lights Figure Problem that I have re-read many times. It has been invaluable, as a developer, and in life generally.

I'm reading now Agile Software Development, Principles, Patterns and Practices. For those interested in XP and Object-Oriented Design, this is a classic reading.

Kernighan & Plauger's Elements of Programming Style. It illustrates the difference between gimmicky-clever and elegant-clever.

to get advanced in prolog i like these two books:

really opens the mind for logic programming and recursion schemes.

Here's an excellent book that is not as widely applauded, but is full of deep insight: Agile Software Development: The Cooperative Game, by Alistair Cockburn.

What's so special about it? Well, clearly everyone has heard the term "Agile", and it seems most are believers these days. Whether you believe or not, though, there are some deep principles behind why the Agile movement exists. This book uncovers and articulates these principles in a precise, scientific way. Some of the principles are (btw, these are my words, not Alistair's):

- The hardest thing about team software development is getting everyone's brains to have the same understanding. We are building huge, elaborate, complex systems which are invisible in the tangible world. The better you are at getting more peoples' brains to share deeper understanding, the more effective your team will be at software development.
**This is the underlying reason that pair programming makes sense. Most people dismiss it (and I did too initially), but with this principle in mind I highly recommend that you give it another shot. You wind up with TWO people who deeply understand the subsystem you just built ... there aren't many other ways to get such a deep information transfer so quickly. It is like a Vulcan mind meld.** - You don't always need words to communicate deep understanding quickly. And a corollary: too many words, and you exceed the listener/reader's capacity, meaning the understanding transfer you're attempting does not happen. Consider that children learn how to speak language by being "immersed" and "absorbing". Not just language either ... he gives the example of some kids playing with trains on the floor. Along comes another kid who has never even
**SEEN**a train before ... but by watching the other kids, he picks up the gist of the game and plays right along. This happens all the time between humans. This along with the corollary about too many words helps you see how misguided it was in the old "waterfall" days to try to write 700 page detailed requirements specifications.

There is so much more in there too. I'll shut up now, but I HIGHLY recommend this book!

The Back of the Napkin, by Dan Roam.

A great book about visual thinking techniques. There is also an expanded edition now. I can't speak to that version, as I do not own it; yet.

Agile Software Development by Alistair Cockburn

Do users ever touch your code? If you're not doing solely back-end work, I recommend About Face: The Essentials of User Interface Design — now in its third edition (linked). I used to think my users were stupid because they didn't "get" my interfaces. I was, of course, wrong. About Face turned me around.

"Writing Solid Code: Microsoft's Techniques for Developing Bug-Free C Programs (Microsoft Programming Series)" by Steve MacGuire.

Interesting what a large proportion the books mentioned here are C/C++ books.

While not strictly a software development book, I would highly recommend that Don't Make me Think! be considered in this list.

As so many people have listed Head First Design Patterns, which I agree is a very good book, I would like to see if so many people aware of a title called Design Patterns Explained: A New Perspective on Object-Oriented Design.

This title deals with design patterns excellently. The first half of the book is very accessible and the remaining chapters require only a firm grasp of the content already covered The reason I feel the second half of the book is less accessible is that it covers patterns that I, as a young developer admittedly lacking in experience, have not used much.

This title also introduces the concept behind design patterns, covering Christopher Alexander's initial work in architecture to the GoF first implementing documenting patterns in SmallTalk.

I think that anyone who enjoyed Head First Design Patterns but still finds the GoF very dry, should look into Design Patterns Explained as a much more readable (although not quite as comprehensive) alternative.

Even though i've never programmed a game this book helped me understand a lot of things in a fun way.

How influential a book is often depends on the reader and where they were in their career when they read the book. I have to give a shout-out to Head First Design Patterns. Great book and the very creative way it's written should be used as an example for other tech book writers. I.e. it's written in order to facilitate learning and internalizing the concepts.

97 Things Every Programmer Should Know

This book pools together the collective experiences of some of the world's best programmers. It is a must read.

Extreme Programming Explained: Embrace Change by Kent Beck. While I don't advocate a hardcore XP-or-the-highway take on software development, I wish I had been introduced to the principles in this book much earlier in my career. Unit testing, refactoring, simplicity, continuous integration, cost/time/quality/scope - these changed the way I looked at development. Before Agile, it was all about the debugger and fear of change requests. After Agile, those demons did not loom as large.

One of my personal favorites is Hacker's Delight, because it was as much fun to read as it was educational.

I hope the second edition will be released soon!

You.Next(): Move Your Software Development Career to the Leadership Track ~ Michael C. Finley (Author), Honza Fedák (Author) link text

I've been arounda while, so most books that I have found influential don't necessarily apply today. I do believe it is universally important to understand the platform that you are developing for (both hardware and OS). I also think it's important to learn from other peoples mistakes. So two books I would recommend are:

Computing Calamities and In Search of Stupidity: Over Twenty Years of High Tech Marketing Disasters

Working Effectively with Legacy Code is a really amazing book that goes into great detail about how to properly unit test your code and what the true benefit of it is. It really opened my eyes.

Today I needed a simple algorithm for checking if a number is a power of 2.

The algorithm needs to be:

- Simple
- Correct for any
`ulong`

value.

I came up with this simple algorithm:

```
private bool IsPowerOfTwo(ulong number)
{
if (number == 0)
return false;
for (ulong power = 1; power > 0; power = power << 1)
{
// This for loop used shifting for powers of 2, meaning
// that the value will become 0 after the last shift
// (from binary 1000...0000 to 0000...0000) then, the 'for'
// loop will break out.
if (power == number)
return true;
if (power > number)
return false;
}
return false;
}
```

But then I thought, how about checking if `log`

is an exactly round number? But when I checked for 2^63+1, _{2} x`Math.Log`

returned exactly 63 because of rounding. So I checked if 2 to the power 63 is equal to the original number - and it is, because the calculation is done in `double`

s and not in exact numbers:

```
private bool IsPowerOfTwo_2(ulong number)
{
double log = Math.Log(number, 2);
double pow = Math.Pow(2, Math.Round(log));
return pow == number;
}
```

This returned `true`

for the given wrong value: `9223372036854775809`

.

Is there a better algorithm?

Some sites that document and explain this and other bit twiddling hacks are:

- http://graphics.stanford.edu/~seander/bithacks.html

(http://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2) - http://bits.stephan-brumme.com/

(http://bits.stephan-brumme.com/isPowerOfTwo.html)

And the grandaddy of them, the book "Hacker's Delight" by Henry Warren, Jr.:

As Sean Anderson's page explains, the expression `((x & (x - 1)) == 0)`

incorrectly indicates that 0 is a power of 2. He suggests to use:

```
(!(x & (x - 1)) && x)
```

to correct that problem.

Shifting bits left and right is apparently faster than multiplication and division operations on most, maybe even all, CPUs if you happen to be using a power of 2. However, it can reduce the clarity of code for some readers and some algorithms. Is bit-shifting really necessary for performance, or can I expect the compiler or VM to notice the case and optimize it (in particular, when the power-of-2 is a literal)? I am mainly interested in the Java and .NET behavior but welcome insights into other language implementations as well.

Most compilers today will do more than convert multiply or divide by a power-of-two to shift operations. When optimizing, many compilers can optimize a multiply or divide with a compile time constant even if it's not a power of 2. Often a multiply or divide can be decomposed to a series of shifts and adds, and if that series of operations will be faster than the multiply or divide, the compiler will use it.

For division by a constant, the compiler can often convert the operation to a multiply by a 'magic number' followed by a shift. This can be a major clock-cycle saver since multiplication is often much faster than a division operation.

Henry Warren's book, Hacker's Delight, has a wealth of information on this topic, which is also covered quite well on the companion website:

See also a discussion (with a link or two ) in:

Anyway, all this boils down to allowing the compiler to take care of the tedious details of micro-optimizations. It's been years since doing your own shifts outsmarted the compiler.

I must say I have never had cause to use bitwise operators, but I am sure there are some operations that I have performed that would have been more efficiently done with them. How have "shifting" and "OR-ing" helped you solve a problem more efficiently?

I have not read the book (yet), but I have been told that the Book Hacker's Delight shows a number of tricks in working with bits.

I am confused as to when I should use a Boolean vs Bitwise operators

```
and vs &, or vs |
```

Could someone enlighten me as to when do i use each and when will using one over the other affect my results?

The hint is in the name:

- Boolean operators are for performing
**logical operations**(truth testing common in programming and formal logic) - Bitwise operators are for
**"bit-twiddling"**(low level manipulation of bits in byte and numeric data types)

While it is possible and indeed sometimes desirable (typically for efficiency reasons) to perform logical operations with bitwise operators, you should generally avoid them for such purposes to prevent subtle bugs and unwanted side effects.

If you need to manipulate bits, then the bitwise operators are purpose built. The fun book: Hackers Delight contains some cool and genuinely useful examples of what can be achieved with bit-twiddling.

I would like to learn more about low level code optimization, and how to take advantage of the underlying machine architecture. I am looking for good pointers on where to read about this topic.

**More details:**

I am interested in optimization in the context of scientific computing (which is a lot of number crunching but *not only*) in low level languages such as C/C++. I am in particular interested in optimization methods that are not obvious unless one has a good understanding of how the machine works (which I don't---yet).

For example, it's clear that a better algorithm is faster, without knowing anything about the machine it's run on. It's not at all obvious that it matters if one loops through the columns or the rows of a matrix first. (It's better to loop through the matrix so that elements that are stored at adjacent locations are read successively.)

Basic advice on the topic or pointers to articles are most welcome.

**Answers**

Got answers with lots of great pointers, a lot more than I'll ever have time to read. Here's a list of all of them:

- The software optimization cookbook from Intel (book)
- What every programmer should know about memory (pdf book)
- Write Great Code, Volume 2: Thinking Low-Level, Writing High-Level (book)
- Software optimization resources by Agner Fog (five detailed pdf manuals)

I'll need a bit of skim time to decide which one to use (not having time for all).

An interesting book about bit manipulation and smart ways of doing low-level things is Hacker's Delight.

This is definitely worth a read for everyone interested in low-level coding.

I learned a lot from the book Inner Loops. It's ancient now, in computer terms, but it's very well written and Rick Booth is so enthusiastic about his subject I would still say it's worth looking at to see the kind of mindset you need to make a CPU fly.

Drepper's What Every Programmer Should Know About Memory [pdf] is a good reference to one aspect of low-level optimisation.

It's been a few years since I read it, but Write Great Code, Volume 2: Thinking Low-Level, Writing High-Level by Randall Hyde was quite good. It gives good examples of how C/C++ code translates into assembly, e.g. what really happens when you have a big `switch`

statement.

Also, altdevblogaday.com is focused on game development, but the programming articles might give you some ideas.

According to wiki shifts can be used to calculate powers of 2:

A left arithmetic shift by n is equivalent to multiplying by 2^n (provided the value does not overflow), while a right arithmetic shift by n of a two's complement value is equivalent to dividing by 2^n and rounding toward negative infinity.

I was always wondering if any other bitwise operators (`~`

,`|`

,`&`

,`^`

) make any mathematical sense when applied to base-10? I understand how they work, but do results of such operations can be used to calculate anything useful in decimal world?

Yes, there are other useful operations, but they tend to be oriented towards operations involving powers of 2 (for obvious reasons), e.g. test for odd/even, test for power of 2, round up/down to nearest power of 2, etc.

See Hacker's Delight by Henry S. Warren.

I just started reading Hacker's Delight and it defines abs(-2^{31}) as -2^{31}. Why is that?

I tried `printf("%x", abs(0x80000000))`

on a few different systems and I get back 0x80000000 on all of them.

For a 32bit datatype there is no expression of +2^31, because the biggest number is 2^31-1 ... read more about the two's complement ...

When using SSE2 instructions such as `PADDD`

(i.e., the `_mm_add_epi32`

intrinsic), is there a way to check whether any of the operations overflowed?

I thought that maybe a flag on the MXCSR control register may get set after an overflow, but I don't see that happening. For example, `_mm_getcsr()`

prints the same value in both cases below (8064):

```
#include <iostream>
#include <emmintrin.h>
using namespace std;
void main()
{
__m128i a = _mm_set_epi32(1, 0, 0, 0);
__m128i b = _mm_add_epi32(a, a);
cout << "MXCSR: " << _mm_getcsr() << endl;
cout << "Result: " << b.m128i_i32[3] << endl;
__m128i c = _mm_set_epi32((1<<31)-1, 3, 2, 1);
__m128i d = _mm_add_epi32(c, c);
cout << "MXCSR: " << _mm_getcsr() << endl;
cout << "Result: " << d.m128i_i32[3] << endl;
}
```

Is there some other way to check for overflow with SSE2?

Here is a somewhat more efficient version of @drhirsch's `sum_and_overflow`

function:

```
void sum_and_overflow(__v4si a, __v4si b, __v4si& sum, __v4si& overflow)
{
__v4si sa, sb;
sum = _mm_add_epi32(a, b); // calculate sum
sa = _mm_xor_si128(sum, a); // compare sign of sum with sign of a
sb = _mm_xor_si128(sum, b); // compare sign of sum with sign of b
overflow = _mm_and_si128(sa, sb); // get overflow in sign bit
overflow = _mm_srai_epi32(overflow, 31); // convert to SIMD boolean (-1 == TRUE, 0 == FALSE)
}
```

It uses an expression for overflow detection from Hacker's Delight page 27:

```
sum = a + b;
overflow = (sum ^ a) & (sum ^ b); // overflow flag in sign bit
```

Note that the overflow vector will contain the more conventional SIMD boolean values of -1 for TRUE (overflow) and 0 for FALSE (no overflow). If you only need the overflow in the sign bit and the other bits are "don't care" then you can omit the last line of the function, reducing the number of SIMD instructions from 5 to 4.

I have seen this topic here about John Carmack's magical way to calculate square root, which refers to this article: http://www.codemaestro.com/reviews/9. This surprised me a lot, I just didn't ever realized that calculating sqrt could be so faster.

I was just wondering what other examples of "magic" exist out there that computer games use to run faster.

**UPDATE**:
John Carmack is not the author of the magic code. This article tells more. Thanks @moocha.

There is a book which gathers many of those 'magic tricks' and that may be interesting for you: The Hacker's Delight.

You have for example many tricks like bit twiddling hacks etc... (you have several square root algorithms for example that you can see on the google books version)

I was wondering if the above was at all possible. For example:

```
Math.Sqrt(myVariableHere);
```

When looking at the overload, it requires a double parameter, so I'm not sure if there is another way to replicate this with decimal datatypes.

I don't understand why all the answers to that question are the same.

There are several ways to calculate the square root from a number. One of them was proposed by Isaac Newton. I'll only write one of the simplest implementations of this method. I use it to improve the accuracy of double's square root.

```
// x - a number, from which we need to calculate the square root
// epsilon - an accuracy of calculation of the root from our number.
// The result of the calculations will differ from an actual value
// of the root on less than epslion.
public static decimal Sqrt(decimal x, decimal epsilon = 0.0M)
{
if (x < 0) throw new OverflowException("Cannot calculate square root from a negative number");
decimal current = (decimal)Math.Sqrt((double)x), previous;
do
{
previous = current;
if (previous == 0.0M) return 0;
current = (previous + x / previous) / 2;
}
while (Math.Abs(previous - current) > epsilon);
return current;
}
```

About speed: in the worst case (epsilon = 0 and number is decimal.MaxValue) the loop repeats less than a three times.

If you want to know more, read this (*Hacker's Delight* by Henry S. Warren, Jr.)

There is a lot of information on how to find the next power of 2 of a given value (see refs) but I cannot find any to get the previous power of two.

The only way I find so far is to keep a table with all power of two up to 2^64 and make a simple lookup.

From Hacker's Delight, a nice *branchless* solution:

```
uint32_t flp2 (uint32_t x)
{
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >> 16);
return x - (x >> 1);
}
```

This typically takes 12 instructions. You can do it in fewer if your CPU has a "count leading zeroes" instruction.

I am lookin for a method to have number of 1's in 32 bit number without using a loop in between. can any body help me and provide me the code or algorithm to do so. Thanks in advance.

See `Integer.bitCount(int)`

. You can refer to the source code if you want to see how it works; many of the `Integer`

class's bit twiddling routines are taken from *Hacker's Delight.*

If I have an assignment

```
Long c = a + b;
```

Is there an easy way to check that `a + b`

is not bigger/smaller than `Long.MAX_VALUE`

/`Long.MIN_VALUE`

?

Using Guava, it's as simple as

```
long c = LongMath.checkedAdd(a, b); // throws an ArithmeticException on overflow
```

which is, I'd like to think, very readable indeed. (LongMath Javadoc here.)

For the sake of fairness, I'll mention that Apache Commons provides `ArithmeticUtils.addAndCheck(long, long)`

.

If you want to know how they work, well, the answer is one line of bit-hackery for Guava: the result doesn't overflow if `(a ^ b) < 0 | (a ^ (a + b)) >= 0`

. This is based on the trick that the bitwise XOR of two numbers is nonnegative iff they have the same sign.

So `(a ^ b) < 0`

is true if `a`

and `b`

have different signs, and if that's the case it'll never overflow. Or, if `(a ^ (a + b)) >= 0`

, then `a + b`

has the same sign as `a`

, so it didn't overflow and become negative.

(For more tricks like this, investigate the lovely book *Hacker's Delight*.)

Apache uses more complicated casework based on the sign of `a`

and `b`

.

I'm looking for the fastest way to popcount on large buffer of 512 or more bytes. I can guarantee any required alignment, and the buffer size is always a power of 2. The buffer corresponds to block allocations, so typically the bits are either all set, none set, or mostly set favoring the "left" of the buffer, with occasional holes.

Some solutions I've considered are:

I'm interested in the fastest solution, it must work on 32bit x86 chipset belonging to core2 or more recent. SSE and SIMD are of great interest. I'll be testing on the following quad core CPU:

```
matt@stanley:~/anacrolix/public/stackoverflow$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
stepping : 11
cpu MHz : 1600.000
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow vnmi flexpriority
bogomips : 4800.21
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
```

I would suggest implementing one of the optimised 32 bit popcnt routines from Hacker's Delight, but do it for 4 x 32 bit integer elements in an SSE vector. You can then process 128 bits per iteration, which should give you around 4x throughput compared to an optimised 32 bit scalar routine.

How would i go about finding the number of 'zero' bits in C++. Suppose I have an integer;

```
int value = 276;
```

For which I have the bits 100010100, but how do I count the zeros?

There is a great book for this kind of stuff : Hacker's Delight (yeah, the name sucks : it has nothing to do with security but exclusively bit-twiddling). It provides several algorithms to count '1' bits, the best can also be found here (although the book has explanations that this website doesn't).

Once you know the '1' bits count, just subtract it to the number of bits in your type representation.

Recently I discovered, that if I need to see if variable is even ( or odd ), I could just see if last bit of variable equals 0. This discovery, when implemented, replaced few *modulo 2* calculations and thus, whole function ran faster.

Are there any other "tricks" like this one, where working with bits could replace other calculations, leading to improved function execution time?

Well, there is a great book on this topic: Hacker's Delight (Google books).

On Amazon.com: Hacker's Delight.

**Is there any standard way for converting an (any) equation into bit-shift operations?**

By this I mean converting any thing that is not a + or - into bit-shifts, so the end equation contains only the operands **<<, >>, +, and -**. This is in the interest of making formulas less processor intensive.

Obviously these resultant equations will only be approximations, giving better accuracy with the more orders considered (first-order, second-order e.t.c).

I have scoured the web for any information on this but can't find any, except for stuff on specific formulas (sin, cos, inv e.t.c).

I was envisioning something like a polynomial or Taylor's expansion procedure, and then converting that to bit-shift operations.

Software multiplication with bit-wise operations is not likely to beat hardware multiplication on modern CPUs.

Usually going down to bit-wise manipulation can yield better performances if it allows to avoid 1) loops; and 2) branching.

A good online cookbook for bit hacking. Otherwise there is A Hacker's delight.

- What are some of the ways a compiler eliminates repeated subexpressions recomputations? How do you keep track of the sub-expressions? And How do you identify the repeated ones?
- Besides the usage of bitwise operators, what are some of the strength reduction techniques common compilers use?

I would highly recommend two printed references on these subjects:

- Advanced Compiler Design & Implementation by Steven S. Muchnick
- Building an Optimizing Compiler by Robert Morgan

The Muchnick book is on the formal side but is very readable and has good descriptions of all of the important optimization techniques. The Morgan book has a much more hands-on feel and would be a great basis for a compiler project focused on optimization techniques. Neither book has much to say about lexical analysis or parsing, knowledge of these subjects is assumed.

To add one more book to the list of recommendations, check out "Hacker's Delight" by Henry S. Warren. It's a great compendium of techniques for optimizing common operations, like transforming integer divisions into multiplications.

Bitcounting can be done in several ways, eg. with set bit iterator, unset bit iterator, pre-computed bits with lookup tables or parallel counting. As I have figured out by searching the web, unset bit iterator is fast when there are less unset bits, and set bit iterator the opposite. But when should you use parallel counting, MIT HAKMEM (seen below) in particular? It seems quite fast, although probably slower then lookup tables. Is it always better compared to set/unset bit in terms of speed? Are there some other conserns regarding which one to choose than speed and memory?

```
int BitCount(unsigned int u) {
unsigned int uCount;
uCount = u - ((u >> 1) & 033333333333) - ((u >> 2) & 011111111111);
return ((uCount + (uCount >> 3)) & 030707070707) % 63;
}
```

Why choose one bit counting method over another? Well it really depends on your machine and the problem you're trying to solve. Note that all the instruction counts I give below are for a basic RISC processor and might not translate well to a more complicated beast like x86.

The HAKMEM algorithm you quoted will execute in 13 instructions but is unlikely to be very fast due to the modulus operator. By eye-balling it, it does look like it has some pretty good instruction level parallelism which should help if your processor is capable of exploiting that.

The algorithm Bo Persson presented is quite fast (`2 + 5*pop(x)`

instructions) but only if the word is sparsely populated. It can also be modified to work on densely populated words. It also contain branches and doesn't have any significant instruction level parallelism.

**EDIT:** The table lookup method can also be very fast but does make memory accesses. If the entire table is in the L1 cache then it's probably one of the fastest algorithms. If the table isn't in cache then it's almost certainly one of the slowest.

The algorithm below is a variation of one of the HAKMEM algorithm and is presented in the book Hacker's Delight (I highly recommend this book if you like this sort of things). It executes in 19 instructions and is branch-free. It also doesn't use a division but does have a multiplication. It's also very economical in the way it uses registers by re-using the same mask as much as possible. Still no significant instruction level parallelism here that I can see.

```
int pop(unsigned x) {
unsigned n;
n = (x >> 1) & 0x77777777;
x = x - n;
n = (n >> 1) & 0x77777777;
x = x - n;
n = (n >> 1) & 0x77777777;
x = x - n;
x = (x + (x >> 4)) & 0x0F0F0F0F;
x = x * 0x01010101;
return x >> 24;
}
```

The Hacker's Delight book also presents a couple of even more specialised algorithms for 9-8-7 bit wide fields or using floating point operators. Note that most of the analysis I presented above were also partially taken from that book as well.

The fact is that there's a truck load of methods and the only way to be sure which works best in your particular situation is to measure and compare. I do realise that this is a pretty canned answer but the alternative is to know your processor and compiler inside out.

Could anyone give me some pointers as to the best way in which to learn how to do very low latency programming? I have many programming books but I've never seen one which focused (or helped) on writing extremely fast code. Or are books not the best way forward?

Some advice from an expert would be really appreciated!

EDIT: I think I'm referring more to CPU/Memory bound.

[C++ programmer]:

Ultra-low-latency programming is hard. Much harder than people suspect when they first start down the path. There are some techniques and "tricks" you can employ. Like IO Completion ports, multi core utilization, highly optimized synchronization techniques, shared memory. The list goes on forever. (edit) It's not as simple as "code-profile-refactor-repeat" because you can write excellent code that is robust and fast, but will never be truly ultra-low latency code.

Unfortunately there is no one single resource I know of that will show you how it's done. Programmers specializing in (and good at) ultra low-latency code are among the best in the business and the most experienced. And with good reason. Because if there is a silver bullet solution to becoming a good low-latency programmer, it is simply this: you have to know **a lot about everything**. And that knowledge is not easy to come by. It takes years (decades?) of experience and constant study.

As far as the study itself is concerned, here's a few books I found useful or especially insightful for one reason or another:

i just stumbled onto this comment.

```
public static int lowestOneBit(int i) {
// HD, Section 2-1
return i & -i;
}
```

in the 1.5 java source. what does this comment mean? is it a reference to a book? a spec?

HD probably refers to Hacker's Delight by Henry S. Warren. Indeed, this formula appears on page 11, section 2-1.

Possible Duplicate:

Best algorithm to count the number of set bits in a 32-bit integer?

I want to find out how many 1s are there in binary representation of a number.I have 2 logic .

`int count =0; int no = 4; while(no!=0){ int d = no%2; if(d==1) count++; no = no/2; str = str+ d; }`

Now second logic is to keep on masking number iteratively with 1,2,4,8,32 and check if result is 1,2,4, 8..... Am not geting what should be ending condition for this loop.

Use Java API(java 5 or above).

```
Integer.bitCount(int);
Long.bitCount(long);
```

NOTE: The above java methods are based on hacker's delight

Best in PHP,

for example,

11011111 ==> 11111011

Try to get this book, there is whole chapter about bits reversion: Hacker's Delight. But please check content first if this suits you.

I am a mid level developer. Currently I work in C# ASP.NET but I have worked in PHP, classic ASP, and a tiny bit in Java (would like to get more into this in the future). I am looking for a few suggestions on good books to help me continue growing in my coding techniques.

Basically a book with lots of best practices, coding tricks, rarely used but very useful features and operators (e.g. the ?? operator), stuff like that. I am NOT looking for a book to teach me a specific language. I don't particularly care about what language the code samples are in but C or Java would probably be best. The suggestions don't need to be limited to pure code objects either. Books with information about database connection management ect. would be good.

Thanks for your suggestions.

More Recent

For your java Skills:

and in general

I strongly recommend to read Effective Java **before** reading Clean Code. -- Because it will make you more sensitive for a few (I personal belive not so optimal) code changes suggested in clean code.

The first book that comes to mind for me is Code Complete. It's fantastic. Absolutely required reading for a serious developer.

If by rarely used stuff, you think Bitwise operations, Hacker's Delight is supposed to be really well regarded.

I am trying to find some books or resources talking about bit in detail so that for example I would be able to translate a number (like 16) into bits. I am currently a high school student and whenever reading a programming books I can understand almost everything except the bit/bitwise operators part. I just do not know how it works and why do people even invent bit & byte :(. Therefore, I hope that you guys can give me some resources suggestions talking about how to translate number/characters into bits.

Thank you for answering my question and have a great day!

One of the best books I've ever read on binary math and bit shifting is Hacker's Delight. It's practically THE book on anything to do with superoptimization. I'd highly recommend reading it and if the material is too complex, working through it very slowly. I've had to rewrite standard library functions like strlen() for a hobby OS before and this book saved my life.

How can I unset the most significant setted bit of a word (e.g. 0x00556844 -> 0x00156844)? There is a `__builtin_clz`

in gcc, but it just counts the zeroes, which is unneeded to me. Also, how should I replace __builtin_clz for msvc or intel c compiler?

Current my code is

```
int msb = 1<< ((sizeof(int)*8)-__builtin_clz(input)-1);
int result = input & ~msb;
```

UPDATE: Ok, if you says that this code is rather fast, I'll ask you, how should I add a portability to this code? This version is for GCC, but MSVC & ICC?

Just round down to the nearest power of 2 and then XOR that with the original value, e.g. using `flp2()`

from Hacker's Delight:

```
uint32_t flp2(uint32_t x) // round x down to nearest power of 2
{
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >>16);
return x - (x >> 1);
}
uint32_t clr_msb(uint32_t x) // clear most significant set bit in x
{
msb = flp2(x); // get MS set bit in x
return x ^ msb; // XOR MS set bit to clear it
}
```

I have an array of uint64 and for all unset bits (0s), I do some evaluations.

The evaluations are not terribly expensive, but very few bits are unset. Profiling says that I spend a lot of time in the finding-the-next-unset-bit logic.

Is there a faster way (on a Core2duo)?

My current code can skip lots of high 1s:

```
for(int y=0; y<height; y++) {
uint64_t xbits = ~board[y];
int x = 0;
while(xbits) {
if(xbits & 1) {
... with x and y
}
x++;
xbits >>= 1;
}
}
```

(And any discussion about how/if to SIMD/CUDA-ise this would be an intriguing tangent!)

Hacker's Delight suggests a loop-unrolled binary search. Not pretty, but fast for sparse unset bits because it skips dwords/bytes/nibbles/etc. with every bit set.

If you can get a Phenom with SSE4a (not Core2 Duo, unfortunately) you can use POPCNT to write a fast number-of-set-bits function. Then you can get the index of the next unset bit with:

```
pop(x & (~x-1))
```

`x & (~x-1)`

clears the set bits above the next zero bit; then you just have to count the remaining bits with POPCNT.

Here's a worked example with a byte:

```
01101111 x
10010000 ~x
10001111 ~x-1
00001111 x & ~x-1
pop(00001111) => 4
```

I want to use a version of the well known MIT bitcount algorithm to count neighbors in Conway's game of life using SSE2 instructions.

Here's the MIT bitcount in c, extended to count bitcounts > 63 bits.

```
int bitCount(unsigned long long n)
{
unsigned long long uCount;
uCount = n – ((n >> 1) & 0×7777777777777777)
- ((n >> 2) & 0×3333333333333333)
- ((n >> 3) & 0×1111111111111111);
return ((uCount + (uCount >> 4))
& 0x0F0F0F0F0F0F0F0F) % 255;
}
```

Here's a version in Pascal

```
function bitcount(n: uint64): cardinal;
var ucount: uint64;
begin
ucount:= n - ((n shr 1) and $7777777777777777)
- ((n shr 2) and $3333333333333333)
- ((n shr 3) and $1111111111111111);
Result:= ((ucount + (count shr 4))
and $0F0F0F0F0F0F0F0F) mod 255;
end;
```

I'm looking to count the bits in this structure in parallel.

```
32-bit word where the pixels are laid out as follows.
lo-byte lo-byte neighbor
0 4 8 C 048C 0 4 8 C
+---------------+
1|5 9 D 159D 1|5 9 D
| |
2|6 A E 26AE 2|6 A E
+---------------+
3 7 B F 37BF 3 7 B F
|-------------| << slice A
|---------------| << slice B
|---------------| << slice C
```

Notice how this structure has 16 bits in the middle that need to be looked up.
I want to **calculate** neighbor counts for each of the 16 bits in the middle using SSE2.
In order to do this I put slice A in XMM0 low-dword, slice B in XXM0-dword1 etc.

I copy XMM0 to XMM1 and I mask off bits `012-456-89A`

for bit `5`

in the low word of XMM0, do the same for word1 of XMM0, etc. using different slices and masks to make sure each word in XMM0 and XMM1 holds the neighbors for a different pixel.

**Question**

How do I tweak the MIT-bitcount to end up with a bitcount per word/pixel in each XMM word?

**Remarks**

I don't want to use a lookup table, because I already have that approach and I want to
test to see if SSE2 will speed up the process by not requiring memory accesses to the lookup table.

An answer using SSE assembly would be optimal, because I'm programming this in Delphi and I'm thus using x86+SSE2 assembly code.

The MIT algorithm would be tough to implement in SSE2, since there is no integer modulus instruction which could be used for the final `... % 255`

expression. Of the various popcnt methods out there, the one that lends itself to SSE most easily and efficiently is probably the first one in Chapter 5 of "Hackers Delight" by Henry S. Warren, which I have implemented here in C using SSE intrinsics:

```
#include <stdio.h>
#include <emmintrin.h>
__m128i _mm_popcnt_epi16(__m128i v)
{
v = _mm_add_epi16(_mm_and_si128(v, _mm_set1_epi16(0x5555)), _mm_and_si128(_mm_srli_epi16(v, 1), _mm_set1_epi16(0x5555)));
v = _mm_add_epi16(_mm_and_si128(v, _mm_set1_epi16(0x3333)), _mm_and_si128(_mm_srli_epi16(v, 2), _mm_set1_epi16(0x3333)));
v = _mm_add_epi16(_mm_and_si128(v, _mm_set1_epi16(0x0f0f)), _mm_and_si128(_mm_srli_epi16(v, 4), _mm_set1_epi16(0x0f0f)));
v = _mm_add_epi16(_mm_and_si128(v, _mm_set1_epi16(0x00ff)), _mm_and_si128(_mm_srli_epi16(v, 8), _mm_set1_epi16(0x00ff)));
return v;
}
int main(void)
{
__m128i v0 = _mm_set_epi16(7, 6, 5, 4, 3, 2, 1, 0);
__m128i v1;
v1 = _mm_popcnt_epi16(v0);
printf("v0 = %vhd\n", v0);
printf("v1 = %vhd\n", v1);
return 0;
}
```

Compile and test as follows:

```
$ gcc -Wall -msse2 _mm_popcnt_epi16.c -o _mm_popcnt_epi16
$ ./_mm_popcnt_epi16
v0 = 0 1 2 3 4 5 6 7
v1 = 0 1 1 2 1 2 2 3
$
```

It looks like around 16 arithmetic/logical instructions so it should run at around 16 / 8 = 2 clocks per point.

You can easily convert this to raw assembler if you need to - each intrinsic maps to a single instruction.

Suppose I have an increasing sequence of unsigned integers `C[i]`

. As they increase, it's likely that they will occupy increasingly many bits. I'm looking for an efficient conditional, based purely on two consecutive elements of the sequence `C[i]`

and `C[i+1]`

(past and future ones are not observable), that will evaluate to true either exactly or approximately once for every time the number of bits required increases.

An obvious (but slow) choice of conditional is:

```
if (ceil(log(C[i+1])) > ceil(log(C[i]))) ...
```

and likewise anything that computes the number of leading zero bits using special cpu opcodes (much better but still not great).

I suspect there may be a nice solution involving an expression using just bitwise or and bitwise and on the values `C[i+1]`

and `C[i]`

. Any thoughts?

I think you just need `clz(C[i+1]) < clz(C[i])`

where `clz`

is a function which returns the number of leading zeroes ("count leading zeroes"). Some CPU families have an instruction for this (which may be available as an instrinsic). If not then you have to roll your own (it typically only takes a few instructions) - see Hacker's Delight.

i got really confused every time when i encounterd bit operations,especially those shifts,rotates,overflow things etc.I wonder if there's any book/article on the web introducing boolean algebra,which could give me a solid background of boolean algebra,thanks!

At university we used a book called Introduction to Logic Design. Covered everything from boolean algebra up to FPGA stuff. Pretty comprehensive and it has a fair amount of exercises.

Two really great books come to mind.

Also, online you can read Bit Twiddling Hacks.

I have a `byte`

I am using to store bit flags. I need to compute the position of the most significant set bit in the byte.

**Example Byte:** `00`

`1`

`01101`

=> `6`

is the position of the most significant set bit

**Compact Hex Mapping:**

```
[0x00] => 0x00
[0x01] => 0x01
[0x02,0x03] => 0x02
[0x04,0x07] => 0x03
[0x08,0x0F] => 0x04
[0x10,0x1F] => 0x05
[0x20,0x3F] => 0x06
[0x40,0x7F] => 0x07
[0x80,0xFF] => 0x08
```

**TestCase in C:**

```
#include <stdio.h>
unsigned char check(unsigned char b) {
unsigned char c = 0x08;
unsigned char m = 0x80;
do {
if(m&b) { return c; }
else { c -= 0x01; }
} while(m>>=1);
return 0; //never reached
}
int main() {
unsigned char input[256] = {
0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2a,0x2b,0x2c,0x2d,0x2e,0x2f,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x3a,0x3b,0x3c,0x3d,0x3e,0x3f,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5a,0x5b,0x5c,0x5d,0x5e,0x5f,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x7b,0x7c,0x7d,0x7e,0x7f,
0x80,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x8a,0x8b,0x8c,0x8d,0x8e,0x8f,
0x90,0x91,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9a,0x9b,0x9c,0x9d,0x9e,0x9f,
0xa0,0xa1,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xab,0xac,0xad,0xae,0xaf,
0xb0,0xb1,0xb2,0xb3,0xb4,0xb5,0xb6,0xb7,0xb8,0xb9,0xba,0xbb,0xbc,0xbd,0xbe,0xbf,
0xc0,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xca,0xcb,0xcc,0xcd,0xce,0xcf,
0xd0,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6,0xd7,0xd8,0xd9,0xda,0xdb,0xdc,0xdd,0xde,0xdf,
0xe0,0xe1,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0xea,0xeb,0xec,0xed,0xee,0xef,
0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0xfa,0xfb,0xfc,0xfd,0xfe,0xff };
unsigned char truth[256] = {
0x00,0x01,0x02,0x02,0x03,0x03,0x03,0x03,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,
0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,
0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,
0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08};
int i,r;
int f = 0;
for(i=0; i<256; ++i) {
r=check(input[i]);
if(r !=(truth[i])) {
printf("failed %d : 0x%x : %d\n",i,0x000000FF & ((int)input[i]),r);
f += 1;
}
}
if(!f) { printf("passed all\n"); }
else { printf("failed %d\n",f); }
return 0;
}
```

I would like to simplify my `check()`

function to not involve looping (or branching preferably). Is there a bit twiddling hack or hashed lookup table solution to compute the position of the most significant set bit in a byte?

I'm sure everyone else has long since moved on to other topics but there was something in the back of my mind suggesting that there had to be a more efficient branch-less solution to this than just unrolling the loop in my other posted solution. A quick trip to my copy of Warren put me on the right track: Binary search.

Here's my solution based on that idea:

```
Pseudo-code:
// see if there's a bit set in the upper half
if ((b >> 4) != 0)
{
offset = 4;
b >>= 4;
}
else
offset = 0;
// see if there's a bit set in the upper half of what's left
if ((b & 0x0C) != 0)
{
offset += 2;
b >>= 2;
}
// see if there's a bit set in the upper half of what's left
if > ((b & 0x02) != 0)
{
offset++;
b >>= 1;
}
return b + offset;
```

Branch-less C++ implementation:

```
static unsigned char check(unsigned char b)
{
unsigned char adj = 4 & ((((unsigned char) - (b >> 4) >> 7) ^ 1) - 1);
unsigned char offset = adj;
b >>= adj;
adj = 2 & (((((unsigned char) - (b & 0x0C)) >> 7) ^ 1) - 1);
offset += adj;
b >>= adj;
adj = 1 & (((((unsigned char) - (b & 0x02)) >> 7) ^ 1) - 1);
return (b >> adj) + offset + adj;
}
```

Yes, I know that this is all academic :)

Hello is have a question for a school assignment i need to :

Read a round number, and with the internal binaire code with bit 0 on the right and bit 7 on the left.

Now i need to change: bit 0 with bit 7 bit 1 with bit 6 bit 2 with bit 5 bit 3 with bit 4

by example :

if i use hex F703 becomes F7C0 because 03 = 0000 0011 and C0 = 1100 0000 (only the right byte (8 bits) need to be switched. The lession was about bitmanipulation but i can't find a way to make it correct for al the 16 hexnumbers.

I`am puzzling for a wile now,

i am thinking for using a array for this problem or can someone say that i can be done with only bitwise ^,&,~,<<,>>, opertors ???

Study the following two functions:

```
bool GetBit(int value, int bit_position)
{
return value & (1 << bit_position);
}
void SetBit(int& value, int bit_position, bool new_bit_value)
{
if (new_bit_value)
value |= (1 << bit_position);
else
value &= ~(1 << bit_position);
}
```

So now we can read and write arbitrary bits just like an array.

```
1 << N
```

gives you:

```
000...0001000...000
```

Where the 1 is in the Nth position.

So

```
1 << 0 == 0000...0000001
1 << 1 == 0000...0000010
1 << 2 == 0000...0000100
1 << 3 == 0000...0001000
...
```

and so on.

Now what happens if I BINARY AND one of the above numbers with some other number Y?

```
X = 1 << N
Z = X & Y
```

What is Z going to look like? Well every bit apart from the Nth is definately going to be 0 isnt it? because those bits are 0 in X.

What will the Nth bit of Z be? It depends on the value of the Nth bit of Y doesn't it? So under what circumstances is Z zero? Precisely when the Nth bit of Y is 0. So by converting Z to a bool we can seperate out the value of the Nth bit of Y. Take another look at the GetBit function above, this is exactly what it is doing.

Now thats reading bits, how do we set a bit? Well if we want to set a bit on we can use BINARY OR with one of the (1 << N) numbers from above:

```
X = 1 << N
Z = Y | X
```

What is Z going to be here? Well every bit is going to be the same as Y except the Nth right? And the Nth bit is always going to be 1. So we have set the Nth bit on.

What about setting a bit to zero? What we want to do is take a number like 11111011111 where just the Nth bit is off and then use BINARY AND. To get such a number we just use BINARY NOT:

```
X = 1 << N // 000010000
W = ~X // 111101111
Z = W & Y
```

So all the bits in Z apart from the Nth will be copies of Y. The Nth will always be off. So we have effectively set the Nth bit to 0.

Using the above two techniques is how we have implemented SetBit.

So now we can read and write arbitrary bits. Now we can reverse the bits of the number just like it was an array:

```
int ReverseBits(int input)
{
int output = 0;
for (int i = 0; i < N; i++)
{
bool bit = GetBit(input, i); // read ith bit
SetBit(output, N-i-1, bit); // write (N-i-1)th bit
}
return output;
}
```

Please make sure you understand all this. Once you have understood this all, please close the page and implement and test them without looking at it.

If you enjoyed this than try some of these:

http://graphics.stanford.edu/~seander/bithacks.html

And/or get this book:

I came across a StackOverflow answer that gives the following code to efficiently count the number of bits that are set to `1`

in a 32-bit `int`

:

```
int NumberOfSetBits(int i)
{
i = i - ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
```

But I had lot of issues in understanding this. I couldn't find a link where it's explained properly. Can anyone help me out here in understanding this piece of code, or provide a link which could be more helpful?

Answering somewhat indirectly: a great reference on how bit twiddling routines like this (and hundreds of others) work is the book "Hacker's Delight" by Henry Warren. I highly recommend it -- it belongs on every programmer's bookshelf. http://www.amazon.com/Hackers-Delight-Henry-S-Warren/dp/0201914654

I'm curious to know what usage patterns people have established for themselves to utilize bit shifting in a modern day context such as client/server, web, desktop development. What's a usage of bit shifting in software development other than device drivers, asm?

What bitwise operations / shifting techniques should I be using? What are some algorithms I can reuse?

I guess The Art of Computer Programming, Vol. IV, Fasc. 1 will be right up your alley.

It really depends on the language, application your developing etc. That being said, I would look at Hacker's Delight. I think it will have what you are looking for.

i am studing different methods about bit counting ,or population count methods fopr given integer, during this days,i was trying to figure out how following algorithms works

```
pop(x)=-sum(x<<i) where i=0:31
```

i think that after calculate each value of x,we will get

```
x+2*x+4*x+8*x+16*x+..............+2^31*x =4294967294*x
```

if we multiply it by -1,we get `-4294967294*x`

,but how it counts number of bits?please help me to understand this method well.thanks

I believe you mean

as seen in the cover of the book *Hacker's Delight*, where the symbol means left-*rotation* not left-*shift* which will produce the wrong results and downvotes.

This method works because the rotation will cause all binary digits of *x* to appear in every possible bits in all terms, and because of 2's complement.

Take a simpler example. Consider numbers with only 4 binary digits, where the digits can be represented as `ABCD`

, then the summation means:

```
ABCD // x <<rot 0
+ BCDA // x <<rot 1
+ CDAB // x <<rot 2
+ DABC // x <<rot 3
```

We note that every column has all of A, B, C, D. Now, `ABCD`

actually means "2³ A + 2² B + 2¹ C + 2⁰ D", so the summation is just:

```
2³ A + 2² B + 2¹ C + 2⁰ D
+ 2³ B + 2² C + 2¹ D + 2⁰ A
+ 2³ C + 2² D + 2¹ A + 2⁰ B
+ 2³ D + 2² A + 2¹ B + 2⁰ C
——————————————————————————————————————————————————————
= 2³(A+B+C+D) + 2²(B+C+D+A) + 2¹(C+D+A+B) + 2⁰(D+A+B+C)
= (2³ + 2² + 2¹ + 2⁰) × (A + B + C + D)
```

The (A + B + C + D) is the population count of *x* and (2³ + 2² + 2¹ + 2⁰) = 0b1111 is -1 in 2's complement, so the summation is the negative of the population count.

The argument can be easily extended to 32-bit numbers.

Realated tags

.net32-bitalgebraalgorithmarchitectureasp.netassemblybitbit-manipulationbit-shiftbitarraybitcountbitflagsbitsbitwise-operatorsbooleancc#c++commentscompiler-constructioncompiler-developmentconways-game-of-lifedecimaldesigndoublegccintegerjavalanguage-agnosticlong-integerlow-latencymathmicro-optimizationnumericaloptimizationperformancephppythonresourcessimdssesse2typesx86