Artificial Intelligence

Stuart Jonathan Russell, Peter Norvig

Mentioned 16

Artificial intelligence: A Modern Approach, 3e,is ideal for one or two-semester, undergraduate or graduate-level courses in Artificial Intelligence. It is also a valuable resource for computer professionals, linguists, and cognitive scientists interested in artificial intelligence. The revision of this best-selling text offers the most comprehensive, up-to-date introduction to the theory and practice of artificial intelligence.

More on

Mentioned in questions and answers.

I'm struggling my way through Artificial Intelligence: A Modern Approach in order to alleviate my natural stupidity. In trying to solve some of the exercises, I've come up against the "Who Owns the Zebra" problem, Exercise 5.13 in Chapter 5. This has been a topic here on SO but the responses mostly addressed the question "how would you solve this if you had a free choice of problem solving software available?"

I accept that Prolog is a very appropriate programming language for this kind of problem, and there are some fine packages available, e.g. in Python as shown by the top-ranked answer and also standalone. Alas, none of this is helping me "tough it out" in a way as outlined by the book.

The book appears to suggest building a set of dual or perhaps global constraints, and then implementing some of the algorithms mentioned to find a solution. I'm having a lot of trouble coming up with a set of constraints suitable for modelling the problem. I'm studying this on my own so I don't have access to a professor or TA to get me over the hump - this is where I'm asking for your help.

I see little similarity to the examples in the chapter.

I was eager to build dual constraints and started out by creating (the logical equivalent of) 25 variables: nationality1, nationality2, nationality3, ... nationality5, pet1, pet2, pet3, ... pet5, drink1 ... drink5 and so on, where the number was indicative of the house's position.

This is fine for building the unary constraints, e.g.

The Norwegian lives in the first house:

nationality1 = { :norway }.

But most of the constraints are a combination of two such variables through a common house number, e.g.

The Swede has a dog:

nationality[n] = { :sweden } AND pet[n] = { :dog }

where n can range from 1 to 5, obviously. Or stated another way:

    nationality1 = { :sweden } AND pet1 = { :dog } 
XOR nationality2 = { :sweden } AND pet2 = { :dog } 
XOR nationality3 = { :sweden } AND pet3 = { :dog } 
XOR nationality4 = { :sweden } AND pet4 = { :dog } 
XOR nationality5 = { :sweden } AND pet5 = { :dog } 

...which has a decidedly different feel to it than the "list of tuples" advocated by the book:

( X1, X2, X3 = { val1, val2, val3 }, { val4, val5, val6 }, ... )

I'm not looking for a solution per se; I'm looking for a start on how to model this problem in a way that's compatible with the book's approach. Any help appreciated.

Thanks to everyone for some helpful information!

The hint I really needed came to me in a traffic jam. Rather than assigning nationalities, pets etc. to houses (variables named country1, country2, pet1, pet2), what I needed to do was assign houses to the elements of the domain! Example:

(9) norway = 1        ; unary constraint: The Norwegian lives in the 1st house
(2) britain = dog     ; binary constraint: Dog is in same house as the Brit
(4) green - ivory = 1 ; relative positions

This allowed me to find simple formulations for my constraints, like this:

(def constraints
   [:con-eq :england :red]
   [:con-eq :spain :dog]
   [:abs-pos :norway 1]
   [:con-eq :kools :yellow]
   [:next-to :chesterfields :fox]
   [:next-to :norway :blue]
   [:con-eq :winston :snails]
   [:con-eq :lucky :oj]
   [:con-eq :ukraine :tea]
   [:con-eq :japan :parliaments]
   [:next-to :kools :horse]
   [:con-eq :coffee :green]
   [:right-of :green :ivory]
   [:abs-pos :milk 3]

I'm not done yet (puttering at this only part time) but I will post a complete solution once I work it out.

Update: About 2 weeks later, I've come up with a working solution in Clojure:

(ns houses
  [:use [htmllog] clojure.set]  

  [ 1] The Englishman lives in the red house.
  [ 2] The Spaniard owns the dog.
  [ 3] The Norwegian lives in the first house on the left.
  [ 4] Kools are smoked in the yellow house.
  [ 5] The man who smokes Chesterfields lives in the house next to the man with the fox.
  [ 6] The Norwegian lives next to the blue house.
  [ 7] The Winston smoker owns snails.
  [ 8] The Lucky Strike smoker drinks orange juice.
  [ 9] The Ukrainian drinks tea.
  [10] The Japanese smokes Parliaments.
  [11] Kools are smoked in the house next to the house where the horse is kept.
  [12] Coffee is drunk in the green house.
  [13] The Green house is immediately to the right (your right) of the ivory house.
  [14] Milk is drunk in the middle house.

  “Where does the zebra live, and in which house do they drink water?”

(def positions #{1 2 3 4 5})

(def categories {
          :country #{:england :spain :norway :ukraine :japan}
          :color #{:red :yellow :blue :green :ivory}
          :pet #{:dog :fox :snails :horse :zebra}
          :smoke #{:chesterfield :winston :lucky :parliament :kool}
          :drink #{:orange-juice :tea :coffee :milk :water}

(def constraints #{
                    ; -- unary
          '(at :norway 1) ; 3
          '(at :milk 3) ; 14
                    ; -- simple binary
          '(coloc :england :red) ; 1
          '(coloc :spain :dog) ; 2
          '(coloc :kool :yellow) ; 4
          '(coloc :winston :snails) ; 7
          '(coloc :lucky :orange-juice) ; 8
          '(coloc :ukraine :tea) ; 9
          '(coloc :japan :parliament) ; 10
          '(coloc :coffee :green) ; 12
                    ; -- interesting binary
          '(next-to :chesterfield :fox) ; 5
          '(next-to :norway :blue) ; 6
          '(next-to :kool :horse) ; 11
          '(relative :green :ivory 1) ; 13

; ========== Setup ==========

(doseq [x (range 3)] (println))

(def var-cat    ; map of variable -> group 
      ; {:kool :smoke, :water :drink, :ivory :color, ... 
    (apply hash-map (apply concat 
        (for [cat categories vari (second cat)] 
      [vari (first cat)]))))

(prn "var-cat:" var-cat)

(def initial-vars    ; map of variable -> positions
      ; {:kool #{1 2 3 4 5}, :water #{1 2 3 4 5}, :ivory #{1 2 3 4 5}, ...
    (apply hash-map (apply concat 
        (for [v (keys var-cat)] [v positions]))))

(prn "initial-vars:" initial-vars)

(defn apply-unary-constraints
   "This applies the 'at' constraint. Separately, because it only needs doing once." 
   (let [update (apply concat
      (for [c constraints :when (= (first c) 'at) :let [[v d] (rest c)]]
   [v #{d}]))]
      (apply assoc vars update)))

(def after-unary (apply-unary-constraints initial-vars))

(prn "after-unary:" after-unary)

(def binary-constraints (remove #(= 'at (first %)) constraints))

(prn "binary-constraints:" binary-constraints)

; ========== Utilities ==========

(defn dump-vars
   "Dump map `vars` as a HTML table in the log, with `title`." 
   [vars title]
  (letfn [
        (vars-for-cat-pos [vars var-list pos]
          (apply str (interpose "<br/>" (map name (filter #((vars %) pos) var-list)))))]
      (log-tag "h2" title)
    (log "<table border='1'>")
    (log "<tr>")
    (doall (map #(log-tag "th" %) (cons "house" positions)))
    (log "</tr>")
    (doseq [cat categories]
      (log "<tr>")
          (log-tag "th" (name (first cat)))
          (doseq [pos positions]
          (log-tag "td" (vars-for-cat-pos vars (second cat) pos)))
      (log "</tr>")
    (log "</table>")))

(defn remove-values
   "Given a list of key/value pairs, remove the values from the vars named by key." 
   [vars kvs]
   (let [names (distinct (map first kvs))
      delta (for [n names]
      [n (set (map second (filter #(= n (first %)) kvs)))])
      update (for [kv delta
         :let [[cname negative] kv]]
      [cname (difference (vars cname) negative)])]
      (let [vars (apply assoc vars (apply concat update))]

(defn siblings
   "Given a variable name, return a list of the names of variables in the same category."
   (disj (categories (var-cat vname)) vname))

(defn contradictory?
   "Checks for a contradiction in vars, indicated by one variable having an empty domain." 
   (some #(empty? (vars %)) (keys vars)))

(defn solved?
   "Checks if all variables in 'vars' have a single-value domain."
   (every? #(= 1 (count (vars %))) (keys vars)))

(defn first-most-constrained
   "Finds a variable having the smallest domain size > 1."
   (let [best-pair (first (sort (for [v (keys vars) :let [n (count (vars v))] :when (> n 1)] [n v])))]
      (prn "best-pair:" best-pair)
      (second best-pair)))   

;========== Constraint functions ==========

      These functions make an assertion about the domains in map 'bvars', 
      and remove any positions from it for which those assertions do not hold. 
      They all return the (hopefully modified) domain space 'bvars'.)

   (declare bvars coloc next-to relative alldiff solitary)

   (defn coloc
      "Two variables share the same location." 
      [vname1 vname2]
      (if (= (bvars vname1) (bvars vname2)) bvars
      (let [inter (intersection (bvars vname1) (bvars vname2))]
         (apply assoc bvars [vname1 inter vname2 inter])))))

   (defn next-to 
      "Two variables have adjoining positions"
      [vname1 vname2]
      ; (prn "doing next-to" vname1 vname2)
      (let [v1 (bvars vname1) v2 (bvars vname2)
            bad1 (for [j1 v1 :when (not (or (v2 (dec j1)) (v2 (inc j1))))] [vname1 j1])
        bad2 (for [j2 v2 :when (not (or (v1 (dec j2)) (v1 (inc j2))))] [vname2 j2])
         allbad (concat bad1 bad2)]
   (if (empty? allbad) bvars 
         (remove-values bvars allbad)))))

   (defn relative
      "(position vname1) - (position vname2) = diff"  
      [vname1 vname2 diff]
      (let [v1 (bvars vname1) v2 (bvars vname2)
       bad1 (for [j1 v1 :when (not (v2 (- j1 diff)))] [vname1 j1])
         bad2 (for [j2 v2 :when (not (v1 (+ j2 diff)))] [vname2 j2])
         allbad (concat bad1 bad2)]
   (if (empty? allbad) bvars
         (remove-values bvars allbad)))))

   (defn alldiff
      "If one variable of a category has only one location, no other variable in that category has it."
      (let [update (apply concat
   (for [c categories v (val c) :when (= (count (bvars v)) 1) :let [x (first (bvars v))]]
      (for [s (siblings v)]
         [s x])))]
   (remove-values bvars update)))

   (defn solitary
      "If only one variable of a category has a location, then that variable has no other locations."
      (let [loners (apply concat
   (for [c categories p positions v (val c) 
      :when (and 
         ((bvars v) p)
         (> (count (bvars v)) 1)
         (not-any? #((bvars %) p) (siblings v)))]
      [v #{p}]))]
      (if (empty? loners) bvars
      ; (prn "loners:" loners)
      (apply assoc bvars loners)))))

;========== Solving "engine" ==========


(dump-vars initial-vars "Initial vars")

(dump-vars after-unary "After unary")

(def rules-list (concat (list '(alldiff)) binary-constraints (list '(solitary))))

(defn apply-rule
   "Applies the rule to the domain space and checks the result." 
   [vars rule]
      (nil? vars) nil
      (contradictory? vars) nil
   (binding [bvars vars]
   (let [new-vars (eval rule)]
         (contradictory new-vars) (do 
      (prn "contradiction after rule:" rule) 
         (= new-vars vars) vars  ; no change
         :else (do 
      (prn "applied:" rule)
      (log-tag "p" (str "applied: " (pr-str rule))) 
      (prn "result: " new-vars) 

(defn apply-rules 
   "Uses 'reduce' to sequentially apply all the rules from 'rules-list' to 'vars'."
   (reduce apply-rule vars rules-list))

(defn infer
   "Repeatedly applies all rules until the var domains no longer change." 
   (loop [vars vars]
      (let [new-vars(apply-rules vars)]
      (if (= new-vars vars) (do 
         (prn "no change")
      (do (recur new-vars))))))

(def after-inference (infer after-unary))

(dump-vars after-inference "Inferred")

(prn "solved?" (solved? after-inference))

(defn backtrack
   "solve by backtracking."
      (nil? vars) nil
      (solved? vars) vars
      (let [fmc (first-most-constrained vars)]
   (loop [hypotheses (seq (vars fmc))]
      (if (empty? hypotheses) (do
         (prn "dead end.")
         (log-tag "p" "dead end.")
         (let [hyp (first hypotheses) hyp-vars (assoc vars fmc #{hyp})]
      (prn "hypothesis:" fmc hyp)
      (log-tag "p" (str "hypothesis: " hyp))
      (dump-vars hyp-vars (str "Hypothesis: " fmc " = " hyp))
      (let [bt (backtrack (infer hyp-vars))]
         (if bt (do
      (prn "success!")
         (dump-vars bt "Solved")
      (recur (rest hypotheses))))))))))

(prn "first-most-constrained:" (first-most-constrained after-inference))

(def solution (backtrack after-inference))

(prn "solution:" solution)


(println "houses loaded.")

This is 292 lines, but there's a lot of debug/diagnostic coding in there. In all, I'm pretty happy to have managed a reasonably short solution in Clojure. Functional programming made for a bit of a challenge but I managed to maintain a pretty consistent functional style.

Criticism welcome though!

For anyone who cares, here's the solution:

house       1       2               3       4             5
country     norway  ukraine         england spain         japan
color       yellow  blue            red     ivory         green
pet         fox     horse           snails  dog           zebra
smoke       kool    chesterfield    winston lucky         parliament
drink       water   tea             milk    orange-juice  coffee

Warning: I'm not sure this is what are you searching for, because I haven't read Artificial Intelligence: A Modern Approach, but I think what follow is interesting nonetheless.

Edi Weitz has an interesting page on this riddle, with explained source in Common Lisp and other sources in C++ and Common Lisp without detailed comments. I found the C++ source by Klaus Betzler particularly interesting (reformatted a little for enhanced clarity):

//  einstein.cpp  (c) Klaus Betzler 20011218


//  `Einstein's Riddle´, the rules:

//  1 The Brit lives in the red house 
//  2 The Swede keeps dogs as pets 
//  3 The Dane drinks tea 
//  4 The green house is on the left of the white house 
//  5 The green house's owner drinks coffee 
//  6 The person who smokes Pall Mall rears birds 
//  7 The owner of the yellow house smokes Dunhill 
//  8 The man living in the centre house drinks milk 
//  9 The Norwegian lives in the first house 
// 10 The person who smokes Marlboro lives next to the one who keeps cats 
// 11 The person who keeps horses lives next to the person who smokes Dunhill 
// 12 The person who smokes Winfield drinks beer 
// 13 The German smokes Rothmans 
// 14 The Norwegian lives next to the blue house 
// 15 The person who smokes Marlboro has a neigbor who drinks water 

#undef WIN32           // #undef for Linux

#include <stdio.h>
#ifdef WIN32
  #include <windows.h>

inline unsigned long BIT(unsigned n) {return 1<<n;}

const unsigned long 
  yellow    = BIT( 0), 
  blue      = BIT( 1),
  red       = BIT( 2),
  green     = BIT( 3),
  white     = BIT( 4),

  norwegian = BIT( 5),
  dane      = BIT( 6),
  brit      = BIT( 7),
  german    = BIT( 8),
  swede     = BIT( 9),

  water     = BIT(10),
  tea       = BIT(11),
  milk      = BIT(12),
  coffee    = BIT(13),
  beer      = BIT(14),

  dunhill   = BIT(15),
  marlboro  = BIT(16),
  pallmall  = BIT(17),
  rothmans  = BIT(18),
  winfield  = BIT(19),

  cat       = BIT(20),
  horse     = BIT(21),
  bird      = BIT(22),
  fish      = BIT(23),
  dog       = BIT(24);

const char * Label[] = {
  "Yellow",   "Blue",    "Red",     "Green",   "White",
  "Norwegian","Dane",    "Brit",    "German",  "Swede",
  "Water",    "Tea",     "Milk",    "Coffee",  "Beer",
  "Dunhill",  "Marlboro","Pallmall","Rothmans","Winfield",
  "Cat",      "Horse",   "Bird",    "Fish",    "Dog"

const unsigned long color   = yellow   +blue    +red     +green   +white;
const unsigned long country = norwegian+dane    +brit    +german  +swede;
const unsigned long drink   = water    +tea     +milk    +coffee  +beer;
const unsigned long cigar   = dunhill  +marlboro+pallmall+rothmans+winfield;
const unsigned long animal  = cat      +horse   +bird    +fish    +dog;

unsigned long house [5] = {norwegian, blue, milk, 0, 0};  // rules 8,9,14
unsigned long result[5];

const unsigned long comb[] = { // simple rules
  brit+red,                    // 1
  swede+dog,                   // 2
  dane+tea,                    // 3
  green+coffee,                // 5
  pallmall+bird,               // 6
  yellow+dunhill,              // 7
  winfield+beer,               // 12
  german+rothmans              // 13

const unsigned long combmask[] = { // corresponding selection masks

inline bool SimpleRule(unsigned nr, unsigned which)
  if (which<8) {
    if ((house[nr]&combmask[which])>0)
      return false;
    else {
      return true;
  else {           // rule 4
    if ((nr==4)||((house[nr]&green)==0))
      return false;
      if ((house[nr+1]&color)>0)
        return false;
      else {
        return true;

inline void RemoveSimple(unsigned nr, unsigned which)
  if (which<8) 

inline bool DunhillRule(unsigned nr, int side)  // 11
  if (((side==1)&&(nr==4))||((side==-1)&&(nr==0))||((house[nr]&dunhill)==0))
    return false;
  if ((house[nr+side]&animal)>0)
    return false;
  return true;

inline void RemoveDunhill(unsigned nr, unsigned side)

inline bool MarlboroRule(unsigned nr)    // 10 + 15
  if ((house[nr]&cigar)>0)
    return false;
  if (nr==0) {
    if ((house[1]&(animal+drink))>0)
      return false;
    else {
      return true;
  if (nr==4) {
    if ((house[3]&(animal+drink))>0)
      return false;
    else {
      return true;
  int i,k;
  for (i=-1; i<2; i+=2) {
    if ((house[nr+i]&animal)==0) {
      for (k=-1; k<2; k+=2) {
        if ((house[nr+k]&drink)==0) {
          return true;
  return false;

void RemoveMarlboro(unsigned m)
  if (m>0)
  if (m<4)

void Recurse(unsigned recdepth)
  unsigned n, m;
  for (n=0; n<5; n++) {
    if (recdepth<9) {    // simple rules
      if (SimpleRule(n, recdepth)) {
        RemoveSimple(n, recdepth);
    else {               // Dunhill and Marlboro
      for (int side=-1; side<2; side+=2)
        if (DunhillRule(n, side)) {
          for (m=0; m<5; m++) 
            if (MarlboroRule(m))
              for (int r=0; r<5; r++)
                result[r] = house[r];
          RemoveDunhill(n, side);

int main()
  int index, i;
#ifdef WIN32
  LARGE_INTEGER time0, time1, freq;
#ifdef WIN32
  printf("\nComputation Time: %ld microsec\n\n", 
  if (result[0]==0) {
    printf("No solution found !?!\n");
    return 1;
  for (i=0; i<5; i++)
    if ((result[i]&animal)==0)
      for (index=0; index<25; index++)
        if (((result[i]&country)>>index)==1)
          printf("Fish Owner is the %s !!!\n\n", Label[index]);
  for (i=0; i<5; i++) {
    printf("%d: ",i+1);
    for (index=0; index<25; index++)
      if (((result[i]>>index)&1)==1)
  return 0;

I've recently come across Intelligent Agents by reading this book : link text

I'm interested in finding a good book for beginners, so I can start to implement such a system. I've also tried reading "Multiagent Systems : A modern approach to distributed artificial intelligence" (can't find it on amazon) but it's not what I'm looking for. Thanks for the help :).

There is numerous classical books:

The first two are the easiest, the second one covers more than machine learning. However, there is little "pragmatic" or "engineering" stuff in there. And the math is quite demanding, but so is the whole field. I guess you will do best with O'Reilly's programming collective intelligence because it has its focus on programming.

we are working on a little Java game, based on the game Blokus. Blokus-Manual

I'm a Java-beginner and plan to implement an advanced artificial intelligence. We already have a random AI (picks a random valid move) and an AI with a simple move-rating mechanism. We also want an AI which should be as good as possible (or at least very good ;) ).

The question is: Which AI-concept would be suitable for our purpose? The minimax-algorithm seems to be a valid choice, but how do you adapt it to a 4-player-game? Are there better concepts for a game like blokus?

Thanks already :)

I would recommend minimax algorithm. One thing you can add to make it more efficient (meaning you should be able go more moves deep into the future) is alpha-beta pruning.

The problem with minimax search is that the number of games states it has to examine is exponential in the depth of the tree. Unfortunately, we can't eliminate the exponent, but it turns out we can effectively cut it in half.

The quote is from Chapter 5.3 of Artificial Intelligence: A Modern Approach third edition by Stuart Russel and Peter Norvig. It was holding up my monitor, and I used it in a few of my classes in college. I know people don't often reference books on SO, but it's extremely relevant. I have used it extensively, and I do really recommend it for both being understandable, and covering a wide range of AI content.

It is available on amazon for $104, or * cough cough * I'm sure you can find it online if you don't have that kind of money for a textbook floating around. Looking up the minimax algorithm and alpha beta pruning online should also get you good results.

I think the only circumstance that would make Minimax a poor option for you is if the game state is only partially observable to any given player (they don't know everything about what's going on), or if the game is non-deterministic (it has random elements). Because neither of these are the case for Blokus, I think you made an excellent choice with Minimax.

The area of AI is called Adversarial Search in the textbook (Chapter 5: Adversarial Search), so looking up more info online with that term may get you more helpful information, or help you find an example Java implementation. I do not consider this a beginner's task, but it sounds like you are up to it, if you made the game and can pick random valid moves. Keep up the good work!

Where can I find a proof for the following theorem:

Theorem: If h(n) is consistent, A* using GRAPH-SEARCH is optimal


You can find it in this book on page 95-97:

The basic outline of the proof is:

First we define these functions:

g(n) = cost to reach node from start node

f(n) = g(n) + h(n)


  1. Establish that the values of f(n) along any path are nondecreasing, if h(n) is consistent.

  2. Prove that whenever A* selects a node for expansion, the optimal path to that node has been found.

Step 1 follows directly from the definition of consistency.

Step 2 is proved by seeing, that if it wasn't true, there would have to be another frontier node n' on the optimal path from the start node to n, but this cannot be, as paths are nondecreasing and hence that node would have lower f cost than than n. I.e. f(n) = g(n) + h(n) > f(n') = g(n') + h(n')

I am working on a space invaders clone in unity game engine. I want to make the enemies intelligent. One approach I tried was using a min max algorithm. I took the x-coordinates of player and made the min max tree from it and the used it to make the enemy turn away from that position after specific time intervals. However this did not work well. Now I want the enemies to learn and evolve to avoid the player using a neural network. How can I implement this in space invaders? Are there other algorithms better then neural networks to use in space invaders?

Almost any learning algorithm would be better than neural networks. However it is a very deep topic -- what you need is a book and a few months, not a quick answer on SO.

So I'll recommend a few:

I don't recommend neural networks for this because they take a very long time to learn, even with the most modern learning-optimization techniques (which I don't think you're going to find in any book yet anyway). You want these invaders to learn on the fly, so you need something much more responsive.

I would probably use a decision tree, and keep a limited-length memory vector so that the critters can adapt quickly to changes in the player's strategy.

I'm about to take a course in AI and I want to practice before. I'm using a book to learn the theory, but resources and concrete examples in any language to help with the practice would be amazing. Can anyone recommend me good sites or books with plenty of examples and tutorials ?

Thanks !

Edit: My course will deal with Perceptrons, Neural networks and Bayesian AI.

Russel & Norvig have a good survey of the broad field.

Personally I would recommend this M.Tim.Jones book on AI.

Has many many topics on AI and almost every type of AI discussion is followed by C example code. Very pragmatic book on AI indeed !!

I'm going to do a research about:

"The ways Artificial Intelligence can reduce the time gap between the player action and the computer/game reaction." my AI course research.

I'm interested in Human Computer Interaction(HCI) studies, so I decided to choose this topic for my AI research as well.

But unfortunately I couldn't find any useful resource on the web,till now.

Could anyone introduce me some resources?

(I would be glad if they're for beginners, cause I'm new to AI and all its related works)

I would suggest reading this: Artificial-Intelligence-Modern-Approach-3rd

I want to cover the equivalent of a typical CS undergrad course in material, so I'm making a list of books to cover the typical topics. I've split the list into topics that, from the research I did, I think are compulsory and optional. I would like some help to confirm if the topics are split correctly, and if the books are of the correct level. Also, please let me know if I left out any important topics, or if any are beyond undergrad level.

Thank you for your time!

Edit regarding the on hold status: I do not believe this question is off-topic as I am not asking for a recommendation for books - I am asking if the topics I have listed are indicative of a typical CS course, and if any important topics are missing. The books links are only there in case the books I have chosen are not correct for the topic, and can be removed if necessary.


Operating Systems: Operating System Concepts

Networks: Computer Networking: A Top-Down Approach

Discrete Mathematics: Concrete Mathematics

Data Structures and Algorithms: Introduction to Algorithms

Computer Architecture: Computer Systems: A Programmer's Perspective

Automata Theory: Introduction to the Theory of Computation

Compilers: Engineering a Compiler was recommended to me over the dragon book.

Database Theory: An Introduction to Database Systems

Programming Language Concepts and Design: Programming Language Pragmatics


Cryptography: Cryptography Engineering: Design Principles and Practical Applications

Functional Programming: Learn You a Haskell for Great Good!

Artificial Intelligence: Artificial Intelligence: A Modern Approach

Computer Graphics: Real-Time Rendering

Your list is very good on the subjects directly related to computer science. However, it is light on math. In my own B.Sc. in Computer Science I also had a ton of calculus, linear algebra, algebra (groups, rings, etc), statistics, analytic geometry and numerical analysis. Some applications of computer science rely heavily on those:

  • Machine learning depends on lots of linear algebra, calculus, and statistics;
  • Computer graphics depends a lot on analytic geometry and linear algebra;
  • Scientific computation relies on calculus and numerical analysis.

I never used much of the ton of Algebra I had, but I hear it is important for cryptography. :-)

For a programmer developing more regular applications your list is very good, but for those interested in these more specialized areas (which are still pretty important), these subjects are vital.

I would like to implement a naive bayes classifier for spam filtering from scratch as a learning exercise. What would be the best langauge of the following to try this out in?

  1. Java
  2. Ruby
  3. C++
  4. C
  5. something else

Please give reasons (it would help greatly!)

I would do it in C#, but that's only because it's the language that I'm most familiar with at the moment, and because I know it's got strong string handling. It can also be done in C++ with stl::string classes, Ruby, Java, etc.

If I were building a naive bayes classifier, I'd start with a simple example, like the one in Russell & Norvig's book (the one I learned off of way back when, in the second edition of the book) or the one in Mitchell's book (I used his because he taught the class). Make your learner generate rules in a general fashion; that is, given input data, produce output rules, and have the input data be a generalizable thing (could be a block of text that for spam detection, could be a weather report to predict if someone's going to play tennis).

If you're trying to learn Bayes classifiers, a simple example like this is better to start with than a full-blown spam filter. Language parsing is hard in and of itself, and then determining whether or not there's garbage language is also difficult. Better to have a simple, small dataset, one where you can derive how your learner should learn and make sure that your program matches what you want it to do. Then, you can grow your dataset, or modify your program to incorporate things like language parsing.

Learning Clips, while I don't mind the syntax I'm finding it difficult to derive rules from facts. Is there a tip on how to structure rules given a knowledge base? A non trivial example would be nice, thanks.

Basically this is the entire field of "knowledge engineering;" it's really as broad a question as "given a domain, how do I architect application software for that domain?" Whole books have been written on this topic. Here are three that treat this topic well IMO: Peter Norvig's "Artificial Intelligence, A Modern Approach", Giarratano and Riley's "Expert Systems: Principles and Programming", and my "Jess In Action".

currently I am trying to develop an AI for a game similar to chess where I basically try to calculate a few steps ahead then to get the best possible payoff with alpha beta.

Currently my strategy is to , for every 1 move, make 4 best outcome moves, change "board" temporarily, then for those 4 moves, call 4 more again, then run the moves, call 4 more per each move.....

This seems very natural for a recursive function, I am trying to dig all the way down 10 moves or so. But so far I've wrote 3 steps (my turn, your, mine) and my code is already blowing up. I need to code the part where FOR EACH new move, calculate new move. Are there anyway I can truncate the below code so it can be something like recursive Call( ...) with a max level check as base case?

Here is my current iterative 3 - step code, assume there is a tree structure, inside every move contains an arraylist of 4 best counter moves.

 for (i = 0; i < root.children.size(); i++) {
        for (int lvl1 : root.children.get(i).currOtherPieces) {
            root.children.get(i).addChild(lvl1,root.children.get(i).level + 1);         
        for (j = 0; j < root.children.get(i).children.size(); j++) {
            System.out.print("    ");
            for (int lvl2 : root.children.get(i).children.get(j).currSelfPieces) {
                root.children.get(i).children.get(j).addChild(lvl2, root.children.get(i).children.get(j).level + 1);
            for (k = 0; k < root.children.get(i).children.get(j).children.size(); k++) {
                System.out.print("            ");

As you can see I can keep writing 7 more layers of for loop, but maybe there is a clever way to recursively put this. Any help? Thanks!

(The sys.out with spaces are just for me to debug, can leave that out...)

what are you doing is so called "behaviour tree traversal" or "solution space search", i.e. enumerating all the possible next moves and evaluate them. Behaviour tree traversal is just an application of tree traversal methods.

While a suggest you to get a good knowledge of breadth first tree traversal and deep first tree traversal strategies, I can give you a quick answer to your question.

When dealing with tree traversal, you might prefer iterative algorithm to recursive ones, as you can have troubles with stack overflow if you are not doing tail recursion optimization.

When traversing a tree you should use a queue, called "fringe", that contains the next nodes to visit.

Queue<Node> fringe = new LinkedList<Node>();

At first the fringe will contain only the root node.


Then, you can iterate over the fringe, in this way:

while(fringe.isEmpty() == false) {
  Node current = fringe.poll();

  // Evaluate the current move, e.g. if it leads to an improvement of your gameplay
  boolean result = doSomething(current); 

  if(result && !current.getChildren().isEmpty()) {

This algorithm is a trivial implementation of the BST (Breadth First Traversal), which means you visit al children of a given tree level before move to the next. The DFS (Deep First Traversal) version uses a FIFO data structure for the fringe.

Stack<Node> fringe = new LinkedList<Node>();

while(fringe.isEmpty() == false) {
  Node current = fringe.pop();

  // Evaluate the current move, e.g. if it leads to an improvement of your gameplay
  boolean result = doSomething(current); 

  if(result && !current.getChildren().isEmpty()) {
     for(Node child: current.getChildren()) {

There are a ton of optimisations of these two algorithms, such as ones that process children with different order according to some metrics (heuristic methods, such A star), others that keep track of already visited nodes to avoid cycles, and so on.

To get a better grasp of solution space visit, I suggest you a reading of "Artificial Intelligence: A Modern Approach" by Stuart Russel and Peter Norvig.

For keeping count of the deepness of your search you can extend the Node object with a field called level, that for root is 0. When exploring node's children, before insert them in the fringe, just set their level at parent's level + 1, and if this is greater than your max level, just avoid to put them in the fringe.

I would like to know how to use the Manhattan Distance heuristic to drive my search in NxN 2d array. I have the following manhattan distance:

private int manhattan(int[] pos, int tile) {
        int[] dest = new int[] {
            (tile - 1) % BOARDSIZE, (tile - 1) / BOARDSIZE
        return Math.abs(dest[0] - pos[0]) + Math.abs(dest[1] - pos[1]);

I will be moving tiles to the empty tile to LEFT, RIGHT, UP or DOWN. How do I use the above function to select neighbours of a node in order to add to a queue? Do Ii have to put it in a double for loop or? I am using f = g+h

I am a beginner in puzzles so am struggling a little bit to understand. I need help.

I can see you've rewritten your earlier question. The question you pose is explored in detail in Russell and Norvig's Artificial Intelligence: A Modern Approach. Read the 3rd chapter. Check out their website at They even have code for A* there, with a link to a demo of the 8 puzzle.

Pathfinding generally refers to the problem of finding the shortest route between two points, subject to any obstacles. Pathfinding has application in a wide range of areas including robotics and game development. Algorithms for pathfinding tend to be closely related to graph and tree search algorithms.

Significant Literature

External Links

Related Tags

So I've spent the last 5 hours googling DFS, BFS, A*, Bellman-ford and others.

I have roads=[('a', [('b', 5.0), ('c', 8.0)]), ('b', [('a', 5.0), ('d', 6.0)]), ('c', [('a', 8.0), ('d', 2.0)]), ('d', [('b', 6.0), ('c', 2.0), ('e', 12.0), ('f', 2.0)]), ('e', [('d', 12.0), ('g', 3.0)]), ('f', [('d', 2.0), ('g', 7.0)]), ('g', [('e', 3.0), ('f', 7.0)])]

weighted dictionary. I found this code:

def find_all_paths(graph, start, end, path=[]):
    path = path + [start]
    if start == end:
        return [path]
    if not start in graph:
        return []
    paths = []
    for node in graph[start]:
        if node not in path:
            newpaths = find_all_paths(graph, node, end, path)
            for newpath in newpaths:
    return paths

It never finds start in the roads and returns [].

I'm so lost. I get what these algorithms do but have no idea how to code them. I just want to find the shortest path.

The code you've copied won't solve your problem. Note, for starters, that its name tells you that it does something else. Also, it doesn't even look at the edge weights.

Finally, it isn't designed to work on your graph representation at all. Note that the start parameter is used as a key in what I assume is a dictionary, and it expects the value to be an iterable of reachable nodes. But you're not feeding it a dictionary, you're feeding it a list.

As for the empty-list return value, if you're calling the function like this:

path = find_all_paths(graph, 'a', 'f')

...then it is correct. 'a' is not a node in your graph at all. Your nodes are tuples, starting with an identifier like 'a', but also containing a list of 2-tuples. You'd have to type this mess every time:

path = find_all_paths(
  start=('a', [('b', 5.0), ('c', 8.0)]),
  end=('f', [('d', 2.0), ('g', 7.0)]),


As a final insult, those 2-tuples like ('d', 2.0) are also not nodes. Even if you fixed the rest of it, these would still fail the if start not in graph: test once one of them got passed as the new start argument into the recursive function call. And if you fixed that, they wouldn't match the designated end node. This code can't do what you want.

My suggestions:

  1. Don't plug random data into functions you found lying around the Internet. Start simple: Dijkstra's Algorithm will return the shortest path, which is what you want. It's A* without the heuristic function (you said you've already read up on A*). Code that, and then worry about getting complex. (Note that you will probably need an adjustable priority queue --- you can build one using the heapq module.)

  2. You need to access arbitrary nodes by an immutable ID, so your graph should be a dictionary, not a list. The keys are your one-letter strings, and the values are lists (or tuples) of edges. You can then iterate through all of a node's edges with something like: for edge in graph['a']:.

  3. Your life will probably be much easier if you create a collections.namedtuple called Edge with two fields: dest and weight. (You could do 3 fields if you find you also need the origin: orig, dest, and weight.)

  4. Look for other chunks of data that go together, and consider writing another namedtuple or a class for them, too. (First contender: Path, with a weight method.)

  5. Consider using the pprint.pprint function for displaying large, nested structures like your graph. It will make print-to-console debugging much easier.

PS: I found that the first couple chapters of Stuart Russell's and Peter Norvig's Artificial Intelligence: A Modern Approach, 3rd Ed were an excellent introduction to graph search algorithms like this... Unfortunately, the majority of the book would probably be off-topic for you.

I'm trying to learn more about how answer / inference engines work, the code behind it.

Are there any famous or well done algorithms, good books, or papers on this topic?

How do systems like Google Now ( The answer not predictive part ), Siri, and Wolfram | Alpha work?

I know they use Natural Language Processing and Machine Leaning, but how do they answer questions based from a collection of knowledge / facts?

You ask a very broad question. There are many implementations of inference engines, but they would all rely on natural language processing and searching algorithms at their core so I would focus on that.

Try the book Artifical Intelligence : A Modern Approach. It has sections on both NLP and Search and is very good.