9. Constraints: Visual Object Recognition

Borders and faces orientation

Discerning borders of objects and face orientation with vectors, an initial computer vision theory seemed plausible but too difficult to implement.

Orthographic projection

In orthographic projection, the correspondence of a system of points of three known objects and one unknown object creates a system of equations with a unique solution of parameters. If this solution can be applied  to all points of the unknown object, the object is recognized.

This works with manufactured objects but not so well with natural objects.

“Goldilocks principle”

Don’t search for features that are too big / complex, and not too small / simple. Not too big, not too small.

Face recognition

Use an integral of different recognizable points on a face: search for a correlation of 2 eyes + 1 nose, 1 nose + 1 mouth, etc.

8. Constraints: Search, Domain Reduction

Domain Reduction Algorithm

Vocabulary

  • Variable v: something that can have an assignment
  • Value x: something that can be assigned
  • Domain d: set of all different values
  • Constraint c: limit on variable values

With a depth-first search, the domain reduction algorithm goes back up one node when it is unable to comply with a constraint. It reduces the domain of available variables at node n-1 to make sure it will have an unused variable in the next node n.

By considering more variables (propagating to neighboring states in the US map coloring example), the domain can be reduced to a single value at a certain node.

Solving the problem faster

By working on the most constraints nodes first, the algorithm doesn’t have to work back up as much as if it starts with least constraints first.

Obviously, the main constraint of using only a small number of values (colors in the example) is the most limiting one. Being able to increase the number of values can drastically simplify the problem.

In case of uncertainty on the exact number of values, find a range of values that solve the problem (such as in the airplane assignment example).

Resources allocations

  1. Used most constraints first
  2. Propagate through domain reduced to single algorithm
  3. In the absence of definite result, find a narrow range of max values where it doesn’t work, and min values where it does

7. Constraints: Interpreting Line Drawings

Computer vision

Empirical approach

Using lines on pictures of real-world objects, the edges between shapes could serve to identify the number of objects in it.

The different intersections possible generally form two types of trihedral vertexes to identify shapes:

  • arrow vertexes
  • fork vertexes

Theoretical approach

A second approach uses convex and concave lines and boundaries between objects in a more theoretical domain, to identify trihedral vertexes between 3 faces, where objects are always considered in a general position of perspective where unusual cases are not considered.

These constraints create 18 different possibilities of junctions. With this catalog of positions, shapes can be identified to know if an object can exist in the domain defined above… but not in the real world.

Towards robot vision

Adding cracks, shadows, non-trihedral vertexes and light to the theoretical approach, the domain complexity increased with more than a 1000 junctions possibilities.

However, in the same manner as the theoretical approach, an algorithm that identifies junctions one by one with this catalog, a much wider array of objects can be identified to determine what their actual shape is and or if some ambiguity is left.

6. Search: Games, Minimax, and Alpha-Beta

How a computer can play games (like Chess)

  1. How human play chess: mixing analysis, strategy, tactics to choose a move
  2. If-Then rules
  3. Look ahead and evaluate
  4. British museum algorithm

To evaluate the potential of the next situation of the board, (methods 2 and 3) we can use a linear scoring polynomial to give number to situations and choose the best.

In the resulting tree of possibilities:

  • the branching factor is the number b of branches at each at each level
  • the depth is the number d of levels
  • the leaves or terminal nodes are the resulting b^d possibilities

In Chess, there are about 10^120 leaf nodes, making it impossible to compute.

Minimax

Method: using a scoring system for final leaves of the trees, the algorithm simulates a maximizing player who chooses higher values, while a minimizing player who chooses lower values.

Alpha-Beta

Alpha-Beta is a method to discard useless branches to reduce computations for Minimax.

From terminal nodes, the algorithm evaluates the best choice for the maximizing and minimizing players going up one node at a time. It then cuts off branches that would not be chosen by either players at the nodes they play.

Progressive Deepening

Depending on the game and time available, the branching factor of the tree, the tree can be extended not completely but to a certain number of levels only, ensuring that a good answer can always be reached in time, even though it may not be the single best answer.

Deep Blue

Deep Blue essentially used all these methods, plus some extra improvements.

Deep Blue = Minimax + Alpha-Beta + Progressive Deepening

+ Parallel Computing + Opening Book + End Game Tricks

+ Uneven Tree Development

Uneven Tree Development: some situation may need to expand the search further to ensure better choices (take the opponents’ Queen)

5. Search: Optimal, Branch and Bound, A*

Optimal search trees

Finding the best possible sequence of choices.

Getting closer to the goal is generally considered good, but it may lead to dead ends or non-optimal choices.

Oracle

Knowing the minimum path length to the goal, the search algorithm records the length of path already extended and always extends the shortest path first until all paths are longer than the known shortest length.

Branch & Bound

With the same principle, if the shortest path is not known yet (no “Oracle”) paths are extended to the goal and recorded. Other paths are extended until the are as long or longer than the current shortest path to the goal.

The algorithms extends the first path and then sorts them by length.

Branch & Bound + Extended List

In addition to the branch and bound algorithm, new branches that lead back to node previously extended with a longer path are discarded.

Branch & Bound + Admissible Heuristic

The estimated remaining distance to the goal is added to the path extended. Only the shortest path extended + remaining estimated distance is extended until the goal is reached. All longer paths are discarded.

A* = Branch & Bound + Extended List + Admissible Heuristic

In the algorithm, instead of sorting extensions to put the shortest paths (leading to many calculations), only test if the shortest path leads to the goal.

Consistency Heuristic

In certain cases (especially not about maps), the Admissible Heuristic may lead to problems. The consistency heuristic uses a stronger condition, namely that the distance to the goal from a given node x, minus the distance to the goal from another node y, in absolute value, has to be inferior to the distance between x and y.