15. Learning: Near Misses, Felicity Conditions

One-shot learning

Learning in human-like way, in one shot: learning something definite from each example.

The evolving model

Comparing an initial model example, a seed, with a near miss or another example, the evolving model understands an important characteristic for each new near miss or example compared.

The evolving model develops a set of heuristics to describe the seed, specializing with near misses (reducing the potential matches) or generalizing with examples (broadening the potential matches) the characteristics of the seed.

  • Require link heuristic: specialization
  • Forbid link heuristic: specialization
  • Extend set heuristic: generalization
  • Drop link heuristic: generalization
  • Climb tree heuristic: generalization

Felicity conditions

The teacher and learner must know about each other to achieve the best learning. The learner must talk to himself to understand what he is doing.

How to package ideas better

To better communicate ideas to others in order to achieve better results, the following 5 characteristics makes communication more effective.

  • Symbol: ease to remember the idea
  • Slogan: focus the idea
  • Surprise: catch the attention
  • Salient: one thing to stand out
  • Story: helps transmission to people

12b: Deep Neural Nets

Image recognition by a deep neural net

Convolution: a neuron looks for patterns in a small portion (10×10 px) of an image (256×256 px), the process is repeated by moving this small area little by litte.

Pooling: The result of the convolution is computed as a point for each portion analyzed. By a similar step by step process, a small set of points are computed into values by choosing the maximum value (“max pooling”).

By reproducing the pooling process multiple times (100x), and feeding it to a neural net, it will compute how likely the initial image is recognized as a known category.


A small number of neurons (~2), the “hidden layer“, a bottleneck of neurons between two columns of multiple neurons (~10) is used to obtain output values z[n] that are the same as input values x[n].

Such results implies that a form a generalization is accomplished by the hidden layer, or rather, a form of encoded generalization, as the actual parameters of the bottleneck of neurons seems not so obvious to understand.

Final layer of neurons

As the neural net is trained with parameters and thresholds, the shape and corresponding equation of the sigmoid function is adapted to properly sort positive and negative results, by maximizing the probability of sorting examples properly.


Instead of sorting by the maximum value and the corresponding category, the final output is an array of the most probable categories (~5 categories).


The problems of neural nets is that they can get blocked in local maximum areas. To prevent this, at each computation, one neuron is deactivated to check if its behavior is skewing the neural net. At each new computation another is shut down, or dropped out, to check all neurons.

Thanks to wider neural networks, neural nets can avoid being jammed into local maximum as they can analyze local maximum through more parameters.

See also: 

Boltzmann machine