image recognition Archives

August 13, 2020August 4, 2022

Finding the dominant colors in an image with k-means

Working with images can be a very time-consuming task, especially if you have many images to work on. Machine learning can thus be a great time-saver for various image analysis and editing tasks, such as finding the dominant colors of an image thanks to the K-means clustering algorithm.

Continue reading “Finding the dominant colors in an image with k-means”

July 10, 2018October 23, 2019

Guide to real Machine Learning applications

This series of articles dives deeper into the actual applications of Machine Learning that are currently in use in many current technological processes and devices.

Amazon Alexa

Through these posts entitled “Machine Learning is Fun!”, Adam Geitgey guides us step by step through the concepts, data, algorithms, code, results and pitfalls of machine learning applications from image, face and speech recognition to language translation and more. It also gathers several different sources for more details on each application and its development.

Image encoding

This series is really dense with detailed code, but it is also explained very clearly, step by step, with detailed illustration. It notably covers the use of a Convolutional Neural Network (including Generative Adversarial Network) and Recurrent Neural Network, together with some of their most prominent applications in daily life. It is a real course not to be missed for any ML developer!

Here is the list of posts with direct links:

Part 1: The world’s easiest introduction to Machine Learning
Part 2: Using Machine Learning to generate Super Mario Maker levels
Part 3: Deep Learning and Convolutional Neural Networks
Part 4: Modern Face Recognition with Deep Learning
Part 5: Language Translation with Deep Learning and the Magic of Sequences
Part 6: How to do Speech Recognition with Deep Learning
Part 7: Abusing Generative Adversarial Networks to Make 8-bit Pixel Art
Part 8: How to Intentionally Trick Neural Networks

December 6, 2017October 23, 2019

12b: Deep Neural Nets

Image recognition by a deep neural net

Convolution: a neuron looks for patterns in a small portion (10×10 px) of an image (256×256 px), the process is repeated by moving this small area little by litte.

Pooling: The result of the convolution is computed as a point for each portion analyzed. By a similar step by step process, a small set of points are computed into values by choosing the maximum value (“max pooling”).

By reproducing the pooling process multiple times (100x), and feeding it to a neural net, it will compute how likely the initial image is recognized as a known category.

Autocoding

A small number of neurons (~2), the “hidden layer“, a bottleneck of neurons between two columns of multiple neurons (~10) is used to obtain output values z[n] that are the same as input values x[n].

Such results implies that a form a generalization is accomplished by the hidden layer, or rather, a form of encoded generalization, as the actual parameters of the bottleneck of neurons seems not so obvious to understand.

Final layer of neurons

As the neural net is trained with parameters and thresholds, the shape and corresponding equation of the sigmoid function is adapted to properly sort positive and negative results, by maximizing the probability of sorting examples properly.

Softmax

Instead of sorting by the maximum value and the corresponding category, the final output is an array of the most probable categories (~5 categories).

Dropout

The problems of neural nets is that they can get blocked in local maximum areas. To prevent this, at each computation, one neuron is deactivated to check if its behavior is skewing the neural net. At each new computation another is shut down, or dropped out, to check all neurons.

Thanks to wider neural networks, neural nets can avoid being jammed into local maximum as they can analyze local maximum through more parameters.

Tag: image recognition