A key element of Artificial Intelligence, Natural Language Processing is the manipulation of textual data through a machine in order to “understand” it, that is to say, analyze it to obtain insights and/or generate new text. In Python, this is most commonly done with NLTK. Continue reading “Basics of Natural Language Processing with NLTK”
Creating a stock trading bot is both a very interesting and a very challenging task. To build an algorithm that makes money, there is a number of potential trading strategies from which value can be created. This post attempts to list the most obvious strategies in a formal and systematic approach, to methodically structure the testing of the different ideas. This post will also be updated over time as more strategies get added and the most promising ideas are tested.
Mean Shift is an unsupervised machine learning algorithm. It is a hierarchical data clustering algorithm that finds the number of clusters a feature space should be divided into, as well as the location of the clusters and their centers. It works by grouping data points according to a “bandwidth”, a distance around data points, and converging the clusters’ centers towards the densest regions of data.
K Means is a popular unsupervised machine learning algorithm for data clustering. A typical start for flat clustering, the K Means algorithm works by defining a number K of clusters to be extracted by the algorithm. With this K number given, the algorithm will then find the best “centroids” to cluster the data around.
Support Vector Machine is one of the most commonly used supervised machine learning algorithms for data classification. A binary classifier, the support vector machine algorithm works in vector space to sort data points by finding the best hyperplane separating them into two groups. Thanks to its reliance upon vectors, it finds frontiers between groups of data points even in nonlinear patterns and features spaces of high dimensions.