NLTK - AILEPHANT

NLTK, which stands for Natural Language Toolkit is a suite of libraries and programs for symbolic and statistical Natural Language Processing – NLP – for the Python programming language.

Developed by Steven Bird and Edward Loper in the Department of Computer and Information Science at the University of Pennsylvania, NLTK notably allows to easily conduct the following operations:

Lexical analysis: Word and text tokenizer
n-gram and collocations
Part-of-speech tagger
Tree model and Text chunker for capturing
Named-entity recognition

NLTK also provides access to more than 50 corpora and lexical resources, including WordNet, as well as a number of other NLP resources.

For more information on NLTK, check the NLTK website and the NLTK Wikipedia page. There is also a complete and free book on NLTK available online for NLTK 3 and Python 3 with the following chapters.

Preface
Language Processing and Python
Accessing Text Corpora and Lexical Resources
Processing Raw Text
Writing Structured Programs
Categorizing and Tagging Words (minor fixes still required)
Learning to Classify Text
Extracting Information from Text
Analyzing Sentence Structure
Building Feature Based Grammars
Analyzing the Meaning of Sentences (minor fixes still required)
Managing Linguistic Data (minor fixes still required)
Afterword: Facing the Language Challenge

Bibliography
Term Index

« Back to Glossary Index