Skip to content
AILEPHANT logo

AILEPHANT

Artificial Intelligence Lab

  • AILEPHANT home iconHome
  • Theory
  • Algorithms
  • Data
  • Applications
  • Studies
  • Glossary

Tokenization

In natural language processing, Tokenization is the operation of separating a string of text into words and/or sentences. The resulting list of words/sentences can then be further analyzed, notably to measure word frequency, which can be a first step to understanding what a text is about.

Check the basics of NLP with NLTK to program Tokenization in Python.

More on Tokenization on Wikipedia.

« Back to Glossary Index

Post navigation

Previous PostPrevious NLTK
Next PostNext Lemmatization
  • About
  • Contact
  • Twitter profile
  • GitHub page
  • LinkedIn page
  • Facebook page

Copyright © 2025 AILEPHANT - All rights reserved.