In natural language processing, an N-gram is a contiguous sequence of n items: phonemes, syllables, letters, words or base pairs, from a given sample of text or speech.
Examples from the Google n-gram corpus:
- ceramics collectables collectibles
- ceramics collectables fine
- ceramics collected by
- serve as the incoming
- serve as the incubator
- serve as the independent
In natural language processing, an N-gram model is a type of probabilistic language model for predicting the next item in a sequence, such as a string of text. They can be used to analyze sequences of words, so as to compute the frequency of collocation of words and predict the next possible word in a given request.
More on N-grams on Wikipedia.