massOfai

Vocabulary (Vocab)

NLP & Text

Set of tokens the model recognizes

What is Vocabulary (Vocab)?

Vocabulary size affects model capacity and OOV handling; smaller subword vocabs handle rare words better.

Real-World Examples

  • 30k token vocabulary for LLMs