Tokens
Intermediate30 min
LLMs do not read characters or words, they read tokens. Tokenization determines context limits, pricing, and a surprising number of model quirks.
TokenizationSubword unitsToken countingCost estimation
Learn from these
Let's build the GPT Tokenizer
Andrej Karpathy · 2 hr
Explained: Tokens and Embeddings in LLMs
ArticleThe Research Nest
Tiktokenizer: see tokenization live
PlaygroundTiktokenizer

