Token Count API - Estimate LLM Tokens for GPT, Claude, Llama
Approximates token count using family-specific characters-per-token ratios (GPT, Claude, Llama, Mistral, Gemini). Returns token count, the model used, the estimation method, character count, and word count. Accurate enough for budgeting prompts and splitting long inputs.
Code examples
curl -X POST https://api.botoi.com/v1/token/count \
-H "Content-Type: application/json" \
-d '{"text":"Summarize this article in 3 bullet points.","model":"claude-3.5-sonnet"}'When to use this API
Budget LLM costs before the request
Estimate tokens on user input, multiply by the model's per-token price, and display expected cost in your UI. Prevents surprise bills on long prompts.
Fit prompts in the context window
Combined with /v1/token/truncate, check whether your assembled prompt fits and chop intelligently if it overshoots. Especially useful for RAG pipelines with variable retrieved-context lengths.
Rate-limit by token spend, not request count
Enforce per-user LLM spending by summing estimated tokens per request. Stops a single user with long prompts from exhausting your monthly quota.
Frequently asked questions
How accurate is the estimate?
Why is method "estimated" and not "exact"?
Which model should I pass?
Does this count system prompts and tool schemas?
Can I pass multiple strings?
Get your API key
Free tier includes 5 requests per minute with no credit card required. Upgrade for higher limits.