Text Similarity API - Cosine & Jaccard Scoring
Computes a numeric similarity score between two strings using either cosine similarity on term-frequency vectors or Jaccard similarity on token sets. Both methods tokenize input by matching word characters, lowercase everything, and ignore punctuation. The response rounds the score to four decimal places.
Code examples
curl -X POST https://api.botoi.com/v1/text/similarity \
-H "Content-Type: application/json" \
-d '{"text_a":"The quick brown fox jumps over the lazy dog","text_b":"A quick brown fox leaps over a sleepy dog","method":"cosine"}'When to use this API
Detect duplicate support tickets
Score incoming tickets against open issues so agents get a "likely duplicate" flag when similarity exceeds 0.8, cutting resolution time on repeat reports.
Deduplicate user-generated content
Compare new forum posts or product reviews against recent entries before publishing. Block near-identical spam while letting genuine paraphrases through.
Route FAQ answers in a chatbot
Match a user question against a library of canned answers and serve the top-scoring match instead of calling a more expensive LLM for every query.
Grade short-answer quiz responses
Score student submissions against a reference answer to auto-grade within a threshold and flag borderline cases for human review.
Frequently asked questions
What is the difference between cosine and Jaccard?
How is the score scaled?
Is this semantic similarity?
How is text tokenized?
Does word order matter?
Get your API key
Free tier includes 5 requests per minute with no credit card required. Upgrade for higher limits.