Skip to content
POST AI agent ready /v1/nationality/estimate

Nationality Prediction API - Estimate Origin Country from Name

Estimates the most probable countries of origin for a given name using statistical models trained on global demographic data. Returns an array of countries ranked by probability with ISO 3166-1 alpha-2 codes. Useful for localization, demographic analysis, and internationalization.

Parameters

stringrequired

Name (first or last) to predict nationality for.

Code examples

curl -X POST https://api.botoi.com/v1/nationality/estimate \
  -H "Content-Type: application/json" \
  -d '{"name":"Tanaka"}'

When to use this API

Automatic locale and language selection

When a new user signs up with only their name, predict their likely country of origin and default the interface language and locale settings accordingly. A user named "Tanaka" would see Japanese as the suggested language, saving them a manual selection step.

Market research and audience analysis

Analyze the nationality distribution of your user base or email list by running names through this endpoint in batch. Identify which countries your product resonates with most, and allocate marketing budgets to the highest-opportunity regions.

KYC and identity verification pre-screening

During KYC onboarding, compare the predicted nationality against the declared country of residence or passport country. A large mismatch is not fraud on its own, but it can trigger additional verification steps in your risk model.

Frequently asked questions

Can I use first names, last names, or full names?
The endpoint works with any name input. Last names (surnames) often produce more distinctive nationality signals than first names. You can pass first name, last name, or full name, though single-token inputs (one name at a time) tend to be most accurate.
What does the probability score mean?
Each country in the response has a probability between 0 and 1 representing the likelihood that a person with that name originates from that country. A probability of 0.85 for JP means 85% of people with that name in the dataset are from Japan.
How many countries are returned?
The response includes up to 5 countries ranked by probability. Only countries with non-negligible probability (above ~0.01) are included. Names strongly associated with one country may return fewer results.
Why does "Tanaka" show Brazil as a secondary match?
Brazil has the largest Japanese diaspora outside Japan. Names like "Tanaka" appear in Brazilian records due to Japanese immigration in the early 20th century. The model reflects real-world name distributions across countries.
Is the name normalized before processing?
Yes. Names are lowercased and trimmed of whitespace. Diacritical marks are preserved and contribute to the prediction, since accented characters carry nationality signals (e.g., "Muller" vs "Mueller").

Get your API key

Free tier includes 5 requests per minute with no credit card required. Upgrade for higher limits.