GILT Ninjas

Ninja Power in Globalization, Internationalization, Localization and Translation

UI of a sign up page

What’s in a name?

At GILT Ninjas, we’ve covered many complex topics over time. But there’s one seemingly simple concept we haven’t touched on, and it’s one of the most overlooked (and surprisingly tough) challenges for companies expanding globally: how to handle names.

Whether it’s a UI element, an email greeting, or an SMS, addressing your customer by name is a basic feature. But doing it right across cultures is anything but basic.

To understand the problem, let’s start with how many English-centric systems handle names. 

A typical assumption is:

[Given Name] [Middle Name] [Family Name]

With expected lengths like:

Given name: ~6 characters

Middle name: ~2 characters

Family name: ~7 characters

So, name schemas often look like this:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "ClassicalEnglishName",
  "type": "object",
  "properties": {
    "given_name":  { "type": "string", "minLength": 1, "maxLength": 15 },
    "middle_name": { "type": "string", "maxLength": 15 },
    "family_name": { "type": "string", "minLength": 1, "maxLength": 15 }
  },
  "required": ["given_name", "family_name"]
}

But even within English-speaking markets, this is limiting. People change names due to marriage, divorce, adoption, or other legal reasons. So a key early question is:

  • Should your system store previous names?

If so, the schema above is insufficient. And if you plan to support international markets or want to respect cultural diversity, things get even more complex.

Let’s look at some global considerations that should influence your schema.

Characters beyond ASCII

Names may include symbols, numbers and characters beyond A–Z:

English: Ke$ha

Nordic: Bjørn

French: Chloë

Spanish: Iñaki

Portuguese: João

… and many more.

Multiple given or family names

In Hispanic cultures, names are often multi-part:

Example: Pablo Diego José Francisco de Paula Juan Nepomuceno Crispín Crispiniano María de los Remedios de la Santísima Trinidad Ruiz Picasso

(Yes, that’s Picasso’s full name!)

Here, Ruiz Picasso is the family name — and even that can be composed of multiple parts. Therefore it is better to set a very high character limit for names.

Name order varies

In English: [Given Name] [Family Name]

In Japanese: [Family Name] [Given Name]

Parsing these into individual fields can introduce errors, especially if you’re guessing from the “full name” field.

Keep in mind that this order variations have to also be displayed in the UI when needed.

Honorifics and declensions

Some cultures use honorifics (e.g., Japanese Suzuki-san) or require names to decline grammatically depending on context (common in Slavic languages). These variations add complexity that can’t be solved by a single static field.

The best practice: legal vs preferred names

To handle names properly:

  • Use legal names for official documents (invoices, contracts, etc.)
  • Use preferred names for UI, emails, notifications, etc.

If a preferred name is not provided, fallback based on cultural norms (e.g., first name in English, last name in Japanese).

Also, make sure engineers consistently default to the preferred name across the product.

A flexible schema1 for real-world use

Based on these considerations, here’s a more robust, future-ready schema that supports:

  • Legal + preferred names
  • Locale-aware formatting
  • Multilingual scripts
  • Transliteration
  • Previous names (with validity windows)
  • Sorting, phonetics, and honorifics
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "PersonNameRecord",
  "type": "object",
  "properties": {
    "default_name_usage": {
      "type": "string",
      "enum": ["preferred", "legal"],
      "default": "preferred",
      "description": "Use preferred name by default across UI/comms; fall back to legal if missing."
    },
    "preferred_name": { "$ref": "#/definitions/NameRepresentation" },
    "legal_name": { "$ref": "#/definitions/NameRepresentation" },
    "previous_names": {
      "type": "array",
      "items": { "$ref": "#/definitions/HistoricalName" },
      "description": "Prior legal/preferred names with validity windows and reasons."
    },
    "formatting_defaults": {
      "type": "object",
      "properties": {
        "locale": {
          "type": "string",
          "description": "BCP-47 locale hint for formatting/sorting, e.g., 'es-ES', 'ja-JP'."
        },
        "script": {
          "type": "string",
          "description": "ISO-15924 script code, e.g., 'Latn', 'Jpan', 'Cyrl'."
        }
      },
      "additionalProperties": false
    }
  },
  "required": ["preferred_name", "legal_name"],
  "additionalProperties": false,

  "definitions": {
    "NameRepresentation": {
      "type": "object",
      "properties": {
        "full": {
          "type": "string",
          "maxLength": 1024,
          "description": "Exact string as entered by the user; keep for faithful display."
        },
        "order": {
          "type": "string",
          "enum": ["given_family", "family_given", "mononym", "unknown"],
          "description": "How parts should be displayed by default."
        },
        "parts": {
          "type": "array",
          "minItems": 1,
          "items": { "$ref": "#/definitions/NamePart" },
          "description": "Ordered list; do not parse from 'full' unless user provided parts."
        },
        "honorific_prefixes": {
          "type": "array",
          "items": { "type": "string" },
          "description": "e.g., 'Dr', 'Sir', 'Dra.', 'Srta.'."
        },
        "honorific_suffixes": {
          "type": "array",
          "items": { "type": "string" },
          "description": "e.g., 'Jr', 'III', 'PhD', Japanese '様' if explicitly preferred."
        },
        "particles": {
          "type": "array",
          "items": { "type": "string" },
          "description": "e.g., 'de', 'del', 'da', 'van', 'bin', 'al'. Keep separate for sorting."
        },
        "locale": { "type": "string", "description": "BCP-47 locale for this specific representation." },
        "script": { "type": "string", "description": "ISO-15924 script for this representation." },
        "transliterations": {
          "type": "array",
          "items": { "$ref": "#/definitions/Transliteration" }
        },
        "phonetics": {
          "type": "array",
          "items": { "$ref": "#/definitions/Phonetic" }
        },
        "sort_as": {
          "type": "string",
          "description": "Precomputed sort key (locale-aware), e.g., \"Ruiz y Picasso, Pablo…\""
        },
        "initials": {
          "type": "string",
          "description": "Optional precomputed initials; locale rules vary."
        },
        "normalized_form": {
          "type": "string",
          "description": "NFC-normalized version of 'full' (store if you normalize)."
        }
      },
      "required": ["full", "order"],
      "additionalProperties": false
    },

    "NamePart": {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "enum": [
            "given", "middle", "family",
            "additional_family", "patronymic", "matronymic",
            "mononym", "particle", "prefix", "suffix"
          ]
        },
        "value": { "type": "string", "maxLength": 256 },
        "language": { "type": "string", "description": "BCP-47 language tag if relevant." },
        "script": { "type": "string", "description": "ISO-15924 script code if relevant." }
      },
      "required": ["type", "value"],
      "additionalProperties": false
    },

    "Transliteration": {
      "type": "object",
      "properties": {
        "system": { "type": "string", "description": "e.g., 'Hepburn', 'ALA-LC', 'ISO 9'." },
        "value": { "type": "string" },
        "language": { "type": "string" },
        "script": { "type": "string" }
      },
      "required": ["system", "value"],
      "additionalProperties": false
    },

    "Phonetic": {
      "type": "object",
      "properties": {
        "system": { "type": "string", "enum": ["ipa", "pinyin", "katakana", "romaji", "other"] },
        "value": { "type": "string" }
      },
      "required": ["system", "value"],
      "additionalProperties": false
    },

    "HistoricalName": {
      "type": "object",
      "properties": {
        "representation": { "$ref": "#/definitions/NameRepresentation" },
        "valid_from": { "type": "string", "format": "date" },
        "valid_to": { "type": "string", "format": "date" },
        "reason": {
          "type": "string",
          "enum": ["marriage", "divorce", "legal_change", "adoption", "other"]
        },
        "notes": { "type": "string" }
      },
      "required": ["representation"],
      "additionalProperties": false
    }
  }
}

Designing name schemas for international audiences is hard. There’s no one-size-fits-all, but the worst thing you can do is assume what works in English will work globally.

For those reasons:

  • Think beyond first/last name.
  • Build with cultural and linguistic flexibility in mind.
  • Choose the right structure early. Migrations are very costly later.

  1. Please note that the schema was generated using OpenAI, it is just to provide a view of how complex a proper name schema can get. ↩︎

Leave a Reply

Your email address will not be published. Required fields are marked *

Discover more from GILT Ninjas

Subscribe now to keep reading and get access to the full archive.

Continue reading