Language

Metrici supports the translation of solutions into multiple languages.

In outline:

  • Text to be translated within the solutions is tagged with identifiers.
  • Translations of text are collected into language packs, which contain all available translations of all the tags within a solution.
  • Users are associated with a language code that indicates their language of choice.
  • As web pages are written, the translations for the user's language are extracted from the language pack and used to replace the tagged text.

HTML to be translated is identified using a data-l10n-id attribute (l10n stands for "localization"). For each language, translations of that data are held using a combination of:

  • identifier
  • attribute name (optional)
  • pattern (optional)
  • target, i.e. the translated text.

The scenarios below illustrate how these are used.

Translation scenarios

Element replacement

In the simplest scenario, an element is identified with a data-l10n-id attribute, and a translation is associated with that id that will replace the entire element contents.

<p data-l10-id="greeting">Hello World!</p>

Using translation table

id attribute pattern target
greeting     Salut le Monde!
<p data-l10n-id="greeting">Salut le Monde!</p>

The contents of the translated element is itself then translated. This is useful for pattern matching and placeholders.

Attribute replacement

As well as, or instead of, translating the element content, a translation can be applied to an attribute. The data-l10-id attribute is applied in the same way. So

<a href="#" data-l10n-id="deleteButton" title="Delete">x</a>

Using translation table

id attribute pattern target
deleteButton  title   Effacer
<a href="#" data-l10n-id="deleteButton" title="Effacer">x</a>

Pattern matching

The same data-l10n-id can be reused and applied to different text automatically, using pattern matching. For example:

<button type="button" data-l1on-id="buttonCaption">Save</button>
<button type="button" data-l1on-id="buttonCaption">Cancel</button>

Using translation table

id attribute pattern target
buttonCaption   Save Enregistrer
buttonCaption   Cancel Annuler
<button type="button" data-l1on-id="buttonCaption">Enregistrer</button>
<button type="button" data-l1on-id="buttonCaption">Annuler</button>

Even though the same id is used, pattern matching can be used to pick a matching translation.

Pattern matching can be used on element text or attributes.

Pattern matching with placeholders

The pattern matching can include placeholders in the text.

<p data-l10n-id="alertMessage">You have 10 alerts</p>

Using translation table

id attribute pattern target
alertMessage   You have ${count} alerts  Vous dispozes de ${count} alertes
<p data-l10n-id="alertMessage">Vouz dispozes de 10 alertes</p>

Importantly, the translation works whatever the number.

By using multiple patterns you can deal with variants of the sentence.

id attribute pattern target
alertMessage   You have no alerts  Vous n'avez aucune alerte
alertMessage   You have 1 alert Vous aves 1 alerte
alertMessage   You have ${count} alerts Vous avez ${count} alertes
  • You have no alerts -> Vous n'avez aucune alerte
  • You have 1 alert -> Vous avez 1 alerte
  • You have 5 alerts -> Vous avez 5 alertes

Pattern matching with placeholders works with both elements and attributes.

The translations definitions allow for a sequence to control the order in which patterns apply. In the example above, the "You have ${count} alerts" pattern must have the highest sequence so it is examined last (since it would match the other patterns also).

Explicit placeholders

For elements, but not attributes, you can use explicit placeholders instread of pattern matching. This can be useful in complex chunks of text.

<p data-10n-id="shareMessage">
<span data-l10n-placeholder="userName">Fred Jones</span>
(<span data-l10n-placeholder="userEmail">fred.jones@somewhere.wr</span>)
has shared their work with
<span data-l10n-placeholder="toUserName">Sandra Lewis</span>
(<span data-l10n-placeholder="toUserEmail">sandra.lewis@somewhere.wr</span>)
</p>

Using translation table

id attribute pattern target
shareMessage     ${userName) (${userEmail}) a partagé son travail avec ${toUserName} (${toUserEmail}).
<p data-10n-id="shareMessage">
<span data-l10n-placeholder="userName">Fred Jones</span>
(<span data-l10n-placeholder="userEmail">fred.jones@somewhere.wr</span>)
a partagé son travail avec
<span data-l10n-placeholder="toUserName">Sandra Lewis</span>
(<span data-l10n-placeholder="toUserEmail">sandra.lewis@somewhere.wr</span>)
</p>

Language packs

Each user is associated with a language pack, which is defined using the Language pack reference on their user node, account node or theme. This is a node with a Data member with XML with the following structure:

<LanguagePack>
<sourceLanguage/> <TranslationSet> <language/> <comment/> <TranslationList> <Translation> <identifier/> <attribute/> <comment/> <source/> <pattern/> <target/> </Translation> </TranslationList> </TranslationSet> </LanguagePack>

Below the language pack is the translation set, that defines a set of related translations for the same language, i.e, all the translations for product X in language Y.

TranslationSet and Translation repeat. The same language code and identifiers can appear multiple times.

The data elements have the following meanings.

sourceLanguage The language of the source (typicaly "en"). This allows a user to pass a set of language codes, and to correctly not translate unless required (for example, a French Canadian might use language codes "en-US, en, fr-CA, fr-FR", and they would expect to get the untranslated English version rather than the French version.
language Standard code for language. This can be a comma-delimited list of codes, e.g. "fr, fr-FR" means "General French and French French".
comment Used to guide human translators.
identifier Identifies an element within the HTML document (i.e. the value of the data-l10n-id attribute).
attribute If given, specifies that the translation if for the named attribute. If not given, the translation is for the inner html of the element.
source Original text, used to guide human translators. Defaults to pattern.
pattern A sequence of text with ${..} placeholders that indicates that only text that matches this pattern should be translated. Optional - if omitted then any text is translated.
target The target translation text, with ${..} placeholders.

The language pack is loaded into memory and shared by all users who need it. To avoid permission problems, grant anonymous page administer permission on the language pack.

The language pack can be entered manually using the Data Node type. However, there are a number of types that simplify building language packs:

  • Language Pack type allows a language pack to be built from other language package and from separate translation sets nodes, each of which holds just the <TranslationSet> element in the Data field.
  • Translation Set Template holds definitions of a set of related phrases that have not been translated. Each individual phrase is held in a node of type Translation Template.
  • Translation Set type references a Translation Set Template and builds a translation set from Translation nodes within itself. It can reference a Translation Set in place of a Translation Set Template, in which case it can be used to extend an existing translation set. It also provides favilities to export the translation set to Excel, and import it back in again, to allow offline translation.

Specifying language pack and language

The language pack for a user is read from the Language pack reference property on their user node, account node or theme. This should contain the translation sets for all the products they are using.

The language code for a user us read from the Language code property on their user node, account node or theme (though user node makes most sense), or from the languageCode cookie, or from the Accept-Language request header. Once set, the language code is retained in a cookie, to allow translations to use the language code before the user signs on again. A language code of "*" means "accept any language", i.e. do not translate. A language code of "browser" means always look at the Accept-Language header.

If a full language code (such as fr-FR) is not available, but a high-level language code (fr) is available, the high-level language code is used.

Conventions

Full language codes should be held as two lower case letters, a hyphen, and two upper case letters, e.g. fr-CA. This is the convention used in the IETF BCP 47 standard, which generally comprises of an ISO 639-1 2-digit language code, a hyphen, and an ISO 3166 country code.

Translation identifiers should be of the form namespace-identifier, where namespace is a based on a node reference that relates to the owner of the appropriate code (it does not have to match a node reference, but basing the namespace on a node you own prevents clashes). For example, system-defined elements have a namespace prefix of system-, but somecompany's HR application might use a prefix of somecompany.products.hr

A translation set does not have to be limited to a single namespace, although generally it would focus on a single namespace. For example, it might need to override system-defined messages to make more sense to its users.