Source definition JSON

For ease of retrieval and processing, information from the source definition is gathered into a JSON structure held for each source system.

The schema is structured around entities (tables) and fields (columns). The schema maps from source entities and fields to target entities and fields.

There is one definition for each system that can send data to the data hub. This source system can send one or more entities, each of these source entities can be mapped to one or more entities in the target data structures, and each of these entity mapping can contain multiple field mappings.

Information about the source defaults to that for the target. A JSON structure without source information is generated for all target entities, using a source system reference of "*".

Format of the schema

{
"reference": "system_reference",
"name": "System Name",
"description": "System description",
"mappings": {
"source_entity_reference": [
{
... source entity mapping object – see below ...
}, ... more ...
]
}
}

The schema is an object with the following properties.

reference A reference for the system. This should be the same as the reference used to build the schema file name.
name An optional name for the system. May be used for documentation. Defaults to the system reference.
mappings An object that allows an array of entity mapping objects to be looked up for a different source

Source entity mapping object

{
"reference": "target_entity_reference",
"name": "Target Entity Name",
"description": "Target entity description",
"sourceReference": "source_entity_reference",
"sourceName": "Source Entity Name",
"sourceDescription": "Source entity description',
"messageReader": "message.reader.name",
"messageReaderConfig": "config",
"unique": true|false,
"uniqueIdentifier": "target_identifier_field_reference",
"sourceUniqueIdentifier": "source_identifier_field_reference",
"uniqueSequence": "target_sequence_field_reference",
"sourceUniqueSequence": "source_sequence_field_reference",
"retriever": "data.retriever.name",
"priority": priority,
"processor": "processor.name",
"processorConfig": "config",
"fields": [
.. fields ..
]
}

The source entity mapping object maps a source system entity to a target entity. It has the following properties.

reference Reference to target entity (table name).
name Name of target entity. Optional. Used for documentation.
description Optional description of target entity.
sourceReference Reference by which the entity is known in the source. Defaults to the reference.
sourceName Name of source entity. Optional. Defaults to name.
sourceDescription Optional description of source entity, defaults to description.
messageReader

Optional identifier of class used to read message.

If there is more than one target entity, the message reader and message reader options specified for the first target entity are used for all target entities.

messageReaderConfig Config used to instantiate the message reader.
unique If set to true, indicates that this represents a unique set of records, for example transactions. Default is false.
uniqueIdentifier If unique is true, reference of field on the target table that identifies this set of records. This will be populated with the message identifier. The field should be a character field of at least 36 characters in length. This and the uniqueSequence should be on the list of fields for the target table, and should be identified as the key.
sourceUniqueIdentifier Reference of field on the source message to which the uniqueIdentifier will be written. Defaults to uniqueIdentifier.
uniqueSequence If unique is true, reference to integer field on the target table that is used to hold the record sequence.
sourceUniqueSequence Reference of field on the source message to which the uniqueSequence will be written. Defaults to uniqueSequence.
retriever

Optional identifier of class used to read the target table using fields from the source. This is used to resolve keys, i.e. work out whether an incoming record matches one or more existing records on the target entity. Not valid if unique is true. If unique is false, defaults to a class that retrieves existing records based on the identified key fields.

On the default system mapping, the retriever represents the definitive method to access the table using its keys. This is known as the golden retriever.

priority Controls whether messages from this source system/entity can delete or create records created or deleted by other systems. Higher priorities take precedence. Defaults to 0.
processor

Optional identifier of class to post-process message.

processorConfig Config used to instantiate the processor.
fields An array of fields to be read from the record and how they should be mapped to the target tables.

Field object

{
"reference": "target_field_reference",
"name": "Target Field Name",
"description": "Target field description",
"sourceReference": "source_field_reference",
"sourceName": "Source Field Name",
"sourceDescription": "Source field description",
"key": true|false,
"type": "text|number|date|timestamp|boolean|link|children",
"mandatory": true|false,
"length": length,
"scale": scale,
"precision": precision,
"linkEntity": "target_link_entity_reference",
"sourceLinkEntity": "source_link_entity_reference",
"linkKey": "target_parent_field_reference",
"sourceLinkKey": "source_parent_field_reference",
"childEntity": "target_child_entity_reference",
"sourceChildEntity": "source_child_entity_reference",
"parentIdentifier": "target_parent_identifier_reference",
"sourceParentIdentifier": "source_parent_identifier_reference",
"childSequence": "target_child_sequence_field_reference",
"sourceChildSequence": "source_child_sequence_field_reference",
"priority": priority
}

The field object identifies a field in the source entity that should be mapped to the target entity. It has the following properties.

reference Reference to target field (column name).
name Name for target field. Optional, used in documentation. Defaults to the reference.
description Description of the field.
sourceReference Reference of field in source entity. Defaults to reference.
sourceName Name for source field. Optional, used in documentation. Defaults to the name.
sourceDescription Description of the source field. Optional, defaults to description.
key Set to true to indicate this field is part of the key of the record.
type

The data type of the target field.

number A general number data type. Implemented as a double.
text

A text data type. If length is set and is 255 or less, will be held as a varchar. Otherwise it will be held as a long text object of indeterminate length.

Trailing spaces are not considered significant and are removed from input data. This allows for consistency between fixed-length strings and variable-length strings in source systems.

 

smallint A signed 2-byte integer number.
integer A signed 4-byte integer number.
bigint A signed 8-byte integer number.
double A signed double-precision (8-byte) floating point number.
decimal A signed number with a fixed number of decimal places. The total number of digits is specified by the scale property. The number after the decimal point are specified by the precision property.
date A date. In the incoming data this should be a string in format yyyy-mm-dd
timestamp A date and time. In the incoming data this should be a string in format yyyy-mm-ddThh:mm:ss.ttt. Note that there is a T between the date and time portion, but the more common database convention of using a single space in place of the T is permitted. Seconds and fractional seconds are optional.
boolean A true/false value. This can be boolean true or false, the strings "true" or "false", a non-zero number (true) or zero (false), or a string containing a number.
link

A foreign key relationship. The parent entity is identified by the linkEntity. The field to be used to look up the key is identified by the linkKey. If more than one field is involved in the key, the linkKey will contain an array of source fields references.

children

A one-to-many parent to child link, where the parent is part of the identifying key of the children.

mandatory

Set to true to indicate the field must always be present in the source.

length

For type of text, maximum length of the text. If omitted or 0, no maximum is applied.

scale For type of decimal, the total number of digits (including those after the decimal point).
precision For type of decimal, the number of digits after the decimal point.
linkEntity For type of link, the target reference to the entity the retriever of which should be used to map the key.
sourceLinkEntity For a type of link, the source reference to the entity the retriever of which should be used to map the key. The first source entity mapping for the entity is used for key resolution. Defaults to linkEntity.
linkKey

For type of link, references of the field or fields required to resolve the link entity. Can be a single string or an array of strings.

This would generally match the references of the parent entity's keys.

sourceLinkKey

For type of link, source references of the field or fields required to resolve the link entity. Can be a single string or an array of strings.

These keys should appear in the input record, and would generally match the source references of the parent entity's keys.

Defaults to the linkKey.

childEntity For type of children, the reference to the entity which should be used for the child rows.
sourceChildEntity For type of children, the reference to the source entity which should be used for child rows. Defaults to childEntity.
parentIdentifier For type of children, the reference of the field on the children entity which holds the link back to the parent.
sourceParentIdentifier For type of children, reference of the parentField to be added to the source entity Defaults to parentField.
childSequence For type of children, reference of the field on the children entity which holds a sequence number. If not given, then no sequence number is generated.
sourceChildSequence For type of children, reference of the sequenceField to be added to the source entity. Defaults to childSequence.
priority Controls priority of values from different source systems and entities. Priorities the same or higher overwrite existing fields. Defaults to priority defined on the entity mapping.

The source system, entity and field may also have a node property, which is used in the generation process and should not be relied on by systems that consume the definitions.