Guidance for FHIR IG Creation
0.1.0 - CI Build
Guidance for FHIR IG Creation, published by HL7 International - FHIR Management Group. This guide is not an authorized publication; it is the continuous build for version 0.1.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/FHIR/ig-guidance/ and changes regularly. See the Directory of published versions
The IG publisher can generate a set of resources in a test data directory from a spreadsheet. The factory is controlled by an ini file that sets up the parameters for the factory.
There are two kinds of factories:
Note that you can have multiple factories that reuse the same data files
A liquid template is conceptually simple: a liquid template that constructs an instance of a resource from a set of data. The author provides the template, and the data generation is predictable - just based on what the template and the data source provide. E.g. the data source as a set of columns and the liquid template refers to the columns, laying them out in the resource. One instance is created for each row in the column.
Liquid templates use the FHIR variant of the basic liquid syntax, which uses FHIRPath for expressions in the liquid template.
The liquid template must fully populate the resource, though if it leaves the Resource.id out, an autogenerated id will be added. The liquid template can produce either XML or JSON. In the case of JSON, the resource is treated as JSON5 and converted to normal JSON after it is run - this means that you don't have to get the commas correct in the generated json.
Profile based generation works differently - there is no script laying out the content. Instead, the instances are generated based on the defined profile, including fixed values, pattern values and bindings. The data used in these generated test instances comes from one of three sources (in order of preference):
The details of how the locally provided data works is described below.
[factory]
type=liquid|template
data={data-file}
liquid={template-file}
profile={url}
mappings={mapping-file}
filename={filename}
format=json | xml
bundle=true|false
log={log-name}
[table]
name=file
where:
type
- whether to use a liquid template or the profile driven factorydata-file
: A relative path to a CSV or excel file containing the data, where the first row contains the names of the columnsliquid
: a relative path to a liquid template that builds a resourceprofile
: the URL of a profile to use as the template for generating the instancemapping
: A json file describing how the data file maps into the generated instances (described below)filename
: A script that controls the name of the output file (see immediately below)format
: the format of the generated file (doesn't have to match the format that a liquid template produces)bundle
: if true, the generated resources will be wrapped into a bundle and only a single file createdlog
: the name by which the factory should be logged (see below)Also, you can nominate other tables, where the table is a relative path to a CSV or excel file containing a table of data.
The output filename controls where the generated data goes. It is a relative path (relative to the repository root folder).
When bundle=true
, it's a static filename for the single bundle produced by the generation. In the case where individual
resources are produced, the filename is a script that looks like this: test/$type$-{$id}.json
The following variables can be used in the filename:
$type$
- the resource type$id$
- the id of the resource$counter$
- a factory scoped serially incrementing counter starting at 1$format$
- either json or xml depending on the format for the factoryA log of the process of running the test data factory will be generated in output/qa-factory-$log-name$.txt. One reason it's provided is to help users see the paths in the profile generation (used below)
The spreadsheet should not contain any names containing spaces, or '-'. Also, the sheet cannot contain a column named 'counter'. Or else a data mapping file must be used (see below).
For a liquid template, the template does not need to get the commas correct in json - the json is reprocessed once the liquid script is complete to fix up the commas. (it must produce valid json5 output)
The [tables]
section in the ini file contains a list of named files.
The data in the files will be available in the liquid template using
[name].cell(row, col)
where:
[name]
is the name in the ini filecell(row,col)
gives access to the data. Row is an integer (1 based), and column is either an integer (1 based) or a nameIn this mode, the instances are generated based on the information in the profile. The tighter the profile, the more coherent the generated instance will be.
The intention of the spreadsheet approach is to support a user provided database. For this reason, the source has two parts: the source data, and a mappings script that describes how data in the spreadsheet is converted to FHIR data. The intention here is to support non-technical (e.g. clinical) users to provide the sample data. One instance is created per table row.
Because the data providers aren't expected or required to be technical, here's a list of things the mapping script can do to massage the data into shape ready to go in a resource:
The mapping script looks like this:
{
"format-version" : 1,
"values" : [{
"path" : "{path}",
"source" : [{
"property" : "{prop-name}",
"column" : "{name}",
"regex" : "{regex}",
"constant" : "value"
}]
}]
}
Documentation:
path
: the path in the generated instance where the data will go. The path must match the correct path from the generation logvalues
: one or more source columns in the spreadsheet that contribute source to this value
property
: When provided, the property names must match the names of the FHIR properties e.g. code, or period.startcolumn
: the name in the provided main data spreadsheetregex
: a regex that extracts the data from the columnconstant
: Sometimes a fixed value is needed - e.g. providing a code system URL. In this case, provide a constant
rather than a column
(and no regex
)Note that a single column can appear in the values list more than once, usually with different regexes.
Examples: