Guidance for FHIR IG Creation
0.1.0 - CI Build
Guidance for FHIR IG Creation, published by HL7 International - FHIR Management Group. This guide is not an authorized publication; it is the continuous build for version 0.1.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/FHIR/ig-guidance/ and changes regularly. See the Directory of published versions
The IG publisher can generate a set of resources in a test data directory from a table of data. The factory is controlled by an json file that sets up the parameters for the factory.
There are two kinds of factories:
Note that you can have multiple factories that reuse the same source data tables
A liquid template is conceptually simple: a liquid template that constructs an instance of a resource from a set of data. The author provides the liquid script, and the data generation is predictable - just based on what the template and the data source provide. E.g. the data source as a set of columns and the liquid template refers to the columns, laying them out in the resource. One instance is created for each row in the column.
Liquid templates use the FHIR variant of the basic liquid syntax, which uses FHIRPath for expressions in the liquid template.
The liquid template must fully populate the resource, though if it leaves the Resource.id out, an autogenerated id will be added. The liquid template can produce either XML or JSON. In the case of JSON, the resource is treated as JSON5 and converted to normal JSON after it is run - this means that you don't have to get the commas correct in the generated json.
Profile based generation works differently - there is no script laying out the content. Instead, the instances are generated based on the defined profile, including fixed values, pattern values and bindings. The data used in these generated test instances comes from one of three sources (in order of preference):
The details of how the locally provided data works is described below.
Test Data Factories are defined using the parameter test-data-factories
:
<parameter>
<code>
<system value="http://hl7.org/fhir/tools/CodeSystem/ig-parameters"/>
<code value="test-data-factories"/>
</code>
<value value="factories/factories.json"/>
</parameter>
Multiple test-data-factories
are allowed, but since each json file can define multiple
factories, there's usually only one entry. By convention, factories are defined in the
folder 'factories' but this is not required. The value points to an json file with this format:
{
"factories-version" : 1,
"factories" : [{
// one entry for each factory
}
}
Each entry in the factory control file has the following format:
{
"name" : "{factory-name}",
"type" : "liquid|profile",
"liquid" : "{template-file}",
"profile" : "{url}",
"data" : "{data-source}",
"filename" : "{filename}",
"format" : "json|xml",
"bundle" : true|false,
"tables" : {
"name" : "{data-source}",
},
"filter" : "{fhirpath expression}",
"mappings" : [{
// mapping details - see below
}]
where:
name
(mandatory): the name of the factorytype
(mandatory): whether to use a liquid template or the profile driven factorydata-source
(mandatory): A path to a source data table containing the data to drive generation (see below)liquid
(if liquid): a relative path to a liquid template that builds a resourceprofile
(if profile): the URL of a profile to use as the template for generating the instancefilename
(mandatory): A script that controls the name of the output file (see immediately below)format
(optional): the format of the generated file (doesn't have to match the format that a liquid template produces)bundle
(optional): if true, the generated resources will be wrapped into a bundle and only a single file createdtables
(optional): other tablesfilter
(optional): if present, a FHIR Path expression that must evaluate to true or the row is ignored when processing the source datamapping
(if profile): Describes how the data table maps into the generated instances (described below)Source Data can be provided in multiple different forms:
;{name}
to the filename. In the absence of a sheet name, the first sheet will be used.;{name}
to the filename (required)For both .csv and .xlsx, the first row contains the names of the columns.
For all data sources, an additional column named counter
is created, which is the index of the
current row, a serially incrementing number starting a 1
. None of the data sources can provide a
column name 'counter' of their own.
The output filename controls where the generated data goes. It is a relative path (relative to the repository root folder).
When bundle=true
, it's a static filename for the single bundle produced by the generation. In the case where individual
resources are produced, the filename is a script that looks like this: test/Patient-{$counter$}.json
, where any $xxx$
will be interpreted as a reference to a named column in the primary data source
A log of the process of running the test data factory will be generated in output/qa-factory-$log-name$.txt. One reason it's provided is to help users see the paths in the profile generation (used below)
The liquid template must produce resources in the specified format. If the template produces JSON, the commas do not need to be correct - the json is reprocessed once the liquid script is complete to fix up the commas (it must produce valid json5 output).
Each row of the data table is passed to the liquid template as a 'row' object whose properties are the named columns
in the data table. E.g. if the data table has a column name
, then the liquid statement {{ row.name }}
inserts
the value of the name column in the row. The data table should not contain any names containing spaces, or '-'.
The liquid template can produce a resource of any type (doesn't have to produce the same type). If bundle=true
, the
Liquid template should not produce a Bundle resource unless the desire is to have a Bundle of Bundles - the liquid
script will run once for each row of data.
The tables
section of the configuration contains a list of named files.
The data in the files will be available in the liquid template using
[name].cell(row, col)
where:
[name]
is the name in the ini filecell(row,col)
gives access to the data. Row is an integer (1 based), and column is either an integer (1 based) or a namelookup(lookupCol, value, outputCol)
looks up a value in lookupCol, and returns the value in outputCol (or null)In addition, a Global object is available as Globals.
which has the following properties:
dateTime
: the date and time in FHIR format of the instant that processing startedpath
: the path to the base FHIR specification (correct version path)In this mode, the instances are generated based on the information in the profile. The tighter the profile, the more coherent the generated instance will be.
If provided, an instance will be generated for each row in the primary data source. The intention with regard to the primary data source is to support user provided information. For this reason, there is a mapping table that maps between the source source data and proper FHIR data. The intention here is to support non-technical (e.g. clinical) users to provide the sample data.
Because the data providers aren't expected or required to be technical, here's a list of things the mapping script can do to massage the data into shape ready to go in a resource:
Entry mapping entry looks like this:
{
"path" : "{path}",
"fhirType" : "{type}",
"if" : "{fhirpath expression}",
"expression" : "{fhirpath expression}",
"parts" : [{
"name" : "{prop-name}",
"expression" : "{fhirpath expression}"
}]
}
Documentation:
path
: the path in the generated instance where the data will go
fhirType
- use when the type is polymorphic and not fixed in the profile. Can be either the name of a type, or a FHIRPath expression that returns the name of a typeif
- if this is present, evaluate the expression, and only use the entry if the result is trueexpression
: An expression which evaluates to the value. See below for detailsparts
: a series of named expressions where the name of each part corresponds to a property name of a typeThere is 3 ways to refer to a column from the source data in the expression:
"expression" : "patientId"
where patientId
is the name of the column in the source data"expression" : "column('Patient ID')"
where Patient ID
is the name of the column in the source data"expression" : "column('Date of Birth', 'M/d/yyyy')"
where Date of Birth
is the name of the column in the source data, and M/d/yyyy
is the format of the column. For format advice, see https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html.Notes:
This IG includes some examples. You can find the output from the examples in the package, or you can look in the package source to see how they work
Name | Mode | Script | Flags | Description |
LiquidDemo | liquid | factories/patient.liquid | json Bundle |
A simple liquid script showing how to look up a random value in a table |
PatientGenerator | profile | http://hl7.org/fhir/uv/howto/StructureDefinition/test-patient-profile | json |
Generate instances based on a profile in the IG, and fill out values from an excel spreadsheet |
EncounterGenerator | liquid | factories/encounter.liquid | json |
Another liquid script showing how to do conditional content |
BloodPressureGenerator | profile | http://hl7.org/fhir/StructureDefinition/bp | json |
A more complex example. Since this is a wide open profile, a lot of what the mappings do is suppress columns |
WeightGenerator | profile | http://hl7.org/fhir/StructureDefinition/bodyweight | json |
Shows how to to do conditional content depending on the content of the spreadsheet |
WarfarinGenerator | profile | http://hl7.org/fhir/StructureDefinition/MedicationStatement | json |
Shows how to filter the rows in the first place |