Bulk Data Access IG, published by HL7 International / FHIR Infrastructure. This guide is not an authorized publication; it is the continuous build for version 2.0.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/HL7/bulk-data/ and changes regularly. See the Directory of published versions
This implementation guide is intended to be used by developers of backend services (clients) and FHIR Resource Servers (e.g., EHR systems, data warehouses, and other clinical and administrative systems) that aim to interoperate by sharing large FHIR datasets. The guide defines the application programming interfaces (APIs) through which an authenticated and authorized client may request a Bulk Data Export from a server, receive status information regarding progress in the generation of the requested files, and retrieve these files. It also includes recommendations regarding the FHIR resources that might be exposed through the export interface.
The scope of this document does NOT include:
This profile inherits terminology from the standards referenced above. The key words "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this specification are to be interpreted as described in RFC2119.
All exchanges described herein between a client and a server SHALL be secured using Transport Layer Security (TLS) Protocol Version 1.2 (RFC5246) or a more recent version of TLS. Use of mutual TLS is OPTIONAL.
With each of the requests described herein, implementers SHOULD implement OAuth 2.0 access management in accordance with the SMART Backend Services Authorization Profile. When SMART Backend Services Authorization is used, Bulk Data Status Request and Bulk Data Output File Requests with requiresAccessToken=true
SHALL be protected the same way the Bulk Data Kick-off Request, including an access token with scopes that cover all resources being exported. A server MAY additionally restrict Bulk Data Status Request and Bulk Data Output File Requests by limiting them to the client that originated the export. Implementations MAY include endpoints that use authorization schemes other than OAuth 2.0, such as mutual-TLS or signed URLs.
This implementation guide does not address protection of a server from potential compromise. An adversary who successfully captures administrative rights to the server will have full control over that server and can use those rights to undermine the server's security protections. In the Bulk Data Export workflow, the file server will be a particularly attractive target, as it holds highly sensitive and valued PHI. An adversary who successfully takes control of a file server may choose to continue to deliver files in response to client requests, so that neither the client nor the FHIR server is aware of the take-over. Meanwhile, the adversary is able to put the PHI to use for its own malicious purposes.
Healthcare organizations have an imperative to protect PHI persisted in file servers in both cloud and data-center environments. A range of existing and emerging approaches can be used to accomplish this, not all of which would be visible at the API level. This specification does not dictate a particular approach at this time, though it does support the use of an Expires
header to limit the time period a file will be available for client download (removal of the file from the server is left up to the server implementer). A server SHOULD NOT delete files from a Bulk Data response that a client is actively in the process of downloading regardless of the pre-specified expiration time.
Data access control obligations can be met with a combination of in-band restrictions (e.g., OAuth scopes), and out-of-band restrictions, where the server limits the data returned to a specific client in accordance with local considerations (e.g. policies or regulations). The FHIR server SHALL limit the data returned to only those FHIR resources for which the client is authorized. Implementers SHOULD incorporate technology that preserves and respects an individual's wishes to share their data with desired privacy protections. For example, some clients are authorized to access sensitive mental health information and some aren't; this authorization is defined out-of-band, but when a client requests a full data set, filtering is automatically applied by the server, restricting the data that the client receives.
Bulk Data Export can be a resource-intensive operation. Server developers SHOULD consider and mitigate the risk of intentional or inadvertent denial-of-service attacks though the details are beyond the scope of this specification. For example, transactional systems may wish to provide Bulk Data access to a read-only mirror of the database or may distribute processing over time to avoid loads that could impact clinical operations.
This implementation guide builds on the FHIR Asynchronous Request Pattern, and in some places may extend the pattern.
There are two primary roles involved in a Bulk Data transaction:
Bulk Data Provider - consists of:
a. FHIR Authorization Server - server that issues access tokens in response to valid token requests from client.
b. FHIR Resource Server - server that accepts kick-off request and provides job status and completion manifest.
c. Output File Server - server that returns FHIR Bulk Data files and attachments in response to urls in the completion manifest. This may be built into the FHIR Server, or may be independently hosted.
Bulk Data Client - system that requests and receives access tokens and Bulk Data files
The Bulk Data Export Operation initiates the asynchronous generation of a requested export dataset - whether that be data for all patients, data for a subset (defined group) of patients, or all FHIR data in the server.
As discussed in See Privacy and Security Considerations above, a server SHALL limit the data returned to only those FHIR resources for which the client is authorized.
The Resource FHIR server SHALL support invocation of this operation using the FHIR Asynchronous Request Pattern. A server SHALL support GET requests and MAY support POST requests that supply parameters using the FHIR Parameters Resource.
A client MAY repeat kick-off parameters that accept comma delimited values multiple times in a kick-off request. The server SHALL treat the values provided as if they were comma delimited values within a single instance of the parameter. Note that we will be soliciting feedback on the use of comma delimited values within parameters, and depending on the response may consider deprecating this input approach in favor of repeating parameters in a future version of this IG.
For Patient-level requests and Group-level requests associated with groups of patients, the Patient Compartment SHOULD be used as a point of reference for recommended resources to be returned and, where applicable, Patient resources SHOULD be returned. Other resources outside of the patient compartment that are helpful in interpreting the patient data (such as Organization and Practitioner) MAY also be returned.
Binary Resources whose content is associated with an individual patient SHALL be serialized as DocumentReference Resources with the content.attachment
element populated as described in the Attachments section below. Binary Resources not associated with an individual patient MAY be included in a System Level export.
References in the resources returned MAY be relative URLs with the format <resource type>/<id>
, or MAY be absolute URLs with the same structure rooted in the base URL for the server from which the export was performed.
[fhir base]/Patient/$export
View table of parameters for Patient Export
FHIR Operation to obtain a detailed set of FHIR resources of diverse resource types pertaining to all patients.
[fhir base]/Group/[id]/$export
View table of parameters for Group Export
FHIR Operation to obtain a detailed set of FHIR resources of diverse resource types pertaining to all members of a specified Group.
If a FHIR server supports Group-level data export, it SHOULD support reading and searching for Group
resource. This enables clients to discover available groups based on stable characteristics such as Group.identifier
.
Note: How these Groups are defined is specific to each FHIR system's implementation. For example, a payer may send a healthcare institution a roster file that can be imported into their EHR to create or update a FHIR group. Group membership could be based upon explicit attributes of the patient, such as age, sex or a particular condition such as PTSD or Chronic Opioid use, or on more complex attributes, such as a recent inpatient discharge or membership in the population used to calculate a quality measure. FHIR-based group management is out of scope for the current version of this implementation guide.
[fhir base]/$export
View table of parameters for Export
Export data from a FHIR server, whether or not it is associated with a patient. This supports use cases like backing up a server, or exporting terminology data by restricting the resources returned using the _type
parameter.
Accept
(string)
Specifies the format of the optional FHIR OperationOutcome
resource response to the kick-off request. Currently, only application/fhir+json
is supported. A client SHOULD provide this header. If omitted, the server MAY return an error or MAY process the request as if application/fhir+json
was supplied.
Prefer
(string)
Specifies whether the response is immediate or asynchronous. Currently, only a value of respond-async
is supported. A client SHOULD provide this header. If omitted, the server MAY return an error or MAY process the request as if respond-async was supplied.
Query Parameter | Optionality for Server | Optionality for Client | Cardinality | Type | Description |
---|---|---|---|---|---|
_outputFormat |
required | optional | 0..1 | String | The format for the requested Bulk Data files to be generated as per FHIR Asynchronous Request Pattern. Defaults to application/fhir+ndjson . The server SHALL support Newline Delimited JSON, but MAY choose to support additional output formats. The server SHALL accept the full content type of application/fhir+ndjson as well as the abbreviated representations application/ndjson and ndjson . |
_since |
required | optional | 0..1 | FHIR instant | Resources will be included in the response if their state has changed after the supplied time (e.g., if Resource.meta.lastUpdated is later than the supplied _since time). In the case of a Group level export, the server MAY return additional resources modified prior to the supplied time if the resources belong to the patient compartment of a patient added to the Group after the supplied time (this behavior SHOULD be clearly documented by the server). For Patient- and Group-level requests, the server MAY return resources that are referenced by the resources being returned regardless of when the referenced resources were last updated. For resources where the server does not maintain a last updated time, the server MAY include these resources in a response irrespective of the _since value supplied by a client. |
_type |
optional | optional | 0..* | string of comma-delimited FHIR resource types | The response SHALL be filtered to only include resources of the specified resource types(s). If this parameter is omitted, the server SHALL return all supported resources within the scope of the client authorization, though implementations MAY limit the resources returned to specific subsets of FHIR, such as those defined in the US Core Implementation Guide. For Patient- and Group-level requests, the Patient Compartment SHOULD be used as a point of reference for recommended resources to be returned. However, other resources outside of the Patient Compartment that are referenced by the resources being returned and would be helpful in interpreting the patient data MAY also be returned (such as Organization and Practitioner). When this behavior is supported, a server SHOULD document this support (for example, as narrative text, or by including a GraphDefinition Resource). A server that is unable to support _type SHOULD return an error and FHIR OperationOutcome resource so the client can re-submit a request omitting the _type parameter. If the client explicitly asks for export of resources that the Bulk Data server doesn't support, or asks for only resource types that are outside the Patient Compartment, the server SHOULD return details via a FHIR OperationOutcome resource in an error response to the request. When a Prefer: handling=lenient header is included in the request, the server MAY process the request instead of returning an error.For example _type=Observation could be used to filter a given export response to return only FHIR Observation resources. |
_elements |
optional, experimental | optional | 0..* | string of comma-delimited FHIR Elements | When provided, the server SHOULD omit unlisted, non-mandatory elements from the resources returned. Elements SHOULD be of the form [resource type].[element name] (e.g., Patient.id ) or [element name] (e.g., id ) and only root elements in a resource are permitted. If the resource type is omitted, the element SHOULD be returned for all resources in the response where it is applicable.A server is not obliged to return just the requested elements. A server SHOULD always return mandatory elements whether they are requested or not. A server SHOULD mark the resources with the tag SUBSETTED to ensure that the incomplete resource is not actually used to overwrite a complete resource. A server that is unable to support _elements SHOULD return an error and FHIR OperationOutcome resource so the client can re-submit a request omitting the _elements parameter. When a Prefer: handling=lenient header is included in the request, the server MAY process the request instead of returning an error.
|
patient (POST requests only) |
optional | optional | 0..* | FHIR Reference | Not applicable to system level export requests. When provided, the server SHALL NOT return resources in the patient compartments belonging to patients outside of this list. If a client requests patients who are not present on the server (or in the case of a group level export, who are not members of the group), the server SHOULD return details via a FHIR OperationOutcome resource in an error response to the request.A server that is unable to support patient SHOULD return an error and FHIR OperationOutcome resource so the client can re-submit a request omitting the patient parameter. When a Prefer: handling=lenient header is included in the request, the server MAY process the request instead of returning an error.
|
includeAssociatedData |
optional, experimental | optional | 0..* | string of comma delimited values | When provided, a server with support for the parameter and requested values SHALL return or omit a pre-defined set of FHIR resources associated with the request. A server that is unable to support the requested includeAssociatedData values SHOULD return an error and FHIR OperationOutcome resource so the client can re-submit a request that omits those values (for example, if a server does not retain provenance data). When a Prefer: handling=lenient header is included in the request, the server MAY process the request instead of returning an error.A client MAY include one or more of the following values. If multiple conflicting values are included, the server SHALL apply the least restrictive value (value that will return the largest dataset).
|
_typeFilter |
optional | optional | 0..* | string of a FHIR REST API query | When provided, a server with support for the parameter and the requested search parameters SHALL filter the data in the response for resource types referenced in the typeFilter expression to only include resources that meet the specified criteria. FHIR search response parameters such as _include and _sort SHALL NOT be used. See details below.A server unable to support the requested _typeFilter queries SHOULD return an error and FHIR OperationOutcome resource so the client can re-submit a request that omits those queries. When a Prefer: handling=lenient header is included in the request, the server MAY process the request instead of returning an error.
|
organizeOutputBy |
optional | optional | 0..1 | string of a FHIR resource type | When provided, a server with support for the parameter SHALL organize the resources in output files by instances of the specified resource type, including a header for each resource of the type specified in the parameter, followed by the resource and resources in the output that contain references to that resource. When omitted, servers SHALL organize each output file with resources of only single type. See details below. A server unable to structure output by the requested organizeOutputBy resource SHOULD return an error and FHIR OperationOutcome resource. When a Prefer: handling=lenient header is included in the request, the server MAY process the request instead of returning an error.
|
allowPartialManifests |
optional | optional | 0..1 | boolean | When provided, a server with support for the parameter MAY return a portion of bulk data output files to a client prior to all output files being available and/or MAY distribute bulk data output files among multiple manifests and provide links for clients to page through the manifests. See details below. |
Note: Implementations MAY limit the resources returned to specific subsets of FHIR, such as those defined in the US Core Implementation Guide. If the client explicitly asks for export of resources that the Bulk Data server doesn't support, the server SHOULD return details via a FHIR OperationOutcome
resource in an error response to the request.
If an includeAssociatedValue
value relevant to provenance is not specified, or if this parameter is not supported by a server, the server SHALL include all available Provenance resources whose Provenance.target
is a resource in the Patient compartment in a patient level export request, and all available Provenance resources in a system level export request unless a specific resource set is specified using the _type
parameter and this set does not include Provenance.
To obtain new and updated resources for patients in a group, as well as all data for patients who have joined the group since a prior query, a client can use following pattern:
Initial Query (e.g., on January 1, 2020):
Client submits a group export request:
[baseurl]/Group/[id]/$export
Subsequent Queries (e.g., on February 1, 2020):
Client submits a group export request to obtain a patient list:
[baseurl]/Group/[id]/$export?_type=Patient&_elements=id
Client submits a group export request via POST for patients who are new members of the group:
POST [baseurl]/Group/[id]/$export
{"resourceType" : "Parameters",
"parameter" : [{
"name" : "patient",
"valueReference" : {reference: "Patient/123"}
},{
"name" : "patient",
"valueReference" : {reference: "Patient/456"}
...
}]
}
Client submits a group export request for updated group data:
[baseurl]/Group/[id]/$export?_since=[initial transaction time]
Note that data returned from this request may overlap with that returned from the prior step.
_typeFilter
Query ParameterThe _typeFilter
parameter enables finer-grained filtering out of resources in the bulk data export response that would have otherwise been returned. For example, a client may want to retrieve only active prescriptions rather than all prescriptions and only laboratory observations rather than all observations. When using _typeFilter
, each resource type is filtered independently. For example, filtering Patient
resources to people born after the year 2000 will not filter Encounter
resources for patients born before the year 2000 from the export.
The value of the _typeFilter
parameter is a FHIR REST API query. Resources with a resource type specified in this query that do not meet the criteria in the search expression in the query SHALL NOT be returned, with the exception of related resources being included by a server to provide context about the resources being exported (see processing model). A client MAY repeat the _typeFilter
parameter multiple times in a kick-off request. When more than one _typeFilter
parameter is provided with a query for the same resource type, the server SHALL include resources of that resource type that meet the criteria in any of the parameters (a logical "or").
FHIR search result parameters (such as _sort, _include, and _elements) SHALL NOT be used as _typeFilter
criteria. Clients should consult the server's capability statement to identify supported search parameters (see server capability documentation). Since support for _typeFilter
is OPTIONAL for a FHIR server, clients SHOULD be robust to servers that ignore _typeFilter
.
Example Request
The following is an export request for MedicationRequest
resources, where the client would further like to restrict the MedicationRequests to those that are active
, or else completed
after July 1, 2018. This can be accomplished with two _typeFilter
query parameters and an _type
query parameter:
MedicationRequest?status=active
MedicationRequest?status=completed&date=gt2018-07-01T00:00:00Z
$export?
_type=
MedicationRequest
&_typeFilter=
MedicationRequest%3Fstatus%3Dactive
&_typeFilter=
MedicationRequest%3Fstatus%3Dcompleted%26date%3Dgt2018-07-01T00%3A00%3A00Z
Note that newlines and spaces have been added above for clarity, and would not be included in a real request.
The following steps outline a logical model of how a server should process a bulk export request. The actual operations a server performs and the order in which they're performed may differ. Additionally, as documented elsewhere in this implementation guide, depending on the values and headers provided, some requests may cause a server to return an error rather than continuing to process the request.
* In the case of a Group level export, the server may retain resources modified prior to _since timestamp if the resources belong to the patient compartment of a patient added to the Group after the supplied time and this behavior is documented by the server.
202 Accepted
Content-Location
header with the absolute URL of an endpoint for subsequent status requests (polling location)OperationOutcome
resource in the body in JSON format4XX
or 5XX
OperationOutcome
resource in JSON formatIf a server wants to prevent a client from beginning a new export before an in-progress export is completed, it SHOULD respond with a 429 Too Many Requests
status and a Retry-After
header, following the rate-limiting advice for "Bulk Data Status Request" below.
After a Bulk Data request has been started, a client MAY send a DELETE request to the URL provided in the Content-Location
header to cancel the request as described in the FHIR Asynchronous Request Pattern. If the request has been completed, a server MAY use the request as a signal that a client is done retrieving files and that it is safe for the sever to remove those from storage. Following the delete request, when subsequent requests are made to the polling location, the server SHALL return a 404 Not Found
error and an associated FHIR OperationOutcome
in JSON format.
DELETE [polling content location]
202 Accepted
OperationOutcome
resource in the body in JSON format4XX
or 5XX
OperationOutcome
resource in JSON formatAfter a Bulk Data request has been started, the client MAY poll the status URL provided in the Content-Location
header as described in the FHIR Asynchronous Request Pattern.
Clients SHOULD follow an exponential backoff approach when polling for status. A server SHOULD supply a Retry-After
header with a with a delay time in seconds (e.g., 120
to represent two minutes) or a http-date (e.g., Fri, 31 Dec 1999 23:59:59 GMT
). When provided, clients SHOULD use this information to inform the timing of future polling requests. The server SHOULD keep an accounting of status queries received from a given client, and if a client is polling too frequently, the server SHOULD respond with a 429 Too Many Requests
status code in addition to a Retry-After
header, and optionally a FHIR OperationOutcome
resource with further explanation. If excessively frequent status queries persist, the server MAY return a 429 Too Many Requests
status code and terminate the session. Other standard HTTP 4XX
as well as 5XX
status codes may be used to identify errors as mentioned.
When requesting status, the client SHOULD use an Accept
header indicating a content type of application/json
. In the case that errors prevent the export from completing, the server SHOULD respond with a FHIR OperationOutcome
resource in JSON format.
GET [polling content location]
Responses
Response Type | Description | Example Response Headers + Body |
---|---|---|
In-Progress | Returned by the server while it is processing the $export request. |
|
Error | Returned by the server if the export operation fails. |
|
Complete | Returned by the server when the export operation has completed. |
|
202 Accepted
X-Progress
header with a text description of the status of the request that is less than 100 characters. The format of this description is at the server's discretion and MAY be a percentage complete value, or MAY be a more general status such as "in progress". The client MAY parse the description, display it to the user, or log it.allowPartialManifests
kickoff parameter is true
, the server MAY return a Content-Type
header of application/json
and a body containing an output manifest in the format described below, populated with a partial set of output files for the export. When provided, a manifest SHALL only contain files that are available for retrieval by the client. Once returned, the server SHALL NOT alter a manifest when it is returned in subsequent requests, with the exception of optionally adding a link
field pointing to a manifest with additional output files or updating output file URLs that have expired. The output files referenced in the manifest SHALL NOT be altered once they have been included in a manifest that has been returned to a client.4XX
or 5XX
Content-Type
header of application/fhir+json
when body is a FHIR OperationOutcome
resourceOperationOutcome
resource in JSON format. If this is not possible (for example, the infrastructure layer returning the error is not FHIR aware), the server MAY return an error message in another format and include a corresponding value for the Content-Type
header.In the case of a polling failure that does not indicate failure of the export job, a server SHOULD use a transient code from the IssueType valueset when populating the FHIR OperationOutcome
resource's issue.code
element to indicate to the client that it should retry the request at a later time.
Note: Even if some of the requested resources cannot successfully be exported, the overall export operation MAY still succeed. In this case, the Response.error
array of the completion response body SHALL be populated with one or more files in ndjson format containing FHIR OperationOutcome
resources to indicate what went wrong (see below). In the case of a partial success, the server SHALL use a 200
status code instead of 4XX
or 5XX
. The choice of when to determine that an export job has failed in its entirety (error status) vs. returning a partial success (complete status) is left up to the server implementer.
200 OK
Content-Type
header of application/json
Expires
header indicating when the files listed will no longer be available for access.The output manifest is a JSON object providing metadata and links to the generated Bulk Data files. The files SHALL be accessible to the client at the URLs advertised. These URLs MAY be served by file servers other than a FHIR-specific server.
Field | Optionality | Type | Description |
---|---|---|---|
transactionTime |
required | FHIR instant | Indicates the server's time when the query is run. The response SHOULD NOT include any resources modified after this instant, and SHALL include any matching resources modified up to and including this instant.
Note: To properly meet these constraints, a FHIR server might need to wait for any pending transactions to resolve in its database before starting the export process. |
request |
required | String | The full URL of the original Bulk Data kick-off request. In the case of a POST request, this URL will not include the request parameters. Note: this field may be removed in a future version of this IG. |
requiresAccessToken |
required | Boolean | Indicates whether downloading the generated files requires the same authorization mechanism as the $export operation itself.
Value SHALL be true if both the file server and the FHIR API server control access using OAuth 2.0 bearer tokens. Value MAY be false for file servers that use access-control schemes other than OAuth 2.0, such as downloads from Amazon S3 bucket URLs or verifiable file servers within an organization's firewall.
|
outputOrganizedBy |
required when organizeOutputBy was populated |
String | The organizeOutputBy value from the Bulk Data kick-off request when populated and supported. |
output |
required | JSON array | An array of file items with one entry for each generated file. If no resources are returned from the kick-off request, the server SHOULD return an empty array.
The url field SHALL be populated for each output item. When a resource type is not specified in the organizeOutputBy kick-off parameter, the type field SHALL also be populated for each item. When a resource type is specified in the organizeOutputBy kick-off parameter and resources related to a resource of this type continue into another output file, the continuesInFile field SHALL be populated with the URL of that output file.
|
deleted |
optional | JSON array | An array of deleted file items following the same structure as the output array.
The ability to convey deleted resources is important in cases when a server may have previously exported data and wishes to indicate that these data should be removed from downstream systems. When a _since timestamp is supplied in the export request, this array SHOULD be populated with output files containing FHIR Transaction Bundles that indicate which FHIR resources match the kick-off request criteria, but have been deleted subsequent to the _since date. If no resources have been deleted, or the _since parameter was not supplied, or the server has other reasons to avoid exposing these data, the server MAY omit this key or MAY return an empty array. Resources that appear in the 'deleted' section of an export manifest SHALL NOT appear in the 'output' section of the manifest.
Each line in the output file SHALL contain a FHIR Bundle with a type of transaction which SHALL contain one or more entry items that reflect a deleted resource. In each entry, the request.url and request.method elements SHALL be populated. The request.method element SHALL be set to DELETE .
Example deleted resource bundle (represents one line in output file):
|
error |
required | Array | Array of message file items following the same structure as the output array.
Error, warning, and information messages related to the export SHOULD be included here (not in output). If there are no relevant messages, the server SHOULD return an empty array. Only the FHIR OperationOutcome resource type is currently supported, so the server SHALL generate files in the same format as Bulk Data output files that contain FHIR OperationOutcome resources.If the request contained invalid or unsupported parameters along with a Prefer: handling=lenient header and the server processed the request, the server SHOULD include a FHIR OperationOutcome resource for each of these parameters.
Note: this field may be renamed in a future version of this IG to reflect the inclusion of FHIR OperationOutcome resources with severity levels other than error.
|
link |
optional | JSON array |
When the allowPartialManifests kickoff parameter is true , the manifest MAY include a link array with a single object containing a relation field with a value of next , and a url field pointing to the location of another manifest. All fields in the linked manifest SHALL be populated with the same values as the manifest with the link, apart from the output , deleted and link arrays.
In response to a request to a next link , a server MAY return an error as described Error Status section above. For non-transient errors, a client MAY process resources that have already retrieved be retrieved prior to re-running the export job or MAY discard them.
|
extension |
optional | JSON object | To support extensions, this implementation guide reserves the name extension and will never define a field with that name, allowing server implementations to use it to provide custom behavior and information. For example, a server may choose to provide a custom extension that contains a decryption key for encrypted ndjson files. The value of an extension element SHALL be a pre-coordinated JSON object.
Note: In addition to extensions being supported on the root object level, extensions may also be included within the fields above (e.g., in the 'output' object). |
Example manifest, organizeOutputBy
kickoff parameter is not populated:
{
"transactionTime": "2021-01-01T00:00:00Z",
"request" : "https://example.com/fhir/Patient/$export?_type=Patient,Observation",
"requiresAccessToken" : true,
"output" : [{
"type" : "Patient",
"url" : "https://example.com/output/patient_file_1.ndjson"
},{
"type" : "Observation",
"url" : "https://example.com/output/observation_file_1.ndjson"
},{
"type" : "Observation",
"url" : "https://example.com/output/observation_file_2.ndjson"
}],
"deleted": [{
"type" : "Bundle",
"url" : "https://example.com/output/del_file_1.ndjson"
}],
"error" : [{
"type" : "OperationOutcome",
"url" : "https://example.com/output/err_file_1.ndjson"
}],
"extension":{"https://example.com/extra-property": true}
}
Example manifest, organizeOutputBy
kickoff parameter is Patient
, and allowPartialManifests
kickoff parameter is true
:
{
"transactionTime": "2021-01-01T00:00:00Z",
"request" : "https://example.com/fhir/Patient/$export?_type=Patient,Observation",
"requiresAccessToken" : true,
"outputOrganizedBy": "Patient",
"output" : [{
"url" : "https://example.com/output/file_1.ndjson"
},{
"url" : "https://example.com/output/file_2.ndjson",
"continuesInFile": "https://example.com/output/file_3.ndjson"
},{
"url" : "https://example.com/output/file_3.ndjson"
}],
"deleted": [{
"type" : "Bundle",
"url" : "https://example.com/output/del_file_1.ndjson"
}],
"error" : [{
"type" : "OperationOutcome",
"url" : "https://example.com/output/err_file_1.ndjson"
}],
"extension":{"https://example.com/extra-property": true},
"link": [{
"relation": "next",
"url": "https://example.com/output/manifest-2.json"
}]
}
Output files may be organized by resource type, or by instances of a resource type specified in the organizeOutputBy
kickoff parameter.
When the organizeOutputBy
kickoff parameter is not populated, each output file SHALL contain resources of only one type, and a server MAY create more than one file for each resource type returned. The number of resources contained in a file MAY vary between servers and files.
When the organizeOutputBy
kickoff parameter is populated with a resource type, the output files SHALL be populated with blocks consisting of a header Parameters
resource containing a parameter named header
with a reference to a resource of the type in the kickoff parameter, followed by the resource referenced in this header and resources that reference the resource referenced in the header (together a "resource block"). Each output file MAY contain multiple resource blocks and, when possible, a single resource's block SHOULD NOT be split across files. If a resource block does span more than one file, the header SHALL be repeated at the start of each file where the block continues, and the association between these files SHALL be documented in the manifest using the continuesInFile
field in the relevant output
array items.
Resources that would otherwise be included in the export, but do not have references to the resource type specified in the organizeOutputBy
parameter, MAY be included in a resource blocks that contain resources they reference, MAY be repeated in every resource block, or MAY be omitted from the export.
Example header for Patient
resource:
{
"resourceType" : "Parameters",
"parameter" : [{
"name": "header",
"valueReference": {"reference": "Patient/123"}
}]
}
Using the URLs supplied by the FHIR server in the manifest, a client MAY download the generated Bulk Data files (one or more per resource type) within the time period specified in the Expires
header (if present). A client MAY re-fetch the output manifest if output links have expired, and a server MAY provide updated links and/or an updated timestamp in the Expires
header in the response.
As long as a server is following relevant security guidance, it MAY generate output manifests where the requiresAccessToken
field is true
or false
; this applies even for servers available on the public internet.
If the requiresAccessToken
field in the manifest is set to true
, the request SHALL include a valid access token. See Privacy and Security Considerations above.
If the requiresAccessToken
field is set to false
and no additional authorization-related extensions are present in the manifest's output entry, then the output URLs SHALL be dereferenceable directly (a "capability URL"), and SHALL follow expiration timing requirement that have been documented for bearer tokens in SMART Backend Services. A client SHALL NOT provide a SMART Backend Services access token when dereferencing an output URL where requiresAccessToken
is false
.
The exported data SHALL include only the most recent version of any exported resources unless the client explicitly requests different behavior in a fashion supported by the server (e.g., via a new query parameter yet to be defined). Inclusion of the Resource.meta
information in the resources is at the discretion of the server (as it is for all FHIR interactions).
A client SHOULD provide an Accept-Encoding
header when requesting output files and SHOULD include gzip
compression as one of the encoding options in the header. A server SHALL provide output files as uncompressed, with gzip
compression, or with another compression format from the Accept-Encoding
header. When compression is used, a server SHALL communicate this to the client by including a Content-Encoding
header in the response. A client SHALL accept files that are uncompressed or encoded with gzip
compression, and MAY accept files encoded with other compression formats.
Example NDJSON output file:
{"id":"5c41cecf-cf81-434f-9da7-e24e5a99dbc2","name":[{"given":["Brenda"],"family":["Jackson"]}],"gender":"female","birthDate":"1956-10-14T00:00:00.000Z","resourceType":"Patient"}
{"id":"3fabcb98-0995-447d-a03f-314d202b32f4","name":[{"given":["Bram"],"family":["Sandeep"]}],"gender":"male","birthDate":"1994-11-01T00:00:00.000Z","resourceType":"Patient"}
{"id":"945e5c7f-504b-43bd-9562-a2ef82c244b2","name":[{"given":["Sandy"],"family":["Hamlin"]}],"gender":"female","birthDate":"1988-01-24T00:00:00.000Z","resourceType":"Patient"}
GET [url from status request output field]
Accept
(optional, defaults to application/fhir+ndjson
)Specifies the format of the file being requested.
200 OK
Content-Type
header that matches the file format being delivered. For files in ndjson format, SHALL be application/fhir+ndjson
4XX
or 5XX
If resources in an output file contain elements of the type Attachment
, the server SHOULD populate the Attachment.contentType
code as well as either the data
element or the url
element. When populated, the url
element SHALL be an absolute url that can be de-referenced to the attachment's content.
When the url
element is populated with an absolute URL and the requiresAccessToken
field in the Complete Status body is set to true
, the url location must be accessible by a client with a valid access token, and SHALL NOT require the use of additional authentication credentials. When the url
element is populated and the requiresAccessToken
field in the Complete Status body is set to false
, the url location must be accessible by a client without an access token.
Note that if a server copies files to the Bulk Data output endpoint or proxies requests to facilitate access from this endpoint, it may need to modify the Attachment.url
element when generating the Bulk Data output files.
This implementation guide is structured to support a wide variety of Bulk Data Export use cases and server architectures. To provide clarity to developers on which capabilities are implemented in a particular server, server providers SHALL ensure that their Capability Statement accurately reflects the implemented Bulk Data Operations. Additionally, the server's Capability Statement SHOULD list the resource types available for export in the rest.resource
element, and SHOULD list the search parameters that can be used in the _typeFilter parameter in the rest.resource.searchParam
element.
Servers SHOULD indicate resource types and search parameters that are accessible on the server with the REST API, but not available using the Bulk Export operation, with one or more extensions that have a URL of http://hl7.org/fhir/uv/bulkdata/Extension/operation-not-supported
and a valueCanonical
with the canonical URL for the OperationDefinition of the bulk operation that is not supported. Alternatively, the extension may be populated with the canonical URL for the FHIR Bulk Data Access Implementation Guide CapabilityStatement when none of the bulk operations are supported.
Server providers SHOULD also ensure that their documentation addresses the topics below. Future versions of this IG may define a computable format for this information as well.
Practitioner
or Organization
included in the export and under what circumstances?outputFormat
values does this server support?_since
parameter return additional resources modified prior to the supplied time if the resources belong to the patient compartment of a patient added to the Group after the supplied time?includeAssociatedData
values does this server support?