Bulk Data Access IG, published by HL7 International / FHIR Infrastructure. This guide is not an authorized publication; it is the continuous build for version 4.0.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/HL7/bulk-data/ and changes regularly. See the Directory of published versions
| Official URL: http://hl7.org/fhir/uv/bulkdata/ImplementationGuide/hl7.fhir.uv.bulkdata | Version: 4.0.0 | ||||
| IG Standards status: Trial-use | Maturity Level: 5 | Computable Name: BulkDataAccessIG | |||
Organizations that manage populations often need to exchange large FHIR datasets. Examples include pulling a cohort from an EHR for analytics, sending a pre-arranged package of data to a payer or regulator, or publishing reference data such as provider directories and schedules. Standard FHIR REST APIs work well for interactive and transaction-scale use, but resource-by-resource exchange becomes expensive and operationally brittle when the job involves thousands or millions of resources.
This implementation guide defines a family of FHIR-based bulk operations that standardize how large datasets are requested, delivered, monitored, and reused. Instead of relying on custom CSV extracts and one-off file transfer workflows, these operations use consistent manifest structures, asynchronous processing, and security patterns that can be applied across many implementations.
The operations are applicable to any data that can be represented in FHIR, and may be implemented in "native" FHIR servers that store FHIR resources directly as well as systems that implement FHIR as an interoperability layer (as is often the case with EHR systems and data warehouse platforms).
The scope of this document does NOT include:
Bulk Data defines three operations. Each fits a different relationship between the system that holds the data (Data Provider) and the system that needs it (Data Consumer).
Bulk Export — The consumer pulls data from a provider on demand. The consumer controls what comes back by optionally choosing the cohort, resource types, filters, data elements, and time window. Use this operation when a system needs to retrieve data from a trusted source and shape the request to its own needs — for example, a research data warehouse exporting clinical data from an EHR or a payer exporting claims-relevant records from a clinical system.
Bulk Submit — The provider pushes a pre-coordinated dataset to a specific recipient. Both sides agree in advance on what the submission contains, and the recipient can acknowledge processing, report issues, or return derived artifacts through the in-band Bulk Submit Status operation. Use this operation when the sender already knows what needs to be delivered and the receiver needs to close the loop — for example, submitting required clinical data to a payer, regulator, or processing service.
Bulk Publish — The provider posts a dataset for any number of consumers to retrieve via ordinary HTTP. The provider decides what is published; consumers discover and cache it using standard HTTP semantics. Use this operation when the same relatively static dataset serves many downstream systems — for example, publishing a provider directory, formulary, or scheduling data.
Multiple operations can be used to address a single use case. For example, an intermediary might use Bulk Export to retrieve data from one system, transform it, and then use Bulk Submit to deliver the transformed version to another system.
Key distinctions between the operations:
| Bulk Export | Bulk Submit | Bulk Publish | |
|---|---|---|---|
| Cohort and data elements | Recipient specifies | Provider defines | Provider defines |
| Kick-off workflow | Recipient pull | Provider push | Recipient pull |
| Cardinality | One provider to one recipient | One provider to one recipient | One provider to many recipients |
| Feedback channel | Out of band | In band | Out of band |
Many Bulk Export workflows are applicable to a specific cohort of patients rather than all patients in a system. These cohorts can be represented and managed as FHIR Group resources. For example, a payer roster, research cohort, care management panel, quality-measure population, recently discharged patients, or another recurring population that needs to be exchanged over time can be modeled as a FHIR group. As described on the Group page, implementations may expose read-only groups managed by the Data Provider, member-based groups managed by the Data Consumer, or criteria-based groups whose membership is computed from characteristics. Some Data Providers may also support the Bulk Cohort API described in this guide for asynchronous creation of characteristic-based cohorts by a Data Consumer.
To declare conformance with this IG, a server SHOULD include the following URL in its CapabilityStatement.instantiates: http://hl7.org/fhir/uv/bulkdata/CapabilityStatement/bulk-data.
The Bulk Data Access Implementation Guide Resource defines the technical details of this publication, including dependencies and publishing parameters.
The key words "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this specification are to be interpreted as described in RFC2119.
Common datasets to be exchanged through bulk operations include:
The Bulk Export Operation and the Bulk Submit Status Operation build on the FHIR Asynchronous Bulk Interaction Pattern, a FHIR request and response flow that servers can implement for any Operation or Defined Interaction that needs to return a large dataset. This pattern is described in the FHIR R4 and FHIR R5 versions of the FHIR specification, and has been moved into this Implementation Guide going forward.
Use cases that return small amounts of data but may take a lot of time to process may prefer to use the related Asynchronous Interaction Request Pattern.