Bulk Data Access IG
4.0.0 - STU 4 International flag

Bulk Data Access IG, published by HL7 International / FHIR Infrastructure. This guide is not an authorized publication; it is the continuous build for version 4.0.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/HL7/bulk-data/ and changes regularly. See the Directory of published versions

Plain Language Summary goes here

Home

Official URL: http://hl7.org/fhir/uv/bulkdata/ImplementationGuide/hl7.fhir.uv.bulkdata Version: 4.0.0
IG Standards status: Trial-use Maturity Level: 5 Computable Name: BulkDataAccessIG

Organizations that manage populations often need to exchange large FHIR datasets. Examples include pulling a cohort from an EHR for analytics, sending a pre-arranged package of data to a payer or regulator, or publishing reference data such as provider directories and schedules. Standard FHIR REST APIs work well for interactive and transaction-scale use, but resource-by-resource exchange becomes expensive and operationally brittle when the job involves thousands or millions of resources.

This implementation guide defines a family of FHIR-based bulk operations that standardize how large datasets are requested, delivered, monitored, and reused. Instead of relying on custom CSV extracts and one-off file transfer workflows, these operations use consistent manifest structures, asynchronous processing, and security patterns that can be applied across many implementations.

The operations are applicable to any data that can be represented in FHIR, and may be implemented in "native" FHIR servers that store FHIR resources directly as well as systems that implement FHIR as an interoperability layer (as is often the case with EHR systems and data warehouse platforms).

The scope of this document does NOT include:

  • A legal framework for sharing data between partners, such as Business Associate Agreements, Service Level Agreements, and Data Use Agreements, though these may be required in many use cases.
  • Real-time data exchange
  • Data transformations, validation or processing that may be needed by the Data Consumer
  • Patient matching (although identifiers may be included in the FHIR resources being transmitted)

Example Use Cases

  • A healthcare organization submitting data to a regulatory agency to meet a reporting requirement
  • A healthcare organization sending clinical data to a payer organization to support a quality measurement calculation
  • A payer organization sharing data on claim status with a healthcare organization
  • A healthcare organization moving data from a clinical system onto a standalone FHIR server to consolidate data from multiple systems in order to run analytic queries
  • An organization providing FHIR data to an internal or external service to process the data for de-identification or other transformation
  • An organization sharing a pre-defined dataset from a clinical system with another application, such as a care management tool

Choosing a Bulk Operation

Bulk Data defines three operations. Each fits a different relationship between the system that holds the data (Data Provider) and the system that needs it (Data Consumer).

Bulk Export — The consumer pulls data from a provider on demand. The consumer controls what comes back by optionally choosing the cohort, resource types, filters, data elements, and time window. Use this operation when a system needs to retrieve data from a trusted source and shape the request to its own needs — for example, a research data warehouse exporting clinical data from an EHR or a payer exporting claims-relevant records from a clinical system.

Bulk Submit — The provider pushes a pre-coordinated dataset to a specific recipient. Both sides agree in advance on what the submission contains, and the recipient can acknowledge processing, report issues, or return derived artifacts through the in-band Bulk Submit Status operation. Use this operation when the sender already knows what needs to be delivered and the receiver needs to close the loop — for example, submitting required clinical data to a payer, regulator, or processing service.

Bulk Publish — The provider posts a dataset for any number of consumers to retrieve via ordinary HTTP. The provider decides what is published; consumers discover and cache it using standard HTTP semantics. Use this operation when the same relatively static dataset serves many downstream systems — for example, publishing a provider directory, formulary, or scheduling data.

Multiple operations can be used to address a single use case. For example, an intermediary might use Bulk Export to retrieve data from one system, transform it, and then use Bulk Submit to deliver the transformed version to another system.

Key distinctions between the operations:

  Bulk Export Bulk Submit Bulk Publish
Cohort and data elements Recipient specifies Provider defines Provider defines
Kick-off workflow Recipient pull Provider push Recipient pull
Cardinality One provider to one recipient One provider to one recipient One provider to many recipients
Feedback channel Out of band In band Out of band

Representing Cohorts

Many Bulk Export workflows are applicable to a specific cohort of patients rather than all patients in a system. These cohorts can be represented and managed as FHIR Group resources. For example, a payer roster, research cohort, care management panel, quality-measure population, recently discharged patients, or another recurring population that needs to be exchanged over time can be modeled as a FHIR group. As described on the Group page, implementations may expose read-only groups managed by the Data Provider, member-based groups managed by the Data Consumer, or criteria-based groups whose membership is computed from characteristics. Some Data Providers may also support the Bulk Cohort API described in this guide for asynchronous creation of characteristic-based cohorts by a Data Consumer.

Conformance and Publication

To declare conformance with this IG, a server SHOULD include the following URL in its CapabilityStatement.instantiates: http://hl7.org/fhir/uv/bulkdata/CapabilityStatement/bulk-data.

The Bulk Data Access Implementation Guide Resource defines the technical details of this publication, including dependencies and publishing parameters.

Underlying Standards

Terminology

The key words "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this specification are to be interpreted as described in RFC2119.

Datasets

Common datasets to be exchanged through bulk operations include:

FHIR Asynchronous Bulk Interaction Pattern

The Bulk Export Operation and the Bulk Submit Status Operation build on the FHIR Asynchronous Bulk Interaction Pattern, a FHIR request and response flow that servers can implement for any Operation or Defined Interaction that needs to return a large dataset. This pattern is described in the FHIR R4 and FHIR R5 versions of the FHIR specification, and has been moved into this Implementation Guide going forward.

Use cases that return small amounts of data but may take a lot of time to process may prefer to use the related Asynchronous Interaction Request Pattern.

Additional Documentation