AI Transparency on FHIR, published by HL7 International / Electronic Health Records. This guide is not an authorized publication; it is the continuous build for version 1.0.0-ballot built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/HL7/aitransparency-ig/ and changes regularly. See the Directory of published versions
| Page standards status: Informative |
The goal of this implementation guide is to provide observability of the use of AI in the production or manipulation of health data. To the end user, this means that in some way they can determine first that AI was involved and then discover more information about the AI and its usage. From this, we can understand that there are two levels of observability and multiple factors that can be observed within the second level.
Note: that both Security Labels and Provenance can be applied at the whole Resource level or at the Element level within a resource.
Beyond 1st level observability, there are a number of factors that the end user or client system may be interested in knowing about. These factors can be broken down into 3 categories:
The use of tagging enables distinguishing data that has not been influenced by AI from data that has been influenced by AI. The level of influence and the details about how the AI was used are not provided by simple tagging. However, tagging is very light weight and does not add significant bloat to the payload or additional lookups. Tagging can be used as an indicator that AI was used in the creation or updating of the given resource and that a client system may wish to investigate further by fetching the Resource's Provenance.
💡 Tip
Use when one needs to quickly and easily identify Resources or elements inside a Resource that have been influenced by AI.
Tagging (also called Security Labels) uses the FHIR Resource definition .meta.security element that is at the top of all Resources, and as such can be found without Resource type specific processing. The use of security tagging follows the purpose for security tagging, as the domain of security covers protections against risks to Confidentiality, Availability, and Integrity (see Healthcare Privacy and Security Classification System (HCS) vocabulary). In this case focusing on Integrity is defined as completeness, veracity, reliability, trustworthiness, and provenance. In the case of AI Transparency we want to mark the AI participation to convey reliability, trustworthiness, and provenance.
Within the Integrity Security Tags Vocabulary is AIAST - Artificial Intelligence Asserted as a broad concept of any infuence by any kind of artificial intelligence. There is also DICTAST - Dictation asserted for when dictation, which might be AI driven, has been involved in translating dictation to data.
classDiagram
class Resource {
<<FHIR Resource>>
id
meta.security = AIAIST
...
}
Resource tag
A Resource tag indicates that the whole Resource is influenced by the code assigned.
Use when an example is completely authored by an AI.
The key portion of that Resource is the following meta.security element holding the AIAST code. AIAST is an HL7 Observation value for metadata that indicates that AI was invovled in producing the data or information.
Discussion has indicated that a few more codes might be useful. For this we create a local codeSystem to allow us to experiment. Eventually useful codes would be proposed to HL7 Terminology (THO). For example AIAST does not indicate if a clinician was involved in the use of the AI, or reviewed the output of the AI.
{
"resourceType" : "Observation",
"id" : "glasgow",
"meta" : {
"security" : [
{
"system" : "http://terminology.hl7.org/CodeSystem/v3-ObservationValue",
"code" : "AIAST",
"display" : "Artificial Intelligence asserted"
}
]
},
"text" : {
...
Element tag within a Resource
An Element tag will indicate that an element or a few elements within a Resource were influenced by AI, but not the whole Resource. Use when components of an example were authored by AI, but not the whole Resource.
meta.security holds a code defined in DS4P Inline Security Labels - PROCESSINLINE, and the inline-sec-label extension is on each element that was influenced by AI to indicate it is an AI asserted value.
One of the key portions of that Resource is
"conclusionCode" : [
{
"extension" : [
{
"url" : "http://hl7.org/fhir/uv/security-label-ds4p/StructureDefinition/extension-inline-sec-label",
"valueCoding" : {
"system" : "http://terminology.hl7.org/CodeSystem/v3-ObservationValue",
"code" : "AIAST",
"display" : "Artificial Intelligence asserted"
}
}
],
"coding" : [
{
"system" : "http://snomed.info/sct",
"code" : "428763004",
"display" : "Staphylococcus aureus bacteraemia"
}
]
}
]
There are a number of observability factors beyond simple tagging that are of interest to end users and downstream systems. Cheif among these is the nature of the AI itself. The user would like to understand what algorythm / model was used, who developed it, how it was trained, any certifications it has, and so on… To do this, the guide outlines the use of the Provenance resource, which can then be linked to Device and DocumentReference to point to a Model-Card.
💡 Tip
Use when the AI model is important to the use-cases, such as when it may be important to understand which AI model was used.
The industry is converging around standards for providing this information, generally called Model-Cards. Several different standards are emerging, including Hugging Face and CHAI Model Cards. This guide does not enforce any particular Model-Card, but does show how to encode any Model-Card in a AI Model-Card profiled DocumentReference, and these would be referenced in a AI profiled Device or within the Provenance describing the AI involvement. This looks like:
classDiagram
direction LR
class Resource {
<<FHIR Resource>>
id
meta.security = AIAIST
...
}
class Provenance {
<<FHIR Resource>>
target : Reference resource created/updated
occurred : When
reason : `AIAST`
agent : Reference to AI Device
agent : References to other agents involved
entity : References to Model-Card DocumentReference
entity : References to other data used
}
class Device {
<<FHIR Resource>>
id
identifier
type = "AI"
extension : Specific kind of AI
modelNumber
manufacturer
manufactureDate
deviceName
version
owner
contact
url
note
safety
extension : model-card
}
class DocumentReference {
<<FHIR Resource>>
id
type = AImodelCard
category = AImodelCardMarkdownFormat | AImodelCardCHAIformat
description
version
data / url = codeable model-card details
data / url = pdf rendering
}
Resource "1..*" <-- Provenance : "Provenance.target"
Provenance --> Device : "Provenance.agent.who"
Provenance --> DocumentReference : "Provenance.entity.what"
Examples:
The Hugging Face Model-Card is a combination of YAML that defines in codeable terms the details, and a Markdown that describes it in narrative. Given that Markdown can carry YAML, the overall object is Markdown.
Example Model-Card from https://github.com/huggingface/huggingface_hub/tree/main/tests/fixtures/cards
Here is an example given:
---
language:
- en
license:
- bsd-3-clause
annotations_creators:
- crowdsourced
- expert-generated
language_creators:
- found
multilinguality:
- monolingual
size_categories:
- n<1K
task_categories:
- image-segmentation
task_ids:
- semantic-segmentation
pretty_name: Sample Segmentation
---
The example above is encoded in a DocumentReference with Model-Card encoded inside
The Coalition for Health AI (CHAI) Applied Model Card utilizes XML encoding and PDF rendering.
An example from the CHAI Github Examples is included here in multiple DocumentReference formats:
Note that these are all the same example Model-Card, just encoded different ways depending on the needs. These three encoding methods are available for the HuggingFace format as well. Note that in the case of CHAI format, these examples include both the XML and the PDF rendering of the same as different .content entries.
In R5/R6 of FHIR core the Device resource has a .property element with a .property.type we can use to indicate the model-card, and place the model-card markdown into .property.valueAttachment as markdown string. (It could go into .valueString if we know it will be markdown, but that is not strongly clear.)
One choice is to just put that Markdown Model-Card into the Device.note.text element. This is not wrong from the definition of that element, but it may not be obvious to one looking at the Device resource that there is meaning to the markdown given.
One could encode the Model-Card in a resource designed for carrying any mime-type, the DocumentReference. To make this more clear and searchable we define a codeSystem that has some codes to be used to identify that the DocumentReference is specifically an AI Model-Card or an AI Input Prompt
When using an AI it is necessary to supply it with certain inputs. These inputs very based on the AI involved, but the industry generally refers to these inputs as the "prompt" (especially in the case of Generative AI).
💡 Tip
Use when the record needs to show the data inputs, such as to understand what data the AI had to inference on, vs what data was not provided.
There are different kinds of prompts supplied, including but not limited to:
In general, inputs should be captured using a DocumentReference linked through the Provenance, but when specific clinical data is involved a FHIR Bundle or other resource maybe linked.
Note
There is significant variation in what and how AI systems inputs are supplied, however capturing those inputs should remain relatively consistent.
The context documents all inputs involved in AI processing.
One useful thing to record is the prompt(s) given to the AI. This prompt(s) can be very important to the output, and the interpretation of the output. The prompt(s) is recorded as an attachment, using the DocumentReference, and using a code as defined above
The first example is just showing the encapsulating mechanism. The Second example is a prompt that might be used to have the AI create a given Patient resource that meets the input requirements.
AI Models do not exist in a vacuum, in addition to the context / inputs, there needs to be a system that calls the AI, supplies the inputs, and gets the result. This result may then be used as-is, supplied to another AI, varified by an automated system, varified by a human, or any number of other activities. Understanding this process may be very important to end users and downstream systems. For example, if the results of the AI were verified by a human (human-in-the-loop) then an end user may be able to rely on the results with less scrutiny.
💡 Tip
Use when all possible factors are important to record. This level of Observability Factor is very comprehensive, and as such is very verbose. This level of Observability Factor capturing may not be justified beyond initial model use, while shaking out the use.
Some of the process elements that may be captured are:
As with tagging, a Provenance can point at a whole Resource. In this way one can carry details in the Provenance, such as what AI was used and how.
Provenance can be just about some elements within a Resource. This is a normal part of Provenance, but it is important for AI use-cases.
This is a full example of how to capture the AI process in FHIR.
This is an additional example provided that shows how this IG can be applied.
Use Case: A provider receives a PDF of lab result(s) for a patient. This PDF is examined by an AI which generates a Bundle with a Patient resource and Observation resource(s).
In the attached example the patient's name is Alton Walsh and the lab test is an HbA1C. All the FHIR resources in the bundle have been created by the AI, so they should be tagged accordingly.