Structured Data Capture - CI Build

Structured Data Capture, HL7 International - FHIR Infrastructure Work Group - CI build for version null). This version is based on the current content of https://github.com/HL7/sdc/ and changes regularly. See the Directory of published versions

Form population

One of the objectives of the SDC implementation guide is to reduce data entry into forms when the relevant data already exists in discrete form within the clinical or administrative system completing the form. This saves user time and increases accuracy. If a Questionnaire is created with knowledge of where the data for different questions within the form is expected to come from, the Form Filler or Form Manager system can query for the relevant information and automatically fill in some of the answers. FHIR's REST interface provides a standard way to query this data, regardless of the type of client.

There are four different modes of "population" possible:

  • Full population: The questionnaire can identify the specific data element (or set of data elements) that should be used to populate the form. No user intervention is required other than to review the content. E.g. capturing the patient's gender, birth date, most recent weight measurement, etc. In some cases, this might involve performing a calculation based on data retrieved. For example, determining the patient's BMI by querying the most recent weight and height observations.
  • Choice selection: The questionnaire can identify the candidate answers, but the user must choose which of those should be included in the form response. For example, a question might say "List any relevant concomitant medical conditions". The notion of "relevant" will typically require human decision-making. However, the system rendering the questionnaire can easily bring up a list of concomitant medical conditions and allow the user to select which ones are "relevant" and include those in the answer.
  • Answer context: The specific answer to the question will not exist as data in the client system, but might be able to be determined by the user based on data that is available in the clinical system. For example, the question "Has the patient ever exhibited a similar reaction in the past?" with a yes/no/unknown answer choice is not a data element that will exist in the clinical system. However, it is possible to provide the clinician with a list of AdverseEvent, Observation and Condition instances that might be relevant they can scan in choosing how to answer the question.
  • No population: There is no way to query information relevant to the answer - the user will have to fill out that portion of the form unassisted.

Caveats and considerations with form population

  • For form population to work, the designer of the Questionnaire must do a fair bit of work defining the queries and the filters needed to select what elements to use in populating specific questions. While there are standard expectations for querying demographics like gender, birth date or even most recent weight (due to the HL7-mandated vital signs profiles), there is little international guidance on what codes or conventions to use when representing other data such as medications, procedures, lab results, etc. For form population to work with such data, the Questionnaire will have to be designed presuming the use of particular profiles and conventions - for example, national implementation guides such as US Core or Australian Base. The more sophisticated a Questionnaire's population rules, the more dependent it will be on systems exposing their data in accordance with specified profiles.
  • Even if data exists and is exposed in accordance with agreed profiles, that doesn't necessarily mean that the source system will have the necessary query capabilities to find and appropriately filter the data. While some degree of filtering is possible using FHIRPath, loading thousands or tens of thousands of records for local filtering is likely to pose a challenge from the perspective of performance, memory usage, access control logging, etc. National and other implementation guides that set minimum expectations on search for capabilities on clinical and administrative systems can be helpful here as well.
  • Any queries performed to populate the form will need to be subject to appropriate access controls. Users should not be able to see information populated into a form that they would not be able to see when navigating the source system directly. In some cases, this means that an answer in a form might not be populated even though the data exists - because the user doesn't have authority to see that data.
  • Defining the rules for populating a questionnaire is essentially a mapping process. The author is mapping between the data element defined by a particular question or group item and a corresponding element in a FHIR resource or profile.

    Mapping can be dangerous because it can lead to the sharing of incorrect data or, occasionally, the failure to share correct data. For this reason, it is essential that human users be given the opportunity to review questionnaire responses that have been populated by automated means.

    Even with this precaution, care should be taken when performing data mappings, including:

    • Be aware that the risk of incorrect mapping occurring can be significantly higher when mapping complex questionnaires and/or complex data structures and especially high when mapping both.
    • The degree of risk is implementation dependent (type of data, type of questionnaire, type of source, mapping mechanism, type of user, etc.)
    • Both mappings and the implementation of those mappings should be carefully verified and, in some situations, should be subject to certification or external verification
    • Mapping from profiles not specifically defined for use in the context of a particular questionnaire or set of questionnaires (i.e. with a defined context) magnifies the risk of erroneous population.
  • Of the four population modes, the 'Full Population' approach can be done either as a single step for all items in the questionnaire prior to presenting the form to the user - "pre-population". Alternatively, it can be done continuously as the user encounters each item when completing the form - "continuous population". "Choice selection" and "answer context" can only be done as the user encounters each question because they require a user decision.
  • The 'enableWhen' elements in Questionnaire mean that some elements may not be relevant in all situations. When pre-populating a questionnaire, it makes sense to "fill in" as much data as possible, even if it may not always be needed. However, such data should not be shown to the user until the given item is "enabled". And once the questionnaire is completed, data for any non-enabled elements should be removed.
  • Sometimes the author interested in making a Questionnaire "populatable" doesn't have authority to make changes to the "official" Questionnaire. In other cases, there might be one official Questionnaire, but a need to populate it from resources that comply with different sets of profiles (e.g. that comply with requirements from different countries) - and thus a need for different metadata in the Questionnaire to support the population process. In this case, rather than invoking the population on the original Questionnaire, it can be done on a derived Questionnaire - one that has a Questionnaire.derivedFrom relationship to the canonical URL of the desired Questionnaire. The derived Questionnaire would contain the same content as the base Questionnaire, but would have additional extensions inserted to support population. Once the QuestionnaireResponse was generated/completed, it would be updated to assert a QuestionnaireResponse.questionnaire of the original Questionnaire's canonical URL rather than that of the derived Questionnaire that drove the population process.

Pre-population service

Populating a QuestionnaireResponse is a complex task. The system must be able to query resources, use FHIRPath to extract and potentially calculate relevant data elements, manage conditionality rules around enableWhen as part of the population process, etc. As a result, some client systems might prefer to offload the responsibility for handling pre-population of a form to a separate system. The FHIR specification defines a set of services that can be used to provide a variable degree of offloading - from just handling the population aspect through to handling the rendering of the form and interactive data capture completely through a separate site.

There are three operations:

Operation Description
populate This operation supports generating a QuestionnaireResponse instance based on a specified Questionnaire. If matching data is available for any of the questions and the server supports the pre-population capability, the answers for those questions will be populated in the returned QuestionnaireResponse instance.
populatehtml This operation produces an HTML web page as a Binary instance. The HTML page provides an interactive rendering of the form, using html-based controls to capture user inputs and scripting languages to support form validation and submission to the server that generated the form and/or the recipient(s) designated by the Questionnaire. If matching data is available for any of the questions and the server supports the pre-population capability, the form will initially render with the answers for those questions filled in.
NOTE: because the Binary will contain active content, client systems must ensure they trust the server performing the populatehtml operation.
populatelink This operation returns a URL leading to a web page with an interactive rendering of the form that allows a user navigating to the link with a browser to complete and submit a response to the questionnaire. The response will be transmitted to the server generating the link, hosting the form and/or as designated as part of the Questionnaire itself. If matching data is available for any of the questions and the server supports the pre-population capability, the form will initially render with the answers for those questions filled in.

In addition to these operations, another alternative is to use the Adaptive Form mechanism. It hides the design of the questionnaire entirely and, in principle, allows the service determining the next question to automatically fill in (or perhaps even suppress asking questions) based on data the service can access about the patient.

For SDC purposes, server systems claiming to support roles that require support for the populate, populatehtml or populatelink operations (SDC Form Manager) SHALL, at minimum:

  • Handle the input parameters identifier, questionnaire, questionnaireRef, subject and content
  • Support passing at least* the Patient resource using the content parameter
  • Populate the returned QuestionnaireResponse instance or rendered form for all questions referencing data element logical model StructureDefinition that are mapped to C-CDA content

Similarly, client systems claiming to support the populate, populatehtml and/or populatelink operations (SDC Form Filler) SHALL, at a minimum:

  • Be capable of invoking the operation(s) on a selected questionnaire both directly (Questionnaire/[identifier]/$populate) as well as indirectly either by identifier or questionnaire
  • Support passing at least* the Patient resource using the content parameter
  • Be able to accept an incoming partially-populated QuestionnaireResponse and render it as if they had retrieved a saved partially-completed QuestionnaireResponse
  • It is the responsibility of the SDC Form Filler to ensure the form is valid after a human has reviewed and edited the form.

* Supporting additional resources relevant to the context of the form is also encouraged. Past versions of this specification mandated support for passing C-CDA resources as FHIR Binary instances and this is still permitted. However, this specification no longer provides recommendations for extracting data elements from C-CDA and we have not yet heard of any systems that do this successfully in a production environment

Designing Questionnaires to support 'populate'

This specification defines three different mechanisms to embed information in Questionnaires to support population:

Systems are free to experiment with other population mechanisms but cannot expect support for those from other SDC-conformant systems.

The sdc-questionnaire-populate profile includes the data elements and extensions relevant for supporting all these mechanisms. None of the additional resource elements or extensions are marked as "mustSupport" because there is no expectation that systems will support all (or any of) the different population mechanisms. Instead, each system should choose which approach(es) it wishes to use and support the elements described in that section of the implementation guide.

Some of these mechanisms make use of FHIR-based queries, FHIRPath and/or CQL as well as extensions that include expressions in one of these languages. Implementers should read the Using Expressions page for background and guidance on these technologies and extensions.

Observation-based Population

This is the simplest of the population mechanisms. It takes advantage of the fact that most questions in the healthcare space typically correspond to the value element of an Observation. It also takes advantage of the Questionnaire.item.code element that identifies what a concept each question or group corresponds to.

To use this method:

  1. Include the item.code element on each question to be populated. Typically, this will be a LOINC code, but in some jurisdictions/environments, SNOMED CT or other codes may be relevant
  2. Groups can also have an item.code present - this might represent the code of a panel or the Observation.code of an Observation with no value but with multiple Observation.component elements. Child question items can then assert the item.code of the "member-of" Observations or the Observation.component.code values
  3. To signal that the item.code is actually intended for use in population (as opposed to just providing metadata about the Questionnaire item, the questionnaire-observationLinkPeriod extension must also be included). This extension indicates the period of time over which to search for matching observations. If there are no observations within that window, no population will occur. For observations where how recent they are doesn't matter (e.g. blood type), simply set the duration to a long period of time - e.g. 200 years.
  4. Multiple item.code elements might be present. If so, each are considered an acceptable Observation.code for the desired population value.

For example:

    
      
        
          
          
          
        
      
      
      
        
        
        
      
      
        
        
        
      
      
        
        
        
      
      
      
      
        
          
          
        
      
      
        
          
          
        
      
    ]]>
  

When performing the population operation, the system would look for questions that have the questionnaire-observationLinkPeriod extension and would then perform a query on the Observation using the context of the QuestionnaireResponse's subject - typically a Patient. The query would retrieve the most recent observation for that subject whose code matched one of the Questionnaire.item's codes, that had a value and whose value was of the correct data type.

The 'logical' query performed for the above Questionnaire item being populated on August. 31 might look like this:

[base]?Observation?subject=[questionnaire response subject id]&code=http%3A//loinc.org|29463-7,http%3A//loinc.org|3141-9,http%3A//loinc.org|8341-0&status=completed&date=ge2018-06-02&_sort=-date&_count=1

In practice, the server might not support the _sort or _count parameters, so the filtering for "most recent" might need to be done locally - especially if there's a need to check for matching data type.

Considerations when using this approach:

  • If the units of measure are constrained for the Questionnaire item, the system can choose to convert the data (if the units are coded and the conversion factor is known), grab the most recent Observation within the window where conversion isn't necessary, or choose not to populate.
  • Question items being populated from Observations of type Quantity SHOULD also have a type of Quantity. However, they MAY have a type of integer or decimal provided that the questionnaire-unit extension is present to allow determination of what unit to match on/convert to
  • Systems need to allow for the possibility that an Observation might not have a value (e.g. if dataAbsentReason is present)
  • Systems SHALL filter to only look at Observations with a status of 'completed'
  • Systems SHALL check for the presence of the Observation.focus element and exclude elements where the focus is not the patient (e.g. we don't want the heart rate of a fetus if trying to populate the heart rate of the mother.)
  • If a parent item (group or question) already has established the context of a specific Observation, when populating child items, matching observations SHALL be limited to components of the Observation bound to the parent item and other Observations reachable by a hasMember link from that observation.
  • Where an Observation is known to directly correlate to another resource element value (e.g. LOINC 21112-8 corresponds to Patient.birthDate), systems MAY take advantage of this knowledge to populate the answer of a question by retrieving data from resource elements other than Observation.value and Observation.component.value.
  • The same data elements used to populate a QuestionnaireResponse from Observations can also be used to generate/update Observations from a completed Questionnaire. See the Extraction page of this implementation guide for more information.
  • This mechanism can be used in parallel with the FHIRPath mechanism described below - i.e. some elements might be populated from Observation.value elements based on the item.code while other questions in the same form might be populated based on queries and FHIRPaths specified in the form.
  • This mechanism only supports Full Population. It does not work for other modes.
  • This mechanism can only be used for non-repeating items.
  • This mechanism can be used either for a pre-population step - before the user fills in any data, or as a "continuous" population process, where data is only retrieved when a given element becomes "enabled"
  • Obviously, this mechanism only works for questionnaire items that correspond to Observation values.

FHIRPath-based Population

This approach to population is more generic. It supports retrieving data from any queryable FHIR resources available on the source system. Those queries can be based on the context in which the QuestionnaireResponse is being generated and/or on the results of other queries. Furthermore, it permits FHIRPath operations to be done on the resulting data such as calculations, conditional determinations, etc. As such, it is significantly more powerful than the Observation-based method. However, it also requires skill in using both FHIR queries and FHIRPath, so it requires more technical expertise than the Observation-based approach.

To use this method:

  1. Include the questionnaire-launchContext extension to identify any contextual information that needs to be passed into the population process. Typical contexts would be the Patient or Encounter resources in whose context the questionnaire is being completed, but other elements are also possible (e.g. an AdverseEvent if performing an adverse event report). These 'context' elements will be available as FHIRPath variables for use in subsequent steps.
  2. Use the variable extension to query for any additional data required, possibly based on context variables set in the previous step, and/or other variable extensions. (NOTE: questionnaire-launchContext elements are evaluated first, then all Questionnaire-level variables, in order of appearance in the Questionnaire.)
  3. If appropriate, use the questionnaire-itemContext on group items to establish the context for a group. When populating the questionnaire, this will do two things: it will create a group repetition for each row returned from the query; and it will set the specified variable name to that resource repetition for use in processing items within the group.
  4. Use the variable extension on items as well to perform intermediary calculations or additional queries that are based on itemContext values.
  5. For Full Population, use the questionnaire-initialExpression to cause the initial answer for the question to be set to the specified expression. This should always be a FHIRPath that resolves to an item of the appropriate type unless the element has a type of Reference, in which case the expression can be a FHIR query - and the item will be populated with a reference to the resource. Note, this extension SHALL NOT appear on groups.
  6. For Choice Selection, use the sdc-questionnaire-candidateExpression to make available a list of possible answers for the user to choose from. As with initialExpression, this should always be a FHIRPath that resolves to an item of the appropriate type unless the element has a type of Reference, in which case the expression can be a FHIR query - and any selected item will be populated with a reference to that resource
  7. For Answer Context, use the sdc-questionnaire-contextExpression to indicate the resources to make available for display to the user to aid in answering the question. The information SHOULD be made available through user action (clicking a button or link) rather than being presented by default.

Considerations when using this approach:

  • FHIR queries found in any of the variables may contain embedded FHIRPath expressions (surrounded by double curly-braces). Systems SHALL evaluate and substitute the results of such queries before executing them
  • If the result of evaluating the FHIRPath expressions is an invalid query, that is an error. Systems SHOULD log it and continue with population as if the query had returned no data.
  • If an item has both an initialExpression and an initialValue, the initialExpression SHALL take precedence over the initialValue. If the initialExpression resolves to an empty set, then the question SHALL be populated with the initialValue instead.
  • If the candidateExpression or contentExpression resolve to an empty set, do not display them.
  • Unlike initialExpressions, candidateExpresions can appear on groups as well as questions. If they appear on a group, then a separate group instance is created for each repetition the user selects from the candidates. The variable named in the candidateExpression extension is set to the value of the candidate for use in continuous population of descendant items. For example, a candidateExpression might resolve to a list of Conditions. After the user picks the relevant conditions, a separate group would be created for each and child items for the condition code, severity, onset date, etc. could automatically be populated using an initialExpression based on the variable name from the candidateExpression.
  • When an initialExpression, candidateExpression or contextExpression will return a resource, backbone element or complex data type, the questionnaire-choiceColumn extension can be used to control which data elements should be exposed to the user (and what labels and widths should be allocated to each element so exposed).
  • When an item has an associated candidateExpression or contextExpression, consider whether there should be a preceding 'display' item that provides instructions on how to filter from the available candidates or how to make use of the available context in answering the question.
  • Multiple modes can be present at once. If so, precedence is as follows: initialExpression is used if it resolves; if not then candidateExpression expression is used. A link to contentExpression can be present with either of the modes as it may help the user in verifying the content.
  • It is an error if the type of an initialExpression or candidateExpression disagrees with the type of the item, with the following exceptions: a Resource can map to a Reference and a Quantity can map to a integer or decimal provided that the questionnaire-unit extension is present to allow determination of what unit to match on/convert to. Systems that encounter non-matching expressions that resolve to an element that does not agree SHOULD log it but still populate other elements as best they can.
  • When crafting queries, be sure to filter all relevant elements. For example, ensuring status excludes entered-in-error elements, practitioners are active, etc.
  • Take care to ensure that variable names as set by questionnaire-launchContext, variable and questionnaire-itemContext remain unique within the hierarchy of the questionnaire. It's technically ok to have sibling items define the same variable name - they won't collide. However, if doing so, make sure it's being used for the same purpose to avoid confusion.
  • If the initialExpression evaluates to a collection size that exceeds the maximum cardinality requirements of the item, the system may either choose not to populate the item at all, or instead may treat the initialExpression as if it were a candidateExpression and give the user a chance to choose from amongst the results being returned.
  • If the initialExpression returns fewer occurrences than the questionnaire-minOccurs value (if specified), it SHALL populate what occurrences it can. The user can then add fill in additional repetitions as needed.
  • If an answer is dependent on other answers within the form, the calculation of the answer should be performed using a variable extension on the nearest common ancestor in which all relevant questions or variables are descendants. The question will then have an initialValue extension that references that calculated value. The reason for this is that FHIRPath only has access to the current context node and its descendants, not siblings or ancestors.
  • It's possible that not all queries will resolve successfully - or that some context variables might not be populated. In this situation, the system performing the population SHOULD make best efforts and populate whatever elements it can that are not dependent on
  • It's possible that the same data might be queried multiple times within different contexts when filling out the form. Systems are free to notice this and use the data already in memory rather than re-querying the data
  • The expressions used to populate some elements might be dependent on the answers to other questions, not just on context elements or queried data. In this case, pre-populating systems will only be able to populate the answer if the questions depended on are able to be pre-populated. For continuous-populating systems, they can either populate the dependent answer when the depended-on answer is filled in or when the user gets to the dependent answer.
  • In some cases, a user might choose to change the answer to a question that was the source for pre-populating another question. In this case, the system MAY leave the dependent answer as-is. Alternatively, it MAY re-evaluate the population logic and, if the new answer would differ, prompt the user about whether they would like to change to the new calculated answer. (Note that this may trigger still further updates. Users should have a mechanism to cancel further prompts for updates if they feel bombarded.)
  • If a questionnaire is stored in partially completed state and then is re-opened sometime later for editing, it's possible that originally populated values could be out-of-date. Systems MAY choose to leave already populated data elements as they are. Alternatively, systems MAY re-execute queries to determine whether the new populated value would differ from the current value and, if so, prompt the user as to whether the question should be updated to the 'new' auto-populated value. Systems SHALL NOT replace already filled in answers without user approval.
  • If the units of measure are constrained for the Questionnaire item, the system can choose to convert the data (if the units are coded and the conversion factor is known) or choose not to populate.

StructureMap-based Population

The StructureMap approach is the most sophisticated approach of the three - and the most powerful. It allows iteration of groups based on repeating elements within resources, supports concept translation using structure maps and provides access to transformation capabilities not available with the FHIRPath approach. It also allows the conversion process between data and Questionnaire to be maintained independently and to draw on shared sources across Questionnaires. This can be an advantage in certain environments where the content of the questionnaire may need tight control, but the data environment can be more dynamic. This comes at the cost of requiring expertise in the FHIR mapping language, which is not (yet?) a common skill.

To use this method:

  1. As done for the FHIRPath approach, include the questionnaire-launchContext extension to identify any contextual information that needs to be passed into the population process. Typical contexts would be the Patient or Encounter resources in whose context the questionnaire is being completed, but other elements are also possible (e.g. an AdverseEvent if performing an adverse event report). These 'context' elements will be available as FHIRPath variables for use in subsequent steps.
  2. Include the sdc-questionnaire-sourceQueries extension which SHALL point to a Batch consisting of one or more queries to execute. Prior to executing, FHIRPaths embedded in the queries (referring to elements from the questionnaire-launchContext variables) SHALL be resolved.
  3. Include the questionnaire-sourceStructureMap extension. This SHALL define a transform between the Bundle of Bundles that will result from executing the sourceQueries Batch and a QuestionnaireResponse that complies with the Questionnaire

For example:

    
      ...
      
        
          
        
        
          
        
        
          
        
      
      
        
          
        
        
          
        
        
          
        
      
      
        
          
        
      
      
        
          
        
      
      ...
    ]]>
  

To populate the QuestionnaireResponse, evaluate the FHIRPaths in the sourceQueries batch, execute the sourceQueries batch and then execute the StructureMap on the resulting Bundle. The result of that will be the pre-populated QuestionnaireResponse.

Considerations when using this approach:

  • This approach will only work for pre-population using the Full Population mode. It cannot support continuous population or "Choice Selection" or "Answer Context" modes.
  • This mode has the drawback that if the StructureMap execution fails, there will generally not be any QuestionnaireResponse as output. As a result, the StructureMap must be designed to be very robust in the face of missing or potentially 'bad' data.
  • The ability of StructureMaps to reference other StructureMaps allows for the possibility of re-use if certain sections of multiple questionnaires are consistent