Terminology Change Set Exchange
1.0.0 - STU1 Ballot International flag

Terminology Change Set Exchange, published by HL7 International / Terminology Infrastructure. This guide is not an authorized publication; it is the continuous build for version 1.0.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/HL7/termchangeset-ig/ and changes regularly. See the Directory of published versions

Tinkar Business Requirements

Page standards status: Informative

This section details specific business requirements for a Tinkar logical model.

Clinical Requirements

The ultimate goal of this effort is to support the coordination of safe, effective medicine (Requirement 1). This goal requires quality information in the patient record (Requirement 2), wherever it comes from, and the increasingly distributed nature of care that requires commonly understood data standards (Requirement 3) to ensure mutual comprehension across the care team and over time. There are four outlined clinical use cases:

  • Record Patient Data

A care provider, already authenticated and authorized to the system and using the appropriate context to ensure the system records the data for the correct patient, adds or modifies information in the patient record. This may include signs, symptoms, impressions, diagnoses, orders, notes, or other assets.

This operation may initiate workflow processes or automated processes such as clinical decision support suggestions.

For structured data using standard terminologies, the terms available are appropriate (Requirement 4) for the clinical context, for the role (i.e., terms may differ for different kinds of users), and for the data context (e.g., data entry fields may not support inactive or deprecated terms that would be allowed in search or analytical contexts).

If the available terminology does not support the provider’s needs, the provider may assert a need for a new term.

  • Propose Terminology Change

If a provider attempts to enter a term that is not supported by the enterprise terminology, the effort will be captured as a proposed term (Requirement 5).

Systems may capture this information unobtrusively as text, or prompt further information from the clinician to assist the authoring process. The system will convey at least the text and the identity of the clinician to the terminologist.

  • Review Patient Data

A provider, already authenticated and authorized to the system and using the appropriate context to ensure the system records the data for the correct patient, finds and reviews information in the patient record.

For structured data using standard terminologies, the terms available are appropriate for the clinical context, for the role (i.e., terms may differ for different kinds of users), and for the data context (i.e., data entry fields may not support inactive or deprecated terms that would be allowed in search or analytical contexts).

Changes to the terminology that could affect record interpretation will be indicated (Requirement 6), along with a way to identify the change and its effect.

  • Review Knowledge Base Changes Relevant to Record

If the system identifies a relevant change, the provider may request further information.

This will include the ability to see available values and CDS results for specific dates and contexts (Requirement 7), including those under which the data was recorded or specific decisions were made.

Clinical Use Cases

The key capability for a clinician should be to record and review data quickly and accurately (Requirement 8), taking advantage of up-to-date classifications and decision support rules. This should be accomplished by knowing when a change in the knowledge base might affect a record. The change management capability that supports these operations should be as unobtrusive as possible to patients and care providers, and always readily available.

These operations depend on the availability not only of currently accurate terminology assets, but also assets from prior points in time (Requirement 7). These may include assets as defined or refined by different stakeholders with different sets of assumptions. For instance, whether a disorder meets a criterion defined by a standard terminology, a payor, a professional society, or a locally chartered board of specialists.

To support these needs, the Enterprise Terminology that supports the clinical systems must manage change systematically (Requirement 9), and it must do so for both internally-managed and externally-sourced assets.

Asset Curation Requirements

Curation of these assets requires detailed change data. The evolution and maturation of knowledge happens at different times and places. Keeping standards and relationships to standards current is a complex undertaking. A health system may subscribe to dozens of standard and commercial terminologies, each of which may publish scheduled updates several times a year, and any of which may push out an emergency update at any time. All these assets have different designs, so ensuring continued cohesion is expensive and time-consuming, and the necessary transformations introduce risk. Systematic management of change requires granular representation of the assets and associated asset changes.

There are best-practice capabilities in knowledge asset maintenance. The following is proposed for clinical data standards:

  • Unique object identification (Requirement 10): Every object under version control must have a unique identifier, and the identifier must remain unchanged as the object is modified and different versions of it are created and saved.

  • Version history retention (Requirement 11): Each version of an object must be persisted as the object changes over time, along with metadata indicating its version identifier, time of creation, creating author, and branch of the version control system on which it was created. Further, every version of each object must remain available for retrieval and inspection.

  • Version comparison (Requirement 12): It must be easy to compare two versions of the same object and identify all differences between them. Among other things, this capability is important to determine whether updates to a sub-artifact have changed its semantics in a way that may affect the behavior of one or more of its parent artifacts. Ready comparison is also important when merging two or more concurrent development efforts involving the same knowledge artifacts.

  • Branching capabilities (Requirement 13): It must be possible to create a virtual copy of the entire version-control configuration, or a defined subset, in a new “path,” such that changes made to objects in this branch do not appear in the original configuration. This capability allows individual knowledge engineers to make and test changes to knowledge artifacts without affecting the work of other knowledge engineers or the integrity of knowledge artifacts currently in production. This facility is critical to the orderly and safe management of a clinical decision support system.

  • Merging capabilities (Requirement 14): It must be possible to incorporate all the changes made on one branch of the version-control repository into another branch, such that any conflicts between different versions of the same objects are detected and resolved. This capability is important to enable work done by multiple knowledge engineers concurrently to be combined and incorporated into the main branch of the repository. The merging capability is also important to allow knowledge engineers to update their local branches of the repository with changes that may have been made by others to the main branch, thus ensuring that changes will remain compatible with the latest version of the system.

These core properties support authoring and maintenance operations: at a high level, this means modifying the enterprise terminologies (Requirement 15), importing standard terminologies (Requirement 16), and publishing the enterprise terminologies (Requirement 17) to the client clinical systems. The standard terminology publisher has the same needs around modification and publishing as the enterprise, and some standards import other standards as well (e.g., Medication Reference Terminology [MedRT], which publishes relationships among other standards).

We distinguish between the Enterprise Terminologist and the SDO Terminologist. The Enterprise Terminologist is responsible for ensuring that the terminology resources provided to clinical systems are current and accurate. This involves managing the consumption of external terminologies as well as maintenance of assets defined within the enterprise. The SDO Terminologist is responsible for ensuring that the terminology resources provided to other terminology systems are current and accurate. This may involve managing the consumption of external terminologies as well as maintenance of assets defined within the SDO.

  • Modify Enterprise Terminology

A user adds, modifies, or deactivates content in the terminology assets of the enterprise, including assets provided to clinical systems as well as management data used only within the knowledge base.

  • Publish Enterprise Terminology

A user manages the publication process that supports the automated provision of terminology content to clinical systems.

  • Import Standard Terminology

A user incorporates a new standard terminology or new version of a standard terminology into the enterprise terminology. During this process, functionality supports the assessment and management of impacts on existing enterprise assets.

  • Publish Standard Terminology

A user manages the publication process that supports the automated provision of terminology content to client terminology servers.

  • Modify Standard Terminology

A user adds, modifies, or deactivates content in the terminology assets of the standard, including assets provided to client terminology systems as well as management data used only within the knowledge base (e.g., changing SNOMED CT ® relationships, inserting a concept in between two existing concepts). Note that a standard can only be modified by the standard owner. A client enterprise may add to or modify the content in an “overlay,” but those changes are part of the local enterprise assets. The client enterprise cannot actually modify the standard.

Asset Curation Use Cases

Today, clinical systems consume terminologies, but the interfaces are point-to-point. To assert or assess new information, the tools must already understand all relevant interface models. Since an external organization may modify that model at any time, the ability to consume external assets involves ongoing manual efforts to understand or confirm the model and the design of transformations to support consumption. This is potentially expensive and risky.

We propose a “data-driven” architecture to support self-describing terminology assets. All changes can be programmatically managed with a globally consistent design. Management may involve human review, but it can leverage pattern-based recognition of specific change types for automated handling, leaving a smaller number of cases that require human judgment. This information design will support a common representation of all terminologies. There are two key requirements for this design:

The context information of the first requirement includes the following:

  1. The Status of the asset: whether it should be considered active or inactive in the context of these other attributes (Requirement 19). For systems that do not support status, the default will be “active.”

  2. The Time of the change, specified with a time zone and at an appropriate precision (Requirement 20). For systems that do not provide a time, the default will be the release time.

  3. The Author of the creation or change, unambiguously identified (Requirement 21). For systems that do not provide an author, a default author will be created for the system.

  4. The domain or organizational name of the larger asset within which the component is meaningful, such as code system or edition (a.k.a., Module) (Requirement 22). For systems that do not provide a module, a default module will be created for the system.

  5. The production branch of that organization, e.g., for distributed development, testing, staging, or production (a.k.a., Path) (Requirement 23). For systems that do not provide a path, a default path will be created for the system.

These elements together are referred to by the acronym “STAMP.” Every new assertion, whether a new asset or a change to an existing asset, must have a STAMP to determine when it is to be used. The STAMP properties support the ability to apply terminology assets for specific purposes. For example:

  • “Path” can be used to test provisional content without physically swapping out systems.

  • “Modules” are used to organize content for maintenance and publication purposes. Modules are the domain or organizational name of the larger asset within which the component is meaningful, such as code system or edition. Modularity for terminologies should follow a similar design to modularity in software engineering. Deciding what belongs in certain modules or extensions within certain terminologies is a difficult subject that is out of scope for this document, but having support for the ability to create modules, recognize redundancy, and merge or retire concepts are important requirements that must be supported. [12-14]

  • “Time” supports the ability to apply CDS rules as they would have looked in the past.

A further requirement is that not only must the architecture support these properties, but that it must require the properties for all assets under curation. Without consistent application of this rule, the foundational capability of detailed version management is more difficult.

Additionally, for an asset to support a record of changes, each asset must itself be identifiable (Requirement 10).

The “single syntax” requirement is harder to satisfy. One approach would be to define a syntax that addresses the data elements of all known terminologies. This would be a heavy specification, that would be difficult to maintain, and could fail to capture new elements as terminologies are added in the future.

The other approach is to use a “self-describing” or “meta-modeling” approach, where the syntax defines not only the content but also what the content means. “Rigid” or “brittle” specifications determine in advance where information belongs: a database may use column names to suggest what belongs in a column, but there is no way to determine whether the name is a good one, or whether an instance value meets the criterion implied by the name. But flexible specifications support data definition. Extensible Markup Language (XML) (a subset of Standard Generalized Markup Language [SGML]) provides a way to specify types of data and structural (not semantic) relationships. Resource Description Framework (RDF) goes one step further by making the relationship between an element and its containing class an explicit part of every triplet. If this relationship is specified in a controlled terminology, then assertions can be tested for validity. For example, if an RDF Schema Specification (RDFS) asserts that the finding site of a lesion must be an anatomical feature, then assertions about actual lesions can be tested for valid finding sites. Furthermore, this logic specifies a “range” in the same syntactic structure as the instance assertion: changes to the knowledge base do not affect the syntactical representation of the knowledge. Systems that adopt this approach will require effort to take advantage of new features of terminologies, without having to rebuild their infrastructure when changes are made.

Having change data in discrete tagged change sets will allow the software to hide most of the complexity of version management from the human managers, allowing them to focus on significant decisions.

Configuration Requirements

A granular self-describing model will support any statement that can be made using concepts in a subject-predicate-object structure, and its compositional aspect permits compound predicates. It is difficult to imagine a proposition that cannot be supported, however, this means that there are multiple ways to support any specific kind of statement that a terminology knowledge base must support. This section addresses best practices for these cases.

Operations

Import: A user may identify content from another system and write it into the Terminology Repository. When this happens, the new content will be recorded in the common, self-describing format. When a set of content is imported, rules asserted by the source steward or the Terminology Repository steward may be used to assert structural equivalence in the repository (i.e., different source concepts may be represented as alternate representations of the same root concept). During importation of subsequent versions of a system, changes to assets on which other enterprise assets depend must be identified and managed as directed by documented policies. The import operation will usually identify sets of such changes which require prioritization to prevent redundant processing.

Search (Requirement 24): A user may use lexical or concept-based parameters to search for a set of matching assets.

View (Requirement 25): A user may view an asset, the view consisting of related information associated in visually appropriate ways. This view may omit information not appropriate to the user’s context.

Compare (Requirement 12): A user may view related assets, including versions of component, in a form designed to support analytical comparison (e.g., side-by-side display).

Authoring/Maintenance: A user may modify existing content or add new content. To preserve prior states, all modifications are recorded as new versions of content: prior versions will remain unchanged. Any time a change is made, the system will identify dependent assets and rules for handling these changes.

  • An addition (Requirement 26) is a new version with a new asset UUID (universally unique identifier). Patterns may assert constraints for additions, which may be specific to context (Modules, Paths, Languages, etc.).

  • An inactivation (Requirement 27) is a new version of an existing asset with status set to “inactive.” Patterns may assert rules for deletions, which may be specific to context.

  • A change (Requirement 28) is a new version of an existing asset with the new value(s), distinguishable by STAMP value. A change may involve only a STAMP value. For example, deactivation, or import of a concept to a new module or path.

Classify (Requirement 29): A user may select a logical profile and classifier and use classification logic to test equivalence and subsumption of identified assets, or to generate a set of inferred relationships from a set of stated relationships. An inferred set may be persisted.

Publish (Requirement 30): A user may promote content into a “publication” path and produce a transmissible payload of content that can be consumed by other repositories. This promotion is a change and may require resolution of constraints on membership in that path.

Patterns for Representing Various Assets

The data architecture must support patterns for the representation of many kinds of assets. A minimal list includes the following:

  1. A term must have:

    1. A string representation

    2. A language, possibly including refinements

    3. An indicator of case sensitivity

    4. A type used to represent whether a term is a synonym, fully qualified name (for example: SNOMED CT ® Fully Specified Name or LOINC ® Long Common Name), definition, etc.

  2. A concept must have:

    1. At least one term

    2. At least one parent, except for root concepts or terminologies that are not hierarchical

  3. A logical definition must have:

    1. A definitional status

  4. STAMP values must include:

    1. “Active” and “inactive” status concepts

    2. At least one “default” author

    3. At least one “root” module

    4. Paths supporting “development” and “publication”

  5. An inferred classification must indicate:

    1. The classifier used for its generation

    2. The logic profile used for its generation

    3. The stated asset(s)

  6. The module dependency graph:

    1. Identifies the root module

    2. Lists all other modules, indicating dependency

    3. Must be acyclical

Many other patterns may be present. Implementations are expected to support:

  1. Any assembly of relationships associating one concept with another which must have:

    1. At least one default rule (constraint) for handling changes (e.g., whether assets dependent on changed assets can be automatically handled or require intervention)

  2. Any assembly of relationships may include components that are themselves semantics

  3. Value sets may include:

    1. Rule-based member inclusions

    2. Enumerated members

  4. System-specific import rules:

    1. System equivalences for Tinkar attribute and other infrastructure concepts

    2. Specified exclusions of logical assertions to support equivalence-on-import inferences irrespective of administrative metadata

  5. Maps:

    1. Relationships for equivalence assertions

    2. Relationships for subsumption assertions

    3. Relationships for other functions (e.g., U.S. Center for Disease Control and Prevention (CDC) Reportable Condition Mapping Table)

  6. Constraints on asset patterns, including:

    1. Logical composition constraints on concepts (e.g., the SNOMED CT ® concept model)

    2. Syntactic compositional constraints on strings (e.g., Multipurpose Internet Mail Extension [MIME] types, International Organization for Standardization [ISO] languages, or UCUM units)

    3. Pattern constraints, e.g., presence of exactly one name classified as “fully specified,” or names in specified languages

    4. Rules that may govern modifications to other assets (e.g., incremental addition of effort estimates based on known problematic terms).

One other feature is the set of concepts that the application will use to determine how to present the data to the user. A key dimension is the STAMP information defined above. In addition, three other “coordinates” are required for managing the presentation:

  1. Language: A user may assert a required or preferred language, or a set of ranked language priorities.

  2. Logic: A user may select the parameters for logical classification.

  3. Navigation: A user may select the parameters for presentation of the logical classification

Like other concepts, these can be represented by the core data architecture. The application implementing the Tinkar specification must be able to identify those concepts appropriate for these uses.

Constraints

Constraints are required to:

  1. Ensure that the appropriate level of detail for standard terminologies are represented within Tinkar

  2. Create terminology extensions that conform to the requirements of the standard(s) the extension is based on

  3. Perform general quality assurance

For example, constraints would be used to represent standard terminology artifacts, like the SNOMED CT ® Machine Readable Concept Model. Additionally, constraints could be used to ensure that the terminologies represented within a Tinkar implementation are completely and consistently queried and displayed.

These same constraints can be used to create new content within a Tinkar implementation to specify the minimally viable data that would be required. For example:

  1. All concepts must have at least one Fully Qualified Name within at least one Language or Dialect

  2. All concepts must have at least one Name specified as Preferred within at least one Language or Dialect

  3. All concepts must have at least one parent, unless it is a root concept

Constraints can be applied (or not applied) based on various criteria to perform Quality Assurance on content that is represented within a Tinkar implementation. For example:

  1. SNOMED CT ® Fully Specified Name hierarchy tags are applied based on where a concept exists in a hierarchy

  2. Relationships between concepts have domain (based on hierarchy) and range (the hierarchy(s) of values that a relationship takes)

  3. Modeling templates can be specified to ensure that new content that is created under a certain node in a hierarchy uses similar wording and relationships

Since some Quality Assurance Constraints do not always indicate an error, an Allow List could also be represented as a Semantic to record concepts that are allowed to not conform to a constraint. Constraints would be represented using semantics as they are self describing and can support multiple different representations for constraints (SNOMED CT ® Expression Constraint Language, Drools, etc.). Representing Constraints as a Semantic also ensures STAMP. STAMP is versioned over time, capturing author information and allowing for tests and progress over different modules and paths.

Implementing Constraints would depend upon how the Constraints are written and formatted. For example, implementers could utilize a Rete algorithm through something like Drools to implement Constraints.

Minimally Required Content

A Tinkar implementation must be furnished with the following content:

  1. One root concept

  2. One module dependency graph

  3. Infrastructure concepts a.k.a Tinkar Model Concepts to support the core patterns listed above

  4. Import rules to support import of standard terminologies, including:

    1. Equivalences to support semantic integration of terminologies (e.g., that a LOINC ® “system” instantiates the same relationship concept as the SNOMED CT ® “inheres in” attribute)

    2. Exclusions to support removal of non-semantic properties from classification (e.g., RxNorm Translated CDs)

List of Requirements

IDRequirementLevelClinical SystemTerminology ManagementInformation Design
1Support the practice and coordination of safe, effective medicine Needx
2Provide quality information in the patient recordNeedx
3Represent information in commonly understood data standards Featurexxx
4Provide terms appropriate to the contextFeaturexxx
5Capture terminology suggestions from point of careFunctionx
6Indicate data for which changes to the terminology could affect record interpretation Functionxxx
7View available terms and decision support recommendations for specified dates and contextsFunctionxxx
8Support rapid and accurate recording and review of record data Needx
9Manage change systematicallyFeature xx
10Identify assets uniquelyFunction xx
11Retain all prior versionsFunction xx
12Support comparison of versionsFunction xx
13Support branching of sets of assets for independent developmentFunction xx
14Support controlled merging of branches by identifying and addressing conflicts with defined rulesFunction xx
15Modify enterprise terminology by creating, modifying, or deactivating assets and relationshipsFunction xx
16Import standard terminologies, including merging capability for assets referring to prior versions of the standardFunction xx
17Publish enterprise terminologies, including application and resolution of constraints specific to the publication pathFunction xx
18A self-describing method for representing terminology assets from diverse and mutable modelsFeature x
19Represent the status of the asset in a contextFunction xx
20Represent the time at which a change is recordedFunction xx
IDRequirementLevelClinical SystemTerminology ManagementInformation Design
21Represent the author of a changeFunction xx
22Represent the system or sub-system of an assetFunction xx
23Represent the path or branch of an asset versionFunction xx
24Support search using lexical or logical criteriaFunction x
25Support detailed view of assets and diverse properties, filtering content not relevant to the chosen contextFunction x
26Add terminology assets, including concepts, terms, relationships, definitions, value sets, maps, and othersFunction xx
27Deactivate assets, preserving their original formFunction xx
28Modify assets, preserving their original formFunction xx
29Classify assets using identified tools and logical profiles in chosen contexts, with the option to persist the inferred assetsFunction x
30Process a set of content for publication, including identification and resolution of unresolved constraintsFunction x