Terminology Fundamentals
0.1.0 - CI Build International flag

Terminology Fundamentals, published by HL7 International / Terminology Infrastructure. This guide is not an authorized publication; it is the continuous build for version 0.1.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/HL7/terminology-fundamentals-ig/ and changes regularly. See the Directory of published versions

Code Systems

Page standards status: Informative

Code Systems

A Code System is defined as a managed collection of uniquely identifiable concepts with associated representations, designations, associations, and meanings. Examples of Code Systems include ICD-9 CM, SNOMED CT, LOINC, and CPT. To meet the requirements of a Code System as defined by HL7, a given concept identifier must resolve to one and only one meaning within the Code System; i.e. Code Systems used in HL7 must be represented by unique identifiers.

Although Code Systems may be differentially referred to as terminologies, vocabularies, or coding schemes, HL7 considers all such collections ‘Code Systems' . At a minimum, Code Systems have the following attributes:

  • An identifier that uniquely identifies the Code System. This SHALL be in the form of a Universal ID (UID). The UID is most commonly represented by the following forms: an ISO OID, or a URI. The HL7 OID Registry may be found at http://www.hl7.org/oid/index.cfm
  • Language that describes the Code System, and may include the Code System uses, maintenance strategy, intent and other information of interest.
  • Administrative information about the Code System, independent of any specific version of the Code System, such as ownership, source URL, and copyright information.

The Concept Representations in a Code System may also have additional text, annotations, references and other information that serve to further identify and clarify what a concept is. A Code System also may contain assertions about relationships that exist between concepts in the Code System.

A Code System is typically created for a particular purpose and may include finite collections, such as concepts that represent individual countries, colors, or states. Code Systems may also represent broad and complex collections of concepts, e.g., SNOMED-CT, ICD, LOINC, and CPT. Where possible, HL7 modelers faced with a requirement for a coded concept will reference an existing Code System. Some of these Code Systems are replicated within official HL7 publications for stability or convenience, while others are documented as references. HL7 will only create a new Code System when no appropriate Code System exists, or when an existing Code System is not available for use in HL7.

Post-Coordination in Code Systems

Code Systems may provide a capability for the composition of complex concepts consisting of two or more existing concepts and formal relationships between them, using a defined syntax. A complex concept which has been assigned a unique concept identifier or code and is published in the Code System is referred to as "pre-coordinated". A representation of a complex concept that is created externally to the Code System (i.e., by an outside entity or user) is called a post-coordinated expression. Note also that it is possible (although discouraged) for a post-coordinated expression to be constructed that represents a concept which already exists and is published in the Code System. Some commonly used code systems in health care IT, such as SNOMED-CT and UCUM, provide a formal mechanism for post-coordination. The mechanism (syntax) that permits this construction of new concept representations is referred to as a grammar for post-coordination.

A simple example to illustrate post-coordination is a case where there exists a concept code for the anatomical object 'arm', a concept code for the laterality identifier 'left', and a code used in post coordinated expressions for 'has laterality'. Although the code system may not have any code for 'left arm', it can be represented in the code system with the three codes, e.g. 'arm' - 'has laterality' - 'left'. HL7 permits the use of such expressions in the coded data types for representation in model instances. Another example is the concept "excision of pituitary gland" which may be pre-coordinated in the code system with a unique identifier and a designation of "hypophysectomy". An alternative definition for this concept can be obtained by post-coordinating codes for "brain excision" and "pituitary gland, " without adding a reusable designation to the source system. Post-coordination allows us to derive complex composite terms by combining the concepts they are derived from, which reduces the number of concepts in the source system. Post-coordination requires the explicit resolution of complex questions of context and semantic precedence (e.g., the preposition "of" in this example).

Code System Versions

Code Systems evolve over time. Changes occur because of corrections and clarifications, because the understanding of the entities being modeled evolves (e.g., new genes and proteins are discovered), because the entities being modeled change (e.g., new countries emerge; old countries are absorbed), or because the assessment of the relevance of particular entities within the knowledge resource change (e.g., the addition of new parent-child relationships to existing codes in SNOMED-CT). Depending upon how well the Code System adheres to good vocabulary practices, the changes in meaning could be significant. Therefore it can be important to know which version of a given Code System was used in the creation of HL7 model instances, the recording of coded data, or in some cases the creation of the HL7 models.

The HL7 model depends on the meaning of a specific concept identifier remaining constant over time, independent of the particular version of the knowledge resource. In the general case where codes do not change meaning over time (or where the meaning change is sufficiently subtle that any such meaning changes are deemed by the user as immaterial for the purpose for which the code is being carried), the version of the Code System is not required to be noted with the Code and the Code System. In cases where the knowledge resource itself doesn't enforce this (e.g. older versions of ICD-9-CM, where codes were retired and subsequently re-used to represent something different), then the version of the Code System in which the code is associated with the desired meaning must be included in the HL7 data type carrying the instance of the Code. Some code systems have changes that are significant over time, and for these a new code system has been designated rather than a new version of the same code system. This is the case, for example, with the German diagnostic classification codes: ICD10GM2009 is NOT a new version of ICD10GM2008, but a completely new code system, with its own independent versioning.

Code Systems employ a variety of mechanisms to signify a particular version; these are defined by the author and/or distributor of the Code System. Most use a number, or set of numbers, others may use alphanumeric strings or dates. There exists no standard representation of version identifiers that has been adopted by authors of Code Systems. Any version identifier that can be represented in an HL7 string data type can be used in HL7. When a code system is registered in THO or the HL7 OID Registry so that conformance may be tested and/or measured, the registration should include precise and detailed information on how code system versions are defined and identified. The registration information should be explicit enough to ensure that independent applications recording the code system version used to capture a particular data element will be recognized by other applications with whom they exchange information. Considerations include factors like use of upper-case vs. lower-case characters, the presence and types of separators, use of leading characters, order of components, and potentially even character set. For example "02.5" would not match with "2-5"; "2001/03/05" would not match with "05-03-2001", etc. The registration should indicate enough information about versioning and the syntax of the version string so as to make the machine processing of the version string a reasonable expectation.