Clinical Genomics Resource Incubator, published by HL7 International / Clinical Genomics. This guide is not an authorized publication; it is the continuous build for version 0.1.0-ci-build built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/HL7/cg-incubator/ and changes regularly. See the Directory of published versions
| Official URL: http://hl7.org/fhir/StructureDefinition/MolecularDefinition | Version: 0.1.0-ci-build | ||||
| Standards status: Draft Draft as of 2025-01-30 | Maturity Level: 0 | Computable Name: MolecularDefinition | |||
| Other Identifiers: OID:2.16.840.1.113883.4.642.5.1301 | |||||
Definitional content for a molecular entity, such as a nucleotide or protein sequence.
The MolecularDefinition resource represents molecular entities (e.g., nucleotide or protein sequences) for both clinical and non-clinical use cases, including translational research. The resource is definitional, in that it focuses on discrete, computable, and semantically expressive data structures that reflect the genomic domain. Because the resource focuses on the molecular entities rather than specimen source or annotated knowledge, it supports both patient/participant-specific use cases and population-based data, and both human and non-human data.
The MolecularDefinition resource itself is abstract, but it supports profiles for core molecular concepts, including Sequence (nucleotide and protein), Allele, Variation, Haplotype, and Genotype. Support for additional molecular types, such as structural variation, fusions, and biomarkers, will be considered in the future.
Use cases supported by this resource include but are not limited to:
Use cases often require expression of the same genomic concept in different ways. Since the concept is the same and only the serialization of it differs, the Molecular Definition resource supports multiple approaches to representing molecular sequences. This allows senders and receivers of messages to choose a sequence representation that is most intuitive for the particular use case.
It is important to note that all representations of a given sequence MUST resolve to the exact same primary sequence. Therefore, if a single instance of MolecularDefinition contains one literal, two resolvable files, and a code, all four of those representations must represent the same sequence. Note that this equivalence does not apply to metadata or annotations that are outside the scope of the Molecular Definition resource, since those data are not definitional to the molecule.
The MolecularDefinition resource should be profiled and used to capture representations of molecular concepts such as sequence, allele, haplotype, and genotype.
This resource does not capture workflow (e.g., test ordering/resulting process), the method of obtaining or specifying the molecular content (e.g., the test or assay), or the interpretation of the results (e.g., clinical impact). Those concepts will be captured by profiles of Observation and by the Genomic Study resource. In particular, the Genomics Reporting Implementation Guide contains extensive support for the observation and reporting of clinical genomic results.
Usages:
You can also check for usages in the FHIR IG Statistics
Description Differentials, Snapshots, and other representations.
| Name | Flags | Card. | Type | Description & Constraints Filter: ![]() ![]() |
|---|---|---|---|---|
![]() |
0..* | DomainResource | Definitional content for a molecular entity Elements defined in Ancestors:id, meta, implicitRules, language, text, contained, extension, modifierExtension | |
![]() ![]() |
Σ | 0..* | Identifier | Unique ID of an instance |
![]() ![]() |
0..1 | markdown | Description of the Molecular Definition instance | |
![]() ![]() |
Σ | 0..1 | CodeableConcept | The type of molecule (e.g., DNA, RNA, polypeptide) Binding: Molecular Definition Molecule Type (required): The broad physical class of molecule: DNA, RNA, or polypeptide. Codes are drawn from the Sequence Ontology (SO). |
![]() ![]() |
Σ | 0..* | CodeableConcept | Domain-semantic subtype classification of the molecule Binding: Molecular Definition Type (extensible): Domain-semantic subtype of the molecule (e.g., mRNA, genomic_DNA, lncRNA). Codes are drawn from a curated subset of Sequence Ontology (SO). Implementers may use additional SO codes or codes from other systems when no suitable code exists in this set. |
![]() ![]() |
Σ | 0..* | CodeableConcept | The structural topology of the molecular entity (e.g., linear, circular) Binding: Molecular Definition Topology (required): The structural topology of the molecule (e.g., linear, circular, linear-discontiguous, branched). Codes are drawn from the locally defined Molecular Definition Topology CodeSystem. |
![]() ![]() |
Σ | 0..* | Reference(Molecular Definition) | Constituents of an aggregate molecular concept (e.g., haplotype, genotype) |
![]() ![]() |
Σ | 0..* | BackboneElement | A defined location on a molecular entity |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A coordinate-based location on a sequence |
![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The sequence on which the location is defined |
![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | An interval on a sequence |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The coordinate system used to define the location |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The type of coordinate system used Binding: LOINC Answer List LL5323-2 (extensible): Coordinate system type governing position counting. Codes from LOINC answer list LL5323-2 correspond to widely-used systems: LA30100-4 (0-based interval counting, used by UCSC BED, GA4GH VRS, and SPDI), LA30101-2 (0-based character counting), LA30102-0 (1-based character counting, used by HGVS c./g./n./p. and RefSeq), and LA30103-8 (1-based interval counting). |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The location of the origin of the coordinate system Binding: Coordinate System Origin ValueSet (required): The reference landmark from which coordinates are measured. Unambiguous origin specification is essential for correct variant interpretation and cross-system interoperability. |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The normalization method used for determining a location within the coordinate system Binding: Coordinate System Normalization Method ValueSet (required): The normalization convention applied when positioning a variant in a repetitive sequence region (e.g., left-shift for VCF, right-shift for HGVS 3' rule, fully-justified for VOCA/GA4GH). |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The start location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The end location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() |
0..1 | CodeableConcept | The strand orientation of the sequenceLocation Binding: Molecular Definition Strand (required): Strand orientation of a sequenceLocation. 'forward' (SO:0001030) = plus/sense/Watson strand; 'reverse' (SO:0001031) = minus/antisense/Crick strand. | |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A cytoband-based location on a sequence |
![]() ![]() ![]() ![]() |
Σ | 1..1 | BackboneElement | Reference Genome |
![]() ![]() ![]() ![]() ![]() |
0..1 | CodeableConcept | Species of the organism Binding: Molecular Definition Organism (extensible): The organism whose genome the assembly describes. Codes are drawn from NCBI Taxonomy (http://www.ncbi.nlm.nih.gov/taxonomy). Common values: 9606 (Homo sapiens), 10090 (Mus musculus). | |
![]() ![]() ![]() ![]() ![]() |
0..1 | CodeableConcept | Genome assembly build Binding: LOINC Answer List LL1040-6 (extensible): Named genome assembly build. Codes from LOINC LL1040-6 cover established NCBI/Genome Reference Consortium (GRC) assemblies (e.g., LA14029-5 GRCh37, LA14032-9 GRCh38). The extensible binding accommodates newer assemblies such as T2T-CHM13 not yet represented in LL1040-6. | |
![]() ![]() ![]() ![]() ![]() |
0..1 | CodeableConcept | NCBI Assembly accession Binding Description: (example): NCBI Assembly accession numbers identifying genome assembly versions. Use system http://www.ncbi.nlm.nih.gov/assembly with the accession (e.g., GCF_000001405.40) as the code. | |
![]() ![]() ![]() ![]() ![]() |
0..1 | Genome assembly description | ||
![]() ![]() ![]() ![]() ![]() ![]() |
markdown | |||
![]() ![]() ![]() ![]() ![]() ![]() |
string | |||
![]() ![]() ![]() ![]() |
Σ | 1..1 | BackboneElement | Cytoband Interval |
![]() ![]() ![]() ![]() ![]() |
Σ | 1..1 | CodeableConcept | Human chromosome identifier Binding: LOINC Answer List LL2938-0 (preferred): LOINC answer list LL2938-0: human chromosome identifiers covering autosomes 1-22, sex chromosomes X and Y, and mitochondrial chromosome M. |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | Start | |
![]() ![]() ![]() ![]() ![]() ![]() |
CodeableConcept | |||
![]() ![]() ![]() ![]() ![]() ![]() |
string | |||
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | End | |
![]() ![]() ![]() ![]() ![]() ![]() |
CodeableConcept | |||
![]() ![]() ![]() ![]() ![]() ![]() |
string | |||
![]() ![]() |
Σ | 0..* | BackboneElement | A representation of a molecular entity |
![]() ![]() ![]() |
Σ | 0..1 | code | Representation focus concept Binding: Molecular Definition Representation Focus VS (required): Classifies the role of a representation within a MolecularDefinition: allele-state, reference-state, alternative-state, or context-state. |
![]() ![]() ![]() |
Σ | 0..* | CodeableConcept | Molecular sequence identifier (e.g., RefSeq accession) Binding: Molecular Definition Representation Code (example): Example molecular sequence identifier systems registered in HL7 terminology: NCBI RefSeq (http://www.ncbi.nlm.nih.gov/refseq) and LRG (http://www.lrg-sequence.org). |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A molecular entity defined as a string literal |
![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The encoding used in the value Binding: Molecular Definition Literal Encoding VS (extensible) |
![]() ![]() ![]() ![]() |
Σ | 1..1 | string | A string literal representation of the molecular entity, using the encoding specified in encoding |
![]() ![]() ![]() |
Σ | 0..1 | Reference(DocumentReference) | A resolvable representation of a molecular entity (e.g., URI, attached and formatted file) |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A molecular entity that is represented as a portion of a different entity |
![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The molecular entity that serves as the conceptual 'parent' from which the intended entity is derived |
![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The interval on startingMolecule that defines the portion to be extracted to produce the intended entity |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The coordinate system used to define the location |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The type of coordinate system used Binding: LOINC Answer List LL5323-2 (extensible): Coordinate system type governing position counting. Codes from LOINC answer list LL5323-2 correspond to widely-used systems: LA30100-4 (0-based interval counting, used by UCSC BED, GA4GH VRS, and SPDI), LA30101-2 (0-based character counting), LA30102-0 (1-based character counting, used by HGVS c./g./n./p. and RefSeq), and LA30103-8 (1-based interval counting). |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The location of the origin of the coordinate system Binding: Coordinate System Origin ValueSet (required): The reference landmark from which coordinates are measured. Unambiguous origin specification is essential for correct variant interpretation and cross-system interoperability. |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The normalization method used for determining a location within the coordinate system Binding: Coordinate System Normalization Method ValueSet (required): The normalization convention applied when positioning a variant in a repetitive sequence region (e.g., left-shift for VCF, right-shift for HGVS 3' rule, fully-justified for VOCA/GA4GH). |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The start location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The end location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() |
Σ | 0..1 | boolean | A flag that indicates whether the extracted sequence should be reverse complemented |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A representation as a repeated motif |
![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The motif that is repeated |
![]() ![]() ![]() ![]() |
Σ | 1..1 | integer | The number of copies of the motif |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | An ordered concatenation of molecular entities |
![]() ![]() ![]() ![]() |
Σ | 1..* | BackboneElement | One of the concatenated entities |
![]() ![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | A reference to the sequence that defines this specific concatenated element |
![]() ![]() ![]() ![]() ![]() |
Σ | 1..1 | integer | The ordinal index of the element within the concatenated representation |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A molecular entity represented as an ordered series of edits on a specified starting entity |
![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The molecular entity on which edits will be applied |
![]() ![]() ![]() ![]() |
Σ | 0..* | BackboneElement | A defined edit (change) to be applied |
![]() ![]() ![]() ![]() ![]() |
0..1 | integer | Defines the order of edits when multiple edits are to be applied to the startingMolecule | |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The interval on startingMolecule that defines the portion to be extracted to produce the intended entity |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The coordinate system used to define the location |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The type of coordinate system used Binding: LOINC Answer List LL5323-2 (extensible): Coordinate system type governing position counting. Codes from LOINC answer list LL5323-2 correspond to widely-used systems: LA30100-4 (0-based interval counting, used by UCSC BED, GA4GH VRS, and SPDI), LA30101-2 (0-based character counting), LA30102-0 (1-based character counting, used by HGVS c./g./n./p. and RefSeq), and LA30103-8 (1-based interval counting). |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The location of the origin of the coordinate system Binding: Coordinate System Origin ValueSet (required): The reference landmark from which coordinates are measured. Unambiguous origin specification is essential for correct variant interpretation and cross-system interoperability. |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The normalization method used for determining a location within the coordinate system Binding: Coordinate System Normalization Method ValueSet (required): The normalization convention applied when positioning a variant in a repetitive sequence region (e.g., left-shift for VCF, right-shift for HGVS 3' rule, fully-justified for VOCA/GA4GH). |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The start location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The end location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The molecular entity that serves as the replacement in the edit operation |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | Reference(Molecular Definition) | The portion of the molecular entity that is replaced by the replacementMolecule |
Documentation for this format | ||||
| Path | Status | Usage | ValueSet | Version | Source |
| MolecularDefinition.moleculeType | Base | required | Molecular Definition Molecule Type | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.type | Base | extensible | Molecular Definition Type | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.topology | Base | required | Molecular Definition Topology | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.sequenceLocation.coordinateInterval.coordinateSystem.system | Base | extensible | LOINC Answer Codes for LL5323-2 | ∅ | unknown? |
| MolecularDefinition.location.sequenceLocation.coordinateInterval.coordinateSystem.origin | Base | required | Coordinate System Origin ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.sequenceLocation.coordinateInterval.coordinateSystem.normalizationMethod | Base | required | Coordinate System Normalization Method ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.sequenceLocation.strand | Base | required | Molecular Definition Strand | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.cytobandLocation.genomeAssembly.organism | Base | extensible | Molecular Definition Organism | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.cytobandLocation.genomeAssembly.build | Base | extensible | LOINC Answer Codes for LL1040-6 | ∅ | unknown? |
| MolecularDefinition.location.cytobandLocation.genomeAssembly.accession | Base | example | Not State | Unknown | |
| MolecularDefinition.location.cytobandLocation.cytobandInterval.chromosome | Base | preferred | LOINC Answer Codes for LL2938-0 | ∅ | unknown? |
| MolecularDefinition.representation.focus | Base | required | Molecular Definition Representation Focus VS | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.code | Base | example | Molecular Definition Representation Code | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.literal.encoding | Base | extensible | Molecular Definition Literal Encoding VS | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.extracted.coordinateInterval.coordinateSystem.system | Base | extensible | LOINC Answer Codes for LL5323-2 | ∅ | unknown? |
| MolecularDefinition.representation.extracted.coordinateInterval.coordinateSystem.origin | Base | required | Coordinate System Origin ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.extracted.coordinateInterval.coordinateSystem.normalizationMethod | Base | required | Coordinate System Normalization Method ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.relative.edit.coordinateInterval.coordinateSystem.system | Base | extensible | LOINC Answer Codes for LL5323-2 | ∅ | unknown? |
| MolecularDefinition.representation.relative.edit.coordinateInterval.coordinateSystem.origin | Base | required | Coordinate System Origin ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.relative.edit.coordinateInterval.coordinateSystem.normalizationMethod | Base | required | Coordinate System Normalization Method ValueSet | 📦0.1.0-ci-build | This IG |
<MolecularDefinition xmlns="http://hl7.org/fhir"><id value="[id]"/><!-- 0..1 * Logical id of this artifact --> <meta><!-- I 0..1 * Metadata about the resource --></meta> <implicitRules value="[uri]"/><!-- I 0..1 * A set of rules under which this content was created --> <language value="[code]"/><!-- I 0..1 * Language of the resource content --> <text><!-- I 0..1 * Text summary of the resource, for human interpretation --></text> <contained><!-- 0..* * Contained, inline Resources --></contained> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored --></modifierExtension> <identifier><!-- 0..* * Unique ID of an instance --></identifier> <description value="[markdown]"/><!-- 0..1 * Description of the Molecular Definition instance --> <moleculeType><!-- 0..1 * The type of molecule (e.g., DNA, RNA, polypeptide) --></moleculeType> <type><!-- 0..* * Domain-semantic subtype classification of the molecule --></type> <topology><!-- 0..* * The structural topology of the molecular entity (e.g., linear, circular) --></topology> <member><!-- 0..* * Constituents of an aggregate molecular concept (e.g., haplotype, genotype) --></member> <location> I 0..* * <!-- I 0..* A defined location on a molecular entity --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <sequenceLocation> I 0..1 * <!-- I 0..1 A coordinate-based location on a sequence --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <sequenceContext><!-- 1..1 * The sequence on which the location is defined --></sequenceContext> <coordinateInterval> I 0..1 * <!-- I 0..1 An interval on a sequence --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <coordinateSystem> I 0..1 * <!-- I 0..1 The coordinate system used to define the location --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <system><!-- 0..1 * The type of coordinate system used --></system> <origin><!-- 0..1 * The location of the origin of the coordinate system --></origin> <normalizationMethod><!-- 0..1 * The normalization method used for determining a location within the coordinate system --></normalizationMethod> </coordinateSystem> <start[x]><!-- 0..1 Quantity|Range The start location of the interval --></start[x]> <end[x]><!-- 0..1 Quantity|Range The end location of the interval --></end[x]> </coordinateInterval> <strand><!-- 0..1 * The strand orientation of the sequenceLocation --></strand> </sequenceLocation> <cytobandLocation> I 0..1 * <!-- I 0..1 A cytoband-based location on a sequence --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <genomeAssembly> I 1..1 * <!-- I 1..1 Reference Genome --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <organism><!-- 0..1 * Species of the organism --></organism> <build><!-- 0..1 * Genome assembly build --></build> <accession><!-- 0..1 * NCBI Assembly accession --></accession> <description[x]><!-- 0..1 markdown|string Genome assembly description --></description[x]> </genomeAssembly> <cytobandInterval> I 1..1 * <!-- I 1..1 Cytoband Interval --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <chromosome><!-- 1..1 * Human chromosome identifier --></chromosome> <startCytoband[x]><!-- 0..1 CodeableConcept|string Start --></startCytoband[x]> <endCytoband[x]><!-- 0..1 CodeableConcept|string End --></endCytoband[x]> </cytobandInterval> </cytobandLocation> </location> <representation> I 0..* * <!-- I 0..* A representation of a molecular entity --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <focus value="[code]"/><!-- 0..1 * Representation focus concept --> <code><!-- 0..* * Molecular sequence identifier (e.g., RefSeq accession) --></code> <literal> I 0..1 * <!-- I 0..1 A molecular entity defined as a string literal --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <encoding><!-- 0..1 * The encoding used in the value --></encoding> <value value="[string]"/><!-- 1..1 * A string literal representation of the molecular entity, using the encoding specified in encoding --> </literal> <resolvable><!-- 0..1 * A resolvable representation of a molecular entity (e.g., URI, attached and formatted file) --></resolvable> <extracted> I 0..1 * <!-- I 0..1 A molecular entity that is represented as a portion of a different entity --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <startingMolecule><!-- 1..1 * The molecular entity that serves as the conceptual 'parent' from which the intended entity is derived --></startingMolecule> <coordinateInterval> I 0..1 * <!-- I 0..1 The interval on startingMolecule that defines the portion to be extracted to produce the intended entity --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <coordinateSystem> I 0..1 * <!-- I 0..1 The coordinate system used to define the location --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <system><!-- 0..1 * The type of coordinate system used --></system> <origin><!-- 0..1 * The location of the origin of the coordinate system --></origin> <normalizationMethod><!-- 0..1 * The normalization method used for determining a location within the coordinate system --></normalizationMethod> </coordinateSystem> <start[x]><!-- 0..1 Quantity|Range The start location of the interval --></start[x]> <end[x]><!-- 0..1 Quantity|Range The end location of the interval --></end[x]> </coordinateInterval> <reverseComplement value="[boolean]"/><!-- 0..1 * A flag that indicates whether the extracted sequence should be reverse complemented --> </extracted> <repeated> I 0..1 * <!-- I 0..1 A representation as a repeated motif --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <sequenceMotif><!-- 1..1 * The motif that is repeated --></sequenceMotif> <copyCount value="[integer]"/><!-- 1..1 * The number of copies of the motif --> </repeated> <concatenated> I 0..1 * <!-- I 0..1 An ordered concatenation of molecular entities --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <sequenceElement> I 1..* * <!-- I 1..* One of the concatenated entities --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <sequence><!-- 1..1 * A reference to the sequence that defines this specific concatenated element --></sequence> <ordinalIndex value="[integer]"/><!-- 1..1 * The ordinal index of the element within the concatenated representation --> </sequenceElement> </concatenated> <relative> I 0..1 * <!-- I 0..1 A molecular entity represented as an ordered series of edits on a specified starting entity --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <startingMolecule><!-- 1..1 * The molecular entity on which edits will be applied --></startingMolecule> <edit> I 0..* * <!-- I 0..* A defined edit (change) to be applied --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <editOrder value="[integer]"/><!-- 0..1 * Defines the order of edits when multiple edits are to be applied to the startingMolecule --> <coordinateInterval> I 0..1 * <!-- I 0..1 The interval on startingMolecule that defines the portion to be extracted to produce the intended entity --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <coordinateSystem> I 0..1 * <!-- I 0..1 The coordinate system used to define the location --> <id value="[id]"/><!-- 0..1 * Unique id for inter-element referencing --> <extension><!-- See Extensions Additional content defined by implementations --></extension> <modifierExtension><!-- I 0..* * Extensions that cannot be ignored even if unrecognized --></modifierExtension> <system><!-- 0..1 * The type of coordinate system used --></system> <origin><!-- 0..1 * The location of the origin of the coordinate system --></origin> <normalizationMethod><!-- 0..1 * The normalization method used for determining a location within the coordinate system --></normalizationMethod> </coordinateSystem> <start[x]><!-- 0..1 Quantity|Range The start location of the interval --></start[x]> <end[x]><!-- 0..1 Quantity|Range The end location of the interval --></end[x]> </coordinateInterval> <replacementMolecule><!-- 1..1 * The molecular entity that serves as the replacement in the edit operation --></replacementMolecule> <replacedMolecule><!-- 0..1 * The portion of the molecular entity that is replaced by the replacementMolecule --></replacedMolecule> </edit> </relative> </representation> </MolecularDefinition>
{
"resourceType" : "MolecularDefinition",
"id" : "<id>", // 0..1 Logical id of this artifact
"meta" : { Meta }, // I 0..1 Metadata about the resource
"implicitRules" : "<uri>", // I 0..1 A set of rules under which this content was created
"language" : "<code>", // I 0..1 Language of the resource content
"text" : { Narrative }, // I 0..1 Text summary of the resource, for human interpretation
"contained" : [{ Resource }], // 0..* Contained, inline Resources
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"identifier" : [{ Identifier }], // 0..* Unique ID of an instance
"description" : "<markdown>", // 0..1 Description of the Molecular Definition instance
"moleculeType" : { CodeableConcept }, // 0..1 The type of molecule (e.g., DNA, RNA, polypeptide)
"type" : [{ CodeableConcept }], // 0..* Domain-semantic subtype classification of the molecule
"topology" : [{ CodeableConcept }], // 0..* The structural topology of the molecular entity (e.g., linear, circular)
"member" : [{ Reference(MolecularDefinition) }], // 0..* Constituents of an aggregate molecular concept (e.g., haplotype, genotype)
"location" : [{ BackboneElement }], // I 0..* A defined location on a molecular entity
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"sequenceLocation" : { BackboneElement }, // I 0..1 A coordinate-based location on a sequence
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"sequenceContext" : { Reference(MolecularDefinition) }, // 1..1 The sequence on which the location is defined
"coordinateInterval" : { BackboneElement }, // I 0..1 An interval on a sequence
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"coordinateSystem" : { BackboneElement }, // I 0..1 The coordinate system used to define the location
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"system" : { CodeableConcept }, // 0..1 The type of coordinate system used
"origin" : { CodeableConcept }, // 0..1 The location of the origin of the coordinate system
"normalizationMethod" : { CodeableConcept } // 0..1 The normalization method used for determining a location within the coordinate system
}
// start[x]: The start location of the interval. One of these 2:
"startQuantity" : { Quantity },
"startRange" : { Range },
// end[x]: The end location of the interval. One of these 2:
"endQuantity" : { Quantity },
"endRange" : { Range }
}
"strand" : { CodeableConcept } // 0..1 The strand orientation of the sequenceLocation
}
"cytobandLocation" : { BackboneElement } // I 0..1 A cytoband-based location on a sequence
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"genomeAssembly" : { BackboneElement }, // I 1..1 Reference Genome
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"organism" : { CodeableConcept }, // 0..1 Species of the organism
"build" : { CodeableConcept }, // 0..1 Genome assembly build
"accession" : { CodeableConcept }, // 0..1 NCBI Assembly accession
// description[x]: Genome assembly description. One of these 2:
"descriptionMarkdown" : "<markdown>",
"descriptionString" : "<string>"
}
"cytobandInterval" : { BackboneElement } // I 1..1 Cytoband Interval
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"chromosome" : { CodeableConcept }, // 1..1 Human chromosome identifier
// startCytoband[x]: Start. One of these 2:
"startCytobandCodeableConcept" : { CodeableConcept },
"startCytobandString" : "<string>",
// endCytoband[x]: End. One of these 2:
"endCytobandCodeableConcept" : { CodeableConcept },
"endCytobandString" : "<string>"
}
}
}
"representation" : [{ BackboneElement }] // I 0..* A representation of a molecular entity
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"focus" : "<code>", // 0..1 Representation focus concept
"code" : [{ CodeableConcept }], // 0..* Molecular sequence identifier (e.g., RefSeq accession)
"literal" : { BackboneElement }, // I 0..1 A molecular entity defined as a string literal
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"encoding" : { CodeableConcept }, // 0..1 The encoding used in the value
"value" : "<string>" // 1..1 A string literal representation of the molecular entity, using the encoding specified in encoding
}
"resolvable" : { Reference(DocumentReference) }, // 0..1 A resolvable representation of a molecular entity (e.g., URI, attached and formatted file)
"extracted" : { BackboneElement }, // I 0..1 A molecular entity that is represented as a portion of a different entity
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"startingMolecule" : { Reference(MolecularDefinition) }, // 1..1 The molecular entity that serves as the conceptual 'parent' from which the intended entity is derived
"coordinateInterval" : { BackboneElement }, // I 0..1 The interval on startingMolecule that defines the portion to be extracted to produce the intended entity
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"coordinateSystem" : { BackboneElement }, // I 0..1 The coordinate system used to define the location
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"system" : { CodeableConcept }, // 0..1 The type of coordinate system used
"origin" : { CodeableConcept }, // 0..1 The location of the origin of the coordinate system
"normalizationMethod" : { CodeableConcept } // 0..1 The normalization method used for determining a location within the coordinate system
}
// start[x]: The start location of the interval. One of these 2:
"startQuantity" : { Quantity },
"startRange" : { Range },
// end[x]: The end location of the interval. One of these 2:
"endQuantity" : { Quantity },
"endRange" : { Range }
}
"reverseComplement" : <boolean> // 0..1 A flag that indicates whether the extracted sequence should be reverse complemented
}
"repeated" : { BackboneElement }, // I 0..1 A representation as a repeated motif
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"sequenceMotif" : { Reference(MolecularDefinition) }, // 1..1 The motif that is repeated
"copyCount" : <integer> // 1..1 The number of copies of the motif
}
"concatenated" : { BackboneElement }, // I 0..1 An ordered concatenation of molecular entities
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"sequenceElement" : [{ BackboneElement }] // I 1..* One of the concatenated entities
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"sequence" : { Reference(MolecularDefinition) }, // 1..1 A reference to the sequence that defines this specific concatenated element
"ordinalIndex" : <integer> // 1..1 The ordinal index of the element within the concatenated representation
}
}
"relative" : { BackboneElement } // I 0..1 A molecular entity represented as an ordered series of edits on a specified starting entity
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"startingMolecule" : { Reference(MolecularDefinition) }, // 1..1 The molecular entity on which edits will be applied
"edit" : [{ BackboneElement }] // I 0..* A defined edit (change) to be applied
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"editOrder" : <integer>, // 0..1 Defines the order of edits when multiple edits are to be applied to the startingMolecule
"coordinateInterval" : { BackboneElement }, // I 0..1 The interval on startingMolecule that defines the portion to be extracted to produce the intended entity
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"coordinateSystem" : { BackboneElement }, // I 0..1 The coordinate system used to define the location
"id" : "<id>", // 0..1 Unique id for inter-element referencing
(Extensions - see JSON page)
(Modifier Extensions - see JSON page)
"system" : { CodeableConcept }, // 0..1 The type of coordinate system used
"origin" : { CodeableConcept }, // 0..1 The location of the origin of the coordinate system
"normalizationMethod" : { CodeableConcept } // 0..1 The normalization method used for determining a location within the coordinate system
}
// start[x]: The start location of the interval. One of these 2:
"startQuantity" : { Quantity },
"startRange" : { Range },
// end[x]: The end location of the interval. One of these 2:
"endQuantity" : { Quantity },
"endRange" : { Range }
}
"replacementMolecule" : { Reference(MolecularDefinition) }, // 1..1 The molecular entity that serves as the replacement in the edit operation
"replacedMolecule" : { Reference(MolecularDefinition) } // 0..1 The portion of the molecular entity that is replaced by the replacementMolecule
}
}
}
}
@prefix fhir: <http://hl7.org/fhir/> .[ a fhir:MolecularDefinition; fhir:nodeRole fhir:treeRoot; # if this is the parser root fhir:id [ id ] ; # 0..1 Logical id of this artifact fhir:meta [ Meta ] ; # 0..1 I Metadata about the resource fhir:implicitRules [ uri ] ; # 0..1 I A set of rules under which this content was created fhir:language [ code ] ; # 0..1 I Language of the resource content fhir:text [ Narrative ] ; # 0..1 I Text summary of the resource, for human interpretation fhir:contained ( [ Resource ] ... ) ; # 0..* Contained, inline Resources fhir:extension ( [ Extension ] ... ) ; # 0..* I Additional content defined by implementations fhir:modifierExtension ( [ Extension ] ... ) ; # 0..* I Extensions that cannot be ignored fhir:identifier ( [ Identifier ] ... ) ; # 0..* Unique ID of an instance fhir:description [ markdown ] ; # 0..1 Description of the Molecular Definition instance fhir:moleculeType [ CodeableConcept ] ; # 0..1 The type of molecule (e.g., DNA, RNA, polypeptide) fhir:type ( [ CodeableConcept ] ... ) ; # 0..* Domain-semantic subtype classification of the molecule fhir:topology ( [ CodeableConcept ] ... ) ; # 0..* The structural topology of the molecular entity (e.g., linear, circular) fhir:member ( [ Reference(MolecularDefinition) ] ... ) ; # 0..* Constituents of an aggregate molecular concept (e.g., haplotype, genotype) fhir:location ( [ BackboneElement ] ... ) ; # 0..* I A defined location on a molecular entity fhir:representation ( [ BackboneElement ] ... ) ; # 0..* I A representation of a molecular entity ]
Differential View
| Name | Flags | Card. | Type | Description & Constraints Filter: ![]() ![]() |
|---|---|---|---|---|
![]() |
0..* | DomainResource | Definitional content for a molecular entity Elements defined in Ancestors:id, meta, implicitRules, language, text, contained, extension, modifierExtension | |
![]() ![]() |
Σ | 0..* | Identifier | Unique ID of an instance |
![]() ![]() |
0..1 | markdown | Description of the Molecular Definition instance | |
![]() ![]() |
Σ | 0..1 | CodeableConcept | The type of molecule (e.g., DNA, RNA, polypeptide) Binding: Molecular Definition Molecule Type (required): The broad physical class of molecule: DNA, RNA, or polypeptide. Codes are drawn from the Sequence Ontology (SO). |
![]() ![]() |
Σ | 0..* | CodeableConcept | Domain-semantic subtype classification of the molecule Binding: Molecular Definition Type (extensible): Domain-semantic subtype of the molecule (e.g., mRNA, genomic_DNA, lncRNA). Codes are drawn from a curated subset of Sequence Ontology (SO). Implementers may use additional SO codes or codes from other systems when no suitable code exists in this set. |
![]() ![]() |
Σ | 0..* | CodeableConcept | The structural topology of the molecular entity (e.g., linear, circular) Binding: Molecular Definition Topology (required): The structural topology of the molecule (e.g., linear, circular, linear-discontiguous, branched). Codes are drawn from the locally defined Molecular Definition Topology CodeSystem. |
![]() ![]() |
Σ | 0..* | Reference(Molecular Definition) | Constituents of an aggregate molecular concept (e.g., haplotype, genotype) |
![]() ![]() |
Σ | 0..* | BackboneElement | A defined location on a molecular entity |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A coordinate-based location on a sequence |
![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The sequence on which the location is defined |
![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | An interval on a sequence |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The coordinate system used to define the location |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The type of coordinate system used Binding: LOINC Answer List LL5323-2 (extensible): Coordinate system type governing position counting. Codes from LOINC answer list LL5323-2 correspond to widely-used systems: LA30100-4 (0-based interval counting, used by UCSC BED, GA4GH VRS, and SPDI), LA30101-2 (0-based character counting), LA30102-0 (1-based character counting, used by HGVS c./g./n./p. and RefSeq), and LA30103-8 (1-based interval counting). |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The location of the origin of the coordinate system Binding: Coordinate System Origin ValueSet (required): The reference landmark from which coordinates are measured. Unambiguous origin specification is essential for correct variant interpretation and cross-system interoperability. |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The normalization method used for determining a location within the coordinate system Binding: Coordinate System Normalization Method ValueSet (required): The normalization convention applied when positioning a variant in a repetitive sequence region (e.g., left-shift for VCF, right-shift for HGVS 3' rule, fully-justified for VOCA/GA4GH). |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The start location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The end location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() |
0..1 | CodeableConcept | The strand orientation of the sequenceLocation Binding: Molecular Definition Strand (required): Strand orientation of a sequenceLocation. 'forward' (SO:0001030) = plus/sense/Watson strand; 'reverse' (SO:0001031) = minus/antisense/Crick strand. | |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A cytoband-based location on a sequence |
![]() ![]() ![]() ![]() |
Σ | 1..1 | BackboneElement | Reference Genome |
![]() ![]() ![]() ![]() ![]() |
0..1 | CodeableConcept | Species of the organism Binding: Molecular Definition Organism (extensible): The organism whose genome the assembly describes. Codes are drawn from NCBI Taxonomy (http://www.ncbi.nlm.nih.gov/taxonomy). Common values: 9606 (Homo sapiens), 10090 (Mus musculus). | |
![]() ![]() ![]() ![]() ![]() |
0..1 | CodeableConcept | Genome assembly build Binding: LOINC Answer List LL1040-6 (extensible): Named genome assembly build. Codes from LOINC LL1040-6 cover established NCBI/Genome Reference Consortium (GRC) assemblies (e.g., LA14029-5 GRCh37, LA14032-9 GRCh38). The extensible binding accommodates newer assemblies such as T2T-CHM13 not yet represented in LL1040-6. | |
![]() ![]() ![]() ![]() ![]() |
0..1 | CodeableConcept | NCBI Assembly accession Binding Description: (example): NCBI Assembly accession numbers identifying genome assembly versions. Use system http://www.ncbi.nlm.nih.gov/assembly with the accession (e.g., GCF_000001405.40) as the code. | |
![]() ![]() ![]() ![]() ![]() |
0..1 | Genome assembly description | ||
![]() ![]() ![]() ![]() ![]() ![]() |
markdown | |||
![]() ![]() ![]() ![]() ![]() ![]() |
string | |||
![]() ![]() ![]() ![]() |
Σ | 1..1 | BackboneElement | Cytoband Interval |
![]() ![]() ![]() ![]() ![]() |
Σ | 1..1 | CodeableConcept | Human chromosome identifier Binding: LOINC Answer List LL2938-0 (preferred): LOINC answer list LL2938-0: human chromosome identifiers covering autosomes 1-22, sex chromosomes X and Y, and mitochondrial chromosome M. |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | Start | |
![]() ![]() ![]() ![]() ![]() ![]() |
CodeableConcept | |||
![]() ![]() ![]() ![]() ![]() ![]() |
string | |||
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | End | |
![]() ![]() ![]() ![]() ![]() ![]() |
CodeableConcept | |||
![]() ![]() ![]() ![]() ![]() ![]() |
string | |||
![]() ![]() |
Σ | 0..* | BackboneElement | A representation of a molecular entity |
![]() ![]() ![]() |
Σ | 0..1 | code | Representation focus concept Binding: Molecular Definition Representation Focus VS (required): Classifies the role of a representation within a MolecularDefinition: allele-state, reference-state, alternative-state, or context-state. |
![]() ![]() ![]() |
Σ | 0..* | CodeableConcept | Molecular sequence identifier (e.g., RefSeq accession) Binding: Molecular Definition Representation Code (example): Example molecular sequence identifier systems registered in HL7 terminology: NCBI RefSeq (http://www.ncbi.nlm.nih.gov/refseq) and LRG (http://www.lrg-sequence.org). |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A molecular entity defined as a string literal |
![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The encoding used in the value Binding: Molecular Definition Literal Encoding VS (extensible) |
![]() ![]() ![]() ![]() |
Σ | 1..1 | string | A string literal representation of the molecular entity, using the encoding specified in encoding |
![]() ![]() ![]() |
Σ | 0..1 | Reference(DocumentReference) | A resolvable representation of a molecular entity (e.g., URI, attached and formatted file) |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A molecular entity that is represented as a portion of a different entity |
![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The molecular entity that serves as the conceptual 'parent' from which the intended entity is derived |
![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The interval on startingMolecule that defines the portion to be extracted to produce the intended entity |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The coordinate system used to define the location |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The type of coordinate system used Binding: LOINC Answer List LL5323-2 (extensible): Coordinate system type governing position counting. Codes from LOINC answer list LL5323-2 correspond to widely-used systems: LA30100-4 (0-based interval counting, used by UCSC BED, GA4GH VRS, and SPDI), LA30101-2 (0-based character counting), LA30102-0 (1-based character counting, used by HGVS c./g./n./p. and RefSeq), and LA30103-8 (1-based interval counting). |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The location of the origin of the coordinate system Binding: Coordinate System Origin ValueSet (required): The reference landmark from which coordinates are measured. Unambiguous origin specification is essential for correct variant interpretation and cross-system interoperability. |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The normalization method used for determining a location within the coordinate system Binding: Coordinate System Normalization Method ValueSet (required): The normalization convention applied when positioning a variant in a repetitive sequence region (e.g., left-shift for VCF, right-shift for HGVS 3' rule, fully-justified for VOCA/GA4GH). |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The start location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The end location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() |
Σ | 0..1 | boolean | A flag that indicates whether the extracted sequence should be reverse complemented |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A representation as a repeated motif |
![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The motif that is repeated |
![]() ![]() ![]() ![]() |
Σ | 1..1 | integer | The number of copies of the motif |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | An ordered concatenation of molecular entities |
![]() ![]() ![]() ![]() |
Σ | 1..* | BackboneElement | One of the concatenated entities |
![]() ![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | A reference to the sequence that defines this specific concatenated element |
![]() ![]() ![]() ![]() ![]() |
Σ | 1..1 | integer | The ordinal index of the element within the concatenated representation |
![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | A molecular entity represented as an ordered series of edits on a specified starting entity |
![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The molecular entity on which edits will be applied |
![]() ![]() ![]() ![]() |
Σ | 0..* | BackboneElement | A defined edit (change) to be applied |
![]() ![]() ![]() ![]() ![]() |
0..1 | integer | Defines the order of edits when multiple edits are to be applied to the startingMolecule | |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The interval on startingMolecule that defines the portion to be extracted to produce the intended entity |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | BackboneElement | The coordinate system used to define the location |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The type of coordinate system used Binding: LOINC Answer List LL5323-2 (extensible): Coordinate system type governing position counting. Codes from LOINC answer list LL5323-2 correspond to widely-used systems: LA30100-4 (0-based interval counting, used by UCSC BED, GA4GH VRS, and SPDI), LA30101-2 (0-based character counting), LA30102-0 (1-based character counting, used by HGVS c./g./n./p. and RefSeq), and LA30103-8 (1-based interval counting). |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The location of the origin of the coordinate system Binding: Coordinate System Origin ValueSet (required): The reference landmark from which coordinates are measured. Unambiguous origin specification is essential for correct variant interpretation and cross-system interoperability. |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | CodeableConcept | The normalization method used for determining a location within the coordinate system Binding: Coordinate System Normalization Method ValueSet (required): The normalization convention applied when positioning a variant in a repetitive sequence region (e.g., left-shift for VCF, right-shift for HGVS 3' rule, fully-justified for VOCA/GA4GH). |
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The start location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | The end location of the interval | |
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Quantity | |||
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Range | |||
![]() ![]() ![]() ![]() ![]() |
Σ | 1..1 | Reference(Molecular Definition) | The molecular entity that serves as the replacement in the edit operation |
![]() ![]() ![]() ![]() ![]() |
Σ | 0..1 | Reference(Molecular Definition) | The portion of the molecular entity that is replaced by the replacementMolecule |
Documentation for this format | ||||
| Path | Status | Usage | ValueSet | Version | Source |
| MolecularDefinition.moleculeType | Base | required | Molecular Definition Molecule Type | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.type | Base | extensible | Molecular Definition Type | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.topology | Base | required | Molecular Definition Topology | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.sequenceLocation.coordinateInterval.coordinateSystem.system | Base | extensible | LOINC Answer Codes for LL5323-2 | ∅ | unknown? |
| MolecularDefinition.location.sequenceLocation.coordinateInterval.coordinateSystem.origin | Base | required | Coordinate System Origin ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.sequenceLocation.coordinateInterval.coordinateSystem.normalizationMethod | Base | required | Coordinate System Normalization Method ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.sequenceLocation.strand | Base | required | Molecular Definition Strand | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.cytobandLocation.genomeAssembly.organism | Base | extensible | Molecular Definition Organism | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.location.cytobandLocation.genomeAssembly.build | Base | extensible | LOINC Answer Codes for LL1040-6 | ∅ | unknown? |
| MolecularDefinition.location.cytobandLocation.genomeAssembly.accession | Base | example | Not State | Unknown | |
| MolecularDefinition.location.cytobandLocation.cytobandInterval.chromosome | Base | preferred | LOINC Answer Codes for LL2938-0 | ∅ | unknown? |
| MolecularDefinition.representation.focus | Base | required | Molecular Definition Representation Focus VS | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.code | Base | example | Molecular Definition Representation Code | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.literal.encoding | Base | extensible | Molecular Definition Literal Encoding VS | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.extracted.coordinateInterval.coordinateSystem.system | Base | extensible | LOINC Answer Codes for LL5323-2 | ∅ | unknown? |
| MolecularDefinition.representation.extracted.coordinateInterval.coordinateSystem.origin | Base | required | Coordinate System Origin ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.extracted.coordinateInterval.coordinateSystem.normalizationMethod | Base | required | Coordinate System Normalization Method ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.relative.edit.coordinateInterval.coordinateSystem.system | Base | extensible | LOINC Answer Codes for LL5323-2 | ∅ | unknown? |
| MolecularDefinition.representation.relative.edit.coordinateInterval.coordinateSystem.origin | Base | required | Coordinate System Origin ValueSet | 📦0.1.0-ci-build | This IG |
| MolecularDefinition.representation.relative.edit.coordinateInterval.coordinateSystem.normalizationMethod | Base | required | Coordinate System Normalization Method ValueSet | 📦0.1.0-ci-build | This IG |
Other representations of resource: CSV, Excel
Molecular sequences are represented using numerous encodings, which are not always explicitly specified. The representation.literal.encoding attribute captures this information directly, so that implementors can validate the content of messages and computationally determine how a particular sequence should be interpreted.
The examples below illustrate different encodings, which could be used to create terms for this attribute. They are based on the IUPAC symbols for nucleotide and amino acid sequences.
| Symbol | Meaning | Origin of designation |
|---|---|---|
| G | Guanine | G |
| A | Adenine | A |
| T | Thymine | T |
| C | Cytosine | C |
| Symbol | Meaning | Origin of designation |
|---|---|---|
| G | Guanine | G |
| A | Adenine | A |
| U | Uracil | U |
| C | Cytosine | C |
| Symbol | Meaning | Origin of designation |
|---|---|---|
| G | Guanine | G |
| A | Adenine | A |
| T | Thymine | T |
| C | Cytosine | C |
| N | G or A or T or C | aNy |
| Symbol | Meaning | Origin of designation |
|---|---|---|
| G | Guanine | G |
| A | Adenine | A |
| T | Thymine | T |
| C | Cytosine | C |
| R | G or A | puRine |
| Y | T or C | pYrimidine |
| M | A or C | aMino |
| K | G or T | Keto |
| S | G or C | Strong interaction (3 H bonds) |
| W | A or T | Weak interaction (2 H bonds) |
| H | A or C or T | not-G, H follows G in the alphabet |
| B | G or T or C | not-A, B follows A |
| V | G or C or A | not-T (not-U), V follows U |
| D | G or A or T | not-C, D follows C |
| N | G or A or T or C | aNy |
| Symbol | Amino acid |
|---|---|
| A | alanine |
| C | cysteine |
| D | aspartic acid |
| E | glutamic acid |
| F | phenylalanine |
| G | glycine |
| H | histidine |
| I | isoleucine |
| K | lysine |
| L | leucine |
| M | methionine |
| N | asparagine |
| P | proline |
| Q | glutamine |
| R | arginine |
| S | serine |
| T | threonine |
| V | valine |
| W | tryptophan |
| Y | tyrosine |
| Symbol | Amino acid |
|---|---|
| Ala | alanine |
| Cys | cysteine |
| Asp | aspartic acid |
| Glu | glutamic acid |
| Phe | phenylalanine |
| Gly | glycine |
| His | histidine |
| Ile | isoleucine |
| Lys | lysine |
| Leu | leucine |
| Met | methionine |
| Asn | asparagine |
| Pro | proline |
| Gln | glutamine |
| Arg | arginine |
| Ser | serine |
| Thr | threonine |
| Val | valine |
| Trp | tryptophan |
| Tyr | tyrosine |
| Symbol | Amino acid |
|---|---|
| A | alanine |
| B | aspartic acid or asparagine |
| C | cysteine |
| D | aspartic acid |
| E | glutamic acid |
| F | phenylalanine |
| G | glycine |
| H | histidine |
| I | isoleucine |
| K | lysine |
| L | leucine |
| M | methionine |
| N | asparagine |
| P | proline |
| Q | glutamine |
| R | arginine |
| S | serine |
| T | threonine |
| U | selenocysteine |
| V | valine |
| W | tryptophan |
| X | unknown or 'other' amino acid |
| Y | tyrosine |
| Z | glutamic acid or glutamine |
| Symbol | Amino acid |
|---|---|
| Ala | alanine |
| Asx | aspartic acid or asparagine |
| Cys | cysteine |
| Asp | aspartic acid |
| Glu | glutamic acid |
| Phe | phenylalanine |
| Gly | glycine |
| His | histidine |
| Ile | isoleucine |
| Lys | lysine |
| Leu | leucine |
| Met | methionine |
| Asn | asparagine |
| Pro | proline |
| Gln | glutamine |
| Arg | arginine |
| Ser | serine |
| Thr | threonine |
| Sec | selenocysteine |
| Val | valine |
| Trp | tryptophan |
| Xaa | unknown or 'other' amino acid |
| Tyr | tyrosine |
| Glx | glutamic acid or glutamine |
The Molecular Definition resource supports several different methods for representing a molecule. Some of the elements described below may apply only to sequences, and different elements may be added to support other types of molecular concepts.
Native representations: The literal, code, and resolvable are native representations, meaning they represent a sequence "as-is" without any additional computation.
Derived representations: The extracted, concatenated, repeated, and relative representations are derived representations, meaning they require one or more computational operations to be performed to create the sequence that is being represented.
The literal element can be used to represent a sequence as a string of characters. By convention, nucleotide sequences are expressed 5' to 3' and protein sequences are expressed N to C terminus. The encoding element can optionally be used to specify the encoding used for the sequence literal. The encoding can be important in disambiguating sequences that share alphabets (for example, ATG might represent a translation start codon in DNA, but it could also represent a peptide containing 3 amino acids).
The representation.code element (0..*) can be used to identify a molecular entity by a coded accession or identifier from a known sequence database. The 0..* cardinality allows the same molecule to be cross-referenced using identifiers from multiple databases within a single representation.
The system, code, and display elements of the Coding datatype should be used to fully identify the sequence: system carries the database URI, code carries the accession identifier, and display carries a human-readable description.
The most common use case in clinical genomics is a versioned NCBI RefSeq accession (system: http://www.ncbi.nlm.nih.gov/refseq). Versioned accessions (those including a dot-version suffix, e.g., NC_000010.11) are strongly preferred over unversioned ones to ensure stable, unambiguous identification of a specific sequence version. Commonly used accession types:
| Prefix | Type | Example |
|---|---|---|
NC_ |
Chromosomal genomic | NC_000010.11 |
NG_ |
RefSeqGene | NG_008384.3 |
NM_ |
mRNA transcript | NM_000769.4 |
NR_ |
Non-coding RNA | NR_024540.1 |
NP_ |
Protein | NP_000760.1 |
LRG (http://www.lrg-sequence.org) is another recognized system registered in HL7 terminology, providing stable reference sequences for clinical variant reporting. An example binding documenting these systems is defined in the Terminology Considerations page.
Note that representation.code does not guarantee that the repository is publicly accessible or that the referenced sequence can be retrieved — it only identifies the sequence using a code that can be exchanged. A private biobank accession that follows a known system scheme is a valid use of this element. For publicly accessible files, prefer representation.resolvable (which implies the content can be retrieved).
The resolvable element can be used to represent a sequence by reference, but it also implies that the sequence is accessible and SHOULD be resolvable (although a security layer may be present). This element makes use of the Document Reference resource, which contains the content.attachment element. The Attachment datatype can be used to represent sequences that are captured as a formatted file (using .contentType and .data) or as a URL (using .contentType and .url).
The extracted element can be used to represent a sequence that is derived from another, longer sequence. The startingMolecule element refers to the "parent" sequence, and is itself an instance of Molecular Definition (with its own representation). The coordinateInterval element specifies a precise interval on the "parent" sequence, which is to be extracted (conceptually or literally) and optionally reverse-complemented. This element provides a way to conveniently reference regions of very long molecules (e.g., chromosomes) without requiring either the "parent" or the extracted sequence to be serialized. Conceptually, this representation is the inverse operation of the concatenated representation.
The concatenated element can be used to represent a sequence that is comprised of other sequences that are concatenated together to form the intended sequence. Each sequenceElement is specified as an instance of Molecular Definition (and each has its own representation). The order of concatenation is explicitly defined using the ordinalIndex element. Conceptually, this representation is the inverse operation of the extracted representation.
The repeated element can be used to represent a sequence that is comprised of a sequence motif that is repeated a specified number of times. The sequenceMotif is an instance of Molecular Definition (and has its own representation), and copyCount specifies the number of times the motif is copied in tandem. Conceptually, this representation is a special case of the concatenated representation, where each element is an identical copy of a given motif.
The relative element can be used to represent a sequence in relation to another sequence, where the difference between the two sequences can be expressed as an ordered series of edit operations. This representation can be used to conveniently represent minor but meaningful differences between long or complex sequences (e.g., HLA alleles). Algorithmically, the relative representation defines a sequence by beginning with a startingMolecule (an instance of Molecular Definition) and performing at least one edit operation on it. Each edit operation is performed in order and includes replacing the sequence (the replacedMolecule) at a defined coordinateInterval with the sequence specified by the replacementMolecule. The resulting sequence after all edits have been performed is the sequence referenced by this representation element.
Note that the edits specified in this representation are operations and NOT variations. Variations are defined as a specific comparison between two states (a reference and an alternative), and while they are sometimes called "changes" and therefore they might be confused for edit operations, they are semantically distinct concepts.
Since the derived representations (extracted, concatenated, repeated, and relative) each reference Molecular Definition, representations can be combined to support complex use cases. For example:
It is possible to create arbitrarily deep structures using derived representations, and while there might be rationale for doing so implementations should avoid overly-complex representation structures.
Every representation, regardless of its complexity, can be resolved to a literal. Two instances of MolecularDefinition are considered equivalent if they define the same entity. For molecular sequences, this means that for two instances of MolecularDefinition to be equivalent they must resolve to the same literal sequence. Two instances are identical if their serializations are identical: they must contain the same elements, and each corresponding element must have the same value.
The Molecular Definition resource supports several profiles that represent molecular concepts:
In addition, profiles have been drafted to represent the concepts of Haplotype and Genotype, although they have not been exercised as deeply as the profiles listed above. Finally, preliminary work has demonstrated that the Molecular Definition resource could be used to represent concepts related to structural variation, including Adjacency and Fusion. It is anticipated that profiles to support these concepts will be developed over time.
The MolecularDefinition resource is an abstract resource that provides building blocks for creating semantically robust, computable structures that define molecular entities. The two most complex backbone elements, location and representation, support the concept of molecular sequences but they might not be relevant to other types of entities. Conversely, other entities may require different backbone elements. As such, it is expected that these high-level backbone elements will serve as modular schemas that can be profiled as needed for a given molecular entity. Profiling could include constraints on cardinality (e.g., the Sequence profile has 0..0 location, while Allele has 1..1 location) and slicing.
The representation backbone element provides a series of methods for specifying the value of a sequence. As a result, the entire structure can be used any time a sequence is referenced, and this is accomplished by slicing on representation.focus. The focus element uses a required binding to the MolecularDefinitionRepresentationFocus CodeSystem; its four codes classify the role of each representation slice.
The current sequence-based profiles of MolecularDefinition define representation slices as follows:
| Profile | Cardinality | representation.focus code |
Semantic meaning |
|---|---|---|---|
| Sequence | 1..1 | (none — focus omitted) | The primary sequence of the molecule |
| Allele | 1..1 | allele-state |
The sequence of the allele at the specified location |
| Allele | 0..1 | context-state |
The surrounding genomic context at the specified location |
| Variation | 1..1 | reference-state |
The sequence defined as the reference allele |
| Variation | 1..1 | alternative-state |
The sequence defined as the alternate allele |
| Variation | 0..1 | context-state |
The surrounding genomic context at the specified location |
| Name | Type | Description | Expression |
| identifier | token |
The unique identity for a particular sequence |
MolecularDefinition.identifier
|
| member | reference |
Reference to the state of the molecular member |
MolecularDefinition.member
|
| moleculetype | token |
Amino Acid Sequence/ DNA Sequence / RNA Sequence |
MolecularDefinition.moleculeType
|
| topology | token |
The structural topology of the molecular entity (e.g., linear, circular) |
MolecularDefinition.topology
|
| type | token |
Classification of the molecule into types other than those defined by moleculeType |
MolecularDefinition.type
|