Genomics Reporting Implementation Guide (STU1)

Genomics Reporting Implementation Guide, published by HL7 International Clinical Genomics Work Group. This is not an authorized publication; it is the continuous build for version 1.0.0). This version is based on the current content of https://github.com/HL7/genomics-reporting/ and changes regularly. See the Directory of published versions

General Genomic Reporting

This section defines the "core" profiles and concepts that would be expected to be present in most genomic reports, regardless of type and how those profiles relate to each other. Concepts covered include the genomics report itself and the high-level categories of observations and other elements that make up the report, such as patient, specimen, variants, haplotypes, genotypes, etc.

Genomic Observations

This guide defines a number of Observation profiles, with common underlying components and constraints being inherited from abstract profiles as shown in the following diagram. The profiles and their specific usage will be defined in more detail below and on the other pages of this guide.

Class diagram showing the inheritance structure for genomic observations.

Figure 1: Defined Genomic Observations

(Profile links: Genomics Base, Overall Interpretation, Region Studied, Grouper, Genomic Finding (abstract), Variant, Haplotype, Genotype, Sequence Phase Relationship, Genomic Implication (abstract), Inherited Disease Pathogenicity, Medication Implication (abstract), Medication Efficacy Implication, Medication Transport Implication, Medication Metabolism Implication, High Risk Allele, Somatic Implication (abstract), Somatic Diagnostic Implication, Somatic Prognostic Implication, Somatic Predictive Implication)

Diagnostic Report

The diagnostic report is the focus of all genomic reporting. It conveys metadata about the overall report (what kind of report it was, when it was written, who wrote it, final vs. draft, etc.). It also typically includes a rendered version for review by a clinician. It also groups together all "relevant" information found as part of the genomic analysis (Rules for relevancy will depend on the type of testing ordered, the reason for testing and the policies of the lab). The information found is expressed as FHIR Observations. These observations fall into one of six categories:

Class diagram showing the high-level categories of the component parts in a genomic diagnostic report

Figure 1: Genomic Report Overview

(Profile links: Genomics Report, Grouper, Overall Interpretation, Genomic Implications (see Figure 7, Genomic Findings (see Figure 6, Region Studied)

Genomic Interpretations These are high-level observations of the result of the genomic testing.
Genomic Implications These represent observations about the patient based on the genomic test results. For example, "Patient may have increased susceptibility to heart attacks"
Genomic Findings These are observations about the specimen's genomic characteristics. For example, a chromosomal abnormality, genotype, haplotype or variant that was detected.
Grouper The genomic observations can be organized and grouped together in a wide variety of ways.
Region Studied These are observations describing the region or regions that were studied as part of this Genomics Report.
Other Observations The results of tests other than sequenced genomic variants may also be included the report.

In addition to the observations included in the report, some reports might also recommend specific actions be taken, such as genomic counseling, re-testing, adjusting drug dosages, etc. - driven by the results found. These are covered by the Recommended Action category and are expressed using FHIR's Task resource.

Various Observations including vital signs, lab information, assessments, genomic information, etc. result in different risk assessment. The risk assessment resource captures predicted outcomes for a patient or population on the basis of source information, and the Genomics Report has space for specifically noting this assessment under the DiagnosticReport-risk extension, which links directly to an assessment of prognosis or risk as informed by the diagnostic results (For example, genomic results and possibly by patient genomic family history information). This extension is used when one needs RiskAssessment as an alternate choice for Observation.hasMember or DiagnosticReport.result.

As shown in the diagram above, all of the observations may hang directly off of the diagnostic report However, they can also be part of a grouper. In this version of the specification, no guidance is provided on where or if grouper should be used. This is left up to the discretion of the reporting lab. Observations might be organized on the basis of subject, specimen, chromosome, gene, condition/disease, medication or other appropriate measure. The recursive "hasMember" relationship on grouper supports a nested tree-structure of groupers if appropriate, though more than two levels of groupers is likely excessive. See an example using grouper to separately reference variants and other Observations on the report.

Any organization of observations into groups or sub-groups is purely for navigation and presentation purposes. It carries no additional "meaning." Each observation can be interpreted on its own without knowing the associated group or sub-group and must be able to stand alone as a resource and be independently valuable. The organization of observations in groups does not assert any relationship between observations.

If needed, large or complex genomic reports may be broken down into sub-reports using core DiagnosticReport extensions like extends or summaryOf. This is particularly useful when different labs or services are performing later steps in the analysis, for example.

Note that it is possible for relationships to exist between the different observation components of a genomics report. Such relationships are asserted directly on one of the affected observations. Some of these relationship types are defined on the basis of the high-level observation category the observation belongs to. Others will be defined for narrower categories or explicit observation types. The high-level category relationships are shown in the following diagram:

Class diagram showing the high-level categories of the component parts in a genomic diagnostic report

Figure 3: Genomic Report Category Relationships

(Profile links: Genomic Implications (see Figure 7, Genomic Findings (see Figure 6, Overall Interpretation, Recommended Followup, )

The relationships between categories are as follows:

  • Genomic Interpretations should be "derived from" Genomic Findings. For example, an interpretation that "deletions or duplications were found" might be supported by observations of variants that contain deletions and/or duplications.
  • Genomic Implications should be "derived from" Genomic Findings. For example, in a genomic report, it's not acceptable to imply "patient is an increased metabolizer of drug X" without also indicating the variant, haplotype or genotype found that supports that implication.
  • Every Recommended Action will have a reason relationship to either a Genomic Interpretation or Genomic Finding. For example: a recommendation to increase the dosage of a medication might be tied to a genomic interpretation indicating that the patient is an increased metabolizer of that medication; another possible recommendation is that re-testing should be performed on a variant that was detected but had low quality metrics for certainty.

This diagram also shows a specific example of a Recommended Action - the Recommended Followup which includes suggestions for confirmatory testing, additional testing and/or genomic counselling.

General Relationships

To allow searching and appropriate navigation, the diagnostic report, observations and tasks must be able to stand on their own. They need to be related to the associated patient and/or specimen, the order that initiated the testing, the lab that performed the testing, etc. FHIR design principles dictate that these associations be present on every resource instance. That's because each resource could be accessed on its own as part of a query response, embedded in a document or message, passed to a decision support engine, etc. However, this is still relatively lightweight because the information is included by reference only.

The following diagram shows the relationships between the diagnostic report, observations and other elements used in the profile. Note that there is no expectation that all relationships will point to the same instances. In special cases, a genomic report may involve multiple patients or multiple specimens.

Class diagram showing the interrelationships between DiagnosticReport, Observation, Task, Patient, Specimen, etc.

Figure 3: Genomic Report Other Relationships

(Profile links: Recommended Action (see Figure 3), Genomics Report, Request for Genomic Test, Genomics Base, Specimen, Genomic Finding (see Figure 6), Overall Genomic Interpretation, Grouper, Genomic Implications )

A few key points to take from this diagram:

  • Request for Genomic Test and the reports resulting from them can be associated to patients, to specimens or both.
  • Specimens may be linked to a specific subject, but they can also be stand-alone. For example, genomic testing of a sample swabbed from a counter-top.
  • Family member history and tasks are always associated with a patient, not a specimen.
  • All genomic observations are derived from a common abstract profile that asserts they should have a category, effective date, issued date and status.
  • The effective date is the date the genomic specimen was collected and the issued date is when the observation was performed.
  • Of the different types of observations, Genomic Findings all have exactly one specimen. The remainder might be associated with a specimen, but might not. Observations may also be associated with a particular BodyStructure, such as a fetus, tumor or lesion.
  • Genomic reports and observations can be tied to multiple "orders" - this is because each test requested is handled as a separate request. All tests ordered as part of a single requisition are linked by the requisition identifier.
  • The Request for Genomic Test typically represents a clinician order. However, it can also represent a lab-side filler order, a reflex order or even a plan or recommendation. These uses are distinguished via the intent element.
  • The primary test to perform is captured in ServiceRequest.code. However, Qualifications on what variants, medications, diseases and other aspects to search on can be conveyed using the orderDetail element

Orders for genomic tests can point to other sources of information used to support the analysis performed as part of genomic testing. Genomic reports can refer to this the information that was considered as part of the report - whether provided as part of the order or made available subsequently by the patient or clinicians or otherwise retrieved. Figure 4 (below) shows these relationships, which can be to various Observations, FamilyMemberHistory records (including records that comply with Family member history for genomics analysis and RiskAssessments. In some cases, the lab or other reporting organization may generate risk assessments as part of their reports.

Class diagram showing supportingInfo links from ServiceRequest and DiagnosticReport.

Figure 5: Genomics Report Supporting Information

(Profile links: Genomics Report, Request for Genomic Test )

(Profile links: Recommended Action (see Figure 3), Genomics Report, Request for Genomic Test, Genomics Base, Specimen, Genomic Finding (see Figure 6), Overall Interpretation, Grouper, Genomic Implications, )

Genomic Findings

The primary focus of genomic testing is making Genomic Findings. These are the fine and/or coarse-grained descriptions of a specimen's genomic characteristics. It is this information that leads to the actionable Genomic Implications and the Overall Interpretations for the report.

Class diagram showing relationship of Computable genomic findings as well as genotypes, haplotypes, variations and sequences.

Figure 6: Genomic Findings

(Profile links: Recommended Action (see Figure 3), Computable Genomic Finding, Genotype, Haplotype, Variation (see Variant Reporting), Sequence Phase Relationship )

Computable findings can be subdivided into three types of observations:

  • Genotypes describe combinations of genomic variations that together are associated with a particular phenotype - i.e. a specific physical, behavioral or risk-associated difference associated with the organism whose specimen was tested.
  • Haplotypes describe a set of genomic variations that appear on a single strand of DNA - and which are therefore typically inherited together
  • Variants are specific differences or combinations of differences between parts of one or more specimen sequences and the equivalent portions of the reference sequence(s) for that organism.

These categories of observations have relationships. Haplotypes can be identified based on the presence of variants. Genotypes can be identified based on the presence of haplotypes and/or variants. All three can be expressed as a combination of one or more sequences.

Genotype is used to convey corresponding haplotypes or variations at a particular locus. Many genotypes are expressed as simple strings, and can be conveyed in genotype.valueCodeableConcept.text. In some cases, genotypes are sufficiently standardized to be conveyed as codes in genotype.valueCodeableConcept.code.

  • TPMT *1/*3A represents the TPMT *1 haplotype (or 'star allele') on one chromosome and the TPMT *3A haplotype on the homologous chromosome
  • A/C represents a heterozygous "A" and a heterozygous "C" at SNP rs1142345.

For HLA, KIR, and other genes in the immunogenomics domain, the National Marrow Donor Program (NMDP) led a community effort to define the Genotype List String (GL String) grammar, described here. Notably, the GL String uses '+' as a delimiter between alleles in a genotype. It also has delimiters for ambiguous genotypes, allele lists, and haplotypes.

For Pharmacogenomics, we define here a simple grammar of [HGNC gene symbol followed by white space followed by a slash ('/') delimited list of haplotype codes], where the codeSystem is set to the codeSystem of the haplotypes (e.g. Pharmvar).

Here are some examples that are standardized, note that there are some other examples which still lack standardization today.

  • Text only
    <valueCodeableConcept>
      <text>A/C at SNP rs1142345</text>
    </valueCodeableConcept>
  • Coded HLA
    <valueCodeableConcept>
      <coding>
        <system value="http://glstring.org"/>
        <version value="1.0"/>
        <code value="#hla#3.23#HLA-A*01:01:01:01/HLA-A*01:02+HLA-A*24:02:01:01"/>
      </coding>
    </valueCodeableConcept>
  • Coded PGx
    <valueCodeableConcept>
      <coding>
        <system value="https://www.pharmvar.org/"/>
        <code value=value="CYP2C9 *1A/*1A"/>
      </coding>
    </valueCodeableConcept>

Genomic Interpretations

Overall interpretations are high-level summary observations that apply to the whole report. Their purpose is to answer the question "Did you find anything when you did the test I asked you to do?":

Genomic Analysis Overall Coded Interpretation is what the laboratory declares as the summary result of the test (e.g. Positive, Negative, Unknown) and is typically used when the genomic test was looking for a particular genomically-based disease. It allows indication of whether genomic results known to be associated with the disease was found or not.

Genomic Implications

At present, implications are noted as explicit observations about the patient/subject. However, it's not clear this is the correct approach. The work group is evaluating introducing a new resource that allows conveying "knowledge" about a variant in a patient-independent way. This would allow saying "this variant is associated with an increase risk of cardiovascular disease" rather than "based on this variant, the patient is at an increased risk of cardiovascular disease", which isn't necessarily a determination the reporting organization may wish to assert. Feedback is welcome.

Class diagram showing the abstract Genomic Implications class.

Figure 7: Genomic Implications

(Profile links: Genomic Implications, Inherited Disease Pathogenicity )

Genomic Implications are assertions of likely effects genomic results on the patient, tumor or other subject. All implications inherit a common set of elements:

  • RelatedArtifact supports conveying references to citations, supporting documentation and other information relevant to the assertion of the implication
  • comment contains additional detail and possibly qualification of the asserted Implication
  • levelOfEvidence indicates the strength of the evidence behind the assertion

Only one Implication is defined as a "common" Implication. However, implications are relevant for other areas of genomic testing including pharmacogenomics and somatics where more implication types will be defined.

The "Inherited Disease" Implication indicates the likelihood of inheritance (valueCodeableConcept) of a particular disease (the associated-phenotype) as well as how inheritance is likely to occur (mode-of-inheritance).

Here are some examples of how the levelOfEvidence component can be used (NOTE - they would be used in the appropriate, use case specific profile):

  • Coded ACMG (see ACMG)
    <valueCodeableConcept>
      <coding>
        <system value="https://www.acmg.org/"/> TODO - Need ACMG as a Code System
        <code value="PS1"/>
      </coding>
      <coding>
        <system value="https://www.acmg.org/"/>
        <code value="PM2"/>
      </coding>
    </valueCodeableConcept>
  • Coded PGx (see PharmGKB)
    <valueCodeableConcept>
      <coding>
        <system value="https://www.pharmvar.org/"/>
        <code value="Level 1A"/>
      </coding>
    </valueCodeableConcept>
  • Coded Somatic (see MVLD)
    <valueCodeableConcept>
      <coding>
        <system value="https://www.clinicalgenome.org/mvld/"/> TODO - Need ClinGen/MVLD as a Code System
        <code value="Tier1 LevelA"/>
      </coding>
    </valueCodeableConcept>

Other content

The profiles describing the detailed observations within a genomics report are found in the other sections of this implementation guide based on what type of testing and reporting is being done:

  • The Variant Reporting section deals with all types of variants detected by formal sequencing, including simple/discrete variants, structural variants and complex variants detected by direct sequencing, shotgun-based sequencing and array-based testing for specific variants.
  • The Pharmacogenomic Reporting section deals with genomic testing related to medication results. It primarily focuses on medication-related implications of variation, haplotype and genotype observations.
  • The Somatic Reporting section deals with non-germline variations, particularly those related to tumors and genomic-based implications on outcomes and the effectiveness of medications and other interventions.
  • The Histocompatibility Reporting section deals with information related to variations relevant histocompatibility and immunogenomics, including HLA typing.

Many genomics reports will draw on more than one of these areas. For example, a somatic report will typically include sequencing information as well as information on likely tumor susceptibility to particular medications. Reports should draw on whatever sections are relevant.