Clinical Genomics Resource Incubator
0.1.0-ci-build - CI Build International flag

Clinical Genomics Resource Incubator, published by HL7 International / Clinical Genomics. This guide is not an authorized publication; it is the continuous build for version 0.1.0-ci-build built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/HL7/cg-incubator/ and changes regularly. See the Directory of published versions

CodeSystem: Genomic Study Data Format (Experimental)

Official URL: http://hl7.org/fhir/uv/cg-incubator/CodeSystem/genomicstudy-dataformat Version: 0.1.0-ci-build
Standards status: Draft Draft as of 2022-08-17 Maturity Level: 1 Computable Name: GenomicStudyDataFormat
Other Identifiers: OID:2.16.840.1.113883.4.642.4.1978

The data format relevant to genomics. These formats and relevant codes were pulled from Integrative Genomics Viewer Documentation by Broad Institute.

This Code system is referenced in the definition of the following value sets:

Last updated: 2021-01-05 10:01:24+1100

Profile: Shareable CodeSystem

This case-sensitive code system http://hl7.org/fhir/uv/cg-incubator/CodeSystem/genomicstudy-dataformat defines the following codes:

CodeDisplayDefinition
bam BAM Binary Alignment/Map format for storing read alignments against reference sequences.
bed BED Browser Extensible Data format for representing genomic regions and associated annotations.
bedpe BEDPE Paired-End BED format for representing pairwise genomic interactions.
bedgraph BedGraph BED Graph format for representing genomic signals as continuous-valued data.
bigbed bigBed Binary indexed BED format for efficiently storing large amounts of genomic region data.
bigWig bigWig Binary indexed Wig format for efficiently storing large amounts of continuous-valued genomic data.
birdsuite-files Birdsuite-Files File format used by the Birdsuite suite of software for SNP genotyping and copy number analysis.
broadpeak broadPeak BED format variant for representing broad peaks in ChIP-Seq data.
cbs CBS Copy number data format output by Circular Binary Segmentation analysis.
chemical-reactivity-probing-profiles Chemical-Reactivity-Probing-Profiles Profiles of chemical reactivity for RNA structure analysis.
chrom-sizes chrom-sizes File listing chromosome names and their sizes.
cn CN Copy number data format.
custom-file-formats Custom-File-Formats User-defined or proprietary file formats for genomic data.
cytoband Cytoband Chromosome cytogenetic band locations and characteristics.
fasta FASTA Format for representing sequences of nucleic acids or proteins using single letter codes.
gct GCT Gene Cluster Text format for storing gene expression data.
cram CRAM Compressed Reference-Aligned Map format for storing read alignments more compactly than BAM.
genepred genePred Format for storing gene predictions with exon and CDS information.
gff-gtf GFF/GTF General Feature Format / Gene Transfer Format for storing genomic features and annotations.
gistic GISTIC Genomic Identification of Significant Targets in Cancer output format for copy number analysis.
goby Goby Compact file format for storing read alignments, variations, and base quality information.
gwas GWAS Genome-Wide Association Study format for storing association results.
igv IGV Integrative Genomics Viewer session or display format.
loh LOH Loss of Heterozygosity data format.
maf-multiple-alignment-format MAF-Multiple Alignment Format Multiple Alignment Format for storing aligned sequences.
maf-mutation-annotation-format MAF-Mutation-Annotation-Format Mutation Annotation Format for storing somatic mutation data.
merged-bam-file Merged BAM File BAM file containing read alignments from multiple samples or lanes merged together.
mut MUT Mutation data format.
narrowpeak narrowPeak BED format variant for representing narrow peaks in ChIP-Seq data.
psl PSL Pattern Space Layout format for storing sequence alignments.
res RES Resolution data format.
rna-secondary-structure-formats RNA-Secondary-Structure-Formats Formats for representing RNA secondary structure information.
sam SAM Sequence Alignment/Map format for storing read alignments, the uncompressed version of BAM.
sample-info-attributes-file Sample-Info-Attributes-file File containing sample information and attributes.
seg SEG Segmented data format for storing copy number or other segmented genomic data.
tdf TDF Tiled Data Format for efficient storage and display of large genomic datasets.
track-line Track Line UCSC Genome Browser track line header defining display properties for genomic data.
type-line Type Line Type line header for defining genomic data track properties.
vcf VCF Variant Call Format for storing variant information including SNPs, indels, and structural variations.
wig WIG Wiggle Track format for storing continuous-valued genomic data.

Description of the above table(s).