Test Plan

CMS FHIR Prototype Measure Calculation Tool IG
0.1.0 - CI Build United States of America flag

CMS FHIR Prototype Measure Calculation Tool IG, published by HL7 International - [Some] Work Group. This guide is not an authorized publication; it is the continuous build for version 0.1.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/cqframework/mct-ig/ and changes regularly. See the Directory of published versions

Test Plan

Test Plan

This page documents the test plan for the Measure Calculation Tool (MCT) prototype. The test plan is intended to demonstrate:

Functionality of the Measure Calculation Tool and validation, certification, and testing content
Correctness of a provider implementation of the Measure Calculation Tool

Test data was prepared by constructing random datasets based on the data requirements for the specific measure under test. Although this is a reasonable approach to functional testing, it is not necessarily representative of real-world test data. Additional testing should be performed using:

Larger data sets
More sophisticated data generation techniques, such as the Synthea tool
Real-world data using deidentified data sets

Whenever possible, automated testing approaches should be used to enable more streamlined testing of the measure calculation tool. For tests that are intended to demonstrate functionality of the prototype, this automation can be accomplished using continuous integration and delivery pipelines. For tests that are intended to demonstate validity and capability of an integration, this automation can be accomplished through integration testing tools such as Postman.

Content Tests

These tests are performed as part of prototype development and testing to ensure that the measure content for the Validation Measure and for CMS104 is correctly evaluating given known input data.

NOTE: These tests cover proportion measure calculation only. Other calculation features would need to be tested specifically, including: ratio, continuous-variable, and composite calculation and stratifiers.

Test Validation Measure
1. Test data is present for each data element
  1. Ineligible - data is missing and the validation result indicates it is
  2. Invalid - data is present but invalid for each data element and the validation result provides validation messages
  3. Valid - data is present for each data element
2. Test measure score is successful for
  1. Ineligible
  2. Initial population
  3. Denominator
  4. Denominator Exception
  5. Denominator Exclusion
  6. Numerator
Test CMS104
1. Test data is present for each data element
  1. Ineligible - data is missing and the validation result indicates it is
  2. Invalid - data is present but invalid for each data element and the validation result provides validation messages
  3. Valid - data is present for each data element
2. Test measure score is successful for
  1. Ineligible
  2. Initial population
  3. Denominator
  4. Denominator Exception
  5. Denominator Exclusion
  6. Numerator

NOTE: The content unit tests are all patient-specific, rather than population level. Population level testing is performed as part of integration tests.

Content Data Elements

The Validation/Certification measure contains expressions to support validation of all QICore profiles. However, this prototype is focusing on the data elements involved in the CMS104 Measure:

Encounter: Non-Elective Inpatient Encounter
Condition: Diagnosis per Encounter
ServiceRequest: Comfort Measures
Procedure: Comfort Measures
MedicationRequest: Antithrombotic Therapy
MedicationRequest: Pharmacological Contraindications For Antithrombotic Therapy
MedicationNotRequested: Antithrombotic Therapy

Integration Tests

These tests are performed as part of prototype development and testing to ensure that the Measure Calculation Tool is performing as expected in the prototype environment with known configuration and input data served through expected server behavior.

Test CCN Configuration
1. Validate the MeasureReport is produced with the configured CCN identifier
Test Organization/Facility Configuration
1. Validate the MeasureReport is produced with the configured reporter Organization, and location extensions for each configured facility
Test Validation Measure
1. Test data is present for each data element
2. Test missing data produces expected validation messages
3. Test invalid data produces expected validation messages
4. Test measure score is successful for each test case (1..7)
Test CMS104
1. Test data is present for each data element
2. Test missing data produces expected validation messages
3. Test invalid data produces expected validation messages
4. Test measure score is successful for each test case (1..7)

Validation Tests

These tests are performed at an implementing site to ensure that the prototype is installed and configured correctly and that it performs as expected within the site environment.

Test Validation Measure Data
1. MeasureReport has the correct CCN
2. MeasureReport has the correct reporter Organization
3. MeasureReport has the correct reported Location(s)
4. MeasureReport has data for each element
5. MeasureReport has expected validation messages for missing data
6. MeasureReport has expected validation messages for invalid data
Test Validation Measure Calculation
1. MeasureReport has expected population count and score for each population test (1..7)
2. MeasureReport has expected supplemental data
Test Validation Measure Submission
1. Validate submitted MeasureReport has correct:
  1. CCN
  2. Organization
  3. Reported location(s)
2. Validate submitted MeasureReport has expected population count and score for each population (1..7)
3. Validate submitted MeasureReport has expected data references
4. Validate all expected data is submitted
5. Validate no unexpected data is submitted

Submission Tests

These tests are performed at an implementing site to demonstrate calculation and submission of the CMS104 measure.

Test CMS104 Measure Data
1. MeasureReport has data for each element
2. MeasureReport has expected validation messages for missing data
3. MeasureReport has expected validation messages for invalid data
Test CMS104 Measure Calculation
1. MeasureReport has expected population count and score for each population (1..7)
2. MeasureReport has expected supplemental data
Test CMS104 Measure Submission
1. Validate submitted MeasureReport has expected population count and score for each population (1..7)
2. Validate submitted MeasureReport has expected data references
3. Validate all expected data is submitted
4. Validate no unexpected data is submitted

Performance Tests

These tests are performed as part of prototype development and testing and provide baseline performance characteristics in a known solution environment.

Test Validation Measure Evaluation Performance
1. Unit Test - 1, 10, 50, 100, and 200 Patients
2. Integration Test - 1, 10, 50, 100, and 200 Patients
Test CMS104 Measure Evaluation Performance
1. Unit Test - 1, 10, 50, 100, and 200 Patients
2. Integration Test - 1, 10, 50, 100, and 200 Patients

CMS104 Measure Evaluation Performance

The following is an analysis of the measure evaluation performance of the prototype using the CMS104 measure as the subject. For this analysis, the following three processes will be profiled:

Gathering the patient data
Validating the patient data gathered in step 1
Evaluating the measure referencing the data gathered in step 1

Gathering Patient Data

The first step of gathering the patient data includes an analysis of the data requirements for the measure. The data requirements identify the resources and data elements used to evaluate the measure logic. The prototype uses the data requirements to generate FHIR REST queries, which are then executed across the specified facilities registered with an organization.

Validating Patient Data

The data validation step operates on the gathered patient data to ensure that the data adheres to a specified set of profiles (in this case QiCore version 4.1.1). Inconsistencies with the gathered patient data and the specified profiles are documented within the patient data as contained resources. Any missing data requirements will also be documented within the returned patient data bundle (see the $gather operation specification for more information).

Evaluating the Measure

The measure evaluation occurs on both a patient-level and population-level. The prototype is testing a proportion measure. The result of the evaluation returns individual and population reports detailing population group membership, a measure score, and the resources that were used during evaluation.

Methodology

The prototype operates on a linear scale. Meaning each of the processes outlined above are evaluated sequentially for each patient. Therefore, as the population or resources within that population (i.e. patients and/or patient resources) increase, the time to evaluate will also increase.

The prototype was profiled using populations sizes of 1, 10, 50, 100, and 200 patients (test cases) in order to provide a reasonable representation of the linear scaling and represent several measure population groupings (i.e. simulate a real-world population). The patient data is randomly generated with adherence to certain requirements. The requirements include:

Each measure population group (Ineligible, Initial population, Denominator, Denominator Exception, Denominator Exclusion, and Numerator) must be represented whenever possible.
- For the single patient population, a Numerator population group was profiled.
The population should have ~60% success rate for the Numerator measure population group.
The population should have ~80% success rate for the Initial population measure group.
The population must use valid patient data for the measure.
- Some profile validation errors should appear for full coverage profiling, but those errors must not coincide with the data elements required to evaluate the measure.

Metrics

Each population set was randomly generated 100 times and profiled recording the average runtime for each process in the following table.

Number of Test Cases	Combined	Measure Evaluation	Patient Data Queries	Validation
1	01.113	00.657	00.401	00.056
10	08.623	05.088	03.104	00.431
50	43.477	25.651	15.652	02.174
100	01:24.834	50.052	30.540	04.242
200	02:44.587	01:37.106	59.251	08.229

CMS104 Performance Graph

The following chart displays the runtime distribution for each of the profiled processes:

Performance Enhancements

Although the prototype could be implemented as-is and perform reasonably well for smaller populations, it is not currently recommended as an enterprise-level solution. In order to scale the prototype for enterprise use, there are several enhancements that could be implemented to improve the overall performance and user experience including, but not limited to:

Using parallel programming to carry out various processes simultaneously.
- Could vastly improve performance when gathering patient data across multiple facilities.
- Could enable evaluating multiple measures across multiple populations.
Using asynchronous programming to reduce/eliminate the limitations of sequential processing.
- Asynchronous programming is non-blocking, meaning the program does not have to wait for the process to finish before performing other tasks.
- Would be very impactful when processing large populations.
- Would allow the user to perform other tasks while the measure is being evaluated.
Using the FHIR Bulk Data API to gather the patient data.
- Patient data retrieval would be vastly improved, especially for facilities with large datasets.

CMS104 Test Cases

The following table outlines example test cases for each measure population group and the expected result the prototype should produce.

Population Group	Test Case	Expected Result
Ineligible	Ineligible Test Bundle	Ineligible Result
Initial Population	Initial Population Test Bundle	Initial Population Result
Denominator	Denominator Test Bundle	Denominator Result
Denominator Exception	Denominator Exception Test Bundle	Denominator Exception Result
Denominator Exclusion	Denominator Exclusion Test Bundle	Denominator Exclusion Result
Numerator	Numerator Test Bundle	Numerator Result

Population Group

Test Case

Expected Result

Ineligible

Ineligible Test Bundle

Ineligible Result

Initial Population

Initial Population Test Bundle

Initial Population Result

Denominator

Denominator Test Bundle

Denominator Result

Denominator Exception

Denominator Exception Test Bundle

Denominator Exception Result

Denominator Exclusion

Denominator Exclusion Test Bundle

Denominator Exclusion Result

Numerator

Numerator Test Bundle

Numerator Result

The following table provides larger test data sets to provide facility-level testing. Two facilities are provided to facilitate both single-facility report testing and aggregate report testing

Facility	Test Bundle	Expected Result
Facility A	Facility A Bundle	Facility A Result
Facility B	Facility B Bundle	Facility B Result
Facility A & Facility B		Aggregate Result

Facility

Test Bundle

Expected Result

Facility A

Facility B

Facility A & Facility B

Aggregate Result

IG © 2022+ HL7 International - [Some] Work Group. Package cms.fhir.mct#0.1.0 based on FHIR 4.0.1. Generated 2024-06-26
Links: Table of Contents | QA Report | Version History external |