Requirements Federated Learning and mUlti-party computation Techniques for prostatE cancer
0.1.0 - ci-build
Requirements Federated Learning and mUlti-party computation Techniques for prostatE cancer, published by HL7 Europe. This guide is not an authorized publication; it is the continuous build for version 0.1.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/hl7-eu/flute-requirements/ and changes regularly. See the Directory of published versions
Contents:
This page provides a list of the FHIR artifacts defined as part of this implementation guide.
These define logic, asset collections and other libraries as part of content in this implementation guide.
| Inclusion Criteria FLUTE |
Retrieves Patient matching inclusions criteria for FLUTE study. |
| Research Variables FLUTE |
Retrieves key research variables for FLUTE study. |
These define constraints on FHIR resources for systems conforming to this implementation guide.
| Datamart Parameters List |
FHIR profile of a list to contain parameters following the evaluation of search variables. |
| FamilyMemberHistory: prostate cancer family history |
A record detailing a patient's family history of prostate cancer. |
| Observation: Outcome BCN-RC 2 |
This profile represents the outcome for BCN-RC 2 ISUP-GG. |
| Observation: PI-RADS |
A risk assessment based on the PI-RADS (Prostate Imaging-Reporting and Data System) scoring system. |
| Observation: prostate specific antigen |
Observation for the measurement of prostate specific antigen (PSA) levels in the blood. |
| Observation: prostate volume |
Observation for measuring the volume of the prostate. |
| PR-EvidenceVariable |
A profile for the use of the EvidenceVariable resource for inclusion/exclusion criteria and research variables for cohort and datamart management. |
| PR-ResearchStudy |
A profile for the use of the ResearchStudy resource for managing research studies. |
| Procedure: DRE |
This profile represents the digital rectal examination (DRE) procedure performed on a patient. |
| Procedure: biopsy |
This profile represents the biopsy procedure performed on a patient. |
These define constraints on FHIR data types for systems conforming to this implementation guide.
| EXT-Datamart |
Extension for research variables in a study. |
These define sets of codes used by systems conforming to this implementation guide.
| DRE Observation Interpretation Codes |
This ValueSet includes codes for interpreting the results of a digital rectal examination (DRE) in the context of prostate assessment. |
| Gleason Grade |
Gleason Grade for Prostate Cancer |
| Malignant tumor of prostate Value Set |
Malignant tumor of prostate. |
| Research Study Phase Code |
ValueSet for the codes of phase of a research study. |
| Type of Biopsy |
ValueSet for categorizing types of biopsy as initial or repeated |
These define new code systems used by systems conforming to this implementation guide.
| Custom Research Study Phases (Cohort/Datamart) |
Codes for tracking study phases related to cohort and datamart generation. |
These are example instances that show what data produced and consumed by systems conforming with this implementation guide might look like.
| Device |
Device Actor Definition |
| EXP-S1-IncludedPatient |
This bundle includes all the resources for a patient who has been included in the study. |
| EXP-S1-Patient |
The Patient included in the study S1. |
| EXP-S2-ExcludedPatient |
This bundle includes all the resources for a patient who has been excluded from the study. |
| EXP-S2-Patient |
The Patient excluded in the study S2. |
| F-HUF-1 |
Researcher node has jupyterlab interface. |
| F-HUF-10 |
Support for federated Grid Optimization. |
| F-HUF-11 |
Support for federated Generative Adversarial Networks (GAN). |
| F-HUF-12 |
Support for federated Variational auto-encoders (VAE). |
| F-HUF-13 |
Support for federated Diffusion models. |
| F-HUF-14 |
Support for at least one effective federated synthetic data generator learner (GAN, VAE, or DiffMod). |
| F-HUF-15 |
Support for multi-model synthetic health data (both tabular & image). |
| F-HUF-16 |
Synthetic data generation module should allow for specifying what data (images, tabular …) should be generated. |
| F-HUF-17 |
Synthetic data generation module should allow for specifying population subsets, e.g., only with cancer. |
| F-HUF-18 |
Generation of synthetic 3D MRI images. |
| F-HUF-19 |
Data owner node has functional interface with local data owner database. |
| F-HUF-2 |
Researcher node offers all features provided by TRUMPET researcher node. |
| F-HUF-20 |
Data owner node has user interface for data owner users. |
| F-HUF-21 |
Data owner node has server interfacing with other nodes. |
| F-HUF-3 |
Support for federated Logistic Regression (LR). |
| F-HUF-4 |
Support for federated Decision Trees (DT). |
| F-HUF-5 |
Support for federated Random Forests (RF). |
| F-HUF-6 |
Support for federated Support Vector Machines (SVM). |
| F-HUF-7 |
Support for federated Deep Neural Networks (DNN). |
| F-HUF-8 |
Support for federated Convolutional Neural Networks (CNN). |
| F-HUF-9 |
Support for federated Bayesian Optimization. |
| F-IMSD-1 |
SD algorithm shall offer a CSV file with the required number of instances of tabular data and each column should be in the expected format (i.e., categorical, numerical etc.). |
| F-IMSD-10 |
An option for users to save hyperparameters in draft and apply them at later time. |
| F-IMSD-11 |
Images input and outputs will be in DICOM format. |
| F-IMSD-12 |
SD algorithm shall have the ability to save a Database (DB) with current CSV file and previous CSVs files proposed. |
| F-IMSD-13 |
SD shall incorporate more than one SD algorithm to perform calculations based on customer choice. |
| F-IMSD-14 |
Data imputation should be considered when historical data is not available, and there is uncertainty or bad quality in the data. |
| F-IMSD-15 |
SD module will have a trained machine to generate synthetic data from new repositories shared by users. |
| F-IMSD-16 |
SD algorithm shall take into account that training SD generation can suppose a long waiting time. |
| F-IMSD-17 |
SD shall be implemented so that future modular extensions can be added. |
| F-IMSD-2 |
SD algorithm shall offer the possibility to modify some hyper-parameters and GUI shall offer a value reset option to set hyperparameters to their default value. |
| F-IMSD-3 |
Synthetic data should be evaluated using various methods and tools. |
| F-IMSD-4 |
Synthetic Images should be evaluated using various methods and tools including human expert validation. |
| F-IMSD-5 |
Ability to create error message when error occurs. |
| F-IMSD-6 |
A range of conditions can be forced for some features when synthetic tabular data is generated. |
| F-IMSD-7 |
Ability to add structured data by the user. |
| F-IMSD-8 |
SD algorithm shall offer a modular structure where each parameter is a module capable of being available or disable. |
| F-IMSD-9 |
SD should take into account that new users will probably need to change units or convert initial data according to specified standards. |
| F-PIL-1 |
mpMRI/bpMRI shall be performed within 1 year prior to the prostate biopsy. |
| F-PIL-10 |
Reduced sample datasets with the 7 clinical variables and associated images shall be shared to generate synthetic images and algorithms. |
| F-PIL-11 |
No sensitive information shall can be revealed from exchanged messages (aggregates, models statistics, etc.) between users. |
| F-PIL-12 |
Access to local FLUTE nodes shall be Controlled/restricted. |
| F-PIL-13 |
Platform shall allow AI developers to train their models in accordance with their legal requirements and document such training. |
| F-PIL-14 |
The training requests sent to the FLUTE nodes shall specify the minimum/maximum resources needed to be executed. |
| F-PIL-15 |
The performance of the model shall be higher than the BCN1 and BCN2 models. |
| F-PIL-16 |
AI researchers shall be able to discover and select the datasets registered at each FLUTE node and obtain descriptive statistics about the datasets. |
| F-PIL-17 |
Jupyter notebooks shall be integrated with the FLUTE functionalities to ensure discoverability of the datasets. |
| F-PIL-18 |
Platform shall allow the AI researchers to search for the relevant dataset. |
| F-PIL-19 |
Platform shall provide space to add guidance documents and instructions on how to use the Platform and the datasets. |
| F-PIL-2 |
Cohorts shall consist of men with clinical suspicion of PCa based on a PSA > 3.0 ng/ml and/or abnormal DRE. |
| F-PIL-20 |
Platform shall allow authentication of authorized individuals from Data owners and Data Users and varied level of access, based on their defined roles. |
| F-PIL-21 |
Platform shall keep record of Data Users and Data owners and logs details of their activity in the Platform. |
| F-PIL-22 |
Platform shall ensure that the training data remains on the federated node and any processing, analysis and AI training is performed there. Data User shall not see, directly access or download the data, i.e. the AI model shall only be trained in the local node. |
| F-PIL-23 |
There shall be a security check of the uploaded AI model prior to its deployment in the FLUTE data. |
| F-PIL-24 |
AI models BCN1/BCN2 trained through the platform shall be packaged into software components and deployed at the clinical sites involved in validation activities. |
| F-PIL-25 |
The platform SHALL be able to generate synthetic data for mpMRI, bpMRI and tabular data for BCN1 and BCN2 case series. |
| F-PIL-26 |
The platform SHALL be able to train BCN1 and BCN2 models from an augmented/balanced datasets thanks to synthetic data. |
| F-PIL-27 |
Age shall be provided at time of biopsy. |
| F-PIL-28 |
Type of biopsy shall be provided at time of biopsy with class 0 for initial or 2 for repeated. |
| F-PIL-29 |
PSA shall be provided for each cohort. |
| F-PIL-3 |
Lesions detected in mpMRI/bpMRI shall have to be reported using the Prostate Imaging-Report and Data System (PI-RADS) in version 2.0 or higher. |
| F-PIL-30 |
DRE shall be provided for each cohort with class 0 for normal or 1 for suspicious. |
| F-PIL-31 |
VP shall be provided for each cohort in the MRI report. |
| F-PIL-32 |
PI-RADS shall be provided for each cohort in the MRI report with class from 1 to 5. |
| F-PIL-4 |
Prostate biopsies shall be systematic and targeted in cases of PI-RADS ≥3 lesions. |
| F-PIL-5 |
The platform shall define the methodology to extract/load/transform data from clinical databases and data warehouse into the FLUTE data node. |
| F-PIL-6 |
Input data shall be anonymized or pseudonymized. |
| F-PIL-7 |
Clinical data and MRI (both raw and processed) shall be linked. |
| F-PIL-8 |
MRI imaging study shall comply with specific requirements of QP-Prostate tool provided for FLUTE project. |
| F-PIL-9 |
Data shall be labelled with class csPCa 0 or 1. |
| F-SRS-1 |
Platform should provide secure methods to access the system like multi-factor authentication. |
| F-SRS-10 |
FLUTE platform should allow the user to select whether the central aggregator has clear access to the local models. |
| F-SRS-2 |
Access to different platform features should be role-based. |
| F-SRS-3 |
User sessions should time out after a period of inactivity. |
| F-SRS-4 |
FLUTE platform should allow to select which protection techniques are using in a training. |
| F-SRS-5 |
Local training algorithms should be run in the data owner infrastructure. |
| F-SRS-6 |
Local trained models should be sent to aggregator using TLS. |
| F-SRS-7 |
Data owners should be able to select which fields of their data sets can be used for model training. |
| F-SRS-8 |
FLUTE platform should log every use of the data. |
| F-SRS-9 |
FLUTE platform should initiate a local training when the data owner provides consent to use the data to that study. |
| F-STD-1 |
The FLUTE project SHOULD use the HL7 FHIR standard whenever possible. |
| F-STD-10 |
The FLUTE project SHOULD explore the possibility to model AI models using the HL7 standards FHIR and/or CQL. |
| F-STD-2 |
The FLUTE project SHOULD use SNOMED CT, LOINC and UCUM terminologies whenever possible. |
| F-STD-3 |
The FLUTE project SHOULD use DICOMweb (DICOM) for imaging evidences. |
| F-STD-4 |
A conceptual/logical model of the data that has to be exchanged SHALL be specified. |
| F-STD-5 |
Privacy Policies SHOULD be modelled and exchanged using the HL7 FHIR standard. |
| F-STD-6 |
Permission to access healthcare Data SHOULD be modelled and exchanged using the HL7 FHIR Permission resource. |
| F-STD-7 |
The prediction of whether a biopsy is need SHOULD be modelled and exchanged using the HL7 FHIR standard. |
| F-STD-8 |
The FHIR exchange capabilities of each system SHALL be modelled and exchanged using the HL7 FHIR CapabilityStatement resource. |
| F-STD-9 |
Each Hospital SHALL expose its non-imaging data using the HL7 FHIR standard. |
| FLUTE Administrator |
FLUTE Administrator Actor Definition |
| FLUTE Platform |
FLUTE Platform Actor Definition |
| NF-HUF-1 |
All output preserves privacy. |
| NF-HUF-2 |
All algorithm implementations should follow the platform guidelines (adopted & revised from TRUMPET), e.g., on privacy/security parameters. |
| NF-IMSD-1 |
SD algorithm shall take into account that training SD generation can suppose a long waiting time. |
| NF-IMSD-2 |
SD shall be implemented so that future modular extensions can be added. |
| NF-IMSD-3 |
Synthetic data maintains data privacy and cannot correlate to patient data. |
| NF-IMSD-4 |
Synthetic data used in combination with real data (data augmentation) improves the prediction performance of the algorithms trained using only real data. |
| NF-IMSD-5 |
SD GUI shall be able to run several queries simultaneously to reduce total time. |
| NF-IMSD-6 |
A user manual and helping description must be provided. |
| NF-IMSD-7 |
SD GUI shall incorporate an internal counter which will be in charge of recording the amount of use the customer is making to allow a possible pay per use subscription method. |
| NF-PIL-1 |
Units shall be harmonized. |
| NF-PIL-10 |
Platform shall monitor the use of the data in the Platform, to detect potential misuse. It shall implement measures for detection of data breaches and potential privacy threats/leaks. |
| NF-PIL-11 |
Platform shall ensure that the pseudonymized data can be amended or withdrawn after its sharing, if the data subject (patient) requests the modification. |
| NF-PIL-2 |
A common (FHIR) data model shall be defined to represent the clinical data used in the study. |
| NF-PIL-3 |
The platform shall provide validators that check whether the clinical data pushed to the local node complies with the common data model. |
| NF-PIL-4 |
The data shall be standardized to a common (FHIR) data model before ingestion into the FLUTE local node. |
| NF-PIL-5 |
Cryptographic methods like homomorphic encryption and differential privacy shall be used to aggregate statistics about the cohort without disclosing (leaking) sensitive data outside the local node. |
| NF-PIL-6 |
Platform shall keep and display FLUTE data catalogue with defined basic metadata that characterizes the datasets available through the FLUTE Platform. |
| NF-PIL-7 |
Platform shall display terms and conditions of use (T&C) and Privacy policy. |
| NF-PIL-8 |
Platform shall display the conditions of the use of each of the datasets, as specified by its owner or the data hub. |
| NF-PIL-9 |
Datasets which are not defined as open to all users of the platform, shall only be available to uses which request access to the dataset and are permitted to use it by the data owner or data hub. |
| NF-SRS-1 |
Platform should have password policies. |
| NF-SRS-2 |
FLUTE Platform should implement several PETs to protect data privacy. |
| NF-SRS-3 |
Administrators of FLUTE platform should keep the systems up-to-date and patched. |
| NF-SRS-4 |
There should be security policies to avoid the use of potentially vulnerable software. |
| NF-SRS-5 |
FLUTE Platform should guarantee data is not tampered with in training processes. |
| NF-STD-1 |
The definition of the different actors of the platform SHOULD be modelled and exchanged using the HL7 FHIR ActorDefinition resource. |
| NF-STD-2 |
The definition of the different requirements of the platform SHOULD be modelled and exchanged using the HL7 FHIR Requirement resource. |
| NF-STD-3 |
The example scenarios of the platform usage SHOULD be modelled and exchanged using the HL7 FHIR ExampleScenario resource. |
| NF-STD-4 |
The testing of the different requirements of the platform SHOULD be modelled and exchanged using the HL7 FHIR TestScript, TestPlan and TestReport resource. |
| URS-1 |
Data should never leave data owner infrastructure. |
| URS-10 |
All the federated learning processes should be logged to be able to conduct an audit in case of a security incident. |
| URS-11 |
The system should provide consent management mechanisms. |
| URS-12 |
The exchange of data between data owner nodes and central aggregator should follow the principle of data minimization. Only sharing the necessary data to be able to train models effectively. |
| URS-13 |
FLUTE platform should be compliant with regulations. |
| URS-14 |
FLUTE platform should provide privacy in a semi-honest threat model (honest but curious parties). |
| URS-15 |
FLUTE platform should provide privacy in a threat model with malicious parties. |
| URS-2 |
Central aggregation of models should not leak any information of the data used to train local models. |
| URS-3 |
Access to the platform should be protected by a secure login with multi-factor authentication. |
| URS-4 |
Communication between system nodes should be encrypted. |
| URS-5 |
Personal and sensitive data should not be used in the model training. In case it is required it should be properly protected, for example, with anonymization. |
| URS-6 |
Users whose data is part in the training of a model should be protected to data reconstruction attacks. |
| URS-7 |
Users whose data is part in the training of a model should be protected to membership inference attacks. |
| URS-8 |
Users whose data is part if the training of a model should be protected to property inference attacks. |
| URS-9 |
Devices used in the Federated Learning process must be secure, regularly patched and protected against malware and other vulnerabilities. |
| User |
User Actor Definition |
These are resources that are used within this implementation guide that do not fit into one of the other categories.
| FLUTE Research Study |
FLUTE Research Study |
| Group of EvidenceVariables for the FLUTE ResearchStudy |
Group of EvidenceVariables for the FLUTE ResearchStudy |
| Inclusion Variable for FLUTE Study |
Inclusion criteria for FLUTE study |
| Study variable: Age at Biopsy |
Age at the time of biopsy. |
| Study variable: Digital rectal examination results |
Digital rectal examination results: Indicates the results of DRE (normal or suspicious) |
| Study variable: PI-RADS |
PI-RADS score: The Prostate Imaging Reporting and Data System score used to assess prostate cancer risk on MRI. |
| Study variable: PSA |
Prostate-specific antigen (PSA) in ng/ml. |
| Study variable: Prostate Cancer Family History |
Family history of prostate cancer: Indicates if the patient has known family history of prostate cancer. |
| Study variable: Prostate Volume |
Prostate volume: Volume of the prostate. |
| Study variable: Type of Biopsy |
Type of biopsy: Specifies whether the biopsy was initial or repeat. |