Requirements Federated Learning and mUlti-party computation Techniques for prostatE cancer
0.1.0 - ci-build
Funded by the European Union

Requirements Federated Learning and mUlti-party computation Techniques for prostatE cancer, published by HL7 Europe. This guide is not an authorized publication; it is the continuous build for version 0.1.0 built by the FHIR (HL7® FHIR® Standard) CI Build. This version is based on the current content of https://github.com/hl7-eu/flute-requirements/ and changes regularly. See the Directory of published versions

2.2 Threat model and attacks

The FLUTE platform builds on the TRUMPET platform, which supports several threat models including:

  • Honest-but-curious: all parties are honest
  • Honest-fraction: at least 80% of parties (data owners and aggregator) are honest

Like the TRUMPET platform, FLUTE supports federated learning tasks where the data owners are hospitals, but there are two extensions not considered in TRUMPET:

  • Apart from single hospitals, we also consider data hubs
  • Wherever the PET uses a trusted execution environment (TEE), there is an additional party, the manufacturer of the enclave, that may be malicious.

A data hub can be seen as a data owner collecting in a large database the data of other data owners. This may introduce additional risks (due to the transfers of data and the additional parties involved) and in the future it should be investigated whether we recommend to avoid centralizing data in data hubs or only collect encrypted data in data hubs. However, in this section we only consider requirements based on a data hub participating. As a party, a data hub can be honest or malicious, but compared to the TRUMPET scenario there is also the option that one of the data owners providing data to the data hub colludes with other parties in an attempt to reveal sensitive information or bias results. For consistency with the terminology of the TRUMPET platform on which the FLUTE platform will build, we may sometimes refer to data hubs participating in the platform as data owners as well.

When a TEE is used, a critical requirement is that the manufacturer of the processor be trusted. This is not always a valid assumption, given that most of the chip manufacturers are not based in Europe.

One strategy that could be considered is to combine multiple TEEs with processors from multiple manufacturers in a setup which only breaks if all manufacturers are malicious.
Therefore, we will consider the following participant threat models:

  • Honest-but-curious: the classic machine learning model where we assume that all parties are honest, i.e., they follow the protocol and don’t collude
  • Honest-fraction-of-parties-t: at least a t fraction of the parties (data owners, aggregators, innovators) is honest, but the rest can collude.
  • Honest-fraction-of-data-t: counting all instances of all honest data owners, we get at least a fraction t.

Each of these participant threat models can be combined with any of the three secure enclave threat models:

  • No-TEE-enclaves-trusted
  • At-least-t-enclaves-honest
  • All-enclaves-honest

Taking into account the defined threat models. we can consider that the attacks that the FLUTE platform may suffer may come from both insiders and outsiders. Two main types of attacks can be identified, poisoning (with the objective of tampering the results of the Federated Learning process) and inference (with the objective of learning information of the data involved in the FL process). The PETs considered in FLUTE aim to mitigate these attacks.