Featured Image

Data Federations Explainer

What is a Data Federation?

The term “Data Federation” describes a model for collaboration between a group of organisations, in which the partners commit their data assets to a shared goal whilst retaining control over the extent to which their data is shared or exposed. It’s based on the possibilities afforded by distributed, privacy-preserving data technologies such as Federated Learning (FL) and Distributed Data Mining (DDM), which allow insights or or predictive models to be generated from separate datasets, without requiring that the data is collected in a central location. We call this digital collaboration without data sharing.

What are the technologies?

There are a range of currently existing technologies which allow us to create value from data without sharing or exposing it. Techniques such as FL and DDM allow us to work with data held in varying formats across different locations. Differential Privacy works by systematically injecting noise into datasets, which enables us to generate insights from them without exposing sensitive details. Synthetic Data is a method for systematically producing dummy data sets which replicate the underlying structure of the parent data whilst obscuring sensitive or personally identifiable information. Each of these mechanisms offers a particular set of possibilities for would-be collaborators, which might be more or less useful in a given set of circumstances. One of the main functions of a Data Federation is to expose collaborators to these options, and help them decide how they might be used to address their particular goals.

What does a Data Federation look like?

A Data Federation is what happens when a group of organisations get together to attempt a shared project on the basis of the possibilities afforded by the technologies described above. Typically, this means that they are more concerned with creating a digital tool which will help them address a specific set of goals, rather than assembling a standardised, centralised dataset. We think this creates a distinct set of options for how they might organise their collaboration.

Data Federations are inherently flexible structures, but there are a set of key competencies that must be fulfilled:

  • Establish shared values and common purpose
  • Surface barriers and ethical concerns
  • Develop shared understanding of how technical possibilities map onto specific goals
  • Create governance structure to manage ownership of tool and distribution of value

What are the possibilities of Federated Collaboration?

Data Federations have a number of unique characteristics, as opposed to collaborations based around the construction of a large centralised dataset:

  • A secure, straightforward mechanism for mitigating concerns around the privacy of sensitive or proprietary information
  • Focusses on creating shared purpose and value, rather than assets
  • A relatively low barrier to entry and low initial governance burden
  • Able to accommodate different ways of understanding and working with data
  • Provides a space for the development of shared understanding and collaborative relationships

Who Can Build a Data Federation?

The Data Federations model opens the possibility of digital collaboration to a range of actor who otherwise would be unable to consider it. These include organisations who:

  • Are seeking to engage in complex multi-stakeholder collaborations across different sectors
  • Lack sufficient data to engage with ML or analytics on their own, but can find partners to work with
  • Are concerned about exposing private or proprietary data
  • Lack the time or resources to engage in complex data standardisation work
  • Are struggling to reconcile differences in perspective between project partners

If you can see yourself in what you’ve just read and are excited by the possibilities we’ve described, get in touch with us today to explore how to start building your Data Federation.

Be sure to download and read our Data Federations Prospectus below.

To discuss the ideas presented in this article please click here.