A Data Federation is a new model for digital collaboration, in which organisations commit their data to a common project without being obliged to share it with their partners. This possibility depends on the use of data technologies which allow us to perform analytics or predictive modelling on distributed datasets whilst preserving their privacy and integrity. At Etic Lab, we have access to a range of these technologies, and each has its own set of potential uses. In this blog, we’re going to take a closer look at one of these tools in particular: Federated Learning.
What Is Federated Learning?
Federated Learning (FL) is a new Machine Learning (ML) technique which allows for the development of predictive models from a range of separate datasets, without requiring that this data is shared, standardised or collected in a central location.
We’re all familiar with common ML applications – in search engines, recommendation tools, social media feeds. Large companies employ ML to manage supply chains, inventory and logistics, whilst scientists have used it to train machines to play games like Chess and Go, write original texts and perform dance routines. All of this is possible because ML is capable of rapidly working through vast sets of data, recognising and processing the patterns which allow it to progressively improve at its given task until it can perform to the required level of efficiency.
Dancing robots are very impressive, but at Etic Lab we’re more interested in what this technology might do in the hands of a different set of actors – ordinary businesses, civil society orgs, local government, charities and activists. So far, the opportunities for people like this to get in on the ML game have been limited for a host of reasons, some technological and some social.
To work properly, ML requires large quantities of training data. Typically, this data must collected in one big centralised, standardised dataset. This basic requirement is a substantial barrier to many. Small and medium-sized organisations typically do not have access to data on this scale. They would therefore be required to work together with like-minded partners, but collaboration creates its own set of problems. The relevant data may be private or proprietary; it is also likely to be scattered across different locations and held in different formats. This presents a set of legal, ethical and technical challenges which many organisations are simply not equipped to face.
Federated Learning offers networks of smaller actors a way to work together without compromising the security of their data or requiring them to tackle technical challenges which are beyond their resources.
How Does It Work?
With FL, all the learning takes place on the machines of the various organisations which have joined the federation. No data is ever exposed or shared beyond its original location; instead, local FL clients analyse the data and communicate what they have learned back to a central server. The server aggregates these different insights to produce an overall model, which is then relayed back to the local clients to inform their ongoing analysis. The process continues until the central model can perform its task to the required efficiency.
What Might This Look Like In Practice?
Imagine that you work for a charity which provides legal advice and support to people going through divorce. Part of your job involves planning what capacity you will require to meet the demands of the year ahead – how many trained advisors, volunteers, admin hours, what resources will be required, and so on. Experience has taught you to look out for certain trends. For instance, you know that your services will be in higher demand just after Christmas. By applying ML to client data, you could discover more insights like this, which would help you improve your planning. It may be, perhaps, that there is a measurable connection between divorce rates and rising unemployment, or changes in benefits provision.
You own organisation does not possess sufficient data to attempt this project alone, but you have connections at a range of other charities who are prepared to help – not only family law specialists, but also organisations involved with employment, benefits and housing. However, the data involved is extremely sensitive, and you do not have the resources to anonymise and standardise each partner’s contribution. In this case, FL will allow you and your partners to engage in a project which would otherwise be completely impracticable.
Working together with your partners, you determine a set of questions relating to patterns of demand for legal support in relation to divorce, employment, benefits and housing. You also review what each organisation can contribute to the project, and agree how to manage their various constraints and dependencies. Through this process you are guided and motivated by your common goal – to provide more co-ordinated and effective support for people with pressing legal needs.
When it comes to deploying the FL tech, each member of the partnership receives a secure link to download the client to their individual machine, where it interfaces with the dataset which that partner has agreed to contribute to the project. Over a period of a few hours, the client studies the partner’s data to uncover patterns of demand, before communicating these patterns back to the central server. When aggregated, they provide a model which is able to make predictions based on how changes in demand in one area might affect another.
At the conclusion of the project, you have produced a tool which allows you and your partners to predict changes in demand and inform planning decisions. More importantly, however, you have developed a set of collaborative relationships based on a set of common values and a shared understanding of the problem you are trying to address. Several of your collaborators were particularly impressed by what you were able to achieve, and together you agree to enter into a long-term strategic partnership based on developing and utilising shared insights into the relationships between your fields of practice. This becomes a forum for joint advocacy, fundraising and the creation of policies to address the overlapping problems faced by the people you are supporting.
FL is an extremely powerful technology. By lowering the barriers to entry, it opens up the possibility of digital collaboration to a range of actors who would otherwise be unable to access it. However, in order to ensure that this technology is effectively and ethically used, these collaborations must take place within a framework which allows partners to make informed collective decisions based on shared purpose and common values. The goal of Etic Lab’s Data Federations project is to guide our partners and clients through the process of building such a framework. If you’re interested in discovering how your organisation might benefit from the creation of a Data Federation based around the use of FL, get in touch with us today to find out more.
Be sure to download and read our Data Federations Prospectus below.