Promising Applications in Health Care
A number of recent and ongoing projects utilize federated learning in health care. Recent initiatives include the HealthChain project, which has developed a federated learning framework across four hospitals in France to predict treatment response for breast cancer and melanoma patients.2 Ongoing projects include the Federated Tumour Segmentation (FeTS) initiative, an international federation of 30 health care institutions using an open-source federated learning framework.3 There are also industrial applications, as federated learning allows competing companies to collaborate on research without revealing their proprietary data.
“Federated learning has tremendous potential across numerous domains, particularly within healthcare, as shown by our research with Penn Medicine. Its ability to protect sensitive information and data opens the door for future studies and collaboration, especially in cases where datasets would otherwise be inaccessible.”
–Jason Martin, Principal Engineer, Intel Labs”4
Overcoming Challenges
While federated learning techniques are gaining traction, certain limitations hinder their widespread adoption. One challenge arises from uneven data distribution, where some devices possess significantly more data than others—referred to as “class imbalance.” Another is that data on different devices might not be similar to each other. Both of these issues can make it harder for the machine learning model to train effectively. Adding to the complexity is that, because of the way federated learning works, the data being used to train the model are hidden—or “encrypted”— preventing both the devices and central server from accessing the complete data set. As a result, methods that have been used in the past to address class imbalance in other settings might not work well for federated learning.
Recently, a new method has been proposed for improving the efficiency of federated learning in handling imbalanced data.5 This method involves a hybrid data sampling strategy that addresses the issues of data imbalance at both the global and local level. This approach enhances the performance of federated learning without the need for direct access or control of the device-level model. The method also makes it possible to retain the benefits of federated learning, such as privacy preservation and reduced communication costs, while also improving the model’s accuracy and robustness.
Success Stories
One significant success story of federated learning in health care predictive analytics comes from a research project known as the EXAM study, led by Nvidia’s Global Head of Medical AI Mona G. Flores and other researchers.6 The aim of the study was to build a model using local data, as well as data across a federated network, to predict outcomes for patients who arrived at the emergency department with respiratory complaints.
The study demonstrated the feasibility and benefits of federated learning in the health care domain, including its potential to enable hospitals to collaborate and provide federated access to data without compromising patient privacy and security. Importantly, the study showed that the federated learning approach was able to improve the performance of the predictive model, creating a global federated model that outperformed any local model. It also showed a high degree of generalizability to unseen data in a subsequent validation study. This work points to federated learning’s potential to transform the way hospitals collaborate to improve patient outcome.
Another case is the joint research study by Intel Labs and the Perelman School of Medicine at the University of Pennsylvania.7 This study utilized federated learning to help international health care and research institutions identify malignant brain tumors. The study included an unprecedented global data set examined from 71 institutions across six continents and demonstrated the ability to improve brain tumor detection by 33 percent.
This system addressed numerous data privacy concerns by keeping raw data inside the data holders’ computer infrastructure and allowing only model updates computed from that data to be sent to a central server or aggregator, not the data itself. The study showed the potential of federated learning as a paradigm shift in securing multi-institutional collaboration by enabling access to the largest and most diverse data set of glioblastoma patients ever considered in literature, while all data were retained within each institution at all times.
The research demonstrated the effectiveness of federated learning at scale and the potential benefits the health care industry can realize when multi-site data silos are unlocked. Benefits include early detection of disease, which could improve quality of life or increase a patient’s life span.
Endnotes