Introduction To Federated Learning

Last Updated on February 24, 2025 by Editorial Team

Author(s): William Lindskog

Originally published on Towards AI.

Breaking Down AI: What Makes It Work?

Artificial Intelligence (AI) is transforming everything from self-driving cars to personalized medicine to financial fraud detection. But what exactly makes AI work?

Practically, AI consists of 3 fundamental components:

AI Models → The brains of AI. Models process data and make decisions, like recognizing faces in photos or predicting stock market trends. Most modern AI models, such as neural networks, are inspired by the way the human brain works where information “flows” through layers of interconnected nodes, leading to a decision.
Hardware → The muscle behind AI. If you’ve heard of GPUs (Graphics Processing Units), you already know why they matter. GPUs are highly efficient at handling AI computations, enabling faster training and inference. This specialized hardware is why AI has accelerated so rapidly in the last decade.
Data → The fuel of AI. The more quality data an AI model has, the better it performs. Everything from search engines to chatbots to healthcare AI relies on vast amounts of high-quality data to learn and improve.

But here’s the problem: Data has become AI’s biggest bottleneck.

Traditional recipe for breakthroughs in AI. We are running out of publicly available data.

The AI Data Problem: Why Traditional Approaches Don’t Scale

AI models don’t improve on their own. They need continuous exposure to diverse, high-quality data to become more accurate, reliable, and useful. The challenge? Access to public data is shrinking, not growing [1]. Many industries e.g. healthcare, finance, automotive, and beyond have the data that could supercharge AI, but it’s trapped behind barriers that make large-scale AI training incredibly difficult. [2]

The Best Data Is Private

The most valuable datasets e.g. medical records, financial transactions, personal conversations, and real-world driving data, are highly sensitive. This data can’t simply be shared or copied without violating privacy laws and ethical standards.

Hospitals can’t share patient records with external AI labs due to HIPAA and GDPR [3,4]. These “silos” hold the potential to train AIs that possess capabilities beyond what is currently possible. While there are efforts to make data publicly available, it is an insignificant amount to what is kept within organizations.
Banks can’t expose customer transactions to train better fraud detection models [5]. Collaborative efforts would make enable better detection of fraudulent transactions and anti-money laundering. Yet, today it can be challenging detecting these with data from entities within a larger enterprise.
Users don’t want their private messages collected for improving chatbots. This is quite obvious, but we do want personalized AIs helping us check for grammatical errors and typos.

As a result, many AI models end up being trained on public or outdated datasets that don’t reflect real-world scenarios. The AI remains detached from the actual environment where it’s supposed to operate.

Data is global but in silos that are difficult to access.

Data Is Scattered Across the World

Even when data can be used, it’s rarely in one place. Instead, it’s fragmented across multiple devices, companies, or institutions. Creating a centralized dataset is cumbersome. Larger platforms and organizations can benefit from big consumer groups and in-house datasets, but even data transfer between in-house entities/units is not always straightforward. Smaller research groups often need to resort to publicly available data or construct their own. While this is possible, it is not only time-consuming but costly.

Medical AI needs patient data from multiple hospitals, but each hospital has its own isolated dataset. Furthermore, making this sort of data public often means anonymizing the data, which complicates matters when attempting to develop personalized AI models.
Self-driving cars collect sensor data, but who actually owns the data that originates in a vehicle? For example, a company like Volvo might have its own fleet of vehicles but is running 3rd party software to power self-driving car capabilities. Yet, this software is running on hardware produced by another company which actually stores that data. To top it off, the operating system might be something like Android, so at what company does the data actually reside? This is where collaboration is needed.
Large language models (LLMs) improve with real user interactions, but personal conversations are stored on individual phones. Thus, in order to develop their capabilities, they need more data from users phone. Nevertheless, centralizing this data is out of the question. People are used to 2-way encryption techniques and do not want their messages to be stored elsewhere than their phone.

Centralizing this data into one place is often impossible due to logistics, security, and trust issues.

AI Needs More Data But It’s Getting Less

There is a clear correlation between the more quality data an AI gets, the better the performance [6]. Therefore, for anyone interested in seeing AI get better, we need to ensure that it can access at least knowledge from more data. Nevertheless, the amount of publicly available data is decreasing rapidly [1]. This leads to a paradox:

AI models require massive, real-world datasets to improve. You might have heard that the latests LLMs are trained on almost all text on the internet. While this is up for discussion, there is no doubt that they train and fine-tune their models on enormous amounts of data.
The best datasets are locked in silos. While there are several reasons for this e.g. regulations or corporate policies, an argument can be made for collaboration being beneficial. Take healthcare as an example. Suppose there are healthcare clinics fragmented across regions. Privacy laws hinder them from sharing patient data, but there might be only one clinic large enough to have enough data to train a somewhat accurate cancer predictor. The smaller clinics don’t have enough quality data and would benefit hugely from knowledge from the larger clinic. Vice versa, the larger clinic would also benefit from more data from all the different clinics but privacy laws put an end to this. While the reason for the law might at one point been reasonable, the current outcome is not.

So, what’s the solution? How can AI train on more data while respecting privacy and security?

This is where Federated Learning [7] (FL) changes the game.

FL addresses a fundamental limitation in AI development: access to high-quality, real-world data is diminishing, yet AI models require more data than ever to improve. Current methods rely on centralizing datasets for training, but as we have established, this approach is increasingly infeasible due to logistical barriers. The question, then, is how do we enable AI to improve without direct access to more data?

A Paradigm Shift in AI Training

FL presents a novel approach: instead of bringing the data to the model, we bring the model to the data. This decentralized method allows multiple entities, whether they be hospitals, financial institutions, or individual devices, to collaboratively train an AI model without sharing raw data.

FL is built upon a simple yet powerful mechanism:

A global AI model is initialized and distributed across decentralized participants (clients). Let’s take next word prediction for texting on smartphones. The task is arbitrary but something that many people might be familiar with.
Each participant trains the model locally using its own private dataset. This means that the AI used to predict the next word on your phone will train itself on the messages that you type. Now, important, the messages themselves are still kept on your phone and not sent anywhere. The AI learns from the messages but doesn’t in itself store a message that you have sent.
Instead of sending raw data to a central server, only model updates (weight adjustments, gradients, or encrypted parameters) are shared. This means that your texts/messages are not shared with other devices and the intrinsic structure and build of the AI on your phone is shared, not the data.
A central coordinator aggregates these updates to improve the global model while ensuring that no single participant’s data is exposed. Thus, we still learn from millions, maybe even billions of devices, but the data never leaves, in this case, the phone.

This cyclical process repeats over multiple training rounds, leading to an AI model that benefits from a diverse, distributed dataset without compromising privacy or security.

An initial model is hosted on the cloud (1), it is sent out as “copies” of itself to, in this case, different smartphones (2), each copy of the model is trained on data from the respective smartphones which results in slightly different models as they have been exposed to different texts/messages (3), the model updates are sent back to the server — **not the texts themselves** — (4), where they are aggregated to form a better AI (5).

Theoretical Foundation of Federated Learning

FL builds upon several well-established principles in distributed optimization and privacy-preserving machine learning:

Decentralized Stochastic Gradient Descent (SGD): Instead of computing gradients from a centralized dataset, gradients are computed locally and aggregated at the server [7].
Secure Aggregation: Techniques such as differential privacy [8] and homomorphic encryption [9] ensure that individual updates remain confidential while still contributing to model improvement.
Statistical Heterogeneity Management: Unlike centralized training, where datasets are curated for uniformity, FL must account for non-IID (independent and identically distributed) data, meaning different clients may have vastly different distributions. Take the case of driving assisting capabilities in automotive. Suppose you have 2 drivers: the aggressive one (1) who likes to hit the gas and see the car take off quickly, and then the more environmental friendly one (2) who cares more about getting to the final destination having save fuel. If an AI trains on the data from these 2 people in a federated fashion, the end model might be okay … but just okay. Seeing that their behaviors vary greatly, it might have “landed” somewhere just right in between, which doesn’t serve either one perfectly. Thus, FL needs to be able to handle this, and current state-of-the-art frameworks can. [10–12]

This approach allows AI to scale beyond traditional limitations, accessing knowledge from previously inaccessible, private, or fragmented datasets. We are already seeing FL being deployed at mass scale in various industries. Frameworks like Flower [10] enable experimentation with your FL solutions and can be used to deploy for real-world applications.

Applications of Federated Learning Across Industries

Healthcare: Preserving Patient Privacy While Advancing AI

The potential for AI in healthcare is undeniable: AI models can assist in diagnosing diseases, predicting patient deterioration, and optimizing treatment plans. However, strict regulations such as HIPAA (U.S.), GDPR (EU), and local data protection laws prevent hospitals from sharing sensitive patient records. This is understandable as patients do not e.g. want images of their bodies or blood values being distributed across the globe. Moreover, such sensitive data can be used in harmful ways, thus it should stay protected.

Traditional AI development requires large, diverse datasets, but hospitals cannot freely exchange patient data across institutions or regions. Federated Learning enables collaborative AI training across hospitals without exposing patient records. Each hospital locally trains the AI model using its own dataset, and only model updates are shared for aggregation. A cancer detection model could be trained across multiple hospitals worldwide, improving diagnostic accuracy for rare conditions without violating patient privacy laws. [13–15]

Other industries are also adopting FL: automotive [16], finance [17], telecommunications [18] and more. The reason being:

More data = Better AI

The Future of AI is Federated

Federated Learning represents a fundamental shift in how AI models are trained. By enabling collaborative, privacy-preserving AI, FL bridges the gap between the need for more data and the constraints of modern privacy and security regulations.

Key Takeaways

AI requires continuous exposure to diverse, high-quality data — but traditional data collection is becoming infeasible.
Federated Learning enables AI models to learn from decentralized data sources while preserving privacy.
Industries such as healthcare, automotive, and finance are already adopting FL to unlock new AI capabilities.
The future of AI will depend on distributed, privacy-conscious training methodologies and FL is leading the way.

🚀 Want to explore Federated Learning further? Start building with Flower, the open-source framework for privacy-preserving AI.

References

[1] Villalobos, Pablo, et al. “Will we run out of data? Limits of LLM scaling based on human-generated data.” arXiv preprint arXiv:2211.04325 (2022).

[2] Karimireddy, Sai Praneeth, et al. “Breaking the centralized barrier for cross-device federated learning.” Advances in Neural Information Processing Systems 34 (2021): 28663–28676.

[3] Brauneck, Alissa, et al. “Federated machine learning, privacy-enhancing technologies, and data protection laws in medical research: scoping review.” Journal of medical Internet research 25 (2023): e41588.

[4] Loftus, Tyler J., et al. “Federated learning for preserving data privacy in collaborative healthcare research.” Digital Health 8 (2022): 20552076221134455.

[5] Long, Guodong, et al. “Federated learning for open banking.” Federated learning: privacy and incentive. Cham: Springer International Publishing, 2020. 240–254.

[6] Budach, Lukas, et al. “The effects of data quality on machine learning performance.” arXiv preprint arXiv:2207.14529 (2022).

[7] Konečný, Jakub, et al. “Federated learning: Strategies for improving communication efficiency.” arXiv preprint arXiv:1610.05492 (2016).

[8] Dwork, Cynthia. “Differential privacy.” International colloquium on automata, languages, and programming. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006.

[9] Gentry, Craig. “Fully homomorphic encryption using ideal lattices.” Proceedings of the forty-first annual ACM symposium on Theory of computing. 2009.

[10] Beutel, Daniel J., et al. “Flower: A friendly federated learning research framework.” arXiv preprint arXiv:2007.14390 (2020).

[11] Zhu, Hangyu, et al. “Federated learning on non-IID data: A survey.” Neurocomputing 465 (2021): 371–390.

[12] Li, Qinbin, et al. “Federated learning on non-iid data silos: An experimental study.” 2022 IEEE 38th international conference on data engineering (ICDE). IEEE, 2022.

[13] Qayyum, Adnan, et al. “Collaborative federated learning for healthcare: Multi-modal covid-19 diagnosis at the edge.” IEEE Open Journal of the Computer Society 3 (2022): 172–184.

[14] Chaddad, Ahmad, Yihang Wu, and Christian Desrosiers. “Federated learning for healthcare applications.” IEEE internet of things journal 11.5 (2023): 7339–7358.

[15] Antunes, Rodolfo Stoffel, et al. “Federated learning for healthcare: Systematic review and architecture proposal.” ACM Transactions on Intelligent Systems and Technology (TIST) 13.4 (2022): 1–23.

[16] Lindskog, William, Valentin Spannagl, and Christian Prehofer. “Federated learning for drowsiness detection in connected vehicles.” International Conference on Intelligent Transport Systems. Cham: Springer Nature Switzerland, 2023.

[17] Wen, Jie, et al. “A survey on federated learning: challenges and applications.” International Journal of Machine Learning and Cybernetics 14.2 (2023): 513–535.

[18] Lee, Joohyung, et al. “Federated learning-empowered mobile network management for 5G and beyond networks: From access to core.” IEEE Communications Surveys & Tutorials 26.3 (2024): 2176–2212.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Introduction To Federated Learning

Author(s): William Lindskog

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Exploring Deep Learning Models: Comparing ANN vs CNN for Image Recognition

LAI #72: From Python Groundwork to Function Calling, ICL Theory, and Load Balancing MoEs

Quantum AI Is Coming. Here’s What No One Is Telling You (But Should)

Tool Descriptions Are Critical: Making Better LLM Tools + Research Capability

Top 5 AI Chatbot projects to showcase on your Portfolio: with Code

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Introduction To Federated Learning

Author(s): William Lindskog

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥