
1 Strategy in Federated Learning
Author(s): William Lindskog
Originally published on Towards AI.
AI took a new direction with this strategy.

Federated learning (FL) emerged from the realization that modern mobile devices and edge nodes generate vast amounts of data but cannot easily share it due to privacy regulations, bandwidth constraints and regulatory concerns.
In 2016, researchers proposed federated learning to allow decentralized devices to collaboratively train a model while keeping data on‑device [1]. At the heart of this paradigm is Federated Averaging (FedAvg). FedAvg is an algorithm that aggregates locally computed model updates from participants to form a global model without ever seeing the raw data.
This article gives a scientific yet accessible overview of FedAvg, discusses its benefits and challenges, looks at recent advances and applications, and highlights open research directions.
What is Federated Averaging?
Federated learning (FL) distributes the training of a machine‑learning model across a set of clients (smartphones, IoT devices, hospitals, banks and so on).
Each client trains a local model on its own data and sends updates (model weights or gradients) to a central server, which aggregates them to form an improved global model.

FedAvg is the most common aggregation algorithm: the server simply averages the client updates (optionally with weights proportional to dataset sizes) and uses this as the new global model.
The strategy enables collaborative model training across decentralized devices while keeping data localized — updates from multiple participants are averaged on the server to improve the shared model. This approach addresses privacy concerns, network bandwidth constraints and heterogeneous data distributions. [2]
The original FedAvg paper by McMahan et al. combined iterative model averaging with deep learning to produce models that are robust to unbalanced and non‑IID data distributions.
By letting clients perform multiple local training steps before communicating, FedAvg dramatically reduces communication rounds (by 10–100× compared with synchronized stochastic gradient descent).
2. How does the FedAvg algorithm work?
The standard FedAvg protocol can be described in 5 steps :
- Initialization: The server initializes a global AI model.
- Send copies: Server sends copies of the global AI model to participating clients.
- Local training: Each client receives the global model and performs E epochs of local training using its own data. After training, each client sends its updated model weights back to the server.
- Aggregation: The server collects all client models and computes the new global model by taking a weighted average of the client models. Weights n_k/n reflect the proportion of data each client contributes:

5. Iteration: Steps 2–4 are repeated until the global model converges.
This averaging rule gives more influence to clients with larger datasets while still incorporating contributions from smaller clients.
FedAvg is communication‑efficient: performing multiple local epochs reduces the number of server–client interactions. To handle uneven data or device availability, servers may employ weighted averaging or partial participant sampling.

3. Limitations and challenges
Despite its elegance, FedAvg faces several challenges:
- Statistical heterogeneity (non‑IID data): When clients’ data distributions differ substantially, the averaged model may not represent any client well. Notable sources of heterogeneity include bias in client availability and differences in update success among devices. Techniques such as client clustering, transfer learning and personalization are being explored. [3], [4]
- Communication constraints: Although FedAvg reduces communication, sending model updates can still be costly in low‑bandwidth or high‑latency environments. Methods like model/gradient compression, asynchronous communication and hierarchical aggregation aim to reduce the amount of data transmitted. [5]
- Device heterogeneity: Differences in computational power, network stability and participation rates lead to staleness and convergence issues. Adaptive techniques, participant sampling and server‑side momentum help mitigate these issues. [6]
- Privacy and security threats: FL reduces data exposure but is still vulnerable to data poisoning, membership inference and model inversion attacks. Differential privacy, secure multi‑party computation and homomorphic encryption can protect against these threats. [7]
FedAvg continues to evolve. In 2025, researchers have constrcuted strategies for various applications and domains. The research domain continues to evolve but there’s now a more mature consensus of when to apply what strategy. Frameworks like Flower now includes the largest repository of federated strategies that researchers can utilize. [8]
Future work aims to:
- Personalize global models: Use meta‑learning and transfer‑learning strategies to tailor models to individual clients while leveraging shared knowledge.
- Optimize communication: Further compress model updates or design hierarchical aggregation schemes to operate in bandwidth‑constrained environments.
- Enhance robustness: Develop algorithms resilient to adversarial attacks and client drop‑outs, and integrate anomaly detection to filter malicious updates. [9]
- Ensure fairness: Incorporate fairness metrics and data augmentation to mitigate bias across demographic groups.
- Establish incentive mechanisms: Use monetary rewards, reputation systems, or game‑theoretic approaches like the Shapley value to encourage participation.

4. Conclusion
Federated Averaging is the cornerstone of federated learning — simple yet powerful. By averaging local model updates, it harnesses diverse data sources without exposing sensitive information, reduces communication, and yields robust global models.
However, its success depends on addressing challenges like heterogeneity, communication constraints and security.
The algorithm continues to inspire new variants and theoretical analyses, as evidenced by recent continuous‑time studies. As researchers refine FedAvg and its derivatives, federated learning will play a vital role in building privacy‑preserving, scalable AI systems across mobile devices, healthcare, finance and beyond.
References
[1] McMahan, Brendan, et al. “Communication-efficient learning of deep networks from decentralized data.” Artificial intelligence and statistics. PMLR, 2017.
[2] Konečný, Jakub, et al. “Federated optimization: Distributed machine learning for on-device intelligence.” arXiv preprint arXiv:1610.02527 (2016).
[3] Zhao, Yue, et al. “Federated learning with non-iid data.” arXiv preprint arXiv:1806.00582 (2018).
[4] Tan, Alysa Ziying, et al. “Towards personalized federated learning.” IEEE transactions on neural networks and learning systems 34.12 (2022): 9587–9603.
[5] Chen, Mingzhe, et al. “Communication-efficient federated learning.” Proceedings of the National Academy of Sciences 118.17 (2021): e2024789118.
[6] Pfeiffer, Kilian, et al. “Federated learning for computationally constrained heterogeneous devices: A survey.” ACM Computing Surveys 55.14s (2023): 1–27.
[7] Wen, Jie, et al. “A survey on federated learning: challenges and applications.” International journal of machine learning and cybernetics 14.2 (2023): 513–535.
[8] Beutel, Daniel J., et al. “Flower: A friendly federated learning research framework.” arXiv preprint arXiv:2007.14390 (2020).
[9] Yurdem, Betul, et al. “Federated learning: Overview, strategies, applications, tools and future directions.” Heliyon 10.19 (2024).
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI
Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!
Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Discover Your Dream AI Career at Towards AI Jobs
Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!
Note: Content contains the views of the contributing authors and not Towards AI.