1 Strategy in Federated Learning

Author(s): William Lindskog

Originally published on Towards AI.

AI took a new direction with this strategy.

Federated learning (FL) emerged from the realization that modern mobile devices and edge nodes generate vast amounts of data but cannot easily share it due to privacy regulations, bandwidth constraints and regulatory concerns.

In 2016, researchers proposed federated learning to allow decentralized devices to collaboratively train a model while keeping data on‑device [1]. At the heart of this paradigm is Federated Averaging (FedAvg). FedAvg is an algorithm that aggregates locally computed model updates from participants to form a global model without ever seeing the raw data.

This article gives a scientific yet accessible overview of FedAvg, discusses its benefits and challenges, looks at recent advances and applications, and highlights open research directions.

What is Federated Averaging?

Federated learning (FL) distributes the training of a machine‑learning model across a set of clients (smartphones, IoT devices, hospitals, banks and so on).

Each client trains a local model on its own data and sends updates (model weights or gradients) to a central server, which aggregates them to form an improved global model.

FedAvg is the most common aggregation algorithm: the server simply averages the client updates (optionally with weights proportional to dataset sizes) and uses this as the new global model.

The strategy enables collaborative model training across decentralized devices while keeping data localized — updates from multiple participants are averaged on the server to improve the shared model. This approach addresses privacy concerns, network bandwidth constraints and heterogeneous data distributions. [2]

The original FedAvg paper by McMahan et al. combined iterative model averaging with deep learning to produce models that are robust to unbalanced and non‑IID data distributions.

By letting clients perform multiple local training steps before communicating, FedAvg dramatically reduces communication rounds (by 10–100× compared with synchronized stochastic gradient descent).

2. How does the FedAvg algorithm work?

The standard FedAvg protocol can be described in 5 steps :

Initialization: The server initializes a global AI model.
Send copies: Server sends copies of the global AI model to participating clients.
Local training: Each client receives the global model and performs E epochs of local training using its own data. After training, each client sends its updated model weights back to the server.
Aggregation: The server collects all client models and computes the new global model by taking a weighted average of the client models. Weights n_k/n reflect the proportion of data each client contributes:

5. Iteration: Steps 2–4 are repeated until the global model converges.

This averaging rule gives more influence to clients with larger datasets while still incorporating contributions from smaller clients.

FedAvg is communication‑efficient: performing multiple local epochs reduces the number of server–client interactions. To handle uneven data or device availability, servers may employ weighted averaging or partial participant sampling.

3. Limitations and challenges

Despite its elegance, FedAvg faces several challenges:

Statistical heterogeneity (non‑IID data): When clients’ data distributions differ substantially, the averaged model may not represent any client well. Notable sources of heterogeneity include bias in client availability and differences in update success among devices. Techniques such as client clustering, transfer learning and personalization are being explored. [3], [4]
Communication constraints: Although FedAvg reduces communication, sending model updates can still be costly in low‑bandwidth or high‑latency environments. Methods like model/gradient compression, asynchronous communication and hierarchical aggregation aim to reduce the amount of data transmitted. [5]
Device heterogeneity: Differences in computational power, network stability and participation rates lead to staleness and convergence issues. Adaptive techniques, participant sampling and server‑side momentum help mitigate these issues. [6]
Privacy and security threats: FL reduces data exposure but is still vulnerable to data poisoning, membership inference and model inversion attacks. Differential privacy, secure multi‑party computation and homomorphic encryption can protect against these threats. [7]

FedAvg continues to evolve. In 2025, researchers have constrcuted strategies for various applications and domains. The research domain continues to evolve but there’s now a more mature consensus of when to apply what strategy. Frameworks like Flower now includes the largest repository of federated strategies that researchers can utilize. [8]

Future work aims to:

Personalize global models: Use meta‑learning and transfer‑learning strategies to tailor models to individual clients while leveraging shared knowledge.
Optimize communication: Further compress model updates or design hierarchical aggregation schemes to operate in bandwidth‑constrained environments.
Enhance robustness: Develop algorithms resilient to adversarial attacks and client drop‑outs, and integrate anomaly detection to filter malicious updates. [9]
Ensure fairness: Incorporate fairness metrics and data augmentation to mitigate bias across demographic groups.
Establish incentive mechanisms: Use monetary rewards, reputation systems, or game‑theoretic approaches like the Shapley value to encourage participation.

4. Conclusion

Federated Averaging is the cornerstone of federated learning — simple yet powerful. By averaging local model updates, it harnesses diverse data sources without exposing sensitive information, reduces communication, and yields robust global models.

However, its success depends on addressing challenges like heterogeneity, communication constraints and security.

The algorithm continues to inspire new variants and theoretical analyses, as evidenced by recent continuous‑time studies. As researchers refine FedAvg and its derivatives, federated learning will play a vital role in building privacy‑preserving, scalable AI systems across mobile devices, healthcare, finance and beyond.

References

[1] McMahan, Brendan, et al. “Communication-efficient learning of deep networks from decentralized data.” Artificial intelligence and statistics. PMLR, 2017.

[2] Konečný, Jakub, et al. “Federated optimization: Distributed machine learning for on-device intelligence.” arXiv preprint arXiv:1610.02527 (2016).

[3] Zhao, Yue, et al. “Federated learning with non-iid data.” arXiv preprint arXiv:1806.00582 (2018).

[4] Tan, Alysa Ziying, et al. “Towards personalized federated learning.” IEEE transactions on neural networks and learning systems 34.12 (2022): 9587–9603.

[5] Chen, Mingzhe, et al. “Communication-efficient federated learning.” Proceedings of the National Academy of Sciences 118.17 (2021): e2024789118.

[6] Pfeiffer, Kilian, et al. “Federated learning for computationally constrained heterogeneous devices: A survey.” ACM Computing Surveys 55.14s (2023): 1–27.

[7] Wen, Jie, et al. “A survey on federated learning: challenges and applications.” International journal of machine learning and cybernetics 14.2 (2023): 513–535.

[8] Beutel, Daniel J., et al. “Flower: A friendly federated learning research framework.” arXiv preprint arXiv:2007.14390 (2020).

[9] Yurdem, Betul, et al. “Federated learning: Overview, strategies, applications, tools and future directions.” Heliyon 10.19 (2024).

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

1 Strategy in Federated Learning

Author(s): William Lindskog

AI took a new direction with this strategy.

What is Federated Averaging?

2. How does the FedAvg algorithm work?

3. Limitations and challenges

4. Conclusion

References

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Why Knowledge Graphs Are the Missing Piece in AI Agent API Discovery

The Complexity of Self-Driving Cars Explained Simply

Bridging Symbolic AI and Deep Learning: How Knowledge Graphs are Revolutionizing ResNets

LAI #93: Smarter Model Choices, Multi-Agent Systems, and Cutting Through AI Noise

Who Wins Purview vs Rogue AI in Data Control

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

1 Strategy in Federated Learning

Author(s): William Lindskog

AI took a new direction with this strategy.

What is Federated Averaging?

2. How does the FedAvg algorithm work?

3. Limitations and challenges

4. Conclusion

References

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement