Introduction to Federated Learning

Last Updated on July 24, 2023 by Editorial Team

Author(s): Manish Nayak

Originally published on Towards AI.

Introduction

Any deep learning model learns from the data and that data must be collected or uploading on the server (one machine or in a data center). A most realistic and meaningful deep learning model can learn from personal data. Personal data is extremely private and sensitive and no one would like to send or upload it on the server. Federated learning is a collaborative machine learning approach in which we trained a model without centralizing data on the server and this is the main kind of a revolution.

What if we bring the model to the data where it generated instead of bringing data to one location and training a model.

The main use case is when we want to improve a pre-trained model repeatedly using the data from multiple mobile devices or it can be any kind of embedded device like all sorts of Internet of Things that are connected to the Internet, and without uploading the data to end server or cloud.

This is really interesting because the actual solution to this problem is really simple. First the clients, the mobile devices get a pre-trained model and then improve the model using local data. So the actual model is trained on a locally on devices and sends the model back to the server.

The server combines all models that it gets from clients. And this combined model becomes the next initial model that will be sent to clients and we just repeat the process. All those devices get the benefit of each device's data.

Challenges with a federated learning approach

Performance: If the client has only a few training examples it can still learn a bit about the data. If we 50,000 clients each with small data they spend most of the time sending the model back and forth and not much time training if the model is really big
Privacy: By looking at weights changed, someone can figure out personal data. So we can not use federated learning if someone gets to know, the training data from the weight update.

To deal with both these issue Google developed a Secure Aggregation protocol in which the main idea is server generates a public and private key pair and share public keys to each client.

Then clients talk directly to each other and share their encrypted updated weight using the public key of the server. All clients having only public key shared by the server so there is no way any client can see other weights update.

All clients accumulate all their model’s weight into a single and final update sent back to the server. Then the server decrypted it using a private key and update server’s model weight. In this process, the server gets accumulate weights so the server also can not see any particular client's weight update. In this secure aggregation protocol, no individual phone’s update can be inspected before averaging by the server. The server can request to share the update to a client and the client will only respond if it has been syncing up with other clients and accumulated their weight with some threshold U+007CNumber Of ClientsU+007C > threshold. After getting a response from the client, the server reconstructs the accumulated weight with a private key and computes the aggregate value.

One question you may ask, how clients accumulate weight which encrypted? The answer is Homomorphic Encryption. Homomorphic Encryption lets you perform computation on encrypted values without decrypting them. you can try yourself by installing the python-paillier library(pip install phe).

Homomorphic Encryption in Python

Conclusion

Using federated learning, now we can develop a very useful and precise model that learns from personal data such as healthcare, personal management, where data sets are often tightly locked which making research difficult.

I hope this article helped you to get an understanding of federated learning using the user’s personal and private data. I also try to explain how user’s data is secure and no client’s deep learning model can sniff around the user’s sensitive data not even the server’s deep learning model can see user’s sensitive data.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Introduction to Federated Learning

Author(s): Manish Nayak

Introduction

Challenges with a federated learning approach

Conclusion

References

Federated Learning: Collaborative Machine Learning without Centralized Training Data

Standard machine learning approaches require centralizing the training data on one machine or in a datacenter. And…

Practical Secure Aggregation for Federated Learning on User-Held Data – Google AI

Secure Aggregation is a class of Secure Multi-Party Computation algorithms wherein a group of mutually distrustful…

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

NN#2 — Neural Networks Decoded: Concepts Over Code

#61: Are LLMs Entering the Age of Agents?

DeepSeek R-1 on Your Mac: 4 Surprisingly Simple Local Setup Tricks

DeepSeek R1: The AI Playing Hide-and-Seek with Security… in a Glass House

Semantic Search Engine Using Langchain

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Introduction to Federated Learning

Author(s): Manish Nayak

Introduction

Challenges with a federated learning approach

Conclusion

References

Federated Learning: Collaborative Machine Learning without Centralized Training Data

Standard machine learning approaches require centralizing the training data on one machine or in a datacenter. And…

Practical Secure Aggregation for Federated Learning on User-Held Data – Google AI

Secure Aggregation is a class of Secure Multi-Party Computation algorithms wherein a group of mutually distrustful…

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement