Core Principles
It represents a transformative paradigm in AI model training. (Chauhan, 2022)
- How?
- Enables multiple decentralized clients - such as mobile devices, edge nodes, or organizational servers.
 
 - For:
- Collaboratively developing a shared global model without necessitating the centralization or direct exchange of their raw, sensitive data.
 
 - Operating:
- Client-server architecture.
 
 - Steps:
- It commences with a central server broadcasting the global model to a selected subset of clients
 - Each client trains this model locally using its private dataset.
 - Clients upload only their update model parameters and gradients.
 - Central server subsequently aggregates these local updates to form an improved global model.
 - Then it is redistributed to clients for the subsequent training round.
 - This cycle repeats until the model converges or a predefined number of iterations are completed.
 
 - Benefits: (Duan et al., 2022)
- Shifts computational burden.
 - Shifts data processing from a central server to data’s origin.
 - Enhances data locality and privacy.
 
 
Importance for Data Privacy and Distributed AI
(Chauhan, 2022) Federated learning holds particular significance for applications in highly regulated and privacy-sensitive sectors such as:
- Healthcare analytics
 - Financial fraud detection
 - Anti-money laundering
 
Data protection regulations, such as:
- The European Union’s General Data Protection Regulation (GDPR).
 - More pertinently here, India’s Digital Personal Data Protection (DPDP) Bill 2023.
 
These laws often prohibit or severely restrict the direct sharing of raw personal data across entities or national borders (Yu et al., 2023).
Federated learning offers a robust solution by allowing organizations to leverage vast, distributed datasets for AI model development while maintaining data locality and confidentiality, thereby facilitating compliance with data protection mandates (Duan et al., 2022).
Overview of Privacy-Preserving Techniques in FL
While FL preserves privacy by keeping raw data decentralized, it is not entirely immune to privacy attacks. Information might still be inferred from shared model updates or gradients. (Koutsoubis et al., 2025)
Therefore, integrating additional Privacy-Enhancing Technologies (PETs) has become necessary components to truly achieve privacy-preserving FL.
This understanging is critical for the framework design, ensuring it addresses the full spectrum of privacy risks.
Key preserving technologies include:
- Differential Privacy (DP)
 - Secure Multi-Party Computation (SMPC)
 - Homomorphic Encryption (HE)
 - Other Technologies
 
Differential Privacy (DP)
This technique involves adding a carefully calibrated amount of random noise to the module updates or gradients before they are transmitted to the central server. (Chauhan, 2022)
- Provides:
- Strong, mathematically quantifiable privacy guarantee. (Milvus Team, 2024)
 
 - Ensures:
- Presence or absence of any single individual’s data point in the training set does not significantly affect the final model. (Milvus Team, 2024)
 
 - Trade-off:
- The strength of privacy is controlled by a parameter, epsilon (). Where a lower provides stronger privacy but can introduce trade-off with model accuracy. (Milvus Team, 2024)
 
 
Secure Multi-Party Computation (SMPC)
SMPC employs cryptographic protocols that enable multiple parties to jointly compute a function over the private inputs without revealing those inputs to each other or to a central aggregator. (Chauhan, 2022)
- In-short: (Milvus Team, 2024)
- Central server to aggregate model updates from the clients.
 - While individual contributions remain masked and inaccessible.
 - This approach is considered efficient for large-scale FL deployments.
 
 
Homomorphic Encryption (HE)
It is a powerful cryptographic technique that permits computations to be performed directly on encrypted data without prior decryption (Chauhan, 2022).
- Steps: (Milvus Team, 2024)
- Client can encrypt their model updates.
 - Send them to the server.
 - Server then aggregates these ciphertexts.
 - The results remains encrypted and can only be decrypted by authorized parties.
 
 - Trade-off: (Milvus Team, 2024)
- Computationally intensive.
 - Can limit its practicality for real-time applications.
 - Hinders performance of resource-constrained edge devices.
 
 
Other Techniques
Beyond core methods, techniques such as (Timofte et al., 2025):
- Gradient clipping (limiting the magnitude of gradients)
 - Fisher information-based parameter selection, to reduce potential gradient leakage and improve communication efficiency.
 
Conclusion
When evaluating these PETs, a recurring theme is the inherent trade-off between privacy, performance, and efficiency.
- Differential Privacy involves an “accuracy trade-off”
 - Homomorphic Encryption is described as “computationally extensive”
 
This highlights a fundamental trillema:
- Maximizing privacy often compromises model accuracy
 - Increases computational overhead
 - Reduces communication efficiency
 
The proposed framework cannot simply aim for the strongest privacy; it must meticulously balance these three dimensions.
Challenge
The specific constraints of Indian network infrastructure and the performance requirements of the healthcare and finance applications will dictate the optimal balance, making this a central challenge and a key area for the research’s contribution and benchmarking.