Abstract
In this article, we investigate the federated clustering (FedC) problem, which aims to accurately partition unlabeled data samples distributed over massive clients into finite clusters under the orchestration of a parameter server (PS), meanwhile considering data privacy. Though it is an NP-hard optimization problem involving real variables denoting cluster centroids and binary variables denoting the cluster membership of each data sample, we judiciously reformulate the FedC problem into a nonconvex optimization problem with only one convex constraint, accordingly yielding a soft clustering solution. Then, a novel FedC algorithm using differential privacy (DP) technique, referred to as DP- FedC, is proposed in which partial clients participation (PCP) and multiple local model updating steps are also considered. Furthermore, various attributes of the proposed DP- FedC are obtained through theoretical analyses of privacy protection and convergence rate, especially for the case of nonidentically and independently distributed (non-i.i.d.) data, that ideally serve as the guidelines for the design of the proposed DP- FedC. Then, some experimental results on two real datasets are provided to demonstrate the efficacy of the proposed DP- FedC together with its much superior performance over some state-of-the-art FedC algorithms, and the consistency with all the presented analytical results.
Original language | English |
---|---|
Pages (from-to) | 6705-6721 |
Number of pages | 17 |
Journal | IEEE Internet of Things Journal |
Volume | 11 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2024 Feb 15 |
All Science Journal Classification (ASJC) codes
- Signal Processing
- Information Systems
- Hardware and Architecture
- Computer Science Applications
- Computer Networks and Communications