A comparative study of federated learning and synthetic data for privacy-aware machine learning
- Akhtar Hussain,
- Atiquer Rahman Sarkar,
- Eunjin Kim,
- ,
- Noman Mohammed
- University of North Dakota,
- University of Manitoba,
Sustainable Development Goals
- SDG 3 Good Health and Well
Abstract
Healthcare institutions face a critical challenge in training and deploying machine learning applications due to data scarcity compounded by stringent privacy regulations. In this case study involving breast cancer identification, we evaluated four experimental scenarios under conditions of limited data availability and strict privacy requirements. Specifically, we compared: (i) federated learning with distributed real data, (ii) federated learning with synthetic data, (iii) centralized learning on aggregated synthetic datasets generated locally, and (iv) multi-step synthetic data generation. Our results indicate that when local datasets are too small to be useful independently, federated learning with real data achieves the highest performance, outperforming federated learning with synthetic data. In contrast, models developed on aggregated synthetic datasets or via centralized generation of synthetic data based on local synthetic samples yielded suboptimal results. Although federated learning with real data appeared to be the best-performing strategy, it still fell behind centralized learning with pooled real data. This result demonstrates that federated learning is preferable to synthetic data approaches in low dataset scenarios. Additionally, the modest performance gap compared to the centralized real-data benchmark underscores the importance of further research into improved federated methods.
Publication Information
Output type
Original language
EnglishPages from-to (Number of pages)
Pages 185-190 (6 pages)Publication milestones
- Published - 16/02/2026
Publication status
Publisher
Institute of Electrical and Electronics Engineers Inc., United StatesPublication series
- Publication series name: 2025 IEEE 16th Annual Information Technology, Electronics and Mobile Communication Conference, IEMCON 2025
ISBN (Electronic)
9798331565053External Publication IDs
- Scopus: 105034710769
Host publication title
2025 IEEE 16th Annual Information Technology, Electronics and Mobile Communication Conference, IEMCON 2025Host publication editors
- Rajashree Paul
