Skip to search boxSkip to navigationSkip to main content

A comparative study of federated learning and synthetic data for privacy-aware machine learning

Research Output: Chapter in Book/Report/Conference proceeding Conference contribution Peer-review

Sustainable Development Goals

  • SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well

Abstract

Healthcare institutions face a critical challenge in training and deploying machine learning applications due to data scarcity compounded by stringent privacy regulations. In this case study involving breast cancer identification, we evaluated four experimental scenarios under conditions of limited data availability and strict privacy requirements. Specifically, we compared: (i) federated learning with distributed real data, (ii) federated learning with synthetic data, (iii) centralized learning on aggregated synthetic datasets generated locally, and (iv) multi-step synthetic data generation. Our results indicate that when local datasets are too small to be useful independently, federated learning with real data achieves the highest performance, outperforming federated learning with synthetic data. In contrast, models developed on aggregated synthetic datasets or via centralized generation of synthetic data based on local synthetic samples yielded suboptimal results. Although federated learning with real data appeared to be the best-performing strategy, it still fell behind centralized learning with pooled real data. This result demonstrates that federated learning is preferable to synthetic data approaches in low dataset scenarios. Additionally, the modest performance gap compared to the centralized real-data benchmark underscores the importance of further research into improved federated methods.

Publication Information

Output type

Research Output: Chapter in Book/Report/Conference proceeding Conference contribution Peer-review

Original language

English

Pages from-to (Number of pages)

Pages 185-190 (6 pages)

Publication milestones

  • Published - 16/02/2026

Publication status

Published - 16/02/2026

Publisher

Institute of Electrical and Electronics Engineers Inc., United States

Publication series

  • Publication series name: 2025 IEEE 16th Annual Information Technology, Electronics and Mobile Communication Conference, IEMCON 2025

ISBN (Electronic)

9798331565053

External Publication IDs

  • Scopus: 105034710769

Host publication title

2025 IEEE 16th Annual Information Technology, Electronics and Mobile Communication Conference, IEMCON 2025

Host publication editors

  • Rajashree Paul