Pneumonia is one of the fatal diseases that causes the death of around 4 million people yearly. Previously, several researches have been done to detect pneumonia using state-of-the-art machine learning methods. However, the challenges involved in medical image detection are high in spatial resolution, heterogeneous in visuals, and complex in pattern. To overcome thischallenge, a large number of datasets is needed that can be achieved by utilising data through collaborative sharing platforms from hospitals and medical institutes. But general data protection and regulation (GDPR) and data protection act (DPA) 2018 do not allow institutions to share customer data with third-party companies. With the restrictions imposed due to UK (EU) rulesand regulations, the major challenge for researchers is the accessibility of data. As a result, a method to access the appropriate amount of data for machine learning models is needed to make an accurate prediction while maintaining privacy. In research, a hybrid approach of machine learning models and a federated learning framework has been proposed to use distributed data in a privacy-preserving manner. In theexperiment, the chest radiographs is used to detect pneumonia disease by distributing the data toa different number of clients (simulation) and training the model individually. Data are trainedlocally on the client in the distributed system federated learning framework, and the trained modelis shared with the central federated learning server. The benchmark of best performing models has been performed on malaria and brain tumor dataset. The research has also highlighted the significance test between the models performance in federated learning framework. The research contribution includes the hybrid framework of federated learning and the CNN based pre-trained models that allows access to the distributed data in a privacy preserving manner. The test analysis have been performed using machine learning algorithms that include convolutional neural network (CNN) based pretrained models of Alexnet, DenseNet, Residual Network (ResNet50), Inception, and Visual Geometry Group 19 (VGG19) in the pneumonia dataset. research will allow hospitals and medical institutes to collaborate while using data mutually. This thesis gives the clear pathway of the effective approaches that can be adopted to enhance diagnosis, improving the healthcare. It also gives state-of-the-art methods for different medical image detection, limitation and future potential. The benchmark analysis gives clear reflection of the potential effectiveness of findings and future scope. I have selected algorithms by performing experimental analysis for effective classification as they are state of the art methods. Due to the complexity of medical images (especially X-ray images), I need a vast number of datasets (images) to train the model correctly and precisely. Novel aspects of the research are to develop the hybrid framework for individual algorithms withfederated learning while ensuring data privacy by using a secure aggregation encryption method that promises the privacy. The preliminary result showed that ResNet50 and desnenet perform well in contrast to others in the federated learning framework. It answers research questions of mutual data collaboration while keeping privacy intake and knowing what machine learningmodels can be used in medical image detection. I have also demonstrated the future scope of research that will allow hospitals and medical institutes (including national health services (NHS) bodies) to share live stream data for effective machine learning modelling in a privacy-preserving manner. This thesis reflects the hybrid approach of using CNN-based pre-trained models in afederated learning framework for medical image detection and is a novel contribution to the scientific knowledge, as best of information.
| Date of Award | 2024 |
|---|
| Original language | English |
|---|
| Awarding Institution | - University of Bedfordshire
|
|---|
| Supervisor | Haiming Liu (Supervisor) & Vladan Velisavljevic (Second supervisor) |
|---|
- Privacy Preserving
- Image Detection
- Medical Disease Detection
- Machine Learning
- Federated Learning
- Subject Categories::G760 Machine Learning
A privacy-preserving approach to effectively utilise distributed data for medical disease detection
Kareem, A. (Author). 2024
Student thesis: Doctoral thesis