Extremely Randomized Trees With Privacy Preservation for Distributed Structured Health Data

Amin Aminifar1, Matin Shokri2, Fazle Rabbi1,3, Violet Ka I. Pun1,4, Yngve Lamo1
1 Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Norway
2 Faculty of Computer Engineering, K. N. Toosi University of Technology, Iran
3 Department of Information Science and Media Studies, University of Bergen, Norway
4 Department of Informatics, University of Oslo, Norway

IEEE Access, 2022
DOI: https://doi.org/10.1109/ACCESS.2022.3141709

PDF Download PDF BibTeX Citation Google Scholar Google Scholar

Graphical abstract

Learning from decentralized data with privacy-preserving distributed extremely randomized trees.
Learning from decentralized data with privacy-preserving distributed extremely randomized trees.

Abstract

Artificial intelligence and machine learning have recently attracted considerable attention in the healthcare domain. The data used by machine learning algorithms in healthcare applications is often distributed over multiple sources, for instance, hospitals or patients’ personal devices. One main difficulty lies in analyzing such data without compromising patients’ privacy and personal data, which is a primary concern in healthcare applications. Therefore, in these applications, we are interested in running machine learning algorithms over distributed data without disclosing sensitive information about the data subjects. In this paper, we propose a distributed extremely randomized trees algorithm for learning from distributed data with privacy preservation. We present the implementation of our technique (which we refer to as k-PPD-ERT) on a cloud platform and demonstrate its performance based on medical data, including Heart Disease, Breast Cancer, and mental health datasets (Depresjon and Psykose datasets) associated with the Norwegian INTROducing Mental health through Adaptive Technology (INTROMAT) project.

BibTeX

@article{aminifar2022extremely,
  title={Extremely randomized trees with privacy preservation for distributed structured health data},
  author={Aminifar, Amin and Shokri, Matin and Rabbi, Fazle and Pun, Violet Ka I and Lamo, Yngve},
  journal={IEEE Access},
  volume={10},
  pages={6010--6027},
  year={2022},
  publisher={IEEE}
}