1 / 1

Introduction

EXpectation Propagation LOgistic REgRession (EXPLORER): distributed privacy-preserving online model learning. Shuang Wang , 1 Xiaoqian Jiang, 1 Yuan Wu, 1 Lijuan Cui, 2 and Samuel Cheng 2 , Lucila Ohno- Machado 1.

ata
Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EXpectation Propagation LOgistic REgRession (EXPLORER): distributed privacy-preserving online model learning Shuang Wang,1 Xiaoqian Jiang,1 Yuan Wu,1Lijuan Cui,2and Samuel Cheng2, Lucila Ohno-Machado1 1Division of Biomedical Informatics, University of California–San Diego, La Jolla, California, USA 2School of Electrical and Computer Engineering, University of Oklahoma, Tulsa, Oklahoma, USA Introduction Experimental Results EXPLORER framework It has been shown in last decade that data privacy cannot be maintained by simply removing patient identities. Thus, training data in one institute cannot be exchanged or shared with other institutions directly for the purposes of global logistic regression model learning. To address such a challenge, numerous privacy-preserving distributed frequentist regression models for horizontally partitioned data have been studied, among which Grid LOgistic RE- gression (GLORE) model [1] and the Secure Pooled Analysis acRoss K-site (SPARK) protocol [2] are the closest work for the method presented here. Despite its simplicity and interpretability, the distributed frequentist logistic regression approach has limitations as shown in Table 1. Table 1: Comparing EXPLORER with GLORE and SPARK Table 2: Summary ofdatasetsusedinourexperiments Table 3: Distributed forward feature selection on data set 1 over 30 trials Table 4: Comparisons of H-L tests and AUCs for simulated dataset 2 with/without interaction using Ordinary LR and 4-site EXPLORER We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning [3]. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance as the traditional frequentist logistic regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications Methodology Table 5: Learned model parameter β of dataset 3 using Ordinary LR and 2-site EXPLORER Summary of Conclusions Secured Intermediate iNformation Exchange (SINE) protocol In summary, EXPLORER offers an alternative tool for privacy-preserving distributed statistical learning. We showed empirically on multiple data sets that the results are very similar to those of ordinary logistic regression. These promising results warrant further validation in larger data sets and further refinement of the methodology. Inability to openly share (i.e., transmit) patient data without onerous processes involving pair-wise agreements between institutions may significantly slow down analyses that could produce important results for healthcare improvement and biomedical research advances. EXPLORER provides a means to mitigate this problem by relying on multiparty computation without need for extensive re-training of models, nor reliance on synchronous communications among sites. The convergence speed of all 10 coefficients of the data set 4 for an asynchronous 8-site EXPLORE setup [1] Wu, Y., Jiang, X., Kim, J., Ohno-Machado, L. (2012). Grid Binary LOgistic REgression (GLORE): building shared models without sharing data. JAMIA, 19(5), 758-764. [2] El Emam, K., Samet, S., Arbuckle, L., Tamblyn, R., Earle, C., & Kantarcioglu, M. (2013). A secure distributed logistic regression protocol for the detection of rare adverse drug events. JAMIA, 20(3), 453-461. [3] Wang, S., Jiang, X., Wu, Y., Cui, L., Cheng, S., & Ohno-Machado, L. (2013). EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed privacy-preserving online model learning. JBI, 46(3), 480-496. References

More Related