Privacy-Aware Image Search Using Machine Learning

Exploring the possibility of Privacy-Aware Search on Images with Machine Learning Jae Duk Seo Jae.duk.seo@Ryerson.ca

Motivation • Wilson’s PhD Thesis “Privacy-Aware Search and Computation Over Encrypted Data Stores” • Is there a way to perform Privacy-Aware Search on Image data as well? (Not Computation!) • Note, the author of this PPT have a limited knowledge about computer security in General.

NIPS 2017 Deep Steganography • 1. Three CNN • 2. Secret Image • The Message we wish to Hide. • 3. Cover Image • The cover of the secret message.

Recap / Summary • 1. Have a Secret Image • 2. Have a Cover Image • 3. Make a Container Image (Secret Image + Cover Image) • 4. Reveal the Image

Implementation • Every CNN have 50 Channel, however kernel size differs from one another. (Prep-3, Hide-4, Reveal-5) • Able to perform Dilated Back Propagation as well! (Next Talk) • For the full blog post please visit this link. • For running the code online please visit this link.

Results From the Paper • Red – Cover Image • Blue – Secret Image • Green – Container Image • Purple – Revealed Image

Results from My Implementation • Does well on certain Image • However, loses some data from the original Image (Open Problem) • Using the Data Set "A BENCHMARK FOR SEMANTIC IMAGE SEGMENTATION”

Side Track: Transferring Image • Bob wants to Transfer an Image safely to Alice • White Box – Prep + Hiding Network • Blue Box – Reveal Network

Side Track: Transferring Image Problems • 1. We assumed Alice already had the Revealing Network • 2. If we retrain the Prep/Hiding Network, we need to retrain the Revealing Network as well. • 3. If a ‘Hacker’ obtain the cover image they might be able to retrieve the original image using statistical analysis.

Side Track: Transferring Medical Images • Overall Poor Results • Information Loss • Strange Artifacts on Carrier Image • A) Small ImagesB) Non-Optimal Hyper-parameters

Naïve Method • 1. Convert the (Secret /Carrier) Image into Base 64 • 2. Use SHA 256 to convert the Base 64 into fixed String (Practical Non-Collision) • 3. Dictionary Mapping

Naïve Method to Make the image Searchable • 3. Upload the Carrier Image to a cloud • 4. Delete the original Image • 5. When where we need a specific image use the Dictionary • 6. Download and Reveal. • Information loss (HUGE PROBLEM)

Huge Flaw in Naïve Approach • If we want other people to access the images we need to transfer the dictionary. • What happens if that dictionary gets stolen during the process? • Possible ‘solution’? What about NLP + NN

GAN + Natural Language Processing • AttnGAN: Generates Images from given Sentence. • GAN is able to generate numerical values as well. That could possibly act as a key value.

Given a sentence Generate a Key • Black – GAN that generates a unique key for a given sentence describing the image • Red – Revealing Network • The Carrier Image’s name would be the Key Value.

Challenges / Comparison • 1. Have to guarantee that GAN does not generate same key for two similar images. (Non-Collision) • 2. Access to the Carrier Image database can be authorized using the traditional method • 3. Rather than transferring the Dictionary Mapping, transfer the pretrained GAN and Reveal Network. • ** If the man in the middle get hold of GAN/Reveal network, how can we guarantee that they won’t be able to extract useful information?

Possible Challenge? Homomorphic Encryption • 1. Hide Medical Image under Natural Images • 2. Perform Segmentation • 3. Convert Back • 4. lattice cryptography?

Reference • Baluja, S. (2017). Hiding Images in Plain Sight: Deep Steganography. In Advances in Neural Information Processing Systems (pp. 2066-2076). • [NIPS 2017/Google] - Hiding Images in Plain Sight: Deep Steganography with Interactive Code […. (2018). Towards Data Science. Retrieved 29 April 2018, from https://towardsdatascience.com/nips-2017-google-hiding-images-in-plain-sight-deep-steganography-with-interactive-code-e5efecae11ed • Hui, L. (2018). A BENCHMARK FOR SEMANTIC IMAGE SEGMENTATION . Ntu.edu.sg. Retrieved 29 April 2018, from http://www.ntu.edu.sg/home/asjfcai/Benchmark_Website/benchmark_index.html • Encrypting Different Medical Images using Deep Neural Network with Interactive Code. (2018). Towards Data Science. Retrieved 29 April 2018, from https://towardsdatascience.com/encrypting-different-medical-images-using-deep-neural-network-with-interactive-code-b47656dcd1e • Goetz, M., Heim, E., Maerz, K., Norajitra, T., Hafezi, M., & Fard, N. et al. (2016). A learning-based, fully automatic liver tumor segmentation pipeline based on sparsely annotated training data . Medical Imaging 2016: Image Processing. doi:10.1117/12.2217655 • Santos, L. (2018). Image Segmentation · Artificial Inteligence. Leonardoaraujosantos.gitbooks.io. Retrieved 29 April 2018, from https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/image_segmentation.html • Lattice-based cryptography - IBM Research - US. (2018). Research.ibm.com. Retrieved 29 April 2018, from https://www.research.ibm.com/5-in-5/lattice-cryptography/ • Outperforming Tensorflow’s Default Auto Differentiation Optimizers, with Interactive Code [Manual…. (2018). Towards Data Science. Retrieved 29 April 2018, from https://towardsdatascience.com/outperforming-tensorflows-default-auto-differentiation-optimizers-with-interactive-code-manual-e587a82d340e • (2018). Esprockets.com. Retrieved 29 April 2018, from http://www.esprockets.com/papers/nips2017.pdf • Near Field Communication (NFC) Technology, Vulnerabilities and Principal Attack Schema. (2013). InfoSec Resources. Retrieved 30 April 2018, from http://resources.infosecinstitute.com/near-field-communication-nfc-technology-vulnerabilities-and-principal-attack-schema/#gref • Xu, T., Zhang, P., Huang, Q., Zhang, H., Gan, Z., Huang, X., & He, X. (2017). AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks. arXiv preprint arXiv:1711.10485.

Privacy-Aware Image Search Using Machine Learning

Privacy-Aware Image Search Using Machine Learning

Presentation Transcript

ON REASONING WITH IMAGES: THE USE OF IMAGES IN CLINICAL RESEARCH

A Primer on Machine Learning, Classification, and Privacy

Machine Learning In Search Quality At

Machine Learning on Spark

Machine Learning on Spark

Machine Learning with MapReduce

Distributed Machine Learning: Communication, Efficiency, and Privacy

Machine Learning with EM

Machine Learning with WEKA

Exploring the Effect of Learning Paradigm on Web-based Learning

Improving the Sensitivity of Peptide Identification With Meta-Search and Machine Learning

Privacy-Aware Publishing of Netflix Data

Machine Learning on Images

CS7380: Privacy Aware Computing

Saby on Machine Learning

How Search Engines Use Machine Learning

THE FOUNDATIONS OF CAUSAL INFERENCE With Reflections on Machine Learning

Machine Learning with Weka

Privacy by Design – Principles of Privacy-Aware Ubiquitous Systems

Machine Learning with Python

MACHINE LEARNING WITH PYTHON

Machine Learning with Python