Secure Data Outsourcing. Outline. Motivation Background Research issues Summary. Motivation. Cost of maintaining/mining large data 4-5 times of the cost of data acquisition DBAs are paid well More and more data service providers Low cost – cloud computing
Original Xi distribution is known
Transformed Xi’ distribution
1. randomly change the original data
2. the attacker cannot effectively recover the original data
3. the desired properties are preserved
k-dimensional numeric data, n records,
represented as a k x n matrix, x: a record
(1) Extend x to k+2 dimensions
- A is a (k+2)x(k+2) invertible real value matrix, with at least two non-zero values for each row and the last column of A has all non-zero values
- A is shared by all records
hyperplane based query
range query in Rk+2.
half space: wTx<=a
The intersection of convex sets is also convex.
Problems: (1) A-1 can be probed
(2) is . . If a is known, the whole
dimension i is breached.
(1) A-1 cannot be derived from
(2) (Xi-a)Xk+2 0 contains the random component Xk+2 that protects
the condition (Xi-a) 0
Filter out the junk records
Querying this bounding
A multidimensional tree index is been built on the encrypted data (in the
transformed space) in the server.
The client calculates the large bounding box;
The server uses the index to find the results.
filter the initial results with the conditions yTiy 0 for 1…2k
Note: the two-stage strategy works, if the output of stage 1 is significantly smaller than the original database and can be fit into the memory.
Otherwise, use linear scan with stage 2 filtering.
H(H(x1)+H(x2)) , + is string concatenation
Can be stored with tree like structure : index, xml
LUB(q) = 4
GLB(q) = 11
whether <Li, Fk(Li)> == CiW
It reveals nothing if Ci is not the ciphertext for W.
And Li is random for different Wi – server cannot find any information from Li.