authentic publication the truthsayer project n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Authentic Publication The TRUTHSAYER Project PowerPoint Presentation
Download Presentation
Authentic Publication The TRUTHSAYER Project

Loading in 2 Seconds...

play fullscreen
1 / 54

Authentic Publication The TRUTHSAYER Project - PowerPoint PPT Presentation


  • 61 Views
  • Uploaded on

Authentic Publication The TRUTHSAYER Project. Chip Martel Premkumar Devanbu Michael Gertz April Kwong Glen Nuckolls Stuart Stubblebine Department of Computer Science, University of California, Davis http://truthsayer.cs.ucdavis.edu. Databases Play a Vital Role.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Authentic Publication The TRUTHSAYER Project' - jase


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
authentic publication the truthsayer project

Authentic PublicationThe TRUTHSAYER Project

Chip Martel

Premkumar Devanbu

Michael Gertz

April Kwong

Glen Nuckolls

Stuart Stubblebine

Department of Computer Science,

University of California, Davis

http://truthsayer.cs.ucdavis.edu

databases play a vital role
Databases Play a Vital Role
  • Commerce: credit card data, find goods
  • Financial: Investment sites
  • Health: treatments, doctors/credentials, drugs
  • Many more
goals
Goals
  • Correct and complete answers (with assurance)
  • Efficient Protocols
example queries
Example Queries
  • Is Credit card number 5543… Valid?
  • List all Hong Kong to San Francisco flights.
  • Find Digital cameras with 3-5 Mega-pixels, and cost < $200
  • List all bars within one mile of HKU
what is a correct answer
What is a Correct Answer?
  • We assume a trusted Data Owner with the official copy of the Database: Defines the “correct answer”
what is a correct answer1
What is a Correct Answer?
  • We assume a trusted Data Owner with the official copy of the Database: Defines the “correct answer”
  • Problems with a single Data Owner:

1) May not want/be able to answer queries

2) Hard to keep online DB secure

3) Scalability

solution third party servers
Solution: Third-Party Servers
  • Third party sites (Publishers) get information from the Data Owner and answer queries
  • Example: Travel sites (Expedia, Travelocity, Orbitz) answer using government airline Data (FAA)
server replication
Server Replication

Can ITrustThis Server?

Travelocity

FAA

Expedia

Data

Orbitz

trust issues
Trust Issues
  • Sites have left out cheaper flights from non-preferred airlines (deliberate)
  • Sites may be corrupted: outside hacker or insider
  • Errors
authentic publication the truthsayer project1

Data + Digest of Data

Query

Authentic Publication: The TRUTHSAYER project.

Initially: for RDB (DBSEC 2000, Jnl. Comp. Sec.)General Model for a Variety of Data (Algorithmica, 2004)

Owner

Publisher

Answer +Verification Object

talk outline
Talk Outline
  • Introduction
  • Background--- Merkle Trees
  • Range Queries (Multi-attribute Queries)
  • A General Model for Authenticated Data Structures
  • Conclusion
authentic publication
Authentic Publication
  • A trusted Owner digests the Data Set, and signs it.
  • Untrusted Publishers receive the data & signature.
  • Clients submit queries to untrusted Publishers.
  • Publishers return Answers (A), and Verification Objects (A+ VO)
  • Clients use A + VO to Prove the answer is correct/complete.

Protocol is correct, and secure.

verifying answers
Verifying answers

Protocol provides:

  • Correctness:Returns exact elements matching the query.
  • Completeness:Returns allelements matching query.
  • Security: Cheating is infeasible.
  • Efficiency:Overhead is low.

Recall: No signatures!!

merkle hashing a data set

h(d1)

Merkle hashing a data set.
  • Leaves: data in some lexical order.
  • One way hash function h; h1=h(d1)
  • Bottom-up hashing, starting with data
  • Root hash value = the digest of the data set.
merkle trees
Merkle Trees
  • Classic use: prove that data value d is in the data set
  • Solves: Is Credit card number 5543… Valid?
  • But also can verify all items in a range: e.g. camcorders from $400 to $900
verifying a range

1

3

5

15

6

8

10

11

q

Verifying a Range

To Show that q =(5,6,8) is the Answer to 4<d <10:

Used Lower Bound 3, Upper Bound 10 and starred hashvalues to compute/verify root hash.

verifying a range1

1

3

5

15

6

8

10

11

q

Verifying a Range

Query: 4<d <10:

Answer: 5,6,8 (in practice, key + data)

Verification Object: [( (h(1),3), (5,6) )( (8,10), *) ]

authentic publication1
Authentic Publication

Hash Digest

Merkle Tree

security property
Security Property
  • If the Answer and VO are correct, user accepts
security property1
Security Property
  • User accepts an Invalid answer only if a specific collision in h is found (provable):

h(x,y)= z in a correct VO (x,y, z are the hash values of tree nodes),

VO uses different x’, y’ with h(x’,y’)=z

good features
Good Features
  • Proofs are short (size proportional to tree height and answer size).
  • Use hashes, a fast cryptographic operation
  • Proofs as easy to compute as finding the answer
  • No secret keys: hash function and digests all are public (no insider attack once data set is digested).
extensions
Extensions
  • Want to handle more complex queries
  • Find Digital cameras with 3-5 Mega pixels, and cost < $200
  • List all bars within one mile of HKU
multi attribute queries
Multi-Attribute Queries
  • Model as a 2-D Range query
  • Find points (x,y) with a < x < b
    • c < y < d

(b,d)

(a,d)

Pixels

(a,c)

(b,c)

Cost

searching a 2d range tree
Searching a 2D-range Tree
  • Find (x,y) with 4 < x <50 AND 4 < y < 10
  • All in Associated Y-trees Match x-range
searching a 2d range tree1
Searching a 2D-range Tree
  • Find pairs (x,y) with 4 < x <50 AND 4 < y < 10
  • In X-tree: subtrees rooted at 5 and 13
  • Search in Associated Y-trees
searching a 2d range tree2
Searching a 2D-range Tree
  • Find (x,y) with 4 < x <50 AND 4 < y < 10
  • Answer: (12,5) and (23,8) AND values in 5’s Y-tree
digesting a 2d range tree
Digesting a 2D-range Tree
  • Digest each Y-tree as Merkle tree
  • Each internal node in the X-tree gets the hash of three values: two children and associated Y-tree value
range trees
Range Trees
  • Let k be the number of answers (out of n)
  • Search: O(k+ log2n) time, nlogn space
  • improve to O(k+ logn) time with extra pointers (can still get a hash digest)
  • VO (proof) size also O(k+logn)
  • Extend to d-dimensions (d-attribute query). Search time: O(k+log(d-1) n), VO size: same.
authenticated data structures
Authenticated Data Structures
  • Problem: May want to use a variety of efficient data-structures:
    • B-trees (reduce disk access)
    • Suffix arrays (string queries)
    • Geometric data structures (items within one mile)
    • Many more
authenticated data structures1
Authenticated Data Structures
  • Solution: General method to digest a data structure (produce a single summary hash value).
  • Efficient: Proof size and construction time = search time.
  • Secure: Similar security property: break only with a specific collision in h
search dags
Search DAGS
  • Our general setting is any data structure modeled by:
    • A labeled Directed Acyclic Graph (DAG)
    • A search process that visits DAG nodes and determines which neighboring nodes to visit next (based on labels of visited nodes)

This Models a wide range of structures

a search dag
A Search DAG
  • Search starts at the unique source node s of in-degree zero
  • Digesting starts from the sinks (here u, v ): hash the associated values

s

b

c

a

v

u

a search dag1
A Search DAG
  • D(u): Digest of u
  • Node u data : du
  • D(u)= h(du)
  • D(v)= h(dv)

s

b

c

a

v

u

a search dag2
A Search DAG
  • Other Digests use data and successors
  • D(c) = h(dc, D(v) )
  • D(b)=h(db,D(v),D(c))
  • D(s) is DAG Digest

s

b

c

a

v

u

verification for search dag
Verification for Search DAG
  • Traditional Merkle Tree verification is Bottom up (hash path values to root)
  • We use top down verification to simulate a correct search
  • Owner provides search procedure P and root digest D(s)
verification object for dag
Verification Object for DAG
  • VO: information so User can reproduce the search (and thus verify answers)
  • “Lines” of VO match steps of P:
  • Data of a node and successor hashes
    • ds, D(v1), D(v2) … (successors of s)
    • dv1 , D(u1), D(u2), … (successors of v1)
an example search
An Example Search
  • Starts at s, then visits b then v
  • VO:
    • ds, D(a), D(b), D(c) (line 1)

D(s) = h(ds, D(a), D(b), D(c))

So know data ds is OK.

s

b

c

a

v

u

an example search1
An Example Search
  • Starts at s, process ds and decide b is next
  • VO:
    • ds, D(a), D(b), D(c) [line 1]
    • db, D(v), D(c) [line 2]

If D(b)=h(db,D(v),D(c))

(using D(b) from line 1)

    • Data db is correct

s

b

c

a

v

u

verified search
Verified Search
  • The verified computation proceeds until all nodes in the actual search are visited (the VO has one line for each node visited).
  • The correct answer is now returned by search procedure P.
verified search1
Verified Search
  • The verified computation takes time proportional to the original search (visits the same nodes).
  • Security Proof: shows that a User accepts the wrong answer only if a specific collision in hash function h used (e.g. D(b)=h(d’b,D’(v),D’(c))
updates
Updates
  • Typically Digests are updated with work similar to the data structure’s update time (e.g. length of the search paths to updated items)
  • If updates are frequent, overall scheme doesn’t work well (can use time-stamped digests)
generalizations
Generalizations
  • Allowing multiple Owners: often want to query data collected from several owners. Can be done, but now need to trust owners and data collector.
  • Privacy: VO’s may reveal information about about the data set. Methods to conceal extra data.
generalizations1
Generalizations
  • I/O efficient digests/VO’s: can use a multi-way tree to store multiple values in one disk block (still logically a binary tree for VO purposes, but stored more efficiently).
  • Top-down search DAG approach may be improved for specific data-structures (e.g. 2D range trees)
generalizations2
Generalizations
  • Collections of structured data: XML documents (can answer path queries)
  • Relational operations (Joins, Selection, Projection)
  • Fancier Crypto operations (to reduce VO size)
references
References

P. Devanbu, M. Gertz, C. Martel, and S.

G. Stubblebine. Authentic Third Party

Data Publication, 14th IFIP 11.3 Working Conf. in DB Security (DBSec 2000),

Original Authentic Publication Paper

A General Model for Authenticated Data Structures, Algorithmica, 2004

Many Data Structures and Search DAG ( above group and G. Nuckolls)

references1
References

Certifying Data from Multiple Sources, Proceedings of the 17th Database Security Conference, 2003

Shows how to use multiple Owners

Flexible authentication of XML documents, Journal Computer Security, 2004

survey chapters
Survey Chapters

Li, Hadjieleftheriou, Kollios, Reyzin

Authenticated Index Structures for Outsourced Databases(Overview of area and efficiency issues)

R. Sion: Towards Secure Data Outsourcing

Both in: Michael Gertz and Sushil Jajodia (eds.): "Handbook of Database Security: Applications and Trends", Springer, 2007, to appear.

slide51

Anagnostopoulos, M. Goodrich, R. Tamassia,

Persistent Authenticated Dictionaries and Their Applications (allows queries of prior DB versions)

Authenticated Data Structures for Graph and Geometric Searching (fancy geometric data structures)

pointer for more information
Pointer for more information

http://truthsayer.cs.ucdavis.edu

conclusion
Conclusion
  • A single signed Digest, can authenticate answers to many queries
  • Secure against hackers and insiders
  • Can handle a wide range of data structures
  • Efficient protocols: fast query processing and small VO’s
future work
Future Work
  • Better Update Mechanisms
  • Integration of Database optimization methods
  • Actual implementation (partly done by others), and evaluation