1 / 28

Secure XML Querying with Security Views

Secure XML Querying with Security Views. Wenfei Fan University of Edinburgh & Bell Laboratories Chee-Yong Chan National University of Singapore Minos Garofalakis Bell Laboratories. The need for XML security. Data in XML format: Business information: confidential

Download Presentation

Secure XML Querying with Security Views

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Secure XML Querying with Security Views Wenfei Fan University of Edinburgh & Bell Laboratories Chee-Yong Chan National University of Singapore Minos Garofalakis Bell Laboratories

  2. The need for XML security Data in XML format: • Business information: confidential • Health-care data: Patient Privacy Act, … Access control: • multiple groups simultaneously query the same XML document • each user group has a different access-control policy Enforcement of access-control policies: . . . user group 1 user group n inaccessible accessible XML Query Engine

  3. user group Q Q(T) inaccessible XML Query Engine accessible XML document T Secure XML querying For each user group of an XML document T, • specify a access-control policyS, • enforceS: for any query Q posted by the group over the document T, Q(T) consists of only data accessible wrt S Access control for XML: • How to specify access policies at various levels of granularity? • How to efficiently enforce those access policies?

  4. hospital * patient * SSN name record date diagnosis treatment regular trial trName tname bill * Example: an XML document of patients Document DTD D hospital  patient* patient  SSN, name, record* record  date, diagnosis, treatment treatment  (trial + regular) trial  trName, treatment* regular  tname, bill Access-control policies over docs ofD: • Doctors in the hospital are granted access to all the data in the docs • Insurance company is allowed to access billing information only DTD graph

  5. X X X X hospital * patient * SSN name record date diagnosis treatment regular trial trName tname bill * Access-control policy for syndrome surveillance • patients: accessible to only those who are diagnosed to have a certain disease “DIS” (a constant) • records: • only with diagnosis = “DIS” • part of “DIS” records: date, diagnosis, treatment, tname • denied from seeing whether a patient is in a clinical trail or not (trial, regular, trName) • denied from accessing billing information

  6. hospital * patient * SSN name record date diagnosis treatment regular trial trName tname bill * Challenge: Access-control specification • various levels of granularity: restricting access to entire subtrees or specific elements • conditional access: e.g., a patient is accessible if and only if it has a descendant diagnosis = “DIS” • overriding: e.g., tname overrides the accessibility of its parent regular • inheritance: e.g., SSNand name inherit the accessibility of patient conditionally accessible

  7. hospital * patient * SSN name record date diagnosis treatment regular trial trName tname bill * conditionally accessible Challenge: access-control enforcement should not imply any drastic degradation in performance Example: an XPath query Q posed by a syndrome surveillance group over a document T //patient[name=`Joe’]//tname • access control requirement: Q(T) {accessibletname} • enforcement: ensure that • all and only those Joe’s having a descendant diagnosis = “DIS”, • all and only those records with diagnosis = “DIS”

  8. hospital * patient * SSN name record date diagnosis treatment regular trial trName tname bill * conditionally accessible Challenge: schema availability • One needs schema information to facilitate query formulation and optimization • How to define a schema (DTD) characterizing all and only the accessible information, without security breach? • How to automatically derive such a DTD from the document DTD and an access-control specification? XML DTD is far more complicated than its relational counterpart – recursive, nondeterministic

  9. Previous proposals/standards for XML security Dozens of models have been proposed for XML: XACML, XACL, … • Specifying and enforcing access-control at a physical level • annotate data nodes in an XML document with accessibility, and check accessibility at runtime (with optimizations for tree-pattern queries and tree/DAG DTDs), or • materialize a view consisting of accessible data Problems: • costly (time, space): multiple accessibility annotations/views • error-prone: integrity maintenance becomes a problem when the underlying data or access policy is updated • No support for schema availability: either deny access to any schema information, or expose the entire document DTD -- security breach

  10. hospital * patient * SSN name record date diagnosis treatment regular trial trName tname bill * A seemingly plausible model • annotate data nodes with accessibility • check accessibility at runtime, and • expose the document DTD D Example: permissible XPath queries: • Q1://patient[name=`Joe’]/record /treatment/*/tname • Q2://patient[name=`Joe’]/record /treatment//tname Security breach: from the document DTD it follows that if Q2(T) – Q1(T) is nonempty then Joe is involved in a clinical trial

  11. query query query Security view k (view DTD, xpath( )) Security view n (view DTD, xpath( )) Security view 1 (view DTD, xpath( )) derivation module query translation module Rewriter Optimizer specification 1 specification n specification k Our security model for XML • Security administrator: specifies a access-control policy for each group by extending the document DTD with XPath qualifiers • Derivation module: automatically derives a security-view definition from each policy: view DTD and mapping via XPath • Query translation module: rewrite and optimize queries over views to equivalent queries over the underlying document XML document

  12. query query query Security view k (view DTD, xpath( )) Security view n (view DTD, xpath( )) Security view 1 (view DTD, xpath( )) derivation module query translation module Rewriter Optimizer specification 1 specification n specification k XML document Overcome the limitations of previous proposals • Specification and enforcement: at the conceptual (schema) level • no need to update the underlying XML data • no need to materialize views or perform runtime check • Schema availability: view schema is automatically derived • characterizing accessible data • exposing necessary schema information only

  13. Access-control specification • DTD D : element type definitions A    ::= PCDATA |  | A1, …, Ak | A1 + … + Ak | A* • Specification S= (D, access( )): a mapping access( ) from the edges in the document DTD  { Y, N, [q]}. For each A  , for each B in , define Access(A, B) as • Y: accessible (true) • N: inaccessible (false) • [q]: XPath qualifier,conditional: accessible iff [q] holds XPath fragment: p ::=  | A | * | // | p/p | p  p | p[q] q ::= p | p = “c” | q1  q2 | q1  q2 | q + Access policy = DocumentDTD XPath qualifiers

  14. hospital * [q1] patient * [q2] SSN name record date diagnosis treatment regular trial trName tname bill * Example: access policy S for syndrome surveillance access(hospital, patient) = [//diagnose = “DIS”] -- [q1] access(patient, record) = [diagnose = “DIS”] -- [q2] access(treatment, trial) = N access(treatment, regular) = N access(regular, tname) = Y • overriding: if access(A, B) = Y (N), then the B children of A override the accessibility of A • inheritance: if access(A, B) is not explicitly defined, then the B children of A inherit the accessibility of A • content-based: conditional accessibility via XPath qualifiers conditionally accessible

  15. hospital * [q1] patient * [q2] SSN name record date diagnosis treatment regular trial trName tname bill * Properties of the specification language • XML tree of the document DTD: the accessibility of each data node is uniquely defined by an access specification • relative to the path from root • a qualifier at a node a constrains the entire subtree rooted ata, e.g., [q2] constrains tname • various levels of granularity: entire subtrees or specific elements • schema level: the underlying XML data is not touched; efficient, easy to specify and maintain conditionally accessible

  16. Enforce access control – security views XML security view:  = (Dv, xpath( )) with respect to an access policy S= (D, access( )), • Dv: view DTD, exposed to the user and characterizing the accessible information (of document DTD D) wrt S Schema availability: to facilitate query formulation • xpath( ):mapping from instances of D to instances of Dv defined in terms of XPath queries and view DTD Dv • for each A   in Dv, for each B in , xpath(A, B) = p • p: generates B children of an A element in a view p ::=  | A | * | // | p/p | p  p | p[q] q ::= p | p = “c” | q1  q2 | q1  q2 | q

  17. hospital * [q1] hospital patient * * [q2] patient SSN name record * SSN name record date diagnosis treatment date diagnosis treatment * regular trial tname trName tname bill * Example: view DTD for syndrome surveillance  = (Dv, xpath( )) with respect to access policy S= (D, access( )) View DTD Dv • Hide trial, trName, regular, bill • Expose accessible information only Document DTDD

  18. patient patient patient patient SSN name record Example: view definition for syndrome surveillance xpath( ): maps edges in view DTDDv to paths in document DTD D • hospital  patient* xpath(hospital,patient) = hospital/patient [q1] [q1]: [//diagnose=“DIS”] semantics: • top-down construction • preserving qualifiers in a specification hospital • patient  SSN, name, record* • xpath(patient, SSN) = SSN, /* name */ • xpath(patient, record) = record [q2] • [q2]: [diagnose=“DIS”]

  19. patient patient patient patient date diagnosis treatment treatment tname tname SSN name record regular trial trName tname bill * DTD-directed construction of security views • record  date, diagnosis, treatment xpath(record, date) = date /*diagnosis, treatment */ hospital • treatment  tname* xpath(treatment, tname) = //tname • DTD-directed construction view DTD conformance • Never materialized the construction strategy is just to give the semantics

  20. Derivation of security-view definition XML security views are far more intriguing than relational views • multiple XPath queries vs. a single SQL query • DTDvs. relational schema One needs an algorithm to compute a security-view definition: • Input: anaccess policy S= (D, access( )) • Output: a security-view definition  = (Dv, xpath( )) • sound: accessible information only • complete: all the accessible data (structure preserving) • DTD-conformant: conforming to the view DTD • efficient: O(|S|2) time • generic: recursive/nondeterministic document DTDs

  21. hospital hospital xpath(hospital,patient) = hospital/patient[q1] * * patient [q1] patient xpath(patient, record) = record[q2] * * [q2] SSN name record SSN name record date diagnosis treatment date diagnosis treatment xpath(record, treatment) = treatment Algorithm: deriving a security-view definition • Top-down traversal ofthe document DTD D • short-cutting/renaming (via dummy)inaccessibleelement types • normalizing the view DTD Dv and reducing dummy types

  22. treatment treatment dummy1 dummy2 regular trial treatment * trName * tname bill tname tname * deriving a security-view definition • recursive and non-deterministic productions xpath(treatment, dummy2) = regular xpath(treatment, dummy1) = trail • reducing dummy element types: (dummy1/treatment)* / dummy2 / tname  dummy2/tname)  (dummy1/treatment)* / dummy2 / tname  tname* xpath(treatment, tname) = //tname

  23. query Query translation: one needs an efficient algorithm to rewrite queries over a security view to equivalent and efficient queries over the underlying document Security view k (view DTD, xpath( )) Rewriter Optimizer query translation module XML document Enforce access control via query rewriting security viewsare virtual: not materialized • Efficiency: no extra costs to support multiple security views over the same large document simultaneously • Consistency/integrity: updating the underlying data introduces no difficulties/overhead

  24. algorithm rewrite • Input: •  = (Dv, xpath( )) (security view wrt S= (D, access( ))), and • an XPath query Qv over the view (Dv) • Output: an equivalent XPath query Qtover the document • for any XML document T of D, Qt(T) = Qv((T)) Dynamic programming: • for any subquery Qv’ of Qv, anynode A in view-DTD graph Dv rewrite Qv’ at A by incorporating xpath(A, _) Qt’(A) • efficient:O(|Qv| | |2) time • a practical class of XPath (with union, descendant, qualifiers) vs. tree-pattern queries studied in previous security models

  25. hospital * [q1] hospital xpath(hospital, patient) [name = “Joe”] / xpath(patient, record) / xpath(record,treatment) / xpath(treatment, tname) patient * * [q2] patient SSN name record * SSN name record date diagnosis treatment date diagnosis treatment Qt=/hospital/patient[name = “Joe” and //diagnosis = “DIS”] /record[diagnosis = “DIS”] /treatment // tname equivalent queryover document * regular trial tname trName tname bill * Example: query rewriting for syndrome surveillance Qv = // patient[name=“Joe”] // tname over the view

  26. A [B and C]  empty-set exclusive constraint: an Aelement cannot have bothB and C children at the same time A disjunction: exclusive constraints B C • // F[G] / H  empty-set non-existence constraint: a Felement does not have a G child E F G conjunction: existence (nonexistence) constraints H Query optimization with structural constraints Optimize Qt = rewrite(, Qv) by leveraging the document DTD D Q = A[B] // E[F] //H  A [B and C] // H  // F[G] / H  Q’ = A /B / E / F / H • A[B] // E[F] // H  A /B / E / F / H exclusive constraint: B and C do not coexist under an A element DTD graph

  27. A heuristic for XPath containment (NP-hard for small fragments in the presence of DTDs) • image graph: evaluation ofsub-queries over DTD graph • containment test: extension of simulation • Q1  Q2 if image(Q1) is simulated by image(Q2) • qualifiers: inverse simulation • effective: preliminary experimental study (speedup up to a factor of 2) B * A A C E DTD graph B B E [C] E image graph for // *[C] //E image graph for // E Example: heuristic for XPath containment Q = // *[C] //E  // E  Q’ = A /B / E • Q1  Q2  Q2 if Q1  Q2 // *[C] //E  // E  // E  A /B / E

  28. Summary • security views: the first model for specifying/enforcing XML security at a schema level and providing schema availability • a fine-grained access-control specification language • an effective enforcement framework via security views • view DTD: characterizing accessible information • algorithm for deriving security-view definitions • algorithms for query rewriting/optimization: no need tomaterialize views or to perform runtime security checks • future work: • reasoning about security views (soundness, completeness, DTD conformance – subsume XPath satisfiability with DTDs) • inference control in the presence of external knowledge A practical solution for securing XML querying

More Related