1 / 20

Advanced CQL and Profiling

Advanced CQL and Profiling. ?!. 1. Esoteric CQL features: – Word Anchoring – Proximity – Relation modifiers – Boolean modifiers 2. Profiling 3. Prefix mapping 4. Defining relations. Advanced CQL and Profiling. Mike Taylor < mike@indexdata.com >. CQL features: esoterica.

garrison
Download Presentation

Advanced CQL and Profiling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced CQL and Profiling ?! 1. Esoteric CQL features: – Word Anchoring – Proximity – Relation modifiers – Boolean modifiers 2. Profiling 3. Prefix mapping 4. Defining relations Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  2. CQL features: esoterica “You are not expected to understand this.” – comment in the Unix Version 7 source code. The point is that new users are not required to understand this, and may happily use CQL for many years – perhaps forever – without needing to. Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  3. CQL esoterica: word anchoring A word beginning with “^” must occur at the start of its field. A word ending with “^” must occur at the end of its field. • dinosaur – matches “the complete dinosaur” • dinosaur^ – also matches • ^dinosaur – does not match • the – matches “the complete dinosaur” • ^the – also matches • the^ – does not match Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  4. CQL esoterica: proximity The “prox” boolean, by default, requires its operands to be next to each other, in either order: • cervical prox vertebra – equivalent to "cervical vertebra" or "vertebra cervical" • (cervical or dorsal) prox vertebra – equivalent to "cervical vertebra" or "dorsal vertebra" or "vertebra cervical" or "vertebra dorsal" Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  5. CQL esoterica: proximity II Modifiers can generalise the semantics of proximity: • cervical prox/distance<=5/ vertebrae – within five words of each other • cervical prox/distance=0/unit=sentence vertebrae – within the same sentence • cervical prox/distance>0/unit=paragraph vertebrae – in different paragraphs • cervical prox/ordered vertebrae – in the specified order: exactly equivalent to "cervical vertebra" Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  6. CQL esoterica: relation modifiers Modifiers can refine the semantics of relations: • title =/stem dig – finds “dig”, “digging”, “dug”, etc. • title any/relevant "dinosaur bird reptile" – finds “sauropods”, “avian”, “crocodile”, “snake”, etc. • author =/fuzzy tailor – finds “Mike Taylor” • phoneNumber exact/fuzzy "020 8348 6768" – finds “020 8348 6769” Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  7. CQL esoterica: relation modifiers II Relation modifiers can be overloaded to specify extra information about the term that the relation joins to the index: • createdDate >/isoDate "2004-03-12 09:45:00" – the term is in ISO 8601 format. • location within/geom.polygon "(12,46) (15,52)" – the term indicates a polygon of two points (i.e. a straight line) rather than the corners of a rectangle. Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  8. CQL esoterica: boolean modifiers Modifiers can refine the semantics of boolean operators. We've already seen some examples of this in proximity. • cervical prox/distance<=5/ vertebrae – within five words of each other • cervical or/exclusive vertebrae – one or the other, but not both. • "denenberg or/rel.mean "information retrieval" • "denenberg or/rel.sum "information retrieval" • "denenberg or/rel.max "information retrieval" – average, total or maximum relevance of operands Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  9. Profiling CQL For simple searching, it suffices to use common indexes. Semantic interoperability requires more precise behaviour. This lesson was learned in the Z39.50 world and resulted in the invention of “profiles” - specifications for a subset of the full specification that are needed to support an application. The classic example in Z39.50 is a Bath Profile for bibliographic searching. Similarly, we define a Bath Profile for CQL searching. Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  10. Profiles and context sets A profile is not the same thing as a context set! • A context set is merely a bag of indexes (and relation modifiers and boolean modifiers) that may be used in any application. • A profile provides a palette of indexes drawn from several context sets. The distinction is similar to that between XML namespaces and XML Schemas. • Schemas depend on namespaces, and may use several. • CQL profiles depend on context sets, and may use several. Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  11. Example: the Bath Profile See http://zing.z3950.org/srw/bath/2.0/ Bath searches may use any of the following indexes: dc.creator bath.personalName dc.title bath.corporateName dc.subject bath.conferenceName cql.anywhere bath.uniformTitle dc.identifier bath.issn dc.date rec.id bath.keyTitle bath.geographicName dc.format bath.notes dc.language bath.topicalSubject bath.possessingInstitution bath.genreForm bath.name Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  12. Existing and possible profiles Explicit CQL profiles have been created for some applications: • Bath Profile for bibliographic data • Zthes profile for hierarchical thesaurus navigation Profile are in development (or “unwritten”) for others: • Google-like structureless searching • Simple metadata searching with the Dublin Core • CCG for collectable card games • Music – musicalKey, arranger, duration, etc. • GILS (Global Information Locator Service) • ... your application goes here! Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  13. CQL esoterica: prefix mapping So far, we have been free and easy with index prefixes such as “dc”. But how do we know what they mean? Why should “dc” mean Dublin Core rather than Deep Custard? • dc.custardDepth <= 20 Why should “bath” mean the Bath Profile for bibliographic searching instead of plumbing supplies? • bath.capacityInGallons > 45 Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  14. CQL esoterica: prefix mapping II Prefixes are just convenient, easy-to-type abbreviations. The real identifier of a context set is its URI. For example, the Dublin Core context set is info:srw/cql-context-set/1/dc-v1.1 but we map that URI to a prefix for convenience. This is exactly like XML namespaces: they are identified by URIs, but the URIs do not appear in the names of elements or attributes: short prefixes are used instead. Advanced CQL and Profiling Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  15. CQL esoterica: prefix mapping III In XML, a prefix is associated with a namespace using: • <element xmlns:prefix="http://example.org/xyz/"> In CQL, a prefix is associated with a namespace using: • >prefix=http://example.org/xyz/ and the rest of the query follows. The following queries are exactly equivalent: • >dc=info:srw/cql-context-set/1/dc-v1.1 dc.title=fish • >yx=info:srw/cql-context-set/1/dc-v1.1 yx.title=fish Most applications will have established default mappings. Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  16. CQL esoterica: prefix mapping IV It is possible to establish the context set from which indexes with no explicit prefix are taken by omitting the “prefix=” part from the mapping: • >http://example.org/heraldry/ title=baron and side=sinister So the following queries are exactly equivalent: • >info:srw/cql-context-set/1/dc-v1.1 title=fish • >yx=info:srw/cql-context-set/1/dc-v1.1 yx.title=fish Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  17. CQL esoterica: prefix mapping V Finally ... Finally! :-) Prefix mappings can be stacked up: • >dc = info:srw/cql-context-set/1/dc-v1.1 >bath=http://zing.z3950.org/cql/bath/2.0/ >rec=info:srw/cql-context-set/2/rec-1.0 rec.created < 2004-10-09 and dc.title=ecology and bath.conferenceName=dinosaur (Yes, this is all one query.) Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  18. CQL esoterica: prefix mapping VI Don't try this at home. Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  19. Defining relations CQL has a “feature” where any word can act as a relation. For example, the query: foo bar baz is interpreted as index-name “foo”, relation “bar”, term “baz” – even though there is no relation “bar”. This is a misfeature. it prevents the obvious interpretation of this query as a phrase-search or AND search. If your profile needs a new relation, consider defining it as a relation modifier on one of the existing relation, instead. Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

  20. Thanks for listening! ?! Advanced CQL and Profiling Mike Taylor <mike@indexdata.com>

More Related