CQL
Download
1 / 100

CQL - PowerPoint PPT Presentation


  • 323 Views
  • Updated On :

CQL “Common Query Language” Ray Denenberg March 2005 CQL’s Goals Combine the simplicity and intuitiveness of google searching with the expressive power of Xquery. Support very simple queries; and arbitrarily complex expressions as necessary. Example: search on “cat” cat cat

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CQL' - paul2


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg
CQL

“Common Query Language”

Ray Denenberg

March 2005


Cql s goals l.jpg
CQL’s Goals

  • Combine the simplicity and intuitiveness of google searching with the expressive power of Xquery.

    • Support very simple queries;

    • and arbitrarily complex expressions as necessary.

Example: search on “cat”



Slide4 l.jpg

(That’s it. The whole query.)


Simple cql queries l.jpg
Simple CQL Queries

  • cat

  • cat and dog

  • title = cat


Simple cql queries6 l.jpg
Simple CQL Queries

  • cat (simplest)

  • cat and dog (simple boolean)

  • title = cat (index)


Simple cql queries7 l.jpg
Simple CQL Queries

  • cat (simplest)

  • cat and dog (simple boolean)

  • title = cat (index)

  • dc.title = cat(index qualified)


Boolean l.jpg
Boolean

  • cat and dog

  • cat or dog

  • Cat not dog


Boolean9 l.jpg
Boolean

  • cat and dog

  • cat or dog

  • Cat not dog

  • cat not dog and fish or frog


Boolean10 l.jpg
Boolean

  • cat not dog and fish or frog

  • evaluates to:(((cat not dog) and fish) or frog)


Boolean11 l.jpg
Boolean

  • cat not dog and fish or frog

  • evaluates to:(((cat not dog) and fish) or frog)

  • Not:(cat not dog) and (fish or frog)


Index search l.jpg
index Search

  • title = cat


Qualified index l.jpg
Qualified index

  • title = cat

  • dc.title = cat

  • bib.title = cat

  • Bath.keyTitle


Fielded index search l.jpg
Fielded/index Search

  • dc.title = cat

  • bib.title = cat


Slide15 l.jpg

  • dc.title A name given to the resource

  • bib.title(fictitious)A word, phrase, character, or group of characters, normally appearing in an item, that names the item or the work contained in it.     


Zthes indexes l.jpg
Zthes Indexes

zthes.nt=sauropod

and

zthes.bt=macronaria

narrower than sauropod but broader than macronaria.



Relations18 l.jpg
Relations

The triple:

<index> <relation> <search term>

(e.g. title = cat)

Is called a:

Search Clause


Relations19 l.jpg
Relations

<index> <relation> <search term>


Simple relations l.jpg
Simple Relations

  • Title = "the complete dinosaur"

  • title all "complete dinosaur“

  • title any "dinosaur bird reptile"

  • title exact "the complete dinosaur"


The relation l.jpg
the = relation

  • Title = "the complete dinosaur“

    (find these three words,

    adjacent and in this order)


Slide22 l.jpg

  • Title = "the complete dinosaur“

  • matches “a day in the life of the complete dinosaur“

  • and“the complete dinosaur goes to Paris“


Slide23 l.jpg
=

  • Title = "the complete dinosaur“

  • matches “a day in the life of the complete dinosaur“

  • and“the complete dinosaur goes to Paris“

  • but not “the complete and unabridged dinosaur"


Slide24 l.jpg
All

  • Title all "complete dinosaur“

  • matches “the complete and unabridged dinosaur“

  • does not match “the unabridged dinosaur“


Slide25 l.jpg

  • Title all "dinosaur bird reptile“

  • does not match “the completedinosaur"


Slide26 l.jpg
Any

  • Title any "dinosaur bird reptile“

  • doesmatch “the complete dinosaur" and

  • “the unabridged dinosaur"


Exact l.jpg
Exact

  • title exact "the complete dinosaur"

    matches

    "the complete dinosaur"


Exact28 l.jpg
Exact

  • title exact "the complete dinosaur"

    matches

    "the complete dinosaur"

    Does not match: “a day in the life of the complete dinosaur

    or“the complete dinosaur goes to Paris“ or“the complete and unabridged dinosaur “



Relations observations30 l.jpg
Relations …. observations

  • Observation 1: Shorthand


Slide31 l.jpg


Relations observations32 l.jpg
Relations …. observations

  • Observation 2: Anchoring

    ^

    The anchor character


Recall l.jpg
Recall …….

  • Title = "the complete dinosaur“

  • matches “a day in the life of the complete dinosaur“


Anchoring l.jpg
Anchoring

  • title="^the complete dinosaur"would not match

    “a day in the life of the complete dinosaur”


Anchoring35 l.jpg
Anchoring

  • title="^the complete dinosaur" would not match

    “a day in the life of the complete dinosaur”

  • title="the complete dinosaur^"would not match

    “the complete dinosaur goes to Paris”


Relations observations36 l.jpg
Relations …. observations

  • Observation 3: Index and Relation go together



Index and relation go together38 l.jpg
Index and Relation go together

  • Cat

  • Title = cat

  • Title cat

  • = cat


Index and relation go together39 l.jpg
Index and Relation go together

  • Cat

  • Title = cat

  • Title cat

  • = cat


Slide40 l.jpg
BNF

searchClause ::='(' cqlQuery ')‘

| index relation searchTerm

| searchTerm


Basic relations summary l.jpg
Basic Relations …. summary

  • Title = "the complete dinosaur"

  • title all "complete dinosaur“

  • title any "dinosaur bird reptile"

  • title exact "the complete dinosaur"


A few more relations l.jpg
A few more relations …

  • < less

  • > greater

  • <= less or equal

  • >=greater or equal

  • = (see next)

  • <> not equal


Relation l.jpg
= relation

= means:

  • word adjacency, when the term is a list of words.

  • Equality, otherwise.


Relation modifiers l.jpg
RelationModifiers

  • Stem

  • relevant

  • Fuzzy

  • phonetic


Stemming l.jpg
Stemming

  • title =/stem"these completed dinosaurs“ matches

  • The Complete Dinosaur.


Relevance l.jpg
Relevance

subject any/relevant "fish frog" would find records whose subject field included words like shark, tuna, coelocanth, toad, amphibian, etc.


Relation modifiers47 l.jpg
Relation Modifiers

  • Stem

  • relevant

  • Fuzzy

  • phonetic


Fuzzy l.jpg
fuzzy

  • Fuzzy means:

    • “be liberal in what you count as a match … details left to the server. Might include permutations of character order, off-by-one for numerical terms.”

  • Title =/fuzzy “sharlot simmins”

    might match “I am Charlotte Simmons”

  • telephoneNumber exact/fuzzy “303 441 1319"


Relation modifiers49 l.jpg
Relation Modifiers

  • Stem

  • relevant

  • Fuzzy

  • phonetic


Phonetic l.jpg
Phonetic

  • Match words that sound the same

    e.g. Hostel might match “hostile”


Booleans l.jpg
Booleans

  • And

  • Or

  • not


Booleans52 l.jpg
Booleans

  • And

  • Or

  • Not

  • Proximity


Slide53 l.jpg

  • And cat and dog

  • Or cat or dog

  • Not cat not dog

  • Proximitycat prox dog


Slide54 l.jpg

  • And cat and dog

  • Or cat or dog

  • Not cat not dog

  • Proximity cat prox dog roughly: “find cat near dog”


Proximity l.jpg
Proximity

(chestnut

prox

“Cryphonectaria parasitica”)

prox

(“dutch elm”

proxCeratocystisulmi)


Proximity parameters l.jpg
Proximity parameters

  • relation

  • Distance

  • unit

  • ordering


Proximity parameters57 l.jpg
Proximity parameters

  • relation

  • Distance

  • unit

  • ordering

e.g: “Find cat in the same

sentence as dog”

Relation: less or equal

Distance: 0

Unit: sentence

Ordering: unordered


Slide58 l.jpg

  • relation ("<", ">" ,"<=" ,">=" ,"=" , "<>"; default "<="),

  • distance (integer; default: 1 for word, zero otherwise)

  • unit ("word", "sentence", "paragraph", or "element"; default "word"),

  • ordering ("ordered" or "unordered"; default "unordered")


Slide59 l.jpg

“Find cat in the same

sentence as dog”

cat prox//sentence dog


Slide60 l.jpg

“Find cat in the same

sentence as dog”

cat prox//sentence dog

same as:

cat prox/<=/0/sentence/unordered dog


Slide61 l.jpg

(chestnut

prox//sentence“Cryphonectaria parasitica”)

prox//paragraph

(“dutch elm”

prox//sentenceCeratocystisulmi)


Slide62 l.jpg

(chestnut

prox//sentence“Cryphonectaria parasitica”)

prox//paragraph

(“dutch elm”

prox//sentenceCeratocystisulmi)

(find chestnut in the same sentence as

“Cryphonectaria parasitica”, and “dutch elm” In the same sentence as Ceratocystisulmi, and both sentences in the same paragraph.)


Slide63 l.jpg

(chestnut

prox//paragraph“Cryphonectaria parasitica”)

and

(“dutch elm”

prox//paragraphCeratocystisulmi)


Slide64 l.jpg

(chestnut

prox//paragraph“Cryphonectaria parasitica”)

and

(“dutch elm”

prox//paragraphCeratocystisulmi)

(find chestnut in the same paragraph as

“Cryphonectaria parasitica”, and “dutch elm” In the same paragraph as Ceratocystisulmi.)


Slide65 l.jpg

cat prox/>/2//orderedhat

retrieves “cat in the hat” but not “cat in hat”

nor “hat on the cat”


Pattern matching l.jpg
Pattern Matching

  • ?Matches any single character

  • *Matches any sequence of zero or more characters

    • ^word-anchoring


Pattern matching67 l.jpg
Pattern Matching

  • ?Matches any single character

    • c?t matches cat, cot, cut, but notcoat or ct. c??t matches cart, but notcat or crypt.

  • *Matches any sequence of zero or more characters

    • c*t matches cat, coat, crypt and counterargument.

  • ^word-anchoring ---


  • Word anchoring l.jpg
    Word Anchoring

    • title="^the complete dinosaur"

      • Matches “the complete dinosaur meets godzilla”

      • But not“a day in the life of the complete dinosaur”

    • title="the complete dinosaur^ “

      • Matchesa day in the life of the complete dinosaur”

      • But not“the complete dinosaur meets godzilla”


    Word anchoring any l.jpg
    Word Anchoring - any

    • title any "^cat ^dog rat“

      • Means title with cat at the beginning, or with dog at the beginning,or with rat anywhere.


    Word anchoring any70 l.jpg
    Word Anchoring - any

    • title any "^cat ^dog rat“

      • Means title with cat anywhere, or with rat anywhere, or with dog at the beginning.

    • matches

      • 'cat eats dog',

      • 'dog eats hat'

      • ‘hat eats rat’

    • but not

      • ‘hat eats dog'


    Cql syntax l.jpg
    CQL Syntax

    • Reserved words:

      • and, or, not, prox

    • Special Characters

      • Space ( ) = < > ” /


    Tokens l.jpg
    Tokens

    • A string that has no special characters; or

    • Any string at all enclosed by double quotes. (Except the string cannot include a double quote, unless escaped.)


    Escape character l.jpg
    Escape Character \

    • Backslash (\) escapes '*', '?', " and '^' , as well as itself

    "\“why not\?\" she said"

    Results in the following token:

    “why not?" she said



    Context sets75 l.jpg
    Context sets

    • Indexes

    • Relations

    • Relation modifiers

    • Boolean Modifiers



    Slide77 l.jpg

    subject any/relevant "fish frog"

    Relation

    modifier

    Search term

    index

    relation


    Slide78 l.jpg

    subject any/relevant "fish frog"

    Relation

    modifier

    Search term

    index

    relation

    Subject to

    context qualification


    Slide79 l.jpg

    dc.subject any/relevant "fish frog"

    Context set


    Slide80 l.jpg

    dc.subject any/relevant "fish frog"


    Slide81 l.jpg

    dc.subject any/rel.lr "fish frog"


    Slide82 l.jpg

    dc.subject any/rel.lr "fish frog"

    A specific

    Relevance

    algorithn

    Context

    set


    Slide83 l.jpg

    dc.subject cql.any/rel.lr "fish frog"

    Context

    set


    Example fictitious relation only l.jpg
    Example–fictitious relation: “only”

    • depicts only “cat"Matching images would depict only a cat and nothing else. The same cat with a person would not match.

    relation

    index


    Slide85 l.jpg

    • image.depicts image.only “cat"

    Context for

    relation

    Context

    for index


    Slide86 l.jpg

    Go back to:

    subject any/relevant "fish frog"


    Slide87 l.jpg

    subject any/relevant "fish frog"

    Or

    title any/relevant “cat dog"


    Slide88 l.jpg

    subject any/relevant "fish frog"

    Or/rel.mean

    title any/relevant “cat dog"


    Slide89 l.jpg

    subject any/relevant "fish frog"

    Or/rel.mean

    Context

    set

    Boolean

    modifier

    title any/relevant “cat dog"


    Defaults l.jpg
    Defaults

    • Consider the query:

      cat

    • The server needs to turn that into a search clause, I.e. an index, relation, and search term.

    • As it is, there’s only a search term


    Slide91 l.jpg

    <index> <relation> cat

    cql.serverChoice

    (default index)

    cql.scr

    (default context set

    and relation)

    scr: “server choice relation”



    Slide93 l.jpg

    • Next, consider the query:

      title = cat

    • The server needs to assign a context set to the index (title) and a context set to the relation (=)


    Slide94 l.jpg

    • Next, consider the query:

      title = cat

    • The server needs to assign a context set to the index (title) and a context set to the relation (=)

    • Or to make it even more complicated….


    Slide95 l.jpg

    • Add a relation modifier

      title = cat/relevant

    • The server needs to assign a context set to the index (title) and a context set to the relation (=), and a context set to the relation modifier.


    Default context sets l.jpg
    Default Context Sets

    <>.title cql.= cat/cql.relevant

    Default context

    set for relation

    is ‘cql’

    Default context

    set for relation

    modifier is ‘cql’

    Default index

    seleted by

    server


    Additional relation modifiers l.jpg
    Additional relation modifiers

    • wordThe term should be broken into words, (according to the server's definition of a 'word‘)

    • stringThe term is a single item, and should not be broken up.

    • isoDateEach item within the term conforms to ISO 8601

    • numberEach item within the term is a number.

    • uriEach item within the term is a URI.

    • masked(default modifier)


    Slide98 l.jpg


    Slide99 l.jpg

    • Title any “cat dog” same as

      Title any/word “cat dog”

    • Title exact “cat in the hat” same as

      title exact/string “cat in the hat”


    Slide100 l.jpg

    • Title any “cat dog” same as

      Title any/word “cat dog”

    • Title exact “cat in the hat” same as

      title exact/string “cat in the hat”

    • Title = “cat * hat” same as

      Title =/masked “cat * hat”


    ad