Combining kr and search crossword puzzles
1 / 29

Combining KR and search: Crossword puzzles - PowerPoint PPT Presentation

  • Uploaded on

Combining KR and search: Crossword puzzles. Next: Logic representations Reading: C. 7.4-7.8. Changes in Homework. Mar 4 th : Hand in written design, planned code for all modules Mar 9 th : midterm Mar 25 th : Fully running system due Mar 30 th : Tournament begins. Changes in Homework.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Combining KR and search: Crossword puzzles' - nika

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Combining kr and search crossword puzzles

Combining KR and search: Crossword puzzles

Next: Logic representations

Reading: C. 7.4-7.8

Changes in homework
Changes in Homework

  • Mar 4th: Hand in written design, planned code for all modules

  • Mar 9th: midterm

  • Mar 25th: Fully running system due

  • Mar 30th: Tournament begins

Changes in homework1
Changes in Homework

  • Dictionary

    • Use dictionary provided; do not use your own

    • Start with 300 words only

    • Switch to larger set by time of tournament

  • Representation of dictionary is important to reducing search time

  • Using knowledge to generate word candidates could also help

  • Midterm survey
    Midterm Survey

    • Start after 9AM Friday and finish by Thursday, Mar. 4th

    • Your answers are important: they will affect remaining class structure

    Crossword puzzle solver
    Crossword Puzzle Solver

    • Proverb: Michael Litman, Duke Univ

      • Developed by his AI class

  • Combines knowledge from multiple sources to solve clues (clue/target)

  • Uses constraint propogation in combination with probabilities to select best target

  • Algorithm overview
    Algorithm Overview

    • Independent programs specialize in different types of clues – knowledge experts

      • Information retrieval, database search, machine learning

  • Each expert module generates a candidate list (with probabilities)

  • Centralized solver

    • Merges the candidates lists for each clue

    • Places candidates on the puzzle grid

  • Performance

    • Averages 95.3% words correct and 98.1% letters correct

    • Under 15 minutes/puzzle

    • Tested on a sample of 370 NYT puzzles

    • Misses roughly 3 words or 4 letters on a daily 15X15 puzzle


    • Is this approach any more intelligent than the chess playing programs?

    • Does the use of knowledge correspond to intelligence?

    • Do any of the techniques for generating words apply to Scrabble?

    To begin research style
    To begin: research style

    • Study of existing puzzles

      • How hard?

      • What are the clues like?

      • What sources of knowledge might be helpful?

  • Crossword Puzzle database (CWDB)

    • 350,000 clue-target pairs

    • >250,000 unique pairs

    • = # of puzzles seen over 14 years at rate of one puzzle/day

  • How novel are crossword puzzles
    How novel are crossword puzzles?

    • Given complete database and a new puzzle, expect to have seen

      • 91% of targets

      • 50% of clues

      • 34% of clue target pairs

      • 96% of individual words in clues

    Categories of clues
    Categories of clues

    • Fill in the blank:

      • 28D: Nothing ____: less

  • Trailing question mark

    • 4D: The end of Plato?:

  • Abbreviations

    • 55D: Key abbr: maj

  • Expert categories
    Expert Categories

    • Synonyms

      • 40D Meadowsweet: spiraea

  • Kind-of

    • 27D Kind of coal or coat: pea

    • “pea coal” and “pea coat” standard phrases

  • Movies

    • 50D Princess in Woolf’s “Orlando”: sasha

  • Geography

    • 59A North Sea port: aberdeen

  • Music

    • 2D “Hold Me” country Grammay winner, 1988: oslin

  • Literature

    • 53A Playwright/novelist Capek: karel

  • Information retrieval

    • 6D Mountain known locally as Chomolungma: everest

  • Candidate generator
    Candidate generator

    • Farrow of “Peyton Place”: mia

    • Movie module returns:

      • 0.909091 mia

      • 0.010101 tom

      • 0.010101 kip

      • 0.010101 ben

      • 0.010101 peg

      • 0.010101 ray

    Ablation tests
    Ablation tests

    • Removed each module one at a time, rerunning all training puzzles

    • No single module changed overall percent correct by more than 1%

    • Removing all modules that relied on CWDB

      • 94.8% to 27.1% correct

  • Using only the modules that relied exclusively on CWDB

    • 87.6% correct

  • Word list modules
    Word list modules

    • WordList, WordListBig

      • Ignore their clues and return all words of correct length

      • WordList

        • 655,000 terms

    • WordListBig

      • WordList plus constructed terms

      • First and last names, adjacent words from clues

      • 2.1 million terms, all weighted equally

  • 5D 10,000 words, perhaps: novelette

  • Wordlist-CWDB

    • 58,000 unique targets

    • Returns all targets of appropriate length

    • Weights with estimates of their “prior” probabilities as targets of arbitrary clues

      • Examine frequency in crossword puzzles and normalize to account for bias caused by letters intersecting across and down terms

  • Cwdb specific modules
    CWDB-specific modules

    • Exact Match

      • Returns all targets of the correct length associated with the clue

      • Example error: it returns eeyore for 19A Pal of Pooh: tigger

  • Transformations

    • Learns transformations to clue-target pairs

    • Single-word substitution, remove one phrase from beginning or end and add another, depluralizing a word in clue, pluralize word in target

    • Nice X <-> X in France

    • X for short <-> X abbr.

    • X start <-> Prefix with X

    • X city <-> X capital

    • 51D: Bugs chaser: elmer, solved by Bugs pursuer: elmer and the transformation rule X pursuer <-> X chaser


  • Information retrieval modules
    Information retrieval modules

    • Encyclopedia

      • For each query term, compute distribution of terms “close” to query

        • Counted 10-k times every times it apears at a distance of k<10 from query term

        • Extremely common terms (as, and) are ignored

  • Partial match

    • For a clue c, find all clues in CWDB that share words

    • For each such clue, give its target a weight

  • LSI-Ency, LSI-CWDB

    • Latent semantic indexing (LSI) identifies correlations between words: synonyms

    • Return closest word for each word in the clue

  • Database modules
    Database Modules

    • Movie


      • Looks for patterns in the clue and formulates query to database

        • Quoted titles: 56D “The Thief of Baghdad” role: abu

        • Boolean operations: Cary or Lee: grant

  • Music, literary, geography

    • Simple pattern matching of clue (keywords “city”, “author”, “band”, etc) to formulate query

    • 15A “Foundation of Trilogy” author: asimov

    • Geography database: Getty Information Institute

  • Synonyms

    • WordNet

      • Look for root forms of words in the clue

      • Then find variety of related words

        • 49D Chop-chop: apace

    • Synonyms of synonyms

    • Forms of related words converted to forms of clue word (number, tense)

      • 18A Stymied: thwarted

      • Is this relevant to Scrabble?

    Syntactic modules
    Syntactic Modules

    • Fill-in-the-blanks

      • >5% clues

      • Search databases (music, geography, literary and quotes) to find clue patterns

      • 36A Yerby’s “A Rose for _ _ _ Maria”: ana

        • Pattern: for _ _ _ Maria

        • Allow any 3 characters to fill the blanks

  • Kindof

    • Pattern matching over short phrases

    • 50 clues of this type

      • “type of” (A type of jacket: nehru)

      • “starter for” (Starter for saxon: anglo)

      • “suffix with” (Suffix with switch or sock: eroo

  • Implicit distribution modules
    Implicit Distribution Modules

    • Some targets not included in any database, but more probable than random

      • Schaeffer vs. srhffeeca

      • Bigram module

        • Generates all possible letter sequences of the given length by returning a letter bigram distribution over all possible strings, learned from CWDB

      • Lowest probability clue-target, but higher probability than random sequence of letters

        • Honolulu wear: hawaiianmuumuu

  • How could this be used for Scrabble?

  • Questions1

    • Is this approach any more intelligent than the chess playing programs?

    • Does the use of knowledge correspond to intelligence?

    • Do any of the techniques for generating words apply to Scrabble?