1 / 16

Dissociated Web

Dissociated Web. Craig S. Kaplan @ PoCSci ‘02. Dissociated Press n. … Here is a short example of word-based Dissociated Press applied to an earlier version of this Jargon File:

brilliant
Download Presentation

Dissociated Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dissociated Web Craig S. Kaplan @ PoCSci ‘02

  2. Dissociated Press n. … Here is a short example of word-based Dissociated Press applied to an earlier version of this Jargon File: wart: n. A small, crocky feature that sticks out of an array (C has no checks for this). This is relatively benign and easy to spot if the phrase is bent so as to be not worth paying attention to the medium in question.

  3. Dissociated Press n. … Here is a short example of word-based Dissociated Press applied to an earlier version of this Jargon File: wart: n. A small, crocky feature that sticks out of an array (C has no checks for this). This is relatively benign and easy to spot if the phrase is bent so as to be not worth paying attention to the medium in question. Do this for web pages!

  4. Applications • HTML/XML rsch, esp. LDAP, XSL/T, XX-TRC, ISO/GWM, etc. • Something to work on in the middle of the night before you give a keynote that you haven’t prepared for, when you really should be sleeping or, god forbid, working on your dissertation • There aren’t enough web pages yet • And now, a lamp:

  5. Implementation Start with a random web page

  6. Implementation • First idea: binary Markov chain… 6 bits of context 2 bits of context

  7. Implementation • First idea: binary Markov chain… …sucks. 6 bits of context 2 bits of context

  8. Implementation • Second idea: symbolic Markov chain… 1 symbol of context = Old English

  9. Implementation • Second idea: symbolic Markov chain… 5 symbols of context = web wacko

  10. Implementation • Second idea: symbolic Markov chain… 8 symbols of context = already above average!

  11. Implementation • A better idea: tree-structured Markov chains (Markov trees) • HTML tags form a tree • Each tag contains a list of children • Markov model generates lists of children • Use different model for every “vertical context” (suffix of path in the tag tree)

  12. Results

  13. Analysis

  14. For more “information”: http://www.cs.washington.edu/homes/csk/disweb/

More Related