distributed real time computation of community preferences l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Distributed, Real-Time Computation of Community Preferences PowerPoint Presentation
Download Presentation
Distributed, Real-Time Computation of Community Preferences

Loading in 2 Seconds...

play fullscreen
1 / 22

Distributed, Real-Time Computation of Community Preferences - PowerPoint PPT Presentation


  • 299 Views
  • Uploaded on

Distributed, Real-Time Computation of Community Preferences. Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science Department Norfolk, VA 23529 USA {lutken,mln,jbollen}@cs.odu.edu HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Distributed, Real-Time Computation of Community Preferences


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion UniversityComputer Science DepartmentNorfolk, VA 23529 USA {lutken,mln,jbollen}@cs.odu.edu HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia 6.-9.Sept. 2005, Salzburg Austria

    2. not CS if you don’t compute changes are immediate no central state not personalization Distributed, Real-Time Computation of Community Preferences

    3. Outline • Review of technologies • buckets • Hebbian learning • previous results • Experiment design • Results • Lessons learned • Conclusions

    4. SRW RSS !? Non-evolution of DL Objects . . .

    5. Buckets • Premise: repositories come and go, but the objects should endure • Began as part of NASA DL research • focus on digital preservation • implementation of the “Smart Objects, Dumb Archives” (SODA) model for digital libraries • CACM 2001, doi.acm.org/10.1145/374308.374342 • D-Lib, dx.doi.org/10.1045/february2001-nelson

    6. Smart Objects • Responsibilities generally associated with the repository are “pushed down” into the stored object • T&C, maintenance, logging, pagination & display, etc… • Aggregate: • metadata • data • methods to operate on the metadata/data • API examples • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=getMetadata&type=all • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listMethods • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listPreference • (cheat) http://www.cs.odu.edu/~mln/teaching/cs595-f03/bucket/bucket.xml

    7. Examples • 1.6.X bucket • http://ntrs.nasa.gov/ • http://www.cs.odu.edu/~mln/phd/ • 2.0 buckets • http://www.cs.odu.edu/~mln/teaching/cs595-f03/ • http://www.cs.odu.edu/~lutken/bucket/ • 3.0 buckets (under development) • http://beaufort.cs.odu.edu:8080/ • uses MPEG-21 DIDLs • cf. http://www.dlib.org/dlib/november03/bekaert/11bekaert.html

    8. Hebbian Learning Implementation issues: - gather log files - problematic when spread across servers/domains - determine a T for session reconstruction - typically 5 min - compute links & weights - update the network periodically - typically monthly

    9. Previous, Log-Based Recommendation Implementations • LANL Journal Recommendations • collection analysis based on journal readership patterns • D-Lib Magazine, dx.doi.org/10.1045/june2002-bollen • NASA Technical Report Server • compared recommendations with those generated by VSM • WIDM 2004, doi.org.acm/1031453.1031480 • Open Video Project • generated recommendations for videos (little descriptive metadata) • JCDL 2005, doi.acm.org/1065385.1065472

    10. Hebbian Learning with Bucket Methods http://b?method=display &referer=http://b& redirect=http://a?method=display %26redirect=http://c?method=display %26referer=http://b http://a?method=display &referer=http://a& redirect=http://b?method=display %26referer=http://a

    11. Experiment • Spin Magazine’s “Top 50 Rock Bands of All Time” • something other than reports, journals, etc. • harvest allmusic.com for metadata for all LPs by the 50 bands (total = 800 LPs) • Maintain hierarchical arrangement • 1 artist  N albums • Initialize the network of 800 LPs with each LP randomly linked to 5 other LPs • Send out email invitations to browse the network • have them explore, and then examine the resulting network • users not informed about the workings of the network

    12. Display of LPs

    13. -<structural> -<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/121/"> -<metadata> -<descriptive> <title>Terrapin Station, Capital Centre, Landover, MD, 3/15/90</title> </descriptive> <administrative/> </metadata> </element> -<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/11/"> -<metadata> -<descriptive> <title>Jealousy/Progress</title> </descriptive> <administrative/> </metadata> </element> -<element wt="3" id="~http://www.cs.odu.edu/~lutken/bucket/434/"> -<metadata> -<descriptive> <title>Nevermind</title> </descriptive> <administrative/> </metadata> </element> -<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/130/"> -<metadata> -<descriptive> <title>Technical Ecstasy</title> </descriptive> <administrative/> </metadata> </element> ……. Hierarchical, Weighted Links weights - initial: 0.5 - frequency : 1.0 - symmetry: 0.5 - transitivity: 0.3

    14. Respondents • August 2004 - October 2004 • 160 respondents • self-identify at the beginning; exit survey at the end • 1200 bucket-to-bucket traversals (7.5 average traversals per session)

    15. How to Evaluate the Resulting Network? • Compute network analysis metrics: • PageRank • Degree Centrality • Weighted Degree Centrality • Compare the results to: • Other “expert” lists (VH1, DigitalDreamDoor, original Spin Magazine list) • Artist / LP best seller according to RIAA • Artist / LP Amazon sales rank

    16. Expert Rankings • No correlation with: • VH1 artist list • DigitalDreamDoor list • original Spin Magazine list (!) (critics don’t agree with each other, or the record buying public)

    17. RIAA Results • RIAA had only • only 51/800 LPs • only 14/50 artists (critics don’t buy records!) *RIAA sales caveat • Figure 6. Probability of albums being best-sellers. Figure 7. Probability of artists being best-sellers.

    18. Amazon Sales Rank • No correlation with individual LP sales rank… • …but correlated with mean artist sales rank • similar to RIAA data • interpretation: popular artists often have obscure LPs

    19. Relatedness(?)

    20. Relatedness(?)

    21. Lessons Learned • While the subject matter was interesting, it was oriented for music geeks • i.e., no actual music was delivered to the users (intellectual property considerations) • more traversals needed • Random initial starting points were difficult to overcome • “cold start problem” - pre-seed the links according to some criteria? • weights did not decay over time/traversals • Choosing only artists from Spin Magazine may have pre-filtered the response • choose artists from Down Beat (Jazz), Vibe (Urban), Music City News (Country), etc.

    22. Conclusions • Can build a network of smart objects featuring adaptive, hierarchical links constructed in real-time without central state • network is created without latency and with computations amortized over individual accesses • Experimental testbed with popular music LP metadata shown to approach sales rank of artists, not LPs