1 / 22

Distributed, Real-Time Computation of Community Preferences

Distributed, Real-Time Computation of Community Preferences. Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science Department Norfolk, VA 23529 USA {lutken,mln,jbollen}@cs.odu.edu HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia

ryanadan
Download Presentation

Distributed, Real-Time Computation of Community Preferences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed, Real-Time Computation of Community Preferences Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion UniversityComputer Science DepartmentNorfolk, VA 23529 USA {lutken,mln,jbollen}@cs.odu.edu HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia 6.-9.Sept. 2005, Salzburg Austria

  2. not CS if you don’t compute changes are immediate no central state not personalization Distributed, Real-Time Computation of Community Preferences

  3. Outline • Review of technologies • buckets • Hebbian learning • previous results • Experiment design • Results • Lessons learned • Conclusions

  4. SRW RSS !? Non-evolution of DL Objects . . .

  5. Buckets • Premise: repositories come and go, but the objects should endure • Began as part of NASA DL research • focus on digital preservation • implementation of the “Smart Objects, Dumb Archives” (SODA) model for digital libraries • CACM 2001, doi.acm.org/10.1145/374308.374342 • D-Lib, dx.doi.org/10.1045/february2001-nelson

  6. Smart Objects • Responsibilities generally associated with the repository are “pushed down” into the stored object • T&C, maintenance, logging, pagination & display, etc… • Aggregate: • metadata • data • methods to operate on the metadata/data • API examples • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=getMetadata&type=all • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listMethods • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listPreference • (cheat) http://www.cs.odu.edu/~mln/teaching/cs595-f03/bucket/bucket.xml

  7. Examples • 1.6.X bucket • http://ntrs.nasa.gov/ • http://www.cs.odu.edu/~mln/phd/ • 2.0 buckets • http://www.cs.odu.edu/~mln/teaching/cs595-f03/ • http://www.cs.odu.edu/~lutken/bucket/ • 3.0 buckets (under development) • http://beaufort.cs.odu.edu:8080/ • uses MPEG-21 DIDLs • cf. http://www.dlib.org/dlib/november03/bekaert/11bekaert.html

  8. Hebbian Learning Implementation issues: - gather log files - problematic when spread across servers/domains - determine a T for session reconstruction - typically 5 min - compute links & weights - update the network periodically - typically monthly

  9. Previous, Log-Based Recommendation Implementations • LANL Journal Recommendations • collection analysis based on journal readership patterns • D-Lib Magazine, dx.doi.org/10.1045/june2002-bollen • NASA Technical Report Server • compared recommendations with those generated by VSM • WIDM 2004, doi.org.acm/1031453.1031480 • Open Video Project • generated recommendations for videos (little descriptive metadata) • JCDL 2005, doi.acm.org/1065385.1065472

  10. Hebbian Learning with Bucket Methods http://b?method=display &referer=http://b& redirect=http://a?method=display %26redirect=http://c?method=display %26referer=http://b http://a?method=display &referer=http://a& redirect=http://b?method=display %26referer=http://a

  11. Experiment • Spin Magazine’s “Top 50 Rock Bands of All Time” • something other than reports, journals, etc. • harvest allmusic.com for metadata for all LPs by the 50 bands (total = 800 LPs) • Maintain hierarchical arrangement • 1 artist  N albums • Initialize the network of 800 LPs with each LP randomly linked to 5 other LPs • Send out email invitations to browse the network • have them explore, and then examine the resulting network • users not informed about the workings of the network

  12. Display of LPs

  13. -<structural> -<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/121/"> -<metadata> -<descriptive> <title>Terrapin Station, Capital Centre, Landover, MD, 3/15/90</title> </descriptive> <administrative/> </metadata> </element> -<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/11/"> -<metadata> -<descriptive> <title>Jealousy/Progress</title> </descriptive> <administrative/> </metadata> </element> -<element wt="3" id="~http://www.cs.odu.edu/~lutken/bucket/434/"> -<metadata> -<descriptive> <title>Nevermind</title> </descriptive> <administrative/> </metadata> </element> -<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/130/"> -<metadata> -<descriptive> <title>Technical Ecstasy</title> </descriptive> <administrative/> </metadata> </element> ……. Hierarchical, Weighted Links weights - initial: 0.5 - frequency : 1.0 - symmetry: 0.5 - transitivity: 0.3

  14. Respondents • August 2004 - October 2004 • 160 respondents • self-identify at the beginning; exit survey at the end • 1200 bucket-to-bucket traversals (7.5 average traversals per session)

  15. How to Evaluate the Resulting Network? • Compute network analysis metrics: • PageRank • Degree Centrality • Weighted Degree Centrality • Compare the results to: • Other “expert” lists (VH1, DigitalDreamDoor, original Spin Magazine list) • Artist / LP best seller according to RIAA • Artist / LP Amazon sales rank

  16. Expert Rankings • No correlation with: • VH1 artist list • DigitalDreamDoor list • original Spin Magazine list (!) (critics don’t agree with each other, or the record buying public)

  17. RIAA Results • RIAA had only • only 51/800 LPs • only 14/50 artists (critics don’t buy records!) *RIAA sales caveat • Figure 6. Probability of albums being best-sellers. Figure 7. Probability of artists being best-sellers.

  18. Amazon Sales Rank • No correlation with individual LP sales rank… • …but correlated with mean artist sales rank • similar to RIAA data • interpretation: popular artists often have obscure LPs

  19. Relatedness(?)

  20. Relatedness(?)

  21. Lessons Learned • While the subject matter was interesting, it was oriented for music geeks • i.e., no actual music was delivered to the users (intellectual property considerations) • more traversals needed • Random initial starting points were difficult to overcome • “cold start problem” - pre-seed the links according to some criteria? • weights did not decay over time/traversals • Choosing only artists from Spin Magazine may have pre-filtered the response • choose artists from Down Beat (Jazz), Vibe (Urban), Music City News (Country), etc.

  22. Conclusions • Can build a network of smart objects featuring adaptive, hierarchical links constructed in real-time without central state • network is created without latency and with computations amortized over individual accesses • Experimental testbed with popular music LP metadata shown to approach sales rank of artists, not LPs

More Related