Distributed real time computation of community preferences
1 / 22

Distributed, Real-Time Computation of Community Preferences - PowerPoint PPT Presentation

  • Updated On :

Distributed, Real-Time Computation of Community Preferences. Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science Department Norfolk, VA 23529 USA {lutken,mln,jbollen}@cs.odu.edu HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Distributed, Real-Time Computation of Community Preferences' - ryanadan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Distributed real time computation of community preferences l.jpg

Distributed, Real-Time Computation of Community Preferences

Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen

Old Dominion UniversityComputer Science DepartmentNorfolk, VA 23529 USA


HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia

6.-9.Sept. 2005, Salzburg Austria

Distributed real time computation of community preferences2 l.jpg

not CS if you don’t compute

changes are immediate

no central state

not personalization

Distributed, Real-Time Computation of Community Preferences

Outline l.jpg

  • Review of technologies

    • buckets

    • Hebbian learning

    • previous results

  • Experiment design

  • Results

  • Lessons learned

  • Conclusions

Non evolution of dl objects l.jpg




Non-evolution of DL Objects

. . .

Buckets l.jpg

  • Premise: repositories come and go, but the objects should endure

  • Began as part of NASA DL research

    • focus on digital preservation

    • implementation of the “Smart Objects, Dumb Archives” (SODA) model for digital libraries

      • CACM 2001, doi.acm.org/10.1145/374308.374342

      • D-Lib, dx.doi.org/10.1045/february2001-nelson

Smart objects l.jpg
Smart Objects

  • Responsibilities generally associated with the repository are “pushed down” into the stored object

    • T&C, maintenance, logging, pagination & display, etc…

  • Aggregate:

    • metadata

    • data

    • methods to operate on the metadata/data

  • API examples

    • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=getMetadata&type=all

    • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listMethods

    • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listPreference

    • (cheat) http://www.cs.odu.edu/~mln/teaching/cs595-f03/bucket/bucket.xml

Examples l.jpg

  • 1.6.X bucket

    • http://ntrs.nasa.gov/

    • http://www.cs.odu.edu/~mln/phd/

  • 2.0 buckets

    • http://www.cs.odu.edu/~mln/teaching/cs595-f03/

    • http://www.cs.odu.edu/~lutken/bucket/

  • 3.0 buckets (under development)

    • http://beaufort.cs.odu.edu:8080/

    • uses MPEG-21 DIDLs

      • cf. http://www.dlib.org/dlib/november03/bekaert/11bekaert.html

Hebbian learning l.jpg
Hebbian Learning

Implementation issues:

- gather log files

- problematic when spread across servers/domains

- determine a T for session reconstruction

- typically 5 min

- compute links & weights

- update the network periodically

- typically monthly

Previous log based recommendation implementations l.jpg
Previous, Log-Based Recommendation Implementations

  • LANL Journal Recommendations

    • collection analysis based on journal readership patterns

      • D-Lib Magazine, dx.doi.org/10.1045/june2002-bollen

  • NASA Technical Report Server

    • compared recommendations with those generated by VSM

      • WIDM 2004, doi.org.acm/1031453.1031480

  • Open Video Project

    • generated recommendations for videos (little descriptive metadata)

      • JCDL 2005, doi.acm.org/1065385.1065472

Hebbian learning with bucket methods l.jpg
Hebbian Learning with Bucket Methods










Experiment l.jpg

  • Spin Magazine’s “Top 50 Rock Bands of All Time”

    • something other than reports, journals, etc.

    • harvest allmusic.com for metadata for all LPs by the 50 bands (total = 800 LPs)

  • Maintain hierarchical arrangement

    • 1 artist  N albums

  • Initialize the network of 800 LPs with each LP randomly linked to 5 other LPs

  • Send out email invitations to browse the network

    • have them explore, and then examine the resulting network

    • users not informed about the workings of the network

Hierarchical weighted links l.jpg


-<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/121/">



<title>Terrapin Station, Capital Centre, Landover, MD, 3/15/90</title>





-<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/11/">








-<element wt="3" id="~http://www.cs.odu.edu/~lutken/bucket/434/">








-<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/130/">



<title>Technical Ecstasy</title>






Hierarchical, Weighted Links


- initial: 0.5

- frequency : 1.0

- symmetry: 0.5

- transitivity: 0.3

Slide14 l.jpg


  • August 2004 - October 2004

  • 160 respondents

    • self-identify at the beginning; exit survey at the end

    • 1200 bucket-to-bucket traversals (7.5 average traversals per session)

How to evaluate the resulting network l.jpg
How to Evaluate the Resulting Network?

  • Compute network analysis metrics:

    • PageRank

    • Degree Centrality

    • Weighted Degree Centrality

  • Compare the results to:

    • Other “expert” lists (VH1, DigitalDreamDoor, original Spin Magazine list)

    • Artist / LP best seller according to RIAA

    • Artist / LP Amazon sales rank

Expert rankings l.jpg
Expert Rankings

  • No correlation with:

    • VH1 artist list

    • DigitalDreamDoor list

    • original Spin Magazine list (!)

      (critics don’t agree with each other, or the record buying public)

Riaa results l.jpg
RIAA Results

  • RIAA had only

    • only 51/800 LPs

    • only 14/50 artists

      (critics don’t buy records!)

*RIAA sales caveat

  • Figure 6. Probability of albums being best-sellers.

Figure 7. Probability of artists being best-sellers.

Amazon sales rank l.jpg
Amazon Sales Rank

  • No correlation with individual LP sales rank…

  • …but correlated with mean artist sales rank

    • similar to RIAA data

    • interpretation: popular artists often have obscure LPs

Lessons learned l.jpg
Lessons Learned

  • While the subject matter was interesting, it was oriented for music geeks

    • i.e., no actual music was delivered to the users (intellectual property considerations)

    • more traversals needed

  • Random initial starting points were difficult to overcome

    • “cold start problem” - pre-seed the links according to some criteria?

    • weights did not decay over time/traversals

  • Choosing only artists from Spin Magazine may have pre-filtered the response

    • choose artists from Down Beat (Jazz), Vibe (Urban), Music City News (Country), etc.

  • Conclusions l.jpg

    • Can build a network of smart objects featuring adaptive, hierarchical links constructed in real-time without central state

      • network is created without latency and with computations amortized over individual accesses

    • Experimental testbed with popular music LP metadata shown to approach sales rank of artists, not LPs