Distributed real time computation of community preferences
Download
1 / 22

Distributed - PowerPoint PPT Presentation


  • 287 Views
  • Updated On :

Distributed, Real-Time Computation of Community Preferences. Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen Old Dominion University Computer Science Department Norfolk, VA 23529 USA {lutken,mln,[email protected] HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Distributed' - ryanadan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Distributed real time computation of community preferences l.jpg

Distributed, Real-Time Computation of Community Preferences

Thomas Lutkenhouse, Michael L. Nelson, Johan Bollen

Old Dominion UniversityComputer Science DepartmentNorfolk, VA 23529 USA

{lutken,mln,[email protected]

HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia

6.-9.Sept. 2005, Salzburg Austria


Distributed real time computation of community preferences2 l.jpg

not CS if you don’t compute

changes are immediate

no central state

not personalization

Distributed, Real-Time Computation of Community Preferences


Outline l.jpg
Outline

  • Review of technologies

    • buckets

    • Hebbian learning

    • previous results

  • Experiment design

  • Results

  • Lessons learned

  • Conclusions


Non evolution of dl objects l.jpg

SRW

RSS

!?

Non-evolution of DL Objects

. . .


Buckets l.jpg
Buckets

  • Premise: repositories come and go, but the objects should endure

  • Began as part of NASA DL research

    • focus on digital preservation

    • implementation of the “Smart Objects, Dumb Archives” (SODA) model for digital libraries

      • CACM 2001, doi.acm.org/10.1145/374308.374342

      • D-Lib, dx.doi.org/10.1045/february2001-nelson


Smart objects l.jpg
Smart Objects

  • Responsibilities generally associated with the repository are “pushed down” into the stored object

    • T&C, maintenance, logging, pagination & display, etc…

  • Aggregate:

    • metadata

    • data

    • methods to operate on the metadata/data

  • API examples

    • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=getMetadata&type=all

    • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listMethods

    • http://www.cs.odu.edu/~mln/teaching/cs595-f03/?method=listPreference

    • (cheat) http://www.cs.odu.edu/~mln/teaching/cs595-f03/bucket/bucket.xml


Examples l.jpg
Examples

  • 1.6.X bucket

    • http://ntrs.nasa.gov/

    • http://www.cs.odu.edu/~mln/phd/

  • 2.0 buckets

    • http://www.cs.odu.edu/~mln/teaching/cs595-f03/

    • http://www.cs.odu.edu/~lutken/bucket/

  • 3.0 buckets (under development)

    • http://beaufort.cs.odu.edu:8080/

    • uses MPEG-21 DIDLs

      • cf. http://www.dlib.org/dlib/november03/bekaert/11bekaert.html


Hebbian learning l.jpg
Hebbian Learning

Implementation issues:

- gather log files

- problematic when spread across servers/domains

- determine a T for session reconstruction

- typically 5 min

- compute links & weights

- update the network periodically

- typically monthly


Previous log based recommendation implementations l.jpg
Previous, Log-Based Recommendation Implementations

  • LANL Journal Recommendations

    • collection analysis based on journal readership patterns

      • D-Lib Magazine, dx.doi.org/10.1045/june2002-bollen

  • NASA Technical Report Server

    • compared recommendations with those generated by VSM

      • WIDM 2004, doi.org.acm/1031453.1031480

  • Open Video Project

    • generated recommendations for videos (little descriptive metadata)

      • JCDL 2005, doi.acm.org/1065385.1065472


Hebbian learning with bucket methods l.jpg
Hebbian Learning with Bucket Methods

http://b?method=display

&referer=http://b&

redirect=http://a?method=display

%26redirect=http://c?method=display

%26referer=http://b

http://a?method=display

&referer=http://a&

redirect=http://b?method=display

%26referer=http://a


Experiment l.jpg
Experiment

  • Spin Magazine’s “Top 50 Rock Bands of All Time”

    • something other than reports, journals, etc.

    • harvest allmusic.com for metadata for all LPs by the 50 bands (total = 800 LPs)

  • Maintain hierarchical arrangement

    • 1 artist  N albums

  • Initialize the network of 800 LPs with each LP randomly linked to 5 other LPs

  • Send out email invitations to browse the network

    • have them explore, and then examine the resulting network

    • users not informed about the workings of the network



Hierarchical weighted links l.jpg

-<structural>

-<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/121/">

-<metadata>

-<descriptive>

<title>Terrapin Station, Capital Centre, Landover, MD, 3/15/90</title>

</descriptive>

<administrative/>

</metadata>

</element>

-<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/11/">

-<metadata>

-<descriptive>

<title>Jealousy/Progress</title>

</descriptive>

<administrative/>

</metadata>

</element>

-<element wt="3" id="~http://www.cs.odu.edu/~lutken/bucket/434/">

-<metadata>

-<descriptive>

<title>Nevermind</title>

</descriptive>

<administrative/>

</metadata>

</element>

-<element wt="0.5" id="~http://www.cs.odu.edu/~lutken/bucket/130/">

-<metadata>

-<descriptive>

<title>Technical Ecstasy</title>

</descriptive>

<administrative/>

</metadata>

</element>

…….

Hierarchical, Weighted Links

weights

- initial: 0.5

- frequency : 1.0

- symmetry: 0.5

- transitivity: 0.3


Slide14 l.jpg

Respondents

  • August 2004 - October 2004

  • 160 respondents

    • self-identify at the beginning; exit survey at the end

    • 1200 bucket-to-bucket traversals (7.5 average traversals per session)


How to evaluate the resulting network l.jpg
How to Evaluate the Resulting Network?

  • Compute network analysis metrics:

    • PageRank

    • Degree Centrality

    • Weighted Degree Centrality

  • Compare the results to:

    • Other “expert” lists (VH1, DigitalDreamDoor, original Spin Magazine list)

    • Artist / LP best seller according to RIAA

    • Artist / LP Amazon sales rank


Expert rankings l.jpg
Expert Rankings

  • No correlation with:

    • VH1 artist list

    • DigitalDreamDoor list

    • original Spin Magazine list (!)

      (critics don’t agree with each other, or the record buying public)


Riaa results l.jpg
RIAA Results

  • RIAA had only

    • only 51/800 LPs

    • only 14/50 artists

      (critics don’t buy records!)

*RIAA sales caveat

  • Figure 6. Probability of albums being best-sellers.

Figure 7. Probability of artists being best-sellers.


Amazon sales rank l.jpg
Amazon Sales Rank

  • No correlation with individual LP sales rank…

  • …but correlated with mean artist sales rank

    • similar to RIAA data

    • interpretation: popular artists often have obscure LPs




Lessons learned l.jpg
Lessons Learned

  • While the subject matter was interesting, it was oriented for music geeks

    • i.e., no actual music was delivered to the users (intellectual property considerations)

    • more traversals needed

  • Random initial starting points were difficult to overcome

    • “cold start problem” - pre-seed the links according to some criteria?

    • weights did not decay over time/traversals

  • Choosing only artists from Spin Magazine may have pre-filtered the response

    • choose artists from Down Beat (Jazz), Vibe (Urban), Music City News (Country), etc.


  • Conclusions l.jpg
    Conclusions

    • Can build a network of smart objects featuring adaptive, hierarchical links constructed in real-time without central state

      • network is created without latency and with computations amortized over individual accesses

    • Experimental testbed with popular music LP metadata shown to approach sales rank of artists, not LPs


    ad