methodologies and approaches for repository aggregation n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Methodologies and approaches for repository aggregation PowerPoint Presentation
Download Presentation
Methodologies and approaches for repository aggregation

Loading in 2 Seconds...

play fullscreen
1 / 18

Methodologies and approaches for repository aggregation - PowerPoint PPT Presentation


  • 80 Views
  • Uploaded on

Methodologies and approaches for repository aggregation. Pat Lockley University of Nottingham 19 th April 2010. I’ve got a brand new combined harvester and I’ll give you the key. Pat Lockley University of Nottingham 19 th April 2010. The theory.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Methodologies and approaches for repository aggregation' - keith


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
methodologies and approaches for repository aggregation

Methodologies and approaches for repository aggregation

Pat Lockley

University of Nottingham

19th April 2010

i ve got a brand new combined harvester and i ll give you the key

I’ve got a brand new combined harvester and I’ll give you the key

Pat Lockley

University of Nottingham

19th April 2010

the theory
The theory
  • Out in the world are lots of repositories using RSS feeds
the theory continued
The theory continued……….
  • So one site could bring all those feeds together, without the need to upload
the theory continues
The theory continues……….

At Nottingham we have Xerte Online Toolkits

This allows for content to be created online, and as one of it’s features allows for the simple automated creation of DCMI rich RSS feeds

Xerte Online Toolkits is free and open source

the idea
The idea

Lots of Toolkits installs could easily create a standard RSS to be harvested in different ways by different people

the process robot 1
The process… Robot #1

So we built a harvester, a bit like a basic web robot

This would go off and get the RSS feeds, download them to a server, and look in the data for OER materials.

Given RSS feeds are a standard, this would be an easy task…..

the 2 nd robot
The 2nd robot…

Sadly, even between DCMI rich RSS and normal RSS there are differences

The link node, which contains the URL of the OER piece is sometimes empty

Sometimes other nodes are used

So gradually the robot got smarter…..

the 3rd robot
The 3rd robot…

Now we had to tell which feed type was which….

Establishing a fingerprint

Knowing what your “fetching”

Getting as much metadata as possible

But…..

the 4th robot
The 4th robot…

Metadata comes in many forms, so the robot needs to be aware

Subject

Category

Author

Creator

Description

Related content

the 5th robot
The 5th robot…

Taking a preference

Dealing with conflict

Dealing with spam

Dealing with bad metadata

the 6th robot
The 6th robot…

Don’t forget we have users, all this metadata needs to be searchable

Does the user care?

Results driven approach?

How to search best?

Search evaluation?

Does it need an explanation?

the 7th robot
The 7th robot…

What to do when the RSS isn’t even RSS

80 RSS feeds

20 aren’t valid

5 aren’t XML

the 8th robot
The 8th robot…

RSS for humans or machines

All of the content?

Some of the content?

How do we want to talk?

the 9th robot
The 9th robot…

Is there more content in other forms?

OPML?

RSS?

OAI?

SRU?

Thinking beyond the field?

the 10th robot
The 10th robot…

Making it all make sense

Effort to make an aggregator

Harmonisation

Handling new challenges

Scope

the 11th robot
The 11th robot…

Making it smart

Harvests every day (approximately 25 new items a day)

Knows which items have been deleted

Knows which items have moved

Knows what people are looking for

contacts
Contacts

Pat Lockley - Xpert

Julian Tenney - Xerte

Steven Stapleton – Berlin OER