The use of an intelligent forum crawler for data retrieval from e learning portals
This presentation is the property of its rightful owner.
Sponsored Links
1 / 14

The use of an intelligent forum crawler for data retrieval from e-learning portals PowerPoint PPT Presentation


  • 72 Views
  • Uploaded on
  • Presentation posted in: General

6th International Conference on Education and New Learning Technologies Barcelona , 7th - 9th of July 2014. The use of an intelligent forum crawler for data retrieval from e-learning portals.

Download Presentation

The use of an intelligent forum crawler for data retrieval from e-learning portals

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The use of an intelligent forum crawler for data retrieval from e learning portals

6th International Conference on Education and New Learning Technologies Barcelona, 7th - 9th of July 2014

The use of an intelligent forum crawler for data retrieval from e-learning portals

Miloš PavkovićandJelicaProtić, University of BelgradeSchool of Electrical Engineering, Belgrade, Serbia


Introduction

Introduction

  • A large number of forums with different topics

  • Forums are often used by students during their studies

  • Large number of relevant information scattered around different forums inside one university domain

  • Forums are based on different technologies


Issues

Issues

  • The same topic can appear across different forums inside one university domain

  • School official forums VS. departments independent forums

  • Same documents can be uploaded as post attachments to a couple of different web forums

  • Similar courses at different schools


Solution specialized crawler

Solution – Specialized crawler

  • Specialized forum crawler

  • Aggregation of crawled data from multiple forums of a single university domain

  • Storing data into database

  • Forum modules that use this database for helping students


Forum structure

Forum structure

  • Always defined by presented implicit paths

  • Example of a) forum b) thread c) attachments inside post.


Crawler algorithm

Crawler algorithm

  • FCbRE – Forum Crawler based on Regular Expressions

  • Automated system

  • Identifying DOM structure and basic forum elements with regular expressions.

  • Identifying forum implicit paths using regexExample: >>index\.php\?showforum\==\digit+!>+>\P=!<+

  • Extraction of post content and storing into the database


Crawler database

Web Forum

Threads

Forums

Posts

Attach

- site id

- site name

- site link

+ site id

- forum id

- forum name

- forum link

+ forum id

- thread id

- thread name

- thread link

+ thread id

- post id

- post info

+ post id

- attach id

- attach name

- attach link

T – Simil.

A – Simil.

F – Simil.

F/T – Simil.

+ thread id (1)

+ thread id (2)

+ attach id (1)

+ attach id (2)

+ forum id (1)

+ forum id (2)

+ forum id

+ thread id

Crawler database

  • Essential in FCbRE model

  • Forum threads and posts are separately stored

  • Similarity tables that contain unique pairs of identifiers of forums, threads and attachments


Finding similarities

Finding similarities

  • Determining similarities of forums, threads or document names

  • It is not enough to just compare the words

    • grammatical errors

    • Singular/plural form

    • different form but the same semantic meaning

  • Using existing search engines to distinguish semantics

  • FCbRE uses low-level semantic difference


Module plugins

Module plugins

  • Two module plugins

    • FCbRE-S (FCbRE Search plugin )

    • FCbRE-DP (FCbRE Duplicate Prevention plugin)

  • Both used for experimental purposes

  • Written for vBulletin technology

  • Can be adopted for any other forum technology


Fcbre s fcbre search plugin

FCbRE-S (FCbRE Search plugin )

  • Designed for standard forums searches

  • Forwards the requested query to FCbRE database for similarity comparison

  • All similarities are shown as addition to standard search results


Fcbre dp duplicate prevention plugin

FCbRE-DP (Duplicate Prevention plugin)

  • Implemented in the section where the users can create a topic or forum

  • Monitors the field for the name of new thread or forum

  • Notifies the user that the similarity exist


Results

Results

  • 9 web forums from the University of Belgrade, manually gathered

  • This group is a mixture from different sources

  • Percentage of similar forums is smallest, while for the document is highest

  • True percentage of "useful" duplicates should be taken with caution


Conclusion

Conclusion

  • The proposed solution performs information aggregation of related forums

  • It has potential in reducing duplication of forums, topics and posts

  • The use of plugins would result in higher forum content quality


Thank you

Thank you!

Feel free to contact us and ask any question that you may find interesting

[email protected]

[email protected]


  • Login