The use of an intelligent forum crawler for data retrieval from e-learning portals - PowerPoint PPT Presentation

The use of an intelligent forum crawler for data retrieval from e learning portals
Download
1 / 14

  • 89 Views
  • Uploaded on
  • Presentation posted in: General

6th International Conference on Education and New Learning Technologies Barcelona , 7th - 9th of July 2014. The use of an intelligent forum crawler for data retrieval from e-learning portals.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

The use of an intelligent forum crawler for data retrieval from e-learning portals

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The use of an intelligent forum crawler for data retrieval from e learning portals

6th International Conference on Education and New Learning Technologies Barcelona, 7th - 9th of July 2014

The use of an intelligent forum crawler for data retrieval from e-learning portals

Miloš PavkovićandJelicaProtić, University of BelgradeSchool of Electrical Engineering, Belgrade, Serbia


Introduction

Introduction

  • A large number of forums with different topics

  • Forums are often used by students during their studies

  • Large number of relevant information scattered around different forums inside one university domain

  • Forums are based on different technologies


Issues

Issues

  • The same topic can appear across different forums inside one university domain

  • School official forums VS. departments independent forums

  • Same documents can be uploaded as post attachments to a couple of different web forums

  • Similar courses at different schools


Solution specialized crawler

Solution – Specialized crawler

  • Specialized forum crawler

  • Aggregation of crawled data from multiple forums of a single university domain

  • Storing data into database

  • Forum modules that use this database for helping students


Forum structure

Forum structure

  • Always defined by presented implicit paths

  • Example of a) forum b) thread c) attachments inside post.


Crawler algorithm

Crawler algorithm

  • FCbRE – Forum Crawler based on Regular Expressions

  • Automated system

  • Identifying DOM structure and basic forum elements with regular expressions.

  • Identifying forum implicit paths using regexExample: >>index\.php\?showforum\==\digit+!>+>\P=!<+

  • Extraction of post content and storing into the database


Crawler database

Web Forum

Threads

Forums

Posts

Attach

- site id

- site name

- site link

+ site id

- forum id

- forum name

- forum link

+ forum id

- thread id

- thread name

- thread link

+ thread id

- post id

- post info

+ post id

- attach id

- attach name

- attach link

T – Simil.

A – Simil.

F – Simil.

F/T – Simil.

+ thread id (1)

+ thread id (2)

+ attach id (1)

+ attach id (2)

+ forum id (1)

+ forum id (2)

+ forum id

+ thread id

Crawler database

  • Essential in FCbRE model

  • Forum threads and posts are separately stored

  • Similarity tables that contain unique pairs of identifiers of forums, threads and attachments


Finding similarities

Finding similarities

  • Determining similarities of forums, threads or document names

  • It is not enough to just compare the words

    • grammatical errors

    • Singular/plural form

    • different form but the same semantic meaning

  • Using existing search engines to distinguish semantics

  • FCbRE uses low-level semantic difference


Module plugins

Module plugins

  • Two module plugins

    • FCbRE-S (FCbRE Search plugin )

    • FCbRE-DP (FCbRE Duplicate Prevention plugin)

  • Both used for experimental purposes

  • Written for vBulletin technology

  • Can be adopted for any other forum technology


Fcbre s fcbre search plugin

FCbRE-S (FCbRE Search plugin )

  • Designed for standard forums searches

  • Forwards the requested query to FCbRE database for similarity comparison

  • All similarities are shown as addition to standard search results


Fcbre dp duplicate prevention plugin

FCbRE-DP (Duplicate Prevention plugin)

  • Implemented in the section where the users can create a topic or forum

  • Monitors the field for the name of new thread or forum

  • Notifies the user that the similarity exist


Results

Results

  • 9 web forums from the University of Belgrade, manually gathered

  • This group is a mixture from different sources

  • Percentage of similar forums is smallest, while for the document is highest

  • True percentage of "useful" duplicates should be taken with caution


Conclusion

Conclusion

  • The proposed solution performs information aggregation of related forums

  • It has potential in reducing duplication of forums, topics and posts

  • The use of plugins would result in higher forum content quality


Thank you

Thank you!

Feel free to contact us and ask any question that you may find interesting

milos_pavkovic@yahoo.com

jeca@etf.bg.ac.rs


  • Login