the use of an intelligent forum crawler for data retrieval from e learning portals
Download
Skip this Video
Download Presentation
The use of an intelligent forum crawler for data retrieval from e-learning portals

Loading in 2 Seconds...

play fullscreen
1 / 14

The use of an intelligent forum crawler for data retrieval from e-learning portals - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

6th International Conference on Education and New Learning Technologies Barcelona , 7th - 9th of July 2014. The use of an intelligent forum crawler for data retrieval from e-learning portals.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' The use of an intelligent forum crawler for data retrieval from e-learning portals' - maris-koch


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the use of an intelligent forum crawler for data retrieval from e learning portals

6th International Conference on Education and New Learning Technologies Barcelona, 7th - 9th of July 2014

The use of an intelligent forum crawler for data retrieval from e-learning portals

Miloš PavkovićandJelicaProtić, University of BelgradeSchool of Electrical Engineering, Belgrade, Serbia

introduction
Introduction
  • A large number of forums with different topics
  • Forums are often used by students during their studies
  • Large number of relevant information scattered around different forums inside one university domain
  • Forums are based on different technologies
issues
Issues
  • The same topic can appear across different forums inside one university domain
  • School official forums VS. departments independent forums
  • Same documents can be uploaded as post attachments to a couple of different web forums
  • Similar courses at different schools
solution specialized crawler
Solution – Specialized crawler
  • Specialized forum crawler
  • Aggregation of crawled data from multiple forums of a single university domain
  • Storing data into database
  • Forum modules that use this database for helping students
forum structure
Forum structure
  • Always defined by presented implicit paths
  • Example of a) forum b) thread c) attachments inside post.
crawler algorithm
Crawler algorithm
  • FCbRE – Forum Crawler based on Regular Expressions
  • Automated system
  • Identifying DOM structure and basic forum elements with regular expressions.
  • Identifying forum implicit paths using regexExample: >>index\.php\?showforum\==\digit+!>+>\P=!<+
  • Extraction of post content and storing into the database
crawler database

Web Forum

Threads

Forums

Posts

Attach

- site id

- site name

- site link

+ site id

- forum id

- forum name

- forum link

+ forum id

- thread id

- thread name

- thread link

+ thread id

- post id

- post info

+ post id

- attach id

- attach name

- attach link

T – Simil.

A – Simil.

F – Simil.

F/T – Simil.

+ thread id (1)

+ thread id (2)

+ attach id (1)

+ attach id (2)

+ forum id (1)

+ forum id (2)

+ forum id

+ thread id

Crawler database
  • Essential in FCbRE model
  • Forum threads and posts are separately stored
  • Similarity tables that contain unique pairs of identifiers of forums, threads and attachments
finding similarities
Finding similarities
  • Determining similarities of forums, threads or document names
  • It is not enough to just compare the words
    • grammatical errors
    • Singular/plural form
    • different form but the same semantic meaning
  • Using existing search engines to distinguish semantics
  • FCbRE uses low-level semantic difference
module plugins
Module plugins
  • Two module plugins
    • FCbRE-S (FCbRE Search plugin )
    • FCbRE-DP (FCbRE Duplicate Prevention plugin)
  • Both used for experimental purposes
  • Written for vBulletin technology
  • Can be adopted for any other forum technology
fcbre s fcbre search plugin
FCbRE-S (FCbRE Search plugin )
  • Designed for standard forums searches
  • Forwards the requested query to FCbRE database for similarity comparison
  • All similarities are shown as addition to standard search results
fcbre dp duplicate prevention plugin
FCbRE-DP (Duplicate Prevention plugin)
  • Implemented in the section where the users can create a topic or forum
  • Monitors the field for the name of new thread or forum
  • Notifies the user that the similarity exist
results
Results
  • 9 web forums from the University of Belgrade, manually gathered
  • This group is a mixture from different sources
  • Percentage of similar forums is smallest, while for the document is highest
  • True percentage of "useful" duplicates should be taken with caution
conclusion
Conclusion
  • The proposed solution performs information aggregation of related forums
  • It has potential in reducing duplication of forums, topics and posts
  • The use of plugins would result in higher forum content quality
thank you
Thank you!

Feel free to contact us and ask any question that you may find interesting

[email protected]

[email protected]

ad