1 / 19

Implementation of One Stop Search by XSLT

Implementation of One Stop Search by XSLT. By Dave Low University of Hong Kong 9-Dec-2003. Agenda. Flow of One Stop Search Reason to use Extensible Stylesheet Language Transformation (XSLT) Difficulties on implementation of One Stop Search by XSLT Our solution Our implementation Summary.

Download Presentation

Implementation of One Stop Search by XSLT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementation of One Stop Search by XSLT By Dave Low University of Hong Kong 9-Dec-2003

  2. Agenda • Flow of One Stop Search • Reason to use Extensible Stylesheet Language Transformation (XSLT) • Difficulties on implementation of One Stop Search by XSLT • Our solution • Our implementation • Summary

  3. Flow of One Stop Search • Capture the search keyword • Issue the search to different search engines • Collect the result and click on next button until we got all the records • Compile the search results from different search engines • Present the result to the user

  4. Flow of One Stop Search One Stop Search ProQuest Science Direct Kluwer Online Capture Keyword Search and next Search and next Search and next Compile Result Present Result

  5. Reason to use XSL • Simple • XSL is plain text • Multiplatform • Can run on any machine with XSLT Engine • Easy to maintain • When the output layout of target search engine change • Just change the content of XSL file • No recompilation is needed

  6. Two main problems when using XSL • XSLT engine requires well formatted XML files as input • Web based search engine output in HTML only • HTML is not well formatted XML • HTML allows open tag only for some tags • E.g. <br>

  7. Solution • Use HTML tidy (http://tidy.sourceforge.net/) to convert HTML to well-format XML • “A HTML syntax checker and pretty printer. It can be used as a tool for cleaning up malformed and faulty HTML. In addition, it provides a DOM interface to the document that is being processed, which effectively makes you able to use it as a DOM parser for real-world HTML” • It is open source • It has many implementations such as Java, Perl and Python

  8. Solution • Sample code in Java StringReader strReader = new StringReader(html); Tidy tidy = new Tidy(); return tidy.parseDOM(strReader, null); • HTML => XML

  9. Two main problems when using XSL • There is no browse function in XSL • In one-stop search, we need to click the next button several times to collect all the result • We need to tell the program to find the next button and then issue a browse request based on the URL of the next button

  10. Solution • Add browse function to XSL by XSL extension • XSLT allows two kinds of extension, extension elements and extension functions • Type of extension depends on XSLT implementations • Detail can be found http://www.w3.org/TR/xslt#extension

  11. Solution • Our implementation • Select a java based XSLT Engine • Use java to write the function • Compile it into classes and then jar • Include the jar file into the classpath of the XSLT Engine • Run it

  12. Sample code on XSL extension Define Class to be used <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:HKUL="http://www.lib.hku.hk/java/hkul.apps.web.Browser" exclude-result-prefixes="HKUL"> <xsl:template match=“/"> <xsl:variable name="url">http://www.lib.hku.hk/</xsl:variable> <xsl:variable name="browser" select="HKUL:new($url)" /> <xsl:variable name="content" select="HKUL:browse($browser,$url)" /> <xsl:apply-templates select="$content/html/*" /> </xsl:template> Create it Call the browse function

  13. Our Implementation Browse Next Tidy Parse Result

  14. Our Implementation • Both client and server programs are written by Java • Client and server program communicated by HTTP • Making use of wireless network

  15. Our Implementation (Client side) • Palm OS • Sun’s Java 2 Platform, Micro Edition (J2ME) http://java.sun.com/j2me/ • Mobile Information Device Profile (MIDP) http://java.sun.com/products/midp

  16. Our Implementation (Server side) • Application Server (Running on Sun Solaris with JDK1.4) • Jakarta Tomcat (http://jakarta.apache.org/tomcat) • Jakarta Struts Framework (http://jakarta.apache.org/struts) • Xerces XSLT Engine (http://xml.apache.org/#xerces) • MySQL database (http://www.mysql.com)

  17. Summary • Implement the one stop search by XSLT • Simple • Multiplatform • Easy to maintain • Two problems • HTML is not well formatted XML • No browse function in XSL

  18. Summary • Solutions • HTML Tidy • XSL Extension • Implementation • J2ME • Jakarta Tomcat + Struts • Xerces • MySQL

  19. Questions? • Thank you

More Related