1 / 9

Federated search engine and the (PAPIzed) TF-emc2Wiki

This article explains the architecture of Searchy, a federated search engine that incorporates agents for various data sources like LDAP, SQL, the Google API, and Searchy itself. It provides information on how to install and configure a Searchy agent and offers a guide on using Searchy for federated data access. The article also includes details on the TF-emc2Wiki, a protected resource with full and read-only access.

opals
Download Presentation

Federated search engine and the (PAPIzed) TF-emc2Wiki

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Federated search engine and the (PAPIzed) TF-emc2Wiki

  2. The Searchy Architecture • Each source incorporates an agent, available through a SOAP interface • Uses RDF as internal representation • Agents for LDAP, SQL, the Google API, and Searchy itself

  3. Searchy test installation • To evaluate federated data acces using Searchy • Build a directory of middleware resources • Using each organization's data sources • Installing a Searchy agent in your systems • Initially, RedIRIS runs the main search interface http://www.rediris.es/busquedas/searchy/middleware/index.en.phtml • Prepare a report with your feedback as a deliverable

  4. Installing your Searchy agent • Download and unpack the lattest Searchy distribution • http://jsearchy.sourceforge.net/ • You only need J2SE >= 1.4 • Select your data sources (backends) • SQL • LDAP • Web servers (Google API for a restricted search) • Configure your agent • Use the sample agent configuration file in the conf directory • Or the simplified configuration to be distributed in the list • Support at http://lists.sourceforge.net/lists/listinfo/jsearchy-users • Register your agent • Host and port • searchy-emc2@rediris.es

  5. Configuring your Searchy agent • Searchy configuration is contained in a XML file • conf/agent.xml • Three main elements • <transport> • General parameters of the agent • <provider> • Access parameters to the different data sources • More than one provider can be used for an agent • <map> • Take care of the data transformations • Queries received by the agent into queries to the provider(s) • Responses from the providers into metadata to be sent by the agent

  6. The <transport> element • Basic configuration parameters • Identifier for the agent • Providers to be used • Port to listen at and maximum number of connections • Log configuration (using log4j) • Vocabulary to be used by the metadata • A subset of Dublin Core is going to be used: • dc:title, dc:subject and dc:description for queries • dc:title, dc:subject, dc:description, dc:creator (and URL!) for responses • ACLs to be applied when receiving • Simple rules based on hostname or IP addresses • Pilot config only accepts connections from certain RedIRIS hosts

  7. The <provider> element • Identifier, type and applicable map • The rest of parameters depend on the type • Three types included in the pilot config • Google • The account key to be used when connecting to the WS interface • SQL • A valid JDBC driver class name • Connection data: URL using the jdbc method, hostname, port, database, username, password • LDAP • URL for the LDAP server • Root and search scope • Other LDAP parameters: follow referrals, timeout,...

  8. The <map> element • Map name and applicable vocabulary • Elements describing input/outpust transformations • <URL>: Do not fiddle with it unless you know what you're doing! • One element per input term (type="query") • How query term is translated into the backend query language <dc:title filter="query">SELECT titleDB, subjectDB, creatorDB, descriptionDB FROM table WHERE (titleDB="%query%")</dc:title> • One element per output term (type="response") • How results field (enclosed between %) are transformed to build the term contents in the response <dc:description type="response">%snippet%</dc:description>

  9. The (PAPIzed) TF-emc2Wiki • Available athttp://www.rediris.es/wiki/tf-emc2/ • Protected by PAPI • Possibility of full and read-only access • We'll be happy to make interoperability tests with other AAIs • We'll include all the users in the mailing list • Username: your e-mail address • Password: you'll receive one that you can (should) change • Those already with access to the JRA5Wiki will be automatically enabled

More Related