Introduction to Grid Computing

Deployment of a Language Detector Grid Service Introduction to Grid Computing Felix Hageloh Roberto Valenti University of Amsterdam, 02-11-2005

Overview • Introduction • Required Steps • Our Service • Introduction • The basic idea • Use Case • Interface • Implementation • Problems Encountered • Future Work • Conclusions • Questions

Introduction • Our chosen task: Grid Services • Task Goals: • Build a grid service. • Aggregate the service with another to provide additional, higher-level services

Steps • Get access to the systems • Authentication • Security issues • Obtain User Certificate • Obtain Host Certificate • Implement the service • Create required files • WSDL • QNames • WSDD • JNDI • Compile and create GAR file • As Globus user: • Deploy service • Start container

But you all know this… So… we jump to our service.

Our Service

Our Service: Introduction • We were requested to implement a useful service which could be integrated on other services • We are AI students so… Let’s Merge AI and Grid Computing!!

Our Service: The Basic Idea • Idea: Language Detection Is a necessary first step in a multitude of applications • Useful Web Service Examples: • Email filtering • Information retrieval • Spell checkers • Can also be component of an aggregated grid service

Our Service: The Basic Idea • What about creating a Language Detector on the Grid? • Training and Testing can be extremely time consuming running on a single machine • Data difficult to obtain -> can be shared on the Grid • Duplicate data for parallel computing

Our Service: Use Case Simple Interface: • Receives a piece of text • Returns a string indicating the language

Our Service: Adding States • Grid services can have states (as opposed to web services) • Not necessary for our service but for the learning factor • Added “dummy” states to our service: • Last Operation • Times Used

Our Service: Statefull Use Case

Our Service: Interface • Requests and Responses <xsd:element name="detect" type="xsd:string" /> <xsd:element name="detectResponse“ type="xsd:string“ /> <xsd:element name="getLanguageRP"> <xsd:complexType /> </xsd:element> <xsd:element name="getLanguageRPResponse" type="xsd:string" /> <xsd:element name="getLastOpRP"> <xsd:complexType /> </xsd:element> <xsd:element name="getLastOpRPResponse" type="xsd:string" /> <xsd:element name="getTimesUsedRP"> <xsd:complexType /> </xsd:element> <xsd:element name="getTimesUsedRPResponse" type="xsd:int" />

Our Service: Interface • Port Types <portType name="LanguageDetectorPortType" … > <operation name="detect"> <input message="tns:DetectInputMessage" /> <output message="tns:DetectOutputMessage" /> </operation> <operation name="getLanguageRP"> <input message="tns:GetLanguageRPInputMessage" /> <output message="tns:GetLanguageRPOutputMessage" /> </operation> <operation name="getLastOpRP"> <input message="tns:GetLastOpRPInputMessage" /> <output message="tns:GetLastOpRPOutputMessage" /> </operation> <operation name="getTimesUsedRP"> <input message="tns:GetTimesUsedRPInputMessage" /> <output message="tns:GetTimesUsedRPOutputMessage" /> </operation> </portType>

Our Service Implementation

Language Detection: Basic Idea • Essentially based on probabilities of character combinations • Every language has typical character combinations that are very frequent in that language • “th” in english • “ij” in dutch • Easy for humans to detect a language even when we don’t know that specific language

Language Learning: Standard Process • Standard machine learning process

Language Learning: Markov Models • Basic Markov Model • kth order Markov Model

Language Detection: Classification • Transitional probabilities estimated as • Classification

Language Detection: Example • The training text for a language consists of the string • Learned model: • the probability of the string would be: test text test ( ^^, t, 1.0 ) ( ^t, e, 1.0 ) ( te, s, 0.5 ) ( es, t, 1.0 ) ( st, , 1.0 ) ( te, x, 0.5 ) ( ex, t, 1.0 ) ( xt, , 1.0 ) P(test|L) = P(t|^^)*P(e|^t)*P(s|te) *P(t|es)*P(_|st) = 1*1*0.5*1*1=0.5

Language Detection: Performance

Problems Encountered • Necessary tools had to be installed (ANT) • Problems on our machine (GRAM) • Conflicts with other team • Buggy shell script to build gar file • Sensitive to path lengths/ names

Future Work • Connect with other services • Make training and evaluation a grid service • Make it part of a multi lingual retrieval engine • Web interface (interactive)

Conclusions • Successfully managed to create and deploy our own web service • Broke loose from the tutorial web service structure • Merged Grid Computing with AI • Got hands on experience with Grid applications and structure • A lot of possibilities to integrate and/or extend the implemented service

Questions ?

Introduction to Grid Computing