Gate development hints
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

GATE development hints PowerPoint PPT Presentation


  • 45 Views
  • Uploaded on
  • Presentation posted in: General

GATE development hints. Reporting bugs Submitting a patch The user guide Continuous integration. Bugs, feature requests. Use the tracker on SourceForge http://sourceforge.net/projects/gate/support Give as much detail as possible

Download Presentation

GATE development hints

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Gate development hints

GATE development hints

  • Reporting bugs

  • Submitting a patch

  • The user guide

  • Continuous integration


Bugs feature requests

Bugs, feature requests

  • Use the tracker on SourceForge

    • http://sourceforge.net/projects/gate/support

  • Give as much detail as possible

    • GATE version, build number, platform, Java version (1.5.0_15, 1.6.0_03, etc.)

    • Steps to reproduce

    • Full stack trace of any exceptions, including "Caused by…"

  • Check whether the bug is already fixed in the latest nightly build


Patches

Patches

  • Use the patches tracker on SourceForge

  • Best format is an svn diff against the latest subversion

    • Save the diff as a file and attach it, don't paste the diff into the bug report.

  • We generally don't accept patches against earlier versions


Patches 2

Patches (2)

  • GATE must compile and run on Java 5

    • Not sufficient to set source="1.5" and target="1.5" but compile on Java 6

    • This doesn't prevent you calling classes/methods that don't exist in 5

  • Test your patch on Java 5 before submitting


The user guide

The User Guide

  • Everything in GATE is (theoretically) documented in the GATE User Guide

    • http://gate.ac.uk/userguide

  • Every change to the core should be mentioned in the change log

    • http://gate.ac.uk/userguide/chap:changes

  • User guide is written in LaTeX


Updating the user guide

Updating the user guide

  • Lives in subversion

    • https://gate.svn.sourceforge.net/svnroot/gate/userguide/trunk

  • Build requires pdflatex, htlatex (tex4ht package), sed, make, etc.

    • On Windows, use Cygwin

  • Download http://gate.ac.uk/sale/big.bib and put in directory above the .tex files


Updating the user guide 2

Updating the user guide (2)

  • Edit the .tex files

  • Graphics, screenshots, etc. should be .png

  • Check in changes to .tex files, the PDF and HTML are regenerated automatically by…


Hudson

Hudson

  • Continuous integration platform

  • Automatically rebuilds GATE and user guide (among others) whenever they change

  • Also does a clean build of GATE every night

    • Nightly builds published at http://gate.ac.uk/download/snapshots


Hudson1

Hudson

  • Junit test results available for each build

  • http://gate.ac.uk/hudson


Running gate embedded in tomcat or any multithreaded system

Running GATE Embedded in Tomcat (or any multithreaded system)

Issues and tricks


Introduction

Introduction

  • Scenario:

    • Implementing a web service (or other web application) that uses GATE Embedded to process requests.

    • Want to support multiple concurrent requests

    • Long running process - need to be careful to avoid memory leaks, etc.

  • Example used is a plain HttpServlet

    • Principles apply to other frameworks (struts, Spring MVC, Metro/CXF, Grails…)


Setting up

Setting up

  • GATE libraries in WEB-INF/lib

    • gate.jar + JARs from lib

  • Usual GATE Embedded requirements:

    • A directory to be "gate.home"

    • Site and user config files

    • Plugins directory

    • Call Gate.init() once (and only once) before using any other GATE APIs


Initialisation using a servletcontextlistener

Initialisation using a ServletContextListener

  • ServletContextListener is registered in web.xml

  • Called when the application starts up

<listener>

<listener-class>gate.web..example.GateInitListener</listener-class>

</listener>

public void contextInitialized(ServletContextEvent e) {

ServletContext ctx = e.getServletContext();

File gateHome = new File(ctx.getRealPath("/WEB-INF"));

Gate.setGateHome(gateHome);

File userConfig = new File(ctx.getRealPath("/WEB-INF/user.xml"));

Gate.setUserConfigFile(userConfig);

// site config is gateHome/gate.xml

// plugins dir is gateHome/plugins

Gate.init();

}


Gate in a multithreaded environment

GATE in a multithreaded environment

  • GATE PRs are not thread-safe

    • Due to design of parameter-passing as JavaBean properties

  • Must ensure that a given PR/Controller instance is only used by one thread at a time


First attempt one instance per request

First attempt: one instanceper request

  • Naïve approach - create new PRs for each request

public void doPost(request, response) {

ProcessingResource pr = Factory.createResource(...);

try {

Document doc = Factory.newDocument(getTextFromRequest(request));

try {

// do some stuff

}

finally {

Factory.deleteResource(doc);

}

}

finally {

Factory.deleteResource(pr);

}

}

Many levels of nested try/finally: ugly but necessary to make sure we clean up even when errors occur. You will get very used to these…


Problems with this approach

Problems with this approach

  • Guarantees no interference between threads

  • But inefficient, particularly with complex PRs (large gazetteers, etc.)

  • Hidden problem with JAPE:

    • Parsing a JAPE grammar creates and compiles Java classes

    • Once created, classes are never unloaded

    • Even with simple grammars, eventually OutOfMemoryError (PermGen space)


Second attempt using threadlocals

Second attempt: using ThreadLocals

  • Store the PR/Controller in a thread local variable

private ThreadLocal<CorpusController> controller = new ThreadLocal<CorpusController>() {

protected CorpusController initialValue() {

return loadController();

}

};

private CorpusController loadController() {

//...

}

public void doPost(request, response) {

CorpusController c = controller.get();

// do stuff with the controller

}


Better than attempt 1

Better than attempt 1…

  • Only initialise resources once per thread

  • Interacts nicely with typical web server thread pooling

  • But if a thread dies, no way to clean up its controller

    • Possibility of memory leaks


A solution object pooling

A solution: object pooling

  • Manage your own pool of Controller instances

  • Take a controller from the pool at the start of a request, return it (in a finally!) at the end

  • Number of instances in the pool determines maximum concurrency level


Simple example

Blocks if the pool is empty: use poll() if you want to handle empty pool yourself

Simple example

private BlockingQueue<CorpusController> pool;

public void init() {

pool = new LinkedBlockingQueue<CorpusController>();

for(int i = 0; i < POOL_SIZE; i++) {

pool.add(loadController());

}

}

public void doPost(request, response) {

CorpusController c = pool.take();

try {

// do stuff

}

finally {

pool.add(c);

}

}

public void destroy() {

for(CorpusController c : pool) Factory.deleteResource(c);

}


Exporting the grunt work the spring framework

Exporting the grunt work -the Spring Framework

  • Spring Framework

    • http://www.springsource.org/

    • Handles application startup and shutdown

    • Configure your business objects and connections between them using XML

    • GATE provides helpers to initialise GATE, load saved applications, etc.

    • Built-in support for object pooling

    • Web application framework (Spring MVC)

    • Used by other frameworks (Grails, CXF, …)


Initialising gate with spring

Initialising GATE with Spring

<beans xmlns="http://www.springframework.org/schema/beans"

xmlns:gate="http://gate.ac.uk/ns/spring">

<gate:init gate-home="/WEB-INF"

plugins-home="/WEB-INF/plugins"

site-config-file="/WEB-INF/gate.xml"

user-config-file="/WEB-INF/user-gate.xml">

<gate:preload-plugins>

<value>/WEB-INF/plugins/ANNIE</value>

</gate:preload-plugins>

</gate:init>

</beans>


Loading a saved application

Loading a saved application

  • scope="prototype" means create a new instance each time we ask for it

    • Default is singleton - one and only one instance

<gate:saved-application id="myApp" location="/WEB-INF/application.xgapp"

scope="prototype" />


Spring servlet example

Spring servlet example

  • Spring provides HttpRequestHandler interface to manage servlet-type objects with Spring

  • Declare an HttpRequestHandlerServlet in web.xml with the same name as the Spring bean


Spring servlet example1

Spring servlet example

  • Write the handler assuming single-threaded access

    • Will use Spring to handle pooling for us

public class MyHandler implements HttpRequestHandler {

public void setApplication(CorpusController app) { ... }

public void handleRequest(request, response) {

Document doc = Factory.newDocument(getTextFromRequest(request));

try {

// do some stuff with the app

}

finally {

Factory.deleteResource(doc);

}

}

}


Tying it together

Tying it together

  • web.xml

<!-- set up Spring -->

<listener>

<listener-class>

org.springframework.web.context.ContextLoaderListener

</listener-class>

</listener>

<!-- servlet -->

<servlet>

<servlet-name>mainHandler</servlet-name>

<servlet-class>

org.springframework.web.context.support.HttpRequestHandlerServlet

</servlet-class>

</servlet>


Tying it together 2

Tying it together (2)

  • applicationContext.xml

<gate:init ... />

<gate:saved-application id="myApp" location="/WEB-INF/application.xgapp"

scope="prototype" />

<bean id="myHandlerTarget" class="my.pkg.MyHandler" scope="prototype">

<property name="application" ref="myApp" />

</bean>

<bean id="handlerTargetSource"

class="org.springframework.aop.target.CommonsPoolTargetSource">

<property name="targetBeanName" value="myHandlerTarget" />

<property name="minIdle" value="3" />

<property name="maxIdle" value="3" />

<property name="whenExhaustedActionName" value="WHEN_EXHAUSTED_BLOCK" />

</bean>

<bean id="mainHandler"

class="org.springframework.aop.framework.ProxyFactoryBean">

<property name="targetSource" ref="handlerTargetSource" />

</bean>


  • Login