using xaira to explore corpora n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Using Xaira to explore corpora PowerPoint Presentation
Download Presentation
Using Xaira to explore corpora

Loading in 2 Seconds...

play fullscreen
1 / 24

Using Xaira to explore corpora - PowerPoint PPT Presentation


  • 170 Views
  • Uploaded on

Using Xaira to explore corpora. Tony McEnery. What is Xaira?. An acronym for X ML A ware I ndexing and R etrieval A rchitecture The XML-aware version of SARA for the BNC corpus Including the Index Toolkit and the Client Working reliably with all writing systems supported by Unicode.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Using Xaira to explore corpora' - sienna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
what is xaira
What is Xaira?
  • An acronym for XML Aware Indexing and Retrieval Architecture
  • The XML-aware version of SARA for the BNC corpus
  • Including the Index Toolkit and the Client
  • Working reliably with all writing systems supported by Unicode
getting your corpus ready for use with the xaira client
Getting your corpus ready for use with the Xaira client
  • Mark up the corpus in XML
    • Markup can be very complex or very simple
      • If your corpus is not XML marked up, use Index Tool (Tools – Preprocess in the Index Toolkit) to add simple XML markup
  • For a non-alphabet language corpus, convert it into Unicode (e.g. UTF-8, UTF-16)
  • Use the Index tool (Tools – Index Wizard in the Index Toolkit) to index your corpus
word query
Word query
  • Click on the first icon
  • Type in a search word, click on Query
quick phrase query
Quick (Phrase Query)
  • Quick Query (Phrase query)
addkey pos lemma query
Addkey (POS/Lemma) Query ( )
  • Search for a POS class
  • Search for a word of a particular POS class
xml query
XML Query ( )
  • Search for a XML element
query builder
Query Builder ( )
  • A powerful combination of all query types
reference
Reference
  • Corpus name: Freiburg
  • Subcorpus: Null (not defined)
  • Total number of hits: 380 (found in 28 files)
  • The mouse is at No. 379
  • Location of this line: sentence No. 1529 in Sample No. 16 in the file FLOB_C in the folder of FLOB
page mode vs line mode
Page mode vs. line mode ( )
  • Page mode: one concordance per page (use Page Up/Down to turn pages)
  • Line mode: KWIC
sorting
Sorting ( )
  • 1st/2nd/3rd sort
copy select concordances
Copy, select concordances
  • Right click on a concordance line to copy, (block) select concordance(s), etc.
thinning and editing
Thinning and editing ( )
  • Removing unwanted concordances
    • Selection: Keep the selected concordances
    • Reverse selection: Keep the unselected concordances
    • Random
    • One per text
collocation
Collocation ( )
  • Compute collocations of the search term
defining subcorpora
Defining subcorpora
  • “Texts – Column control” in the Client
  • “Texts – Define partition” in the Client
distribution
Distribution ( )
  • “Texts – Open Partition” in the Client
  • As per Corpus As per text class
save and export a query
Save and export a query
  • Save: “File – Save (as)” for later use
  • Export (XML): “Query – Listing”
  • Edit a query ( ) so you don’t need to type in everything when making a related new query
xaira faqs
Xaira FAQs
  • Is Xaira free and where can I get it?
    • Yes, it is absolutely free. You can get a copy (binary for Windows, and source codes for compilation on the Unix/Linux/Mac system) at the SourceForce website. The latest release is 115. http://sourceforge.net/project/showfiles.php?group_id=130289
  • Where can I get more documentation?
    • You can get more documentation at the Xaira site: http://www.oucs.ox.ac.uk/rts/xaira/
  • Where can I get technical help?
    • You can sign up for the Xaira Preview List to get help: http://www.tei-c.org.uk/tei-bin/betatest