TIGER API 1.8 - A Java interface to the TIGER corpus
Introduction Documentation Download

A Java interface to TIGER

TIGER API is a library which allows Java programmers to easily access the structure of any corpus given as a TIGER-XML file. It can process the TIGER corpus and any other corpus encoded in TIGER-XML. The underlying API specifies a Java object model for corpora encoded in TIGER-XML and provides methods for traversing syntax trees and accessing elements such as sentences, syntax graph nodes, and their attributes.

XML parser and object model provided

The provided parser processes a given TIGER-XML file, builds according data structures representing all structural aspects of the corpus and renders a Java Corpus object. This Corpus object represents the given corpus and its structures and serves as an entry point for accessing its syntax trees, its nodes and their attributes.

Batteries included

The library further provides a set of tools for processing the given corpus, which covers common tasks such as extracting text strings, retrieving index features of phrases and terminals, generating graphical ASCII representations of trees, determining tree structural relations between nodes and other tasks of syntactic processing.

Converting corpus formats

The API can also be used for converting TIGER-XML encoded corpora to other formats; a sample converter is included.

Accessing other corpus formats

In order to access other corpus formats than TIGER-XML the TIGERRegistry (included in TIGERSearch) can be used to convert them into the TIGER-XML format.

For example, you can process corpora encoded in a general bracketing format, Penn Treebank format, Susanne and Christine corpus format, NEGRA corpus format and miscellaneous other corpus formats.

Detailed instructions: How to access other corpus formats

Easy usage

The API's usage is easy and intuitive: no manual processing of XML is required. TIGER API completely abstracts the corpus object model from its XML representation.

Open Source

TIGER API is an open source project released under the GNU GPL. This means that you are free to use this work, free to redistribute it, and you are also free to change it in order to adapt it to your needs.