A Java interface to TIGER
-
TIGER API is a library which allows Java programmers to
easily access the structure of any corpus given as a TIGER-XML file.
It can process the
TIGER corpus and any other corpus encoded in TIGER-XML.
The underlying API specifies a Java object model for corpora encoded in
TIGER-XML and provides methods for traversing syntax trees and accessing
elements such as sentences, syntax graph nodes, and their attributes.
XML parser and object model provided-
The provided parser processes a given TIGER-XML file,
builds according data structures representing all structural aspects of the
corpus and renders a Java Corpus object. This Corpus object
represents the given corpus and its structures and serves as an entry point
for accessing its syntax trees, its nodes and their attributes.
Batteries included-
The library further provides a set of tools for processing the
given corpus, which covers common tasks such as extracting text
strings, retrieving index features of phrases and terminals, generating
graphical ASCII representations of trees, determining tree structural relations
between nodes and other tasks of syntactic processing.
Converting corpus formats-
The API can also be used for converting TIGER-XML encoded corpora
to other formats; a sample converter is included.
Accessing other corpus formats-
In order to access other corpus formats than TIGER-XML the TIGERRegistry
(included in
TIGERSearch) can be used to convert them into the TIGER-XML format.
For example, you can process corpora encoded in a general bracketing format, Penn Treebank format,
Susanne and Christine corpus format, NEGRA corpus format and miscellaneous other corpus formats.
Detailed instructions: How to access other corpus formats
Easy usage-
The API's usage is easy and intuitive: no manual processing of XML is required. TIGER API completely abstracts the corpus object
model from its XML representation.
Open Source-
TIGER API is an open source project released under
the GNU GPL. This means that you are free to use this work, free to
redistribute it, and you are also free to change it in order to adapt it to
your needs.
|
|
|