We implement and present our theory of German inflection in the
lexical knowledge representation language DATR (see Evans & Gazdar
1996; Keller 1995, 1996).
DATR is a rather spartan nonmonotonic language for defining
inheritance networks with path-value equations. The development of
DATR was guided by a number of concerns which we summarise here.
The objective was to design a language which (i) has an explicit
theory of inference, (ii) has an explicit declarative semantics,
(iii) can be readily and efficiently implemented
, (iv) has the necessary
expressive power to encode the lexical information presupposed by
work in the unification grammar tradition, and (v) can express all
the evident generalisations and subgeneralisations about such
entries. In keeping with its intendedly minimalist character,
it lacks many of the constructs embodied either in general purpose
AI knowledge representation languages or in contemporary
grammar formalisms. The language is nonetheless sufficiently
expressive to represent concisely the structure of lexical
information at a variety of domains of language description.
It should be stressed that DATR itself is no more than a very general language for lexical description and therefore does not commit or restrict the linguist using it to any particular linguistic framework, theory or formalism, nor is it restricted in the class of natural languages that it can be used to describe. Clearly, it is well suited to lexical frameworks that embrace or are consistent with inheritance and non-monotonicity through networks of nodes, but these are not requirements. DATR can be (and has been) used to implement widely differing theoretical approaches (including ILEX, HPSG, LTAG, Finite State Morphology, Network Morphology, Paradigm Function Morphology), and is perhaps best thought of as a programming language which can be used to implement and test linguistic theories. Indeed, it would not be entirely misleading to think of DATR as a kind of assembly language for constructing (or reconstructing) higher level theories of lexical representation. Unlike most other formal languages proposed for lexical knowledge representation, DATR is also not restricted in the domains of linguistic description to which it can sensibly be applied. It is designed to be equally applicable at phonological, orthographic, morphological, syntactic and semantic domains of description. But it is not intended to replace existing approaches to those domains. DATR cannot be (sensibly) used without a prior decision as to the theoretical frameworks in which the description is to be conducted; there is thus no `default' framework for describing, say, morphological facts in DATR.
In DATR, information is organised as a network of nodes, where a node is essentially just a collection of related information. In the context of lexical description, a node might correspond to a phoneme, a syllable, a morpheme, a word, a lexeme, etc., or a class of such items. For example, we might have a node describing an abstract Word in German, another for the subclass of German adjectives, another for the particular adjective lexeme Alt (`old') and still more for the individual words that are instances of this lexeme (alte, altem, alten, alter, altes). Each node has associated with it a set of equations that define partial functions from paths to values where paths and values are both sequences of atoms (which are primitive objects). Atoms in paths are sometimes referred to as attributes. The syntax and terminology of DATR, like its name and its minimalist philosophy, owes more than a little to that of the unification grammar language PATR (Shieber 1986).
