A Bibliography of Papers on DATR
Achterholt, Karin, "Phonological Underspecification in
Phonology," Seminar paper: D. Gibbon: Methoden der
Phonetik/Phonologie: Phonetische Beschreibungstechniken,
University of Bielefeld, Bielefeld, 1989.
Andry, Francois, Norman Fraser, Scott McGlashan, Simon Thornton,
and Nick Youd, "Making DATR work for speech: lexicon
compilation in SUNDIAL," Computational Linguistics, vol. 18,
no. 3, pp. 245-267, 1992.
This paper presents a modular inheritance-based tool which
facilitates the rapid construction of linguistic knowledge
bases. Simple lexical entries are added to an application-
specific DATR lexicon which inherits morphosyntactic,
syntactic, and lexico-semantic constraints from an
application-independent set of structured base definitions.
A lexicon generator expands the DATR lexicon out into a
disjunctive normal form lexicon. This is then encoded
either as an acceptance lexicon (in which the constraining
features are bit-encoded for use in pruning word lattices),
or as a full lexicon (which is used for assigning
interpretations or for generating messages). Inheritance
plays a vital role at each level in the compilation
architecture.
Barg, Petra, "Automatic acquisition of DATR theories from
observations," Theories des Lexicons: Arbeiten des
Sonderforschungsbereichs 282, Heinrich-Heine University of
Duesseldorf, Duesseldorf, 1994.
The automatic acquisition of linguistic knowledge from
examples or observations is a topic of increasing interest.
An approach to this task is presented where the acquired
knowledge is represented in the lexical knowledge
representation language DATR. The basic components of the
learning approach are a set of transformation rules that
define possible transformations of a given DATR theory and a
default-inference algorithm that reduces a monotonic DATR
theory to a default theory. Since the overall approach is
not restricted to any special kind of knowledge, the
heuristic inference strategy requires criteria to evaluate
the quality of a DATR theory with respect to a given set of
observations. Different domains may select different
criteria or give different priority to a set of criteria.
Billigheimer, Diana, "A natural language interface to a speech
therapy database," MSc thesis, University of Sussex,
Brighton, 1990.
This thesis describes a DCG-based natural language interface
to a semantic network that encodes information about
patients, therapists, and communication defects. The
network is implemented in DATR. A DCG parser translates an
English question into a "logical form" and an evaluation
module then uses this as the basis for one or more queries
to the semantic network. A further module then formats the
theorems that result from such queries into something that
can be recognised as an answer to the original English
question.
Bleiching, Doris, "Das Wortfeld 'family' als semantisches Netz,"
Thesis for Staatsexamen (L.A., Sekundarstufe II), University
of Bielefeld, Bielefeld, 1990.
Bleiching, Doris, "Default-Hierarchen in der deutschen
Wortprosodie," ASL-TR-19-91, University of Bielefeld,
Bielefeld, 1991.
Bleiching, Doris, "Prosodisches Wissen in Lexicon," in KONVENS-
92, ed. G. Goerz, pp. 59-68, Springer-Verlag, Berlin, 1992.
Bleiching, Doris, "Integration von Morphophonologie und Prosodie
in ein hierarchisches Lexicon," in Proceedings of KONVENS-
94, ed. Harald Trost, pp. 32-41, Oesterreichische
Gesellschaft fuer Artificial Intelligence, Vienna, 1994.
Brown, Dunstan, "Getting your priorities right: a network
morphology approach to morphological stress," Unpublished
paper presented to the Spring Meeting of the Linguistics
Association of Great Britain, Salford, University of Surrey,
Guildford, 1994.
Brown, Dunstan, "Network Morphology and morphophonological
selection," Unpublished paper presented to the Autumn
Meeting of the Linguistics Association of Great Britain,
Middlesex, University of Surrey, Guildford, 1994.
Brown, Dunstan, "Network Morphology and the Russian verb
[abstract]," in Linguistics at the end of the 20th century:
Achievements and perspectives, ed. A.E. Kibrik, I.M.
Kobozeva, A.I. Kuznecova & T.B. Nazarova, eds., vol. 1, pp.
74-76, Filologiceskij fakultet MGU imeni M.V. Lomonosova,
Moscow, 1995.
Brown, Dunstan, "Setevaja morfologija i russkaja glagol'naja
sistema.," Vestnik MGU, vol. 0, pp. 00-00, 1996.
Brown, Dunstan and Andrew Hippisley, "Conflict in Russian
genitive plural assignment: A solution represented in DATR,"
Journal of Slavic Linguistics, vol. 2, no. 1, pp. 48-76,
1994.
Inflectional endings are assigned in languages by general
principles, but these can come into conflict. The paper
addresses the question of how such conflict is resolved. A
particularly complex example is the Russian genitive plural,
where there is a conflict between exponent assignment
according to declension class and a default exponent
assignment for soft-stem nouns. What is specially
interesting is that the conflict here can be resolved by
reference to subsystems over and above the paradigm, such as
stress. An explicit account of the conflict and its
mediation is presented, based on default inheritance. For
this purpose the lexical knowledge representation language
DATR is used. This allows one to demonstrate in the output
provided that the correct forms are indeed predicted by the
theory.
Brown, Dunstan, Greville Corbett, Norman Fraser, Andrew
Hippisley, and Alan Timberlake, "Russian noun stress and
network morphology," Linguistics, vol. 34, no. 1, pp. 53-
107, 1996.
This paper presents a network morphology analysis of Russian
noun stress. Nouns have a default fixed stem stress, but
some nouns have nondefault stress that may deviate in a way
that is determined by the form's position within the
paradigm; different declensions prefer particular patterns
as their nondefault choices. Membership of a particular
declension, it is argued, constrains the range of possible
stress patterns. Stress is represented as a hierarchy with
limited deviation in terms of number and, less often, case.
Indices in the declension hierarchy are addressed to nodes
in the stress hierarchy. These indices correspond to rank
orderings that declensions have for stress patters. Lexical
items inherit a default value for index rank but may
override this. It is not possible for any override value to
be given at the lexical entry as this has to be evaluated in
the declension hierarchy. The use of cyclicity in metrical
approaches is considered, and it is concluded that lexical
marking is still required. In addition, it is predicted
that accusative forms that are syncretic with the nominative
or genitive on the basis of animacy must have the same
stress as the form with which they are syncretic.
Bouillon, Pierrette, "La morphologie automatiques du Francais
avec DATR," Unpublished manuscript, ISSCO, Geneva, 1990.
This paper documents a rather comprehensive DATR fragment
for the morphology of French adjectives, nouns and verbs.
Cahill, Lynne, "Syllable-based morphology for NLP," DPhil thesis,
University of Sussex, Brighton, 1990.
Chapter 5 and Appendices A-D of this thesis show how
expressions of the syllable sequence mapping language MOLUSC
can be embedded in DATR theories so as to provide full
accounts of the morphology and morphophonology of the
Arabic, English and Sanskrit verbal systems. In this
approach, DATR takes care of the distribution of morphemes
whilst MOLUSC is responsible for their phonological
realization.
Cahill, Lynne, "Morphonology in the lexicon," Sixth Conference of
the European Chapter of the Association for Computational
Linguistics, pp. 87-96, 1993.
This paper presents a means of defining morphonological
phenomena in an inheritance based lexicon, making use of the
theory behind the formal language MOLUSC, in which
morphological alternations were defined as mappings between
sequences of tree-structured syllables. The paper shows how
such alternations can be defined in the inheritance based
lexical representation language DATR, and how the
phonological aspects can be built upon to create an
integrated lexicon with representations that can be used by
both the morphology and the phonology of a language.
Cahill, Lynne, "Some reflections on the conversion of the TIC
lexicon into DATR," in Inheritance, defaults, and the
lexicon, ed. Ted Briscoe, Valeria de Paiva and Ann
Copestake, eds., pp. 47-57, Cambridge University Press,
Cambridge, 1993.
The Traffic Information Collator (TIC) is a prototype system
which takes verbatim police reports of traffic incidents,
interprets them, builds a picture of what is happening on
the roads and broadcasts appropriate messages to motorists
where necessary. Cahill & Evans (1990) describes the
process of converting the main TIC lexicon (around 1000
words specific to the domain of traffic reports) into DATR.
This paper reviews the strategy adopted in the conversion
discussed in that paper, and discusses the results of
converting the whole lexicon, together with statistics
comparing efficiency and performance between the original
lexicon and the DATR version.
Cahill, Lynne, "An inheritance-based lexicon for message
understanding systems," Fourth ACL Conference on Applied
Natural Language Processing, pp. 211-212, 1994.
Cahill, Lynne and Roger Evans, "An application of DATR: the TIC
lexicon," ECAI-90, pp. 120-125, 1990.
Also in Evans & Gazdar (1990) The DATR Papers, Vol. 1, pp.
31-39. The Traffic Information Collator (TIC) is a natural
language understanding system operating in the domain of
road traffic incident reports. This paper describes the
application of DATR to a fragment of the TIC's lexicon, and
discusses a range of techniques which can be used to
overcome the problems of practical lexical representation.
Cahill, Lynne and Gerald Gazdar, "A lexical analysis of numeral
expressions in three related languages," Unpublished paper,
University of Sussex, Brighton, 1996.
Most work on multilingual lexicons has, in effect, assumed
monolingual lexicons linked only at the level of semantics.
This traditional multilingual lexicon architecture, while
arguably adequate for unrelated languages, makes it
impossible to capture useful generalisations about related
languages. Such generalisations, if captured, can help to
produce more robust, more readily maintainable and more
readily extensible multilingual natural language processing
systems for related languages. These generalizations are to
be found at all levels of linguistic description, not just
at the semantic level. The present paper illustrates this
point by reference to a multilingual lexical analysis of
numeral expressions in Dutch, English and German. We show
that the large bulk of the description can be stated without
reference to the language involved. Thus, while English and
German require language specific definitions of small
aspects of their syntax and morphology, the three languages
differ significantly only in their phonology.
Cahill, Lynne and Gerald Gazdar, "The inflectional phonology of
German adjectives, determiners and pronouns," Unpublished
paper, University of Sussex, Brighton, 1996.
This is the first of a series of papers that, taken
together, will give an essentially complete account of
inflection in standard German. In this paper we present
that part of the account that covers adjectives, determiners
and third person pronouns, one that captures all the
regularities, subregularities and irregularities that are
involved. The forms are defined in terms of their syllable
structure, as proposed in Cahill (1990, 1993). The
morphological treatment is based on ideas originally set out
by Zwicky in the mid-1980s.
Corbett, Greville and Norman Fraser, "Network morphology: a DATR
account of Russian nominal inflection," Journal of
Linguistics, vol. 29, pp. 113-142, 1993.
The paper presents an analysis of the inflectional
morphology of Russian nominals which encodes information in
terms of a network of nodes and facts. This approach,
called network morphology, makes extensive use of default
inheritance and is formalized in DATR. The analysis given
has been tested and been shown to generate the correct forms
for each of the regular declensional classes, and for a
range of irregular items.
Corbett, Greville and Norman Fraser, "Computational linguistics
meets typology [abstract]," in Linguistics at the end of the
20th century: Achievements and perspectives, ed. A.E.
Kibrik, I.M. Kobozeva, A.I. Kuznecova & T.B. Nazarova, eds.,
vol. 1, pp. 256-258, Filologiceskij fakultet MGU imeni M.V.
Lomonosova, Moscow, 1995.
Drexel, Guido, "Repraesentation hierarchischer Lexika: DATR in
einer objekt-orientierten Ungebung," MA Thesis, University
of Bielefeld, Bielefeld, 1993.
Duda, Markus and Gunter Gebhardi, "DUTR -- A DATR-PATR interface
formalism," in Proceedings of KONVENS-94, ed. Harald Trost,
pp. 411-414, Oesterreichische Gesellschaft fuer Artificial
Intelligence, Vienna, 1994.
This paper presents a *dynamic* interface between DATR and
PATR.
Evans, Roger, "An introduction to the Sussex Prolog DATR system,"
in The DATR Papers, ed. Roger Evans & Gerald Gazdar, pp.
63-71, University of Sussex, Brighton, 1990.
This paper documents installation and implementation-
specific aspects of the Sussex Prolog DATR system. It
explains how to use the compiler and the various ways in
which a compiled DATR theory can be queried.
Evans, Roger, "Derivational morphology in DATR," in Sussex Papers
in General and Computational Linguistics, ed. Lynne Cahill
and Richard Coates, pp. 55-69, University of Sussex,
Brighton, 1992.
This paper presents a DATR analysis of some aspects of
English derivational morphology, and demonstrate how the
facilities of the language allow succinct description of
derivational concepts. The aim is not to present a new
theory of derivational morphology, but rather to show how
existing ideas in the field can be expressed in terms of
DATR's default and inheritance mechanisms. To this end, the
analysis is based on a single, coherent, but informal
account of the data, namely Bauer's "English Word-formation"
(Cambridge University Press, 1983). The account presented
is a description rather than a representation of
derivational morphology. This entails that representational
issues such as productivity and lexicalisation lie outside
its scope. The implications of this are discussed, and it
is suggested that such a DATR description offers a well-
defined basis for a theory of representation which does
encompass such issues.
Evans, Roger and Gerald Gazdar, "Inference in DATR," Fourth
Conference of the European Chapter of the Association for
Computational Linguistics, pp. 66-71, 1989.
Also in Evans & Gazdar (1990) The DATR Papers, Vol. 1, pp.
15-20. This paper provides a formal definition of the
syntax of the DATR language and the theory of inference.
Evans, Roger and Gerald Gazdar, "The semantics of DATR," in
Proceedings of the Seventh Conference of the Society for the
Study of Artificial Intelligence and Simulation of
Behaviour, ed. Anthony G. Cohn, pp. 79-87, Pitman/Morgan
Kaufmann, London, 1989.
Also in Evans & Gazdar (1990) The DATR Papers, Vol. 1, pp.
21-30. This paper provides a formal definition of a
semantics for the core of the DATR language (value sequences
and evaluable paths are not covered) and shows how this
semantics can be modelled using finite state automata.
Evans, Roger and Gerald Gazdar, "The DATR Papers," Cognitive
Science Research Paper CSRP 139, University of Sussex,
Brighton, 1990.
This volume brings together all the early Sussex-sourced
papers relating to DATR (each of which is listed sepately in
this bibliography). Three of these papers have been
published elsewhere, but, for the other four, this technical
report is likely to remain the only source. In addition to
these seven papers, the volume contains nine natural
language DATR lexicon fragments (on Arabic, Baule, English,
German, Japanese, Latin and Tem); eighteen formal DATR
examples that illustrate a wide variety of representational
techniques; and the complete Prolog source code for the
Sussex DATR system.
Evans, Roger and Gerald Gazdar, "DATR: A language for lexical
knowledge representation," Computational Linguistics, vol.
22, no. 2, pp. 167-216, 1996.
This paper argues that DATR, though minimalist in
conception, is sufficiently expressive to represent
concisely the structure of lexical information at a variety
of levels of linguistic analysis. The paper provides an
informal example-based introduction to DATR and to
techniques for its use, including finite state transduction,
the encoding of DAGs and lexical rules, and the
representation of ambiguity and alternation. Sample
analyses of phenomena such as inflectional syncretism and
verbal subcategorisation are given which show how the
language can be used to squeeze out redundancy from lexical
descriptions.
Evans, Roger, Gerald Gazdar, and Lionel Moser, "Prioritised
multiple inheritance in DATR," in Inheritance, defaults, and
the lexicon, ed. Ted Briscoe, Valeria de Paiva and Ann
Copestake, eds., pp. 38-46, Cambridge University Press,
Cambridge, 1993.
Also "Proceedings of the Acquilex Workshop on Default
Inheritance in the Lexicon", Technical Report No. 238,
University of Cambridge Computer Laboratory, October 1991.
The authors characterise a notion of prioritised multiple
inheritance (PMI) and contrast it with the more familiar
orthogonal multiple inheritance (OMI). DATR was designed to
facilitate OMI analyses of natural language lexicons: it
contains no special purpose facility for PMI and this has
led some researchers to conclude that PMI analyses are
beyond the expressive capacity of DATR. Here, the authors
present three different techniques for implementing PMI
entirely within DATR's existing syntactic and semantic
resources. In presenting them, they draw attention to their
respective advantages and disadvantages.
Evans, Roger, Gerald Gazdar, and David Weir, "Using default
inheritance to describe LTAG," Colloque International sur
les grammaires d'Arbres Adjoints (TAG+3), TALANA-RT-94-01,
TALANA, Universite' Paris VII, Jussieu, Paris, 1994.
The authors investigate how the set of elementary trees of a
Lexicalized Tree Adjoining Grammar (LTAG) can be represented
in DATR. DATR's default mechanism is used to eliminate the
need for a non-immediate dominance relation in the
descriptions of surface LTAG entries. This allows tree
structures to be embedded in the feature theory in a manner
reminiscent of HPSG subcategorization frames, and hence also
allows lexical rules to be expressed as relations over
feature structures.
Evans, Roger, Gerald Gazdar, and David Weir, "Encoding
lexicalized tree adjoining grammars with a nonmonotonic
inheritance hierarchy," Proceedings of the 33rd Annual
Meeting of the Association for Computational Linguistics,
pp. 77-84., 1995.
This paper shows how DATR can be used to define an LTAG
lexicon as an inheritance hierarchy with internal lexical
rules. A bottom-up featural encoding is used for LTAG trees
and this allows lexical rules to be implemented as
covariation constraints within feature structures. Such an
approach eliminates the considerable redundancy otherwise
associated with an LTAG lexicon.
Fabre, Ceile and Anne Le Draoulec, "Organisation s'un lexique
bilingue pour les verbes Anglais et Francais en langage
DATR," MA Project Report, Universite' Paris VII, Jussieu,
Paris, 1992.
Fischer, Kerstin, "Kompositionelle Semantik am Beispiel der
englischen denominalen Nominalkomposita," MA Thesis,
University of Bielefeld, Bielefeld, 1993.
Fraser, Norman, "Derivational morphology in DATR: a new
proposal," Unpublished manuscript, University of Surrey,
Guildford, 1994.
This paper draws attention to a novel way of structuring the
derivational morphology problem in DATR by mapping sequences
of semantic attributes (interpreted as nested modifiers)
into derived forms.
Fraser, Norman and Greville Corbett, "Gender, animacy, and
declensional class assignment: a unified account for
Russian," in Yearbook of Morphology 1994, ed. Geert Booij &
Jaap van Marle, pp. 123-150, Kluwer, Dordrecht, 1995.
This paper extends the DATR analysis presented in Corbett
and Fraser (1993) to allow for the complex interactions of
meaning, gender, declensional class and phonology in the
assignment of gender in Russian.
Fraser, Norman and Greville Corbett, "Gender assignment in
Arapesh: a Network Morphology analysis," Lingua, vol. (to
appear), pp. 00-00, 1995.
This paper explores the various notions of default that are
relevant to morphology, in the context of an analysis of the
noun classes and genders of Arapesh
Gazdar, Gerald, "An introduction to DATR," in The DATR Papers,
ed. Roger Evans & Gerald Gazdar, pp. 1-14, University of
Sussex, Brighton, 1990.
DATR is a declarative language for representing a restricted
class of inheritance networks, permitting both multiple and
default inheritance. The principal intended area of
application is the representation of lexical entries for
natural language processing. The goal of the DATR
enterprise is the design of a simple language that (i) has
the necessary expressive power to encode the lexical entries
presupposed by contemporary work in the unification grammar
tradition, (ii) can express all the evident generalizations
about such entries, (iii) has an explicit theory of
inference, (iv) is computationally tractable, and (v) has an
explicit declarative semantics. The present paper sketches
the brief history of default inheritance approaches to the
lexicon; provides an informal guided tour to DATR via an
extended example that deals with English verb morphology and
subcategorisation; and ends by providing answers to a set of
questions about DATR.
Gazdar, Gerald, "Ceteris paribus," Unpublished paper, University
of Sussex, Brighton, 1990.
This paper uses the morphology of Latin nouns as an example
on which to base an extended informal introduction to the
DATR language, concentrating on default inheritance and the
rules of inference. An appendix provides a full DATR
treatment of Latin noun morphology involving 5 declensions
and 18 subdeclensions.
Gazdar, Gerald, "Paradigm function morphology in DATR," in Sussex
Papers in General and Computational Linguistics, ed. Lynne
Cahill and Richard Coates, pp. 43-53, University of Sussex,
Brighton, 1992.
This paper shows how Stump's "paradigm function" (PFM)
approach to inflectional morphology can be implemented in
DATR. PFM analyses can be encoded in DATR without any loss
in concision over Stump's own notation, but with a great
gain in generality, since Stump's notation is ad hoc to PFM
analyses of inflectional morphology. DATR is thus to be
preferred to Stump's own notation on general methodological
grounds. An appendix suggests that there may also be
analytical grounds for preferring DATR in view of the
difficulties that the Swahili object agreement facts cause
for Stump's notation.
Gibbon, Dafydd, "PCS-DATR: A DATR implementation in PC Scheme,"
English/Linguistics Interim Report No. 3, University of
Bielefeld, Bielefeld, 1989.
This paper documents the Bielefeld PC-Scheme DATR
implementation. The latter is a menu-directed, window-
oriented DATR development environment based on an
interpreter. The paper includes an informal review of DATR,
a guide to the installation and use of PCS-DATR, a
description of implementation-specific aspects of the
interpreter, a high-level explanation of how it works, and a
set of example files.
Gibbon, Dafydd, "Prosodic association by template inheritance,"
in Proceedings of the Workshop on Inheritance in Natural
Language Processing, ed. Walter Daelemans & Gerald Gazdar,
pp. 65-81, ITK (Institute for Language Technology & AI),
Tilburg, 1990.
The domain of morphophonological structures in natural
language lexica is notoriously difficult to describe with
standard formal approaches. The morphoprosodic subdomain,
i.e. lexical suprasegmental structure (stress, tone, vowel
harmony, vowel and consonant mutation) is one of the hardest
parts to model explicitly and in a linguistically adequate
fashion. In this paper, two examples -- the standard
"benchmark" examples of subsets of Kikuyu tone and Arabic
binyan systems -- are selected, and a new approach to
lexical prosody description (morphoprosody) using prosodic
inheritance with defaults (PI) is described and implemented
in DATR.
Gibbon, Dafydd, "Lexical signs and lexicon structure: phonology
and prosody in the ASL-Lexicon," Verbundprojekt ASL-MEMO-
20-91/UBI, University of Bielefeld, Bielefeld, 1991.
Gibbon, Dafydd, "ILEX: a linguistic approach to computational
lexica," in Computatio Linguae: Aufsaze zur algorithmischen
und quantitativen Analyse der Sprache (Zeitschrift fu
Dialektologie und Linguistik, Beiheft 73), ed. Ursula Klenk,
pp. 32-53, Franz Steiner Verlag, Stuttgart, 1992.
The present paper is an attempt to identify some of the
linguistic criteria for lexicon development, and to present
an integrated approach which addresses not only the question
of the structure of individual lexical entries, but also the
issue of the structure of the lexicon as a whole. A
particularly neglected area is the integrated representation
of the morphological and morphophonological generalisations
in the lexicon. The ILEX approach (Inheritance Lexicon with
EXceptions) was developed with the aim of ameliorating this
situation on the basis of explicit linguistic and
computational criteria of adequacy. ILEX models are
currently implemented in DATR.
Gibbon, Dafydd, "The lexical representation of prosody," ELSNET
Summer School on Prosody course booklet, University of
Bielefeld, Bielefeld, 1993.
This 92-page course booklet provides an introduction to
prosody and its role in the lexicon, and covers criteria for
lexical representation, structural stress in English
Compounds, tone, and multi-linear morphology. The use of
DATR for representing lexical prosody is discussed and
extensive examples are given, drawn from Arabic, Yacouba,
Kikuyu, Baule and Tem.
Gibbon, Dafydd, "Generalised DATR for flexible access: Prolog
specification," Deliverable VM-TP5.3-D1, University of
Bielefeld, Bielefeld, 1993.
A representation language with quantification over DATR
theorem constituents, EDQL (Extended DATR Query Language) is
introduced, with variables which also permit EDQL to be
interfaced with Prolog and other formalisms by structure-
sharing. The prototype implementation and applications are
briefly described.
Gibbon, Dafydd, "Generalised DATR inference for lexicon
development and interfacing," Unpublished manuscript,
University of Bielefeld, Bielefeld, 1994.
Gibbon, Dafydd and Firmin Ahoua, "DDATR: un logiciel de
traitement d'he'ritage par de'faut pour la mode'lisation
lexicale," Cahiers Ivoiriens de Recherche Linguistique
(CIRL), vol. 27, pp. 5-59, 1991.
The aim of this paper is to present the properties of DATR
and directions for the use of the DDATR software for
developing and testing DATR descriptions. The DATR language
is capable of integrating recent developments in the lexical
domain in linguistics and computational linguistics. It is
presented as a means of formalising linguistic theories in
the lexical domain in a homogeneous and explicit manner. It
offers not only a means of expressing linguistically
significant generalisations with respect to the criterion of
descriptive adequacy, but also a means of testing the
validity, the coherence and the exhaustivity of complex
generalised lexical descriptions.
Gibbon, Dafydd and Doris Bleiching, "An ILEX model for German
compound stress in DATR," Paper presented and distributed at
the FORWISS-ASL Workshop on Prosody in Man-Machine
Communication, 1991.
This paper notes a number of conditions on German compound
stress and suggests a description in terms of the ILEX
(Integrated Lexicon with EXceptions) model.
Hippisley, Andrew, "Default inheritance and Russian word
formation: An account of Russian denominal adjectives
represented in DATR," Manuscript of paper presented to the
Spring Meeting of the Linguistics Association of Great
Britain, Salford, University of Surrey, Guildford, 1994.
Hippisley, Andrew, "Expressive derivation in Russian represented
in DATR [abstract]," in Linguistics at the end of the 20th
century: Achievements and perspectives, ed. A.E. Kibrik,
I.M. Kobozeva, A.I. Kuznecova & T.B. Nazarova, eds., vol. 1,
pp. 525-526, Filologiceskij fakultet MGU imeni M.V.
Lomonosova, Moscow, 1995.
Hippisley, Andrew, "Russian expressive derivation: a Network
Morphology account," The Slavonic and East European Review,
vol. 74, no. 2, pp. 201-222, 1996.
Jacob, Sabine, "Entwicklung eines DATR-Lexikons zur UCG-basierten
Analyse natuerlichsprachlicher deutscher Saetze," MSc
thesis, Friedrich Alexander University of Erlangen
Nuremberg, Nuremberg, 1993.
Jenkins, Elizabeth, "Enhancements to the Sussex Prolog DATR
implementation," in The DATR Papers, ed. Roger Evans &
Gerald Gazdar, pp. 41-61, University of Sussex, Brighton,
1990.
This paper describes a range of enhancements to the original
(1989) Sussex Prolog DATR implementation. These include
DATR declarations (for atoms, for nodes, and for theorem
dumps); DATR variables (an abbreviatory notation); a
procedural interface; and an interface that allows DATR
queries to be expressed in DATR syntax.
Jenkins, Elizabeth, "Japanese verbs in DATR," in The DATR Papers,
ed. Roger Evans & Gerald Gazdar, pp. 73-78, University of
Sussex, Brighton, 1990.
This short paper presents a DATR analysis of the morphology
of the Japanese verbal system which covers the inflection of
the 11 regular verb types and the 3 irregular verbs.
Keller, William, "DATR theories and DATR models," Proceedings of
the 33rd Annual Meeting of the Association for Computational
Linguistics, pp. 55-62., 1995.
This paper presents a formal semantics for DATR which treats
DATR theories as collections of function definitions.
Keller, William, "An evaluation semantics for DATR theories,"
COLING-96, pp. 646-651, 1996.
This paper describes an operational semantics for DATR
theories that axiomatises the relationship between DATR
expressions and their values. The inference rules provide a
clearer picture of the way in which DATR works, and should
lead to a better understanding of the mathematical and
computational properties of the language.
Kilbury, James, "Strict inheritance and the taxonomy of lexical
types in DATR," Unpublished manuscript (revised version to
appear in 1994), University of Duesseldorf, Duesseldorf,
1992.
This paper describes a technique that allows one to assign
lexical types represented by DATR nodes to individual DAGs
associated with lexemes. The result is obtained by
extending a highly restricted subclass of DATR theories to
reflect the distinction between strict and defeasible
information.
Kilbury, James, "Paradigm-based derivational morphology,"
Unpublished manuscript, University of Duesseldorf,
Duesseldorf, 1993.
This paper sketches an approach to derivational morphology
that is based on the notion of the paradigm and provides new
possibilities for an integrated treatment of inflection and
derivation. The principal innovation lies in the use of
cross-subcategorization to describe derivational
combinations. The notion of a derivational closure is also
introduced. Advantages of the approach for computational
morphology involve both the representation and the
processing of derivational information. Primary attention
is directed at derivational morphotactics.
Kilbury, James, Petra Naerger, and Ingrid Renz, "DATR as a
lexical component for PATR," Fifth Conference of the
European Chapter of the Association for Computational
Linguistics, pp. 137-142, 1991.
The representation of lexical entries requires special means
which basic PATR systems do not include. The language DATR,
however, can be used to define an inheritance network
serving as the lexical component. The integration of such a
module into an existing PATR system leads to various
problems which are discussed together with possible
solutions.
Kilbury, James, Petra Naerger, and Ingrid Renz, "New lexical
entries for unknown words," Unpublished manuscript,
University of Duesseldorf, Duesseldorf, 1992.
This paper presents an approach for simulating the
acquisition of new lexical entries for unknown words, an
issue that is central to NLP since no lexicon can ever be
complete. Acquisition involves two main tasks. First, the
appropriate information about an unknown word in a given
linguistic context (i.e. sentence) is identified. It is
shown that this task requires new general considerations
about shared information in unification based
representations. Second, the collected information is
formulated in a new lexical entry according to a
comprehensive theory of the lexicon which defines the form
of lexical entries and the relations between them. This
task is solved by a general algorithm that depends only on
the form of the collected information and is independent of
the content, i.e. treats all unknown words the same way.
Kilbury, James, Petra Barg, and Ingrid Renz, "Simulation
lexicalischen Erwerbs," in Kognitive Linguistik:
Repraesentation und Prozesse, ed. Sascha W. Felix,
Christopher Habel & Gert Rickheit, pp. 251-271,
Westdeutscher Verlag, Opladen, 1994.
This paper is a (German) descendant of Kilbury, Naerger &
Renz (1992). It presents a model for the processing of
unknown words and the acquisition of corresponding lexical
entries. The linguistic model was formulated in the
unification-based paradigm as a computer simulation with the
system QPATR. The central assumption is that the processing
of unknown words is subject to the same principles as that
of natural language in general. It is shown how information
about unknown words is accumulated during parsing: an
independent component using a DATR-based model of the
lexicon builds new lexical entries for the unknown words and
integrates these entries in the existing lexicon.
Kilgarriff, Adam, "Inheriting verb alternations," Sixth
Conference of the European Chapter of the Association for
Computational Linguistics, pp. 213-221, 1993.
This paper shows how the verbal lexicon can be formalised in
a way that captures and exploits generalisations about the
alternation behaviour of verb classes. An alternation is a
pattern in which a number of words share the same
relationship between a pair of senses. The alternations
captured are ones where the different senses specify
different relationships between syntactic complements and
syntactic arguments, as between "bake" in "John is baking
the cake" and "The cake is baking". The formal language
used is DATR. The lexical entries built are are those of
HPSG. The complex alternation behaviour shared between
families of verbs is elegantly represented in a way that
makes generalizations explicit, and offers practical
benefits to computational lexicographers.
Kilgarriff, Adam, "Inheriting polysemy," in Computational Lexical
Semantics, ed. Patrick Saint-Dizier and Evelyne Viegas, pp.
319-335, CUP, Cambridge, 1995.
There are many patterns of variation in word sense, or
`sense alternations', which apply to classes of words in
English. A description of the lexical resources of a
language would ideally make the alternations explicit,
exploit the generalisations about them to give a concise
representation, present them in a consistent and uniform
manner, and indicate how they interact with each other and
with other varieties of information to be stored in the
lexicon. The paper presents dictionary data illustrating
some facts and generalisations about sense alternations and
shows how they can be expressed in DATR.
Kilgarriff, Adam and Gerald Gazdar, "Polysemous relations," in
Grammar and meaning: essays in honour of Sir John Lyons, ed.
Frank Palmer, pp. 1-25, CUP, Cambridge, 1995.
This paper uses DATR to represent polysemous relations such
as those that hold between the fibre, yarn, cloth and
garment senses of a lexeme like 'silk'. Such polysemous
relations are pervasive in the lexicon, and yet their
subregular character has only rarely been recognized.
Langer, Hagen, "DELASOUL: Eine constraintbasierte
Bescreibungssprache fur lexicalische Reprasentationen,"
Verbundprojekt ASL-TR-26-92/UBI, University of Bielefeld,
Bielefeld, 1992.
Langer, Hagen, "Reverse queries in DATR," COLING-94, vol. 2, pp.
1089-1095, 1994.
DATR is a declarative representation language for lexical
information and as such, in principle, neutral with respect
to particular processing strategies. Previous DATR
compiler/interpreter systems suppport only one access
strategy that closely resembles the set of inference rules
of the procedural semantics of DATR. In this paper, we
present an alternative access strategy (reverse query
strategy) for a non-trivial subset of DATR.
Langer, Hagen, "DATR without nodes and global inheritance,"
Unpublished manuscript, Universitaet Osnabrueck, Osnabrueck,
1994.
This paper investigates which elements of the DATR language
essentially contribute to its expressive capabilities and
which are dispensable for the purposes DATR has been
developed for. A subset of DATR is considered, called local
path DATR (LDATR), that eliminates the concepts of node and
global inheritance by redefining them in a pseudo-
bootstrapping manner in terms of local path inheritance
alone. For an arbitrary standard DATR theory D, there is an
LDATR theory L such that each theorem of D corresponds to an
equivalent theorem of L. This is shown by giving general
translation rules which map an arbitrary standard DATR
theory onto its LDATR counterpart. The main result of the
paper is that restricting DATR to the rules of inference I
and IV, yields a DATR-equivalent formalism (and thus also a
Turing-equivalent one). Furthermore, a version of LDATR
with variables is strongly equivalent to a substantial
subset of standard DATR. Finally, some consequences of
using the LDATR approach as a modelling convention for
lexicon development in DATR are discussed.
Langer, Hagen and Dafydd Gibbon, "DATR as a graph representation
language for ILEX speech oriented lexica," Verbundprojekt
ASL-TR-43-92/UBI, University of Bielefeld, Bielefeld, 1992.
An approach to computational morphology and morphophonology
based on DATR, a task-oriented implementation (DDATR), and a
task-oriented modelling convention (ILEX: Integrated Lexicon
with EXceptions) are described and discussed in terms of
their adequacy for linguistic modelling in the context of
constraint-based, incremental, and maximally deterministic
speech recognition. It is shown that the approach meets
these specifications, while in the case of other approaches
proposed for the same purpose, in particular typed feature
structure formalisms with distributed disjunction, either
the specifications are not met, or their properties in
respect of the specifications have not been described and
are unknown.
Light, Marc, Sabine Reinhard, and Marie Boyle-Hinrichs, "INSYST:
an automatic inserter system for hierarchical lexica," Sixth
Conference of the European Chapter of the Association for
Computational Linguistics, p. 471, 1993.
When using hierarchical formalisms for lexical information,
the need arises to insert (i.e., classify) lexical items
into these hierarchies. This includes at least the
following two situations: (1) testing generalizations when
designing a lexical hierarchy; (2) transferring large
numbers of lexical items from raw data files to a finished
lexical hierarchy when using it to build a large lexicon.
Up until now, no automated system for these insertion tasks
existed. INSYST (INserter SYSTem) can efficiently insert
lexical items under the appropriate nodes in hierarchies.
It currently handles hierarchies specified in the DATR
formalism. The system uses a classification algorithm that
maximizes the number of inherited features for each entry.
Light, Marc, "Classification in feature-based default inheritance
hierarchies," in Proceedings of KONVENS-94, ed. Harald
Trost, pp. 220-229, Oesterreichische Gesellschaft fuer
Artificial Intelligence, Vienna, 1994.
[Also appeared as Technical Report 473, Computer Science
Department, University of Rochester, 1993.] When one works
with a system that utilizes inheritance hierarchies the
following problem often arises. A new object is introduced
and it must be integrated into a hierarchy: under which
classes in the hierarchy should the new object be
positioned? In this paper, the problem is formalized for
feature-based default inheritance hierarchies. Since it
turns out to be NP-complete, an approximation for it is
presented. This algorithm is shown to be efficient and some
of the possible problematic situations for the algorithm are
examined. Although more analysis and experimentation are
needed, these preliminary results show that the algorithm
warrants such efforts.
McFetridge, Paul and Aline Villavicencio, "A hierarchical
description of the Portuguese verb," in Proceedings of the
XIIth Brazilian Symposium on Artificial Intelligence, pp.
302-311, Campinas, 1995.
Mertins, Inge, "Lexical Semantics: an ILEX-DATR account of
English verbs of cooking," MA Thesis, University of
Bielefeld, Bielefeld, 1993.
Moser, Lionel, "Multiple inheritance in DATR: a quick tour," in
The Fourth White House Papers: Graduate Research in the
Cognitive and Computing Sciences at Sussex, ed. Richard
Dallaway, Teresa Del Soldato, and Lionel Moser, pp. 100-104,
University of Sussex, Brighton, 1991.
Inheritance hierarchies with multiple inheritance have long
been studied in AI as structures which have the potential to
permit default reasoning. When a class or instance inherits
from multiple parents, conflicting theorems may be provable.
DATR is a knowledge representation language which supports
path-based multiple inheritance, but is restricted to
deterministic inference. In general, path-based inheritance
requires that the inheritance for a given path be uniquely
specified. In this paper I outline some recent research on
representing default multiple inheritance within the
constraints of deterministic inference such as is used in
recent NLP lexical inheritance representations.
Moser, Lionel, "DATR paths as arguments," Cognitive Science
Research Paper CSRP 215, University of Sussex, Brighton,
1992.
DATR is a lexical knowledge representation language which is
designed to support the lexicon in an NLP system. Its syntax
and semantics are designed to support the types of inference
required in computational lexicography. It was not a design
intention of the language to support general logic
programming, yet in this paper we show that the types of
inference permitted in the language do support a general
type of logical inference. Drawing an analogy with Prolog,
both are declarative languages, and each has its own
inference engine or theorem prover, which are quite
different. DATR allows at least a subset of Prolog-
definable logic programs to be encoded.
Moser, Lionel, "Lexical constraints in DATR," Cognitive Science
Research Paper CSRP 216, University of Sussex, Brighton,
1992.
DATR contains no special features to support testing of
equality, negation, disjunction, or multiple inheritance.
Nevertheless, given an appropriate interpretation it is
possible, within DATR's existing syntax and semantics, to
represent these operations. In this paper we review the
technique known as `negative path extension', and show how
it can be used to reconstruct negation, disjunction, and
equality testing. We then show how these operations can be
used to define what are essentially meta-level constraints
on DATR lexical derivation.
Moser, Lionel, "More multiple inheritance in DATR," Manuscript,
University of Sussex, Brighton, 1992.
In this paper we discuss the representation in DATR of two
multiple inheritance paradigms: (a) prioritized multiple
inheritance, and (b) skeptical multiple inheritance. The
former has been presented in earlier work; in this paper we
extend that work and show that another multiple inheritance
paradigm, skeptical multiple inheritance, is also
recontructible in DATR.
Moser, Lionel, "Evaluation in DATR is co-NP-hard," Cognitive
Science Research Paper CSRP 240, University of Sussex,
Brighton, 1992.
A lower bound of co-NP for the time complexity of DATR query
evaluation is established by showing that an NP-complete
language can be recognized in DATR, and that its complement
can be as well. An upper bound of co-NP is established as
well, thus showing that the complexity of DATR query
evaluation is co-NP.
Moser, Lionel, "Simulating Turing machines in DATR," Cognitive
Science Research Paper CSRP 241, University of Sussex,
Brighton, 1992.
This paper shows (i) how an arbitrary Turing machine can be
simulated in DATR, (ii) that the computational complexity of
DATR is Turing equivalent, and hence (iii) that the
termination of DATR query evaluation is undecidable.
Pampel, Martina, "Die Repraesentation lexicalischen
phonologischen Wissens am Beispiel der Wortbetonung," MA
thesis, University of Bielefeld, Bielefeld, 1992.
Poch, Anna, "Representacion del conocimiento lexico: un ana'lisis
con DATR," PhD thesis, University of Barcelona, Barcelona,
1992.
This thesis shows how DATR may be used to encode a lexicon
for Hudson's (1990) Word Grammar analysis of English.
Reinhard, Sabine, "Adaquatheitsprobleme automatenbasierter
Morphologiemodelle am Beispiel der deutschen Umlautung," MA
thesis, University of Trier, Trier, 1990.
Computational linguistic morphological models must not only
be able to describe concatenation operations correctly but
also more complex association operations (e.g. umlaut and
word stress) as well as the conditions which hold for
occurrence of these operations. Finite state models are
criticised on the grounds of their linguistic inadequacy or
fragmentary character. The thesis exploits Gibbon's DATR-
based 'prosodic inheritance' (PI) approach to morphology and
morphophonology, and applies it to inflectional and
derivational umlauting in German nouns. The approach has
the properties of compact lexical representation, integrated
treatment of concatenation and association operations, and
elegant description of complex dependencies between
morphological operations and morphological and syntactic
conditions. The PI approach differs radically from
computational morphological systems with hybrid formalisms
such as Koskenniemi's 2-level model with continuation lexica
and two-level rules, its derivates with feature-based
lexicons, and Cahill's DATR-driven morphology with
phonological descriptions in MOLUSC.
Reinhard, Sabine, "Verarbeitungsprobleme nichtlinearer
Morphologien: Umlautbeschreibung in einem hierarchischen
Lexikon," in Lexikon und Lexikographie, ed. Burghard Rieger
& Burkhard Schaeder, pp. 45-61, Olms Verlag, Hildesheim,
1990.
This article is a shortened version of the author's MA
thesis on the adequacy problems of automaton-based
morphological models.
Reinhard, Sabine and Dafydd Gibbon, "Prosodic inheritance and
morphological generalisations," Fifth Conference of the
European Chapter of the Association for Computational
Linguistics, pp. 131-136, 1991.
Prosodic inheritance (PI) morphology provides uniform
treatment of both concatenative and non-concatenative
morphological and and phonological generalisations using
default inheritance. Models of an extensive range of German
Umlaut and Arabic intercalation facts, implemented in DATR,
show that the PI approach also covers "hard cases" more
homogeneously and more extensively than previous
computational treatments.

Copyright © Roger Evans, Gerald Gazdar & Bill Keller
Wed Feb 26 12:00:02 GMT 1997