% % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % File: catalann.dtr % % Purpose: derived fe/male forms of nouns for persons in Catalan % % Author: Gerald Gazdar & Max Wheeler, April 1995 % % Email: geraldg@cogs.sussex.ac.uk, maxw@cogs.sussex.ac.uk % % Address: COGS, Sussex University, Brighton BN1 9QH, UK % % Documentation: Max Wheeler draft chapter of "Catalan Reference Grammar" % % Version: 2.00 (June 1996) % % % % Copyright (c) University of Sussex 1995. All rights reserved. % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % What follows is a second attempt to encode in DATR the facts concerning % Catalan derived nouns for females (occasionally males) as set out in the % beginning of Max Wheeler's March 1995 draft chapter on gender for his % "Catalan Reference Grammar" (with A. Yates and N. Dols). % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % # nc seq node:. Word: <> == % by default, the citation form is spelt the same as the name of the % node: == Idem:<""> % to get the ortographic form, apply the Orthography function to the % relevant word value: == Orthography:<"" !>. # vars $sex: male female. Noun: <> == Word == noun % by default, gender is determined by sex, in the obvious way: == masc == femn % by default, the gloss is just the sex concatenated with the % 'meaning': == $sex "" % to arrive at the stem for the sex-alternant, apply MakeStem to the % citation form (and the diacritic, if any): == MakeStem:<"" "" !> % by default, the word for the sex-alternant is created by % adding the relevant suffix to the stem: == "" "". % Nouns with 'common' gender, i.e., unspecified for either sex or gender: NounC: <> == Noun == == "" == "". % Nouns with fixed masculine gender, but unspecified for sex: NounCM: <> == NounC == masc. % Nouns with fixed feminine gender, but unspecified for sex: NounCF: <> == NounC == femn. % Nouns whose basic form refers to a female: NounF: <> == Noun == o t == "". % Nouns whose basic form refers to a male: NounM: <> == Noun == a == "". NounEssa: <> == NounM == e s s a. NounIna: <> == NounM == i n a. NounW: <> == NounM == "". NounS: <> == NounM == "" s. NounL: <> == NounM == "" [dot] l. Noun-: <> == NounM == -. NounRiu: <> == Noun- == r i u. % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % To prepare the stem, reverse it, apply the EndStem function to the reversed % string, and then reverse the result so as to restore the original segment % order: MakeStem: <> == Reverse:>>. % variable over (stressed) vowels used by EndStem: # vars $v: a e i o u. EndStem: <> == Voiced:<> % delete an (unstressed) final vowel: == Idem:<> % needed for poeta (1.1.2), abella (1.1.3), etc. ?? % or could use to nuke specific vowels == Idem:<> == Idem:<> % map a final 'u' to a final 'v': == v Idem:<> % unless it is marked for deletion by the diacritic: <- u> == Idem:<> % which is also used to delete final 'or' when specified: <- r o> == Idem:<> % append 'n' to an acute vowel: <$v [acute]> == n $v [acute] Idem:<> % and map final 'oleg' to 'olog': == g o l o Idem:<>. Voiced: <> == Idem % Next rule commented out to prevent 'fotograva' (which is wrong): % == v Idem:<$v> == d Idem:<$v>

== b Idem:<$v> == g Idem:<$v>. % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % # vars $s1: a b c d e f g h i j k l m n o p q r s t u v w x y z. # vars $s2: a b c d e f g h i j k l m n o p q r s t u v w x y z. # vars $ns: n s. # vars $w: a e i o u. # vars $ae: a e. # vars $iu: i u. # vars $a: [acute] [grave]. Orthography: == <$i> == $i <> <$i !> == $i <$v i i> == $v [dots] i <> <$v $a $iu $s1 $s2> == $v [dots] $iu $s1 $s2 <> % Following rule much too general -- deleted accents all over the place % <$a $v $s1 $s2> == $v $s1 $s2 <> % Replaced by the following three rules: <$a $v $ae> == $v $ae <> <$a $v $ns $w> == $v $ns $w <> <$a $v s s $w> == $v s s $w <>. % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % # atom Aramaean Aztec Belgian Chaldean Galician Galilean Gaul Greek Hebrew Indo-European Jew Manichean Norwegian Pharisee Philistine Pigmy Russian Slav Swiss. # vars $word: orth gloss gender word stem. Abat: <> == NounEssa == abbot. Abella: <> == NounF == bee. Actor: <> == NounRiu == actor. Alcalde: <> == NounEssa == mayor. % Are the following pair of entries are dialectal or stylistic alternants? Ambaixador1: <> == NounM == ambassador. Ambaixador2: <> == NounRiu == Ambaixador1. Amic: <> == NounM == friend. Amo: <> == NounM == master <$word female> == "Mestressa:<$word female>". Andreu: <> == NounW == given name. Aprenent: <> == NounM == apprentice. Arameu: <> == Noun- == Aramaean. Asteca: <> == NounC == Aztec. Ateu: <> == Noun- == atheist. Avi: <> == NounM == [grave] == grandparent. Baro: <> == NounEssa == b a r [acute] o == baron. Bedui: <> == NounM == b e d u [acute] i == bedouin. Belga: <> == NounC == Belgian. Bordegas: <> == NounS == b o r d e g [grave] a s == youth. Borrec: <> == NounM == yearling. Bruixa: <> == NounF == witch == wizard. Bus: <> == NounS == diver. Caldeu: <> == Noun- == Chaldean. Capatas: <> == NounS == c a p a t [grave] a s == overseer. Camil: <> == NounL == given name. % The analysis is not currently capturing the fact that nouns that end % in the -aire agent suffix belong to NounC: Captaire: <> == NounC == beggar. Carca: <> == NounC == reactionary. Cardioleg: <> == NounM == c a r d i [grave] o l e g == cardiologist. Cec: <> == NounM == blind person. Ciril: <> == NounL == given name. Collega: <> == NounC == c o l [dot] l e g a == colleague. Company: <> == NounM == companion. Comte: <> == NounEssa == count. Criatura: <> == NounCF == baby. Democrata: <> == NounC == d e m [grave] o c r a t a == democrat. % Are the following two entries dialect alternants? Deu1: <> == Noun- == d [acute] e u == god. Deu2: <> == Deu1 == NounEssa. Detectiu: <> == NounM == detective. Diable: <> == NounEssa == devil. Diaca: <> == NounEssa == d i a c o n == deacon. Dida: <> == NounF == wet nurse == husband of wet nurse. Dona: <> == NounF == person/spouse <$word male> == "Home:<$word male>". Dormilec: <> == NounM == sleepyhead. Dormilega: <> == NounC == Dormilec. Druida: <> == NounEssa == d r u [dots] i d == druid. Duc: <> == NounEssa == d u q u == duke. Electricista: <> == NounC == electrician. Emperador: <> == NounRiu == emperor. Enemic: <> == NounM == enemy. Esclau: <> == NounM == slave. Eslau: <> == NounM == Slav. Extra: <> == NounC == film extra. Fariseu: <> == Noun- == Pharisee. Filisteu: <> == Noun- == Philistine. Fill: <> == NounM == offspring. Fotograf: <> == NounM == f o t [grave] o g r a f == photographer. Fura: <> == NounF == ferret. Gal: <> == NounL == Gaul. Galileu: <> == Noun- == Galilean. Gall: <> == NounIna == chicken. Gallec: <> == NounM == Galician. Ganapia: <> == NounC == g a n [grave] a p i a == childish person. % The analysis is not currently capturing the fact that nouns that end % in -ant, -ent, or -int, belong to the NounC class, by default: Gerent: <> == NounC == manager. Germa: <> == NounM == g e r m [acute] a == sibling. Gimnasta: <> == NounC == gymnast. Gos: <> == NounS == dog. Grec: <> == NounM == Greek. Guatlla: <> == NounF == quail. Hebreu: <> == Noun- == Hebrew. % The following pair of entries are dialectal alternants Hereu1: <> == NounM == heir. Hereu2: <> == NounW == Hereu1. Heroi: <> == NounIna % there's an issue about the final vowel here: == hero. Hipocrita: <> == NounC == h i p [grave] o c r i t a == hypocrite. Home: <> == NounM == person/spouse <$word female> == "Dona:<$word female>". % Are the following two entries dialect alternants? Hoste1: <> == NounEssa == host. Hoste2: <> == NounEssa == lodger. Indigena: <> == NounC == i n d [acute] i g e n a == native. Individu: <> == NounCM == individual. Indoeuropeu: <> == Noun- == Indo-European. Institutor: <> == NounRiu == tutor. Judoka: <> == NounC == practitioner of judo. Jueu: <> == NounM == Jew. Jutge: <> == NounEssa == judge. Llec: <> == NounM == lay person. Llop: <> == NounM == wolf. Lluis: <> == NounM == l l u [grave] i s == given name. Maniqueu: <> == Noun- == Manichean. Manyac: <> == NounM == sweetheart. Marcel: <> == NounL == given name. Marit: <> == NounM == husband <$word female> == "Muller:<$word female>". Merla: <> == NounF == blackbird. Mestressa: <> == NounF == mistress <$word male> == "Amo:<$word male>". Metge: <> == NounEssa == doctor. Monjo: <> == NounM == monk == nun. Muller: <> == NounF == wife <$word male> == "Marit:<$word male>". Nebot: <> == NounM == nephew == niece. Noi: <> == NounM == child. Noruec: <> == NounM == Norwegian. Ogre: <> == NounEssa == ogre. Oncle: <> == NounM == uncle <$word female> == "Tia:<$word female>". Orfe: <> == NounM == [grave] o r f en == orphan. Os: <> == NounS == [acute] o s == bear. Pages: <> == NounM == p a g [grave] e s == farmer. Pastifa: <> == NounC == botcher. Pau: <> == NounM == p a u l == given name. Pediatre: <> == NounM == paediatrician. Perdiu: <> == NounF == p e r d i g == partridge. Persona: <> == NounCF == person. Pigmeu: <> == Noun- == Pigmy. Poeta: <> == NounEssa == poet. Pompeu: <> == NounM == p o m p e i == given name. Posses: <> == NounS == p o s s [acute] e s == person possessed. Princep: <> == NounM == p r [acute] i n c e p == p r i n c e s == prince. Profes: <> == NounS == p r o f [acute] e s == person who has taken vows. Profeta: <> == NounEssa == prophet. Pubilla: <> == NounF == heir <$word male> == "Hereu1:<$word male>". Pupil: <> == NounL == ward. Rei: <> == NounM == r e i n == monarch. Reu: <> == Noun- == culprit. % The following pair of entries are dialectal alternants Romeu1: <> == NounM == pilgrim. Romeu2: <> == NounW == Romeu1. Rus: <> == NounS == Russian. Santedat: <> == NounCF == holiness. Serf: <> == NounM == s e r v == serf. Socioleg: <> == NounM == s o c i [grave] o l e g == sociologist. Suis: <> == NounS == s u [acute] i s == Swiss. Talos: <> == NounS == t a l [grave] o s == dolt. Televident: <> == NounC == television viewer. Terapeuta: <> == NounC == therapist. Terrissaire: <> == NounC == potter. Tia: <> == NounF == aunt <$word male> == "Oncle:<$word male>". Tigre: <> == NounEssa == tiger. Tsar: <> == NounIna == czar. Vianant: <> == NounC == pedestrian. Victima: <> == NounCF == v [acute] i c t i m a == victim. % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % Two utility functions # vars $n: 1 2 3. Idem: <> == <$i> == $i <> <$i !> == $i ! <$i $sex> == $i <> % Next line is to remove codes from the name of the node: <$i $n> == $i . Reverse: <> == <$i> == <> $i <$i !> == $i. % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % # show . # hide Idem Reverse EndStem MakeStem Voiced Word Noun NounC NounCM NounCF NounM NounF NounEssa NounIna NounRiu NounW NounS NounL Noun- Orthography. % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % The next line is the Revision Control System Archive Id: do not delete it. % $Id: archive.dtr,v 1.1 1997/04/09 20:40:33 root Exp $