A modern Computational Linguistics course using Dutch

Gosse Bouma ,
Alfa-informatica,
Rijksuniversiteit Groningen
gosse@let.rug.nl

We report on a project in which we develop a modern course in
computational linguistics. The emphasis is on using realistic 
data and building (parts of) realistic applications for Dutch.
 

We discuss a number of exercises in computational morphology and phonology (hyphenation, grapheme to phoneme conversion, inflectional morphology), which require the implementation of relatively complex finite-state automata. Such programs can be developed easily using the FSA package (which has built-in support for all standard operations on finite-state automata as well as visualization) and can be tested rigorously using datasets extracted from CELEX.

Furthermore, we discuss exercises in computational syntax and semantics, in which natural language interface and generation applications are constructed, and which require the development of computational grammars. Students extend a basic Dutch grammar fragment implemented in Hdrug. Realistic applications are obtained by using the Web as a back-end. For instance, a dialogue system for public transport information can be connected to the NS web-site, while a wheather forecast generator may use data provided by the KNMI.