Morphology and Finite State Machines
This part is about:
- Morphology: a description of a linguistic phenomenon
- Knowledge representation: representing linguistic information (in particular, morphological information)
- Processing: writing programs that will work with the linguistic descriptions written using the knowledge representation to accomplish tasks such as:
- word recognition (in the sense that a word is a member of a language)
- morphological analysis
- spelling correction
The order in which you explore sections is, to some extent, a matter of personal taste. If you are computationally minded you may want to work with the sections that concentrate on knowledge representation and processes first, before looking at Morphology. On the other hand, if you feel more comfortable with descriptions, you may want to read the section on Morphology first - or dip into sections to find your own preferred order.
The sections are:
- A Finite State Automaton for word recognition
- This presents a simple Finite State Network for recognising a handful of English words. The representation of knowledge about the spelling of English words is introduced, followed by an algorithm for comparing this knowledge with input words.
- Introduction to Morphology
- Morphology is the name given to the study of the "minimal unit of grammatical analysis". This section introduces some of the technical terms, concentrating on those that are relevant to natural language engineering, such as derivational and inflectional morphology. The topic of cliticization is reviewed as it seems to be a significant problem in designing NLP systems that have the conventional, modular architecture.
- Morphological analysis
- This section shows how a Finite State Network can be used to implement a simple inflectional morphological analyser.
- Finite State Transducer
- An extension of the Finite State Network is introduced using the task of correcting the spelling of English words as an example.
- Issues in implementation and adequacy for Finite State Automata
- Finite State techniques are easy to implement and are efficient, particularly if they are deterministic. However, their knowledge representation qualities are limited and this makes them unsuitable for representing some kinds of linguistic descriptions.
- Other approaches to Morphological analysis
- Finite State Automata are widely used, but is not the only technique. This section briefly looks at other techniques.
The programs in this part are written in Prolog and should work with any Edinburgh flavour implementation, for instance SICStus Prolog, Quintus Prolog and Open Prolog.
© P.J.Hancox@bham.ac.uk