This is part of the
Free Poplog Portal
Examples of teaching and demonstration materials
concerned with parsing and grammars for natural language
This is part of the web site on learning to do 'Thinky' Programming
The files with the '.p' suffix include Pop-11 code and comments. They
should be treated as plain text files. If your browser cannot read them
that may be because it associated '.p' with another type of file. You
may have to use the 'Open with' option in your web browser (usually via
right click, depending on the browser.)
All of these examples can be run in the Pop-11 system that is part of Poplog,
available for use in a virtual linux that runs on windows, macs, etc.
or downloadable for installation in a linux system
This list may be extended.
An introduction (for absolute beginners) ideas of grammar, grammatical structures,
lexicon, lexical types, grammar-based random generation of sentences, parsing of
Examples of things that can be done by learners after being introduced to the
grammar formalism and the library designed for using the grammars to generate or
NOT FOR NOVICES TO READ
Lightly commented, Pop-11 source code for the LIB GRAMMAR package that makes all the
above work. It depends on use of the Pop-11 matcher for list structures. [ADD LINK]
NOT FOR NOVICES TO READ
A modified version of Pop-11 LIB GRAMMAR to support use of an ill-formed
substring table to speed up parsing (with spectacular results).
This, like the original library, finds only one parse, even if the input string
is ambiguous according to the grammar.
To find where this differs from the original library look for occurrences of
the global variable no_parse_found and calls of the procedure
A grammar for an artificial subset of English relevant to a
This shows how, using Pop-11's formalism for recursive context-free grammars, it
is possible to define a fairly complex grammar for assertions, questions and
commands, for the domain of a robot with the ability to perceive and
manipulate objects on a table.
(As in the CoSy and CogX robot projects at Birmingham.)
Using the standard Pop-11 grammar library, parsing sentences of the sorts generated
by this grammar can be very slow.
Using newgrammar.p, with the ill-formed substring memory, speeds
that up enormously. (The memory is cleared and rebuilt for each
sentence. The program could be changed to create an enduring memory of
things not to try.)
This grammar was hand-coded, using an idea suggested long ago by Gerald Gazdar,
namely compiling semantic categories into syntactic categories. In a more ambitious
project it might be possible to make a system that can learn such a grammar by being
exposed to a very large corpus of examples.
A lexicon for use with the above grammar, i.e. words to be used in statements,
commands, and questions referring to the robot and its environment and actions.
This package has no understanding of inflections: different forms for the same word,
depending on context of use. All words are treated as unanalysed wholes, except for
the use of apostrophes.
Example code for compiling and testing the grammar and lexicon,
using the tools in newgrammar.p
This includes some examples that invoke the linux 'espeak' program to say the
generated sentences out loud.
One test sentence parsed, to show the sort of parse tree produced. The sentence is
[move the big red box on the tray onto the top of the block on the left side of Jeremy"s triangular horse]
The parse tree is too big to be displayed easily, though it can be displayed in an
xterm-compatible window, or using the poplog XVed editor.
A screen-shot of the tree displayed in an xterm window with a small font is below,
automatically generated by the Pop-11 'showtree' library package.
Suggestions for improvement are welcome.
This file maintained by:
School of Computer Science, University of Birmingham
Installed: 13 Jun 2011
Last Updated: 14 Jun 2011