HELP SUMMARISE Aaron Sloman June 1999 Based on work by Riccardo Poli (DRAFT HELP FILE. May be improved) LIB SUMMARISE This program designed and implemented primarily by Riccardo Poli in October 1996, and slightly modified by Aaron Sloman in June 1999, can read in a text file, digest it, form an impression of the style, and spew out a "crazy" summary, written in a parody of the the style of the original. CONTENTS -- Introduction -- summarise_file(file, max_word_count); -- Examples -- ENTER summarise -- ENTER summarise -- Invoking the program from the Unix command line -- Global variable max_input_words (default 20000 ); -- Introduction ------------------------------------------------------- To make it available load the library: uses summarise The program can be used in three different modes all desrcibed below, namely: o by invoking the procedure summarise_file o by giving a Ved command o by invoking Pop-11 as a unix command The library makes available a number of procedures, one for use directly, e.g. in a procedure of your own which selects a file to summarise, and one which defines a VED command: ENTER summarise, described later. The following is the main procedure -- summarise_file(file, max_word_count); file can be either a string naming a file, or a character repeater (e.g. vedrepeater, or the output of discin) This reads in the file (or the text input stream defined by the character repeater) and analyses some statistical properties of the file (storing for each word the frequencies with which other words follow it in the file). It then prints out (using cucharout) a new file containing max_word_count words, in the same "style" as the original, making use of the frequency information. The result can be an amusing parody of the original, though in general it will not make much sense and may include ungrammatical sentences. -- Examples Try these commands (using ESC d) to compile them. The summary will be printed into the file output.p, so you will probably want to clear that file before you start each command. (Edit output.p, then do ENTER clear): summarise_file('$usepop/pop/teach/respond', 300); summarise_file('$usepop/pop/teach/aithemes', 300); summarise_file('$usepop/pop/doc/continuation', 300); You can also run this with some of your own files. The output of summarise_file is a bit like an essay written by a student who has memorised many short phrases and technical terms and can generate them fluently without really understanding what they mean. -- ENTER summarise -- ENTER summarise The ENTER summarise command can be used to produce a summary of the file you are currently reading in VED. The is used as the second argument to summarise_file. If you don't give the number it defaults to 1000 words. The output is printed into a Ved file called SUMMARY, in non-writable mode. So if you already have a file called SUMMARY in the editor, you should save the file and "quit" it. For instance, try it now, giving this command to produce a summary of the file you are now reading. ENTER summarise 400 Alternatively look at a teach file, e.g. ENTER teach primer Then ENTER summarise 500 -- Invoking the program from the Unix command line If your Pop-11 system is set up properly you can also start the program as a unix command, in this form: pop11 summarise E.g. try the following Unix command a few times pop11 summarise $usepop/pop/teach/eliza 400 You can give that command do the shell in an xterm window. -- Global variable max_input_words (default 20000 ); If the file you wish to summarise is very long, you may not wish to wait for the program to read it all in. There is a global variable max_input_words whose value shold be an integer, and which defaults to 20000. After reading this number of words from the input file the program stops. You can try increasing or decreasing this number to see whether it improves the summaries, when dealing with a long input file. --- $poplocal/local/help/summarise --- Copyright University of Birmingham 1999. All rights reserved. ------