Supervisor: Alan Sexton
Keywords: Final Year Undergraduate Students Only, Document Image Analysis
Brief Description:
Paper forms such as the school's mid and end semester questionnaires
require a great deal of work to prepare and expensive software to
analyse them. This project is to produce a free, public domain version
that could be the basis of a cheap but effective solution for anyone
who needs to work with forms. There are many different components of a
scanner based forms processing system. The forms have to be created.
This can be in a special purpose editor which allows creating the
various components: choice fields, image fields, text boxes etc.
Alternatively it can be produced outside the system and then scanned
in, allowing a user to mark up and configure the various components.
Various image processing facilities are necessary (noise reduction,
skew and rotation correction etc). Automated form detection is an
advantage, locating components correctly on the form essential,
segmentation issues must be looked at etc. Good optical character
recognition on top of the rest would be too ambitious for this project
but the system should provide an API by which an OCR engine could be
plugged in to the system and a free open source OCR engine such as
Tesseract or OCROpus could be used.
Special Equipment: The high speed school scanner will be made available, although the project should be based on handling grey scale or monochrome, low resolution (200-400dpi) TIFF images of forms.
Special Software: No special software requirements
Maintained by A.P.Sexton@cs.bham.ac.uk
Home Page: http//www.cs.bham.ac.uk/~aps
School of Computer Science Home Page