School of Computer Science THE UNIVERSITY OF BIRMINGHAM CN-CR Ghost Machine

The functions of biological vision systems
(DRAFT: Liable to change)

Aaron Sloman
School of Computer Science, University of Birmingham.

Installed: 9 Oct 2014
Last updated: 11 Oct 2014
This paper is
A PDF version may be added later.

A partial index of discussion notes is in

What are the functions of vision? (Again)

I think there is still a huge amount of research to be done on what humans and other animals use vision for. E.g. Marr's claim about getting 3-D information from retinal input is part of the story, but leaves out a large amount. Gibson's emphasis on vision providing information about what the perceiver can and cannot do in order to satisfy goals, preferences, etc. extends that, but isn't nearly enough. The use of vision in SLAM(Simultaneous Localisation and Mapping) is different again, though it combines some of Marr's and Gibson's uses, but also indirectly supports many other things, including reasoning about what could be happening in remote locations, or planning, etc. The ideas of Max Clowes in the 1960s, which I have partly summarised here, are different again:
(and the following sections) He was influenced by related views of perception by earlier researchers such as von Helmholtz, Gombrich and others.

Another whole family of uses (which is what got me into AI originally) is the role of vision in various kinds of mathematical discovery and reasoning, leading up to Euclid's elements. I think those uses of vision occur unconsciously in most humans and in many animals -- closely connected with information about affordances and their limitations. Some examples are here:

I don't think it is possible to understand, model or replicate human visual capabilities without exploring a variety of different questions about the uses of vision, in humans, in other animals, in the evolutionary precursors, at different stages of individual development and in different possible sorts of artefacts that we may in future wish to build with visual capabilities. What follows is a partial high level subdivision of topics to be investigated regarding the functions of vision, in order to understand what sorts of mechanism might be required, or might suffice, in different contexts. This is both incomplete and very shallow: it is work in progress, at an early stage.

1. Examination of uses of vision in other animals.
(E.g. some visual competences of non-humans, e.g. squirrels, orangutans, nest building birds -- e.g. weaver birds) refute certain hypotheses about the central role of human language in human intelligence. There are also different functions of vision related to different habitats and modes of locomotion, e.g. swimming, crawling, sliding, walking, jumping, flying in open space, flying through branches to a nest, etc.)

2. Attempts to understand the evolution of *uses* of vision, as we know it,
from much earlier forms of perception, e.g. because that survey may identify some intermediate biological uses of optical information that still play an essential role in human vision, but an unobvious role. (This emphasis on evolution of various kinds of biological information processing is the core of the Meta-Morphogenesis (M-M) project.) That's a huge task with intolerably sparse evidence, but I think there are ways of reducing the most obvious difficulties by careful research planning.

3. Investigating how uses of vision develop in individuals from infancy onwards.
I think there are lots of very important things going on in pre-verbal infants, that most people don't notice (though Piaget noticed some of them) and which lay foundations for later developments. For example, I think metrical perception (absolute values for size, length, speed, distance, volume, curvature etc.) barely exists at first and instead many partial orderings (e.g. X is closer to or is getting closer to Y) are perceived and used, e.g. in visual servoing.

Another class of cases concerns use of vision in distinguishing different materials and their properties (including distinguishing them by their behaviour rather than static appearance) and using knowledge about materials to explain differences in observed behaviour of manipulated objects and materials -- sand, water, treacle, oil, syrup, butter, mud, plasticine, clingfilm, cooking foil, paper, cardboard, various kinds of cloth, etc. etc.

I think this includes many unnoticed pre-mathematical functions, on which later mathematical competences build (unless destroyed at school).

(Jackie Chappell -- a biologist -- and I have published some papers on how the genome can play different roles at different stages of development with later roles being partly determined by what was learnt earlier. This can lead to questions about different uses of vision in different cultures or in different individuals with different developmental trajectories. Extreme cases are musical sight-readers, outstanding painters, architects, mathematical experts in geometry and topology, and of course programmers who are good at understanding textual programs -- including noticing bugs others don't see. I think that's partly related to abilities to see flaws in mechanical construction (That building won't last in a heavy snow storm.) Working out the developmental trajectories of various visual competences could provide important clues as to their nature.

4. Investigating uses of vision in dealing with other intelligent species.
The work on emotion recognition could be an example, but tends to be very shallow and based on very shallow theories of human affective states and how they relate to visible behaviour. I think there are lots of much more subtle ways in which vision is used to gain information about intelligent individuals (not just humans) e.g. what they are looking at, how they feel about what they are looking at (bored, interested, surprised, dismayed...) and what they are likely to do as a result, whether they understand something, whether they are trying to deceive, whether they are confident about what they are doing, whether they are doing it carefully or not, and many more.

Often perception of information processing in another individual is part of a sophisticated interaction, e.g. a teacher trying to understand how to help a pupil who hasn't understood something, flirting, dancing, collaborating on a complex task, and many more. The visual cues in most of these cases are extended over time and can include not just eyes and face, but body parts and their relations to other things, e.g. picking up an unfamiliar object nervously, etc.

5. Investigating Cultural evolution of visual functions.
This includes a whole lot of different things -- including the uses of vision in human sign language, which can involve perception very complex parallel movements of many body parts. It's much richer than either speech or text. It includes changes of visual functions of domesticated animals.

6. Investigating uses of vision in mathematical discovery, mathematical reasoning, and related processes of designing and understanding complex artefacts,
e.g. a bi-stable spring-driven car boot (trunk) lid. This is what got me from mathematics into philosophy, and then from philosophy into AI. The research problems are very difficult, and progress is slow.

7. Uses of vision in aesthetic contexts:
enjoying a view, admiring a face, a dance, a building, a tree, a painting, etc.

8. Investigating uses of vision in connection with sexual functions
Including finding a potential mate attractive, being sexually stimulated, etc.

There is much more to be added.
In particular I have been collecting a variety of different examples of human visual abilities related to mathematical reasoning in geometry and topology. This connects with abilities to play with and understand certain kinds of toys, especially construction kits (e.g. meccano, tinkertoy, Fischertechnik, etc.) Even understanding how clothes work, and how you can and cannot put them on is related to this. An example is here:

Part of the argument is that there is no sharp divide between what counts as vision and what counts as something 'more central'. Often more central processes go on in registration with some of the optic array details. A surprising case is this illusion:

Look at each face for a couple of seconds at a time. You may notice that the eyes look different. That may be connected with the way some things look fragile, look unstable, etc. (The eyes are geometrically identical.)

It would be useful to develop an outline (possibly evolution-inspired) framework for collecting different functions of vision into a web site that will include information from many disciplines, and can go on growing or being modified.

I think that might be particularly useful both for young researchers trying to find interesting new research problems, and for people trying to plan research or development projects, find collaborators, etc. (This is a result of observing how narrowly most vision researchers seem to think about the role of vision, and how disparate and disconnected the research community is.)

A lot of help will be needed to grow it and develop the structure. I don't know how many researchers would be interested enough to contribute, instead of simply continuing on their existing focus.

In parallel with developing a map or functions of vision, it would be useful to develop a map of components of possible explanatory models, for example a map of

This would build on and feed back into the map of uses/functions of vision.

But it's important that producing maps of functions and maps of mechanisms are two different tasks, since in general different mechanisms can serve any specific function. Most researchers seem to focus only on mechanisms, making unacknowledged assumptions about the functions, often different assumptions in different research teams.

To be added


Maintained by Aaron Sloman
School of Computer Science
The University of Birmingham