This is one of many (illustrated) discussion notes related to evolution, life,
perception, reasoning and mathematics. Others are listed here
How much of what you see can you describe?
If you were walking around among the plants you would see not only static structures, but also many changing relationships.
You certainly cannot put into words exactly what you see in the above pictures, since no human language has a vocabulary rich enough to express all the details. (J.L. Austin summarised this as: "Fact is richer than diction", in 'A plea for excuses'. He could have added "Experience is richer than diction".)
The vast majority of people can't draw or paint what they see. It requires a great deal of training, or at least practice, to be able to draw such things in such a way that others can tell what you have drawn, though even when a painting has been produced by one of the finest artists there will be physically different scenes of comparable complexity with visible differences of detail which you will not be able to tell are different from the scene in the painting.
One of the reasons for that is that such scenes have more visible detail than can be humanly depicted using drawing or painting materials. Another is that slight differences of length, curvature, shade of colour, texture, speed, direction of motion, acceleration, rotational speed, can occur in many places in two real live scenes, and even if some of them are noticed others will not be.
This is to be expected in any visual system, natural or artificial, that is not functioning as a mere camera recording images, but as part of an intelligent agent acquiring information that could potentially be put to some use, e.g. in processes of control of actions relating to the things perceived, or in classifying or describing complex structures and processes in the environment, or in trying to explain something that has been noticed (e.g. where a peculiar shadow comes from), or in trying to decide whether something has changed or not, or in comparing two diagrams or formulae while trying to prove a mathematical conjecture.
These are potentially very complicated information-processing tasks, and mechanisms for performing them will always have some limitations, whether they are limitations in precision of measurement or classification, limitations in complexity of visible structural relationships to be compared, limitations in storage across time when changes are to be detected, or limitations in ability to work out detailed implications concerning unseen parts of a scene on the basis of information about seen parts.
Despite all those limitations in precision, amount of detail, inference capabilities, most humans beyond a certain stage of development will be able to see not only that these are pictures of different scenes but also that most of the scenes have various forms of regularity or repetition within the scene and many have recognizable structure when seen from different viewpoints. (Precisely what is seen in each picture may depend on the viewer's interests, culture, normal or previous environment, education, skills acquired in the workplace, etc.)
This demonstrates that biological evolution produced two different classes of mathematical phenomena, both illustrated by these pictures.
Of course, natural selection produces information processing capabilities related to many features of the environment that are not products of evolution, e.g. rocks, sand, water, clouds, changing illumination, changing seasons, etc. But when objects perceived and used evolve and the systems that perceive and use them also evolve, the positive feedback loops between the evolutionary processes can produce remarkably complex physical structures and controlled behaviours that are intimately linked by information processing mechanisms.
As far as I know, there is no existing AI or robot vision system that can acquire the kinds of information that (adult) humans can acquire from the above pictures and the scenes they depict, much of which seems also to be part of the visual capabilities of other species, though different species need different subsets, to meet their foraging, feeding, mating, hiding, fighting, nest-building or other needs.
(Attaching labels to regions of images, or to parts of image sequences, has little to do with this: perception provides information about what is in the environment, not what's in sensory input. The latter is far more variable, full of mostly irrelevant detail, and far less useful for an organism that needs to act in the environment, part of which may be out of sight, e.g. a hunting animal cautiously moving while avoiding being seen by previously observed prey.)
Perception of human artefacts may use similar products of evolution
A very different sort of capability is illustrated in this picture
You probably see a pencil and a mug and flat horizontal surface below the mug. Moreover, you can almost certainly think about a wide range of ways in which the relationships between pencil, mug and surface can change. (I call those sets of possibilities and their constraints "proto affordances" because they have the potential to become affordances, in James Gibson's sense, for organisms with appropriate capabilities and needs. But organisms without those capabilities or those needs can often understand the same possibilities for change.)
Now consider possible ways in which you can grasp and move things in the scene depicted. Among those possibilities is there a sequence of actions that will allow you to use your hand to lift the mug vertically off the surface, without any part of your skin touching the mug anywhere?
What difference does it make if you have access to a glove or a piece of cloth?
Is any object not already visible in the scene required for the task of lifting the mug?
A very young child might not be able to work out a sequence of movements that can meet the requirement. But people looking at this web page probably can, even if they have never lifted a mug in that way previously.
Moreover, you can probably also reason about whether it is possible to lift the mug without grasping any part of it and without spilling any of the liquid in it, if the mug is almost full?
As far as I know, there is no existing AI system with a combination of visual, reasoning, planning and motor control capabilities that is capable of thinking about the actions referred to above, though there are many that could be trained to perform sequences of actions to achieve such goals. Those that can be trained are able to attach labels to specific percepts, and to respond to a variety of particular configurations to achieve a type of goal -- e.g. grasping a mug, catching a ball, walking through a door, and many more.
A robot that can play table-tennis is able, as a result of training (possibly with abilities to extrapolate from or between previously experienced situations), rapidly to select appropriate behaviours whenever it sees the ball approaching. But that does not require the ability to think about effects of variations of the actions, or variations of the shapes or relative sizes of the objects, or to reason about what might have happened if it had reacted differently, without actually going through the details of the different scenario.
For any particular configuration different from a state that actually occurred, and a particular sequence of hand movements in exactly that different situation, some robots may be able to work out the precise consequences of those movements in that situation, by simulating the process. But simulating such a specific process is very different from being able to reason abstractly about a class of configurations, processes and constraints. Compare our ability not only to check that angles of a particular planar triangle add up to half of a complete rotation, but to reason that the same must be true for any planar triangle, no matter what its angles and what is size, orientation or location. (A task that is discussed in more detail here.)
For reasons that will be explained elsewhere, I think the points discussed in this note are very relevant to answering the question: how did human mathematical capabilities of the sorts used in producing Euclid's Elements originally develop, and what were the evolutionary requirements for this to happen? Part of the answer is that evolution blindly used various kinds of mathematics to produce some of the structures that now exist as a result of growth and development of organisms, and many other organisms need mathematical or proto-mathematical abilities to perceive and use those structures, their possibilities for change and the constraints on those possibilities.
This is related to, but deeper than, abilities to perceive and use analogies. Why? because knowledge about an analogy (or metaphor) involves knowledge about the things between which the analogy holds, including their parts and relationships, whereas an organism with knowledge about an abstraction can discard the details of the instances from which the abstract is derived and add details as needed when dealing with instances. Moreover different abstractions can themselves be found to share features that allow higher order abstractions to be derived. This process of building layers of abstraction, which characterises the history of mathematics, would be far more cumbersome than being required to build layers of metaphors, and metaphors of metaphors, etc. and far less powerful than the mathematics we know.
How is all this done? What mechanisms make it possible? How did they evolve? How do they develop in individuals? How are they used? These are all questions whose answers are far far from obvious. This page, and others, collect examples that may help to constrain our search for answers.
Installed: 25 Aug 2014
After a family visit to Wisley Gardens.
Last updated: 11 Sep 2014; 21 Jun 2015 (Subtitle)
This discussion paper is
A PDF version (which may become out of date) is here.
(Firefox on Linux can print to file as PDF.)
A partial index of discussion notes is in
School of Computer Science
The University of Birmingham