School of Computer Science

(DRAFT: Liable to change)

Aaron Sloman
School of Computer Science, University of Birmingham

Installed: 5 Jan 2016
Last updated: XXX
This paper is
A PDF version may be added later.

A partial index of discussion notes is in

Request from AAAI

This was a response to a request for pointers to research on "robust AI",
including robustness to unmodeled phenomena---the "unknown unknowns".

My answer is long because this has been the focus of my research for many
years, most recently in the framework of the (Turing-inspired)
Meta-Morphogenesis project, which aims to identify important transitions in
biological information processing since the very earliest organisms and
pre-biota. It was triggered by wondering what Turing might have done if he
had died three or more decades after his 1952 morphogenesis paper, rather
than two years later.

Some of those previously un-noticed transitions may give us clues as to
what we are currently failing to identify in brain functions and
(therefore) in brain mechanisms: e.g. capabilities that will be needed in
more complete future AI systems.

This has given a new shape to work I've been doing for half a century,
starting before I encountered AI (thanks to Max Clowes).

Since presenting a critique of McCarthy and Hayes at IJCAI 1971, my aim has
been not to prove that AI must fail (e.g. like Dreyfus) but to identify
gaps that need to be filled so that it can succeed in its long term
(scientific) aims.

There's still a long way to go, mainly because of gaps that go unnoticed by
most AI researchers. (Like the people who once thought Newton had all the
answers and the rest was just a matter of filling in details).

This is AI not as engineering but as science, i.e. attempting to explain,
model and if possible test theories by replicating natural forms of
intelligence that at present are not understood -- like the intelligence
that led up to Euclid, the intelligence of squirrels defeating
bird-feeders, the intelligence of weaver birds making nests using several
thousand knitted/knotted leaves, the intelligence of composers who produce
great music, the intelligence of listeners who respond to it without
needing to have it explained, even centuries later, the intelligence of
human toddlers exploring 3-D topology, and intelligence of deaf children in
Nicaragua who could not have learnt the sign-language they used from data
because they (mostly) created the language themselves.

    This short video documents the episode:

These scientific AI  goals seem to have recently been sidelined, though
understanding and modelling natural intelligence was an important goal for
founders of AI, including Turing (in a letter to Ashby), McCarthy, Minsky,
Simon and others.

However, I think the more ambitious engineering goals will not be achieved
while so many gaps in scientific understanding remain. It may take longer
than the rest of this century -- even if applied AI continues to make
spectacular progress in many sub-fields.

Things I have been trying to analyse from the designer stance include the
kinds of perception, learning, and reasoning processes that might have led
to the production of Euclid's Elements -- long before the discovery of
modern logic-based, formal, mathematics. (Arguably Euclid's book is the
single most important publication ever produced on this planet. Its results
are still used every day by scientists and engineers all over the planet
e.g. .)

Modern AI theorem provers that start with axioms and rules expressed in a
logical notation, and attempt to find proofs derived from the axioms in
accordance with the rules, do not model processes of the sort in question
(in part that was Frege's criticism of Hilbert's attempt to logicise
Euclidean geometry, though I would make a similar criticism of Frege's
great work attempting to logicise arithmetic -- as a result of which he
produced some of the powerful constructs now commonplace in AI
programming languages, and some others, e.g. higher order functions).

I think Immanuel Kant, in his discussion of the nature of mathematical
knowledge in The Critique of Pure Reason (1781) started moving in the right
direction, and would probably have used AI with glee if it had been
available then.

Anyhow, for several decades, in addition to working on AI projects
(including building tools and formalisms used by students and colleagues) I
have been collecting examples of capabilities that don't seem to fit
current AI techniques (e.g. trying in the 1980s to specify the functions of
vision ignored by Marr and others, and the architectural requirements for a
wide variety of emotional and motivational phenomena, including some
ignored by most researchers, like long term grief) and I have recently been
trying to assemble and organise these long term requirements and relate
them to products of biological evolution, in a messy and growing web site:
the Turing-inspired Meta-Morphogenesis (M-M) project

Here are a few example challenges.

A key feature of biological intelligence (almost, but not quite, recognized
by James Gibson, though he made some moves in the right direction) is the
ability to grasp sets of possiblities that have nothing to do with
probabilities, but do have absolute limitations, i.e. things that are

E.g. a child playing with similar blocks on a table top could discover ways
of arranging groups of blocks, in regular arrays, as illustrated in these
three examples:




Every group can be arranged in a line, like the first example. Sometimes
one group can be arranged in more than one way, e.g. as a line or a
rectangular or square array, or in several different ways, e.g. 64 blocks.

But sometimes if you add or remove a block the possibilities change
dramatically. E.g. 68 blocks can be arranged in several different
configurations, but if you remove one block only one possible configuration
remains. Why?

(Gibson, apparently did not notice "negative" affordances with mathematical

How could a young robot playing with such blocks come to realise that
some of the rearrangements are impossible? (I don't know how many humans
can, unaided, but some can. I suspect more would be able to if primary
schools were run differently.) Different sorts of impossibilities involving
blocks are mentioned below.

Three or more straight lines drawn on a plane can enclose a finite region
of the plane. Why can't that be done wtih two lines? (One of Kant's

Is there a similar limitation on plane surfaces arranged in a 3-D space
to enclose a finite volume of the space? How could a machine discover, and
understand the limitation?

(Does anyone have a geometric theorem prover that can find the answer?
Would it have to use geometry arithmetised, following Descartes? You
probably have another way of thinking about it. Can that be programmed
into a robot now?)

Similar discoveries about impossible spatial structures might be useful for
future robot architects -- saving a lot of time trying to build impossible

There are many ways flexible strings can be moved around. In particular,
a string can be threaded through one or more holes in a piece of leather
(as in a shoe). Suppose it goes through only two holes: how many different
ways are there of removing the string from the holes? How can you be sure
that you have counted them all? [Assume two removal processes are the same
if the ends of the string go through the same holes in the same direction.]

You can remove the string by pulling one end, or by pulling the other end.

Why can't you remove it even faster by pulling both ends?


If you want to put a tight fitting shirt on a child, or a doll, why is it a
mistake to start by pulling a sleeve up one of the arms?

What would enable a young robot to have the intelligence to be amazed at a
stage performer who seems to be able to make two disconnected solid rings
become linked together?

Young humans are amazed: they don't need lectures in topology to understand
that what they appear to be seeing cannot happen. It's not just
unfamiliarity. I can do many totally unfamiliar things that will not be
seen as impossible.

What sort of robot would be equally amazed?

Why must the three internal angles of a triangle sum to half a rotation?

What would have to go into a future AI system to enable it to discover
or understand the proofs discussed here:

It's well known that there is no construction that will trisect an
arbitrary angle in Euclidean geometry.

But Archimedes was aware of fairly *simple* extension to Euclid that makes
it possible to trisect any angle, discussed here:

(It can also be done using origami geometry.)

What sort of AI system could discover that sort of extension to Euclid,
and discover that it could be used to trisect angles?

You and I can think about closed, non-self-crossing curves on the surface
of a torus. We can also discover that there are classes of curves that are
equivalent in that each curve in a class can be continuously deformed into
any other in that class (e.g. circles on the sidewall of a tyre surrounding
the hole).

How do you know that a curve round the sidewall cannot be continuously
deformed (in the surface of the torus) into a curve round the tube?

How can such discoveries be made for the first time?

If C1 can be continuously deformed into C2, then C2 can be continuously
deformed into C1. Why? How do you know? How could a robot know, without
being told?

If two curves C1 and C2 can't be continuously deformed into each other on
the surface, they are in distinct equivalence classes, otherwise the same
equivalence class. How many distinct classes of simple continuous, closed,
non-self-crossing curves on a torus are there? How do you convince

Could you have two equivalence classes of curves EC1 and EC2, such that EC1
contains EC2, but not vice versa? How will your robot know, without being

How do all these mathematical capabilities grow out of products of natural
selection: what were the biological requirements that were being met by our
ancestors ancestors... that later made mathematicians possible?

(I suspect there were several stages, some shared with other species,
followed by three layers of meta-cognition apparently unique to humans:
but not all available at birth -- for good reasons their epigenetic
development has to be delayed: why?)

I am not claiming that these mechanisms are infallible: the work of Imre
Lakatos (in Proofs and Refutations(1976)) on the ups and downs of Euler's
theorem about polyhedra (E=V+F-2 E:number of edges, V:number of vertices,
F:number of faces) demonstrates the fallibility of human mathematical
abilities -- and some of the debugging and recovery processes that are
possible in intelligent systems.


The above examples concern abstract shapes and the possibilities and
impossibilities of various transformations of those shapes. Humans and many
other animals also learn about different kinds of space-filling stuff, e.g
some rigid, some with various kinds of non-rigidity (e.g. elastic,
inelastic deformity). Many kinds of animal intelligence depend on abilities
to perceive, understand and use kinds of deformity various kinds of stuff
are capable of: e.g. the orangutans that use different sorts of compliance
in their motions through trees.

How can robots be given similar capabilities? Will all their knowledge have
to come from training, or could they have some deeper capabilities that
enable them to make discoveries analogous to discoveries in euclidean
geometry but subject to various possible shape deformities of different
kinds of matter.


Groups of similar cubes can be arranged in space to form various shapes.
How can a robot think about, and draw pictures of, different spatial
configurations of nine cubes in 3-D space, without ever once making use of
cartesian coordinates?

How could it discover that there are some configurations that can be
drawn, but could not possibly exist -- e.g. the configuration discovered
by Swedish artist Oscar Reutersvard in 1934 discussed in this file (still
under construction/revision):
    Skip to section: Pictures of possible and impossible object configurations

There are many other aspects of human/animal visual perception that I think
AI systems are not yet close to replicating, partly because the
requirements tend to go unnoticed by most researchers.

Here's an example that needs a lot more discussion than could fit into an
email message. I have some videos taken with a (cheap) camera moving around
a fairly rich and varied garden with occasional gusts of wind making
petals, leaves, etc. move. What do our visual systems achieve when looking
at those videos or moving round the garden looking at the bushes, shrubs,
trees, flowers, etc. (without being familiar with the species there)?

I've posed a task for AI: *not* to design a system that can do what we
do, but to design a set of *requirements* for such a system!

I think most vision research is tested against very limited sets of
requirements, and often the wrong ones.

E.g. 3-D stereo vision systems (or visual slam systems) are often tested by
their ability to generate different views of a scene, when perceived from
different locations, or even the ability to produce fly-through videos. But
my brain can't do that, except for very simple views. Expert artists are
much better, but that's a specialised application of some powerful
mechanisms shared with non-artists -- and birds, squirrels, hunting
mammals, and others perhaps??

So what do normal human visual systems do during walks around a botanical
garden full of previously unknown (to the viewer) plant forms?

I don't think anyone at present (and that includes me) can specify the
requirements to be met by an AI vision system that can do what we do when
looking at complex, varied, changing, scenes.

But I've been collecting many fragments of the competences, e.g. telling
whether two flowers never seen before are likely to be members of the same,
previously unknown, species; or whether an unfamiliar object seen from one
viewpoint is also one of the objects visible from another viewpoint, where
its 2-D projection is quite different.

[Unfamiliarity rules out use of previous training on that shape.]

What does a nest-building crow need to see in order to select a location
for the next twig it brings to the unfinished nest?

What does it need to see in the part-built nest in order to control its
search for the next twig? Or does it just fetch any available twig then see
how it can be used.

[Does anyone still remember Betty the hook-making crow from New Caledonia,
in Oxford 2002? (Alex Kacelnik and Jackie Chappell, etc.]

Not all humans have the same perceptual, learning, and problem-solving

Some young autistic-spectrum people can spontaneously draw complex pictures
of a 3-D scene that most humans cannot, though they may improve with
training. So, a general theory of human-like intelligence must enable us to
be able to specify *generic* designs that accommodate various kinds of
exceptional *more specific* designs.

And perhaps explain why such sophisticated capabilities are abnormal?

(Perhaps related to how resources are deployed during normal and abnormal

At present AI theories partly specify mechanisms that some neuroscientists
seek in brains. And vice versa. But I think most of the research in visual
neuroscience is based on false, or at least seriously incomplete,
specifications of what needs to be explained.

[This is also true of the widely admired Perceptual Control Theory of
William T Powers, developed in parallel with a lot of AI work, but with
mutual ignorance, mostly.]


I suspect that when we have adequate specifications of what needs to be
explained we may realise that the computational powers of brains vastly
exceed the powers assumed by current theories and models of how neurons

If that's right, e.g. if there's a huge amount of complex computation going
on within each neuron (using chemistry, or special properties of
microtubules?) then current estimates of when AI systems will match the
computational power of brains may be *grossly* underestimating how far we
still have to go in order to produce adequate hardware.

John von Neumann anticipated this possibility in the book written while he was
dying of cancer in 1956, first published in 1958:

  The Computer and the Brain (Silliman Memorial Lectures)
    (3rd Edition, with Foreword by Ray Kurzweill. 2012)

His calculations are summarised in this short book by Tuck Newport:
Brains and Computers: Amino Acids versus Transistors


Most of my examples are concerned with perception of possibilities for
configurations of objects and processes in which configurations change,
along with discoveries of some *impossibilities*.

(NB: none of what I have mentioned has anything to do with statistics or
probabilities. As far as I can tell 'deep learning' mechanisms cannot
produce the required capabilities -- e.g. the ability to discover and
understand proofs in Euclidean geometry or topology. I've argued with Geoff
Hinton, Juergen Schmidhuber and others about this intermittently over many

If pre-verbal children are studied carefully it becomes clear that they can
see possibilities which they then bring into existence. (Piaget was well
aware of this, though he was sadly hampered by the lack of video-recording
mechanisms. But even with video recorders, normal psychological research
techniques cannot cope with the facts of human cognitive development
because of the huge amount of individual variation, that defeats
conventional research methodologies -- all that spurious statistical
analysis that conceals the important phenomena.)

E.g. here's a video of a pre-verbal toddler holding a pencil, picking up a
sheet of card with two holes, and going through carefully controlled
movements: pushind the pencil into one of the holes, pulling it out,
rotating the sheet of card to bring the other side of the hole into view,
pushing the pencil into the hole in the reverse direction, pulling it out,
pushing it again through the hole from the original directions.

She seems to have very definite intentions and very expert abilities to
bring them about. But at her age (about 17.5 months) she could not say what
she was doing, even though she was linguistically advanced for her age.

So that implies that *long* before she could say things like "I am going to
push this pencil through that hole", "The hole can also be entered from the
other side". "I am going to move the pencil through space, rotating it,
until it can be pushed through the hole in the opposite direction" she
clearly had intentions with contents related to those verbal expressions
and she was able to derive the appropriate movements of her hands and head,
including eye-gaze when the pencil was being moved towards the hole.

(She did this with no prompting, no social interaction, no imitation of
anyone else: she apparently just happened to see the opportunity, and I
just happened to have a cheap camera available, and saw my opportunity.)

Her expert, untrained, apparently unlearned ability, presupposes a rich
internal language (information-bearing system) capable of representing both
perceived and future possible structured configurations and possible
configuration-changing processes.

How many currently used formalisms for representing visual contents in
robots are capable of expressing the percepts, intentions, plans, etc.
apparently involved in motivating and controlling her actions: and the
knowledge of 3-D topology that she seems to have deployed?

What is that language? Where did it come from? What role does it play in
the child's learning to talk, later on? How is it implemented? What kind
of neuroscientific research could suggest answers?

I discuss some of these problems and related issues concerning evolution of
language in this slide presentation:
    What are the functions of vision?
    How did human language evolve?
    (Languages are needed for internal information processing,
        including visual processing)

I could go on at length about forms of representation, ontologies, gaps in
current AI, etc. But I'll end by criticising a common, but false

It is widely believed that humans *learn* languages from existing language
users, and attempts to design learning mechanisms that explain this
learning have driven a lot of AI research in the last few decades, after
early attempts (1960s) hit a brick wall (partly due to dreadfully
inadequate computing hardware) and interest shifted.

But I think the belief that what children achieve is learnt is *false*:
humans *create* languages but they create them cooperatively, and most
young learners have to cooperate with a large collection of more advanced
language users, so the learners are in a minority and they have to direct
their creativity so as to achieve conformity. This uses mechanisms that can
go far beyond acquiring an existing language, as demonstrated clearly by
the deaf children in Nicaragua.

Not only do I know of no evidence that any existing AI language learning
system is capable of replicating that process: I don't think AI researchers
or anyone else currently have a deep specification of the *requirements*
for such systems. Perhaps I simply haven't been reading the right books and

(Luc Steels once started working on such systems, but I don't know how deep
his goals were, and I don't know how much progress he has made, since I
last looked a few years ago.

A learning system that I have not studied closely that seems to have some
of the goals I've described was presented in the PhD thesis of Emre Ugur
in 2010:

I heard a summary presentation at a conference a couple of years ago.

An important feature is use of what I call 'architecture-based
motivation'(ABM) in contrast with 'reward-based motivation'(RBM) assumed by
most researchers.

RBM assumes motives are chosen on the basis of their expected utility --
using available information about rewards and their probabilities.

ABM assumes that one variety of perceptually triggered *reflex* is goal
formation (like blinking, or saccades, but internal!). The goal need not be
associated with any form of utility. It's enough that the individual's
ancestors had those reflexes and as an indirect result tended to produce
more offspring, like the blinking reflexes. (ABMs can produce useful
unintended learning.)

(I have not yet studied Ugur's work as closely as I should have. His goals
were a lot less general than those presented here.)

A final confession: some very clever people have been drawing attention to
interesting types of quantum computation in organisms, e.g. Seth Lloyd at

I think the arguments about the need for quantum computation to account for
human mathematical consciousness proposed by Penrose and Hameroff are very

But I wonder if it's possible that some form of quantum computation
combining aspects of non-locality and superposition might be used for
solving *huge* constraint-satisfaction problems in low-level vision,
including problems that nobody knows how to formulate yet because we don't
yet have adequate characterisations of the functions of vision: e.g.
walking round botanical gardens, or in terrain with dense and varied
vegetation, or walking around a car park where most the surfaces are curved
in complex ways with many distorted reflections and changing highlights.

I don't yet claim that all that extra computational power is also a
requirement for the visual discovery of spatial impossibilities leading up
to Euclid -- but perhaps it will turn out to be a requirement, since that
requires abilities generalising Gibsonian affordance perception that are
still not easy to formulate.

There's lots more, including a quarter-baked theory of evolved
construction-kits (concrete, abstract and hybrid).

I've proposed a tutorial on this at IJCAI 2016, which has been accepted
though I don't think I'll have enough time to do more than scratch the

If you know of work that shows that I am completely out of date and my
claimed 'open' problems, including requirements-specification problems, are
already being solved, I would be grateful for pointers. Maybe I can then

Best wishes, and apologies for length. [I've left out a lot!]


Aaron Sloman,
Honorary Professor of Artificial Intelligence and Cognitive Science
School of Computer Science,
The University of Birmingham, UK


Maintained by Aaron Sloman
School of Computer Science
The University of Birmingham