This file is part of the 'Grand Challenge' directory: discussing proposals for a long term project entitled 'Architecture of Brain and Mind'.

Thoughts on Ontologies, Architectures, Cognitive Systems and Tools
Aaron Sloman

These notes arose from some correspondence regarding possible plans for a project on cognition. I was sent a draft presentation to comment on. These notes arose out of points made in my reply, originally written 13 Jul 2003 and subsequently updated in the light of comments and afterthoughts.
Last updated: 22 Aug 2003


Most research proposals start from things we know and can do (i.e. existing theories, models, tools, architectures) and ask what their strengths and limitations are, and how they can be combined, modified, extended, etc.

We can call that research management style "forward chaining" and contrast it with another, "backward chaining", research mode that attempts to identify things that we cannot yet explain or model, but which are required for our long term goals, -- and then work backwards to what is needed for the task, which may, in many cases, be very different from anything considered in the first approach.

Of course identifying what we don't know is quite difficult. It is not something people are taught to do in degree courses. It requires a special way of looking at the world, and constantly asking, even of familiar things: How did that happen? Or Why didn't X happen?
(Why does a detached apple move downwards?)

Below I give some examples.


We should try to identify important things we don't understand and work backwards from them (where feasible).


Humans build ontologies for perceptual and other processes as they develop. We don't know much about this, but it is crucial for replicating many aspects of human intelligence, and influences the architecture.

In order to achieve the high level goals of the project we need a much better understanding of the processes of acquiring, extending, storing, using, debugging an ontology.

This relates to research on varieties of forms of representation.

It also relates to the role of affordances in perception.

Tools Required

Flexible, extendable tools are needed that facilitate exploration and development of these ideas. I don't think existing toolkits (mostly geared to particular sorts of architectures and particular forms of representation) are adequate.

So a major requirement for progress is probably collaborative development of powerful new tools with features that existing tool-builders have not yet thought of.

All of this may sound defeatist: it makes the task look too hard. But I think that careful analysis can lead to a host of identifiable sub-goals that can drive the whole adventure.

The rest of this note gives more detailed examples and discussion.


We can see the need for understanding ontologies and how they grow by examining closely many developmental transitions in young children.

Toy Train Example

An example is the observation that a child aged about 19 months who can pick things up, kick or throw a ball, stack blocks, open doors, put things in slots, and many other things, can be stymied by a wooden train set made of trucks each of which has a hook at one end and a ring at the other, like this:
Toy tain truck with hook and ring

A bright and competent child may be able to take the trucks apart but not able to join them.

This can be seen on a video of Josh (made available with permission of his father, Jonathan): (4395012 bytes.)

Josh tries to join the two rings together like this, and fails, apparently without understanding why:

Trying to join two rings together.

A bigger, higher resolution version, with extra stuff is in this file: (11501572 bytes)

He picks up two trucks joined together, holding one of the trucks. He shakes it so that the two come apart, then while still holding the first truck picks up another and tries to join them together. He is aware that the two ends are different, and rotates the second one before bringing them together.

But he holds the second one so that its ring is against the ring of the first one, instead of turning the other one round so that ring and hook come together.

He tries to join them, looking intently at the two rings as they come together, fails, gets frustrated and does something else, apparently not noticing the significance of the fact that the two ends of the trucks are different, even though he has often seen two trucks joined together and has separated them.

(By the way: at first I mis-perceived what was happening. I thought he had not noticed that the two ends of the truck were different. Then I looked again at the video.)

I have not been in London recently, so could not try the obvious experiment of showing Josh the right way to join the trucks to see if he can grasp it.

Sometimes if a child is not ready, you can show till you are blue in the face and the crucial structural relationships will still not be grasped.

Why? And what changes when he later finds it all obvious and trivial?

I phoned Jonathan yesterday to suggest an experiment to see if Josh could be taught to do it the right way. I was too late. Josh had already worked it out (by about two weeks after the date of the video) and was now immediately matching hook to ring to join the trucks.

Jonathan had not noticed when or how the transition came.

I've asked Jonathan to try to get a movie of him doing this, to bring out the difference.

Ambiguities of interpretation:

However, even though Josh does the task we cannot tell whether he merely has a kind of 'rote learning' without any understanding of *why* it works.

I.e. he may have learnt *that* pushing the end of the hook through the ring works, whilst moving two rings together does not, and he may have learnt *how* to make the movements that produce this change, but without understanding *why* it works.

What more is required for understanding why?

It would at least require understanding the possible ways in which the context could vary, the possible ways in which the actions could vary and the effects of such variations on the requirements for and possible ways of doing the task:

If the hook were made of very flexible wire what would the consequences be?
If the wire used for the hook were much thicker, what would the consequences be?
If the wire were moved in at a different angle, what would the consequences be?
If the trucks were joined with one held upside down what would the consequences be?

What if you want one truck to *lift* the other, not pull it horizontally?

What would happen if the hook were rotated 90 degrees to lie parallel with the top of the truck?


How much of that does a child grasp, and how does the understanding grow?

How is the knowledge, and the ability to use it, represented in the child?
(Saying that some of the information is in the environment, which is currently fashionable, though Herbert Simon pointed it out many years ago, is true: but does not provide an explanation. The same information was there before and after the child's understanding grew. More precisely: it was available in the environment for an agent to make use of it: but what resources does such an agent require? Eyes are not enough.)

I'll ask my developmental psychology friends if this ontology growth and deployment is something for which they already have explanations. I'll be surprised, since in general psychologists don't ask the same sorts of questions as a designer does.

I am fairly sure there's little or nothing in AI on this kind of grasp of affordances and the associated meta-competence. (It's the sort of change in competence Piaget might have noticed, though he lacked any appropriate explanatory concepts.)

Here are some (still too vague) conjectures (and questions arising):

Learning to see and use what you see in performing actions requires among other things developing an *ontology*. In the example above the ontology might include

and other various sub-categories.

E.g. closed circular, closed with corners, closed convex, closed concave, and many many more.

Likewise open straight, open bent, open curved, etc.

Different individuals will have different sub-categories. Not all categories will have names in the individual's spoken language.

Chimps, hunting animals, grazing animals, berry picking animals, nest-building birds, generals directing battles, all need different ontologies, though there may be important overlaps between what they need.

The ontology will also include many relationships, including metrical, topological and other relationships (Tony Cohn at Leeds and his collaborators have done a lot of work on topological relationships: but I suspect that so far only a *tiny* subset of what is needed has been done).

These are just the *structural* components of an ontology.

Causal/functional components of an ontology

The ontology also needs to include *causal/functional* components, which are far more subtle.

I.e. the perceiver needs concepts for the causal powers associated with various structures and their relationships including:

If the child (or a crow, or a general) has not grasped the ontology required for perceiving some functional relationships then she will not be able to perceive the relationships, and will not be able to perform actions that make use of them.

Stages in the development of an ontology:

There may be intermediate stages where special cases of the general competence are learnt without using the full ontology.

But then the person will normally not *understand* why those special cases work (e.g. in the way that termites probably do not understand why their actions when building a tower have the effects they do).

Trained circus animals are probably like that. Most AI programs that do anything interesting nowadays are also like that at present: they lack *understanding* of the domain in which they operate, and they lack self-understanding about their understanding and lack of it.

Early expert systems were supposed to be able to show understanding by answering Why? questions. But all they did was spew out a back-trace of their rule activations.

This proved inadequate. Why?

(A partial answer:

What about birds building nests: do they know why they do what they do as the weave in the next twig?

Does Betty the crow understand why she needs to bend that piece of wire, and does she understand how her actions cause it to be bent?

Does the variety of ways she can do the bending show that she understands what bending is?

Then why does she not see that the straight wire needs to be bent *before* she tries and fails to use it? Is she lacking in a crucial part of the ontology we have that enables us to see immediately that it will not work? Do some (all?) human children go through the kind of partial understanding that Betty has?

Are transitions through some kinds of partial understanding *necessary* for a system to develop a fuller understanding?

Can someone who has never got things wrong understand what it is to get things right?

(What are the implications of this for automated military, or medical, or other applied cognitive systems?)

Can we find out what Betty does and does not understand?

(Perhaps, but only by difficult and roundabout investigations, with all the uncertainty that attaches to all deep explanatory science.)

Unanswered questions

So we have many questions arising out of this sort of example:


This issue is discussed in several of the slide presentations here
(e.g. in talks 14, 20, 21, 24, 25), and in several of the papers in the Cognition and Affect project directory.


The situation gets more complex when two individuals have to communicate about these things.

Effective communication requires not only a good ontology for the topic being discussed but also an ontology relating to communication: its purposes, means, ways of going wrong (producing misunderstanding, giving away too much, triggering unintended side-effects), ways of detecting and repairing mis-communication, etc.

How does a child grow such an ontology for communicative states and processes?

Does that require different processes from those involved in the development of the ontologies used in perception, planning and action?

(I suspect the meta-management layer in the architecture is crucial here: some aspects of its ontology are applicable both to oneself and to others.)

Note however, that having a *general* ontology for types of communication and their features plus having a usable ontology for perceiving and acting on certain sorts of objects, do not necessarily combine to provide the ability to communicate *particular* things about those objects.

(How can you describe the taste of cinnamon to someone who has never encountered it?
Can a child who knows how turning a screw to fasten something works, explain to others how it works? Maybe.
How many people can communicate what someone they can recognize looks like, to people who don't know the person: why are indentikits needed?
Here's a toy one:
Why can't some pictures be translated into words?)

Cyc/OpenCyc is an attempt to address some of these issues, but it is not clear to me that it is deep enough, or general enough, or based on a good meta-ontology for most of the objects of perception in everyday life.

KQML was an attempt to do something about these issues, but as far as I know it presupposed that something like a logical formalism could be used to express all ontologies.

That's not obvious to me.

As far as I can tell the work on ontologies in AI is mainly concerned with requirements for explicit reasoning and communication, not with ontologies required for perception and action, in particular perception of affordances and actions that make use of affordances. (Both positive and negative affordances.)


A particularly interesting type of ontology is that required for self-understanding of various types. The requirements for other-understanding are similar but not completely similar.

They probably co-evolved once a meta-management layer developed in the architecture of a subset of species.

The interplay between self-understanding and other-understanding in development and learning is very important: many scientists seem to assume that other understanding based on observation of others comes first. I think that's wrong: other understanding would not be possible without an ontology appropriate to self-understanding. (One day when I write a proper paper on meta-management I'll try to make this clear, though I can't say I understand it fully.)

Many philosophers have argued the opposite: that other-understanding comes from self-understanding by inductive extrapolation. I think that's wrong: there were evolutionary pressures to provide innate mechanisms (including partial ontologies) supporting both. What are they and how do they develop?

[There's probably more of this than I remember in Minsky's Emotion Machine and his other writings.]

To investigate all this we have to study many tasks in great depth to find out the ontologies involved (which is not easy, and we can make mistakes) and then analyse the task requirements to see whether they constrain useful forms of representation for those ontologies.

(Minsky's CausalDiversity paper on his web site analyses some of the trade-offs, but only in a few dimensions. How many dimensions are there?).


Of course, in posing all these questions this way I am using a meta-ontology, or collection of meta-ontologies, which (partially) defines a space of problems and possible solutions, but that meta-ontology could be quite wrong.

(E.g. some dynamical systems theorists would claim I am badly mistaken.)

In order to make progress with the science (as opposed to producing systems that just work on particular tasks) we need a good meta-ontology also.

I suspect that can only emerge from a lot of work in trying to devise specific ontologies and then attempting to generalise from the special cases. Unless someone can derive a meta-ontology from functional requirements....

None of this implies that what's in your draft is wrong or unnecessary. I feel it is incomplete, however.


It would b useful to have an architecture design language.

Doing that well requires a good ontology not only for tasks and target systems but also for stages in the design and development process. I don't think we have that collection of ontologies yet. So all proposed solutions must be treated as highly provisional.

Some people would want to distinguish the language for specifying designs and the implementation language. However, if you have a high level implementation language (which requires some sort of powerful interpreter or compiler) that can also function as a design language.

That was essentially how our SimAgent toolkit was developed, though it needs an even higher level design language separate from the implementation language, for some purposes. (I think that can be done via a library mechanism providing larger building blocks.)

The most substantial applications were toy military demos produced by Jeremy Baxter and Colleagues at DERA, but that was not its main target. There's more information here: and even more in online files pointed to from that and in the documentation subdirectories of the "sim" and "prb" sub-directories of the toolkit:

Rulesets as communication channels.

I'll just mention one feature: in most rule-based systems condition-action rules are thought of as primarily a mechanism for either making inferences or triggering actions.

In SimAgent they can also be communication channels between sub-systems in the total architecture, because conditions can run arbitrary code and actions can run arbitrary code.

E.g. a condition can get some information from a neural net, or some intermediate level perceptual buffer, and then an action in the same rule can transform the information and feed it to a planner, either by directly invoking the planner or by putting the information where the planner and other things will find it.

I think such rules are a powerful integration mechanism. I wonder if we'll one day find that brains implement something close to that, in ways not yet imagined?

Summary: rulesets can be a generic kind of compositional glue.

My understanding is that common ways of implementing rulesets, e.g., using the Rete algorithm for efficiency, restrict the diversity of rules and actions, so that they cannot have this function. This is a feature of many AI systems.
(Maybe I've missed something.)

This limitation can arise out of the view that a 'cognitive architecture' is something special -- a self-contained module, separate from perception and action for instance. I think several contributors to the recent Stanford symposium on Cognitive Architectures held that view. I think that view can restrict the vision of researchers. My slides attempting to point this out never made it to the symposium website:

There are many requirements for such design and implementation tools that are not, as far as I know, met by most AI agent toolkits, e.g. the requirement to vary relative speeds of components in order to explore the effects of resource limits and mechanisms for overcoming them.

Enough for now.

Maintained by Aaron Sloman