URL: http://www.cs.bham.ac.uk/research/projects/cosy/deliverables/matrix/architectures/kitty/kitty-arch-draft.html
Last changed: 23 Jan 2006

Earliest draft by Nick Hawes and Aaron Sloman
Arising from discussions in the Birmingham CoSy team


10 Jan 2006: Draft 1 upgraded to Draft 1A

Specification still under construction

Kitty is notionally a 30 month deliverable. However given the way the EC operates, we are expected to have a working version by about 24 Months, i.e. September/October 2005. So there is very little time. Since none of the first year implementation effort was devoted to manipulation (although a great deal of effort was put into analysing requirements for manipulation) we are starting from a very backward position (including waiting for changes to the firmware of the Katana arm that will allow us to develop the required arm control software for a robot that can react to changes as they occur.)

Because of this 'backward starting point' for the work on manipulation, the 'core' specification of Kitty will be very simple at first, so as to allow us to be reasonably confident of having a working system for a 'minimal' demonstration by September, as specified in Jeremy's email of 30th Nov 05.

Additional functionality may be specified but without any commitment to its working in time for the next major demonstration, though with the aim of delivering by month 30.

In any case the initial core architecture must be designed so that it is extendable later on to provide the functionality for month 30 and beyond.

We expect to use a rapid-prototyping methodology, which implies going rapidly for a first draft that may be entirely discarded and a new draft specified on the basis of what we have learnt. For that initial implementation we shall therefore use tools that speed up development and testing, disregarding long term efficiency. This is a well tested software engineering strategy.

Kitty's Architecture: Draft 1A

The architecture in Kitty will need to include the following components many of them acting concurrently, with implications that need to be specified. A partial diagram is provided
Kitty Architecture

1. 'Central' Databases
1.a The current (episodic, situational) knowledge base:
(Sometimes called 'instance memory'. Things in here are treated as facts about the current and past environment.)

1.b Temporary store for hypotheticals
(e.g. used for possible explanations, or thinking about 'what if' questions, or tentative predictions, or hypotheses under test)
[Could be implemented in the instance memory with special tags, or in a separate memory ]

1.c. Temporary store for unanswered questions

1.d. General knowledge about what is possible in the world
(sometimes called 'semantic knowledge')

2. The current set of motivations
This may include things like:

3. Arm and arm controller

4. Visual system (simplified first draft) A more detailed specification of the functions and architecture of the visual sub-system will go here.

5. Alarm mechanism (possibly deferred?)
Something that detects a problem (impending unwanted collision, or limitation of arm movement preventing intended effect) then generates alarm signal(s).

Initially this could just do two things:

(a) cause all motion to freeze
(b) insert information about the alarm and action taken into the database.

Later it may be able to distinguish immediately urgent problems and potential problems. The latter would not generate action immediately (e.g. freezing) but might add some factual information to the episodic memory and generate a goal to deal with the situation.

NOTE: inputs from everywhere, outputs to everywhere (potentially)

Alarm mechanisms need to be fast, using rapid pattern matching, and should be trainable.
The requirement for speed means there will sometimes be mistakes.

6. Goal/motive generator (apart from alarm mechanism)

7. Simple deliberative system
A deliberative system is one that can do reasoning or planning making use of structured representations of hypothetical states of affairs or actions, which may be compared, combined, and selected as the solution to a problem.

These representations may be of many forms, e.g. propositional, action sequence descriptions, or analogical (e.g. using maps, diagrams) or neural nets.

Including one global working memory

8. Simple meta-management system (perhaps better called management??)
(Nick suggests separating 'normal' goal management from more general high-level control processes and restrict meta-management to the latter.)

This includes

9. Plan execution system.
(Including 1-step plans!)
This takes current top level goal and associated plans and sends appropriate action commands and state queries to the arm controller and also constantly checks state as reported by visual system.

(Later it may also send commands to camera motors.)

More fast and fluent action will require more direct coupling between vision system and motor control system. That could be a product of learning in a later stage of the project.

10. As in ACT-R, add a spreading activation substratum
Details not yet specified but something like the ACT-R spreading activation mechanism will be implemented using something like Hebbian learning to produce a 'context' mechanism, with a decay mechanism to implement some aspects of short term memory.

This could be a form of attention control and a mechanism for simple serendipitous learning.

Note that for planning purposes it will be necessary also to have explicit generalisations about preconditions and consequences of actions and events.

There may need to be some high level control mechanisms that can alter parameters in the network - e.g. sensitivity, thresholds, decay rates.

11. Additional learning tasks and mechanisms.
There is a vast collection of possible extensions to provide different kinds of learning. But we shall not include them in our initial set of goals for demonstration October. If the core goals are achieved, then we can work on learning, if necessary re-designing key parts of the system to integrate them fully with learning mechanisms.

Examples of relevant learning tasks would include

Learning mechanisms will use current and past information, plus task information to draw conclusions about what causes what and under what conditions, perhaps learning about both causation as a network of conditional proabilities (Humean Causation) and causation as necessary consequences of changing structural relations when complex structures interact, subject to constraints like rigidity and impenetrability (Kantian Causation). The distinction is explored here (COSY-PR-0506) and some of the ideas are elaborated here (COSY-DP-0601)

12. Testing and debugging interface.
During development and testing we'll need a process running at a terminal (or more than one terminal) that allows all the above components to be interrogated and in some cases to be sent signals to change their behaviour.

This can start with a mixture of low level programming commands, and graphical tools, possibly extended to allow a simplified natural language interface to be used, e.g. to insert goals, interrogate data-structures, etc. send low level commands to subsystems.

13. Later: 'Real' Natural language interface
This should be able to take information from current databases and translate it into sentences, to be typed on terminal or fed to speech synthesiser.

It should also have a language input controller which constantly waits either for typed or spoken input, parses and interprets it and adds the results to the database.

Initially there may be only simple commands, questions and assertions, but later (for the 'Philosopher' scenario) we want to have warnings, interrupts, advice (during reasoning, acting, or planning), discussion of the robots thought processes and perceptual experiences, and discussions about hypothetical situations, to test the robot's self-understanding.

The NLP system will start with some innate information about language including syntax and semantics of terms relating to the robot's world. At a later stage we'll consider ways of learning a language from scratch. (A huge amount of literature exists regarding how this might be done and there are major controversies. We may have to set up a seminar to decide how to take sides in the controversies in order best to serve CoSy.)

The NLP system will need a quite complex sub-architecture not represented here.
(See GJ's architectures).

linking declarative memories to other things

Later -- investigate ways of 'managing' the spreading activation.


Once the arm control system and the visual system are working it would be fairly simple to put together the whole thing using SimAgent to test out the ideas in a rapid prototyping environment.
Even before they are working it may be useful to simulate them in a simplified form to test out aspects of the architecture.

Once we have a proof of concept it can be re-implemented in a preferred framework, after we have re-evaluated the various options (CORBA, MARIE, CARMEN, etc.)

[to be continued, modified, corrected, implemented]