WORK TO BE DONE AT BIRMINGHAM in the FIRST YEAR (AND BEYOND)
WARNING: THIS IS A PROVISIONAL
INCOMPLETE DRAFT, AND LIABLE TO CHANGE.
The Birmingham part of the CoSy project includes collaboration with the other partners on most of the work-packages, but the main work in Birmingham will be focused on attempting to integrate within a single robot a variety of capabilities normally studied separately in different branches of AI (and also studied separately by different researchers in psychology and neuroscience). The robot to be built at Birmingham will mainly be concerned with the 'PlayMate' scenario, involving manipulation of 3-D objects on a table top. In parallel with this other CoSy partners will be work on the 'Explorer' scenario, involving a mobile robot.
Work at Birmingham will focus on:
The summary of our 12 month objective is
PlayMate will be presented with various objects lying on a table and asked to perform various tasks. While doing this it may also answer questions, ask questions and react to changes in the scene.Initially the linguistic interactions will use screen and keyboard, but when the speech processing experts in the CoSy project can interface their software with the rest of the robot software, spoken language will be used.
For more information on the whole project see the project summaries available here.
Hardware to be used
The robot consists of two parts
- a B21 mobile robot with stereo cameras standing next to the table top and able to move a little as neededBoth the mobile robot and the arm will be connected to powerful computers running a version of the linux operating system (which is used for all our software development), including a Sun W2001z dual CPU 2.6Ghz AMD Opteron (64 bit).
- a Katana robot arm with six degrees of freedom, including a gripper with two fingers, mounted on the table and able to reach and manipulate objects on the table including things like blocks, small soft toys, wooden or metal cups and saucers, etc., as shown in these images.
There are also a microphone and speakers, for verbal communication between the robot and humans.
The PlayMate robot (PM) will acquire and use information about a collection of objects on the table which it can manipulate, as can a human sitting nearby. In later stages both PM and the human will manipulate the objects, whereas in the earliest experiments, the only things moving will be PM's arm and perhaps its head (as it shifts position to change its viewpoint). PM will need to be able to perceive and identify the objects on the table, and also be aware that there is a person sitting nearby looking at and talking about the same collection of objects. PM will have to know something about its own viewpoint, for instance in deciding when to change the viewpoint in order to see something better. For some of the interactions later in the project, the robot will also need to understand the person's viewpoint, e.g. knowing what is and is not within reach of the person and what the person can and cannot see (e.g. because some objects are behind others).
The initial tasks will include giving the robot
- the ability to see where its arm is and to perceive its motion
- the ability to control the movement so as to achieve desirable locations, or relations to other objects (e.g. so as to be able to touch them, grasp them, push them, etc.)
- the ability to understand 3-D structure and relationships on the table, and how those are related to possible movements of those objects and the hand
- the ability to predict consequences of simple movements and to check whether the predictions are correct,
- when placing an object on the table or moving it or placing it on another object in a certain way, will the object be stable or not?
- If it is unstable in which way will it move?
- the ability to find out the sources of errors and use that to extend its capabilities
- the ability to explain what it is doing
- the ability to generate its own goals (as in play or exploration)
- the ability to assemble simple constructions, e.g. a bridge made of blocks.
There are many projects that assume that because it is so difficult to design human-like robots (or even impossible), the only sensible strategy is to design something that can learn, perhaps as a new-born human does, and then use training instead of programming, relying on its ability to learn. There are many reasons why we have deliberately not followed that route, including the difficulty of finding out what sorts of learning abilities new-born humans have, or what sorts of innate learning abilities will suffice for the tasks: much of developmental psychology at that stage is guesswork. We conjecture that most of what a newborn infant is actually doing is not observable in experimental situations, e.g. building, or extending, an information processing architecture.
One way to investigate that is to design something that has at least some of the capabilities of an older child and then, having found out what sorts of architectures, mechanisms, forms of representation, etc. actually work, then try to work backwards to investigate what sort of learning system could achieve that. This may not succeed, but we prefer to start from (simplified versions of) things we know some, or most, children can do and see how much of it we can put together in a working system, even if many aspects of the implementation are biologically implausible.
We can also investigate what can be learnt on top of those initial abilities.
Another factor influencing our thinking is that although there are animals that learn a huge amount in their lifetime, developing from an initial state of near helplessness (i.e. members of altricial species), it is unlikely that the only thing evolution has provided for them, apart from their physical mechanisms, is a general-purpose learning system. What's more likely is that the innate learning capabilities are closely tailored to many features of the environment, and which features those are and how the innate mechanisms relate to them will differ from one species to another, even if there are some commonalities. For instance an animal that manipulates objects only or mostly with its beak (like nest-building birds) and an animal that can manipulate things with two independently moving hands while its eyes remain relatively still, will need to learn different things about space, time and movement. So maybe crows and primates start with significantly different learning mechanisms. What the latter have to learn may be one of the things we find out from the exploration described below.
There is more on this topic after the draft scenario descriptions.
The initial practical tasks, and some of the later tasks in the project will include the following (making use of as much pre-existing code as possible from elsewhere):
For an existing 'toy' demo illustrating some of the points, based on work originally done in the early 1970s see http://www.cs.bham.ac.uk/research/poplog/figs/simagent/#gblocks
This may be easy in some cases (e.g. objects arranged in a row) and harder in others, e.g. if the robot generated pointing actions without remembering the sequence of objects pointed at.
(This raises issues about episodic memory).
(The robot should notice the ambiguity about what to do when an object is both red and round.)
and so on.
There are many unanswered questions about trade-offs between innate capabilities and learning. Part of our task will be to find out the pros and cons of providing innate knowledge vs allowing the robot to learn.
It is clear that in animals evolution provides the result of millions of years of exploration of possible initial designs. But what is innate and what is learnt, especially in humans, remains a highly controversial topic. It may be that we need to find new ways of posing the problem, e.g. by allowing innate capabilities to be powerful but not totally domain-neutral bootstrapping capabilities, in addition to general learning capabilities, and highly specific innate capabilities (such as sucking in humans, finding and pecking at food in chicks).
We hope to liaise with developmental psychologists in order to find out how much is known about what children of different ages can and cannot do in relation to tasks such as these. There may also be interesting evidence from brain damaged patients about how such capabilities are typically decomposed in humans.
The aim of the CoSy project is to develop the two scenarios using as much commonality in the designs and tools as possible, so as to be able to merge them at a later date in a robot which is mobile and also able understand small scale 3-D spatial structures and affordances well enough to perform and communicate about 3-D manipulations as above.
Some comments on the use of multiple scenarios, and on the choice of scenarios, can be found in these notes, prepared for the UK Grand Challenge project on 'Architecture of Brain and Mind': 'Metrics and Targets for a Grand Challenge Project Aiming to produce a child-like robot'
Additional papers and presentations, reporting ideas developed during the first year of the project can be found in the Birmingham CoSy papers directory and on the main CoSy web site in the 'Results' section.
The papers directory includes technical reports (some published), discussion papers (e.g. some web sites) and presentations, e.g. this Members' poster presentation on 'Putting the Pieces of AI Together Again' at AAAI'06.
We have also been developing a web-based tool to help with the difficult task of specifying requirements for work to be done on different time scales, including the very long term, the end of the project and the immediate future. A draft version of the tool can be found here (a matrix of requirements).
Specification for an 'ironing robot'
By Maria Petrou (Imperial College)
The background story (Editorial of the IAPR Newsletter, Volume 19, Number 4, 1997, with cartoons).