Artificial Companions in Society: Perspectives on the Present and Future
Oxford 25th-26th October, 2007
Organised by The Companions Project
Position Paper: Requirements and Their Implications
Aaron Sloman
University of Birmingham

Slide presentation
My slides presented at the workshop are available here (PDF).
Adjust the width of your browser window to make the lines of text the length you prefer.
This web site does not attempt to impose restrictions on line length or font size.

This paper is also available in pdf format here.

The workshop presupposes that development of Digital Companions (DCs) will happen, and asks about ethical, psychological, social and legal consequences. DCs "will not be robots ...[but]... software agents whose function would be to get to know their owners in order to support them. Their owners could be elderly or lonely. Companions could provide them assistance via the Internet (help with contacts, travel, doctors and more) that many still find hard, but also in providing company and companionship.
My claim: The detailed requirements for DCs to meet that specification are not at all obvious, and will be found to have implications that make the design task very difficult in ways that have not been noticed, though perhaps not impossible if we analyse the problems properly.
Kitchen mishaps
Many of the things that crop up will concern physical objects and physical problems. Someone I know knocked over a nearly full coffee filter close to the base of a cordless kettle. This caused the residual current device in the fuse box under the stairs to trip, removing power from many devices in the house. Fortunately she knew what to do, unplugged the kettle and quickly restored the power. However, she was not sure whether it was safe to use the kettle after draining the base, and when she tried it later the RCD tripped again, leaving her wondering whether it would ever be safe to try again, or whether she should buy a new kettle. In fact it proved possible to open the base, dry it thoroughly, then use it as before. Should a DC be able to give helpful advice in such a situation? Would linguistic interaction suffice? How? Will cameras and visual capabilities be provided? People who work on language understanding often wrongly assume that providing 3-D visual capabilities will be easier, whereas very little progress has been made in understanding and simulating human-like 3-D vision. (E.g. many confuse seeing with recognising.)
Alternatives to canned responses
That was just one example among a vast array of possibilities. Of course, if the designer anticipates such accidents, the DC will be able to ask a few questions and spew out relevant canned advice, and even diagrams showing how to open and dry out the flooded base. But suppose designers had not had that foresight: What would enable the DC to give sensible advice? If the DC knew about electricity and was able to visualise the consequences of liquid pouring over the kettle base, it might be able to use a mixture of geometric and logical reasoning creatively to reach the right conclusions. It would need to know about and be able to reason about spatial structures and the behaviour of liquids. Although Pat Hayes described the `Naive physics' project decades ago, it has proved extremely difficult to give machines the kind of intuitive understanding required for creative problem-solving in novel physical situations. In part that is because we do not yet understand the forms of representation humans (and other animals) use for that sort of reasoning.
Identifying affordances and searching for things that provide them
Suppose an elderly user finds it difficult to keep his balance in the shower when soaping his feet. He prefers taking showers to taking baths, partly because showers are cheaper. How should the DC react on hearing the problem? Should it argue for the benefits of baths? Should it send out a query to its central knowledge base asking how how people should keep their balance when washing their feet? (It might get a pointer to a school for trapeze artists.) The DC could start an investigation into local suppliers of shower seats. But what if the DC designer had not anticipated the problem? What are the requirements for the DC to be able to invent the idea of a folding seat attached to the wall of the shower, that can be temporarily lowered to enable feet to be washed safely in a sitting position? Alternatively what are the requirements for it to be able to pose a suitable query to a search engine? How will it know that safety harnesses and handrails are not good solutions?
Giving machines an understanding of physical and geometrical shapes, processes and causal interactions of kinds that occur in an ordinary house is currently far beyond the state of the art. (Compare the `Robocup@Home' challenge, still in its infancy: Major breakthroughs of unforeseen kinds will be required for progress to be made, especially breakthroughs in vision and understanding of 3-D spatial structures and processes.
More abstract problems
Sometimes the DC will need a creative and flexible understanding of human relationships and concerns, in addition to physical matters. Suppose the user U is an atheist, and while trawling for information about U's siblings the DC finds that U's brother has written a blog entry supporting creative design theory, or discovers that one of U's old friends has been converted to Islam and is training to be a Mullah. How should the DC react? Compare discovering that the sibling has written a blog entry recommending a new detective novel he has read, or discovering that the old friend is taking classes in cookery. Should the DC care about emotional responses that news items may produce? How will it work out when to be careful? Where will its goals come from?
Is the solution statistical?
The current dominant approach to developing language understanders and advice givers involves mining large corpora using sophisticated statistical pattern extraction and matching. This is much easier than trying to develop a structure-based understander and reasoner, and can give superficially successful results, depending on the size and variety of the corpus and the variety of tests. But the method is inherently broken because as sentences get longer, or semantic structures get more complex, or physical situations get more complex, the probability of encountering recorded examples close to them falls very quickly. Then a helper must use deep general knowledge to solve a novel problem creatively, often using non-linguistic context to interpret many of the linguistic constructs
Some machines can already do creative reasoning in restricted domains, e.g. planning.
Why do statistics-based approaches work at all?
In humans (and some other animals), there are skills that make use of deep generative competences whose application requires relatively slow, creative, problem solving, e.g. planning routes. But practice in using such a competence can train powerful associative learning mechanisms that compile and store many partial solutions matched to specific contexts (environment and goals). As that store of partial solutions (traces of past structure-creation) grows, it covers more everyday applications of the competence, and allows fast and fluent responses. However, if the deeper, more general, slower, competence is not available, wrong extrapolations can be made, inappropriate matches will not be recognised, new situations cannot be dealt with properly and further learning will be very limited, or at least very slow. In humans the two systems work together to provide a combination of fluency and generality. (Not just in linguistic competence, but in many other domains.) A statistical AI system can infer those partial solutions from large amounts of data. But because the result is just a collection of partial solutions it will always have severely bounded applicability compared with humans, and will not be extendable in the way human competences are. If trained only on text it will have no comprehension of non-linguistic context. Occasionally I meet students who manage to impress some of their tutors because they have learnt masses of shallow, brittle, superficially correct patterns that they can string together - without understanding what they are saying. They function like corpus-based AI systems: Not much good as (academic) companions.
What's needed
Before human toddlers learn to talk they have already acquired deep, reusable structural information about their environment and about how people work. They cannot talk but they can see, plan, be puzzled, want things, and act purposefully. They have something to communicate about. That pre-linguistic competence grows faster with the aid of language, but must be based on a prior, internal, formal 'linguistic' competence using forms of representation with structural variability and (context-sensitive) compositional semantics. This enables them to learn any human language and to develop in many cultures. DCs without a similar pre-communicative basis for their communicative competences are likely to remain shallow, brittle and dependent on pre-learnt patterns or rules for every task.
Perhaps, like humans (and some other altricial species), they can escape these limitations if they start with a partly `genetically' determined collection of meta-competences that continually drive the acquisition of new competences building on previous knowledge and previous competences: a process that continues throughout life. The biologically general mechanisms that enable humans to grow up in a very wide variety of environments, are part of what enable us to learn about, think about, and deal with novel situations throughout life. Very little is understood about these processes, whether by neuroscientists, developmental psychologists or AI researchers, and major new advances are needed in our understanding of information-processing mechanisms. Some pointers towards future solutions are in these online presentations: (Mostly about 3-D vision) (On understanding causation) (On seeing a child's toys.)
A DC lacking similar mechanisms and a similar deep understanding of our environment may cope over a wide range of circumstances that it has been trained or programmed to cope with and then fail catastrophically in some novel situation. Can we take the risk? Would you trust your child with one?
Can it be done?
Producing a DC of the desired type may not be impossible, but is much harder than most people realise and cannot be achieved by currently available learning mechanisms. (Unless there is something available that I don't know about). Solving the problems will include:
(a) Learning more about the forms of representation and the knowledge, competences and meta-competences present in prelinguistic children who can interact in rich and productive ways with many aspects of their physical and social environment, thereby continually learning more about the environment, including substantively extending their ontologies. Since some of the competences are shared with other animals they cannot depend on human language, though human language depends on them. However we know very little about those mechanisms and are still far from being able to implement them.
(b) When we know what component competences and forms of representation are required, and what sorts of biological and artificial mechanisms can support them, we shall also have to devise a self-extending architecture which combines them all and allows them to interact with each other, and with the environment in many different ways, including ways that produce growth and development of the whole system, and also including sources of motivation that are appropriate for a system that can take initiatives in social interactions. No suggestions I have seen for architectures for intelligent agents, come close to requirements for this. (Minsky's Emotion machine, takes some important steps.)
Rights of intelligent machines
If providing effective companionship requires intelligent machines to be able to develop their own goals, values, preferences, attachments etc., including really wanting to help and please their owners, then if some of them develop in ways we don't intend, will they not have the right to have their desires considered, in the same way our children do if they develop in ways their parents don't intend?
Risks of premature advertising
I worry that most of the people likely to be interested in this kind of workshop will want to start designing intelligent and supportive interfaces without waiting for the above problems to be solved, and I think that will achieve little of lasting value because they will be too shallow and brittle, and potentially even dangerous - though they may handle large numbers of special cases impressively. If naive users start testing them, and stumble across catastrophic failures that could give the whole field a very bad name.
Some related online papers and presentations
              Computational Cognitive Epigenetics
               Diversity of Developmental Trajectories in Natural and Artificial Intelligence
               Do machines, natural or artificial, really need emotions?
[Aaron Sloman, 25 Sep 2007]