Artificial Companions in Society:
Perspectives on the
Present and Future
Oxford 25th-26th October, 2007
Organised by The Companions Project
Position Paper:
Requirements and Their Implications
Aaron Sloman
University of Birmingham
http://www.cs.bham.ac.uk/~axs/
(
THE HTL TEXT BELOW IS NOW OUT OF DATE: see
PDF version
)
Slide presentation
My slides presented at the workshop are available
here
(PDF).
NOTE ON FORMATTING:
Adjust the width of your browser window to make the lines of text
the length you prefer.
This web site does not attempt to impose
restrictions on line length or font size.
This paper is also available
in pdf format here.
Introduction
The workshop presupposes that development of Digital Companions (DCs)
will happen, and asks about ethical, psychological, social and legal
consequences. DCs "will not be robots ...[but]...
software agents whose function would be to get to know
their owners in order to support them. Their owners could be elderly
or lonely. Companions could provide them assistance via the Internet
(help with contacts, travel, doctors and more) that many still find
hard, but also in providing company and companionship.
My claim: The detailed requirements for DCs to meet that
specification are not at all obvious, and will be found to have
implications that make the design task very difficult in ways that have
not been noticed, though perhaps not impossible if we analyse the
problems properly.
Kitchen mishaps
Many of the things that crop up will concern physical objects and
physical problems. Someone I know knocked over a nearly full coffee
filter close to the base of a cordless kettle. This caused the residual
current device
in
the fuse box under the stairs to trip, removing power from many devices
in the house. Fortunately she knew what to do, unplugged the kettle and
quickly restored the power. However, she was not sure whether it was
safe to use the kettle after draining the base, and when she tried it
later the RCD tripped again, leaving her wondering whether it would ever
be safe to try again, or whether she should buy a new kettle. In fact it
proved possible to open the base, dry it thoroughly, then use it as
before. Should a DC be able to give helpful advice in such a situation?
Would linguistic interaction suffice? How? Will cameras and visual
capabilities be provided? People who work on language understanding
often wrongly assume that providing 3-D visual capabilities will be
easier, whereas very little progress has been made in understanding and
simulating human-like 3-D vision. (E.g. many confuse seeing with
recognising.)
Alternatives to canned responses
That was just one example among a vast array of possibilities. Of
course, if the designer anticipates such accidents, the DC will be able
to ask a few questions and spew out relevant canned advice, and even
diagrams showing how to open and dry out the flooded base. But
suppose designers had not had that foresight: What would enable the DC
to give sensible advice? If the DC knew about electricity and was able
to visualise the consequences of liquid pouring over the kettle base, it
might be able to use a mixture of geometric and logical reasoning
creatively to reach the right conclusions. It would need to know about
and be able to reason about spatial structures and the behaviour of
liquids. Although Pat Hayes described the `Naive physics' project
decades ago, it has proved extremely difficult to give machines the kind
of intuitive understanding required for creative problem-solving in
novel physical situations. In part that is because we do not yet
understand the forms of representation humans (and other animals) use
for that sort of reasoning.
Identifying affordances and searching for things that provide
them
Suppose an elderly user finds it difficult to keep his balance in
the shower when soaping his feet. He prefers taking showers to taking
baths, partly because showers are cheaper. How should the DC react on
hearing the problem? Should it argue for the benefits of baths? Should
it send out a query to its central knowledge base asking how how people
should keep their balance when washing their feet? (It might get a
pointer to a school for trapeze artists.) The DC could start an
investigation into local suppliers of shower seats. But what if the DC
designer had not anticipated the problem? What are the requirements for
the DC to be able to invent the idea of a folding seat attached to the
wall of the shower, that can be temporarily lowered to enable feet to be
washed safely in a sitting position? Alternatively what are the
requirements for it to be able to pose a suitable query to a search
engine? How will it know that safety harnesses and handrails are not
good solutions?
Giving machines an understanding of physical and geometrical shapes,
processes and causal interactions of kinds that occur in an ordinary
house is currently far beyond the state of the art. (Compare the
`Robocup@Home' challenge, still in its infancy:
http://www.ai.rug.nl/robocupathome/) Major breakthroughs of unforeseen
kinds will be required for progress to be made, especially breakthroughs
in vision and understanding of 3-D spatial structures and
processes.
More abstract problems
Sometimes the DC will need a creative and flexible understanding of
human relationships and concerns, in addition to physical matters.
Suppose the user U is an atheist, and while trawling for
information about U's siblings the DC finds that U's brother has written
a blog entry supporting creative design theory, or discovers that one of
U's old friends has been converted to Islam and is training to be a
Mullah. How should the DC react? Compare discovering that the sibling
has written a blog entry recommending a new detective novel he has
read, or discovering that the old friend is taking classes in cookery.
Should the DC care about emotional responses that news items may
produce? How will it work out when to be careful? Where will its goals
come from?
Is the solution statistical?
The current dominant approach to developing language understanders and
advice givers involves mining large corpora using sophisticated
statistical pattern extraction and matching. This is much easier than
trying to develop a structure-based understander and reasoner, and can
give superficially successful results, depending on the size and variety
of the corpus and the variety of tests. But the method is inherently
broken because as sentences get longer, or semantic structures get more
complex, or physical situations get more complex, the probability of
encountering recorded examples close to them falls very quickly. Then a
helper must use deep general knowledge to solve a novel problem
creatively, often using non-linguistic context to interpret many of the
linguistic constructs
See
http://www.cs.bham.ac.uk/research/projects/cosy/papers/#dp0605
Some
machines can already do creative reasoning in restricted domains, e.g.
planning.
Why do statistics-based approaches work at all?
In humans (and some other animals), there are skills that make use of
deep generative competences whose application requires relatively slow,
creative, problem solving, e.g. planning routes. But practice in using
such a competence can train powerful associative learning mechanisms
that compile and store many partial solutions matched to specific
contexts (environment and goals). As that store of partial solutions
(traces of past structure-creation) grows, it covers more everyday
applications of the competence, and allows fast and fluent responses.
However, if the deeper, more general, slower, competence is not
available, wrong extrapolations can be made, inappropriate matches will
not be recognised, new situations cannot be dealt with properly and
further learning will be very limited, or at least very slow. In humans
the two systems work together to provide a combination of fluency and
generality. (Not just in linguistic competence, but in many other
domains.) A statistical AI system can infer those partial solutions from
large amounts of data. But because the result is just a collection of
partial solutions it will always have severely bounded applicability
compared with humans, and will not be extendable in the way human
competences are. If trained only on text it will have no comprehension
of non-linguistic context. Occasionally I meet students who manage to
impress some of their tutors because they have learnt masses of shallow,
brittle, superficially correct patterns that they can string together -
without understanding what they are saying. They function like
corpus-based AI systems: Not much good as (academic) companions.
What's needed
Before human toddlers learn to talk they have already acquired deep,
reusable structural information about their environment and about how
people work. They cannot talk but they can see, plan, be puzzled, want
things, and act purposefully. They have something to communicate about.
That pre-linguistic competence grows faster with the aid of language,
but must be based on a prior, internal, formal 'linguistic'
competence using
forms of representation with structural variability and
(context-sensitive) compositional semantics. This enables them to learn
any human language and to develop in many cultures. DCs without a
similar pre-communicative basis for their communicative competences are
likely to remain shallow, brittle and dependent on pre-learnt patterns
or rules for every task.
Perhaps, like humans (and some other altricial species), they can escape
these limitations if they start with a partly `genetically' determined
collection of meta-competences that continually drive the acquisition of
new competences building on previous knowledge and previous competences:
a process that continues throughout life. The biologically general
mechanisms that enable humans to grow up in a very wide variety of
environments, are part of what enable us to learn about, think about,
and deal with novel situations throughout life. Very little is
understood about these processes, whether by neuroscientists,
developmental psychologists or AI researchers, and major new advances
are needed in our understanding of information-processing mechanisms.
Some pointers towards future solutions are in these online
presentations:
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#compmod07
(Mostly about 3-D vision)
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#wonac07
(On understanding causation)
http://www.cs.bham.ac.uk/research/projects/cosy/photos/crane/
(On seeing
a child's toys.)
A DC lacking similar mechanisms and a similar deep
understanding of our environment may cope over a wide
range of circumstances that it has been trained or programmed to cope
with and then fail catastrophically in some novel situation. Can we take
the risk? Would you trust your child with one?
Can it be done?
Producing a DC of the desired type may not be impossible, but is
much harder than most people realise and cannot be achieved by
currently available learning mechanisms.
(Unless there is something available that I don't know about).
Solving the problems will include:
(a) Learning more about the forms of representation and the knowledge,
competences and meta-competences present in prelinguistic children who
can interact in rich and productive ways with many aspects of their
physical and social environment, thereby continually learning more about
the environment, including substantively extending their ontologies.
Since some of the competences are shared with other animals they cannot
depend on human language, though human language depends on them.
However we know very little about those mechanisms and are still far
from being able to implement them.
(b) When we know what component competences and forms of representation
are required, and what sorts of biological and artificial mechanisms can
support them, we shall also have to devise a self-extending
architecture which combines them all and allows them to interact with
each other, and with the environment in many different ways, including
ways that produce growth and development of the whole system, and also
including sources of motivation that are appropriate for a system that
can take initiatives in social interactions. No suggestions I have seen
for architectures for intelligent agents, come close to requirements for
this. (Minsky's Emotion machine, takes some important steps.)
Rights of intelligent machines
If providing effective companionship requires intelligent machines to be
able to develop their own goals, values, preferences, attachments etc.,
including really wanting to help and please their owners, then if
some of them develop in ways we don't intend, will they not have the
right to have their desires considered, in the same way our children do
if they develop in ways their parents don't intend?
http://www.cs.bham.ac.uk/research/projects/cogaff/crp/epilogue.html
Risks of premature advertising
I worry that most of the people likely to be interested in this kind of
workshop will want to start designing intelligent and supportive
interfaces without waiting for the above problems to be solved, and I
think that will achieve little of lasting value because they will be too
shallow and brittle, and potentially even dangerous - though they may
handle large numbers of special cases impressively. If naive users start
testing them, and stumble across catastrophic failures that could give
the whole field a very bad name.
Some related online papers and presentations
http://www.cs.bham.ac.uk/research/projects/cosy/papers/#tr0703
Computational Cognitive Epigenetics
http://www.cs.bham.ac.uk/research/projects/cogaff/sloman-aaai-representation.pdf
Diversity of Developmental Trajectories in Natural and Artificial Intelligence
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#cafe04
Do machines, natural or artificial, really need emotions?
[Aaron Sloman, 25 Sep 2007]