Architecture-based motivation vs Reward-based motivation
School of Computer Science
The University of Birmingham Installed: 25 May 2009 (Liable to change.)
14 Jun 2015 Note on precursors to states with belief-like and desire-like roles
2 Aug 2009; 31 Mar 2013; 24 Jan 2014
Also published here in 2009:
Newsletter on Philosophy and Computers, American Philosophical Association, 09, 1, pp. 10--13,
Newsletter on Philosophy and Computers, American Philosophical Association, (Index of Newsletter issues)
(PDF) Whole newsletter 09, 1, Fall 2009 (Containing 2009 version of this paper)
In 2009 this paper had a note below referring to White's paper as introducing the
idea of architecture-based-motivation (A-B-M) using different terminology.
R.W. White, 1959, Motivation reconsidered: The concept of competence, Psychological Review, 66, 5, pp. 297-333I have now re-read the paper more carefully and find that his concept of "effectance"
"Effectance motivation similarly aims for the feeling of efficacy, not for the vitally important learnings that come as its consequence."There is no such aim for feelings of any kind necessarily associated with
Earlier in the paper he writes:
"In infants and young children it seems to me sensible to conceive of effectance motivation as undifferentiated. Later in life it becomes profitable to distinguish various motives such as cognizance, construction, mastery, and achievement. It is my view that all such motives have a root in effectance motivation. They are differentiated from it through life experiences which emphasize one or another aspect of the cycle of transaction with environment. Of course, the motives of later childhood and of adult life are no longer simple and can almost never be referred to a single root."All of that is fine. But from my point of view he hasn't noticed the need to go back
Of course, biological evolution may have selected the mechanisms producing A-B-M,
because having those mechanisms tends, in some situations, to produce more competent
adults. But the young animal knows nothing about that and probably in the earliest
stages does not even have any concept of competence or feeling of competence or
incompetence. It merely acts and information gets absorbed in the process, which at
that stage mostly achieves nothing. Later, patterns in the stored information can be
used to create useful strategies for achieving explicit goals.
He also writes:
"Effectance motivation must be conceived to involve satisfaction -- a feeling of efficacy -- in transactions in which behavior has an exploratory, varying, experimental character and produces changes in the stimulus field. Having this character, the behavior leads the organism to find out how the environment can be changed and what consequences flow from these changes."That seems to express a commitment to the notion of motive-selection having to be
This is a very primitive form of meta-management in the sense defined in Luc
Beaudoin's 1994 PhD thesis mentioned below.
In short: control mechanisms are required in addition to factual information
and reasoning mechanisms if A is to do anything. This paper is about what
forms of control are required. I assume that in at least some cases there are
motives, and the control arises out of selection of a motive for action. That raises
the question where motives come from. My answer is that they can be generated and
selected in different ways, but one way is not itself motivated: it merely involves
the operation of mechanisms in the architecture of A that generate motives and select
some of them for action. The view I wish to oppose is that all motives must somehow
serve the interests of A, or be rewarding for A. This view is widely held and is
based on a lack of imagination about possible designs for working system. I summarise
it as the assumption that all motivation must be reward-based. In contrast I claim
that at least some motivation may be architecture-based, in the sense explained
Instead of talking about "passions" I shall use the less emotive terms, "motivation"
and "motive". A motive in this context, is a specification of something to be done or
achieved (which could include preventing or avoiding some state of affairs, or
maintaining a state or process). The words "motivation" and "motivational" can be
used to describe the states, processes, and mechanisms concerned with production of
motives, their control and management and the effects of motives in initiating and
controlling internal and external behaviours. So Hume's claim, as interpreted here is
that no collection of beliefs and reasoning capabilities can generate behaviour on
its own: motivation is also required.
This view of Hume's claim is expressed well in the Stanford Encyclopedia of
Philosophy entry on motivation, though without explicit reference to Hume:
If Hume had known about reflexes, he might have treated them as an alternative mode
of initiation of behaviour to motivation (or passions). There may be some who regard
a knee-jerk reflex as involving a kind of motivation produced by tapping a sensitive
part of the knee. That would not be a common usage. I think it is more helpful to
regard such physical reflexes as different from motives, and therefore as exceptions
to Hume's claim. I shall try to show that something like "internal reflexes" in an
information-processing system can be part of the explanation of creation and adoption
of motives. In particular, adopting "the design-based approach to the study of mind"
yields a wider variety of possible explanations of how minds work than are typically
considered in philosophy or psychology, and paradoxically even in AI/Robotics, where
such an approach ought to be more influential.
If there is no energy dissipation (e.g. no friction, no viscosity, etc.) the oscillation could continue indefinitely (something like a stupid animal chasing its own tail?). However, normally the system will more or less rapidly dissipate its kinetic energy and get ever closer to attaining the desire-like state, then stop.
It may turn out that such apparently superficial and misleading comparisons between 'dumb' physical processes and the more sophisticated goal-directed processes found in living organisms are part of the story of evolution of the more sophisticated systems, such as homeostatic control systems used to control many biological processes as explained in http://www.bbc.co.uk/education/guides/z4khvcw/revision
Such dissipative oscillatory mechanisms may be
among the components of the construction kits used by
evolution, discussed in more detail in:
Evolution (like human designers more recently, such as the inventor of the Watt rotary governor) seems to have found many ways in which systems without goals, can be used as goal-directed mechanisms in organisms. This use could have evolved before the use of reward-based mechanisms for selecting goals.
Contrast the common belief that physical properties of matter cannot account for features of life that include goal-direction, e.g. discussed in
In the history of philosophy and psychology there have been many theories of
motivation, and distinctions between different sorts of motivation, for example
motivations related to biological needs, motivations somehow acquired through
cultural influences, motivations related to achieving or maximising some reward (e.g.
food, admiration in others, going to heaven), or avoiding or minimising some
punishment (often labelled positive and negative reward or reinforcement),
motivations that are means to some other end, and motivations that are desired for
their own sake, motivations related to intellectual or other achievements, and so on.
Many theorists assume that motivation must be linked to rewards or utility. One
version of this (a form of hedonism) is the assumption that all actions are done for
ultimately selfish reasons.
I shall try to explain why there is an alternative kind of motivation,
architecture-based motivation, which is not included even in this rather broad
characterisation of types of motivation on Wikipedia:
Motivation is also a topic of great concern in management theory and management
practice, where motivation of workers comes from outside them e.g. in the form of
reward mechanisms (providing money, status, recognition, etc.) sometimes in other
forms, e.g. inspiration, exhortation, social pressures, ... I shall not discuss any
of those ideas.
In psychology and even in AI, all these concerns can arise, though I am here only
discussing questions about the mechanisms that underlie processes within an organism
or machine that select things to aim for and which initiate and control the
behaviours that result. This includes mechanisms that produce goals and desires,
mechanisms that identify and resolve conflicts between different goals or desires,
mechanisms that select means to achieving goals or desires.
Achieving a desired goal G could be done in different ways, e.g.
- select and use an available plan for doing things of type GThere is much more to be said about the forms different motives can have, and the
- use a planning mechanism to create a plan to achieve G and follow it.
- detect and follow a gradient that appears to lead to achieving G
(e.g. if G is being on high ground to avoid a rising tide, walk uphill while you can)
For a characterisation of some of the largely unnoticed complexity of motives see
L.P. Beaudoin, A. Sloman, A study of motive processing and attention,
Prospects for Artificial Intelligence, IOS Press, 1993
(further developed in Luc Beaudoin's PhD thesis).
Extreme versions of this assumption are found in philosophical theories that all
agents are ultimately selfish, since they can only be motivated to do things that
reward themselves, even if that is a case of feeling good about helping someone else.
More generally, the assumption is that selection of a motive among possible motives
must be based on some kind of prediction about the consequences of achieving or
preventing whatever state of affairs is specified in that motive. This document
challenges that claim by demonstrating that it is possible for an organism or machine
to have, and to act on motives for which there is no such prediction.
In other words, it is possible for there to be reflex mechanisms whose effect
is to produce new motives, and in simple cases to initiate behaviours controlled by
such motives. I shall present a very simple architecture illustrating this
possibility below, though for any actual organism, or intelligent robot, a more
complex architecture will be required, for reasons given later.
Where the reflex mechanisms come from is a separate question: they may be produced by
a robot designer or by biological evolution, or by a learning process, or even by
some pathology (e.g. mechanisms producing addictions) but what the origin of such a
mechanism is, is a separate question from what it does, how it does it, and what the
I am not denying that some motives are concerned with producing benefits for the
agent. It may even be the case (which I doubt) that most motives generated in humans
and other animals are selected because of their benefit for the individual. For now,
I am merely claiming that something different can occur and does occur, as follows:
Not all the mechanisms for generating motives in a particular organism O,
and not all the motives produced in O have to be related to any reward or
positive or negative reinforcement for O.
What makes them motives is how they work: what effects they have, or, in
more complex cases, what effects they tend to have even though they
are suppressed (e.g. since competing, incompatible, motives can exist in O).
I think that is false: some forms of learning occur simply because the opportunity to
learn arises and the information-processing architecture produced by biological
evolution simply reacts to many opportunities to learn, or to do things that could
produce learning because the mechanisms that achieve that have proved their worth in
previous generations, without the animals concerned knowing that they are using those
mechanisms nor why they are using them
At regular intervals another mechanism selects one of the beliefs about processes
occurring recently and copies its content (perhaps with some minor modification or
removal of some detail, such as direction of motion) to form the content of a new
motive in a database of "desires". The desires may be removed after a time.
At regular intervals an intention-forming mechanism selects one of the desires to act
as a goal for a planning mechanism that works out which actions could make the desire
come true, selects a plan, then initiates plan execution.
This system will automatically generate motives to produce actions that repeat or
continue changes that it has recently perceived, possibly with slight modifications,
and it will adjust its behaviours so as to execute a plan for fulfilling the latest
Why is a planning mechanism required instead of a much simpler reflex action
mechanism that does not require motives to be formulated and planning to occur?
A reflex mechanism would be fine if evolution had detected all the situations that
can arise and if it had produced a mechanism that is able to trigger the fine details
of the actions in all such situations. In general that is impossible, so instead of a
process automatically triggering behaviour it can trigger the formation of some goal
to be achieved, and then a secondary process can work out how to achieve it in the
light of the then current situation.
For such a system to work there is NO need for the motives selected or the actions
performed to produce any reward. We have goals generated and acted on without any
reward being required for the system to work. Moreover, a side effect of such
processes might be that the system observes what happens when these actions are
performed in varying circumstances, and thereby learns things about how the
environment works. That can be a side effect without being an explicit goal.
A designer could put such a mechanism into robot as a way of producing such learning
without that being the robot's goal. Likewise biological evolution could have
selected changes that lead to such mechanisms existing in some organisms because they
produce useful learning, without any of the individual animals knowing that it has
such mechanisms nor how they were selected or how they operate.
The motives generated will certainly need to change with the age and sophistication
of the learner.
Some of the motive-generating mechanisms could be less directly triggered by
particular perceived episodes and more influenced by the previous history of the
individual, taking account not only of physical events but also social phenomena,
e.g. discovering what peers seem to approve of, or choose to do. The motives
generated by inferring motives of others could vary according to stage of
development. E.g. early motives might mainly be copies of inferred motives of others,
then as the child develops the ability to distinguish safe from unsafe experiments,
the motives triggered by discovering motives of others could include various
generalisations or modifications, e.g. generalising some motive to a wider class of
situations, or restricting it to a narrower class, or even generating motives to
oppose the perceived motives of others (e.g. parents!).
Moreover some of the processes triggered instead of producing external actions could
produce internal changes to the architecture or its mechanisms. Those changes could
include production of new motive generators, or motive comparators, or motive
generator generators, etc.
For more on this idea see Chapter
6 and Chapter
The Computer Revolution in Philosophy (1978).
Such learning would depend on other mechanisms monitoring the results of behaviour
generated by architecture-based motivational mechanisms and looking for both new
generalisations, new conjectured explanations of those generalisations and new
evidence that old theories or old conceptual systems are flawed -- and require
Such learning processes would require additional complex mechanisms, including
mechanisms concerned with construction and use of powerful forms of representation
and mechanisms for producing substantive (i.e. non-definitional) ontology extension.
For more on additional mechanisms required see
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#glangThe mechanisms constructing architecture-based motivational sub-systems could
Evolution of minds and languages. What evolved first and develops first in children:
Languages for communicating, or languages for thinking (Generalised Languages: GLs)
Ontologies for baby animals and robots From "baby stuff" to the world of adult science:
Developmental AI from a Kantian viewpoint.
A New Approach to Philosophy of Mathematics: Design a young explorer, able to
discover "toddler theorems" (Or: "The Naive Mathematics Manifesto").
Some motivational mechanisms "reward" the genomes that specifySimilarly, some forms of learning may occur because animals that have certain
them, not the individuals that have them.
If follows that any AI and cognitive science research based on the assumption that
learning is produced ONLY by mechanisms that maximise expected utility for the
individual organism or robot, is likely to miss out important forms of learning.
Perhaps the most important forms.
One reason for this is that typically individuals that have opportunities to learn do
not know enough to be able to even begin to asses the long term utility of what they
are doing. So they have to rely on what evolution has learnt (or a designer in the
case of robots) and, at a later stage, on what the culture has learnt. What evolution
or a culture has learnt may, of course, not be appropriate in new circumstances!
This discussion note does not prove that evolution produced organisms that
make use of architecture-based motivation in which at least some motives are produced
and acted on without any reward mechanism being required. But it illustrates the
possibility, thereby challenging the assumption that ALL motivation must arise out of
Similar arguments about how suitably designed reflex mechanisms may react to
perceived processes and states of affairs by modifying internal information stores
could show that at least some forms of learning use mechanisms that are not concerned
with rewards, with positive or negative reinforcement, or with utility maximisation
(or maximisation of expected utility). My conjecture is that the most important forms
of learning in advanced intelligent systems (e.g. some aspects of language learning
in human children) are architecture-based, not reward based. But that requires
The ideas presented here are very relevant to Robotic projects like CogX,
which aim to investigate designs for robots that 'self-understand' and 'self-extend',
since they demonstrate at least the possibility that some forms of
self-extension may not be reward-driven, but architecture-driven.
Various forms of architecture-based motivation seem to be required for the
development of precursors of mathematical competences described here:
Some of what is called 'curiosity-driven' behaviour probably needs to be re-described
as 'architecture-based' or 'architecture-driven'.
To be added:
How the contrast discussed here relates to the distinction made by
psychologists between intrinsic and extrinsic rewards, e.g. in this report:
[This document is still under construction. Suggestions for improvement welcome. It
is likely to change.]
This is one of a series of notes explaining how learning about underlying mechanisms
can alter our views about the 'logical topography' of a range of phenomena,
suggesting that our current conceptual schemes (Gilbert Ryle's 'logical geography')
can be revised and improved, at least for the purposes of science, technology,
education, and maybe even for everyday conversation, as explained in
Marvin Minsky wrote about goals and how they are formed in The Emotion Machine.
It seems to me that the above is consistent with what he wrote, though I may have
Something like the ideas presented here were taken for granted when I wrote
The Computer Revolution in Philosophy in 1978. However, at that time
I underestimated the importance of spelling out assumptions and conjectures in much
Two closely related pieces of work came to my notice after the above had been written:
R.W. White, 1959, Motivation reconsidered: The concept of competence,NB
Psychological Review, 66, 5, pp. 297-333
S. Singh, R. L. Lewis, and A. G. Barto, 2009, Where Do Rewards Come From?
Proceedings of the 31th Annual Conference of the Cognitive Science Society,
Eds. N.A. Taatgen and H. van Rijn, Cognitive Science Society, pp. 2601-2606,
School of Computer Science
The University of Birmingham