Many of Shannon's admirers seem to have forgotten that there is a much older, widely used, theoretically important notion of "information", which was familiar to Jane Austen and used in her novels, and also occurs in non-technical, conversational, uses of the word "information". This concept of information is essential for our understanding of biological evolution and its products (including humans) and for attempts to understand what natural intelligence is and how it works, including attempts to model and replicate natural intelligence in machines.
Shannon himself did not make this mistake of conflating the old concept of semantic information with what he called "information". Margaret Boden comments on this in her two volume survey of cognitive science and its history (2006):
This term was drawn from Shannon's information theory, developed at Bell Labs to measure the reliability or degradation of messages passing down telephone lines (Shannon 1948; Shannon and Weaver 1949). But the "messages" were thought of not as meaningful contents, conveying intelligible information such as that Mary is coming home tomorrow. Rather, they were the more or less predictable physical properties of the sound signal. In Shannon's words: "Frequently the messages have meaning; that is, they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the information problem. The significant aspect is that the actual message is one selected from the set of possible messages."
In short, "information" wasn't a semantic notion, having to do with meaning or truth. It was a technical term, denoting a statistical measure of predictability. What was being predicted was the statistics of a physical signal, not the meaning--if any--being carried by the signal. As a technical term for something measurable, "information" needed a quantitative unit. This new unit was the bit (an abbreviation of "binary unit").
"English novelist known primarily for her six major novels, which interpret, critique and comment upon the British landed gentry at the end of the 18th century."
Born 1775. Died 1817
I'll summarise Shannon's notion and contrast it with Jane Austen's notion (illustrated using extracts from her novel Pride and Prejudice below). She was primarily concerned with useful information contents of various kinds, whereas Shannon, as illustrated above, was primarily concerned with mathematical properties of information vehicles.
I'll try to explain the differences between their approaches, and contrast both of them with the views of Auletta et al. below who, like Shannon, regard information as something transmitted and received, though they focus more on the sender than the receiver.
The uses are many and varied, including recognising a need, a threat, or a source of something useful, distinguishing known and unknown individuals, selecting a goal (e.g. to meet a need or deflect a threat), finding a means to a goal (e.g. selecting an available action, or an available sequence of actions, or a route through space, to achieve the goal), making predictions, and many more.
There are also many actions that can be performed on information, e.g. deriving new information from old, detecting an inconsistency, detecting an ambiguity, refining information by adding new details, using theoretical information to explain some other information gained from observation or reasoning, checking whether one information item is relevant to another (e.g. whether it answers, or helps to answer, a question), and many more.
The information content of a question is a request for some other information that will answer the question. The information content of a command or instruction or suggestion includes specification of some action or type of action that could be performed.
I suspect none of those statements would have surprised Jane Austen or many other thinkers before and after her, who had never encountered Shannon information.
Evolutionary changes produce new physical structures, capabilities, and behaviours, but they can also extend information-processing abilities, in many different ways, including extending uses of information during growth and development. (E.g. as explained below in connection with Meta-Configured genomes.)
Being of use is a more fundamental feature of information than being physically manipulable, since there would not be any point in storing, manipulating or transmitting information, or even creating information items to be stored or transmitted, if the information could not be used.
Having the potential to be used applies to both true and false information and information contents that are neither true nor false, e.g. questions or imperatives (commandments?). It is possible to use false information inadvertently or deliberately, e.g. in political speeches or commercial advertising.
However, not all control information has the potential to be true or false: e.g. a road sign or traffic light telling you to stop is typically part of a complex traffic control system rather than a piece of factual information that can be true or false.
The most basic use of information, in all forms of life, including the simplest forms of life, is for control -- initiating or modifying an action or process, or selecting between things to do, selecting when to start or stop processes, or modulate them, e.g. speeding up, slowing down or changing direction, and many more.
Moreover it is often sensible to store things that are never used, e.g. plumbing tools, because situations could arise in which they would need to be used, and that is also true of information. Information that has the potential to be used for control (e.g. in deciding what actions to perform) need not actually be used for control -- but that does not prevent it being potentially useful control information.
Information, in all these cases, is something abstract, potentially but not intrinsically concerned with relationships between some actual needs or goals, situations, and decisions or selections.
It involves structures -- parts, and wholes, and relationships -- but that in itself does not imply that there is any numerical measure. A single number, or point on a scale does not capture the required useful properties of information.
There may be measures associated with things that are referred to in items of information, but those are not measures of information: e.g. one piece of information could be about a collision between a child and a doorpost, and another about a collision between an asteroid and a planet. The two events have many measures, e.g. amounts of physical matter involved, amounts of kinetic energy transformed into heat energy, but those are not measures of amounts of information.
Information can be about inaccessible or remote entities. There are old philosophical problems about how it is possible to refer to, think about, ask questions about, things that have never been encountered as sensed, or more generally experienced, objects. My own view, initially proposed in Sloman(1985) (and later extended by Ian Wright) is that semantic content beyond what's immediately accessible can be based on the need to fill gaps in causal loops: e.g. if some of the contents of a machine's current sensory experience change spontaneously, then something outside the machine can be postulated as a cause of the change: "loop-closing" semantic content.
If motions of limbs or rotation of a head seem to cause changes in visual contents and there is no direct connection between the relevant muscles and retinal cells, then an animal or machine may use mechanisms for postulating external causal intermediaries, and for inventing theories about what they are and how they work, including, for example, differences between visually perceived changes caused by moving your hand in front of your eyes, and changes caused by rotating your head so that new parts of the environment come into view. Of course, that sketch has to be filled in with a great deal of mechanism, but it is clear that biological evolution combined with features of the environment has produced extremely sophisticated and varied visual mechanisms dealing with such "loop-closing" semantics.
Biological evolution evidently discovered the importance of loop closing semantics in biological mechanisms used for control on the basis of information outside the organism, or in the organism but outside the control mechanism.
These biological mechanisms can be thought of as precursors of the more sophisticated cases proposed by 20th Century philosophers of science (e.g. Hempel, Pap, Tarski, and many others) who developed anti-empiricist explanations of how scientific theories can meaningfully refer to entities that scientists cannot experience, with properties that cannot be directly measured, e.g. the mass and charge of an electron, or the temperature at the surface of a star light-years away from us at a long-past time.
In Shannon's sense of the word "information", there is a numerical quantity of information associated with a signal, derived from the size of the class of alternative signals possible in that context. In that sense "The caterpillar chewed the leaf" and "The asteroid destroyed the island" might have comparable "amounts" of information as English sentences of the form "The [noun] [verb] the [noun]", since both select from the same word classes, ignoring the structure at the level of the alphabet used.
If each signal in a set of possible signals is constructed by concatenating symbols selected from a fixed set of symbols, then the Shannon information content depends on the size of the set of symbols and the number of symbols in the signal. For example if only two signal elements are used, a dot (".") and a dash ("-"), as in Morse code, then any signal made of four components, e.g. "....", "----", "-.-.", etc. has an amount of information expressible in terms of the number of possible four component signals using only two types of components, namely: 16. Each such four component signal eliminates 15 of the 16 possibilities.
Likewise a five component signal using a two character alphabet eliminates 31 of the 32 possibilities. So it has more Shannon information than a four component signal. (There are different mathematically equivalent ways of defining a measure of information based on this idea, some more generally useful than others.)
If instead of only a choice between two items for each signal component, the code used allows four choices for each component, e.g. one of these: "-", ".", "=", "+", then instead of the information measure being based on 2x2x2x2 = 16, it will be based on 4x4x4x4 = 256. So the measure of how much is excluded increases as the number of options for each item in the string increases and also as the length of the string increases.
For technical reasons, Shannon's measure did not directly use these numbers, 16 and 256, or the numbers of items excluded by each signal, e.g. 15 or 255. but numbers derived from them. The main point is that a signal that excludes 255 of 256 possibilities can be said to have more information, in Shannon's sense than a signal that excludes 15 of 16 possibilities, a smaller ratio. So two equally nonsensical words for an English user, e.g. "zzxxjalp" and "azbycxxyrk", which convey no information if sent unexplained as a message, will have different amounts of Shannon information. Assuming the same alphabet is in use, the second is longer and excludes a higher proportion of alternatives than the short word, and therefore has more Shannon information.
This is analogous to the way in which saying that an animal in the distance is a bird gives less information than saying it is a crow, because "crow" excludes more possibilities, and therefore supports more inferences, than "bird" does. E.g., you can therefore make more inferences from "Tweety is a crow" than from "Tweety is a bird". Intuitively the former therefore has more information. That shows a loose connection between our ordinary concept of information and Shannon information.
Each of the two words "bird" and "crow" contains four letters from the same set of 26 possible letters and therefore, considered purely as signals, they have the same amount of Shannon information. Considered as words of English, however, they each have a smaller information measure than that, because not all combinations of four letters of the alphabet are words of English, e.g. "iiii", "zyww" are not, which most English speakers will (somehow!) know without being told, so the words "crow" and "bird" exclude a smaller number of alternatives than they would exclude if all four letter sequences were words of English.
A further complication is that there are some four letter words, e.g. "pick", which have (at least) three meanings, all of which are excluded by use of the word "crow", so that increases the number of words excluded. But this has nothing to do with what the word "crow" means, i.e. what information it can be used to convey.
There are many technical details omitted by this summary. The main point to note is that this concept of information measure, expressible as a number, which turned out to have profoundly important applications in science and engineering, refers only to the structure of the signal itself and the size of the set of alternative possibilities with that structure. Shannon's measure of "information" in this sense has nothing to do with what we would normally refer to as "meaning", "content" or what is "denoted", or "referred to".
It is a syntactic measure that is not directly connected with semantic content, though it may be indirectly connected when applied to signals in a known language. Shannon understood all this, as is shown clearly by a video presentation in which he discusses maze-learning by a mechanical mouse he had built, clearly indicating that that the mouse acquires information that later can be used by getting from anywhere in the maze to the goal point. But his choice of the label "information" in his publications seems to have confused many highly intelligent people. (He apparently later regretted using the label "information" for his concept.)
I have found Shannon's video online in two places:
Youtube video (highly distorted):
This video summary presents some of Shannon's ideas (without going into
technical detail) and explains their importance:
Claude Shannon - Father of the Information Age
There are many online documents explaining Shannon's ideas in more technical detail and contrasting them with alternative ideas. For a philosopher's overview see Floridi's Stanford Encyclopedia of Philosophy entry.
Sometimes the subject matter, or information, identifies an entity of some sort, e.g. London, or the tallest building in Paris, or William Shakespeare. Sometimes it is a fact, or possible fact, e.g. "Humans will be born in spaceships by the year 2250", or even something false, e.g. "The Eiffel Tower is in London", or a question, or an instruction or command (the answer to "what shall I do?", which might be "sit on the mat next to the door and twiddle your thumbs").
These are all examples of semantic content, expressed here in printed English, though in principle the same semantic contents could be expressed in spoken English, hand-written English, or many other languages, using different words, and different textual forms for those words, or in sign languages whose physical instantiations are evanescent body movements.
Pictures and diagrams can also have semantic content though the mechanisms (in brains or computers) required for producing and interpreting them are different from those used for producing and interpreting words, phrases and sentences. Perception and understanding of a picture is related to but different from visual acquisition of information, e.g. about what exists and what's happening in some part of the environment. Visual information acquired in ordinary life typically does not have a sender, and in many environments will have a large number of independent sources, e.g. different plants, paths, walls, boundaries, and insects seen at a moment in a garden. There will be no well defined measure of amount of information in the whole scene though there will typically be many different kinds of information.
However, if a picture, or video, stored in a computer is represented by a computer memory structure composed of bits (symbols chosen from a set of two elements, e.g. '0' and '1') then the number of bits will indicate the information content as measured by Shannon.
There are ways of compressing the signal size required for transmitting or storing such picture elements because of the amount of repetition they often include: e.g. for large regions of an image that are all one colour, or because of repeated pairings or groupings of information items. So the amount of Shannon information required for storage may be different from the amount required for the physical display mechanism that has to show all parts of the image, not a mathematically derived summary. Again, the semantic information content that a human looking at the image, e.g. information about a crow next to its nest, is very different from the Shannon information measure.
That semantic sense is the sense in which Jane Austen used the word "Information" in her novel Pride and Prejudice, published in 1813, about 135 years before Shannon published his paper, though she was mainly referring to verbally expressed information.
The claim that she often used such a concept of information is substantiated by a collection of examples of her use of the word "information" in the novel, presented in the next section. However, I would not be surprised to learn that she was perfectly well aware that information can be acquired through sensing or perceiving other things than written or spoken words, or even by reasoning, and also aware that information can be used in many forms of action, including, for example, catching a ball, or locating a lost key.
She was a woman of mean understanding, little information, and uncertain temper.
Catherine and Lydia had information for them of a different sort.
When this information was given, and they had all taken their seats, Mr. Collins was at leisure to look around him and admire,...
You could not have met with a person more capable of giving you certain information on that head than myself, for I have been connected with his family in a particular manner from my infancy.
This information made Elizabeth smile, as she thought of poor Miss Bingley.
This information, however, startled Mrs. Bennet ...
She then read the first sentence aloud, which comprised the information of their having just resolved to follow their brother to town directly,...
She resolved to give her the information herself, and therefore charged Mr. Collins, when he returned to Longbourn to dinner, to drop no hint of what had passed before any of the family.
...and though he begged leave to be positive as to the truth of his information, he listened to all their impertinence with the most forbearing courtesy.
Mrs. Gardiner about this time reminded Elizabeth of her promise concerning that gentleman, and required information; and Elizabeth had such to send as might rather give contentment to her aunt than to herself.
Elizabeth loved absurdities, but she had known Sir William's too long. He could tell her nothing new of the wonders of his presentation and knighthood; and his civilities were worn out, like his information.
I was first made acquainted, by Sir William Lucas's accidental information, that Bingley's attentions to your sister had given rise to a general expectation of their marriage.
As to his real character, had information been in her power, she had never felt a wish of inquiring.
... and at last she was referred for the truth of every particular to Colonel Fitzwilliam himself-from whom she had previously received the information of his near concern in all his cousin's affairs,
When he was gone, they were certain at least of receiving constant information of what was going on,
Mr. Bennet had been to Epsom and Clapham, before his arrival, but without gaining any satisfactory information....
Elizabeth was at no loss to understand from whence this deference to her authority proceeded; but it was not in her power to give any information of so satisfactory a nature as the compliment deserved.
Upon this information, they instantly passed through the hall once more...
She began now to comprehend that he was exactly the man who, in disposition and talents, would most suit her. His understanding and temper, though unlike her own, would have answered all her wishes. It was an union that must have been to the advantage of both; by her ease and liveliness, his mind might have been softened, his manners improved; and from his judgement, information, and knowledge of the world, she must have received benefit of greater importance.
And will you give yourself the trouble of carrying similar assurances to his creditors in Meryton, of whom I shall subjoin a list according to his information?
But to live in ignorance on such a point was impossible; or at least it was impossible not to try for information.
but to her own more extensive information, he was the person to whom the whole family were indebted
Darcy was delighted with their engagement; his friend had given him the earliest information of it.
"Did you speak from your own observation," said she, "when you told him that my sister loved him, or merely from my information last spring?"
Bingley looked at her so expressively, and shook hands with such warmth, as left no doubt of his good information.
The joy which Miss Darcy expressed on receiving similar information, was as sincere as her brother's in sending it.
What sorts of information-processing machinery can account for the phenomena she was interested in?
Does information have to have a sender and a receiver in order to exist? Can information be received, or acquired, without being sent intentionally? (Which of Jane Austen's examples might be of that sort? What if she had written detective stories?)
Do the examples show that she understood the importance of both control information and factual information? What is the difference?
How can information make something happen?
Exercise: which varieties of control information can you distinguish in organisms, at various stages of development, learning, behaviour, competition, cooperation, reproduction?
"One of the biggest misunderstandings in information theory is to have taken Shannon's (1948) theory of communication (in the context of controlled transmission) as a general theory of information. In such a theory, centred on signal/noise discrimination, the message is already selected and well defined from the start, ...(selected by the sender)..., and the problem here is only to faithfully transmit or further process, ... ... the sequence of bits that has been selected (Auletta 2008a). On the contrary, a true information theory (as was Wiener's (1948) original aim) starts with an input as a source of variety and has the selection only at the end of the information processing or exchanging. In other words, a message here is only the message selected by the receiver."
Note that this makes the assumption (of which I was once guilty) that information is only something transmitted and received. That assumption ignores the fact that all that encoding, transmitting, decoding, etc., would be pointless if information could not be used. So a deep theory of information should start with users of information and its uses, which may differ for different kinds of information and different users.
For example there are many important uses of information (understood by novelists) that have nothing to do with senders and receivers, since the information is the content of an intention, a percept, a plan for action, or an internal self-directed question (e.g. "What made that noise?" "Where did I previously find fruit?" "Why did my action that previously succeeded fail this time?"). Although my examples are expressed in English, I suspect that pre-verbal human toddlers and other animals are able to use much older internal languages that evolved not for purposes of communication but for intelligent (self-)control, including perception, deliberation, control of action, reflection on successes and failures, and many more. So in addition to discussing information in relation to senders and receivers, we also need to discuss its relevance to information users.
Then sources, senders, receivers, encoders, decoders, etc. could be discussed as secondary topics, though for Shannon's job the secondary topics were the only, or the main, matters of concern: because his employer was the Bell Telephone Company.
In the context of the Meta-Morphogenesis project a major way in which information of various kinds, with various sources, plays central, highly context-sensitive roles, is in individual development, as discussed briefly below (Meta-Configured Genomes).
Is that what Auletta et al. intended to say?
I have a fantasy that one day an Austen scholar, with much deeper acquaintance
with her writings, will compose an imaginary dialogue between Jane Austen and a
biologist discussing capabilities of a wide range of organisms, perhaps starting
with these presentations by Maddie Moate and colleagues:
Amazing Animal Architects
Monkeys react to magic
and perhaps going on to much simpler, smaller organisms, or even individual cells in the bodies of living things.
Investigating the varieties of information processing between the very simplest
organisms, or proto-organisms, and the most complex, and how and why the
relevant types of use of information emerged, is the main goal of the
Turing-inspired Meta-Morphogenesis project:
That project includes the hypothesis that the most basic and most pervasive use of information in living things is for control: receiving, transmitting, storing, retrieving, encoding and decoding are all of secondary importance, insofar as they all contribute directly or indirectly, immediately or with some delay, to the use of information, although some processes whose function is to make information available for use may turn out to have been redundant because the information is never used. But that doesn't stop it being potentially usable information.
Most philosophers who write about information tend to focus on intentional uses or transmissions of information by humans, whose use of information is typically complex and varied, with many subtleties. For much simpler forms of life the uses of information and the types of information are much more restricted though the physical mechanisms may be quite complex, as suggested by the "chemoton" theory of Ganti(2003).
Our ideas constitute steps toward a theory of "A meta-configured genome",
according to which genetic information interacts during its expression both with
current aspects of the environment and with products of earlier stages of
gene-expression, as illustrated most obviously in connection with the
multi-stage processes in language development. This idea (still under
development) is outlined in: "The Meta-Configured Genomes" (work in progress).
I believe this is closely related to Annette Karmiloff-Smith's theory of "representational redescription" during individual development:
As far as I know this distinction was not discussed by Shannon, although his
1948 paper implicitly makes use of the distinction insofar as he uses the word
"sense" several times, e.g. in contexts like
"...if P is sufficiently large, in the sense of having an entropy power approaching P + N"
"...the evaluation is "reasonable" in the sense that...".
However, problems arise when attempts are made to apply sense/reference distinction to every possible word or phrase or sign or process that in some sense can be said to convey information or have a meaning.
Examples that cause problems (some of them discussed by Frege) include demonstrative/indexical expressions, e.g. "here", "now", "you", "I", "we", words that combine sentence fragments to form new larger fragments or whole sentences, or qualify assertions, such as "but" "although" "perhaps", "of course", proper names, and many others.
A good novelist with a rich and deep command of her language will use all these hard to analyse words and phrases without worrying about philosophers' questions. However, a complete theory of information, covering all the varieties of information contents of portions of human languages, will have to make explicit the roles of the more complex and subtle words and phrases, as philosophers and linguists have attempted to do. (A survey is beyond the scope of this paper. Is there a good tutorial reference?)
I have tried to do this with a little word that causes big problems,
namely "self", whose linguistic function is sometimes misconstrued as referring
to a special mysterious entity ("the self") by philosophers and others, here:
"THE SELF" -- A BOGUS CONCEPT
Without a good theory covering all the obvious and unobvious cases we are
unlikely to be able to design robots that have minds like ours.
[A tutorial on conceptual analysis is available in chapter 4 of my 1978 book:
(Discussed in the text, above).
Bateson on "difference": discussion note by A.S.
(Added here 29 Aug 2018)
What did Gregory Bateson mean when he wrote: "information" is "a difference that makes a difference"?
Bateson is frequently quoted approvingly by unthinking admirers who seem to ignore the fact that no matter how memorable the slogan sounds it is not at all clear what it could possibly mean. Guided by some of Bateson's writings, the "difference" discussion note explains that Bateson was referring to some of the patterns of causal influence produced in brains by information. (With thanks to Olivier Marteaux for correcting my initial interpretation.)
For a sample of her work closely related to this topic, see:
Jackie Chappell, 2014, Acting on the world: understanding how agents use information to guide their action, in From Animals to Robots and Back: Reflections on Hard Problems in the Study of Cognition, Eds., Wyatt, J.L. and Petters, D.D. and Hogg, D.C., Springer, pp 51--64, 978-3-319-06614-1,
Chappell's paper uses the word "information" 27 times in 10 pages -- in the sense of Jane Austen, but mainly in the context of non-human intelligence.
Most animals navigate a dynamic and shifting sea of information provided by their environment, their food or prey and other animals. How do they work out which pieces of information are the most important or of most interest to them, and gather information on those parts to guide their action later? In this essay, I briefly outline what we already know about how animals use information flexibly and efficiently. I then discuss a few of the unsolved problems relating to how animals collect information by directing their attention or exploration selectively, before suggesting some approaches which might be useful in unravelling these problems.
George B. Dyson,
Darwin Among The Machines: The Evolution Of Global Intelligence,
Addison-Wesley, Reading, MA, 1997,
Here's a sample extract:
"....I've been trying to articulate, with the help of Harvard evolutionary biologist David Haig, just what meaning is, what content is, and ultimately, in terms of biological information and physical information, the information presented in A Mathematical Theory of Communication by Shannon and Weaver. There's a chapter in my latest book called "What is Information?" I stand by it, but it's under revision. I'm already moving beyond it and realizing there's a better way of tackling some of these issues.
The key insight, which I've known for years, is that we have to get away from
the idea of there being the pure ultimate fixed proposition that captures the
information in any informational state. This goal of capturing the proposition,
this attempt at idealization that philosophers have poured their hearts and
souls into for a hundred years is just wrong. Don't even try. I'm now coming
around to wonder why it had such a hold on us. It's quite obvious once you start
thinking this way."
I was surprised by the comment about "why it had such a hold on us". To whom does "us" refer? I've never thought for a moment that Shannon's theory was about what we normally understand by "meaning" or "information content". I remember arguing against this with fellow students, including at least one music student, around 1959. My 1962 DPhil thesis was about relationships between meaning (information content) and truth, and never mentions Shannon, or anything remotely like Shannon information, although I had encountered his ideas and decided they were irrelevant to my attempts to understand relationships between meaning and truth, especially necessary truth. I am surprised that Dennett did not notice that Shannon had simply misused the normal concept of "information". Although many scientists and engineers found Shannon's ideas very useful I would expect philosophers, linguistics, and many others to notice its irrelevance to their work.
Aaron SlomanThis was an invited contribution to Information and Computation, Eds. Gordana Dodig-Crnkovic and Mark Burgin, World Scientific Publishers, New Jersey, pp.393--438, 2011
What's information, for an organism or intelligent machine?
How can a machine or organism mean?,
A partial index of discussion notes on this and many other topics is in