A partial index of discussion notes on this and many other topics is in
It is, of course, true that information that has the potential to be used for control (e.g. in deciding what actions to perform) need not actually be used for control -- but that does not prevent it being control information.
In Shannon's sense, the quantity of information associated with a signal (a numerical quantity) is derived from the size of the class of alternative signals possible in that context. So if each signal is constructed by concatenating symbols from a fixed set of symbols, then the Shannon information content depends on the size of the set of symbols and the number of symbols in the signal. For example if only two signal elements are used, a dot (".") and a dash ("-"), as in Morse code, then any signal made of four components, e.g. "....", "----", "-.-.", etc. has an amount of information expressible in terms of the number of possible four component signals using only two types of components, namely: 16. Each such four component signal eliminates 15 of the 16 possibilities. Likewise a five component signal using a two character alphabet eliminates 31 of the 32 possibilities. So it has more Shannon information than a four component signal. (There are different mathematically equivalent ways of defining a measure of information based on this idea, some more generally useful than others.)
If instead of only a choice between two items for each signal component, the code used allows four choices for each component, e.g. one of these: "-", ".", "=", "+", then instead of the information measure being related to 2x2x2x2 = 16, it will make use of the fact that 4x4x4x4 = 256. For technical reasons, Shannon's measure did not directly use these numbers, 16 and 256, or the numbers of items excluded by each signal, e.g. 15 or 255. but numbers derived from them. The main point is that a signal that excludes 255 of 256 possibilities can be said to have more information, in Shannon's sense than a signal that excludes 15 of 16 possibilities. So two equally nonsensical words for an English user, e.g. "zzxxjalp" and "azbycxxyrk", which convey no information if sent unexplained as a message, will have different amounts of Shannon information. The second is longer so it excludes more possibilities than the short word, and therefore has more Shannon information.
This is analogous to the way in which saying that an animal in the distance is a bird gives less information than saying it is a crow, because "bird" excludes fewer possibilities than "crow" does. You can therefore make more inferences from "Tweety is a crow" than from "Tweety is a bird". The two words "bird" and "crow" each contain four letters from the same set of 26 possible letters and therefore, considered purely as signals, they have the same amount of Shannon information. Considered as words of English, however, they each have a smaller information measure than that, because not all combinations of four letters of the alphabet are words of English, e.g. "iiii", "zyww" are not, which most English speakers will (somehow!) know without being told, so the words "crow" and "bird" exclude a smaller number of alternatives than they would exclude if all four letter sequences were words of English. A further complication is that there are some four letter words, e.g. "pick", which have (at least) two meanings, both of which are excluded by use of the word "crow", so that increases the number of words excluded. But this has nothing to do with what the word "crow" means, i.e. what information it can be used to convey.
There are many technical details omitted by this summary. The main point to note
is that this concept of information measure, expressible as a number,
which turned out to have profoundly important applications in science and
engineering, refers only to the structure of the signal itself and the size of
the set of alternative possibilities with that structure. Shannon's measure of
"information" in this sense has nothing to do with what we would normally refer
to as "meaning", "content" or what is "denoted", or "referred to". It is a
syntactic measure that is not directly connected with semantic
content, though it may be indirectly connected when applied to signals in a
known language. Shannon understood all this, but his choice of the label
"information" seems to have confused many highly intelligent people. This video
summary presents some of Shannon's ideas (without going into technical detail)
and explains their importance:
Claude Shannon - Father of the Information Age
(There are many online documents explaining Shannon's ideas in more technical detail.)
Pictures and diagrams can also have semantic content though the mechanisms (in brains or computers) required for interpreting them are different from those used for interpreting words, phrases and sentences. However, if a picture in a computer is represented by a computer memory structure composed of bits (symbols chosen from a set of two elements, e.g. '0' and '1') then the number of bits will indicate the information content as measured by Shannon.
There are ways of compressing the signal size required for transmitting or storing such picture elements because of the amount of repetition they often include: e.g. for large regions of an image that are all one colour. So the amount of Shannon information required for storage may be different from the amount required for the physical display mechanism that has to show all parts of the image, not a mathematically derived summary. Again, the semantic information content that a human looking at the image, e.g. information about a crow next to its nest, is very different from the Shannon information measure.
That semantic sense is the sense in which Jane Austen used the word "Information" in her novel Pride and Prejudice about 135 years before Shannon wrote his paper, though she was mainly referring to verbally expressed information. The claim that she often used such a concept of information is substantiated by a collection of examples of her use of the word "information" in that novel, presented in the next section.
Catherine and Lydia had information for them of a different sort.
When this information was given, and they had all taken their seats, Mr. Collins was at leisure to look around him and admire,...
You could not have met with a person more capable of giving you certain information on that head than myself, for I have been connected with his family in a particular manner from my infancy.
This information made Elizabeth smile, as she thought of poor Miss Bingley.
This information, however, startled Mrs. Bennet ...
She then read the first sentence aloud, which comprised the information of their having just resolved to follow their brother to town directly,...
She resolved to give her the information herself, and therefore charged Mr. Collins, when he returned to Longbourn to dinner, to drop no hint of what had passed before any of the family.
...and though he begged leave to be positive as to the truth of his information, he listened to all their impertinence with the most forbearing courtesy.
Mrs. Gardiner about this time reminded Elizabeth of her promise concerning that gentleman, and required information; and Elizabeth had such to send as might rather give contentment to her aunt than to herself.
Elizabeth loved absurdities, but she had known Sir William's too long. He could tell her nothing new of the wonders of his presentation and knighthood; and his civilities were worn out, like his information.
I was first made acquainted, by Sir William Lucas's accidental information, that Bingley's attentions to your sister had given rise to a general expectation of their marriage.
As to his real character, had information been in her power, she had never felt a wish of inquiring.
... and at last she was referred for the truth of every particular to Colonel Fitzwilliam himself-from whom she had previously received the information of his near concern in all his cousin's affairs,
When he was gone, they were certain at least of receiving constant information of what was going on,
Mr. Bennet had been to Epsom and Clapham, before his arrival, but without gaining any satisfactory information....
Elizabeth was at no loss to understand from whence this deference to her authority proceeded; but it was not in her power to give any information of so satisfactory a nature as the compliment deserved.
Upon this information, they instantly passed through the hall once more...
She began now to comprehend that he was exactly the man who, in disposition and talents, would most suit her. His understanding and temper, though unlike her own, would have answered all her wishes. It was an union that must have been to the advantage of both; by her ease and liveliness, his mind might have been softened, his manners improved; and from his judgement, information, and knowledge of the world, she must have received benefit of greater importance.
And will you give yourself the trouble of carrying similar assurances to his creditors in Meryton, of whom I shall subjoin a list according to his information?
But to live in ignorance on such a point was impossible; or at least it was impossible not to try for information.
but to her own more extensive information, he was the person to whom the whole family were indebted
Darcy was delighted with their engagement; his friend had given him the earliest information of it.
"Did you speak from your own observation," said she, "when you told him that my sister loved him, or merely from my information last spring?"
Bingley looked at her so expressively, and shook hands with such warmth, as left no doubt of his good information.
The joy which Miss Darcy expressed on receiving similar information, was as sincere as her brother's in sending it.
What sorts of information-processing machinery can account for the phenomena she was interested in?
Does information have to have a sender and a receiver in order to exist? Can information be received, or acquired, without being sent intentionally? (Which of Jane Austen's examples might be of that sort? What if she had written detective stories?)
Do the examples show that she understood the importance of both control information and factual information? What is the difference?
How can information make something happen?
Exercise: how many varieties of control information can you distinguish in organisms, at various stages of development, learning, behaviour, competition, cooperation, reproduction?
Added: 27 Dec 2013
Samuel Johnson (1709--1784) used this concept of information even earlier: "We know a subject ourselves, or we know where we can find information on it" quoted in Boswell's Life of Johnson, 1791.
Added: 6 Aug 2014 (Frege on Sense & Reference [Sinn/Bedeutung]
Frege introduced a distinction usually translated using the words "sense" and "reference", echoing what earlier philosophers had referred to by distinguishing "connotation" and "denotation", or "intension" and "extension". The distinction is so pervasive that it has probably been re-invented or re-discovered many times, though using different terminology.
However, problems arise when attempts are made to apply it to every possible word or phrase or sign or process that in some sense can be said to convey information or have a meaning.
Examples that cause problems (some of them discussed by Frege) include demonstrative/indexical expressions, e.g. "here", "now", "you", "I", "we", words that combine sentence fragments to form new larger fragments or whole sentences, or qualify assertions, such as "but" "although" "perhaps", "of course", proper names, and many others.
A good novelist with a rich and deep command of her language will use all these
hard to analyse words and phrases without worrying about philosophers' questions.
However, a complete theory of information, covering all the varieties of
information contents of portions of human languages, will have to make explicit
the roles of the more complex and subtle words and phrases. I have tried to do
this with a little word that causes big problems, "self" here
"THE SELF" -- A BOGUS CONCEPT
Without a good theory covering all the obvious and unobvious cases we are unlikely to be able to design robots that have minds like ours.
Added: 10 Mar 2015
For an excellent historical overview of varieties of information processing (mainly by humans) since ancient times see:
Shannon's notion of information
Claude Shannon, (1948), A mathematical theory of communication, in Bell System Technical Journal, July and October, vol 27, pp. 379--423 and 623--656, https://archive.org/download/pdfy-nl-WZBa8gJFI8QNh/shannon1948.pdf
Aaron Sloman What's information, for an organism or intelligent machine? How can a machine or organism mean?,http://www.cs.bham.ac.uk/research/projects/cogaff/09.html#905