Evolution's Uses of DNA
(Part of a discussion of genome replication/expression)
Steps toward a specification of a new super-Turing class (or classes)
of computation, potentially relevant to explaining forms of
natural intelligence so far not replicated in any version of AI
or incorporated into theories in psychology or neuroscience.
Peter Tino provoked this strand in
the Meta-Configured Genome documentation.
Alison Sloman corrected some of my errors and confusions in an earlier draft.
Neither has read this document. All errors are my responsibility.
This is http://www.cs.bham.ac.uk/research/projects/cogaff/misc/dna-uses.html
This is part of a collection of documents on the Meta-Configured Genome (MCG) theory, including an introductory video presentation with discussion here: http://www.cs.bham.ac.uk/research/projects/cogaff/movies/meta-config/
One of the main claims is that ancient geometrical discoveries depended on complex interactions between evolutionary transitions and individual learning that are not explained either by uses of logical reasoning or by any current theories of learning, including "deep learning", which cannot lead to discoveries of spatial impossibilities or necessary connections: e.g. no amount of statistical evidence or probabilistic reasoning will prove that certain compass and ruler operations will necessarily bisect any angle on a planar surface, or prove that one-to-one correspondence is necessarily transitive and symmetric (without which number names would not have their practical uses).
The long term goal is to explain how the meta-configured genome uses chemical mechanisms to influence gene expression to produce information processing mechanisms that do have the required explanatory powers, as conjectured in the MCG theory. One likely consequence is development of a new deep and rich theory of consciousness covering the multitude of forms of consciousness in products of biological evolution, replacing the over-simplified, shallow discussions of consciousness characterised by excessive use of simple labels including "what its like", or "what it feels like": an infectious meme that has impoverished philosophical and non-philosophical discussions/theories of consciousness in the last half century.
Developing these ideas leads to surprising conjectures about some of the functions of chemistry-based mechanisms involved in reproduction and development of cognitive mechanisms, as opposed to neural nets, which are restricted to learning statistical correlations and deriving probabilities. For instance, they cannot learn that something is impossible. Statistical formalisms cannot even express the notions of impossibility and necessity.
The ideas about evolution of new forms of information processing (as well as new construction kits and their products) seem to me to illustrate Alastair Wilson's ideas relating meta-physical grounding and causation 2017.
The MCG presentation explains some of the background to my 20 minute presentation at the 2019 conference on Models of Consciousness in Oxford (https://www.youtube.com/watch?v=0DTYh37U8uE) which included a partial defence of Immanuel Kant's claims about mathematical consciousness (especially spatial consciousness leading to the ancient discoveries in geometry and topology). It also claimed that neither current AI (including logic-based, rule-based, deep-learning-based AI, etc.) nor current neuroscience can explain forms of mathematical consciousness involved in spatial intelligence (e.g. squirrel intelligence, and human toddler intelligence) and ancient mathematical discoveries leading up to Euclid's Elements.
The Meta-Configured Genome theory proposes that later stages of development make essential use of information gained through interactions with the environment during earlier stages of gene expression -- before the information was required. Later the information is used to influence expression of more recently evolved genes, concerned with "higher level" competences. My conjecture is that in some cases chemical processes involved in later stages of gene expression are influenced by chemical structures produced during earlier stages of gene expression, under the influence of the environment. (Contrast this with neural nets whose later stages of training/learning make use of probabilities derived from results of earlier stages.)
This requires innate motive-generation mechanisms to produce the earlier actions without knowing what the information gained will later be used for. The motive generators cannot be reward-based, since many "child motives" (or play/exploration motives) initiate actions whose most important side effects are acquisition of information whose relevance and use cannot be judged at the time. Children playing with toys need not know about or be motivated by what they will through playing that's useful later in life. They are simply motivated to play (driven by "architecture-based motivation" (ABM) not "reward-based motivation (RBM), as explained in http://www.cs.bham.ac.uk/research/projects/cogaff/misc/architecture-based-motivation.html (or PDF).
Early proto-linguistic activity is a special case (e.g. attending to adult speech sounds and shaping babbling skills in response) that have important uses only during later stages of language development, about which the young learner has no conception, unlike an adult learner trying to learn a new language in a new country. I suggest this is an example example of motivation triggered by mechanisms produced by biological evolution, not by any expect utility or reward in the child's mind. I.e. it is not reward-based motivation, but architecture-based motivation. (Follow the above link for more details.)
More examples: a child extending its vocabulary and grasp of syntax will later use the acquired competences to achieve goals it could not have conceived of, or been motivated by, during the early learning processes. Such later achievements could include making mathematical discoveries or reading instructions on how to assemble a machine, or winning political debates.
Genome-generated mechanisms and competences
The genetic mechanisms that trigger early behaviours, the mechanisms that store some of the heard, seen, felt, tasted (etc.) results, and the mechanisms that initiate later behaviours making use of previously stored results are all created by mechanisms derived, at different stages, from the genome. The main point of the MCG hypothesis is that those mechanisms are are not fully specified by the genome. Instead the genome specifies generic features that are instantiated and made usable by being combined with information acquired during earlier phases of gene-expression, and some acquired later.
In other words the genetic information is parametrised: it cannot be used to produce new entities unless first combined with additional information -- i.e. given parameters, just as numerical addition cannot done without being provided with at least two numbers to be added. However a theory of addition can be formulated in a general way that can be instantiated when particular numbers are provided. So we can distinguish
In particular, if some of the later processes of gene expression, before brains are developed, are to make use of information gained from earlier processes, the information acquired earlier cannot be stored in (not yet functioning) brains, so they must be stored chemically, ready to be combined with results of later chemical gene expression.
More mature learning at a later stage may use different kinds of information store, some in the environment (e.g. chemical trails in some organisms) and some internal, e.g. newly constructed neural connections, or modifications of previously constructed neural connections, or in sub-neural chemistry if brains are sufficiently advanced.
These very vague remarks are intended to provide motivation for some of the discussion of chemical gene expression below, where the conjectured construction processes combine information derived from the genome (i.e. from DNA) with information derived from earlier interaction with the environment -- using genetically based mechanisms for producing chemical records related to what is learnt during such interactions.
So, whereas gene expression is normally thought of in terms of chemical processes constructing mechanisms whose later behaviours will use fixed chemical structures, e.g. neurones transmitting electrical signals, we also need to allow forms of learning that end with chemical changes instead of only starting from them.
An important benefit of this complexity is allowing chemically stored information acquired (e.g. from the environment) to be used to tailor chemical processes used later in gene expression.
Some versatile plant species develop structural features while growing, that are appropriate to features of the environment, e.g. temperature, amount of solar radiation, atmospheric pressure, amounts of water and other chemicals in the soil, etc. Some of the changes can be passed on to offspring, in a reversible nor non-reversible way. Heslop-Harrison (1953). Further examples are provided in https://www.nature.com/scitable/topicpage/environment-controls-gene-expression-sex-determination-and-982/
I suggest that also in animals some of the mechanisms of genome expression may acquire information through interaction with the environment that is stored chemically, and can be used later to modify effects of later processes of gene expression. This can also be useful in intelligent animals with complex developmental trajectories that vary according to the threats and opportunities of their early environments.
In young humans, recently developed toys may help to prepare them for activities none of their distant ancestors ever encountered. Could some of that preparation be mediated by brain chemistry during gene expression, rather than neural learning? Compare the ability of very young digestive systems to "learn" to thrive on local diets. Below I'll offer conjectures about mechanisms that could achieve this, used in species with "meta-configured" genomes.
After reading (an early version of) the document on Meta-configured genome expression (to which this file is appended), Peter Tino pointed out that some of the chemical complexities in DNA transcription may be ideally suited to types of parameter substitution, postulated by the meta-configured genome (MCG) theory.
There are differences in chemical processes and mechanisms between gene expression, which occurs in every cell during development, growth and normal functioning of an organism, based on the cell's DNA, and genome replication. Replication of DNA occurs in various contexts, including
(A) reproductive processes in which single strands of DNA molecules from two individuals are combined to form a new merged DNA molecule in a seed or egg, initiating production of a separate new member of the species.
(B) production of new cells in a functioning or developing organism, where each new cell requires a complete new copy of the organism's DNA, and
Both of those are different from
(C) gene-expression processes within a fully formed functioning cell in which parts of DNA molecules are used to drive a variety of different molecular processes required in a living organism, e.g. production of proteins.
Online videos explaining some of the details of (A), (B) and (C) are referenced below.
There are many variants of these processes in different types of organism, e.g. single celled organisms and multicellular organisms, both divided into several branching sub-categories, with differing forms of reproduction, as well as ifferent sizes, shapes, environments, behaviours, forms of information processing, etc.
Both genome replication and gene expression occur repeatedly within each living DNA-based organism, though both are different from the sexual reproductive process ((A) above) in which DNA strands from two distinct organisms are combined (in a fertilised egg) to produce a new DNA molecule in the creation of an entirely new organism.
All these processes depend on remarkable features of DNA -- which can be seen as a multi-use interpreted programming language, because a DNA molecule can help to produce/control (at least) three very different processes, described below.
Both replication (within an individual or in production of a new individual) and gene expression during development and maintenance of an organism, involve separating the two strands of a single (very long) DNA molecule, but the processes are very different.
The processes described here are explained more fully and more clearly using graphical displays, in the videos referenced below. As indicated above, there are two uses of genome replication:
(A) making a new organism, starting with an egg cell containing a single new DNA molecule (with one strand from each parent organism in cases of sexual reproduction)
(B) production of new cells while an organism and its body-parts grow or repair themselves, continually making and using new cells containing new copies of the original DNA molecule in the organism (genome replication within an organism).
In genome replication during processes of growth of an organism, (case B), new cells are created with new DNA molecules copied from older DNA molecules in older cells, using mechanisms that separate the two strands of a pre-existing DNA molecule and use each strand as a template to build a matching strand, related roughly as a photograph and its negative (demonstrated in videos).
Each newly created matching strand is then combined with the original strand from which it was derived, so as to produce a new DNA molecule identical with the original molecule. This is part of a process in which two identical new cells are formed, each containing one of the newly created DNA molecules, each of which is (normally) a perfect copy of the original DNA molecule.
If the two cells inhabit slightly different environments (e.g. adjacent but different locations in an organism) their subsequent development can diverge, producing body parts with different functions.
Cell development and maintenance (within a cell)
When a cell is not replicating itself by making new complete two-strand copies of its DNA molecule, it can instead use parts of the DNA molecule, in a different copying process, gene expression, in which many temporarily unzipped portions of the DNA molecule are used as templates for production of a large variety of new molecules (proteins) required for other functions than reproduction of cells. Those functions will vary in different parts of the organism, including providing bones or stalks for size and strength, providing skin or other exterior protective covering materials, responding to infection or damage, and many more products, whose complexity and diversity increase as an organism grows, and sometimes also increase across generations, as organisms become more physically and chemically complex and able to produce more complex behaviours, including interacting with more complex environments. This is also illustrated in a video below.
These processes may be partly parameterised, i.e. incompletely specified in the DNA and its direct derivatives, leaving "information gaps" to be filled by chemically encoded information derived from the organism's environment, or stage of development, or in some cases previous infections, or possibly also side effects of physical damage affecting growth and behaviour.
In other words, if infections, physical damage, or chemicals in an organism's diet can modify processes of gene expression, then perhaps addtional, benign, information about the environment can also influence chemical processes of gene expression. Evolutionary processes may have taken advantage of this possibility to allow quite complex aspects of relatively late development to be strongly influenced by information gained earlier during development and stored chemically for later use. This form of learning, in which generic evolved competences or features are made more specific during individual development, may turn out to be at least as important as alterations of synaptic weights produced by frequency of stimulation. (This suggestion was inspired by a comment made by Peter Tino after head had read earlier research papers postulating important roles for chemistry during development. The ideas need to be made far more precise on the basis of empirical data about developmental processes.)
Sexual reproduction using two genomes
In collaborative genome replication (B), DNA molecules from two organisms of the same species (usually, but not necessarily, with sexual dimorphism) are used to make a single new DNA molecule derived from one strand from of the DNA from each parent. The newly created organism then has a single new DNA strand that combines parts from each parent. If there are twins, triplets, etc., then each of the offspring normally has DNA based on one strand from each parent. Once the new cell is formed, processes of type (A) can grow a new individual, with genetic materials from both parents, possibly leading to new (B) processes in adulthood if mating occurs.
Example 3 corresponds to Peter Tino's remark that the complexities in DNA transcription are ideally suited to types of parameter substitution during gene expression. How such parameters might be derived from information acquired previously by a learner at an earlier stage of development, as required to conform to the Meta-configured Genome, is an interesting question!
Video tutorials explaining the above processes
There are many online video tutorials helping to explain these processes for non-experts. I have found a small (somewhat arbitrary) selection illustrating the points made above using dynamic graphics, providing much more detail than I have. (Please let me know if you find flaws in any of these, especially if you can suggest better -- but still fairly short -- alternatives. If I get many suggestions for alternatives, I may collect them in a separate location (with acknowledgements) accessible from a pointer added here.
Within organism genome (DNA) replication video
This video shows (schematically) how DNA is copied, within a cell (part of the process of producing a new duplicate cell whose features may diverge later as its local environment in an organism changes). The video shows (in a simplified form) how both strands of the original DNA helix are unzipped and copied, then re-combined, to produce two new identical DNA molecules (each is actually only half new). This is Case 2 in the above diagram.
Within organism genome (DNA) transcription video
Transcription and Translation: From DNA to Protein
This gives an impression of some of the complexities involved in temporarily unzipping parts of a DNA molecule to enable a sub-sequence to be used as a template for synthesising a new molecule (a protein), partly by including various pre-existing non DNA molecules. Many different proteins can be synthesised within a cell, based on different regions of the DNA molecule, and used to construct new materials and body parts during processes of growth, normal functioning, and repair. Notice how that process is related to, but in its outcomes very different from, the process of producing a new copy of the DNA molecule by copying each strand completely. One process is more like photocopying a text, the other more like comprehending and following instructions in the text. A more detailed (but still simplified) presentation of the processes is shown in the video. This is Case 3 in the above diagram.
Introduction: What is life?
This shows some of the key sub-cellular structures and processes in living things, and discusses conditions for the origin of life on this planet, emphasising multiple levels of structure/complexity.
Illustrates early (proto-)life forms, and discusses possible processes by which reproducible life forms may have arisen on this planet (including the Miller-Urey experiment).
Illustrates several of the steps in copying a cell to make two cells (mitosis), including duplicating DNA.
(Longer -- 15:50 minutes)
Transcription from temporarily unzipped DNA to RNA: part of the process of using DNA to specify non-DNA molecules to be synthesised for use in a cell. Also shows transfer of new molecules out of the nucleus of the cell into the cytoplasm, where there are various jobs to be done.
More details on transcription and translation in production of proteins, including production of temporary structures required for the process and later discarded.
Some of the differences between gene reproduction and gene expression are also spelled out in this video (15:53 minutes): https://www.khanacademy.org/science/high-school-biology/hs-molecular-genetics/hs-rna-and-protein-synthesis/v/rna-transcription-and-translation
See also Ogryzko(2008).
Complexities of gene expression not mentioned here, but very relevant to the general Meta-Morphogenesis project are shown in some online videos from BBC and other sources, e.g.:
How Did Insect Metamorphosis Evolve?
Added 26 Dec 2019
See also these compelling clips from a BBC TV documentary on meta-morphosis by David Malone (which I have not seen unfortunately):
(I wish producers of serious science documentaries would not add insulting spurious sound effects that make them such a struggle to listen to for growing numbers of people with age-related hearing loss (presbycusis). See also: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/bbc-learning.html)
[If there are better tutorial web sites explaining these different processes and their reproductive, developmental and cognitive functions for non-experts please let me know. I am not an expert in this area. I have tried merely to provide a useful very basic introduction to the mechanisms and functions of genome replication relying on previously available online videos. Please also inform me of errors/flaws or gaps in my attempts to summarise the different processes derived from DNA.]
Diversity of products of evolution
The products of gene expression will be different in different organisms (with some overlaps in similar types of organism), and will also differ between parts of the same organism, e.g. when making skin, hair, muscle, bone, nerve cells, eye-lenses, blood vessels, bark, leaves, petals, roots, digestive juices, and many, many more. These products will be enormously varied within a single complex organism, and even more varied across different sorts of organisms. The Pre-Configured genome theory suggests that in some species (especially those with extended infancy/childhood) the products of later transcription from a particular portion of DNA may be strongly influenced by results of earlier processes of gene expression, in which interactions with the environment produce information somehow stored chemically (i.e. not in synaptic weights) for use during later gene expression.
Well known examples include acquired resistance to certain types of infection. Some forms of resistance are inherited through the genome (from either or both parents), acquired from the mother during development of a foetus. Others are acquired after birth when inherited mechanisms react to infections by developing tailored defences. These are all examples of compositionality, insofar as some inherited abstract structure later combines with acquired parameters to produce something new. In language development these processes occur in the reverse order: slot fillers are acquired before the structures develop containing slots that need to be filled.
For example, a child learning to use new complex syntactic constructs (e.g. hypothetical conditionals, "although" constructs, and other complex sentence forms, like the one in which this comment is embedded) will need the new linguistic construction mechanisms to assemble components acquired during earlier interactions with the environment to create new complex syntactic and semantic structures. The products of those mechanisms can vary between linguistic communities using very different vocabularies and grammars. Such genome-driven construction processes combining new and old molecular structures (new ones derived from the genome and old ones from previous gene-expression processes influenced by the environment) must be chemistry based, not neural-net based.
The MCG hypothesis implies that all of this is relevant not only to language development but to many other forms of cognitive development -- including development of increasingly complex forms of spatial reasoning, and later on studying or making discoveries in mathematics, biology, philosophy, or learning new (secondary) languages, social customs, values, .....
In all cases, instead of one learning mechanism using more and more data there are different mechanisms, evolved at different times, and produced in schematic form from the genome at different stages of development, making use of information structures acquired or produced at earlier and later stages.
The theory is potentially relevant not only to cognitive (including linguistic) development in humans but also complex structured behaviours in other species, e.g. bird song, building of complex nests by various species of birds, collaborative hunting in carnivores, types of mating display or competition for mates, and many more.
Discussion: some implications
-- Genome replication processes
are crucial both for reproduction, i.e. formation of an egg or seed that can grow into a new individual organism, and also for creation of new cells in a developing organism, which must contain a copy of the original DNA to continue the process of growth through cell division, in a biologically consistent manner.
-- Gene (or genome) expression
What the (identical) copies of an original DNA molecule happen to contribute in different cells in different parts of the body can vary enormously depending on the location of the cell in the larger organism and the function or functions of that part, which can vary enormously between organisms. It will also vary between different parts of a complex organism, e.g. muscle tissues of various sorts, skin, hair, nerve cells of different sorts, bone, parts of digestive organs, and the very different parts of single-celled organisms, plants, fungi, and many more.
My main conjecture is that it will turn out that mechanisms of the sort discussed here will eventually be shown to explain the ancient processes of mathematical discovery leading up to Euclidean geometry, ancient number theory, and beyond, and will build on results of earlier processes of development of spatial intelligence shared with other intelligent species (e.g. squirrels, orangutans, crows, and also pre-verbal human crawlers and toddlers with spatial manipulation skills, including some of the examples here: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/toddler-theorems.html).
That's because, over time, repeated uses of abstract specifications (from late-evolved DNA) instantiated using environmental parameters (or internally derived parameters influenced by environmental information) allows more powerful developmental trajectories than can be achieved simply through use from birth of a general purpose learning mechanism. This is explained and illustrated in the latest introduction to the Meta-Configured Genome idea. All of these processes depend on the various uses of components of dna in different parts of a developing organism at different stages of development.
Is there an outstandingly good online introduction to mechanisms of gene expression, i.e. showing non-experts how genes contribute different materials and functions to different parts of organisms, in different processes of reproduction, development, and living, including information processing functions, subsuming all the above fragments but not requiring more than a few hours of study?
A serious gap in the theory: Explain ancient mathematical discovery
An important gap in the explanations so far is the lack of an account (e.g. a single detailed example) of how the above processes of gene expression can produce mechanisms able to support discoveries made by Archimedes, Zeno, etc. I suspect this will turn out to depend crucially on sub-cellular (sub-neural) processing mechanisms rather than the statistics based neural nets that are now increasingly popular. Statistics-based mechanisms (including deep learning mechanisms) cannot explain mathematical discoveries however, since they cannot express impossibility or necessity.
I hope the above videos and comments help to explain some of the diversity of biological processes and mechanisms presupposed by the Meta-Configured Genome hypothesis, explained in this tutorial file:
and related online documents and presentations, including the video of my talk at the Oxford Conference on Models of consciousness https://www.youtube.com/watch?v=0DTYh37U8uE (Why current AI and neuroscience fail to explain ancient forms of spatial reasoning)
The full set of videos of talks at the conference is available here: https://www.youtube.com/channel/UCWgIDgfzRDp-PmQvMsYiNlg/videos
and an older presentation of the Meta-Configured Genome theory that doesn't go into the varying functions/roles of DNA in the processes:
(For now that lacks important details that are now presented here using a video: http://www.cs.bham.ac.uk/research/projects/cogaff/movies/meta-config/)
Implications for theories of consciousness
It is (or should be) obvious to scientists and philosophers that the complexity of the processes outlined in this document is so great that there is much scope for the processes to go wrong in different ways, at different stages, e.g. during reproduction, during development, during application of developed competences, etc. In principle this could lead to some new classifications, and associated explanations, especially of genetic developmental disorders including autism, Williams' syndrome, teenage psychosis, dyslexia, and perhaps many others.
Implications for genetic brain disorders
It should be obvious that the above processes are so complex that there are many ways in which they may "go wrong", or be abnormal, at all stages of development. If the processes of expression of some of the genes expressed relatively late go wrong (e.g. in teenage psychosis?), either because of defects in the part of the genome specifying those developments, or because something has gone wrong at an earlier stage that influences the later stage of gene expression, or because of some broad spectrum disorder caused by malfunctions in chemical processes, then it may, in some cases be possible to detect advance signs that allow medical (chemical?) intervention to correct the process before it starts. (I suspect that's unlikely in most cases.)
If the ideas presented here lead to new, more detailed, descriptions of many of the processes that occur in normal development, they may also help with classification and identification of various kinds of developmental disorders, caused either by genetic abnormality, early trauma, or poor environments (e.g. abuse or neglect of infants or children).
It is well known that there are many genetic defects that arise out of flaws in gene-expression processes. Some of these are physical disorders, e.g. conjoined twins and other physical/physiological abnormalities. I don't know whether researchers working on those problems are already linking them to details of the normal developmental processes sketched in outline in this document.
I expect that an extended version of these ideas will show how different forms of information processing are produced in organism that vary in their evolutionary history, and in organisms that vary in their environment-influenced individual development, and this will produce a deeper, richer theory of biological consciousness than any produced so far, including explanations of previously unexplained types of mathematical consciousness that have been driving my own research.
It's time to give up "what it's like (or feels like)" theories of consciousness and instead switch to "what it achieves and how it achieves it" theories, where what's achieved is a large variety of naturally occurring forms of consciousness, possibly extended in the distant future to new artificial versions in robots, including robot mathematicians.
V.V. Ogryzko, 2008, Erwin Schrödinger, Francis Crick and epigenetic stability, Biology Direct, 3, 15, http://doi.org/10.1186/1745-6150-3-15
"A shift from the signal transduction paradigm to the epigenetic one might be useful for the study of many other protein modifications and even of interactions between macromolecules."
Alastair Wilson, 2017,
Nous (online version)
See also the references in
(To be extended.)