Cover Page

Series Editor

Marie-Christine Maurel

Trajectories of Genetics

Bernard Dujon

Georges Pelletier

Wiley Logo

Introduction

The applications of genetics are invading our daily lives. Whether it is for prenatal diagnosis, agronomy or forensic science, we appreciate the accuracy of genetic methods at the same time as we fear their power. What we are able to accomplish today was unimaginable not long ago. We are entering the era of personalized medicine and genetic therapeutics at the same time as we are leaving the empiricism that has prevailed until now for the description and exploitation of the biosphere. And progress is accelerating. But what is it all about? How can we understand what is happening if we do not have clear notions about the fundamental principles of the living world? Principles that have only been revealed slowly to scientists, whose investigations have often followed complex trajectories, are not very explicit to non-specialists.

When we talk about heredity, common language gives the appearance of simplicity. We hear, for example, that this child with light eyes inherited the genes from his/her grandmother, who also has light eyes. That’s implied, everyone in the family knows it. This simple sentence hides the complexity of the gene. On the one hand, the gene is correctly perceived as this somewhat mysterious element transmitted from generation to generation, even skipping generations, because it’s the grandmother we’re talking about. On the other hand, the gene is here confused with the observable trait referred to as a phenotype*. Here, the eyes are light, but in the population there is a variety of shades, not just light eyes as opposed to dark eyes. Is there a gene for every shade? This seems very difficult to imagine. And then genetics also tells us that this child with blue eyes inherited exactly as many genes from his grandmother with light eyes as from each of his other grandparents who all had dark eyes. So, what is a gene? How does it work?

It was long believed that blood was the carrier of hereditary traits, hence the term “consanguineous” is usually used to designate filiations. We now know that it is DNA* (deoxyribonucleic acid), a macromolecule present in all our cells and that there are tens of thousands of billions of copies of it in an adult human being, all derived from the unique molecules present in the egg that gave birth to him/her. With the progress of genetics, the acronym DNA has become commonplace. It has become a common term for the essence of a personality or even a thing. We hear about the DNA of a company or the DNA of a sports club, which (literally) makes no sense. DNA only exists in living organisms. Moreover, even for personalities, invoking the DNA of an artist to signify his/her talent or the DNA of a child passionate about horseback riding to justify his/her taste for horses brings nothing more to the understanding of the causes than talking about blood; except the appearance of being educated, poorly educated. Because the transition from blood to DNA has been accompanied by a considerable conceptual revolution that is totally ignored here: the particulate nature of the determinants of hereditary traits*. Genetics was born when the old vision of mixing “parental fluids” which was invoked to explain the heredity of the characters of the descendants was replaced by the notion of “particles” of heredity. While these imprecise fluids gave the illusion of continuity, each particle, which would later be called a gene, became a separate entity independent of the other particles with which it mixes in discrete proportions, quantifiable by the experimenter. This was clearly understood by Gregor Mendel as early as 1865. But then what is the relationship between the gene and DNA?

I.1. The multiple facets of the gene

A simple answer is not to be expected. The gene is a concept, but a concept that can be manipulated in a test tube! The following chapters describe the evolution of ideas as the nature, organization and functioning of the genetic material became clearer. Where are we at today? The gene remains counterintuitive, because of three main barriers or obstacles to its understanding: one operational, another temporal and the third essential.

The first obstacle is that the gene is both a DNA segment and a functional entity. In other words, its very nature changes depending on how it is examined, a bit like elementary particles in quantum physics, which can be both particle of matter and wave. However, if DNA is a molecule that is perfectly defined at the chemical level (see Chapter 2) and can be manipulated in vitro, the notion of function remains an abstraction specific to biology that applies at different scales from the molecule – an enzyme* for example – to the entire body – eye color for example. And if DNA can be broken down into its elementary components, the functions cannot. This makes the gene a primordial unit, a “particle”, but composed of elements that can be separated from one another.

The second obstacle is that the gene is both the element that crosses generations and the element that acts on each of them. At each generation, the functional aspect prevails. Between generations, it is the quality of the transmission of the informational content that matters. This duality is based on two different types of molecules and two different processes. The transmission of the informational content is based on DNA and its duplication (referred to as replication*) before each cell division. The functional aspect is based on the copying of DNA (known as transcription*) into RNA molecules*, for ribonucleic acid, the other category of nucleic acids (see Chapter 2).

This leads us to the third obstacle, because the intergenerational permanence is non-physical, in the form of information – but it is carried by molecules and contained in the organization of their elements. And it is important to understand that at the material level it is not the gene itself that is transmitted from generation to generation, but copies of it, with inevitable potential for error, that is, mutation*. The gene, an element of permanence, thus automatically becomes an element of variation. Consequently, over the successive generations of a lineage, and therefore within populations of a species, the same1 gene will acquire multiple forms, known as alleles*. This very important notion makes genetics “the science that uses variation to study permanence”, as Philippe L’Héritier, one of the pioneers of this discipline in France, put it very well.

In summary, one could say that the gene is the information necessary to perform a biological function, written in a nucleic acid* and transmitted from generation to generation with a certain degree of imperfection. But where do the phenotypic traits come from then, if the light-eyed child has inherited as many genes from each of his grandparents, regardless of their eye color?

I.2. Genotype and phenotype: the reality of genetic determinism

We are entering into phenomena that were only slowly understood through a century and a half of research. Let us immediately eliminate the simplistic vocabulary used on the Internet or by some media that talks about genes for anything as soon as humans are involved. There is a jumble of genes for violence, depression, wanderlust, gluttony, generous buttocks, cowardice, resilience, rebellion, slimness, obesity, crime, schizophrenia, homosexuality, mathematical intelligence, etc. Curiously, there is no gene for stupidity that does characterize this list. This is because genetic determinism is anything but simple and direct, as this misleading vocabulary suggests.

First, because each gene can take many forms, it is the alleles that must be considered. Second, because many living organisms – including humans – are diploid*, it is the relationship between two alleles that counts: when the effect of one dominates that of the other, we speak of dominance* of the first and recessivity* of the second, but it is also possible that the effects of one and the other are added in variable proportions. Finally, because biological functions are highly intertwined, it is the interaction between alleles of different and sometimes many genes that comes into play. For example, more than 400 different genes are involved in determining the size of human adults, each gene being represented in the population by many alleles associated by pairs in each individual. It is therefore not surprising that the size of the individuals in populations forms an apparent continuum. Is it always so complex? No. Cystic fibrosis is caused by the dysfunction of a chloride ion transporter of the pulmonary epithelium which is the product of a precisely defined gene, located on our chromosome* 7. We now know about 2,000 alleles of this gene that may be linked to the severity of the syndrome. In this case, the alleles of this single gene explain almost all observed phenotypes. The same is true for a complex developmental phenotype, such as the transformation of antennae into legs in the Drosophila fly, following the mutation of a single gene.

In fact, the vast majority of phenotypic traits depend simultaneously on genes with major effects – the easiest to identify – and genes with minor effects, whose lists are generally not complete. The power of genetic analysis is highly dependent on the organisms to which it applies, their degree of fertility and the genetic polymorphism existing within natural populations. In many cases in humans, the mutations in all the genes already identified do not explain all individuals with a particular phenotypic trait. It is said that there is a lack of heritability* of this trait. Filling this gap is an important objective of current human genetics.

Before going any further, it is necessary to reconsider what are phenotypic traits. A plant placed on poor, arid soil will generally grow less well than its genetic twin (its cutting, for example) placed on rich, suitably irrigated soil. Similarly, an undernourished animal will suffer growth difficulties. These banal findings are important in practice, but of no interest to us here. The comparison of phenotypes between individuals has a genetic meaning only when these individuals are placed under comparable external conditions. But even under these conditions, the genetic determinism of the phenotype is not direct. In reality, genetically identical individuals (members of an inbred line or a clone* or simply identical twins) placed under identical conditions will never be strictly identical phenotypically. There is always, for each phenotypic trait considered, an individual variation around the average value. Within a genetically homogeneous population, this variation generates a statistical distribution specific to this trait. Depending on the trait considered, the variance of the distribution will be more or less large. It is therefore essential to understand that what is transmissible to the offspring is the form and parametric values (mean, variance) of these distributions, not the precise phenotype of each individual. It was this fundamental observation that led Wilhelm Johannsen to define genotype* and phenotype* as early as 1903 (see Chapter 1). When we talk about genetic determinism, we must therefore keep in mind that the genotype, an element of intergenerational permanence, does not directly define the phenotype of each individual, but defines the statistical distribution of the phenotypes of all individuals who carry this same genotype, placed under the same external conditions.

Today, we know how to fully define an individual’s genotype by fully sequencing his or her genome*. For a diploid, the two alleles of each gene must obviously be individually sequenced, which has only recently become possible. However, except in specific cases of prenatal diagnosis, for example, it is still difficult to predict the resulting phenotype, as we are far from knowing all the interactions that can exist between the alleles of all genes. Genetic counselling therefore remains probabilistic.

There is also a time dimension, because the functional state of a gene depends not only on its allelic form (and possibly other genes of the same genome), but also on its functional state during the previous generation. This is called epigenetic* effects. The molecular mechanisms responsible for them will be discussed in the following chapters. At this stage, it is sufficient to recall that epigenetic phenomena are themselves genetically determined and, for some of them, not only by the genotype of the individual concerned, but also by those of previous generations.

I.3. The products of genes

One would naively imagine that genes have a wide variety of products, given the number and diversity of observable phenotypic traits. It is exactly the opposite. There is only one type of direct gene product: RNA. Native RNA molecules copy the information from the DNA of the genes but, depending on their nucleotide sequences*, they then engage in very different functional pathways. It is therefore actually RNAs and not the genes themselves that are responsible for all the functions of living cells. But these RNA molecules have a limited lifetime and, with few exceptions, are not passed on to the next generation because they do not replicate. They therefore do not ensure intergenerational permanence. This is not the case with the RNA molecules of some viruses* that replicate, invade cells and can be transmitted to offspring.

Among the functional pathways in which RNA molecules engage, one leads to the synthesis of another category of macromolecules made up of amino acids* linked to one another in specific orders: the proteins*. Protein synthesis is a complex chemical process that involves several types of RNA molecules acting together in a coordinated manner. During this process, the information made of the succession of nucleotides* in an RNA molecule, known as the messenger RNA* sequence, is translated into an information made of the succession of amino acids in a protein, that is another sequence.

Since nucleotide sequences offer infinite combinations, the theoretical diversity of proteins on which phenotypic traits ultimately depend is unlimited. This explains how, through a single mechanism, genes can be at the origin of all phenotypic traits. But how is achieved this passage of information from RNA to proteins, known as the translation* process?

This is where the genetic code* comes in. This term is often used inappropriately by the public, leading to a total misunderstanding of genetics. The genetic code is not the informational content of a living organism (its genome) that is unique to it and differs between species and even between members of the same species. It is a universal deciphering code, common to all living organisms currently known. The genetic code, like all decryption codes, is a set of simple rules that establish the correspondence between messages made up of elements of different natures. It is somewhat like a message made up of a series of dots and dashes that can be deciphered in Latin script using Morse code. Here, the genetic code establishes the correspondence between a succession of nucleotides in a nucleic acid, a messenger RNA molecule, and a succession of amino acids in a protein that is its translation. The genetic code does not determine the nature of the synthesized protein. This depends only on the information carried by the messenger RNA, which in turn comes, through a sometimes complex path, from the information carried by the gene. To say, as we too often hear, that rewriting a genome modifies the genetic code of a cell is therefore a complete misinterpretation, which amounts to saying that the Morse code has been changed. Fortunately, this is not the case, because, as any telegraphic communication would become impossible, the cell would immediately become unable to decipher its own genome and disappear!

For reasons that will be explained in this book, the genetic code establishes the correspondence between sequences of three nucleotides (there are 43 = 64 different ones) with the 20 main amino acids that make up proteins. It should be noted that the universality2 of the genetic code in the known living world demonstrates the common ancestral origin of the protein synthesis mechanism used by modern living organisms. It is a mechanism that is practically fixed in terms of evolution, which is extremely rare in the living world. This universality allows the transfer of genes between different contemporary organisms. A phenomenon called transgenesis*, which, long before its artificial use, played a critical role in natural biological evolution. In its absence we would not be here to talk about it.

But then, if all living organisms use the same genetic code and can sometimes even exchange genes, can we talk about human genes, cat, grasshopper, wheat, parrots or fish genes? And what is a species?

I.4. Unity and diversity of the living world

Historically, the notion of living species preceded that of biological evolution. Originally, species were defined as sets of individuals with common characteristics that were sufficiently distinctive from those of other sets that, therefore, constituted other species. If these characteristics changed over time, then new species could appear, hence the idea of transformism and evolution. For organisms that reproduce sexually, it is the viability and fertility of hybrids that indicates if both parents are members of the same species or of two different species, as if there were reproductive barriers between species. How then to reconcile these observations? Genetics was born in this historical context and very quickly sought to rationalize the bases of speciation by the idea of incompatibility between alleles of different genes, assuming that the same genes should be present in different allelic forms in different species. As a result, there are no longer any human genes, cat, grasshopper, wheat genes, etc., but only different forms of the same genes after divergence of their sequences due to mutations that have accumulated over successive generations from their common ancestors. Obviously, as the number of generations increases, the differences make it more and more difficult to recognize their common ancestry. But below this limit, it will be possible to reconstruct their probable relationships by comparing their sequences. The idea of trying to reconstruct genealogical links between species is not new. Antoine Nicolas Duchesne gave a first concrete illustration of this idea in his Histoire naturelle des fraisiers in 1766, concerning the strawberry species that he had studied and cross-bred. This idea of species genealogy, taken up mainly by Jean-Baptiste de Lamarck and Charles Darwin, was developed by their immediate successors, including Ernst Haekel, who proposed the first tree of life as we knew it then. With the sequencing of genomes, the search for evolutionary filiations has recently taken on a considerable scale, to the point that our ideas on the different branches of the evolutionary tree of the living world have become much clearer, sometimes modifying older trees (see Box I.1).

As early as 1937, Édouard Chatton differentiated between organisms whose cells have a nucleus* containing the chromosomes, which are thus isolated from the cytoplasm, and others whose cells are not compartmentalized, leaving the chromosomes directly in the cytoplasm. The first, called eukaryotes*, include animals, plants, fungi, algae, but also a very wide variety of unicellular microorganisms that are generally not well known to the public. Eukaryotic cells also contain other organelles* that carry their own genes: mitochondria* and chloroplasts*. The latter, historically called prokaryotes*3 because their greater simplicity could suggest that they were more primitive, are almost exclusively unicellular microorganisms initially confused under the name of bacteria*. The first sequence comparisons were to confirm the uniqueness of eukaryotes, but to distinguish two groups among the prokaryotes, one of which, now called archae*4, had been much less studied than the other, which retained the name bacteria. The living world would therefore be made up of three main subdivisions, whose order of separation remains a subject of discussion. Archae, bacteria and eukaryotes are themselves subdivided into many subgroups. Eukaryotes contain at least five, perhaps eight, subgroups of which animals and plants represent only two particular lines among a much larger number of single-celled microorganism lines. This classification ignores the world of viruses, because they are not living cells, but molecular particles that must infect living cells to multiply. We know of viruses that infect archae, other bacteria, other eukaryotes, each with more or less broad host specificities. We are far from knowing all the existing viruses and it is likely that new ones are constantly being formed. The latest results of ocean exploration suggest that viruses alone represent the largest mass of the living world.

The contributions of the three groups of cellular organisms and viruses to the development of genetics were very different. As we will see, genetics was born from the study of heredity in plants and initially developed with the study of insects, fungi and few other organisms, all eukaryotes. It then became molecular, with the study of bacteria and their viruses, called bacteriophages*, which paved the way for fine structure analysis of the gene and provided the first tools that would later be used for genome analysis. Genomic tools now make it possible to address all organisms and their viruses, but there remains a large bias in our present knowledge between the different branches of the living world, hence the possibility for important new discoveries through the study of the lines that have remained practically unexplored. We will see some examples of this. This bias has never called into question the universality of the established principles, but it questions their completeness.

I.5. Permanent changes: lessons from genomes

The spectacular advances in genomics illustrate this question. It teaches us that the species differ not only from each other in the allelic forms of the genes they share in common, but also in the presence, absence or multiplication of certain genes. The same applies to members of the same species, albeit to a lower scale. Two individuals randomly selected from the human species differ not only in the allelic forms of their genes, but also in the exact number of their genes. Here, the difference remains small in numerical terms – a few units at most – but not necessarily in functional terms. For other organisms, in particular microorganisms, the number of genes common to all members of the species may be significantly less than the number of genes of each individual, itself much less than the estimated total number of genes of all members of the species.

Gene gains or losses that explain these variations are based on mechanisms that are only now beginning to be better understood. The loss of genes from generation to generation seems more or less constant, limited only by the possible deleterious effects that may result. These losses are numerically compensated by gains that can have two sources: acquisition by horizontal transfer from other contemporary organisms or de novo gene formation by mutations from existing sequences. These events remain rare at the individual level, but they play a considerable role in the evolution of populations. As a result, every genome is inherently imperfect, an instant image of entangled dynamic processes whose effects are mitigated by the resilience of their products. This is far from the simple and naive genetic determinism too often imagined where any modification of a genome would be prohibited.

I.6. The future of genetics: hopes and fears

It is rare to keep in mind the progress of past centuries. Their benefits are taken for granted. Their misdeeds are known or forgotten. The remarkable advances of the 20th Century in terms of agricultural production, animal husbandry, human health and domestication of micro-organisms had a broad genetic basis. Who remembers? And since they have been completed, what is left to be done? Current advances in genetics seem to raise as many questions among the public as they raise hope among specialists. When a genetic anomaly is responsible for a disease, what could be more natural than to look for ways to repair it, especially since it is disabling or greatly reduces life expectancy? This idea is not a new one; it emerged about 60 years ago, with the discoveries of the structure of DNA and the genetic code. Does the knowledge of genomes allow us to anticipate the appearance of illnesses? Does transgenesis make it possible to treat them? We have just shown the possibility of transforming lymphocytes into drugs against malignant cells. Can living organisms be created from genomes obtained by chemical synthesis of DNA? Today with microorganisms, tomorrow with animals or plants? Life is built on the chemistry of nucleic acids and proteins. Were there any other possible choices? Is a xenobiology* possible, that is, a new biology separate from our living world because it uses other atoms and molecular structures?

Doesn’t genetics give us the means to act ethically? This is not a new question. William Bateson, one of the founding fathers of genetics, expressed himself in this way in his speech at the closing banquet of the 4th Genetics Conference that was organized in Paris:

What will become of genetics? We are at the beginning, and when you consider the depths and heights it can reach, we are truly dizzy. Genetics gives the human race a power that could never be predicted and that is extremely dangerous […] perhaps – and I don’t think I’m being ridiculous in saying this – will we have the power in a century to regulate the fate of the human race, and the types we don’t want will not be born. I am not sure that a government with this power will not abuse it.

That was 1911. A century has now passed. Eugenicist abuses led to nothing but tragedies, as their foundations were so wrong. They were inspired more by Darwin than by Mendel. But the prophecy was fulfilled. We have the power to influence the fate of the human race, to deliberately modify ecosystems, to artificially manufacture micro-organisms – including pathogens – and, why not, soon to bring extinct species back to life. How can we apply this in a useful way? This is the first question that makes sense, the other being how to preserve the curiosity of the unknown.

I.7. Origin of the book and content of the chapters

This book will illustrate some of the current trajectories of genetics, with an emphasis on the concepts on which they are based. It is derived from a symposium (Comptes Rendus Biologies 2016) held at the Académie des sciences in Paris in September 2016 on the occasion of the 150th anniversary of the publication of Mendel’s work, Versuche über Pflanzen-Hybriden. This conference brought together French researchers – a country that has long been reluctant to recognize this discipline for its true value – and foreign researchers. This book will address the fundamental concepts and issues without which it is impossible to understand the development of current research. It will attempt to show that beyond the immediate benefits, it is intellectual curiosity that gives genetics its full impetus for the future. The novelty and power of the dogmas established by successive discoveries of genetics during the 1950s and 1960s could suggest that it had completed its work, while genes had not yet been chemically isolated. Subsequent developments in recombinant DNA and genomics, which some considered either dangerous or superfluous, would instead open up unsuspected areas of research that now shed light on all branches of biology, from the most fundamental to the most applied. The book will illustrate the transition between these two periods, the first three chapters dealing with the fundamentals, the next three chapters on genomics and the last three on current applications and expected developments.

We would like to thank Jean-Yves Chapron and Éric Postaire for their wise advices and Jean Weissenbach for his careful review of the manuscript.

I.8. References

Comptes Rendus Biologies (2016). Trajectories of genetics, 150 years after Mendel. Comptes Rendus Biologies, 339 (7/8), 223–336.

Woese, C.R., Kandlert, O., Wheelis, M.L. (1990). Towards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and Eucarya. Proceedings of the National Academy of Sciences, 87, 4576–4579.

  1. 1 We will not enter here into the insoluble philosophical problem of knowing to what extent a whole remains the same when its parts change.
  2. 2 In reality, this is almost universal, as minor changes are observed in some species.
  3. 3 The exact term would have been “acaryotes”.
  4. 4 For historical reasons, the archaea were initially referred to as “archaebacteria”, while the other prokaryotic group was referred to as “eubacteria”. Despite this designation, there is no evidence that archaea are more primitive than bacteria.