Our 26-letter alphabet often seems like a model of efficiency. Look at how much information can be encoded and passed on with it. Look at what Shakespeare accomplished with it. I’m particularly fond of it because my ability to manipulate it is what pays for my food and lodging and other necessities like new CDs.
But there is another alphabet that makes the English one of 26 letters look grossly overstocked. With just four letters, it manages to encode and transmit all the information necessary to create a living creature, from the lowliest bacteria to the mighty blue whale to us.
That’s the alphabet of the genetic code, and its discovery is one of the great sagas of modern science.
Genetics, as a separate branch of science, was born in Austria, in the pea patch of Gregor Mendel, a monk. In 1866 he published the results of his experiments, which set out the rules by which characteristics are passed from parent to offspring, based on the concept of dominant and recessive “factors,” later called genes.
Two years later a Swiss biochemist, Friedrich Miescher, identified a complex substance in pus cells on discarded surgical bandages. The new discovery, deoxyribonucleic acid (DNA) soon turned up in all kinds of cells from all kinds of organisms. No one thought it had anything to do with heredity.
Toward the end of the century the chromosome was discovered. These strands in the nucleus of the cell occur in pairs. Whenever a cell divides, both new cells end up with a complete set of chromosomes. In 1903 American scientist William S. Sutton suggested that these chromosomes were the physical basis of heredity.
Chromosomes contain both DNA and protein. Chemists analyzed both and found DNA to be of relatively simple composition, while proteins are highly complex. The natural assumption was that proteins were the genetic material, while DNA just helped out the proteins by holding them in place.
It wasn’t until the early 1940s that Oswald Avery, an American chemist, showed pretty conclusively that DNA was indeed the genetic material, by demonstrating that pure DNA could transform bacteria and that that transformation was halted when the DNA was destroyed, but not when proteins were destroyed.
Meanwhile, chemists had already been trying for years to work out the chemical structure of the large molecules found in cell nuclei, the “nucleic acids.” These nucleic acids always contain phosphate (an acidic compound of phosphorus and oxygen), sugar molecules (in DNA, the sugar deoxyribose– hence the compound’s name), and several organic compounds called bases.
DNA contains four bases, each made up of carbon and nitrogen atoms arranged in rings: adenine (A), guanine (G), cytosine (C) and thymine (T). Every organism has different proportions of these four bases– the four “letters” of the genetic alphabet.
In 1953 James Watson and Francis Crick finally came up with a model of DNA structure, the famous “double helix.” Two strands of DNA run parallel to each other, but in opposite directions. In the centre of the molecule, weak chemical interactions between the bases hold the strands together. Adenosine (A) always forms bonds with thymine (T), while cytosine (C) always binds with guanine (G). The sugar and phosphates that are also part of DNA form the backbones of the two strands, on the outside of the helix.
This complex arrangement is what makes it possible for DNA to replicate itself–and insures that each new cell will contain the same genetic information as its parent. The double helix unzips, and because A can only bond with T and C can only bond with G, the new strand that forms on each half of the unzipped helix is a perfect copy, 999,999,999 times out of a billion.
DNA also does something even more impressive: using only these four “letters,” it directs the formation of proteins, the real stuff of life, long chains containing 20 different building blocks called amino acids.
The segment of DNA that directs the formation of a particular protein unwinds and acts as a pattern for the formation of a similar chemical called messenger RNA (mRNA) in a process called transcription. The bases that form the mRNA fall into three-part units called codons. Each codon represents one of the amino acids that make up proteins.
The mRNA travels out of the cell nucleus into the cytoplasm, the gelatinous mass surrounding the nucleus. Here there are assembly units called ribosomes which hold the mRNA and attract small molecules called transfer RNA, or tRNA. One end of each tRNA molecule matches a specific codon on the mRNA; in tow on the other end of the tRNA is the corresponding amino acid. A ribosome moves along the mRNA and translates the message one codon at a time, each tRNA molecule adding another amino acid to the growing protein. Certain codons act as punctuation for the genetic sentence, one always appearing at the beginning of the sequence and others serving as periods at the end. When the ribosome reaches one of these “periods” it stops, and the protein chain is released.
Of course, all of this happens much more quickly than a description makes it sound. In the last second, your body created about 500 trillion faultless copies of hemoglobin, a protein containing more than 570 amino acids.
The understanding scientists have gained of how DNA directs heredity and protein synthesis is what has made genetic engineering possible. Now they’re engaged in one of the biggest scientific projects of all time: the Human Genome Project, the effort to list, from start to finish, the sequence of the three billion bases in human DNA.
In other words, they’re trying to write a complete description of every physical characteristic of a human– with only four letters.
So who needs 26?