Research and Discovery
In 1949, much of the physics and chemistry of life was still utterly mysterious. Ill-defined, large molecules called proteins apparently catalyzed most essential chemical reactions and provided structural and contractile components. The origin—the manner of synthesis—of these proteins was completely obscure.
Several lines of evidence implicated the nucleic acids in genetic processes and suggested that they might indeed convey genetic information. If so, they had to play a primary role in protein synthesis. But knowledge of the structures of the nucleic acids was even more tenuous than that of proteins. We knew that there were two broad classes: deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Both were chains of monomers, nucleotides. They differed chemically in the sugar portion of each component nucleotide, deoxyribose in DNA, ribose in RNA. In addition to common nucleotides, each contained a distinctive nucleotide, thymidylic acid in DNA, uridylic acid in RNA. All cells contained both DNA and RNA. Some viruses contained DNA, some RNA.
In contrast to the proteins, some of which had even been crystallized, no one had ever obtained a pure, single DNA or RNA species or indeed, as far as was known, a complete "molecule." Only complex mixtures, mostly fragments, were available. And, if a pure, single species had been available, there were no means to analyze its composition at the nucleotide level or to characterize its functional role. Nothing was known about the synthesis of nucleic acids or of their metabolic turnover, if any. Yet these now appeared to be the carriers of inheritance.
In science, the choice of a specific field of research and a means of approach is critical. When much is unknown, it is very difficult to forecast which approaches or paths will be fruitful and which will be dead-ends. My choice—to explore the domain of the nucleic acids and to seek to elucidate as much as I could of their structures and, in time, their functions—was most fortunate. The nucleic acids, which had occupied a minor and obscure niche in biochemistry, were about to move to the center of the biological stage for the next several decades. So little was known about the nucleic acids that almost any new insight would be valuable. And, in 1949, there were few competitors in this biochemical backwater.
With my background at the Radiation Laboratory and in spectrophotometry, and located in a physics department with access to (then) modern spectrophotometric equipment (for infrared and Raman spectroscopy), and provided with graduate students who had backgrounds in physical science and were familiar with physical instrumentation, it was natural for me to seek to explore the structure of nucleic acids by means of their interaction, in various modes, with electromagnetic radiation—to study the absorption, re-emission, and scattering of infrared, visible, and ultraviolet radiation by nucleic acids and to look for photochemical changes induced by such interaction. Studies of this nature had been very powerful means of elucidating the structures of simpler organic molecules and it seemed reasonable to believe they would provide a valuable approach to the study of nucleic acid structures.
We knew that DNA molecules were long, polymeric chains of deoxyribonucleotides. Four different subunits were known—adenylic, guanylic, thymidylic, and cytidylic. Each was composed of a complex ring structure, either a purine (adenine or guanine) or a pyrimidine (thymine or cytosine), joined to a sugar (always deoxyribose), joined in turn to a phosphate group. The subunits were linked together through the phosphate groups between the sugars. The purines and pyrimidines were strong and distinctive absorbers of ultraviolet light.
For the research I wanted to undertake at first, it seemed preferable to study the true nucleic acid subunits, the deoxynucleotides, rather than the isolated purines and pyrimidines. However, in order to do this, I needed a source of deoxyribonucleotides. They were not commercially available then and for good reason—the best available methods for their isolation provided yields of less than 1 percent from a DNA source. Nor was an undegraded DNA of good quality then commercially available.
One of the major heuristic differences between the biology of the
1950s and the biology of the 1990s is the presence today of numerous small companies that supply, for a price, a wide variety of the chemicals, enzymes, even viruses and cells employed in biological research. This development, largely concentrated in the United States, has greatly accelerated the pace of biological investigation. In the 1950s, one had to prepare one's own DNA, as well as the various enzymes needed to digest or modify it, starting usually from fresh animal tissue obtained from a slaughterhouse. Each preparation was not only time-consuming but often involved an interim research project to define the optimum conditions for each step or to apply new techniques that hopefully would provide more acceptable yields.
By preparing our own DNA, purifying our own enzymes, and introducing new fractionation methods, we were able for the first time to digest DNA completely to its component mononucleotides and to prepare these quantitatively. In addition to the four principal mononucleotides we expected (adenylic, thymidylic, guanylic, and cytidylic), we found a minor fifth mononucleotide in our digest that proved to contain the then recently discovered pyrimidinc, 5-methylcytosine (a modified form of cytosine).
The mononucleotides were then used for the planned ultraviolet irradiation studies, as well as for other research such as pioneering Raman (the study of the spectra of re-emitted radiation) and infrared absorption spectroscopy. This information could establish the tautomeric state (i.e., which of several possible alternative atomic groupings was predominant) of the various substituent groups on the purines and pyrimidines. The ultraviolet irradiation studies demonstrated that, on exposure to ultraviolet, the cytidylic acid underwent a reversible loss of its ultraviolet absorption similar to that we had previously described for uracil and uridylic acid but more rapidly revertant. The other deoxynucleotides were more resistant to ultraviolet irradiation.
The infrared absorption studies were performed with solutions of the nucleotides. These experiments were complicated by the fact that water itself has strong absorption bands in regions of the infrared spectrum. To circumvent this difficulty, we also studied the infrared absorption of the nucleotides dissolved in heavy water (D2 O), as its absorption bands are shifted relative to those of H2 O. These studies permitted the assignment of many absorption bands to specific atomic groups in the deoxynucleotide molecules and established that, under biological conditions, they are in what are known as the keto and amino configurations, as opposed to the alternative enol and imino configurations.
These chemical distinctions are very important in determining the kinds of secondary bonds that can then be formed between the nucleotides in higher level structures. The determination of these structures was confirmed by the results of the Raman spectroscopy. In those days, before lasers, the latter experiments were very difficult to perform because of the low light levels available.
It was known that some viruses, such as the tobacco mosaic virus, did not contain DNA but had the other form of nucleic acid, ribonucleic acid, RNA, as their (probable) genetic material. I therefore wanted to study RNA. RNA is abundant in cells, but at that time it could only be isolated as a complex mixture. To obtain a pure, individual RNA, I therefore undertook to grow and isolate the tobacco mosaic virus. After all, Iowa State was an "Ag school" and as such had well-equipped greenhouses. The authorities were at first unenthused about my growing stocks of infected tobacco plants in their greenhouses (the virus can also infect such plants as tomato), but reluctantly they consented.
The tobacco mosaic virus had been isolated and purified to a quasi-crystalline state by Wendell Stanley in the 1930s. The dimensions of the virus particle had been determined in the electron microscope. It had a particle mass of approximately forty million daltons. By chemical analysis, it was known to contain about 5 percent RNA. However, it was quite unknown whether there was one RNA molecule (of mass two million daltons) in each particle or a set of several smaller molecules, which might or might not be identical. We sought to resolve this question by performing light-scattering studies on the isolated viral RNA. We built and calibrated our own light-scattering apparatus. By measurement of the light scattered at various angles from a solution of macromolecules, one can determine the number of scattering particles and therefore, knowing the concentration of the solution, their molecular mass. One can also, assuming a structural model, determine their spatial dimensions.
We introduced a new concept. By measuring the scattering at different wavelengths in the visible and ultraviolet, we were able to differentiate between various possible structural models to determine, unambiguously, the spatial dimensions of the RNA. Our results demonstrated that, when carefully prepared, the RNA was a single molecule of mass two million daltons.
We were, step by step, establishing solid facts about the nucleic acids and their components: absolute absorption coefficients, confirmed
structures, molecular masses and dimensions. We needed, however, a more direct link to function.
In the 1940s, microbiological assay and chromatographic methods are developed that permit the quantitative determination of the amino acid composition of proteins. Arthur Martin, Richard Synge, Wilham Stein, and Stanford Moore play leading roles in this development Building upon these methods, Fred Sanger develops techniques for the determination of amino acid sequence within peptides and over the period 1951 to 1955 determines the first complete ammo acid sequence of a protein, insulin. With this advance, proteins, with all their critical biological functions, can be regarded as complex but precisely definable organic molecules.
In 1950, Linus Pauling and Robert Corey propose the alpha-hehx and the beta-pleated sheet models for the spatial arrangement of ammo acids in proteins. These structures, since verified as components of numerous proteins, provided the first valid concepts of the spatial organization of proteins.
In 1952, Alfred Hershey and Martha Chase demonstrate that DNA is (most likely) the genetic component of the bacteriophage T2, i.e., that DNA (without protein) can be the exclusive carrier of genetic reformation.