PART C
DIRAC'S QUANTUM MECHANICS
Introduction
In Heisenberg's best-informed opinion, the new quantum mechanics contained a quantitative form of the correspondence principle in its very foundation. As a result, further developments could proceed on the basis of this foundation without the need to call on the classical analogy, at least as long as the thorny problem of radiation coupling was deferred. Paul Dirac did not concur with this generally held view. It appeared to him in the fall of 1925 that the analogy between classical and quantum mechanics was not limited to Heisenberg's formal transposition of the Newtonian equations of motion. For him the analogy involved deeper-lying structural properties, those classically expressed in the algebra of Poisson brackets. This remark offered a more direct access to the fundamental equations of quantum mechanics; it also suggested a fruitful adaptation of the canonical methods of resolution from classical dynamics, in which Dirac was already an expert. Dirac thus performed an ultimate transfiguration of the classical analogy into a powerful mathematical heuristics. His impressive success in the winter of 1925-26 was intimately connected to this unorthodox view, and other unusual ideas about theory making.
Dirac's style of quantization was not just a cleverer way of solving other people's equations. It was part of a broader strategy of theory making, one which considered that theories should be articulated in three stages. In the first stage, the fundamental equations of the new theory had to be formulated in the most abstract way, independent of any interpretation. In the second stage, the resulting mathematics had to be developed
in a way that would exhibit groups of transformations and conservation properties. In the last stage, the latter properties would be used to inspire a physical interpretation of some of the mathematical quantities employed in the theory.
More specifically, in the first stage of Dirac's quantum mechanics, quantum variables had to be abstracted from their matrix representation and turned into the purely symbolic notion of "q -numbers." While Göttingen's theorists required explicit constructions of the mathematical objects they were using, Dirac was satisfied with defining his symbols simply by the equations which they obeyed. As we shall observe, he had been prepared to adopt such a bold attitude by his early exposure to symbolic methods in geometry. He knew that there were "non-Pascalian" geometries in which the "coordinates" of a point did not commute. Conversely, many of the relations between q -numbers could be given a geometric interpretation, so that noncommutativity did not have to be feared. This explains why Dirac, in spite of his apparent reveling in algebraic manipulations, could later declare his mind to be an essentially geometric
one.^{[1]}
The second stage, the exploration of the mathematical consequences of the fundamental equations, depended much on the previous abstracting stage, which eased the analogy with classical dynamics. Many classical relations remained true in quantum symbols, at least for a specific choice of the order of the terms of quantum products. Dirac worked out the consequences of these equations in an abstract manner, leaving the door open to several possible representations of the quantum symbols in terms of ordinary (measurable) numbers. In his first papers on quantum mechanics, however, the only representation that he deduced and applied was Heisenberg's original matrix representation (for which the energy is diagonal). As Dirac noticed in late 1925, this representation resulted from a transposition of the classical method of action-angle variables, with which he was most familiar.
After his exposure to Schrödinger's wave equation in the spring of 1926, Dirac exploited the freedom of representation inherent in his approach. He studied all other matrix representations of the q -numbers (including the ones today called q - and p -representations, in which q or p is diagonal) and related them to Heisenberg's representation through multilinear
transformations. Most important, he could show that Schrödinger's wave function was just one of these transformations. To summarize, Dirac's symbolic formulation of the fundamental equations of quantum mechanics eventually yielded the complete mathematical apparatus of modern quantum mechanics.
The third stage of Dirac's strategy, the identification of the physical content of the theory, was perhaps the most subtle. Properties of conservation or transformation, suggestive as they might be, could not by themselves imply the physical interpretation. As we shall observe, they needed to be completed by a touch of correspondence argument. In Dirac's transformation theory of December 1926, the high quantum-number limit gave a first germ of interpretation in the quasi-classical domain, then the transformation properties generated the whole interpretation of the theory from this germ.
What were the origins of Dirac's immensely powerful methodology? One source appears to have been some philosophical remarks by C. D. Broad and A. Eddington. In their critical presentations of Einstein's general relativity, both thinkers replaced events in space and time with networks of abstract relations; they then recommended that the mathematics of these relations be developed at an a priori level, in a way exhibiting covariance (Broad) and permanence of substance (Eddington). In Broad's opinion the first abstractive step was of an inductive nature, so that the physical interpretation of the theory, at least the metrical meaning of the ds^{2} , was already given. According to Eddington, the physical content of the notions of space and time was lost in the original abstraction; and it was recaptured only in a third stage of "identification" informed by the mind's search for permanence. Although Dirac is not likely to have followed Broad and Eddington very far in their philosophical inquiries, he certainly appreciated their methodological lessons. Not only his approach to quantum mechanics but also much of his later research seem to have proceeded from Eddington's ideal of a physical theory, as instantiated in his (Eddington's) reconstruction of general relativity.
Quite naturally, the great inspirer was pleased with the inspired. In lectures given in early 1927, Eddington judged Dirac's thought to be "highly transcendental, almost mystical," and he saw his prophecy of an ever more abstract condition of the world realized:
I venture to think that there is an idea implied in Dirac's treatment which may be of great significance . . . . The idea is that in digging deeper and deeper into
that which lies at the base of physical phenomena we must be prepared to come to entities which, like many things in our conscious experience, are not measurable by numbers in any way.^{[2]}
Indeed, where others had already abandoned space-time pictures, Dirac forsook ordinary numbers, a key to his success in exploiting the classical analogy.
Chapter XI
Classical Beauty
Noncommutative Geometry
Originally trained to be an engineer, Dirac was unable to find a job in this field because of the postwar economic depression. He therefore accepted from the mathematics department of Bristol University an opportunity to develop his exceptional scientific skills. During this period (1921-1923) he was influenced by a very good teacher of mathematics, Peter Fraser, who imparted to him a love of projective geometry. For long Dirac admired the magical power of projective methods to justify at a glance theorems otherwise very difficult to prove. Late in his life he remembered having used these methods in much of his work, though only on the backstage of his research.^{[3]}
Whitehead's Principle
On the subjects he found most attractive Dirac read much on his own. He was presumably exposed to a presentation of projective geometry similar to Whitehead's Axioms of projective geometry , published in 1906. As we shall later see, he was at least familiar with some important characteristics of Whitehead's conception of geometry. In line with the general contemporary enthusiasm for the axiomatic approach, Whitehead emphasized the abstract character of the fundamental objects of geometry.
Points, lines, planes, and so on had to be defined not from an intuition of their inner structure but by their mutual relations, which were raised to the status of axioms; for instance: "Through two distinct points one can draw one and only one line." Moreover, within the limits of mutual compatibility there was freedom in the choice of these axioms. This freedom led to geometries different from Euclid's^{[4]}
Should one venture so far as to deny any relation between "mathematical" and "physical" points? Whitehead did not believe so. In his earlier Treatise on universal algebra (1898), he acknowledged the need for an "existential import" in mathematical definitions. The mutual relations playing the role of definitions were the result of an "act of pure abstraction." In his later philosophical writings, starting with the Principles of natural knowledge (1919), he gave central importance to the bridge between roughly perceived objects and mathematically defined concepts which he called the "principle of extensive abstraction." In his opinion, a proper definition of geometric points had to imitate the construction of rational numbers as classes of pairs of integers, or, better, Dedekind's construction of real numbers as classes of interlocked rational intervals. In order to avoid the necessarily finite extension of perceived points, mathematical points had to be conceived as classes of interlocked finite volumes, there being no minimal element in each class.^{[5]}
This limited form of inductivism could perhaps constrain the choice of axioms defining basic geometric objects, but it left other axioms to the taste or interest of the mathematician. In fact, Whitehead's treatise of projective geometry culminated in the proof that the "fundamental theorem" of this geometry could be taken as an independent axiom. This theorem concerns the reduction of chains of perspective, and it is equivalent to Pappus's theorem, which is simpler to enunciate (see fig. 25): If P_{1} , P_{2} , P_{3} are any three points on a line, and Q_{1} , Q_{2} , Q_{3} are any three points on another line, intersecting the former, then the three points of intersection of the cross joints (P_{1} Q_{2} , P_{2} Q_{1} ), (P_{1} Q_{3} , P_{3} Q_{1} ), (P_{2} Q_{3} , P_{3} Q_{2} ) fall on a line.^{[6]}
Whitehead had found the basic idea of the proof of independence in Hilbert's Grundlagen der Geometrie (1899). More exactly, Hilbert proved something similar, the independence of "Pascal's theorem" in a geometry employing a notion of parallelism (thereby different from projective geometry, which assumes every two lines to intersect). To this end he first
showed that the other axioms of geometry could be represented in terms of a "system of complex numbers," what we today call a division ring. A point could then be defined as a triplet (x, y, z) of "coordinates" belonging to the division ring, a plane as a set of triplets satisfying an equation of the type
and a line as an intersection of two planes (in the case of a three-dimensional geometry). In spite of the superficial resemblance to ordinary analytic geometry, the "complex numbers" x , a , and so on did not have to be real numbers; they did not even have to commute. Owing to the latter circumstance, the coefficients in a plane equation (1) had to be kept on the left of the coordinates.^{[7]}
Pascal's theorem, in the degenerate form used by Hilbert, is a variant of Pappus's theorem, for which the points R_{1} , R_{2} , R_{3} lie at infinity (see fig. 26): "If P_{1} Q_{2} is parallel to P_{2} Q_{1} and P_{2} Q_{3} is parallel to P_{3} Q_{2} , then P_{3} Q_{1} is necessarily parallel to P_{1} Q_{3} ." In the plane of the figure all points can be represented by two coordinates on the natural axes OP_{1} and OQ_{1} . Then,
The equation of the line P_{i} Q_{j} can be written:
That P_{1} Q_{2} is parallel to P_{2} Q_{1} is determined by the "proportionality" of the coefficients in the corresponding equations, which gives
In the same way, for P_{2} Q_{3} and P_{3} Q_{2} to be parallel, one must have
Combining the two conditions gives
while the condition for P_{3} Q_{1} and P_{1} Q_{3} to be parallel would read
Consequently, Pascal's theorem shall hold true if and only if any two a , b commute (which makes the division ring a field). Since there are noncom-mutative division rings (skew fields, like the quaternions), Hilbert continued, there exist "non-Pascalian" geometries for which Pascal's theorem is not valid. Whitehead used a similar technique, based on the possibility of noncommutative projective coordinates, to show the independence of Pappus's theorem in projective geometry.^{[8]}
Baker's Tea Parties
Even if he did not read Hilbert or Whitehead, Dirac certainly became aware of the existence of non-Pappusian or non-Pascalian geometries, and of their relation to noncommutative algebra. When he arrived in Cambridge in 1923, he was invited to participate in the mathematical tea parties organized by Henry Frederick Baker, a friend of his previous mathematics teacher. Baker's main interest at that time was in geometry. In an axiomatic framework similar to Whitehead's or Hilbert's he developed projective methods, studied extensions in higher dimensions of space, and frequently relied on what he called "symbolic methods." A look at his Principles of geometry of 1922 gives a precise idea of the nature of the symbols in question. Like Hilbert's "complex number systems" or Whitehead's projective coordinates, they belonged to a division ring and were used to represent the objects of geometry. More specifically, they provided an extension of Moebius's old barycentric calculus (1827), extracting the algebraic properties of barycentric coefficients from their original identification to real numbers. Every manipulation of the symbols, Baker showed, had a precise geometric meaning, which could be exploited to substitute algebraic for geometric proofs. Moreover, the symbols did not have to commute, again opening the door to non-Pappusian geometries.^{[9]}
Regarding the status of symbolic methods, Baker's attitude was ambiguous. He commended them "to fix ideas and for the purpose of verification" but defended the "purity" of geometry:
While the view is taken that all the geometrical deduction should finally be synthetic, it is also held that to exclude algebraic symbolism would be analogous to preventing a physicist from testing his theories by experiments; and it becomes part of the task to justify the use of this symbolism.
So that it did not degenerate into a mere algebra of symbols, geometry had to remain clearly connected to its observational roots. Hence follows Baker's epistemological credo :
A science grows up from the desire to bring the results of observation, or the relations of a class of facts that appear to be connected, under as few general propositions as possible. Into these general propositions it is generally found necessary, or convenient, when the science has reached a sufficient development, to introduce abstract entities, transcending actual observation, whose
existence is only asserted by the postulation of their mutual relations. If the science is to be arranged as a body of thought developed deductively, it is necessary to begin by formulating fundamental relations connecting all the entities which are to be discussed, from which other properties are to follow as a logical consequence. If this is done we may in the first instance regard all the entities involved in these fundamental propositions as being abstract, even those which we regard as subject to actual observation. The usefulness of the science, for the purpose for which it was undertaken, will depend on the agreement of the relations obtained for the latter entities with those which we can observe. ^{[10]}
This is of course reminiscent of Whitehead's principle of extensive abstraction. However, Baker did not mention any source, which shows the pervasiveness of Whitehead's philosophy among contemporary British scientists.
Quaternions
Both in Whitehead's and in Baker's presentation of non-Pappusian geometries, the canonical example of a skew field adduced was Hamilton's quaternions. By the time Dirac studied mathematics and physics, quaternions were not as popular as they had been in nineteenth-century England. While in Maxwell's and Tait's hands they were omnipotent precursors of vector algebra, they disappeared from twentieth-century physics textbooks. Nevertheless, they remained an interesting mathematical curiosity. For this reason the young Dirac read a treatise on quaternions, probably the one by Kelland and Tait, which was the most commonly available. The general emphasis of the book was on the geometric interpretation of quaternions, as given by decomposition into scalar and vector part. Dirac's conception of algebra might have been influenced by this reading, if it was, as he later suggested, his only exposure to algebra.^{[11]}
The main characteristic of quaternions according to Kelland was their noncommutativity:
About the year 1843 [Hamilton] perceived clearly the obstruction to his progress in the shape of an old law which, prior to that time, had appeared like a law of common sense. The law in question is known as the commutative law of multiplication . . .. When it came distinctly into the mind of Hamilton that this law is not a necessity, with the extended signification of multiplication, he saw his way clear, and gave up the law. The barrier being removed,
he entered on the new science as a warrior enters a besieged city through a practical breach. The reader will find it easy to enter after him. ^{ 12}
This was indeed the way Dirac would enter quantum mechanics. To him the noncommutativity of kinematic variables was not an obstacle but instead the sign of a fundamental advance. In his eyes noncommutativity was not counterintuitive, since it could be understood in geometric terms, and even integrated in a geometric structure, for instance that of non-Pascalian geometry. Neither did it threaten the solidity of the foundation of the new theory, since fundamental entities were sufficiently defined by their mutual relations. According to Kelland, a conqueror of noncom-mutative extensions would not even have to fear rigor: "It is only by standing loose for a time to logical accuracy that extensions in the abstract sciences—extensions at any rate which stretch from one science to another—are effected." For an engineering student like Dirac, who had learned Heaviside's juggling with derivatives of discontinuous functions and other symbolic methods, the remark hardly needed to be made.^{[13]}
The Lesson of Relativity
Broad's Lectures
Eddington's eclipse expedition of 1919 verified one of the predictions of Einstein's theory of gravitation and started a wave of enthusiasm for relativity, in England more than anywhere else. Accordingly, Dirac's first love was for the ds^{2} = g m v dxmdx^{v} of general relativity, though it was not included in the physics curriculum of Bristol University. He actually first learned this theory from a philosopher, Charlie Dunbar Broad, who happened to be teaching a course in philosophy for scientists there in the years 1920-1921. An inspired and methodic thinker, Broad was learned in physics and mathematics and could competently comment on the newest theories. Late in his life he remembered:
^{[12]}I may compare myself with John the Baptist in at least one respect (though I do not share his taste for an unbalanced diet of locusts and wild honey), viz., that there came to these lectures one whose shoe-latches I was not worthy to unloose. This was Dirac, then a very young student, whose budding genius
had been recognised by the department of engineering and was in process of being fostered by the department of mathematics.
Indeed, Broad hardly succeeded in arousing in Dirac an interest in philosophy. But his comments on the nature of relativity thinking may have been heard.^{[14]}
An amplified form of Broad's lectures was published in 1923 as Scientific thought . The main focus was on critical philosophy: "The most fundamental task of philosophy is to take the concepts that we daily use in common life and science, to analyse them, and thus to determine their precise meaning and their mutual relations." Half of the lectures were dedicated to a criticism of space and time, culminating in general relativity. Broad founded his analysis on Whitehead's principle of extensive abstraction, which he declared to provide "the essential connection between what we perceive but cannot treat mathematically, and what we cannot perceive but can treat mathematically." When applied to extension and duration, this principle eliminated classical prejudices about space and time and cleared the way to relativity theory. In this process, Broad said, "physicists had been their own philosophers."^{[15]}
For the sake of historical accuracy, let it be mentioned that Whitehead limited his own relativistic enthusiasm to special relativity. He defended the necessity of a uniform space-time and reinterpreted the equations of general relativity in a flat space. "The structure of continuum events," he argued, "is uniform because of the necessity for knowledge that there be a system of uniform relatedness, in terms of which the contingent relations of natural factors can be expressed. Otherwise we can know nothing until we know everything." Fortunately for Dirac, Broad did not share this view, and gave in his course a sympathetic review of general relativity. This is where Dirac remembered having seen ds^{2} = g m v dx^{v} for the first time. Moreover, Broad regarded the tensor calculus of Einstein's theory as providing the general form of any future theory: "The aim of science should be to find general formulae for the laws of Nature, which would ultimately give the special expression of the law in terms of any particular frame, as soon as the defining characteristics are known.^{[16]}
Eddington's Idealism
Once his relativistic appetite had been whetted by Broad's lectures, Dirac devoured two books by the "fountainhead of relativity in England,"
Arthur Eddington. The first one, Space, time and gravitation (1920), gave a more popular and literary account,^{[17]} while the second one, The mathematical theory of relativity (1923), expounded and even extended the mathematical apparatus of general relativity.^{[18]} Like Sommerfeld, who "knew of no book as well written" as Space, time and gravitation , Dirac must have had difficulty deciding what to admire most: clarity of exposition or thoroughness of thought, mathematical elegance, or wit.^{[19]}
In his general conception of relativity, Eddington was close to Broad and Whitehead in some respects. He highlighted the abstract character of fundamental physical notions: "The ultimate elements in a theory of the world must be of a nature impossible to define in terms recognisable to the mind." He defined the aim of physics as the quest for the "condition of the world," that is, mathematical symbols comprehending their influence on any possible measurement. The energy-momentum tensor of general relativity provided the paradigm of such a symbol. Quantum theory, Eddington speculated (already in 1920!), would require an even higher degree of abstraction, since, for instance, the symbol connected with light phenomena would have to encompass antagonistic aspects, wavelike and corpuscular.^{[20]}
Following Einstein's lead, Eddington regarded general relativity as a geometrization of physics. In a historico-logical analysis he argued that geometry had first become analytical (in the Kantian sense), so that it now dealt with variables of an unknown nature, and could be extended in various ways. The very diversity of extensions made the geometrization of physics possible:
As the geometry became more complex, the physics became simpler; until finally it almost appears that the physics has been absorbed into the geometry. We did not consciously set out to construct a geometrical theory of the world; we were seeking physical reality by approved methods, and this is what happened.^{[21]}
Eddington's position was even more radical than suggested by the above extract. He believed the geometrization to be total, and not only in the context of gravitation phenomena. Unlike Einstein, he reacted enthusiastically to Weyl's unified theory of electricity and gravitation, for it gave the electromagnetic field a geometric interpretation as a connection
between local gauges of length. Already in the limited context of gravitation Eddington's views departed from Einstein's. The fundamental equation,
relating the energy-momentum tensor Tm v to the curvature tensor Rm vps equated, according to Einstein, two quantities of a different essence: the one on the right side was explicitly given as a function of the metric tensor, while the one on the left side was an empty frame, waiting to be completed by an expression of Tm v in terms of well-defined matter fields. Instead Eddington regarded this equation as a definition of energy momentum, hoping that a proper modification of the rest of the theory would provide enough equations to determine the evolution of curvature. In his words the usual conception treated matter as a cause of curvature, whereas he proposed to regard it as a symptom of curvature; matter was the geometric disturbance itself, not a disturbing factor.^{[22]}
For the sake of pure geometry, Eddington also condemned Einstein's a priori identification of the form ds^{2} as the metric element, for it presupposed the existence of material rulers and clocks and thereby treated matter as an implicit cause. As a proper alternative, he recommended that one should first conceive a completely abstract geometry, then develop its mathematics in order to construct a "conserved" tensor Tm v, and finally "identify" Tm v as representing the flow of energy and momentum. The metrical meaning of ds^{2} would, he hoped, appear at a later stage. From the perspective of Eddington's influence on Dirac, the most important aspect of this program was the "principle of identification," according to which the mathematics of a physical theory had to be developed at an a priori level before the identification of physically accessible quantities took place.^{[23]}
There were, in addition to the principle of identification, more specific elements in Eddington's methodology, the first of which was the "principle of the permanence of substance." A healthy mind, Eddington thought, could not pretend to any understanding of the world without believing in some kind of permanence; while a predilection for change led one to the asylum, the search for permanence led to the energy-momentum tensor. Nevertheless, a legitimate concept of substance had nothing to do with the naive idea of substance drawn from common experience. Substance had to be related to the "condition of the world," in which process
it was boiled down to an abstract "substratum" of relations: "The relativity theory of physics reduces everything to relations, that is to say it is structure, not material, which counts."^{[24]}
More strikingly, Eddington believed that there should be only one way to integrate permanence of substance in a geometry of abstract events: "Our whole theory has really been a discussion of the most general way in which permanent substance can be built up out of relations, and it is the mind which, by insisting on regarding only the things that are permanent, has actually imposed the laws on an indifferent world." This "despotism of the mind" remained a lasting feature of Eddington's philosophy. It suggested the inspired and often-cited conclusion of Space, time and gravitation : "We have found a strange footprint on the shores of the unknown. We have devised profound theories one after another, to account for its origin. At last, we have succeeded in reconstructing the creature that made the footprint. And Lo! it is our own."^{[25]}
There were interesting by-products of Eddington's viewpoint, for instance the idea of an affine connection without a metric. Nevertheless, most theoreticians did not see much more in it than a graceful intellectual exercise. Einstein's judgment displayed a typical mixture of admiration and suspicion: "[Eddington] has always seemed to me an uncommonly ingenious but uncritical man. . .. With his philosophy he reminds me of a prima ballerina , who does not herself believe in the justification for her elegant leaps."^{[26]}
Was Dirac Eddingtonian?
Several comments made while Dirac was a young physicist suggest a positive answer to the question, Was Dirac Eddingtonian? In the manuscript for a talk which he gave at one of Baker's tea parties one finds:
The modern physicist does not regard the equations he has to deal with as being arbitrarily chosen by nature. . . . In the case of gravitational theory, for instance, the inverse square law of force is of no more interest—(beauty)?—to the pure mathematician than any other inverse power of distance. But the new law of gravitation has a special property, namely its invariance under any
coordinate transformation, and being the only simple law with this property it can claim attention from the pure mathematician.
Here Dirac shows that to some extent he shared Eddington's belief in the necessity of physical laws, although for a slightly different reason: it is not the search of the mind for permanence but its predilection for mathematical beauty which enforces the necessity of the laws.^{[27]}
Dirac frequently returned to this theme in his later writings without giving a precise definition of his aesthetics. Mathematical beauty was no more subject to definition than beauty in art, but it was obvious to the connoisseur. However, in its first shy appearance under Dirac's pen, the word "beauty" (added in parentheses and with a question mark above "interest") had a more specific meaning: it pointed to rich invariance properties. In other contexts it could also refer to the "magic" of some extensions of ordinary geometry and real analysis, for instance projective geometry and the theory of functions of a complex variable, which cut the Gordian knot of theorems difficult to prove in their original context.^{[28]}
Since the mathematical theories suited to the physical world were also the most beautiful ones, Dirac said to Baker's audience, they were ultimately the ones to be favored by mathematicians themselves: "As more and more of the reasons why nature is as it is are discovered the questions that are of most importance to the applied mathematician will become the ones of most interest to the pure mathematician." Conversely, beautiful parts of mathematics which had not yet received much application would end up being absorbed by a physical theory. For example, in his study of permutational symmetry in higher atoms (1929), Dirac treated group theory as a part of quantum mechanics, for, he said, quantum mechanics was the general science of quantities that do not commute.^{[29]}
In the foreword of the Principles of quantum mechanics (1929), instead of the above-described revelation of a physical quality of mathematics, Dirac preferred the more common idea of an increasingly mathematical nature of physics. Whereas the old physics was based on mental pictures in space and time, the new physics referred to a "nonpicturable substratum" accessible only through a mathematical description. While this idea
was likely to please Bohr just as much as Eddington, the term "substratum" was specifically borrowed from Eddington's Space, time and gravitation . Also like Eddington and Broad, Dirac attached great importance to the existence of groups of transformations relating different "point of views" in the new theory: "The growth of the use of the transformation theory, as applied first to relativity and later to the quantum theory, is the essence of the new method in theoretical physics." And he continued in a very Eddingtonian tone: "This state of affairs is very satisfactory from a philosophical point of view, as implying an increasing recognition of the part played by the observer in himself introducing the regularities that appear in his observations, and a lack of arbitrariness in the ways of nature."^{[30]}
As already remarked, Dirac's concern with philosophy was generally so limited that one may doubt the sincerity of the above-mentioned reflections. He was generally suspicious of any statement that could not be expressed in a mathematical form: what could be said clearly had to be said mathematically. His Eddingtonian utterances might well have been intended as a decorative device, the content of which was soon counterbalanced by crude positivistic statements such as: "The only object of theoretical physics is to calculate results that can be compared with experiments." Or perhaps they pertained to an inaccessible ideal; perhaps they were the mathematical Grail drawing his intellectual energies. In his Scott lecture of 1939 Dirac gave the most extreme expression of this ideal: "We must suppose that a person with a complete knowledge of mathematics would deduce not only astronomical data, but also all the historical events that take place in the world, even the most trivial ones." Such a person, he admitted, did not exist. When under positivistic attack, he would rather profess a balanced mixture of inductive and deductive methods.^{[31]}
Even if Dirac's great admiration for Eddington did not entail a complete adoption of his philosophical stand, he certainly drew important methodological lessons from the Eddingtonian reconstruction of relativity. To imitate the presentation offered in The mathematical theory of relativity , a new physical theory had to start with the most abstract development of relations between mathematical symbols possible. Then transformations leading to invariant or covariant properties had to be sought.
Finally, the structures revealed in this process had to suggest the identification of physically observable quantities.
Dirac explicitly enunciated this methodology in 1931, in his paper on quantized singularities (giving birth to the magnetic monopoles):
The most powerful method of advance that can be suggested at present is to employ all the resources of pure mathematics in attempts to perfect and generalize the mathematical formalism that forms the existing basis of theoretical physics, and after each success in this direction, to try to interpret the new mathematical features in terms of physical entities (by a process like Eddington's Principle of Identification).^{[32]}
As explained in the Scott lecture, the notion of mathematical beauty was an integral part of this strategy. One first had to select the most beautiful mathematics—not necessarily connected to the existing basis of theoretical physics—and then interpret them in physical terms. Here also the paragon of beauty was the tensor calculus of general relativity, with its generous transformation properties:
A powerful new method . . . is to begin by choosing that branch of mathematics which one thinks will form the basis of the new theory. One should be influenced very much in this choice by considerations of mathematical beauty. It would probably be a good thing also to give a preference to those branches of mathematics that have an interesting group of transformations underlying them, since transformations play an important role in modern physical theory, both relativity and quantum theory seeming to show that transformations are of more fundamental importance than equations. Having decided on the branch of mathematics, one should proceed to develop it along suitable lines, at the same time looking for that way in which it appears to lend itself naturally to physical interpretation.^{[33]}
One does not find similar declarations in Dirac's early papers on quantum mechanics. Nevertheless, we shall observe that his spectacular success resulted in good part from an imitation of the model of general relativity as portrayed by Eddington.
The Art of Action-Angle Variables
Dirac arrived in Cambridge in 1923, at the peak of his love for general relativity. For his supervisor he was hoping to get E. Cunningham, a specialist of this theory. To his disappointment, but also to his advantage, he got instead a specialist in statistical mechanics and quantum theory, Ralph Fowler. Being a friend of Niels Bohr and an occasional visitor at
the Copenhagen Institute, Fowler was well informed on the developments of atomic theory and taught the main course on this topic at Cambridge. From Dirac's and Thomas's notes on this course one can appreciate how faithful it was to Bohr's ideas and how concerned it was with the latest advances in this field.^{[34]}
As should be the case with any account of the Bohr-Sommerfeld theory, Fowler gave a thorough treatment of the Hamilton-Jacobi method of classical mechanics, with a special emphasis on the "transformation theory of dynamics," which was Whittaker's expression for the theory of canonical transformations. These tools led to an easy quantization of multiperiodic systems and to the Bohr-Kramers theory of perturbations. Fowler also presented the adiabatic principle and the correspondence principle as having great importance—something rather exceptional outside Copenhagen. He treated the BKS theory, the Kramers-Heisenberg dispersion theory, and the sharpened applications of the correspondence principle in detail. Immediately after their publication, and sometimes even before, he reported about Pauli's ambiguous electron, Heisenberg's multimodel theory of multiplets, and the spin hypothesis.^{[35]}
Fowler was also prompt to detect the exceptional qualities of his new student and to encourage his originality. Only six months after his arrival in Cambridge, Dirac started to publish substantial research papers. Whenever the subject had not been imposed on him, he tried to clarify or to generalize in a relativistic way points which he had found obscure in his readings: for instance, the definition of a particle's velocity in Eddington's relativity, or the covariance of the Bohr frequency condition. The main characteristics of his style already showed through: directness, economy in mathematical notation, and little reference to anterior work.^{[36]}
At the end of 1924, after reading Bohr's "Fundamental postulates" (1923) and following suggestions by Fowler and C.G. Darwin, Dirac focused on the more fundamental problem of consolidating and generalizing the adiabatic principle. Burgers's original proof of the adiabatic in-variance of action integrals remained incomplete. It strictly required that the fundamental frequencies of the deformed system never become commensurable. But this was clearly impossible, since commensurable frequencies are "dense" among incommensurable ones in the same sense as rational numbers are dense among real ones, and therefore necessarily occur infinitely frequently in a continuous deformation.^{[37]}
Through subtle e -splitting, Dirac found a rigorous condition of adiabatic invariance, which, fortunately, held in all practical cases. He also touched other problems in the adiabatic register, like the case of varying magnetic fields or, under Darwin's suggestion, the problem of the in-variance of the weights (degrees of degeneracy) of stationary states in the degenerate case. Within a few months he had become an expert in the most sophisticated methods of classical dynamics, especially in the art of action-angle variables.^{[38]}
According to a widespread belief, Dirac lacked interest in the other "principle" of Bohr's theory, the correspondence principle. While generally correct, this is not completely true. In one of his unpublished manuscripts, he did investigate an application of this principle to nonperiodic integrable systems. His intention was to provide a more systematic foundation for previous calculations (for instance by Kramers) of the radiation emitted during collision processes. According to a usual procedure of his, he looked for an invariance property, here the independence of "corresponding" radiation intensities with respect to the choice of first integrals of the system. This property ended up holding only in the limit where there would be agreement between quantum-theoretical and classical intensities. The result was too trivial to warrant publication.^{[39]}
In general Dirac (or his adviser) did not believe that the correspondence principle furnished him with a good opportunity to deploy his mathematical skills. The systematic side of this principle—that is, the set of rules used to derive selection rules and approximate intensities—had already been thoroughly studied; the heuristic side, the deep-lying formal analogy between old and new mechanics, he felt to be too vague to be helpful.
Summary
An important figure of the intellectual milieu in which' Dirac grew up was Alfred North Whitehead. Through his studies of the foundations of geometry this philosopher-mathematician was led to the "principle of ex-
tensive abstraction," according to which the mathematical concepts of points, lines, planes, and so on are defined by mutual relations suggested by experience, while any intuition of their inner essence is meaningless. In spite of this belief in an inductive origin of the objects of geometry, but in conformity with the axiomatic trend at the turn of the century, Whitehead emphasized the freedom left in the choice of axioms used to supplement the definitions. For example, like Hilbert he recognized the possibility of "non-Pascalian" geometries in which the coordinates of a point do not commute (for instance, these coordinates could be quaternions).
Drawn by his love for projective geometry, in his early Cantabrigian years Dirac attended the scholarly tea parties of Henry Frederick Baker, a mathematician who approved of Whitehead's principle of extensive abstraction, and also of his considerations on noncommutative geometries. Moreover, in his multivolume Principles of geometry Baker frequently called forth "symbolic methods" in which geometric objects were represented by systems of algebraic relations, and geometric proofs reduced to algebraic manipulations. Since adolescence Dirac had been familiar with another example of noncommutative algebra, Hamilton's quaternions. The standard text on this topic, Kelland and Tait's, praised Hamilton's relaxation of commutativity and the consequent conquest of new mathematical territories. In the spirit of Hamilton and Baker, Dirac would quickly welcome noncommutativity when exposed to Heisenberg's quantum mechanics. We shall also observe that the type of relation which he perceived between classical and quantum mechanics was reminiscent of Whitehead's principle of extensive abstraction. Roughly, Dirac's quantum mechanics could be said to be to ordinary mechanics what noncommutative geometry is to intuitive geometry.
In physics Dirac's first passion was for Einstein's relativity. He was initiated into this theory by his philosophy professor at Bristol in 1920-21, the highly respected Charlie Dunbar Broad. Relativity according to Broad was the result of a systematic criticism of the intuitive notions of space and time, a specific anticipation of Whitehead's principle of extensive abstraction. Impressed by the philosophical foundation of general relativity, Broad presented this theory, with its transformation properties and tensor algebra, as the paradigm of any physical theory to come. The other source of Dirac's knowledge of relativity, Eddington's brilliant essays, gave a similar gloss to the subject and emphasized even more than Broad the abstract character of the fundamental "symbols" of relativity. Unlike Broad, however, Eddington believed that the geometry of abstract events could be reached by a priori means and that its physical content could be
"identified" in the end by employing the principle of the permanence of substance (which directed attention to divergence-free tensors).
Like Eddington, Dirac frequently expressed a belief in the a priori necessity of physical laws. But his interest in philosophy was generally so limited that one can only speak of his sympathy with the methodological implications of Eddington's views. In his approach to quantum mechanics, as well as in most of his later work, he tended to first work out the mathematics and then "identify" the physical content. In the first abstract stage, the ultimate guide was his "principle of mathematical beauty," which meant, essentially, that he emphasized and searched for rich transformation properties (as found in Riemannian geometry and in Hamiltonian mechanics). These properties also helped in the second stage, the identification of the physical content of the theory.
Under the influence of his supervisor in Cambridge, Ralph Fowler, Dirac shifted his interest toward quantum theory. Fowler's lectures in this field were exceptionally clear and thorough. They exposed the most sophisticated analytic tools, including action-angle variables. They discussed, in a Bohrian manner, the applications of the adiabatic and correspondence principles and reported the latest advances in the field. In the best of his early work, Dirac deployed his exceptional mathematical skills in extending the most formal aspects of the quantum theory and thus became an expert in the handling of action-angle variables. But he paid little attention to the correspondence principle and did not appreciate its constructive value.
Chapter XII
Queer Numbers
In July 1925 there came to Cambridge a visitor who thought differently about the power of the correspondence principle and had just drawn from it the first elements of a new mechanics. In fact Heisenberg lectured on "Term zoology and Zeeman botanics" at an informal club of young physicists created by Kapitza. The title of this talk referred to the multimodel approach of multiplet theory (see part B, on pp. 205-207). We do not know for sure whether Heisenberg mentioned his more recent inspiration or whether Dirac was present. However, Fowler almost certainly heard of the new kinematics in private conversations, and asked to be kept informed.^{[40]}
Poisson Resurrected
In late August Dirac received from Fowler the proofs of Heisenberg's seminal paper. Even before he was able to judge the relevance of the new scheme, he tried his favorite game, finding a relativistic extension. While this premature attempt fell short, it revealed what Dirac considered to be the essence of Heisenberg's new ideas. First there was a substitution of "Heisenberg's product" for the ordinary product, then an endeavor to maintain as much as possible of the structure of classical dynamics: "The
main point in the present dynamics is that when we have to choose a quantum coefficient, we do so in such a way as to make as many classical relations as possible still true between the quantum quantities."^{[41]}
Another characteristic of Heisenberg's paper, the organic relation between the new kinematics and the structure of the emitted radiation, initially diverted Dirac's attention from more essential features of the theory. In his tentative relativistic extension, he invoked the unidirectional character of the emitted radiation to justify the introduction of the atomic momentum in the labeling of stationary states. In another manuscript he tried to explain the absence of radiation in the fundamental state by introducing a new distinction between two types of "virtual oscillators." The "i -type" with an amplitude q = a e^{iwt} was unable to radiate by itself, if only the general expression of radiated energy was assumed to be A^{ 2} + B^{2} , where A and B are defined by
Not to radiate, the fundamental state had to be a pure i -oscillator; the possibility of emission in the other states was then to be attached to a corruption of the i -oscillators by "j -oscillators," with an amplitude be^{jwt} , wherein j = — i .^{[42]}
As Dirac quickly realized, this strange idea had every chance to be irrelevant, since it connected subsets of virtual oscillators to definite levels, in the naive fashion
which is not compatible with Heisenberg's product. The manuscript ends with the words: "We cannot, however, put xy (n ) = x (n )y (n ), so that coordinates associated with a stationary state can have only a very restricted meaning."^{[43]}
The title and introduction of the above-mentioned manuscript, "Virtual oscillators," clearly indicates that Dirac originally interpreted Heisenberg's new scheme as a modification of the BKS theory. In this modification the distinction between positive and negative oscillators was erased, but an alternative distinction, that between i - and j -oscillators, was needed to give some insight into the mechanism of radiation. Heisenberg's own
emphasis on radiation properties—the only observable things—probably suggested this misinterpretation. Nevertheless, his careful elimination of the term "virtual oscillator" indicated a fundamental departure from the BKS approach: radiation properties could no longer be connected with a given stationary state, as reflected by the interlocked character of quantum products. After his failed distinction between i - and j -oscillators Dirac also emphasized this impossibility: "The components of a varying quantum quantity are so interlocked . . . that it is impossible to associate the sum of certain of them with a given state."^{[44]}
The Brackets
Having given up on trying to gain insights into the mechanism of radiation, Dirac turned to the more formal side of Heisenberg's scheme, first to the new quantum rule. Since Heisenberg presented this rule as deducible from the high-frequency limit of Kramers's dispersion formula, Dirac naturally went back to the Kramers-Heisenberg paper for a full derivation. On the one hand, he found that Heisenberg's new product already appeared in the dispersion formulae for the incoherent case (see (220) of part B).^{[45]}
On the other hand, he knew well that in Hamiltonian dynamics the first-order perturbation P_{1} of a quantity P_{0} (like the electric moment that was responsible for classical dispersion) could be expressed in the form
wherein e f is the generating function of the first-order canonical transformation connecting old and new action-angle variables. He probably had learned this from Whittaker's Analytical dynamics , or from Fowler's lectures, which used this type of expression in the perturbative treatment of the Stark effect, and in the classical dispersion formula leading to the Kramers-Heisenberg formulae (see (202) of part B). Poisson brackets also occurred in several of Dirac's early manuscripts, even though he might not have remembered that they were named so. Now, according to the Kramers-Heisenberg procedure for translating from the classical dispersion formula, the Poisson bracket had to be translated into a commutator.^{[46]}
This explanation of Dirac's first important discovery in the new quantum mechanics is not unfounded reconstruction; it may be surmised from a rough calculation found on a back page of a recycled manuscript. The following transcription is the closest possible.^{[47]}
The diagram was obviously taken from the Kramers-Heisenberg paper. In fact, the whole calculation is very similar to that of Kramers and Heisenberg (which is discussed in the equations (214-220) of part B). The second line results from the prescription^{[48]}
The factor 2p /ih in the expression of a (n, m ) enables us to reestablish its meaning as the quantum amplitude "corresponding" to the harmonic n - m of the classical bracket
Indeed, if
then
where the quantity in parentheses is the exact starting point of Dirac's note. Finally, the h in 2p /ih comes from the translation rule (12).
Most important, Dirac's discovery of the relation between commutators and Poisson brackets appears to have been based on Kramers's procedure of symbolic translation. Therefore, it was directly connected with the previous sharpening of the correspondence principle. Here lies the secret of Dirac's revelation of a structural analogy between old and new mechanics—one more significant than Heisenberg's formal transposition of classical dynamic equations.
In his final paper, however, Dirac adopted a different presentation of the relation between classical and quantum brackets. There he used the correspondence principle backward, from the commutator to the Poisson bracket, and in its narrower but safer acceptance as an asymptotic convergence of quantum relations toward classical ones. The resulting calculation looks artificial, since it is nothing but the original one, read from bottom to top:^{[49]}
which is asymptotically equal to
of
The latter expression is, as we saw, ih/2p times a Fourier coefficient of the Poisson bracket^{[50]}
As immediately noticed by Dirac, the first attractive feature of the Poisson brackets is their canonical invariance: for any choice q, p of the canonical coordinated, they can be expressed as
Moreover, they have the same simple algebraic properties as commutators: antisymmetry, bilinearity, distributivity, and Jacobi's identity, which respectively read:
All of this suggested to Dirac the following assumption:^{[51]} "The difference between the Heisenberg product of two quantities is equal to ih/2p times their Poisson bracket expression. In symbols,
In the case of a canonical pair q, p , this rule gave
In this way Dirac reached the canonical form of the new quantum rule independently of Born and Jordan, and in a more profound way, one showing the intimate structural analogy between classical and quantum mechanics. He concluded: "The correspondence between the quantum and classical theories lies not so much in the limited agreement when as in the fact that the mathematical operations on the two theories obey in many cases the same laws." What Heisenberg had judged to be an "essential difficulty" of his new scheme, the noncommutativity of the quantum product, Dirac viewed as having a natural classical counterpart in the Poisson bracket algebra. As Dirac could not have failed to notice, it also had antecedents, even geometrically meaningful ones, in the algebra of quaternions or in Baker's symbols. This prompted him to develop a "quantum algebra," abandoning commutativity but saving associativity and distributivity.^{[52]}
For the sake of homogeneity of quantum operations, Dirac required every classical operation to have a counterpart in the quantum algebra. Consequently, he introduced a "quantum differentiation" d/dv , with the characteristic property that
Linear realizations of this property, he showed, could always be expressed under the form
For example, the partial derivatives of Hamilton's equations could be represented as commutators in the equations
resulting from the corresponding classical equations
In this elegant manner Dirac dispensed with the awkward mixture of differential and algebraic operations that was being developed with great pain in Göttingen. As Fowler wrote to Bohr: "I think it is a very strong point of Dirac's that the only differential coefficients you need in mechanics are really all Poisson brackets, and that the direct redefinition of the Poisson brackets is better than the invention of formal differential coefficients."^{[53]}
Action-Angle Variables
On the basis of the extended analogy between classical and quantum mechanics, Dirac hoped to be able to transpose classical methods of resolution of dynamic problems. One method, the introduction of new canonical variables, received an immediate counterpart through the canonical criterion: The variables Q, P shall be canonical if and only if
For systems that were multiperiodic at the classical level, there would presumably be something like quantum action-angle variables (which Dirac rather called uniformizing variables).^{[54]}
In a first exploration of this notion, Dirac found it convenient to introduce the canonical variables (similar to the modern creators and annihilators)
In the light of the correspondence principle he requested that the corresponding quantum variables have vanishing matrix elements, except for the elements and with . This condition implies the identities
In order to be canonical the variables have to verify another identity:
Hence,
and
wherein the constant must be taken to be zero in order that all amplitudes may vanish when .
Granted that the classical relation
still holds at the quantum level, the (diagonal) values of the action variables are restricted to J_{r} = n_{ r} h. Implicitly assuming the classical expression of the energy in terms of the J 's, Dirac commented: "This is just the ordinary rule for quantising the stationary states, so that in this case [when relation (35) is true] the frequencies of the system are the same as those given by Bohr's theory." This was too simple to be true, as we shall presently see. Nevertheless, the general tendency to adapt classical methods in the new mechanics proved to be very productive in Dirac's subsequent work.^{[55]}
"The fundamental equation of quantum mechanics" was received in early November 1925 by the editors of the Proceedings of the Royal Society
and was hurried to publication by Fowler. The introduction expressed Dirac's personal view of quantum mechanics:
Heisenberg puts forward a new theory, which suggests that it is not the equations of classical mechanics that are in any way at fault, but that the mathematical operations by which physical results are deduced from them require modification. All the information supplied by the classical theory can thus be made use of in the new theory.
Dirac contrasted this outlook with the one associated with the correspondence principle, which confined the validity of classical equations to the asymptotic case of high quantum numbers and to "certain other special cases." In reality the discovery of the connection between commutators and Poisson brackets was inspired by the conception of correspondence as formal translation earlier developed by Kramers, Born, and Heisenberg under Bohr's guidance. The concomitant formal analogy between classical and quantum mechanics was, though Dirac did not know it, the most perfect expression of the "general tendency" expressed in the latest form of the correspondence principle.^{[56]}
The Canonical Method
How did Dirac's conception differ from that developed in Göttingen? Before seeing Dirac's fundamental paper, Born, Heisenberg, and Jordan had already been aware of the connection between Poisson brackets and commutators. As Kramers communicated to the Dutch Academy in November 1925, Pauli had encountered this relation in the same way as Dirac had, through the classical dispersion formula. However, once the "three men" knew the fundamental equations of quantum mechanics, they stopped referring to their classical origin. They felt that they had in hand an essentially new and self-contained theory, which should be developed from its own axioms with the suitable tools of matrix theory. Instead, Dirac tried his best to transpose the classical methods of solution and apply them to quantum problems. The correspondence between the two theories, he believed, was not limited to the form of the fundamental equations; it concerned mathematical structures , in the modern sense of the word.^{[57]}
From this perspective, the transformation theory of Hamiltonian dynamics and the action-angle variables, suitably adapted, were likely to be
useful in the new theory. As Dirac explained at the Solvay congress of 1927, the operator of the Hamilton-Jacobi theory even anticipated the interlocked character of stationary states in matrix theory, since it connected two infinitely close orbits, in the same way as matrices connected two stationary states.^{[58]}
The analogy, as profound as it might be, was full of traps. As Heisenberg wrote to Dirac in December 1925, quantum mechanics, did not simply result from a reinterpretation of the equations of classical mechanics. The very concept of motion had to be changed. Moreover, the formal correspondence between the two mechanics was not as close as Dirac imagined. One could not simply identify all Poisson brackets with commutators without getting into trouble. For instance, the commutator [q^{2} , p^{ 2} ] could be evaluated in two contradictory ways, through an algebraic reduction in terms of the commutator [q, p ]:
or directly through the corresponding Poisson brackets:
Fortunately, the correspondence still held for canonical pairs, which was all that Dirac needed for his developments.^{[59]}
Heisenberg also reproached Dirac with a more important overestimation of the classical analogy. Contrary to the assumption made at the end of "The fundamental equations," the expression H (J ) of the Hamiltonian in terms of action variables could not be simply adapted from the classical theory; the quantum-mechanical spectrum of multiperiodic systems had to differ, in general, from that given by the Bohr-Sommerfeld theory. Indeed, as a consequence of the noncommutativity of quantum variables, the quantum-mechanical expression of the original configuration variables (q, p ) in terms of the action-angle variables (w, J ) generally differs from the classical one; so does the expression of H as an implicit function of (w, J ) through (q, p ). For instance, for a rotator the action variable is the
angular momentum J around the axis; as a function of this variable, the classical Hamiltonian is ½aJ^{2} while the quantum-mechanical one is ½aaJ ( + 1).^{[60]}
These limitations did not discourage Dirac. In his subsequent papers he adapted more correctly the technique of canonical transformations and action-angle variables, taking properly into account the modifications required by noncommutativity. His general strategy was to find the explicit form of a classical transformation, say , and then modify this expression in such a way that (Q, P ) would remain canonical in the quantum-mechanical sense [Q, P ] = ih /2p , and so on. Take for instance the transformation from plane Cartesian coordinates and momenta to polar coordinates and momenta. Classically,
Quantum-mechanically, the latter relation must be modified to
With this type of consideration, Dirac solved the hydrogen atom, discussed the composition of angular momenta in atoms with several electrons—imitating the classical procedure of "the elimination of the nodes" in celestial mechanics—and even derived the Compton scattering probability (his result being that later obtained from the Klein-Gordon equation). With the exception of the latter result, which was almost the only one not available from the old quantum theory, similar progress had been made in the Göttingen group, but by three or four men instead of one, and only with a great amount of heavy mathematics and some hints from Hilbert and Courant.^{[61]}
Quantum Algebra
Dirac's successful adaptation of the canonical methods of classical dynamics depended much on his conception of "quantum algebra." Several of the symbolic operations which he performed on quantum variables were, indeed, meaningless from the point of view of Göttingen's authorities. For instance, as Jordan explained in a letter to Dirac, there could be
no matrix (even a continuous one) representing an angle variable, since there is no conjugate operator to operators like the action variables, which have a discrete spectrum and no accumulation point.^{[62]}
q-Numbers
In "The fundamental equations" Dirac had adopted Heisenberg's original definition of quantum variables as arrays of ordinary numbers, and also the interpretation of the polarization matrix as giving radiation intensities. In his next article he adopted a more abstract stance. The quantum variables were "magnitudes of a kind that one cannot specify explicitly." They had to be defined only by the fundamental equations which they obeyed, while their representation in terms of infinite matrices, if any existed , had to be deduced from these equations. To capture the essence of his position in one word, Dirac introduced the "q -numbers," which were defined by their algebraic properties alone: they could be added and multiplied as in a ring; only some of them commuted with all other q -numbers, in which case they were called "c -numbers." Apparently, c stood for "classical," while q stood for "quantum"; but later Dirac suggested that they respectively stood for "commutative" and "queer."^{[63]}
In a spectacular illustration of his strategy, Dirac subsequently derived the existence of a matrix representation for most q -numbers in the case of multiperiodic systems.^{[64]} He first modified his definition of quantum action-angle variables in such a way that they no longer presupposed matrices. Just as in the classical theory, (w, J ) would be action-angle variables if and only if the Hamiltonian was a function of J only, and any q -number (save the multiple-valued ones) could be expressed in the form^{[65]}
Consider now two q -numbers x and y and their product xy . We have
and
or
In order to transform the latter expression we first prove the identity
which is valid for any function f expressible as a power series of J . The relation of commutation
implies
or
Equating the n^{th} powers of the two members of the latter equation produces
Then, linearly superposing powers of J and composing the results justifies the identity (45) for power series.
The expression (44) for the product xy now becomes
or
For the sake of transparency change the notation Ct (J ) into C (J, J -t h). Then
This symbolic relation, noticed Dirac, becomes a matrix product as soon as the J 's are given c -number values nh (i.e., J_{r}= n_{r} h). Therefore, any q -number may be represented by a matrix q_{mn} , wherein n and m refer
to two possible values of the action variables J . The action-variables J themselves and the energy H (J ) are represented by diagonal matrices with diagonal elements corresponding to J = nh . Naturally, the different values of J are assumed to characterize stationary states in Bohr's sense.
Thus defined, the matrices do not yet exhibit the time dependence implied by the fundamental relation
Dirac remedied this by studying the time derivative of the quantum Fourier exponentials:
Through the identity (45) this transforms into
Taking the derivative of a q -number with respect to time therefore amounts to multiplying its C (J, J -t h) by 2p i times the Bohr frequency D t H/h. In this magic way Dirac recovered Heisenberg's matrix form and Bohr's frequency condition.^{[66]}
Dirac still had to show that the polarization matrix in this scheme provided transition probabilities, as originally asserted by Heisenberg. He did this in the following manner. The harmonic development of the quantum electric polarization P is essentially ambiguous, for it can be written in two equally justified forms:
According to identity (45), however, the coefficients Ct and are related by
This shows that Ct (J ) is naturally connected to two stationary states, J = nh , and J = (n -t )h , whereas it was connected only to one stationary state in the classical Fourier development. This suggests, in conformity with Bohr's postulates, that radiation is related to a transition
between two stationary states and that the matrix C (J, J -t h) represents the amplitude of the oscillations connected with this transition.^{[67]}
This reasoning of Dirac's reflected a strategy reminiscent of Eddington's principle of identification. It first introduced abstract entities defined only by their mutual relations, the q -numbers, then developed the formal consequences of these relations in such a way as to suggest an identification of their physical meaning. There were, however, some differences. According to Eddington, the primitive relations were dictated by the mind, whereas Dirac obtained them through the classical analogy or, better, through some kind of "extensive abstraction" of the structure of Hamiltonian dynamics. Moreover, the identification of observable quantities was not completely dictated by the mind; it relied on Bohr's postulates and also on the privileging of action-angle variables, which was a remnant of the old form of the correspondence principle.
A Mathematical Digression
The essence of Dirac's approach was to leave the properties of q -numbers open to the needs of future developments that might occur in quantum mechanics. Nevertheless, his interest in the purely mathematical side of his theory led him to introduce supplementary axioms that would enrich the algebra of q -numbers and make it closer to the algebras which he already knew, namely, quaternions and Baker's symbols. For instance, he occasionally admitted that all q -numbers had inverses, and he excluded divisors of zero (i.e., numbers such that qq ' = 0 with and ). In a mathematical paper of 1926 he added another axiom that was supposed to be necessary for a proper definition of q -number functions: for any two q -numbers x and y there had to exist a q -number b such that y = bxb^{-1} .^{[68]}
As Léon Brillouin noted in a letter to Dirac, none of these axioms was suited to quantum variables. An operator introduced by Pauli in 1926, the spin-raising operator S_{+} = S_{x} + iS_{ y} , furnishes a simple counterexample to the two first axioms. It divides zero since , and it cannot be inverted since a relation S_{+q} = 1 leads to an absurdity once multiplied by S_{+} on the left. Finally, if the last axiom were true, any two quantum variables would have the same spectrum—patently untrue. The algebraic
properties of q -numbers, Brillouin concluded, could not differ from those of arbitrary matrices.^{[69]}
Fortunately, Dirac's attempts to axiomatize the q -numbers did not interfere with their practical use. Despite Brillouin's claim, the q -numbers proved to be more general than Heisenberg's original matrices, since they could cover both discrete and continuous spectra and allowed quantum angle variables that had no matrix representation, and since their applicability was not limited to stationary systems, as exemplified in the calculation of the Compton effect. Above all, Dirac wanted flexibility:
One can safely assume that a q -number exists that satisfies certain conditions whenever these conditions do not lead to an inconsistency, since by a q -number one means only a dummy symbol appearing in the analysis satisfying these conditions. . . . One is thus led to consider that the domain of all q -numbers is elastic, and is liable at any time to be extended by fresh assumptions of the existence of q -numbers satisfying certain conditions, and that when one says that all quantum numbers satisfy certain conditions, one means it to apply only to the existing domain of q -numbers, and not to exclude the possibility of a later extension of the domain to q -numbers that do not satisfy the condition.^{[70]}
Dirac thus set forth a general program by which arbitrary physical situations might be analyzed with q -numbers, the properties of the q -numbers being tailored to fit the physical situations as well as the fundamental equations.
Stagnation
In May 1926 Dirac put together in his dissertation the first fruits of his conception of quantum mechanics. By then he had found nearly all that could be learned from the q -number adaptation of the method of uniformizing variables. There were obvious signs that the magic of this method was being exhausted. Even a problem that was simply treated on the basis of the old quantum theory, the H-atom, received a fairly complicated treatment within the q -number theory, regardless of the high mathematical skills deployed. The very problem that motivated Heisenberg's discovery of matrix mechanics, the calculation of the intensities of hydrogen lines, was no more accessible to Dirac than it was to the Göttingen group.
In general (with the questionable exception of the Compton effect), Dirac's methods could not be used to treat more problems than the old quantum theory, precisely because they were nothing but a noncommutative reformulation—should we say complication?—of the methods of this theory.
There was a more fundamental obstacle which Dirac disclosed in the late spring of 1926, either before or right after his first use of the Schrödinger equation: if, in the spirit of Heisenberg's theory, matrices refer only to observable processes, there cannot be any action-angle representation of them in the case of atoms with more than one electron. Let indeed m and n be two similar quantum numbers referring to two electrons in a given atom. According to a natural extension of Heisenberg's observability principle, stationary states differing only by a permutation of m and n should be identified, since there is no observable difference between the transitions and . Consider now the Fourier exponential e^{2}p i(2t · w) corresponding to the transition . Then the Fourier exponential e^{2} p i(2t · w) corresponds to a transition , with
If m'n ' is to be identified with n'm ', one might as well have written
Here comes the absurdity: the values of m"n " deduced from each system cannot refer to the same stationary state since they are neither identical nor related through a permutation. With this ingenious argument Dirac closed a first chapter of his involvement in the history of quantum mechanics.^{[71]}
Summary
In the fall of 1925, Dirac scrutinized Heisenberg's fundamental papers and perceived three essential elements: the new quantum product, the endeavor to maintain as many classical relations as possible, and the direct connection between the quantum amplitudes and the properties of the emitted radiation. Misled by the latter point, Dirac originally interpreted Heisenberg's theory as a modification of the BKS theory and tried to draw
from it a new conception of virtual oscillators. But he quickly abandoned this line of thought and addressed a more fruitful question: Where did Heisenberg's new quantum rule come from? In his paper, Heisenberg pointed to the possibility of deducing his quantum rule from the high-frequency limit of Kramers's dispersion formula. Consequently, Dirac went back to the Kramers-Heisenberg paper (or to Fowler's account of it) and observed that the dispersion formula was the symbolic translation of a Poisson bracket (Poisson brackets are differential expressions involving two dynamic quantities: they appear when considering infinitesimal transformations in Hamiltonian mechanics, and they enjoy remarkably simple algebraic properties). Together with Heisenberg's remark, this led him to postulate that quantum mechanics was obtained by expressing the fundamental equations of mechanics in terms of Poisson brackets, and by replacing the brackets by purely algebraic expressions, the commutators (divided by ih /2p ).
This conception implied a deep structural analogy between classical and quantum mechanics, from which Dirac drew maximum profit. First of all, his "fundamental equations" were expressed in a very homogeneous form, one involving only algebraic operations (except for time differentiation), whereas the mechanics developed in Göttingen awkwardly mixed algebraic and differential operations (with respect to matrix coordinates!). On the practical side, Dirac imagined a quantum-mechanical analogue of the canonical methods for solving mechanical problems, particularly an analogue of the powerful technique of action-angle variables. This led him, within a semester, to results comparable, and sometimes superior, to those obtained in Göttingen. At the end of 1925 (a little after Pauli) he solved the hydrogen atom, and, soon after, he derived the algebra of angular momenta and made a relativistic calculation of Compton scattering probabilities.
The superiority of Dirac's method lay in his personal appraisal of the classical analogy in the new mechanics. While Göttingen's physicists judged that this analogy had been integrated, once for all, into the foundation of the theory, Dirac believed that it could still be used profitably in the development of the theory. Accordingly, Dirac exaggerated the analogy between classical and quantum mechanics. He initially underplayed the revolutionary character of quantum mechanics and asserted that only the physical interpretation, but not the equations of classical mechanics, was at fault. Heisenberg corrected him: the revolution affected the very concept of motion (kinematics); furthermore, the formal analogy between the two mechanics was not quite as close as Dirac first thought. One could not,
without contradiction, replace all Poisson brackets of the classical theory with commutators, and the energy expression in terms of action variables was not the same in the classical and the quantum cases, contrary to what Dirac suggested in his first paper on quantum mechanics. But to Dirac these were only points of rigor, which did not affect his general view or strategy.
The success of Dirac's adaptation of classical methods depended on another unique aspect of his approach, namely his notion of quantum algebra. In an Eddingtonian manner, Dirac formulated the fundamental equations of quantum mechanics in a purely abstract way, without having formerly interpreted the symbols entering these equations. The symbols, or "q -numbers," were defined only by their mutual relations, which for him constituted a "quantum algebra." The physical interpretation of these symbols occurred in two steps. The introduction of a quantum analogue of action-angle variables first suggested a matrix representation of the symbols; then the matrices were identified with collections of transition amplitudes, as suggested by some formal properties and a touch of "correspondence." This strategy was reminiscent of Whitehead's extensive abstraction, insofar as the relations defining the symbols were abstracted from ordinary mechanics; and it was akin to Eddington's principle of identification insofar as it purported to deduce the interpretation of the symbols from their formal properties. Dirac's symbols, however, in contrast with Whitehead's geometric objects, were not interpreted on the basis of their empirical origin; and contrary to Eddington's symbols of the world, they could not be interpreted without comparing the theory with an already interpreted theory of the same phenomena, Bohr's old quantum theory.
Dirac did some purely mathematical work on the quantum algebra, in the course of which he ventured to subject q -numbers to axioms similar to those found in Baker's Principles and Kelland and Tait's Quaternions . Some of these axioms did not suit quantum mechanics, as quickly noticed by Jordan and Brillouin. But Dirac already knew that the guilty axioms were not necessary to his practical calculations. In general, he wished to maintain a certain flexibility in his notion of q -number: the algebra had to be adapted to the needs of the developing theory. Also he did not require rigorous mathematical definitions of the objects he was manipulating; it was sufficient for him that the symbolic operations performed on these objects would not lead to contradiction.
By the spring of 1926 Dirac's progress had amazed all his colleagues; yet it seemed to have reached a peak. The method of transposing classical methods indeed had a defect: essentially, it could only solve problems that
had a solvable counterpart in the old quantum theory based on classical orbits. To make it worse, through a very ingenious argument Dirac proved that action-angle variables could not exist for quantum-mechanical systems containing two or more indistinguishable particles (this case includes all atoms beyond hydrogen!). A new method had to be found to solve the fundamental equations of quantum mechanics.
Chapter XIII
Quantum Beauty
Schrödinger's Equation
The News
From March 1926 on, Erwin Schrödinger published a series of memoirs on a new theory of quantization based on his famous equation. In a conception derived from de Broglie's, stationary states were identified with stationary modes of electron waves in atoms. The corresponding calculations required none of Göttingen's transcendent algebra; they rested on a mathematical technique well known to anyone versed in the classical theory of waves. Within this new framework Schrödinger could very quickly and simply solve many of the standard problems of quantum theory. Surprisingly, the results appeared identical with those given by matrix mechanics, whenever comparison could be made.^{[72]}
Dirac's first reaction to this spectacular invention was essentially negative. A wave theory of matter, he thought, had to be just as inconsistent as the wave theory of light. Moreover, there was no need for a new quantum mechanics, since there already was one, the foundation of which he did not question. A letter from Heisenberg of May 1926 changed his opinion. It explained how the Schrödinger equation could be used as a tool to derive matrices satisfying the fundamental equations of quantum mechanics. One just had to solve this equation (the time-independent version,
which was the only one available at that time) for eigenfunctions y_{n} with the energies E_{n} ,
and form the matrix
associated with the quantum variable g (q, p ) according to the rule
While such a development is possible only if the functions y_{n} span the space of y -functions (that is, if the condition of "completion" is met for the original wave equation), Heisenberg, unencumbered by this type of consideration, immediately proceeded to prove his assertion.^{[73]}
First of all, the matrix associated with the product xy of any two quantum variables is the product of the matrices respectively associated with x and y . This is a trivial result for the modern reader since the equation (62) just says that is the matrix representation of an operator in the base of the y_{n} 's. In this scheme the relation
immediately follows from (62) and
Finally, the time dependence of matrices introduced ad hoc in equation (61) warrants, as usual, the result that
The above consideration is so simple that one may wonder why the Schrödinger equation was not derived from the original quantum mechanics before it was inferred from de Broglie's notion of matter waves. At
Göttingen, theoreticians had been for some time inhibited by Heisenberg's doctrine of observability, which confined quantum mechanics to the methods of matrix algebra. In early 1926 Born, Wiener, and Lanczos attempted to remove this restriction, and would probably have reached the Schrödinger equation (they came very close to it) if only there had been enough time before Schrödinger's publication. On the other side of the Channel, Dirac would have been in the best position to discover the new wave equation since his conception of q -numbers was not bound to any specific representation. For instance, he could have noticed that a differential operator made a good representation of the momentum quantum variable. However, he did not, because his intellectual adventure remained restricted by the analogy with the method of uniformizing variables, as we just saw.
The Crop
In compensation for not having found it, both Heisenberg and Dirac made faster progress in exploiting the Schrödinger equation than did Schrödinger himself. In his letter to Dirac, Heisenberg announced that he already knew how to solve the helium atom. Within the three following months Dirac reached no less important results. As usual, he put a touch of relativity in the new equation, that is to say, he extended the substitution
to
This gives instead of equation (60) the so-called "time-dependent Schrödinger equation"
which Schrödinger obtained at the same time through less straightforward means.^{[74]}
^{[]}The stationary solutions leading to the energy spectrum are the ones for which
Consequently, they evolve in time according to
The functions y_{n} , now directly engender Heisenberg's time-dependent matrix according to a rule similar to (62):
Dirac published this most adequate presentation of the relation of Schrödinger's equation to his quantum mechanics in the first part of a paper entitled "On the theory of quantum mechanics." In the second part he gave his argument about the impossibility of action-angle variables for several indistinguishable particles, together with a reference to Heisenberg's observability principle:
In Heisenberg's matrix mechanics it is assumed that the elements of the matrices that represent the dynamical variables determine the frequencies and intensities of the components of the radiation emitted. The theory thus enables one to calculate just those quantities that are of physical importance, and gives no information about quantities such as orbital frequencies that one can never hope to measure experimentally. We should expect this very satisfactory characteristic to persist in all future development of the theory.^{[75]}
Along the same line, Dirac argued that for a pair of (noninteracting) electrons a single wave function had to be associated with a given pair of individual stationary states m and n , in order that no distinction could be made between the transitions and . Accordingly, the simplest type of wave function for the compound system had to be
In general, Dirac took the wave function of a system of identical particles to be a totally symmetrical or antisymmetrical function of the positions of the particles. The symmetrical case led to the Bose-Einstein statistics, while the antisymmetrical one led to Pauli's exclusion principle and to the Fermi gas, according to procedures that have now become standard.
Up to that point, Dirac's recourse to the Schrödinger equation was simply a new way to achieve a specific matrix representation of his q -numbers. In a subsequent paper on the Compton effect he noted: "The wave equation is used merely as a mathematical help for the calculation of the matrix elements, which are then interpreted in accordance with the assumptions of matrix mechanics." The last part of "On the theory of quantum mechanics" contained, however, a first departure from this limited viewpoint. There Dirac considered the case of a time-dependent Hamiltonian, more specifically the one corresponding to an atom subjected to an electromagnetic field. He calculated the function y through a now standard perturbative method, the first step being the development of the wave function in terms of the stationary solutions of the unperturbed Schrödinger equation:
0
Then he interpreted the squared modulus |c_{n} |^{2} as the number of atoms to be found in the stationary state n , when a large assembly of atoms is subjected to the perturbation. "We take |c_{n} |^{2} instead of any function of c_{n} ," he commented, "because . . . this makes the total number of atoms remain constant." Indeed, the hermiticity of the Hamiltonian operator implies the constancy of S_{ n} |c_{n} |^{2} .^{[76]}
With this rule Dirac derived Einstein's B coefficients for induced atomic transitions. Quantum mechanics, once equipped with Schrödinger's equation, was now able to say something about radiation processes (although not yet anything about spontaneous emission). The procedure leading to this progress once again had some resemblance to Eddington's principle of identification. A new equation, the time-dependent Schrödinger equation, was first introduced by formal considerations; then a permanence property, the conservation of the norm S_{n} |c_{n} |^{2} of the wave function, oriented the discussion of the physical meaning of the solutions.
Transformations
After this quick and rich crop of fundamental results, Dirac pondered about the general interpretation of quantum mechanics. There were basically two ways to draw observable information from the fundamental
equations of quantum mechanics. One could either construct matrices that would then be interpreted à la Heisenberg, or one could try to guess a direct interpretation of Schrödinger's y . As we just saw, Dirac initially favored the first approach, although with his interpretation of the c 's he already had started using the second.
Partial Interpretations
In the last semester of 1926, Dirac became aware of several new points of contact between theory and observation, derived either from the matrix camp or from the wave camp. Originally, Schrödinger regarded the y -function as describing some substantial oscillation within the atom. He quickly realized, however, that this "heuristic viewpoint" could not be taken too literally, since for a quantum system involving more than one particle, the oscillations no longer occurred in the ordinary physical space but in the 3n -dimensional configuration space associated with the n particles. In his fourth installment of wave mechanics, completed in June 1926, he therefore reinterpreted the wave function in the following way:
|y |^{2} is a kind of weight-function in the configuration space of the system. The wave-mechanical configuration of the system is a superposition of many—strictly speaking, of all —onfigurations allowed in the mechanics of a point. . .. For the ones who like paradoxes, the system can be said to occupy all kinematically conceivable positions at the same time, but not "in the same degree."^{[77]}
Nevertheless, Schrödinger still regarded the fluctuation of the electric density (in ordinary space) calculated through y as real, certainly more real than the one attached to the corpuscular picture. There was in his opinion more truth in the continuous evolution of the y than in the quantum leaps of the matrix theory.^{[78]}
In the same month, Max Born compromised between particles and waves. His purpose was to give a quantum-mechanical treatment of collisions between atoms and particles. As we saw in part B (p. 253), after the Bothe and Geiger experiment disproving the BKS theory he had tried with Jordan to come back to Einstein's and Slater's original idea of a wave guiding light quanta. Born had given up this attempt for about a year when he decided to use it as a heuristic analogy for the wave-mechan-
ical collision problem. Just as free light quanta were guided by a plane monochromatic "ghost" wave, the free asymptotic motion of colliding particles had to be represented by a plane monochromatic Schrödinger wave; the distribution of scattering angles was obtained by developing the scattered wave in terms of such plane waves:
After some hesitation Born decided that |a (k)|^{2} (not |a (k)|) would give the scattering probability in the direction of k. In a slight generalization, if y is developed in a basis of stationary states y_{n} according to
then |c_{n} |^{2} determined the relative probability of the state n (Born used the word Háufigkeit , which refers to the statistical conception of probability), as in Dirac's perturbation theory. Accordingly, quantum mechanics gave no deterministic prediction of an observable quantity, the scattering angle. Born believed this feature to be fixed and central, in conformity with his earlier intuition that the world was essentially a kind of lattice (see part B, p. 196): "I myself am inclined to give up determinism in the atomic world." For the consolation of deterministic thinkers he then introduced the notion of statistical determinism: "The motion of particles is ruled by probability laws, but the probability itself propagates in accordance with the causal laws."^{[79]}
In a letter to Heisenberg of 19 October 1926, Pauli combined Born's idea of a probability wave with Schrödinger's "weight functions" in configuration space: the expression
he wrote, had to give the probability for the system to be found in the configuration q_{ 1} , q_{2} , . . ., q_{n} (actually within a little volume dq_{1}dq_{2} . . . dq_{ n} ). Closer to Born's collision case, one could also build a probability density in the momentum space by taking the multiple Fourier transform of y . The latter step was naturally suggested by the existence of a dual form of the Schrödinger equation enjoying the same conservation property as the original equation:^{[80]}
for which
Pauli further remarked that Born's scattering probabilities were intimately connected with a special matrix already introduced in the Born-Heisenberg-Jordan paper. Up to a proportionality coefficient they were the matrix elements of the interaction potential with respect to the stationary states of the unperturbed Hamiltonian. This suggested a deep-lying connection between Heisenberg's original interpretation of matrices and the new probabilistic interpretation of Schrödinger's waves.
During the following month Heisenberg made some progress in clarifying the connection. His motivation was to show that Schrödinger's continuum theory was unsuited for giving a correct intuitive understanding of the internal energy fluctuations taking place when two atoms are interacting. Suppose two identical atoms originally in the stationary states m and n to be weakly coupled. A resonant interaction takes place, which according to Heisenberg corresponds to discontinuous exchanges of the energy values E_{m} and E_{n} between the two atoms. In this view the energy of one of the atoms can take only the values E_{m} and E_{n} , whereas in Schrödinger's view it can take all intermediate values given by the continuous y -evolution.^{[81]}
There was a way, Heisenberg argued, to decide between the two conceptions. Matrix mechanics could certainly not give the energy H_{1} of one of the atoms as a function of time, but it was nonetheless able to determine the probability for this energy to take a given value. One just had to assume that, for any dynamic variable, the diagonal elements of the corresponding matrix gave its average value in the various stationary states of the global system; according to this rule, one could derive the average value of any function of H_{1} , for example the moments and therefore the probability distribution of H_{ 1} . The result read
in conformity with Heisenberg's intuition of discontinuous switches.
Moreover, the probability distribution of H_{1} was explicitly given by the squared moduli of the elements of a matrix S that had been introduced in the three-men paper and related the matrices g of the dynamic variables of the unperturbed system (no coupling between the two atoms) to the
matrices G of the corresponding variables of the perturbed system according to
There is no need here to give the details of Heisenberg's argument, since they will result from Dirac's much more general investigation of relations of the above type.
The Formal Apparatus
When he set out to elaborate his own interpretation of quantum mechanics, Dirac was aware of Schrödinger's fourth memoir, had heard Born speak on collisions at the Kapitza club (on 29 July 1926), and had seen the manuscript of Heisenberg's fluctuation paper. He was dissatisfied with the multiplicity of partial, disconnected, and sometimes contradictory interpretations. But he saw an important advance in Heisenberg's considerations, for they indicated how to derive probability distributions from the original matrix formulation of quantum mechanics.^{[82]}
However, Heisenberg had limited his interpretative inquiry to simple examples, a result of his search for an intuitive understanding of quantum mechanics. Dirac, following his Eddingtonian bent, returned to his fundamental equations of quantum mechanics and formulated the preliminary interpretative problem very generally: as the search for c -numbers connected with the q -numbers satisfying these equations. Since the action-angle variables could no longer help in this problem, one needed to consider all matrix systems capable of representing the q -numbers, without including the restriction that action variables should be diagonal. Even Heisenberg's limitation to a diagonal (total) energy matrix had to be avoided, since it introduced a premature interpretative element.
The fundamental equations read
(in order to present the equations more compactly, they will be written for the case of one degree of freedom, even though the discussion will refer to the general case). Dirac first noticed that one goes from one matrix
representation of these equations to another through a transformation of the type
Interestingly, Dirac had learned much earlier from Göttingen's theoreticians that a relation of this type existed between any two sets of canonical coordinates (in Heisenberg's representation). But at that time he believed the remark to be of no practical use, for it had no obvious classical counterpart.^{[83]} In late 1926 he was completely freed from the inhibitory effects of his desire to maintain a close classical analogy, and regarded instead the form (80) of transformation as most essential.
Representations are likely to be of physical interest, Dirac went on, only if they make a given set x of dynamic variables diagonal. The set is said to be complete if there is only one representation for which it is diagonal. Dirac found it convenient to represent the transformations from one representation to another by symbols x '/x' ) with c -number values, wherein x ' and x' represent the diagonal elements of two complete sets and x. In this notation the equation (80) becomes
(to the mathematicians' horror, g_{x'x"} was not to be regarded as the same function of x'x" as g x 'x " is of x 'x ").
Although this type of formula seems to be limited to the case of continuous spectra, Dirac took it to also cover discrete and mixed spectra, assuming the integral to represent a sum in the discrete case. The continuous case itself called for a few mathematical tricks. First of all, in order to be able to write the transformation (x '/x ") corresponding to the identity (b = 1), Dirac introduced the now famous "d -function," some kind of limit of sharply peaked functions such that
for any (regular) function f . Then the choice
makes (81) an identity, as required. As usual, Dirac did not worry about a rigorous mathematical construction (Laurent Schwarz's distributions later provided such a construction). To him as to his precursor Heaviside,
it was sufficient that the symbolic manipulations of the d -function did not lead to contradictions.^{[84]}
The next easier matrix after the identity is the one representing x in its own scheme. It must be a diagonal matrix with elements x ', which leads to the expression
assuming the d -function to generalize Kronecker's symbol d_{ij} in the continuous case. Dirac further showed that the matrix hx 'x " when h is canonically conjugate to x is given by
The proof required a few d -gymnastics performed on the commutator
[x , h ]:^{[85]}
Performing an integration by part on the last integral gives
The second term of this integral cancels the first integral in (86), while the first provides , that is, times the identity matrix, which completes the proof.
On the basis of the general transformation formula Dirac could calculate the matrices and in a representation where the rows refer to x and the columns to a :
More generally, any function g(x , h ) expressible as a sum of products of and h allowed the mixed representation:
In order to prove this identity, it is sufficient to show that when it holds for f and g , it also holds for f + g and fg . The first part of the latter assertion being trivial, we are left with the second one:
The identity (88) played a central role in Dirac's transformation theory. As a first outstanding application of it, let us choose g to be the Hamiltonian H,x the position coordinates, and x a complete set of dynamic variables commuting with H :
Using a relation similar to (87), we also have
and, therefore,
In this way Dirac could have discovered the Schrödinger equation before Schrödinger, if only he had earlier exploited the freedom of representation of his fundamental equations. At least he was able to realize a posteriori that Schrödinger's y was nothing but the transformation from a scheme in which the position variables are diagonal to one in which the energy is diagonal. It further appears that there are as many equations
as there are choices of x , since the above deduction is not limited to the case of position coordinates.
Finally, the Dirac-Schrödinger evolution for a transformation (x '/a '), when x is a constant of the motion (a = 0),
follows from the fundamental equation
This is easily proved by calculating the matrix elements of the two members of this equation in the a -scheme:^{[86]}
while
The function f and the values of a ' and a " being arbitrary, the expressions are identical if and only if
These two equations are equivalent in any Hermitian scheme for which
Combined with the identity (88), they lead to the time-dependent Schrödinger equation (94), as stated.
The formal apparatus of quantum mechanics was now thoroughly unified, Schrödinger's equation being harmoniously blended with the fundamental equations ruling q -numbers. According to Dirac's own criteria, the whole theory was impressively beautiful, for it displayed a transformation apparatus as elegant and powerful as those of relativity or Hamiltonian dynamics.
Dirac was now ready to attack the interpretation problem proper. He started with the following words:
To obtain physical results from the matrix theory, the only assumption one needs make is that the diagonal elements of a matrix, whose rows and columns refer to the x 's say, representing a constant of integration, g say, of the dynamical system, determine the average values of the function g (x , h ) over the whole of h space for each particular set of numerical values for the x 's in the same way in which they certainly would in the limiting case of large quantum numbers.^{[87]}
There is an unfortunate obscurity in what Dirac meant by the "limiting case" here. The following is a very plausible interpretation.
As a first remark, the term "constant of integration" in the above extract is misleading. Dirac just means that he is considering a dynamic variable at a given instant of time , as made clear by an earlier footnote in the paper. If so, the matrix element gz 'z ' is independent of the choice of the Hamiltonian H . Its interpretation is obtained by exploiting this independence and provisionally considering the Hamiltonian to commute with all the x 's. Then gz 'z ' represents the time average of g, according to Heisenberg's previous interpretation of matrices, or from a comparison with the high quantum-number limit in the Bohr-Sommerfeld theory; besides, the conjugate variables h are, in the Bohr-Sommerfeld theory, phases (or "angles" in the action-angle formulation) varying linearly in time, so that the time-averaging is here identical with an h -averaging. In the general case for which H and x do not necessarily commute, it is therefore natural to assume that gz 'z ' represents the average of g when and h is uniformly spread.
Very cleverly, Dirac deduced a complete interpretation of the quantum formalism from this seemingly limited assumption. His magic wand was the identity
which simply results, for any complete set g of dynamic variables, from
The function d (g - g' ) is nonzero only for g ~ g' . Therefore its "h -average" for x = x ' is nothing but the fraction of the h -space for which g = g' when
x = x '. In other words is the relative probability that g = g' knowing that x = x . According to Dirac, this answered all the questions "to which the quantum theory [could] give a definitive answer." These questions, he added, were "probably the only ones to which the physicist could give an answer."^{[88]}
Welcome To Copenhagen
Dirac completed his transformation theory in December 1926 in Copenhagen, where he was treated like a hero. There he learned that Pascual Jordan had formulated a similar theory, though from a different point of view. Instead of studying transformations from one matrix representation of the fundamental equations to another, Jordan examined transformations from one canonical pair, say (x , h ), to another, say (a , b ), in a given representation. His theory was similar to Dirac's insofar as it led to unitary operators b generating the canonical transformations according to
Superficially, this point of view might have seemed closer to the transformation theory of classical dynamics, which also related canonical pairs. In reality Jordan departed much more from the classical model than did Dirac: he defined canonical conjugation not by commutation rules transposed from the Poisson algebra but by broader axioms at the quantum level. For instance, anticommuting variables like the spin operators S_{x} , S_{y} were conjugate in Jordan's sense.^{[89]}
Dirac could not be sympathetic to such a wild deviation from classical canons. Later he even rejected some interesting products of Jordan's conception like the notion of anticommuting quantized fields.^{[90]} One might wonder, however, why he did not view transformations as relating canonical pairs, since this was, after all, the conception that dominated his own work before the advent of Schrödinger's equation. The reason might well have been psychological: to somebody who had just discovered that his dear uniformizing variables failed to solve the problem of atoms with more than one electron, other canonical transformations had little chance to stand at the foreground of a fundamental exposition of quantum mechanics. That Jordan somehow succeeded in adopting this conception did not persuade Dirac to change his position. In his first lectures on quantum
mechanics he did try to lay out his competitor's theory, but he quickly returned to his own, which he found simpler and more elegant.^{[91]}
There is one feature of Dirac's original transformation theory that is likely to surprise the modern quantum physicist: the notion of state vector is completely absent. It was in fact introduced later by Weyl and von Neumann, and subsequently adopted by Dirac himself. In 1939 Dirac even split his original transformation symbol (x '/a ') into two pieces , the "bra" and the "ket" vectors.^{[92]} The mathematical superiority of the introduction of state vectors is obvious, since it allows—albeit not without difficulty—an explicit construction of mathematical entities (rigged Hilbert spaces) that justify Dirac's symbolic manipulations. There was also a more physical advantage to the notion of state vector: it placed the superposition principle in the foreground, which pleased Bohr, who set wave-particle duality at the core of complementarity. Perhaps modern-day interpreters of quantum mechanics should nevertheless remember that there exists a formulation of quantum mechanics without state vectors, and with transition amplitudes (transformations) only.
In this original conception Dirac had nothing to call the state of a system, except the old (q, p ) configuration. This state of affairs conditioned his conclusion to "The physical interpretation of the quantum dynamics":
It may be mentioned that the present theory suggests a point of view for regarding quantum phenomena rather different from the usual ones. One can suppose that the initial state of a system determines definitively the state of the system at any subsequent time. If, however, one describes the state of the system at an arbitrary time by giving numerical values to the coordinates and momenta, then one cannot actually set up a one-one correspondence between the values of these coordinates and momenta initially and their values at a subsequent time. All the same one can obtain a good deal of information (of the nature of averages) about the values at the subsequent time considered as functions of the initial values. The notion of probabilities does not enter into the ultimate description of mechanical processes; only when one is given some information that involves a probability (e.g . that all points in h -space are equally probable for representing the system) can one deduce results that involve probabilities.^{[93]}
Such a view was still too conservative to please Copenhagen authorities. Once again Dirac was charged with having overplayed the classical anal-
ogy. Nevertheless, the success of the transformation theory was immediate, with respect to both interpretation and application of quantum mechanics. From the transformation connecting two conjugate variables,
Heisenberg deduced the uncertainty relations; and he showed that the re-suiting limitations in the definition of conjugate quantities exactly corresponded to the concrete limitations of double measurement processes. On the more practical side, the transformation theory gave a general method of quantizing everything, since, contrary to Heisenberg's original matrix mechanics, it was completely independent of the nature of the dynamic system under consideration. Dirac's radiation theory, published in early 1927, was the first of a long list of spectacular successes resulting from this method.^{[94]}
As Dirac would have said, Nature was being seduced by mathematical beauty. The transformation theory equaled the aesthetic qualities that he had earlier contemplated in classical theories. In the fall of 1927 Dirac explained this to his first students in quantum mechanics:
The quantum theory has now reached a form . . . in which it is as beautiful, and in certain respects more beautiful than the classical theory. This has been brought about by the fact that the new quantum theory requires very few changes from the classical theory, these changes being of a fundamental nature, so that many of the features of the classical theory to which it owes its attractiveness can be taken over unchanged into the quantum theory.^{[95]}
Summary and Conclusions
In March 1926 Schrödinger published the first of a series of memoirs in which he tried to reduce atomic theory to a mechanics of matter waves la de Broglie. Dirac's first reaction was negative, for he already had placed his hopes in another quantum mechanics. Yet, taking a suggestion of Heisenberg, he promptly exploited the Schrödinger equation, if only as a "mathematical help in calculating matrix elements." In the summer of 1926 he thus reached some of the basic notions of modern quantum
mechanics. Most important, in the name of Heisenberg's principle of observability he introduced symmetrical and antisymmetrical wave functions of the configuration of a set of identical particles, and proceeded to connect respective symmetry classes with Bose's statistics on the one hand, and Pauli's exclusion principle on the other. This provided the general basis both for quantum statistics and for the calculation of properties of atoms with several electrons (but Heisenberg was the one who first solved the helium atom within the new mechanics).
In another important innovation, Dirac presented a method for treating time-dependent perturbations; with it he could derive general expressions for Einstein's transition probabilities. In this case he gave the Schrödinger waves a more direct interpretation (not via Heisenberg's matrices), as a means to calculate the (statistical) probability of the system to be in a given stationary state. Characteristically, he presented this interpretation as being suggested by an invariance property (the invariance of the norm of the wave function).
In the summer of 1926 there were several other contributions to the interpretation of the two or three distinct quantum formalisms that had arisen, proposed both by Schrödinger and by the Göttingen group. Schrödinger, retreating somewhat from his original mechanistic conception of matter waves, now regarded the wave function as a (nonstatistical) sort of "weight function" in the configuration space of the system. While treating the problem of particle scattering, Born related the (outgoing) wave function to the scattering probability. Pauli, blending Schrödinger's notion of a weight function with Born's statistical conception, interpreted the squared modulus of the wave function as giving the probability for the system to be found in a specified configuration, and he vaguely suggested a connection between these probabilities and the transition probabilities of Heisenberg's theory. Finally, Heisenberg made the latter connection entirely explicit within the context of a suggestive example, the energy fluctuation of an atom when coupled with an identical atom (quantum-mechanical resonance): the relevant probability was obtained by taking the squared modulus of the elements of the unitary matrix connecting the stationary states of the coupled system to those of the uncoupled system.
In the fall of 1926 Dirac decided to bring some order to this proliferation of partial interpretations. In accordance with Eddingtonian methodology, and in contrast with Heisenberg, he did not start from specific physical examples but explored instead the transformation properties of the fundamental equations of his quantum mechanics (still the ones he had devised in the fall of 1925). He was now in a position to fully exploit the freedom
of representation of q -numbers inherent in his conception of quantum algebra. Dropping Heisenberg's restriction to a matrix scheme in which the energy matrix is diagonal, he studied the general set of bilinear transformations mutually connecting all possible matrix schemes and proved by symbolic means that Schrödinger's wave function was just a particular case of transformation. This showed that both matrix mechanics and wave mechanics were implicitly contained in his fundamental equations.
For the interpretation of his general formalism, the only assumption Dirac made was that there existed a limited "correspondence" with classical theory. Through an extremely ingenious argument based on transformation properties, he could show that this assumption was sufficient to derive a general interpretation of matrices and transformations. The standard question Dirac's theory could answer, in the simple case of one degree of freedom, was: What is, for a given value of a dynamic coordinate q , the relative probability of the values which a given dynamic quantity can take if the conjugate coordinate p is uniformly distributed? Dirac believed this was the only type of question physicists could answer. In this sense his interpretation was a statistical one; influenced, however, by the classical analogy, he still imagined the state of the system to be represented (at a given time) by definite coordinates q and p . In this view, the theory of transformations just implied that it was fundamentally impossible to predict unambiguously the state of the system at a subsequent time.
Bohr and Heisenberg soon persuaded Dirac to give up the "fiction" of a definite p and q . Accordingly, in his later presentations of quantum mechanics Dirac abandoned the notion of a (q, p ) state, and adopted the formal notion of "state vector" proposed by the Göttingen mathematicians (Weyl and yon Neumann).
A transformation theory partly and formally similar to Dirac's was simultaneously invented by Jordan in Göttingen. In general, Jordan tended to build quantum mechanics on autonomous axioms, without reference to classical theory. As a result, his notion of canonical conjugation, in contrast with Dirac's, did not necessarily correspond to the classical one. However, his idea of a transformation was in some respects closer to the classical idea of canonical transformation than Dirac's was. In this case, Dirac distanced himself from the classical analogy, presumably because his early attempts to adapt canonical transformations in quantum mechanics had become stagnant in the spring of 1926. At any rate, Dirac's transformation theory was more elegant than Jordan's; it was easier to apply (at least in its creator's hands), and as a result of the classical analogy it involved a more restrictive concept of canonical conjugation. These virtues
explain to a large extent the "miracles" Dirac subsequently performed in the contexts of radiation theory and relativistic quantum mechanics.
That Dirac's success owed much to the classical analogy was obvious and explicit. It remains to be seen to what extent his use of analogies was related to the correspondence principle. Dirac discovered the connection between Poisson brackets and commutators by examining Kramers and Heisenberg's procedure of symbolic translation, which itself derived from a sharpening of the correspondence principle. And his use of classical mechanics as a template for the construction of the new theory can be seen as a mathematical version of Bohr's continual appeal to formal analogies between classical and quantum theory. In this sense Dirac fulfilled Bohr's old prophecy of a "rational generalization" of the classical theory.
However, some essential aspects of Dirac's method were foreign to Bohr's strategy of correspondence. For instance, the way Dirac connected the algebra of Poisson brackets with the algebra of commutators was more similar to the mathematicians' notion of isomorphism than to the formal or symbolic analogies cultivated in Copenhagen. Further, in his approach to the transformation theory Dirac was inspired by another type of classical analogy, one in which he tended to imitate the relativistic strategy of theory building. As we saw, he first developed, at an abstract level, the transformation properties of his fundamental equations and then used these properties in identifying the physical content of the formalism. Since to him this was the royal road to fundamental theories, his greatest satisfaction as a theoretical physicist was obtained in the creation of the transformation theory. More generally, he relished the thought that quantum mechanics, in its genesis and expression, could compete in beauty with the greatest classical monuments, Hamiltonian mechanics and general relativity.