PART A
PLANCK'S RADIATION THEORY

― 3 ―

Introduction

Most of Planck's early work was carried out with the principal goal of proving that the second law of thermodynamics was strictly valid and that the entropy of a closed system always increases. Accordingly, he first rejected kinetic gas theory, for it considered the second law to be only statistically valid—until he came to develop his radiation theory on the basis of an analogy with Boltzmann's gas theory. One may wonder how any reasonable theoretician could draw his inspiration from the theory which he is trying to disprove. A plausible answer could be that Planck was converted to Boltzmann's ideas. But he believed too much in the absolute validity of the entropy law to do so. The key that gave him access to the formal apparatus of Boltzmann's theory was in fact a reinterpretation of this theory in nonstatistical terms.^[1]

According to Planck, the central concept of both radiation and kinetic gas theory had to be that of "elementary disorder." In his opinion the main difficulty encountered in these theories was that in the derivation of equations for the evolution of directly observable quantities from fundamental electrodynamic or mechanical processes, there were terms depending on uncontrollable details of the state of the system, for instance the position of individual molecules or the electromagnetic field at a precise point of space. According to Boltzmann, these terms really existed, but they could be neglected when considering the statistical behavior of a large

[1] Allan Needell was the first to point out the central importance of the idea of absolute irreversibility m Planck's work before 1914 (Needell 1980, 1988).

― 4 ―

number of exemplars of the system. The resulting evolution of the directly observable quantities (the one given by the Boltzmann equation) was irreversible, but only in a statistical manner. Instead, according to Planck's notion of elementary disorder the unknown structural details of the system, for instance the structure of the walls of the container or the internal structure of electric resonators, had to be adjusted in such a way that the unwanted terms completely disappeared. This warranted a strictly deterministic (and irreversible) evolution of "directly observable quantities."^[2]

The relation between Planck's and Boltzmann's work in thermodynamics, then, is a subtle and intricate one, and an elucidation of it will be one of the principal goals of this chapter. Planck's appropriation of some of Boltzmann's computational methods has often misled his modern readers, who generally understand these problems from the point of view of statistical thermodynamics, which is essentially Boltzmann's. To a reader aware of the pitfall of incommensurability, Planck's approach will appear far more coherent and conservative than usually assumed.

The analogies used by Planck in his radiation theory were drawn from a reinterpreted version of Boltzmann's theory. Yet, in any analogy there is a risk of overestimating the similarities between the systems compared. Planck certainly did. In Boltzmann's irreversibility theorem, not only was irreversible behavior derived but the final equilibrium state of the system was shown to be unique. Planck initially believed that such uniqueness also held for the electrodynamic system which he considered in his radiation theory. More specifically, he thought he could show that Wien's law was the only possible distribution for thermal radiation. Under the pressure of new empirical data, however, he came to realize that any thermal radiation law was compatible with his irreversibility theorem.

At this stage, Planck thought of adapting the analogy between his and Boltzmann's theory to another method of determining the equilibrium state of a system, through Boltzmann's quantitative relation between entropy and "probability." Naturally, he did this within the context of his reinterpretation of Boltzmann's theory: he freed Boltzmann's "probability" from its original ties with the statistical conception of the entropy law

[2] "Elementary disorder" was the generic expression used by Planck to characterize molecular chaos and natural radiation in his lectures on radiation theory: see Planck 1906, e.g., on 134: "The assertion that in nature every state and process involving a great number of uncontrollable elements is elementarily disordered provides the condition and also the strict guarantee for the unequivocal determination of measurable processes in both mechanics and electrodynamics and for the validity of the second principle of thermodynamics."

― 5 ―

and interpreted it instead as a quantitative measure of the elementary disorder that warranted strict irreversible behavior.

Here the reinterpretation had considerable effects. Most important, we shall see that it permitted finite energy-elements to appear in the final expression of entropy (whereas they disappeared in Boltzmann's original method); at the same time it allowed maintaining the continuous equations for the evolution of the electrodynamic system, without apparent inconsistency. More generally, Planck's quantum hypothesis was meant to complete the existing electrodynamic theories, not to contradict them. Closely connected to elementary disorder, this hypothesis found its logical place in the uncontrollable details of electrodynamic systems and left untouched the laws ruling directly observable quantities.^[3]

Altogether, tight connections existed between the central concepts of Planck's radiation theory, namely: absolute irreversibility, disorder, entropy, and energy quanta. By 1905, however, Einstein perceived an inconsistency in the corresponding reinterpretation of Boltzmann's theory. In his opinion the separation between directly observable quantities and internal structure necessary to Planck's idea of disorder could not be maintained. One then had to return to orthodox statistical thermodynamics, and this led, in the case of Planck's electrodynamic system, to absurd results. The observed properties of thermal radiation, Einstein concluded, could not be explained without a sharp break from ordinary electrodynamics. Nevertheless, the formal skeleton of Planck's derivation of his blackbody law remained valid. This should be seen as a virtue of the symbolic part of Planck's analogies, the resulting equations being "more clever than their inventor," as Born once put it.^[4]

[3] That Planck did not introduce a "quantum discontinuity" in 1900 was first asserted in Kuhn 1978.

[4] Born to Bohr, undated (early 1926), AHQP: "For the time being our mathematical formulae tend to be more clever than we are. The formulae come to us quite naturally, but the interpretation is often difficult."

― 7 ―

Chapter I
Concepts of Gas Theory

When Planck worked out his radiation theory, he relied on an analogy with Boltzmann's gas theory. The key conceptual issues in the latter theory are best understood in light of their historical source, Maxwell's kinetic theory of gases. The following is a critical discussion of some of Maxwell's and Boltzmann's results.

Maxwell's Collision Formula

In the mid-nineteenth century James Clerk Maxwell was prominent in developing the kinetic theory of gases, a subject just then beginning to flourish. Like his precursor in the field, Rudolf Clausius, he conceived of a gas as a set of very small "molecules" animated with a continual motion. A molecule in a sufficiently dilute gas was supposed to travel along a straight line, except when it was redirected by short collisions with other molecules or with the walls of a container. Any quantitative theory of the observable effects of such collisions, for instance of pressure or of viscosity, required an evaluation of the number of collisions of a given kind.^[5]

In 1866, through a seemingly obvious reasoning, Maxwell gave a precise mathematical expression for this number, later known to German-speaking theorists as the Stosszahlansatz . The corresponding formula turned out to provide the starting point for most subsequent kinetic

[5] See, e.g., Brush 1976.

― 8 ―

theories, and these confirmed its validity in many concrete cases. However, the precise formulation of the conditions of its applicability soon became an outstanding conceptual problem of physics. Not only the empirical predictions of the kinetic theory but also, as we shall see, the nature of thermodynamic irreversibility crucially depended on the solution of this problem.^[6]

In his "dynamical theory of gases" Maxwell first examined the case of a chemically and spatially homogeneous gas and introduced the following hypothesis of dilution:

We shall suppose that the time during which a molecule is beyond the action of other molecules is so great compared with the time during which it is deflected by that action, that we may neglect both the time and the distance described by the molecules during the encounter, as compared with the time and the distance described while the molecules are free from the disturbing force. We may also neglect cases in which three or more molecules are within each other's spheres of action at the same instant.^[7]

In the lack of detailed information on intermolecular forces, Maxwell assimilated the molecules either to hard spheres or to centers of force. In the latter case a collision of two molecules, denoted "1" and "2," may be represented as a simple deflection of "2" in a reference frame fixed to "1" (fig. 1).^[8] In the case of central forces the collision "kind" is characterized by the azimuth j of the plane of the trajectory of "2" in this reference frame, and by the angle q between the initial and final relative velocities, which is a definite function of the parameter b (and of the initial relative velocity v₂ - v₁ ).

Let it be agreed that a collision starts when "2" crosses a conventional plane Z perpendicular to the relative velocity v₂ - v₁ . Then, for "2" to collide with "1" within the time d t, and with a kind (q , j ), defined with the uncertainty (d q , dj ), it must be located within the "efficient volume" shaded in figure 2. In order to obtain the number of such collisions occurring in a unit volume of a homogeneous gas of identical molecules, Maxwell simply multiplied the measure |v₂ - v₁ | d t dj b db of this volume by the expression f (v₁ )d³v₁f (v₂ ) d³v₂ , giving the number of pairs of molecules per square unit of volume with velocities v₁ and v₂ , up to d³v₁ and d³v₂ .

[6] See ibid., 583-640.

[7] Maxwell 1867, Scientific Papers , 37.

[8] In his original reasoning Maxwell used the reference frame of the gravity center of the two molecules instead of the reference frame of the molecule "2." The notations used m the sequel are of course not Maxwell's.

― 9 ―

Figure 1.
Deflection of a molecule "2" by a molecule "1," as seen from an observer fixed to "1."

Figure 2.
The "efficient volume" for colliding molecules.

This gives

inline image

In the case of more complex interactions Maxwell used a slightly more general form:

inline image

― 10 ―

where dW is an element of solid angle in the direction z defined by q and j , and s has the dimension of a surface.^[9]

The seeming naturalness of Maxwell's reasoning obscures an important difficulty. For a given target-molecule "1," the number of molecules in its "efficient volume" must almost always be zero, for it has been implicitly assumed that the time d t is so small that no third molecule perturbs a molecule "2" traveling within this volume. Consequently, the relative spatial distribution of the molecules "2" cannot be considered to be uniform at the scale of the efficient volume. But this distribution depends on the choice of the molecule "1," and an average must be formed with respect to this choice (keeping however the velocity v₁ within d³ v₁ ). Maxwell's Ansatz implicitly assumes that the distribution resulting from this averaging is uniform, so that the number of pairs of molecules for which the second molecule belongs to the efficient volume of the first is proportional to the value of the efficient volume.

As Boltzmann and Maxwell's British successors would later find, the latter assumption is not always allowed. One can imagine microscopic configurations of the gas for which the number of collisions is not given by Maxwell's formula. For example, let us assume that at a given instant the velocities of nearest neighbor pairs of molecules point toward one another. Then, the number of collisions in a subsequent time interval will greatly exceed the value given by Maxwell, even if the spatial distribution of molecules is as uniform as possible.^[10]

This example clearly shows the gap in Maxwell's reasoning: For every choice of the molecule "1" with a velocity within d³ v₁ and for a not too small value of d t, there is one molecule "2" in the total (integrated over dW ) efficient volume of "1." Therefore, the average spatial distribution of the molecules "2" (in the sense earlier defined) is far from being uniform; it is more concentrated in the efficient volume than elsewhere.

One would vainly seek for such critical considerations in Maxwell's writings. His Ansatz sounded obvious, and it had the essential advantage of making the number of collisions dependent only on a coarse description of the dynamic state of a gas, namely, a description by a continuous distribution, f (v) or f (r, v) in the configuration space of a molecule. Finer, physically inaccessible details of the molecular description were rendered irrelevant.

This state of affairs allowed Maxwell to draw important consequences from his collision formula. Every transport phenomenon (e.g., transport

[9] Maxwell 1867, Scientific Papers, 44 .

[10] Brush 1976, 616-625; Boltzmann 1896b, trans., 40-41.

― 11 ―

of heat, momentum, etc.) in a gas subjected to external constraints (temperature gradient, pressure gradient, etc.) could be calculated by simply multiplying the elementary transport produced by one collision of a given kind by the number of collisions of this kind, and summing over kind. Most fundamentally, the equilibrium distribution of molecular velocities could be derived from the collision formula. The resulting expression, the so-called Maxwell distribution, had already been obtained by Maxwell in 1860. The new proof of 1866 proceeded in the following way.

Consider two generic elements d³v₁ and d³v₂ in the abstract space of molecule velocities, respectively around the velocities v₁ and v₂ . A sufficient condition of (kinetic) equilibrium is that the number of collisions dn for which the initial velocities belong to these elements, of the kind z , be equal to the number dn ' of collisions for which the final velocities belong to these elements, of the inverse kind z ' (z ' is obtained from z by changing the sign of the angle q between initial and final relative velocities). For these numbers to be finite there must of course be a latitude dW in the definition of z , but we will take it to be negligible in comparison with the latitude in the definition of the direction of v₂ - v₁ resulting from the finite extension of d³v₁ and d³v₂ .^[11]

In order to appreciate the consequences of Maxwell's equilibrium condition, one must first note that a precise choice of v₁ , v₂ , and z implies a definite value of the final velocities inline image and , if energy and momentum are conserved during the collision. Indeed, momentum conservation gives , the collision kind gives the orientation of with respect to v₂ — v₁ , and energy and momentum conservation give together ; the two last pieces of information give , which, combined with the first piece, gives inline image and . Consequently, direct collisions (contributing to dn ) and reverse collisions (contributing to dn ') are simply related by permuting the roles of the initial and final velocities.

Let us denote by inline image and the elements in the space of velocities respectively corresponding to the elements d³v₁ and d³v₂ for a sharply defined kind of collision z . Then the number dn ' of inverse collisions is given by

inline image

a result of Maxwell's Ansatz (2), when inline image and are taken as initial velocities, and the selected kind of collision is z '.

[11] Maxwell 1867, Scientific Papers , 44-46.

― 12 ―

As shown above, inline image , which implies that during a collision the relative velocity u merely rotates. Moreover, the velocity V = (v₁ + v₂ )/2 of the center of gravity is conserved. Therefore the differential element d³ud³V is conserved. Since d³ud³V = d³v₁d³v² , the differential element d³v₁d³v₂ is also conserved. Finally, the sign of the angle q is clearly irrelevant to the definition of s , which implies s (z ') = s (z ). According to these remarks, the number of inverse collisions can be rewritten as

inline image

Consequently, the equality of dn and dn ' occurs if and only if

inline image

Admitting with Maxwell that f cannot depend on the direction of particle velocity, there must exist a function j of v² such that f (v) = j (v² ), and, for any positive numbers x and y , j (x )j (y ) =j (x ')j (y ') if x + y = x' + y '. The latter property is characteristic of exponential functions; hence f must have the form

inline image

which is "Maxwell's distribution" of molecular velocities.^[12]

Clearly, this distribution is not modified by collisions inside the gas, as long as these collisions occur at a pace ruled by Maxwell's Ansatz . But can there be other stationary distributions? Maxwell's answer to this question is so brief that it deserves full quotation:

If there were any other [final distribution of velocities] the exchange of velocities represented by OA and OA' [v and v' in the above notation] would not be equal. Suppose that the number of molecules having velocity OA' increases at the expense of OA. Then since the total number of molecules having velocity OA' remains constant, OA' must communicate as many to OA", and so on till they return to OA.

Hence if OA, OA', OA", &c. be a series of velocities, there will be a tendency of each molecule to assume the velocities OA, OA', OA", &c. in order, returning to OA. Now it is impossible to assign a reason why the successive velocities of a molecule should be arranged in this cycle, rather than in the reverse order. If, therefore, the direct exchange between OA and OA' is not equal, the equality cannot be preserved by exchange in a cycle. Hence the direct exchange between OA and OA' is equal, and the distribution we have determined is the only one possible.^[13]

[13] Maxwell 1867, Scientific Papers , 45.

― 13 ―

Be it wrong, incomplete, or overly condensed, this argument is certainly one of the most impenetrable in all Maxwell's writings. In particular, it is difficult to understand why the balancing of the "direct exchange" between two values of the velocity should be equivalent to the condition dn = dn ', which involves four values of the velocity and pairs of molecules. Even though Maxwell's gas theory rested on sound intuition and skillful mathematics, it failed to provide a convincing proof of the uniqueness of the velocity distribution; and, as we first saw, it left another essential question open, the degree of validity of the collision formula.

Boltzmann's Irreversible Equations

Boltzmann's first important works were dedicated to an extensive criticism of Maxwell's kinetic theory. In spite of some early doubts, Boltzmann soon adopted the Stosszahlansatz and quickly generalized it to more complex systems. In every application he could verify the validity of results obtained from this hypothesis by an independent method, one based on the general theory of Hamiltonian systems and what was later called the ergodic hypothesis.^[14]

When in 1872 Boltzmann questioned Maxwell's laconic reasoning for the uniqueness of the velocity distribution in a dilute gas, he based his alternative proof on the collision formula. The basic idea was simple: knowing the number of collisions of each kind, one could calculate the evolution in time of an arbitrary distribution of velocities and could check whether or not this distribution tended toward Maxwell's. If it did, Maxwell's would then be the unique equilibrium distribution.^[15]

The Boltzmann Equation

With the notation of the preceding section, the number f (v₁ ) d³v₁ of molecules in the element d³v₁ of the space of velocities is decreased by all collisions for which one of the initial velocities belongs to d³v₁ ; the other initial velocity and the kind of the collision may be arbitrary. Conversely, this number is increased by all collisions for which one of the final velocities belongs to d³v₁ ; here the other final velocity and the kind of the collision may be arbitrary. In mathematical terms this gives

inline image

[14] See Klein 1973; Brush 1976; Dugas 1959; Darrigol 1988a.

[15] Boltzmann 1872.

― 14 ―

or, substituting the expressions (2) and (4) for dn and dn ':

inline image

Here we need to remember that inline image , and are defined by v₁ , v₂ , and z . This equation is the simplest case of a "Boltzmann equation." The practical importance of this type of equation was considerable, for it permitted Boltzmann to give a systematic derivation of the observable evolution of a thermodynamic system slightly out of equilibrium, including the transport properties already derived by Maxwell by means of a less general method.

The H-Theorem

However, the most fundamental consequence that Boltzmann drew from his equation was the following theorem on irreversibility: The function of time H defined in terms of the distribution function, f , according to

inline image

is a strictly decreasing function of time for any choice of f —except Maxwell's distribution, for which H is stationary.^[16] This theorem immediately accomplished Boltzmann's initial purpose, a proof of the uniqueness of the equilibrium distribution of velocities. The proof was as follows.

The time derivative of H is given by

inline image

The second term of this expression vanishes, since the derivative can be permuted with the integral sign. The substitution of (8) in the first term gives

inline image

where dm is a positive measure in the (v₁ , v₂ , z )-space, defined as

inline image

― 15 ―

and where inline image stands for . Boltzmann's original proof is considerably simplified by noticing that dm is invariant under a permutation of 1 and 2, and by a permutation of primed and unprimed velocities.

The first of these symmetries gives

inline image

the second gives

inline image

and both together give

inline image

Adding (11), (13), (14), and (15) and dividing by four gives

inline image

The integrand is always positive since the two factors in parentheses always have identical signs (the logarithm being an everywhere increasing function). Therefore dH/dt is always negative, and H is always decreasing. A stable condition occurs only if the integrand vanishes for all values of the variables, that is, inline image for every possible collision. The latter condition is precisely the one leading to Maxwell's distribution. This ends the proof of Boltzmann's H -theorem.

Boltzmann did not fail to notice that -H , calculated for Maxwell's distribution, gave the entropy of a perfect gas (temperatures being measured in energy units). His theorem therefore reproduced the law of entropy increase, insofar as the function -H represented an extension of entropy out of equilibrium. In this way the second principle of thermodynamics could be deduced from kinetic theory, at least for the case of a dilute gas.^[17]

The Nature of Irreversibility

However, Boltzmann soon had to answer an "extremely pertinent" objection raised by his friend and colleague Joseph Loschmidt. Assuming, as

[17] Boltzmann 1872, BWA 1:345-346.

― 16 ―

Maxwell and Boltzmann did, the reversibility in time of the dynamics ruling the microscopic evolution of a closed system, one could always imagine microscopic initial conditions leading to a violation of the entropy law. Indeed, to every evolution with increasing entropy there corresponded an evolution with decreasing entropy, obtained by inverting the velocities of all molecules in the final microscopic configuration. It therefore seemed hopeless to try to derive the entropy law from kinetic theory without an ad hoc selection of microscopic initial conditions.^[18]

Entropy and Probability

In the latter conclusion Boltzmann saw nothing but an "interesting sophism." To illustrate his point of view, he considered a gas of hard spheres uniformly spread within the volume of a container (which provides maximum entropy) at the initial time, and he went on to note:

One cannot prove that for every initial choice of the positions and the velocities of the spheres the distribution will be always uniform after a very long time; one can only prove that the number of [microscopic] initial states leading to a homogeneous distribution after a given long time is infinitely greater than the number of initial states leading to a heterogeneous distribution; even in the latter case the distribution would return to homogeneity after a longer time.^[19]

In this manner Boltzmann reconciled the second principle with the reversibility of molecular dynamics by arguing that extremely improbable initial conditions could be neglected. He underlined the role of probability considerations in this context: "Loschmidt's argument," he wrote, "shows how intimately the second principle is bound to probability calculus." He further suggested a possible extension: "One might even calculate the probability of various distributions [Zustandverteilung] on the basis of their relative numbers, which might perhaps give rise to an interesting method for the calculation of thermal equilibrium."^[20]

In the same year 1877 Boltzmann gave a precise expression to this idea by introducing the quantitative relation between entropy and probability for which he is perhaps most famous.^[21] The detailed account of this relation will be postponed to a later section. For the time being it is sufficient

[18] Boltzmann 1877a, BWA 2: 117; Loschmidt 1876 and subsequent papers. See Brush 1976, 238-240.

[19] Boltzmann 1877a, BWA 2: 119.

[20] Ibid., 119; on 121, one reads: "[Loschmidt's] argument shows how intimately the second principle is bound to probability calculus."

[21] Boltzmann 1877b.

― 17 ―

to mention a persistent ambiguity in Boltzmann's wording for the exclusion of antithermodynamic processes. In some places these are judged to be "infinitely" rare, and the entropy law is still formulated as being strict. In other places they are said to be "extremely improbable" or "impossible in practice."^[22]

Molecular Chaos

In any case, during these years Boltzmann never explained the exact nature of the probability considerations necessarily entering the proof of this H -theorem. In 1894 he could hear the dissatisfaction of British kinetic theorists concerning this lack of clarity. The discussion led to the concept of "molecular chaos," which had only a transitory importance in Boltzmann's work but a central one in the evolution of Planck's ideas.^[23]

For a given partition of the space of molecular velocities into cells, the distribution f (v) can be given a more definite meaning as the list of the numbers N_i representing the number of molecules in the cell i . The corresponding value of H is then defined as the sum

inline image

In principle, the molecular dynamics allows one to calculate the evolution in time of these numbers N_i and therefore the exact value of H at every instant, independent of the Boltzmann equation. The corresponding curve H (t ) is extremely chaotic, with a very rapid alternation of increasing and decreasing phases. However, Boltzmann (and, later, more rigorously, Ehrenfest) could show that in a certain sense, which does not need to be specified here, the H -function was more probably decreasing than increasing.^[24]

Boltzmann's equation gives the secular evolution of H , that is to say, the average evolution, smearing out local irregularities; it does so thanks to the introduction of a special hypothesis hidden in Maxwell's Ansatz . According to this assumption, called "molecular chaos" by Burbury, certain microscopic "ordered" configurations of the gas must be excluded—for instance, the one already described in the previous section, for which the velocities of closest neighboring molecules point toward each other. Beyond this intuitive but vague characterization, neither Boltzmann nor

[22] Boltzmann 1877a, BWA 2:120; ibid., 121.

[23] Burbury 1894 and other British contributions discussed by Brush; Boltzmann 1896b, trans., 40-41. See Brush 1976, 616-625.

[24] On the H -curve, see Klein 1970c, 114-119.

― 18 ―

his British friends could provide a general a priori definition of molecular chaos. In the end, they had to content themselves with defining disordered configurations ad hoc , as those for which Maxwell's Ansatz is valid.^[25]

Recurrence

In 1896 Planck's assistant Ernst Zermelo published a new objection to the H -theorem, which left Boltzmann no time to dwell on molecular chaos. Poincaré had proved a few years earlier that any mechanical system confined to a finite region of space would, after a sufficiently long time, return to a configuration arbitrarily near the original one. According to Zermelo, this contradicted not only the H -theorem but also any kinetic theory of heat, since recurrence was never observed in thermodynamics. Boltzmann was particularly irritated by this objection: his previous description of the H -curve did not exclude recurrence; it just made it very improbable. In his opinion, all the objections to the H -theorem could be turned into harmless comments on the statistical validity of the entropy law.^[26]

As for Zermelo's contention that recurrence in kinetic theory would contradict the entropy law of thermodynamics, Boltzmann remarked that a typical recurrence time as estimated from kinetic theory was so enormous that "in this length of time, according to the laws of probability, there [would] have been many years in which every inhabitant of a large country committed suicide."^[27]

Answering Zermelo's more detailed objections to the shape of the H -curve, Boltzmann improved the description of it by distinguishing eras of entropy increase from eras of entropy decrease in terms of the secular trend of H . For a large number of particles the eras were so long that human observers were confined to a single one, that is, to one direction of time (this direction being defined as that of entropy change).^[28]

Coming back to a more realistic time scale, at the end of the century these episodes left no doubt about Boltzmann's understanding of thermodynamic irreversibility: irreversible behavior could be deduced from the kinetic molecular model, but only as a statistical property of the system under consideration. In this way the second principle of thermodynamics lost its absolute character. This had been foreseen by Maxwell in the famous "demon argument" of 1867, and assented to by Gibbs in 1875 in

[25] Boltzmann 1896b, trans., 40-41.

[26] Zermelo 1895, 1896; Boltzmann 1896a, 1897a. See Brush 1976, 627-640.

[27] Boltzmann 1898a, trans., 444.

[28] Boltzmann 1897a. See Klein 1970c.

― 19 ―

the following terms: "The impossibility of an uncompensated decrease of entropy seems to be reduced to an improbability." In 1898 Boltzmann quoted this thought at the start of the second part of his Gastheorie .^[29]

In the foreword of the same book Boltzmann lamented on the general resistance his ideas met: "I am conscious of being an individual struggling weakly against the stream of the times. But it still remains in my power to contribute in such a way that, when the theory of gases is again revived, not too much will have to be rediscovered."^[30] Rather than converting physicists to the kinetic theory, the notion of statistical irreversibility gave new weapons to those, energetists and positivists, who believed that the concepts of energy and entropy were irreducible.

Summary

Maxwell based his kinetic theory of 1866 on a seemingly obvious derivation of a central quantity: the number of encounters between the molecules of a homogeneous dilute gas. An important property of the resulting expression was that it did not depend on the detailed configuration of the molecules but only on the main quantity of physical interest, the distribution of molecular velocities. In thus eliminating the uncontrollable features of the microscopic description, Maxwell reaped a rich crop: he derived the equilibrium distribution of molecular velocities ("Maxwell's distribution law") and calculated how, through collisions, momentum, kinetic energy, and so on are transferred between contiguous layers of a gas (which accounts for viscosity, thermal conductivity, and other transport phenomena). However, the derivation of the collision formula entailed a hidden assumption of disordered motion, which Maxwell's followers later tried to explain.

Impressed by Maxwell's considerations, Boltzmann greatly extended the generality of the collision formula and the breadth of its applications. Whereas Maxwell had contented himself with the derivation of the final equilibrium distribution (and had given no satisfactory proof of the uniqueness of this distribution), in 1872 Boltzmann managed to derive the time evolution of the distribution of molecular velocities. The corresponding differential equation, which results from Maxwell's collision formula and Conservation laws, involved only the distribution of velocities

[29] Gibbs 1878, Collected works 1:167; Boltzmann 1896b, trans. 215. For Maxwell's demon, see pp. 22-23 below.

[30] Boltzmann 1896b, trans., 216.

― 20 ―

and its time derivative: every uncontrollable feature of the microscopic model was, again, eliminated.

The Boltzmann equation not only simplified access to transport phenomena but also—more fundamentally—implied that the velocity distribution evolves irreversibly toward Maxwell's equilibrium form. Specifically, from the velocity distribution Boltzmann built a certain quantity (later named H by Burbury) which, as a result of the Boltzmann equation, steadily decreases in time until it reaches a minimum value, one that corresponds to Maxwell's distribution. Moreover, the negative of this quantity provided a natural extension of the concept of entropy for a system out of equilibrium; for it increased during irreversible evolution, and, in the equilibrium state, it was identical with the entropy of a perfect gas.

The above result, which constitutes the H -theorem, was repeatedly criticized for conflicting with the general principles of mechanics. In 1876 Loschmidt enunciated his famous paradox: The Boltzmann equation produced irreversible changes, while the equations controlling the underlying molecular dynamics were reversible (symmetric with respect to time reversal). To resolve the conflict, Boltzmann explicitly limited the validity of his equation. He admitted the existence of special molecular configurations for which his equation did not hold; but, he added, such configurations were highly improbable, for they represented only an extremely small fraction of those compatible with given initial macroscopic conditions. In 1877, as a confirmation of this probabilistic view of irreversibility, Boltzmann proposed a direct quantitative relation between entropy and probability, to which I will return later.

In 1894, stimulated by his British colleagues' questions, Boltzmann discussed more precisely the nature of the statistical assumption necessary for the derivation of the Boltzmann equation. This led him to the concept of "molecular chaos" (so named by Burbury). For the Boltzmann equation to hold, excessively "ordered" configurations of the molecules had to be excluded (for instance, those in which every molecule flies toward its nearest neighbor). Aside from such intuitive remarks, Boltzmann contended himself with defining disordered states as states to which Maxwell's collision formula applies. In such a nominalistic guise the concept of molecular chaos could play only a minor role in Boltzmann's writings.

A late attack on the H -theorem came from Planck's assistant Zermelo in 1896. As follows from Poincaré's "recurrence theorem," any finite molecular system has to return to its original macroscopic configuration after a sufficiently long time; therefore, Zermelo concluded, the H -theorem could not be true, even if special "ordered" configurations of the mole-

― 21 ―

cules were initially excluded. This objection irritated Boltzmann but did not embarrass him: his idea of disorder was meant to be statistical, and average recurrence times were far beyond human observation. But to those who believed in a complete generality of the principles of thermodynamics, Zermelo's argument showed that kinetic theory had to be abandoned, for it was incompatible with strict irreversible behavior.

― 22 ―

Chapter II
Planck's Absolute Irreversibility

Against Atoms

Among the outspoken adversaries of Boltzmann's notion of irreversibility stood one of the most respected German thermodynamicists: Max Planck. In his dissertation work of 1879 this son of a law professor had raised the entropy law to the status of an absolute principle. Two years later he explicitly rejected the molecular hypothesis, precisely because it contradicted his conception of the second principle of thermodynamics:

When correctly used, the second principle of the mechanical theory of heat is incompatible with the hypothesis of finite atoms [here a footnote refers to Maxwell's demon argument]. One should therefore expect that in the course of the further development of the theory, there will be a fight between these two hypotheses that will cost the life of one of them. It would be premature to predict the outcome of this fight now; but for the moment it seems to me that, in spite of the great successes of the atomic theory in the past, we will finally have to give it up and to decide in favor of the assumption of continuous matter.^[31]

Planck's argument derived from Maxwell's earlier demon argument. The superhuman demon described in Maxwell's letter to Tait of December 1867 was able to discriminate between slow and fast molecules and, without expense of energy, to direct these two sorts of molecules toward different containers. The temperature gradient created in this way of course

[31] Planck 1882, 475; for Planck's background see, e.g., Heilbron 1986; for his antiatomistic positions see Kuhn 1978.

― 23 ―

contradicted the second principle of thermodynamics. The conflict between the molecular hypothesis and the entropy law shown in this thought experiment convinced Maxwell that entropy was a subjective notion, namely, one depending on the physicist's incapability of controlling the motion of individual molecules. The same conflict led Planck to exclude the molecular hypothesis.^[32]

During the years following this first public condemnation of kinetic molecular theory, Planck successfully applied macroscopic thermodynamics to various systems (chemical equilibrium, solutions, thermocouples, etc.), which strengthened his opinion that atoms were superfluous. In 1891 he found in the meeting of German scientists at Halle a public occasion to denigrate Boltzmann's and Maxwell's kinetic theory. The results of this theory, he declared, were not in suitable proportion to the mathematical effort expended.^[33]

Boltzmann did not answer Planck's attack directly, but in 1894 he found a proper opportunity to confound his adversary. Planck had accepted the editorship of the posthumous fourth volume of Kirchhoff's lectures on physics. The last part was dedicated to a detailed exposition of Maxwell's kinetic theory, and it contained, among other things, a presentation of the proof of Maxwell's distribution based on collision numbers. In a paper published in Annalen der Physik , Boltzmann criticized the negligence of Kirchhoff's editor. Whereas Maxwell had equated the number dn of collisions with initial velocities belonging to two given elementary domains of velocity space to the number dn ' of collisions with final velocities in these domains, the proof by Kirchhoff-Planck equated two different expressions of the same dn : the one given by Maxwell, and another, dn^* , obtained by multiplying "the probability of collision of two molecules" sdW |v₁ - v₂ | by the population of the final state inline image . Boltzmann protested that the probability of collision of the two molecules equaled s dW |v₁ - v₂ | only if the past history of the molecules was unknown; a quite different expression had to be used when the molecules were known to have interacted in a given way.^[34]

Planck acknowledged this point but attributed responsibility for the mistake to Kirchhoff. He further claimed that the objection equally invalidated Maxwell's original reasoning (which reveals his ignorance of it). As a token of his good will, however, he offered an alternative deduction

[32] Maxwell to Tait, 11 Dec. 1867, Cambridge University Archive; Maxwell 1871, 308-309. See Klein 1970b.

[33] Planck 1891.

[34] Kirchhoff 1894, 134; Boltzmann 1894.

― 24 ―

based on time reversal. When changing the direction of time, a stationary distribution of velocities is left unchanged and stationary; therefore, the number dn of collisions of a given kind must be equal to the number of collisions of the reverse kind, which is exactly Maxwell's dn '.^[35]

Boltzmann appreciated this argument, for it made the equality of dn and dn ' necessary and thus gave a new proof of the uniqueness of Maxwell's distribution without recourse to the H -theorem. In his published comments, Boltzmann emphasized the compatibility of the proof with the notion of molecular chaos. Indeed this concept, a fresh outcome of his discussions with British kinetic physicists, was precisely the one supposed to control the applicability of Maxwell's collision probabilities. On this occasion, in spite of his persisting dislike of atomism, Planck could gather that in kinetic theory irreversibility was intimately connected with some assumption of molecular chaos, whatever it meant. Besides, this incident may have convinced him that a new explanation of irreversibility was urgently needed to secure the foundations of thermodynamics.^[36]

Blackbody Radiation

Planck's belief in an absolute entropy principle excluded atomism but not Boltzmann's (and Clausius's) endeavor to give a mechanical explanation of entropy. With respect to mechanical reduction he believed entropy to be comparable to energy. Both quantities had to be determined not only by the thermodynamic state of the system but also by the underlying mechanical state. The problem was to find the proper mechanical basis. While in his opinion molecules would not do, the possibility of a continuum with well-chosen properties, the prime example being the electromagnetic ether, merited further exploration.

There was another reason to examine the relation between electrodynamics and thermodynamics: In the preceding years, several thermodynamic arguments had been successfully applied to the light emitted by heated bodies, and, since Maxwell's "dynamical theory of the electromagnetic field" (1864), the nature of light was often believed to be electromagnetic. Perhaps most important for someone in search of universal properties of nature, Kirchhoff had proved in 1859 an important theorem, a consequence of which was the universality of the spectrum of thermal radiation emitted by what he called a (perfect) blackbody.^[37]

[35] Planck 1895.

[36] Boltzmann 1895.

[37] Kirchhoff 1859. See Jungnickel and McCormmach 1986, 1:299-301; Jammer 1966, 2-6; and Kangro 1970.

― 25 ―

Figure 3.
Absorption of a radiation beam entering a pierced cavity.

By definition a blackbody absorbs any radiation falling upon it, which implies that it looks black in the common sense of the word as long as it is cold enough not to emit much thermal radiation. An excellent concrete realization of a blackbody is obtained by piercing a hole in a container whose walls are at least partially absorbing at all frequencies. Indeed, a light ray penetrating through the hole is both reflected and attenuated a great number of times by the inside walls, until it practically vanishes (see fig. 3). Let us now maintain the walls of this cavity at a constant high temperature. The radiation emitted by a portion of the walls interacts a great number of times with other portions of the walls, which leads to thermal equilibrium for the energy per unit volume u_vdv corresponding to a frequency interval dv .^[38] A simple proof of the universality of the spectral density u_v (not Kirchhoff's) goes as follows.

Consider, ab absurdo , two cavities at the same temperature but with different spectral densities inline image and . Connect them through a small tube in which is placed a filter F at the frequency v (fig. 4). If is greater than , there must be a flow of radiation from 1 to 2; and this flow must be permanent since the excess of radiation energy in 2 must be re-absorbed in order to maintain the thermal equilibrium in 2. But a permanent energy flow between sources at equal temperature is incompatible with the second principle of thermodynamics. Therefore, it must be that

In 1884 Boltzmann assumed the electromagnetic nature of thermal radiation to prove Stefan's empirical law (1879): The total energy density u (given by inline image u_vdv ) is proportional to the fourth power of the absolute

[38] The first definition of a blackbody is in Kirchhoff 1860. As Jammer explains, Kirchhoff's considerations originated in studies of solar radiation (Jammer 1966, 2-6). The use of a pierced cavity for the empirical study of blackbody radiation was first suggested by Christiansen 1884 and first realized by Lummer and Wien 1895.

― 26 ―

Figure 4.
Thought experiment leading to the universality of the blackbody spectrum.
T is the temperature of a thermal bath, and F a monochromatic filter.

temperature T . A slightly modernized version (in Planck's manner) follows.^[39]

According to Maxwell, an electromagnetic plane wave falling normally upon a perfectly reflecting surface exerts a mechanical pressure, the intensity of which is given by twice the energy density of the wave. Consider now the case of an isotropic radiation falling upon a reflecting plane P . The pressure exerted by the fraction of this radiation oriented in the direction z , within the solid angle dW , is given by p = 2u cos qdW /4p , where u is the total energy density, and q is the angle between the direction

and the normal to P . Since this pressure is directed along z , its component normal to P is p cos q , or 2u cos²qdW /4p . The average pressure, P , exerted on P is obtained by integrating the latter expression over all solid angles pointing toward P :

inline image

It should be further noted that in the case of blackbody radiation this pressure is independent of the reflecting quality of P . Indeed, if it were dependent, one could build a perpetual motion of the second kind by inserting in a blackbody cavity a plate silvered on one side and blackened on the other. One may therefore legitimately speak of a "radiation gas" with a definite pressure P , as Boltzmann did in his paper.

In this circumstance, the entropy variation

inline image

[39] Boltzmann 1884a, 1884b. For the context of this work see Carazza and Kragh 1989.

― 27 ―

Figure 5.
Wien's perfectly reflecting piston, as used
in the proof of the displacement law.

where V is the volume of the cavity, must be a differential. This implies (after subtraction of d(uV/T ) from dS )

inline image

or, using (18) and Kirchhoff's law (which makes u a function of T only),

inline image

Integrating the latter equation gives

inline image

which is Stefan's law.

Boltzmann's original reasoning involved, for historical reasons not worth mentioning here, an adiabatic displacement of a reflecting piston inserted in a cylindrical blackbody cavity. During this displacement the spectrum of the reflected radiation differs slightly from that of the incident radiation, as a result of the Doppler effect. In 1894 Wilhelm Wien cleverly exploited this thought experiment to restrict the form of the function u_v (T ). ^[40]

Wien's argument, or more exactly Planck's version of it,^[41] starts with the consideration of a cavity with perfectly reflecting walls, one of which is part of a mobile piston (fig. 5). At the initial time this cavity contains isotropic electromagnetic radiation with the spectral density r_v . Under a slow displacement of the piston at a constant speed v , the light of frequency

[40] Boltzmann 1884b; Wien 1894.

[41] Planck 1906, 68-82.

― 28 ―

v in an incident beam is Doppler-shifted by

inline image

where q is the angle of incidence, and c the velocity of light. In a time interval d t and for the same direction of incidence, the energy flux impinging upon the surface area S of the moving mirror is given by

inline image

As a result of the Doppler shift of the reflected radiation, there is a variation d (p_v V) of the spectral energy at frequency v ; its value may be obtained by subtracting the flux at the frequency v , whose frequency will be increased above v , and adding the flux at the frequency v — d v, whose frequency will be increased just to v , by the reflection:

inline image

or, using (23) for d v,

inline image

Noting that Svd t is equal to the variation in volume d V of the cavity, and integrating over q yields

inline image

and

inline image

for the variation d r_v of the spectral density.

If the radiation initially contained in the cavity is blackbody radiation at a given temperature T (r_v = u_v (T )), then the radiation obtained after the adiabatic displacement of the piston must also be in thermal equilibrium (but at a different temperature). Indeed, if the opposite were true, one would have a state out of equilibrium connected to a state in equilibrium through an adiabatic reversible transformation, which is absurd.^[42] In this

[42] Planck (1906, 69-70) demonstrates the absurdity by considering the following cycle. The piston, initially containing blackbody radiation, is moved adiabatically to a different position (step 1); a little piece of a "black" substance is introduced inside the piston and removed (step 2); the piston is brought back adiabatically to its initial position (step 3); a little piece of "black" substance is introduced inside the piston (step 4). This is a cycle because both the total heat and the total work exchanged are zero; energy densities and temperatures are therefore equal m the initial and final states. If the radiation was not at equilibrium after step 1 or after step 3, entropy would be created during step 2 and step 4, without compensation during the adiabatic processes (1) and (3).

― 29 ―

case the variation of the spectral density can be obtained in a second independent way, as the variation of the universal function u_v (T ) for the variation d T of temperature corresponding to the adiabatic transformation:

inline image

d T itself is readily obtained by equating the variation of the total radiation energy as given by the Stefan-Boltzmann law (equation 22) to the work performed by the piston against the radiation pressure (given by equation 18):

inline image

which implies

inline image

Equating the two expressions (28) and (29) for d u_v now gives

inline image

with the general integral:

inline image

where f is an arbitrary function. This is the so-called "displacement law" which allows the derivation of the blackbody spectrum at any temperature, once it is known at a given temperature.

Planck's Resonators

The empirical validity of Kirchhoff's law, Stefan's law, and Wien's displacement law left no doubt about the legitimacy of combining electrodynamic and thermodynamic laws; and Planck was well aware of these developments. In 1895 he decided to examine the electrodynamic mechanisms responsible for the thermalization of radiation, hoping to find in them the ultimate source of irreversibility and to perhaps determine the arbitrary function in Wien's displacement law. This grand program focused on a very simple system, a Hertz resonator: that is, a small, nonresistive, oscillating electric circuit interacting with electromagnetic waves, the characteristic wavelength of the oscillator being much larger than the oscillator. The simplest choice was considered the most adequate by Planck,

― 30 ―

Figure 6.
Planck's intuition of the source of thermodynamic
irreversibility: the diffusion of a plane electromagnetic
wave by a resonator (at the frequency of the wave).

for in light of Kirchhoff's theorem, the properties of thermal radiation could not depend on the specific properties of the thermalizing system.^[43]

What first attracted Planck's attention was the apparent irreversibility of the interaction between radiation and resonator: a plane monochromatic wave falling upon a resonator forces vibrations of the resonator when the condition of resonance is approximately met; in turn these vibrations emit secondary waves over a wide angle (fig. 6). Also, an excited resonator left to itself emits radiation at its characteristic frequency and thereby gradually loses its energy. Such processes, resonant scattering or radiation damping, looked essentially irreversible, even though the total energy (that of the resonator plus that of radiation) was strictly conserved (in the absence of the Joule effect in the circuit). Planck concluded:

The study of conservative damping seems to me highly important, because it opens a new perspective on the possibility of a general explanation of irreversible processes through conservative interactions, a more and more pressing problem in contemporary theoretical physics.^[44]

These results of classical electrodynamics are now very well known; they are usually obtained through a specific model of the resonator, for instance an elastically bound electron. In 1895, however, the existence of the electron had not been proved; and Lorentz's formulation of electrodynamics, with its detailed analysis of microscopic sources, was not yet currently known (the famous Versuch was published in the same year).

[43] For earlier accounts of Planck's program, see Brush 1976, 628-640; Klein 1962, 1963a; Kuhn 1978; Needell 1980. A general survey of Planck's thermodynamic views is in Hiebert 1968.

[44] Planck 1896, 1897a, PAV 1:470.

― 31 ―

Planck was therefore confined to a different method, first used by Hertz in 1889 for a calculation of the energy radiated by an oscillating dipole. In this method the detailed structure of the resonator was irrelevant .^[45]

As we shall observe in the following, Planck maintained this generality throughout his program and even gave it an essential role. It is therefore useful at this point to explain how the equation of a resonator in an electromagnetic field can be established with this method. Another reason for analyzing Planck's reasoning is that it is typical of a style of theoretical physics, namely, concentrating on the features of physical systems which can be determined on the basis of general principles only, without recourse to detailed microscopic assumptions. Nevertheless, hurried readers may be content with these general comments and may jump to the equation (62), p. 36, for the evolution of the dipolar moment f of a resonator.

The Resonator Equation

A variable distribution of charge and current is located around the origin O of coordinates, and its extension does not exceed the length d . According to Hertz, the simplest possible form of the electromagnetic field at a large distance r from O (r >> d ) derives from a vector potential directed along the z -axis

inline image

(here f ' denotes the derivative of the function f ) and from a scalar potential

inline image

Indeed, A_z is isotropic, and the fields

inline image

satisfy Maxwell's equations in an empty portion of space, as results from^[46]

inline image

[45] Hertz 1889; Planck 1897a.

[46] Planck used Gaussian units (see Conventions and Notations at the front of this book). Hertz introduced the function f (which he denoted P ) but did without the potentials e and A.

― 32 ―

A physical meaning is given to this solution by considering the particular case of a monochromatic source, for which inline image . If the wavelength l exceeds the dimensions of the source d by a large amount, the scalar potential in the region defined by d << r << l is given by

inline image

This is precisely the form of the electric potential that would be created by a static dipole with an amplitude f (in Gaussian units). By a natural extrapolation, Hertz's solution should be regarded as the radiation created by an electric dipole with the varying amplitude f .

We now return to the general case, but with a limitation on the variations of the dipole f : these can be neither too fast nor too slow. More specifically, it will be necessary to assume that there exist two characteristic lengths inline image and such that and, at any time,

inline image

The energy dY/dt emitted in a time unit by the variable dipole can be derived by calculating the flux of the Poynting vector (c /4p ) E × B across a sphere centered in O with a radius inline image . On this sphere the potentials (34) and (35) in spherical coordinates r , q , a (see fig. 7) are approximatively given by:

inline image

The resulting field strengths are (using (39))

inline image

so that the Poynting vector is a radial vector with the amplitude (c /4p ) (f"/rc² )² sin²q . Its flux through the sphere is

inline image

In a second step of his considerations, Planck superposed on Hertz's solution an external field E_e , B_e . This field must be a solution of Maxwell's

― 33 ―

Figure 7.
The dipolar radiation field according to Hertz and Planck.

equations in empty space; and at any time the characteristic distance l_e over which it varies is assumed to greatly exceed the size d of the charge distribution. Then the energetic coupling between the variable charge distribution around O and the external field can be determined without knowing the internal structure of this distribution . To this end Planck cleverly considered the energy flux, F , through a sphere E centered on O, much larger than the charge distribution but small enough to make the variations of the enclosed fields negligible inline image .

Assuming that the charge distribution exchanges energy only with the electromagnetic field (there is no Joule effect and no internal electromotive force), this flux must be equal to the diminution rate —dY/dt of the energy of the charge distribution plus the diminution rate —dW_e/dt of the energy of the external field enclosed by E in the absence of a charge distribution:

inline image

Another expression of this flux results from Poynting's formula:

inline image

This integral may be split into four terms according to the development of the vector product. The one corresponding to E × B is the flux f₀ across E resulting from the field created by the charge distribution, and

― 34 ―

it is simply obtained by noting that the flux f (r ) calculated above (formula 43) must be equal to its retarded value (the propagation time being r/c ):

inline image

The term corresponding to E_e × B_e exactly compensates —dW_e/dt . The coupling terms remain:

inline image

Since the radius r of S is much smaller than the length inline image , according to (34), (35), (36), and (39) the fields E and B on S are approximatively given by

inline image

This leads to

inline image

since the value of E_e at any point of the sphere may be replaced by its value in O.

One might easily believe that F₂ vanishes, because for a uniform B_e the corresponding integral vanishes: one has

inline image

and the latter integral is a zero vector, since d S × E is always tangent to the parallels of S and has a constant modulus on a given parallel. Yet, F₂ is not zero, because the gradient of B_e contributes a term of the same order (in inline image ) as F₁ . This term is

inline image

Because of the rotational invariance of d S × E, only the part inline image of contributes to this integral. Taking into account Maxwell's equation , an elementary calculation then yields

inline image

Equating the two expressions (44) and (45) of F now gives

inline image

― 35 ―

inline image

In the latter equation one can consider the time to be the only variable since, according to (39), f (t — r/c ) can be replaced with f (t ) whenever inline image . This gives

inline image

A further simplification of this equation is obtained by noticing that the terms accompanying Y in the parentheses are negligible, as results from the following remarks.

From the expressions (48) for E and B on S result the orders of magnitude f²/r⁶ for the electric and f '²/c²r⁴ for the magnetic energy density on S . Since r is much larger than the dimensions of the charge distribution, these densities must be very small compared with the contribution Y/(4/3)p r³ of the charge distribution to the average energy density inside S . This gives the inequalities

inline image

Besides, inline image must have the same order of magnitude as f"'/c³ in order that, in equation (55), the variations of f may be coupled to those of . Finally, the inequalities (39) give, for ,

inline image

Taken together, these remarks allow us to write

inline image

as previously asserted. Thanks to this property, one may, without any loss of information, redefine the energy of the charge distribution as the expression in the parentheses in equation (55), which gives the simpler equation

inline image

At this point Planck specified the form of Y in order to represent the case of a resonator, for which the variable distribution of charge and current is comparable to an oscillating circuit. From an analogy with the

― 36 ―

theory of such circuits he set

inline image

wherein the two terms correspond to electric and magnetic energies that can be periodically transformed into each other. This hypothesis implies an equation for the evolution of f :

inline image

In the absence of an external excitation the term proportional to f produces a spontaneous (conservative) damping of the resonator oscillations as a result of the emission of radiation. In the presence of an external field, the electric field inline image at the resonator along the direction of its axis acts on its electric moment f if the Fourier spectrum of this field contains frequencies that are sufficiently close to the eigenfrequency of the resonator. If so, then the resulting field, which is the sum of the external field and the Hertz field produced by f , spreads out in every direction of space even if the external field propagates in a definite direction. This is the feature perceived as irreversible by Planck.

Since radiation damping is always very small (it takes a great number of periods), the previous equation is approximately equivalent to a simpler one:^[47]

where E is an abbreviation for the exciting field inline image , and

inline image

Planck's subsequent studies of radiation processes were based on these equations and on the energy formula (60).

Summary

Planck believed that the two principles of thermodynamics were absolutely valid. Just as certainly as the energy of a closed system remained constant,

― 37 ―

so too for Planck its entropy could only increase in time. Every new theory had to be compatible with both principles—or it had to be rejected. As Planck repeatedly asserted in the years 1880-1895, kinetic molecular theories did not pass the test, for they implied the possibility of entropy-decreasing processes in closed systems. Such violations of the second principle had been shown to occur by Maxwell in his demon argument (1867), and again by Zermelo (1896) with the help of Poincaré's recurrence theorem.

Planck nevertheless studied Maxwell's kinetic theory, if only as a part of his duties as editor of Kirchhoff's lectures on thermodynamics, which included this topic. The resulting book, published in 1894, contained an unfortunate mistake in the proof of Maxwell's distribution law, which stirred a short but instructive polemic with Boltzmann. Planck proposed a better proof and in the process became acquainted with, though not convinced by, Boltzmann's idea of molecular chaos. On this occasion he might also have felt the need of an alternative microscopic foundation of thermodynamics. The following year he engaged upon a new program, the principal aim of which was to provide an electromagnetic explanation of the principles of thermodynamics.

Relations between electrodynamics and thermodynamics had already been found in the study of the so-called blackbody radiation, that is, radiation in thermal equilibrium with the walls of a cavity. That thermodynamics applied to this radiation was clear from the experimental verification of one of this theory's consequences: the universality of the blackbody spectrum (proved by Kirchhoff in 1859). That electromagnetic theory also applied was suggested by Boltzmann's proof (1884) of Stefan's law (1879), which rested upon the use of Maxwell's expression for radiation pressure. Finally, in 1894 Wien had shown that Boltzmann's latter argument could be extended to derive the so-called displacement law, which expresses the blackbody spectrum at an arbitrary temperature in terms of that for any given temperature. By then Hertzian waves had been discovered (1888), and most physicists believed in the electromagnetic nature of blackbody radiation.

Well aware of these developments, in 1895 Planck proposed that the electromagnetic interaction between matter and radiation could explain both thermodynamic irreversibility and the observed value of the blackbody spectrum. As the archetype of an energy-conserving, irreversible process, he imagined the scattering of a plane electromagnetic wave by a miniature version of a perfect Hertz resonator (small oscillating circuit with neither dissipation nor internal electromotive force). Planck then

― 38 ―

proceeded to a quantitative evaluation of this effect. Unlike Lorentz, who had calculated electromagnetic scattering with the help of a specific model of elastically bound ions, Planck favored Hertz's phenomenological approach to electrodynamics and left the internal structure of the resonators undetermined. Just by balancing the energy flux through cleverly chosen surfaces enclosing the resonator, Planck deduced the mathematical form of the interaction between the electric moment of a resonator and the surrounding radiation. The resulting differential equation involved a damping term, which Planck interpreted as the sought-after source of irreversibility.

― 39 ―

Chapter III
On Irreversible Radiation Processes

A Polemic with Boltzmann

In early 1897 Planck published the first of a series of five memoirs entitled "On irreversible radiation processes." His aim was to exploit the preceding equations in a systematic investigation of the uniformizing action of resonators on the radiation enclosed in a cavity with ideal (non-thermalizing) reflecting walls.^[48]

In the first memoir he contented himself with some general considerations announcing the three main points to be proved:

1. the asymmetry of the system under time reversal

2. the absence of Poincaré recurrences

3. the evolution of the system toward a unique stationary final state, for which radiation would be homogeneous and isotropic and would have a definite spectrum.^[49]

Boltzmann, as a specialist in both thermodynamics and electrodynamics, protested immediately and vigorously. Planck's objectives, he argued, could never be reached. Maxwell's equations—even in the presence of a resonator (without the Joule effect)—were just as reversible as

[48] Planck 1897b, 1897c, 1897d, 1898, 1899; also in PAV 1:493-600.

[49] The last point (the definite final spectrum) was never established by Planck. In reality, ideal resonators cannot change at all the spectrum of cavity radiation, as proved m Ehrenfest 1906 and remarked by Einstein m a letter to Mari^[*] of 10 Apr. 1901 (Einstein 1987). See also Planck 1906, 220-221.

― 40 ―

were the equations of mechanics. More generally, the analogy between electrodynamics and mechanics was sufficient to exclude Planck's objectives. For instance, Boltzmann explained, to the scattering of a wave by a resonator corresponded, in mechanics, the scattering of a parallel beam of particles by a fixed target. Nobody would have questioned the reversibility of the latter process; therefore the former also had to be reversible. In both cases the seeming irreversibility came from an arbitrary selection of a certain class from among the possible initial conditions, one excluding beams or waves converging toward the target.^[50]

In his second memoir Planck replied to Boltzmann's criticism. From the beginning of his considerations, he stated, the external electromagnetic field was assumed to vary by a negligible amount over the dimensions of the resonator. This excluded singular fields converging toward the resonator. More generally, Planck denied physical character to solutions of this type, for they could not be realized except by extreme contortions.^[51]

This answer failed to satisfy Boltzmann. The "anti-Planck" solutions (obtained from Planck's solutions by time reversal) were singular only in the physically inaccessible limit of an infinitely small resonator, so that the exclusion from nature of certain types of singularities could not affect the question of reversibility. In order to understand the thermodynamic irreversibility of radiation processes, Boltzmann suggested, one had to imitate the example of kinetic theory and introduce adequate concepts of probability and disorder:

"Just as in gas theory, in radiation theory one could define a state of maximal probability, more exactly a general formula that would include all states for which the waves are not ordered, but cross each other in the most diverse way."^[52]

Planck could not easily adopt suggestions that reduced to naught his original project of deducing from radiation theory a strict, nonprobabilistic entropy law. In his third memoir, communicated in December 1897, he analyzed in some detail the case of a resonator placed at the center of a spherical (perfectly reflecting) cavity. On the basis of the exclusion of singular external fields he believed that he had proved an asymmetry of this system under time reversal. In order to reach his second objective, the exclusion of Poincaré's recurrences, he made a slight concession to

[50] Boltzmann 1897b.

[51] Planck 1897c.

[52] Boltzmann 1897c, BWA 3:621.

― 41 ―

Boltzmann's viewpoint by introducing a notion of disordered radiation. But this notion was not statistical; it just meant the exclusion of some states of radiation "synchronized" with the resonator, on account of their being nonphysical.^[53]

Boltzmann protested for a last time: Planck's expression for the time-reversed version of an equation used in his proof of irreversibility was simply wrong, for it "reversed" the external field without reversing the field of the resonator!^[54]

In his fourth memoir (1898) Planck humbly acknowledged his mistake and turned to a more systematic exploitation of the analogy between gas theory and radiation theory, as Boltzmann had earlier advised. Within the context of this program the central concept became that of "natural radiation"—the counterpart of molecular chaos; and the main theorem became a theorem of irreversibility, which was the counterpart of the H -theorem. In the fifth memoir Planck generalized his results to an arbitrarily shaped cavity, in the manner that will now be described in detail.^[55]

Natural Radiation

The exciting field E (t ) and the electric moment f (t ) of the resonator are conveniently described by their Fourier transforms inline image and such that^[56]

inline image

In these terms the resonator equation (62) becomes

inline image

where

inline image

As a preparatory step, Planck characterized what he called the "directly measurable" quantities connected with the resonator and with its exciting field. There was first the secular average U (t ) of the energy Y (t ) of the resonator, which can be obtained by retaining only the slow frequencies

[53] Planck 1897d.

[54] Boltzmann 1898b.

[55] Planck 1898, 1900a.

[56] Until 1899, instead of the more convenient Fourier integrals Planck used Fourier series over a time interval largely exceeding all characteristic times of the system. He never used the complex notation, although it greatly simplifies the calculations.

― 42 ―

in the Fourier spectrum of the function Y (t ). More specifically, the Fourier transform inline image will be defined as the part of the Fourier transform for which the frequency w is inferior to a cutoff , with .^[57] Then Planck defined the spectral (electric) intensity Jw 0(t ) of the exciting field as a quantity proportional to the secular average of the energy of a test-resonator with central frequency w₀ , and a frequency width r w₀ larger than inline image but much smaller than w₀ . Physically, this means that the test-resonator must be damped in such a way that it can "follow" the secular variations of the field without, however, losing the quality of resonating at the frequency w₀ . This can be achieved only by introducing a resistance r much larger than the one implied by pure radiation damping.

The secular energy U (t ) of the test-resonator is simply given by the secular average of Kf² , for the two terms in the energy formula (60) have equal secular averages. Then, the form (66) of the resonator equation gives, for a frequency µ less than the cutoff inline image :

inline image

From the condition inline image it can be inferred that Z^-1 (w + µ ) ~ Z^-1 (w ) (which means that the response of the resonator is not affected by a frequency shift of the excitation by an amount small compared with the width of the resonator). Consequently, the Fourier transform of the spectral intensity has the form

inline image

where c w 0(w ) is proportional to |Z^-1 (w )|² . The normalization of Jw 0(µ ) is obtained by requiring the integrated spectral intensity to be identical with the (secular)intensity inline image :

inline image

with the result

inline image

Consequently, equation (69) just means that inline image (µ) is an average of for frequencies w close to w₀ , with the weight function c w 0.^[58]

[57] This definition of a secular average is the one best suited to the following calculations. However, Planck preferred a more direct definition, as a time average over an interval [t, t + t ].

[58] Planck 1898, 1899.

― 43 ―

The Fundamental Equation

We now return to a genuine Planck resonator, that is, a resonator with purely radiative damping, and immerse it in cavity radiation, the walls of the cavity being perfectly reflecting (so that they have no thermalizing effect). Planck supposed the secular energy U (t ) of this resonator and the spectral electric intensity J w 0(t ) of the radiation to be the only directly measurable quantities at the place of the resonator. Accordingly, he looked for an equation directly relating these quantities, one that did not involve a more detailed mathematical description based on the functions f (t ) and E (t ). In this respect Planck's approach had an antecedent in Boltzmann's work: the Boltzmann equation directly rules the evolution of the velocity distribution of a gas, without requiring a detailed description of the configuration of individual molecules.^[59]

The exact value of U (>t ) as a function of E (t ) results from the low-frequency part ( inline image ) of equation (68):

inline image

Unfortunately, this expression depends on the detailed form of the function inline image of the variable w . In order to counter this inconvenience, Planck introduced the hypothesis of "natural radiation," according to which the "perceived" by the resonator can be replaced by its measurable average .

We remember that, in the absence of a better definition, Boltzmann had characterized molecular chaos as what legitimates Maxwell's collision formula; in a similar manner, Planck defined natural radiation as what legitimates the above mathematical prescription:

In the sequel I shall assume the validity of the following hypothesis, which is most natural and presumably unavoidable: In the calculation of U from equation [68], the quickly varying factor in the integral can be replaced, without appreciable error, with its slowly varying average . The problem of calculating U from thus receives a perfectly determinate solution, to be verified by measurement.

On the qualitative side, Planck described natural radiation in a manner reminiscent of Boltzmann's reference to the irregularities of molecular motion: "We may grasp the concept of natural radiation in a less direct . . . but more intuitive manner: the deviations of the nonmeasurable, quickly

[59] Planck 1898, PAV 1:551-552; Planck 1899, PAV 1:571-573.

― 44 ―

varying quantity inline image from its slowly varying average are small and irregular."^[60]

The corresponding formal substitution transforms equation (68) into

inline image

The resonance being very narrow, one can use an approximative expression for Z(w ):

inline image

The integral in (72) becomes^[61]

inline image

Using the relations (64), the following equation results:

inline image

Planck's "fundamental equation" was just the Fourier transform of this equation:^[62]

inline image

Accordingly, the energy of a resonator is increased at a rate given by the spectral electric intensity of the surrounding field at the frequency of the resonator, but also damped with a time constant (r w₀ )^-1 identical with that controlling the damping of the dipolar moment of the resonator.

The Electromagnetic H-Theorem

From this equation Planck could derive the measurable effects of a resonator on cavity radiation.^[63] For this purpose he borrowed from optics the notion of a radiation beam, which he considered to provide the finest information about the radiation field that was accessible to measurement. An elementary conical beam is characterized by its direction z , by the solid angle dW in which it is confined, by its frequency, by its spectral

[60] Planck 1899, PAV 1:573.

[61] The integration can be easily performed by the method of residues.

[62] Planck 1898, PAV 1:552; Planck 1899, PAV 1:575.

[63] Planck 1898, PAV 1:543-556; Planck 1899, PAV 1:575-592.

― 45 ―

Figure 8.
Geometric parameters for a beam converging toward a resonator
(the electric field E must be orthogonal to the beam).

intensity (of energy flow) I_v , and by its state of polarization. The latter property is given by the two principal directions (along which the intensity across a polarizer is extremal) and the corresponding principal intensities inline image and .

In this representation the radiation falling on a resonator at a time t from the direction z is characterized by two functions inline image and giving the principal intensities at the frequency v (with v = w /2p ), and by the angle a (t ) between one of the principal directions and the plane P defined by z and the electric moment of the resonator (see fig. 8). Since the part of the radiation with a frequency close to v₀ (= w₀ /2p ) is the only one affected by the resonator, we need only consider the intensities inline image and , and we can simplify the notation by dropping the index v₀ in the sequel. Then the intensity in the plane P is

inline image

and the intensity in the direction normal to P is

inline image

(one must have I_|| + I_S0094> + I ' + I " + I ).

The contribution of this beam to the electric intensity J_v0 (taking J_v0 = 2p Jw 0) is to the flux I_||d W what the squared component inline image of the electric field of a plane wave traveling in the direction z and polarized in P is to the modulus of the Poynting vector, (c /4p ) |E × B|. Therefore,

inline image

where q denotes the angle between z and the electric moment of the resonator. The corresponding energy absorbed by the resonator in a time unit,

― 46 ―

inline image , is given by the right-hand side of the "fundamental equation" (76):

inline image

The total energy radiated by the resonator in a time unit is given by the damping term r w₀U in the fundamental equation. As results from the expression (41) of dipolar radiation, the radiation emitted in the direction z is completely polarized in the plane defined by this direction and the electric moment, and its intensity varies with the azimuthal angle q like sin²q Consequently, the energy radiated in the solid angle dW around z amounts to

inline image

where the factor 3/8p gives the proper normalization, as results from

inline image

For the intensity of this emitted radiation to be defined, the finite breadth (Breite ) of the source, that is, its efficient cross section times its spectral width, must be taken into account. Planck identifies this breadth, s , with the ratio of the absorbed energy to the active part (the one contributing to J_v0 ) of the incoming flux:^[64]

inline image

The intensity emitted in the direction z is therefore, using (81),

inline image

However, this radiation is only one part of the outgoing radiation (at a frequency v ~ y₀ ) in the direction z . Another contribution comes from the part of the incoming radiation in the same direction z which is not absorbed by the resonator. The latter radiation is a mixture of a radiation polarized in the ^ direction with the intensity I^ , and of a radiation polarized in the || direction with the intensity I_|| cos²q . Consequently, the principal directions of the total emerging radiation are these two directions, and its two principal intensities are

inline image

Calling inline image the intensity balance corresponding to a given direction z of absorption or emission, and using the identities (79),

[64] This identification might seem somewhat arbitrary, but it is the only one leading to the natural form of energy conservation expressed in equation (86).

― 47 ―

(84), and (85), the fundamental equation (76) can now be rewritten as

inline image

This equation means that the increase of the (secular) energy of the resonator is equal to the balance of the ingoing and outgoing fluxes of radiation energy.

Having expressed energy conservation in this form, Planck looked for a similar form for the total entropy variation of the system:

inline image

where S would represent the entropy of the resonator, and L an "intensity of entropy" corresponding to the intensity I of energy. Presumably guided by a formal analogy with Boltzmann's H -function, Planck posited (taking v = v₀ , to simplify the notation)^[65]

inline image

That S depended only on U/v was a consequence of Wien's displacement law, as will be seen later; and the factor c² /v² next to I was suggested by the relation I_R (c² /v² ) = U sin²q . For this choice Planck's "electromagnetic H -theorem" holds: the total entropy variation dS_T /dt is always positive, and it vanishes if and only if the distribution of intensities is isotropic and unpolarized. Planck's derivation of this result will now be given.

Using equation (86), one has

inline image

and, inserting this and (88') into (87),

inline image

We are now left with the study of the sign of an expression of the form inline image , where y = x (ln x - 1), and x = Ic² /v²U . Writing the relations (77), (78), (84), and (85) in terms of the x variables gives

inline image

[65] Planck 1899, PAV 1:585.

― 48 ―

Figure 9.
Diagram for the proof of the electromagnetic H -theorem. The point A
has the ordinate (y ' + y ")/2, and the point B,

. The concavity
of the curve implies that A is always "higher" than B, and therefore that

As a first consequence, inline image " lies between x_|| and 1. Since the function y = x (ln x - 1) is a concave function reaching its minimum for x = 1, this implies that y_|| is larger than " and that

inline image

The positivity of the right-hand side results (fig. 9) from the concavity of the function x (ln x - 1) and from the fact that inline image " and x_|| always lie between x ' and x ", as can be seen from (91). Therefore, the total entropy variation is always positive.^[66]

The total entropy variation vanishes only if Dy = 0 for any z , since Dy is always negative. In the above chain of reasoning, the inequalities can be replaced with equalities only if inline image " and also x ' = x ', x " = x_|| , up to a permutation of x ' and x ", which play symmetric roles. Once having eliminated the singular cases a = p /2, q = 0, these three equalities can be satisfied only if , which implies that incoming and outgoing radiations are isotropic and unpolarized. This concludes the proof of what might be regarded as Planck's greatest theorem.

Once uniformity has been reached, the exciting intensity appearing in the fundamental equation (76) can be expressed in terms of the spectral density r_v of the radiation field. Following Planck, r_v is defined by applying to the energy density (E² + B² )/8p the same operations as those that

[66] Planck's original proof was heavier here, since it did not exploit the concavity of x (ln x - 1).

― 49 ―

lead from inline image to J_v . Writing symbolically , the density may be written

inline image

Isotropy implies

inline image

and

inline image

The fundamental equation (76) in the stationary case therefore reads:^[67]

inline image

The Blackbody Law

None of the equations reached by Planck was yet able to determine the equilibrium spectrum of radiation; they just gave a relation between the average energy of a resonator and the density of the surrounding radiation. As noted in Planck's third memoir, a resonator acts only on radiation with frequencies close to its own resonance frequency and therefore cannot change the spectrum of cavity radiation, as long as its more detailed structure is not taken into account. Planck therefore gave up his original hope (see p. 39, point (3)) to describe with the help of his resonators the evolution of the radiation spectrum toward its equilibrium value. Nevertheless, he found another access to the equilibrium spectrum, by identifying the entropy function used in the electromagnetic H -theorem to the real thermodynamic entropy. This would be a legitimate procedure only if the form

inline image

of the resonator entropy was the sole form compatible with the irreversibility theorem. In his fifth memoir (1899) Planck asserted without proof this uniqueness of form and derived the equilibrium spectrum in the following manner.^[68]

[67] Planck 1899, PAV 1:581.

[68] Ibid., 596: "I have made repeated efforts to modify the expression for the electromagnetic entropy of a resonator in such a manner . . . that it remains compatible with all well-founded electromagnetic and thermodynamic theoretical laws; but I have not succeeded."

― 50 ―

He first evoked Wien's displacement law to limit the choice of the arbitrary functions f and g in the above entropy formula. Thanks to the fundamental relation (96), this law can be expressed directly in terms of the resonator energy as

inline image

This implies dS = (v/T )j '(v/T )d (v/T ), which means that S must be a function of v/T , or another function of U/v , since U/v = j (v/T ). Consequently, the resonator entropy must have the form

inline image

The absolute temperature of a resonator in thermodynamic equilibrium is then given by the relation dS/dU = 1/T , with the result

inline image

Inverting this equation provides^[69]

inline image

Finally, the "fundamental equation" (96) gives for the spectral density u_v of the blackbody spectrum

inline image

This law was not new. It had been proposed two years earlier by Wien on the basis of a fragile analogy with the exponential form of Maxwell's law. At the date of Planck's fifth memoir (early 1899) it was well confirmed by experimental measurements. At the price of incompletely justified assumptions about natural radiation and entropy, Planck therefore believed that he had reached the main objectives of his program: a deduction of irreversibility from electrodynamic processes, and a derivation of the universal law of blackbody radiation. As a clear manifestation of his trust in the fundamental character of his theory, he suggested that the constants a and b appearing in the resonator entropy be considered new fundamental constants and recommended natural units of length, time, mass, and temperature built from a, b, c , and the universal gravitation constant.^[70]

[69] Planck 1899, PAV 1:591.

[70] Wien 1896; Planck 1899, PAV 1:599-600.

― 51 ―

Planck Versus Boltzmann

From a systematic comparison of Planck's reasonings with those which led Boltzmann to the H -theorem we may surmise that Planck benefited at various steps from suggestive analogies. Table 1 summarizes the correspondence between Planck's and Boltzmann's arguments. The first horizontal section refers to the most detailed, microscopic level of description, while the second refers to the "directly observable quantities." The third and fifth sections respectively give the equations ruling the microscopic evolution of the system and those ruling the directly observable quantities. The transition between these two types of equations is expressed in the fourth section, with the idea of disorder, and the corresponding simplification of the interaction within the system. The fifth section gives the functions of the directly observable quantities which always increase or decrease. The last section describes the final state of the system in terms of the directly observable quantities.

As appears from this table, the relevant analogies concerned general concepts or categories rather than specific mathematical expressions, except for that connecting the formulae for H and S . This should not

TABLE ONE ANALOGY BETWEEN PLANCK'S AND BOLTZMANN'S IRREVERSIBILITY THEOREMS
Boltzmann	Planck
Molecular velocities: v₁ , v₂ , . . ., v:_N	External fields: E_e (r, t ), B_e (r, t ) Electric dipole: f(t )
Velocity distribution: f (v)	Secular resonator energy: U(t ) Beam intensities: I (z , t ) Electric intensity: J w (t )
Equations of molecular dynamics	Resonator equation (62) Maxwell's equations
Maxwell's Ansatz (molecular chaos)	Planck's fundamental equation (natural radiation)
Boltzmann's equation for f	Energy balancing (86) for U and beam intensities
	(for the resonator)
Maxwell's distribution	Isotropic unpolarized radiation

― 52 ―

surprise us, because the original dynamic systems, the molecular gas on the one hand, the resonator in cavity radiation on the other, exhibited strong qualitative differences. As vague as it was, the idea of a selection of disordered states for which the evolution of all measurable quantities would not depend on finer uncontrollable details was all Planck needed to establish his "fundamental equation." The rest, as we saw, proceeded from the autonomous development of Planck's program, with the exception of the entropy formula.

Did such procedural parallelism imply that Planck would also accept Boltzmann's conception of irreversibility? Somewhat surprisingly, the answer is no. For about fifteen years Planck maintained his idea of an absolute entropy law. As we have seen, Boltzmann's notion of molecular chaos was intrinsically probabilistic: only with a certain probability did it grant a deterministic evolution of the velocity distribution. Moreover, an initially disordered state had to evolve toward ordered states after a sufficiently long time, as implied by Poincaré's recurrence theorem (since perpetual disorder would exclude recurrence). In contrast, Planck's natural radiation had to remain natural forever, and ordered states were absolutely (nonstatistically) excluded from his theory as being nonphysical. In his fourth memoir Planck made this point completely explicit:

If one wishes to apply to nature the preceding theory of irreversible radiation processes, one must admit that these processes, especially those which provide temperature equilibrium, have the properties of natural radiation in any circumstance and for an unlimited amount of time.^[71]

Planck nevertheless observed that his differential equation of the dipole f (t ) combined with Maxwell's equations led to the Poincaré recurrence (at least in the case of a resonator placed at the center of a spherical mirror-cavity). This difficulty did not stop him: at the time scale at which Poincaré recurrences would occur, the deviations in the equation resulting from the detailed structure of the resonator had to play a role, and the dipolar differential equation would no longer be adequate for describing the evolution of the system. Planck commented:

This indetermination lies in the nature of things. Indeed, the physical problem has no definite solution as long as nothing is known about the resonator but its eigenfrequency w₀ and its damping constant r ; that our theory is able to determine the approximate course of phenomena while only these two constants

[71] Planck 1898, PAV 1:556. Needell 1980 first exhibited Planck's persistent behef that the entropy law was strictly valid and the central role of this behef in his early conception of quantum theory. I have independently reached the same conclusion in Darrigol 1988a.

― 53 ―

are given, must be regarded as a great advantage. For the same reason this theory cannot tell us more about the resonator than the determination of w₀ and r . Precisely in this gap [Lücke ] does the hypothesis of natural radiation find its place; otherwise, this hypothesis would be either superfluous or impossible, for the processes would be completely determined without it.^[72]

Thus Planck believed that the indetermination of the internal structure of his resonators made room for an everlasting disorder, which would make the evolution of the accessible properties of the system strictly irreversible. Not only did he not admit a probabilistic interpretation of natural radiation, but he reinterpreted Boltzmann's concept of molecular chaos in a way that would make kinetic theory acceptable to him. On this occasion he emphasized the analogy between gas theory and radiation theory:

Our electrodynamic interpretation of the second principle of thermodynamics suggests a brief comparison with the mechanical interpretation of the same principle, that is, with the corresponding questions in the kinetic gas theory. As is well known, here also we encounter the same, often-noted conflict between the fundamental equations of mechanics, which are perfectly reversible, and the second principle [of thermodynamics], which demands irreversibility for all real processes. But here also the conflict can be resolved in a very similar way by the introduction of a special hypothesis, which, as long as it remains valid, implies all consequences of the second principle. Boltzmann calls this "molecular disorder." This hypothesis is a necessary and sufficient condition for the existence of a definite function of the instantaneous state that perpetually increases in time and therefore shares the essential properties of entropy. However, the hypothesis of molecular disorder, once applied not only to the initial state but also to any subsequent time, has been shown to be incompatible with the assumption of a finite number of simple atoms confined within rigid walls; this circumstance impedes the introduction of the second principle, as a general principle, in [kinetic] gas theory. Some perceive here an objection to the legitimacy of [kinetic] gas theory, while others question the general validity of the second principle of thermodynamics. In reality one is not at all confined to these two alternatives. For whoever is willing to give up only one of the above assumptions, the existence of rigid walls—which, strictly taken, seems to be very improbable—there seems to be no obstacle whatsoever to a general application of the hypothesis of molecular disorder, and one is free to extend the second principle to arbitrarily long times even from the viewpoint of kinetic gas theory.^[73]

As was the indefinite structure of resonators, the complexity of the walls of a container was supposed to maintain a perpetual chaos warranting

[72] Planck 1898, PAV 1:557.

[73] Planck 1900a, PAV 1:619-620.

― 54 ―

an absolute irreversibility of thermodynamic phenomena. In this way the assumption of disordered states, originally meant by Boltzmann as a marginal commentary on a certain collision formula, became central to Planck's conception of thermodynamics, which sought to maintain strict irreversibility.

Pursuing the comparison between natural radiation and molecular chaos, Planck gained some intuition about the nature of the disorder intervening in his radiation theory. He emphasized that his equations led to irreversible behavior for a single resonator, while Boltzmann's considerations required a very large number of molecules, and explained: "The principle of disorder on which every notion of irreversibility seems to rest steps in at very different moments [of the reasonings] in gas theory and thermal radiation theory" (emphasis added). Disorder in a gas consisted of the irregular multitude of molecular velocities and positions, while in the resonator case it had to do with the irregular multitude of Fourier components of the electric moment. There was spatial disorder in one case, temporal disorder in the other.^[74]

Summary

In 1897 Planck defined the aim of his grand program in the first of a series of five memoirs "on irreversible radiation processes": to show that a system of ideal resonators acted irreversibly on radiation enclosed in a cavity with perfectly reflecting walls, leading eventually to a uniform, isotropic distribution of radiant energy. Moreover, he hoped that the spectrum of this radiation would evolve toward a well-defined final state, which, according to Kirchhoff's law, could only be the universal blackbody spectrum.

Planck's announcement of this project triggered a public polemic with Boltzmann. The latter argued that the laws of electrodynamics were just as reversible as the laws of molecular dynamics, so they could not be used to derive irreversible thermodynamic behavior. The only possible escape from this conclusion, Boltzmann suggested, would be—as he had done in gas theory—to shift to a statistical conception of irreversibility and to introduce a notion of "disordered" radiation that would be the counterpart of molecular chaos.

Eventually, after a few misconceived attempts to counter Boltzmann's objection, Planck did adopt a notion of disordered radiation, which he

[74] Planck 1900b, PAV 1:673-674.

― 55 ―

called "natural radiation." As a first step toward defining this notion, he introduced the relevant "directly measurable quantities," that is, certain quadratic time-averages, one for the electric moment of the resonator and one for the electric field at the place of the resonator. These were indeed accessible to measurement, for instance, in the case of the field, through the resonance of a damped test-resonator. But the differential equation relating the resonator moment and the electric field did not imply a definite relation between these quadratic expressions, unless certain "cross-terms" were set to zero. The vanishing of cross-terms is precisely what defines Planck's natural radiation; it leads to an equation involving only the directly observable quantities, which Planck called the "fundamental equation." The analogy with molecular chaos was transparent enough. In both cases there were two levels of description: the detailed micro-level, which includes uncontrollable features of the model (dynamics of molecular collisions/electrodynamic interaction between resonator and radiation), and the "physical" level of description, which involves only physically meaningful quantities (Maxwell's collision formula/Planck's fundamental equation). In order to deduce the second level of description from the first a special assumption must be made, molecular chaos in one case, natural radiation in the other.

Planck then proceeded to derive a counterpart to Boltzmann's H -theorem. To this end he discussed the effect of a resonator on radiation beams and introduced expressions for the entropy of resonators and of beams in terms of the corresponding "measurable quantities." These expressions mirrored very closely Boltzmann's H -function, and, as in Boltzmann's case, they led to a perpetual increase of the total entropy. By this means Planck could prove that resonators made the surrounding radiation increasingly (spatially) uniform and unpolarized. However, the analogy with Boltzmann's H -theorem was imperfect in at least one respect: while the Boltzmann equation led to a definite final distribution of velocities, Planck's equations said nothing about the time evolution of the spectrum of radiation. In fact, contrary to his original hope, Planck gradually realized that his resonators were unable to redistribute radiation from one frequency to another.

Nevertheless, in his concluding memoir of 1899 (the fifth!) Planck managed to derive the blackbody spectrum. He just had to assume that his expression for the resonator entropy was the only one compatible with a global entropy increase and that it was identical with the thermodynamic entropy. Then standard thermodynamic reasoning, Wien's displacement law, and the "fundamental equation" led to a definite spectrum. To

― 56 ―

Planck's satisfaction the resulting law was already well known (previously having been proposed by Wien on a frail theoretical basis) and fitted available observations excellently. Not doubting the fundamental character of this derivation, Planck extracted universal constants from his entropy formula (including h , under a different letter) and even combined them with the universal constant of gravitation to produce absolute units of length, time, and mass.

Apparently Planck had achieved his original aims: the demonstration that resonators could produce the irreversible change required by the second law, and a derivation of the blackbody spectrum. His introduction of "natural radiation" and the overall analogy of his irreversibility theorem with Boltzmann's H -theorem might suggest that he had in the process become a convert to the statistical conception of irreversibility and thus given up the motivating force behind his program, a nonstatistical foundation of the entropy law. This, however, was not the case. Planck maintained his absolute conception of irreversibility for fifteen more years. In his mind the "naturalness" of radiation was not a statistical property; instead, it applied to individual states of radiation. The internal structure of resonators, which Planck deliberately left undetermined in all his reasonings, was in charge of perpetually maintaining this naturalness. Similarly, in gas theory Planck imagined an indeterminate structure of the walls of containers that would maintain molecular chaos forever—and make kinetic theory acceptable to him. Planck's "elementary disorder" (a generic name for molecular chaos and natural radiation) was essential for strict irreversibility and thus became the central concept of his thermodynamics.

― 57 ―

Chapter IV
The Infrared Challenge

The pretense of Planck's theory of radiation to determine the blackbody law was soon contradicted by Berlin's best spectroscopists. In 1899 Paschen, Lummer, and Pringsheim observed violations of Wien's blackbody law in the infrared part of the spectrum. Even though he did not immediately take this result as irreproachable, Planck came to recognize that the proof of the electromagnetic H -theorem was compatible with an infinite number of choices for S, the resonator entropy.^[75] As will presently be seen, the only necessary restriction on this choice was that the second derivative of S be negative: d²S/dU² < 0. Subsequently, Planck found a physical meaning for this derivative and used it to justify the choice of S(U ) that led to Wien's law.^[76]

The Second Derivative of the Resonator Entropy

The meaning of d² S/dU² derives from Planck's following consideration. Having already quickly increased (or decreased) the energy of a resonator initially in equilibrium with thermal radiation by an amount d U, one allows this energy to relax by dU (with dU << d U) toward its equilibrium

[75] This has to do with the fact that ideal resonators cannot change at all the spectrum of cavity radiation (see n. 49). An infinite number of radiation spectra are compatible with Planck's system, as implied by the infinite number of possible choices for the resonator entropy.

[76] Planck 1900b. On infrared blackbody measurements see Jammer 1966,16-17; Mehra and Rechenberg 1982a, 39-43.

― 58 ―

value. The derivative d²S/du² is then proportional to the total entropy change, dS_T , occurring during the relaxation process. More exactly, Planck proved that

inline image

in the following manner.^[77]

During the increment of U by d U the beam intensity balance DI (as defined on p. 46) goes from zero (equilibrium) to

inline image

since, according to (85), inline image is the only intensity that depends on the energy of the resonator. Accordingly, at the second order of approximation the entropy-intensity balance goes from zero to

inline image

In this development the index zero refers to the original state of equilibrium, for which all I 's are equal to v²U/c² . Now let a time dt elapse after the excitation of the resonator. The corresponding relaxation of the resonator energy, dU , is obtained by substituting the above DI into the equation (86) of energy conservation:

inline image

In the same time interval the total entropy variation (from (90)) is

inline image

Inserting the development (105) in the latter equation and using again the equation (86) of energy conservation yields

inline image

The first term can be rewritten as

inline image

[77] Planck 1900b, PAV 1:679.

― 59 ―

and the integral in the last term can be evaluated using (104), (106), and

inline image

The resulting expression for dS_T reads

inline image

So that the system may evolve toward equilibrium, dUd U must be negative; but the individual signs of dU and d U are not fixed. Consequently, the sign of the above expression of dS_T is definite if and only if

inline image

This identity must hold for all values of I and U related through I = Uv²/c² , which implies

inline image

and

inline image

Taking into account the latter remarks, the expression (111) collapses into

inline image

which was to be proved.

This calculation gave two necessary conditions for entropy increase: the form (113) for the function L (I ) corresponding to the function S (U ), and the convexity condition for the latter function, d²/dU² < 0, which provides dS_T < 0 (since dUd U < 0). As Planck now noticed, these conditions were also sufficient for entropy increase. Indeed, the proof of the electro-magnetic H -theorem earlier given may be adapted in the following way.

Starting from the equations (86) and (87) for energy and entropy balance, one can use the first condition (113) to derive

inline image

Then, the two conditions together imply that L(I) - IdS/dU (at a constant U ) is a convex function of I , with an absolute maximum for I = v²U/c² These properties are the only ones necessary to the rest of the

― 60 ―

proof of entropy increase. That Planck originally believed the form S = - (U/f ) In (U/g ) to be the only one possible may be explained by the intricacy of his own proof of the electromagnetic H -theorem, which did not exploit the convexity of S (U ).

Once aware of the physical meaning of d²S/dU² , Planck considered a set of n resonators (separated from one another by large distances) immersed in the same thermal radiation and submitted to the same perturbation d U. The total entropy variation during a common relaxation dU of the energy of these resonators is n dS_T . Planck then equated this variation with the one obtained in the case of a single resonator with the initial energy nU submitted to the perturbation nd U and relaxing by n dU , which gives

inline image

This property implies the form d²S/dU²= -a /U, which integrates into Planck's original formula S = - (U/f) ln (U/g ) and therefore implies Wien's law.^[78]

A New Radiation Law

Planck's colleagues quickly perceived the mistake in the above reasoning: a single resonator with the energy nU is not in equilibrium with the same thermal radiation as the n resonators with the energy U , so that there is no reason to equate the relaxation rates in the two cases. Moreover, by the summer of 1900 the violations of Wien's law in the infrared range had become incontestable. On 19 October, Planck acknowledged his mistake at the Berlin Academy. His methods were really unable to provide a precise form of the resonator entropy and were in fact compatible with an infinity of different equilibrium distributions. He nevertheless "guessed" a new blackbody law, starting from an expression for the second derivative of the resonator entropy S (U ).^[79]

[78] Of course Planck's presentation of this argument does not necessarily reflect the chronological order of his considerations. Perhaps he first noticed the extreme simplicity of the second derivative of the entropy corresponding to Wien's law and then sought a physical interpretation of this derivative that would justify the property of homogeneity.

[79] Planck 1900c. See Kuhn 1978, 96-97.

― 61 ―

This expression, he argued, had to be negative in order not to contradict the entropy law; it had to give back the form - 1/a U leading to Wien's law for small values of U (that is to say for large frequencies); and, for the sake of simplicity, it had to be easily integrable. Planck therefore conjectured the form

inline image

whose integration gives

inline image

This formula, once combined with dS/dU = 1/T , Wien's displacement law, and the fundamental equation u_v = (8 p v²/c³ )U , leads to "Planck's law":

inline image

Within a few days blackbody spectroscopists could verify how excellently this law fitted their experimental data.

In the same communication Planck noted that the form of S (U ) was logarithmic, "as suggested by probability calculus." Presumably he was already hoping for a deeper justification of this law based on Boltzmann's relation between entropy and combinatorial probability. He was here confronted with a situation different from that occurring in Boltzmann's gas theory. In the gas case the Boltzmann equation provided by itself the equilibrium distribution, and the method of complexions gave hardly more than "an illustration of the mathematical meaning of the quantity H" of the H -theorem, as Boltzmann wrote in his Gastheorie . Instead, in Planck's radiation theory the electromagnetic H -theorem had nothing to say about the equilibrium spectrum, while the method of complexions seemed to offer a new hope of a derivation of this spectrum.^[80]

This is why Planck decided to turn to the combinatorial method. However, he consistently rejected the probabilistic context of Boltzmann's original considerations. One could well call a certain mathematical function of the state of a system a "probability" without having to consider the increase of this function in time as a matter of probability. Such was Planck's

[80] Planck 1900c, PAV 1:689; Boltzmann 1896, trans., 55. In Boltzmann's H -theorem the temperature is unambiguously defined through the average kinetic energy corresponding to the distribution f (v ). In the electromagnetic H -theorem the only available definition of temperature is through the relation dS/dU = 1/T .

― 62 ―

opinion, as already expressed in a letter to Graetz of 1897:

Probability calculus can serve, if nothing is known in advance, to determine a state [of equilibrium] as the most probable one. But it cannot serve, if an improbable state is given, to compute the following state. That is determined not by probability but by mechanics.^[81]

The relation between entropy and probability had been introduced by Boltzmann in 1877 in the context earlier described (see pp. 16-17). It is now time to specify the mathematical content of these considerations, on which Planck's success very much depended.^[82]

Boltzmann's Combinatorics

The basic object of Boltzmann's combinatorics of 1877 is a perfect gas of point molecules, a microstate of which is characterized by the set of molecule velocities and positions. In a first simplifying step, Boltzmann considers only the kinetic energies of the molecules and tries to define the probability of an energy partition (Energieverteilung ), that is, of a distribution of a given total energy E over the molecules. Since energy is a continuous variable, there is no obvious definition of such a probability.

As in his combinatorial considerations of 1868, Boltzmann starts with a "fiction" wherein molecules can take only discrete energy values 0, e , 2e , . . ., i e , . . . . Then, if molecules are labeled by the index (a = 1, 2, . . ., N ), a microstate of the system, or "complexion," is defined by attributing to each molecule a given energy:

inline image

where iae is an integral multiple of e . An "energy partition" is given by a sequence of integers N₀ , N₁ , . . ., N_i , . . ., where N_i is the number of molecules carrying the energy i e . To a given partition correspond inline image different Komplexionen . Boltzmann calls the "permutability," since it is equal to the number of permutations of the N molecules that transfer at least one molecule from one discrete energy value to another:

inline image

The probability W of a partition (N₀ , N₁ , . . ., N_i , . . .) is obtained through division of inline image by the normalization factor; Boltzmann gave an explicit formula for this divisor:

inline image

[81] Planck to Graetz, 23 Mar. 1897 (Deutsches Museum), quoted in Kuhn 1978, 265-266.

[82] Boltzmann 1877b.

― 63 ―

where the sum is taken over all distributions (N₀ , N₁ , . . . N_i , . . .) such that

inline image

and P is obtained by dividing the total energy E by e . It might be worth mentioning that this combinatorial formula was the one later used by Planck.

If the N_i 's are large enough to allow the Stirling approximation,

inline image

W reaches its maximum value with the constraints (124) on the total number of particles and energy if for any i , and for Lagrange multipliers a and b .

inline image

This equation implies that N_i must be proportional to e^-ibe . Maxwell's distribution, or a discrete imitation of it, appears to be the "most probable" one, in the sense that it has the greatest number of complexions.

So much for the fiction. Boltzmann then turned to the more realistic continuous case. This was readily achieved by supposing that the energy unit e was small enough to consider that molecules whose energies lie between ie and (i + 1)e have the same energy. The numbers N_i now count the molecules in the various energy intervals. Provided that the sums over i can be approximated by integrals, the most probable number of molecules whose energy K lies within an infinitesimal energy interval dK is proportional to e^-bKdK .

This is not yet Maxwell's law. To get it Boltzmann had to cut up the velocity space, instead of the energy axis, into uniform cells. Then Maxwell's expression

inline image

was found to represent the most probable distribution of molecular velocities. Boltzmann also considered the positions r together with the velocities of the molecules. In this case the (r, v)-space has to be cut up into uniform cells, and N_i gives the number of molecules in the cell i . In the continuous limit the most probable distribution f (r, v) is uniform in the r-space and results in Maxwell's distribution in the velocity space. Furthermore, the logarithm of permutability may be calculated in this case to give

inline image

― 64 ―

or, in the continuous approximation,

inline image

In the case of maximal probability this expression of In inline image , Boltzmann noted, is identical with the function -H and therefore gives the entropy of a perfect gas up to an additive constant (function of N ).

These calculations were simple enough, but their point of departure, the expression of the probability of a state distribution, needed further justification, as Boltzmann himself recognized: "I do not think that one is allowed to set this forth [that the equilibrium state is the most probable one] as something obvious, at least not without having first defined very precisely what is meant by the most probable state-distribution." He tried to justify the two main assumptions leading up to his expression for per-mutability, namely, the possibility of cutting up, and the uniformity of, (r, v)-space.^[83]

This uniformity, he said, resulted from the invariance of the differential element d³r d³v during a Hamiltonian evolution of a molecule. But the accompanying proof was either incomplete or wrong. In any case it could not fill the conceptual gap later emphasized by Einstein: a proper connection between the evolution in time of a system and the probability of its state was needed to justify not only uniformity in (r, v)-space but also, more generally, the relevance of combinatorial probabilities to thermodynamics.^[84]

Boltzmann's combinatorics, if not fully justifiable, had, at least, to be consistent. In this respect the recourse to finite cells could seem problematic. The number of molecules in a given cell had to be large (more precisely, there had to be a negligible number of cells for which N_i is neither zero nor very large) so that the Stirling approximation could be applied. At the same time the size of the cells had to be small enough that the sums over i could be approximated by integrals. Instead of directly investigating the consistency of these assumptions, Boltzmann preferred an analogy with familiar problems of kinetic theory:

Nevertheless, after closer inspection, this assumption must be regarded as obvious. Indeed, any application of differential calculus to gas theory rests on the same assumption. If for instance one wishes to calculate diffusion, viscosity, conductivity, etc., one has to admit in the same way that in every infinitesimal element of volume dx dy dz there is still an infinite number of gas molecules

[83] Ibid., 193.

[84] See Kuhn 1978, 55.

― 65 ―

with velocity components lying between the limits u and u + du, v and v + dv, w and w + dw . This assumption means only that one can choose the limits for u, v, w , so that they include a very large number of molecules and that one may nonetheless regard all these molecules as having the same velocity components.^[85]

In the content of the 1877 memoir, one can easily check the legitimacy of this hypothesis for the most probable distribution of molecules. For instance, in the simplest case in which the energy axis is cut up into intervals of equal size e , the numbers N_i are given by

inline image

It can easily be seen that the condition for the Stirling approximation to be valid is a large value of the number N₀ of molecules in the zero-energy interval:

inline image

The other condition, that the sums can be replaced with integrals, reads:

inline image

The two conditions are both met if

inline image

Consequently, for any value of the available energy (E = N/b ) the size of the cells can be chosen consistently, and it then disappears from the final result for the most probable distribution and the corresponding entropy.

Since in Planck's later combinatorics e is not always a negligible fraction of 1/b ( = kT ) and appears in the final entropy formula, it is important to understand what makes it disappear in Boltzmann's case. The reason is not that e is infinitesimal in the mathematical sense; indeed, it must be larger than 1/Nb . At a purely formal level, the elimination occurs when sums like the one giving the entropy are replaced with integrals. Boltzmann had to take this formal step because the main physical quantity of interest, the distribution of molecules over cells (N₀ , N₁ , . . . N_i , . . . ), was expected to be well approximated by a continuous distribution, the most probable of which is Maxwell's law. We will find that neither this circumstance nor its formal corollary occurs in Planck's combinatorics.

To summarize, Boltzmann's memoir of 1877 on entropy and probability was not very explicit about the physical meaning of its main procedural elements, the method of dividing up the space of configurations and the

[85] Boltzmann 1877b, 197-198.

― 66 ―

uniformity of this space. It was clear to him, however, that combinatorial methods were relevant insofar as they were able to reproduce the "continuous" entropy formula

inline image

which had already been founded on what he considered to be more fundamental bases, that is, on the ergodic hypothesis or on the methods of the H -theorem. On the contrary, Planck, still unfamiliar with the foundations of Boltzmann's theory, would venture to confer physical meaning on the artifact of energy elements.

Quantified Chaos

When in late 1900 Planck tried to apply Boltzmann's combinatorial method to his resonator, he naturally drew his inspiration from the marked analogy between the H -theorem for gases and that for radiation. The pivot of this analogy had to be the principle of disorder, since it was at the center of his conception of irreversibility. In this regard a very typical statement of Planck reads, "One can speak of a disorder, therefore of an entropy of a resonator" (March 1900).^[86] As we saw, Planck identified the kind of disorder affecting a resonator as the unobservable temporal fluctuation of the energy Y(t) of this resonator around its secular average U . More generally, disorder was what made the microscopic details of the description of a system irrelevant to the evolution of the really accessible parameters of the system, namely spatial (in the gas case) or secular (in the radiation case) averages.

In harmony with his idea of the centrality of disorder, Planck perceived Boltzmann's permutability as a quantitative measure of molecular chaos, whereas Boltzmann never tried to make such a direct connection between entropy and disorder. For a gas disorder is what makes the detailed microscopic configuration irrelevant to the evolution of the distribution f (v) or f (r, v) entering the Boltzmann equation. Permutability, being a measure of the number of microscopic configurations compatible with the distribution f , appeared to Planck as a natural quantitative measure of this disorder. In the resonator case, the counterpart of the permutability had to be something like the number of functions Y(t) compatible with a given average energy U , since the disorder lay in the uncontrollable irregularity of the instantaneous energy.

[86] Planck 1900b, PAV 1:674.

― 67 ―

There is no doubt that this relation between entropy and disorder was the key point of Planck's published reasoning. His famous communication of 14 December 1900 to the Berlin Academy introduced the new determination of the entropy of a resonator with the words "Entropie bedingt Unordnung," that is, "Entropy presumes disorder." It continued with a description of the nature of the disorder of a resonator drawn from his previous theory that justified his subsequent computation of the number of complexions, or "probability," leading to the entropy of a resonator. Planck's use of the word "probability" in this context should not confuse the reader: he just meant it in the mathematical sense of an abstract lottery game leading to combinatorial expressions.^[87]

Admittedly, Planck's presentation did not necessarily reflect the way he really reached his derivation of the blackbody law. It would seem plausible that he worked backward from the radiation law and guessed the proper combinatorics from the form of the resonator entropy (as in Rosenfeld's reconstruction, for instance). According to a letter from Planck to Lummer of 26 October 1900, the truth seems to have lain somewhere in between; that is to say, Planck simultaneously used inductive (from the blackbody law to entropy) and deductive (from entropy to the blackbody law) considerations:

If the prospect should exist at all of a theoretical derivation of the radiation law, which I naturally assume, then, in my opinion, this can be the case only if it is possible to derive the expression for the probability of a radiation state, and this, you see, is given by the entropy. Probability presumes disorder, and in the theory I have developed, this disorder occurs in the irregularity with which the phase of the oscillation changes even in the most homogeneous light. A resonator, which corresponds to a monochromatic radiation, in resonant oscillation will likewise show irregular changes of its phase [and also of its instantaneous energy, which was more important to Planck's subsequent derivation], and on this the concept and the magnitude of its entropy are based. According to my formula [the blackbody law communicated on 19 October to the German Academy], the entropy of the resonator should come to:

[formula (119)], and this form very much recalls expressions occurring in the probability calculus. After all, in the thermodynamics of gases, too, the entropy S is the log of a probability magnitude, and Boltzmann has already stressed the close relationship of the function X^x , which enters the theory of combinatorics, with the thermodynamic entropy. I believe, therefore, that the prospect would certainly exist of arriving at my formula by a theoretical route, which

[87] Planck 1900d.

― 68 ―

would then also give us the physical significance of the constants C and c [of Planck's law].^[88]

In any case, the final justification of Planck's calculation in terms of quantified chaos certainly determined his opinion about the status of the finite energy elements.

Having described the nature of the disorder to be found in a resonator, Planck continued his communication as follows. In order to give a definite meaning to the number W of evolutions Y(t) of the energy compatible with a given temporal average U , he replaced Y(t) with its value at N different instants of time, or, more exactly, with the energy values of N independent (far removed from one another) identical resonators at one given instant. Then W is represented by the number of distributions of a total energy E = NU over these N resonators.^[89]

For the rest, Planck proceeded in exact analogy with Boltzmann's "fiction." That is to say, he divided the energy E into finite elements e :

If E is taken to be an indefinitely divisible quantity, the distribution is possible in an infinite number of ways. But I regard E —and this is the essential point of the whole calculation—as made up of a completely determinate number of finite equal parts, and for this purpose I use the constant of nature h = 6.55 × 10^-27 (erg · sec). This constant, once multiplied by the common frequency of the resonators, gives the energy element e in ergs, and by division of E by e we get the number P of energy elements to be distributed over the N resonators. When this quotient is not an integer, P is taken to be a neighboring integer.^[90]

According to this hypothesis, W becomes the total number of complexions compatible with the total energy E , wherein the word "complexion" is defined strictly in Boltzmann's sense, by attributing to each resonator a given discrete energy (as specified in (121)). One could calculate this number by adding the permutabilities of all the distributions (N₀ , N₁ , . . .,N_i , . . .) (in Boltzmann's notation) such that ^[91]

inline image

For the sake of simplicity, however, Planck preferred to compute W directly as the "number of distributions of P energy elements over N resonators," it being understood that only the number (not the identity) of the energy

[88] Planck to Lummer, 26 Oct. 1900, quoted in Jungnickel and McCormmach 1986, 2: 261-262.

[89] Planck's W stands for Wahrscheinlichkeit (probability).

[90] Planck 1900d, PAV 1:700-701.

[91] N can always be chosen so great that the constraint does not play any role. In Planck 1906, 151, the characterization of W as a sum of permutabilities comes first; then Planck switches to the "faster and easier" method of the energy elements.

― 69 ―

elements attributed to a given resonator is considered. The latter stipulation surprised some of Planck's readers (Ehrenfest and Natanson), but it was in fact implied by the analogy with Boltzmann's fiction.^[92]

For the W formula Planck referred his reader to the calculus of combinations. Here follows an elegant proof, due to Ehrenfest and Kamerlingh Onnes (1914). A complexion may be represented as a symbol

inline image

containing P times the symbol e and N - 1 times the symbol/. The number of complexions is therefore equal to the number (N + P - 1)! of all these symbols regarded as different, divided by the number, P !, of permutations of the e symbols and by the number, (N - 1)!, of permutations of the / symbols:^[93]

inline image

Adapting Boltzmann's relation between entropy and probability to this problem, Planck wrote

inline image

for the entropy S of a single resonator. The constant k , Planck emphasized, had to be the same in gas theory and in radiation theory (Boltzmann did not need such a constant, since he measured temperatures in energy units). But contrary to Boltzmann's case, no procedure of extremum was here necessary: as a consequence of the different type of disorder, the average energy U , not the more detailed distribution (N₀ , N₁ , . . . N_i , . . .), characterizes the "macroscopic" state of the resonators.^[94]

As N , the number of exemplars of the resonator, can be taken as great as one wishes, the Stirling approximation applies:

inline image

[92] On Ehrenfest's and Natanson's reactions and Planck's reply, see Darrigol 1988b, 52.

[93] Ehrenfest and Kamerlingh Onnes 1915.

― 70 ―

From the relation between entropy and temperature, 1/T = dS/dU , results

inline image

The fundamental relation (96) and e = hv finally give:

inline image

which is the canonical form of Planck's law.

In a subsequent publication Planck explained that the proportionality of the energy element e to the frequency was implied by Wien's displacement law (expressed in the form (98)). Later, in 1906, he showed that this property and also the uniformity of the cutting up of the energy axis resulted from Boltzmann's general assumption of uniformity in configuration space (here the (f , Lf^[*] )-plane) and from the quadratic form of the energy of a resonator. These remarks made the analogy with Boltzmann's combinatorics even closer.^[95]

Quantum Continuity

Table 2 summarizes the formal correspondence between Boltzmann's "fiction" and Planck's combinatorics of N exemplars of a resonator.

Despite the exact transposition of the definition of a complexion, this correspondence is certainly not the most direct that one could imagine. Had he not been guided by his interpretation of W as a measure of disorder, Planck would no doubt have characterized the macrostate by a distribution (N₀ , N_i , . . . N_i , . . .), in the resonator case as in the gas case. This would have led to

inline image

and

inline image

― 71 ―

TABLE TWO ANALOGY BETWEEN PLANCK'S AND BOLTZMANN'S COMBINATORICS
	Gas	Resonators
Microstate	Complexion:	Complexion:
Macrostate	Energy partition: (N₀ , N₁ , . . ., N₁ ,.. )	Total energy: E = NU
Number of complexions	Permutability:	"Probability" W = (N + P - 1)!/(N - 1)!P !
"Boltzmann's principle"		S = k In W
Uniformity	in (r, v)-space	in (f, Lf^[]* )-plane

where the distribution (N₀ , N₁ , . . ., N_i , . . .) is the one for which W reaches its maximum, with the constraints (135) inline image

An elementary calculation by the method of Lagrange multipliers (see equation 126) gives

inline image

which is identical with Planck's expression (138) (this identity results from the fact that most complexions belong to distributions that are very close to the most probable one).^[96]

However, if the distributions (N₀ , N₁ , . . ., N_i , . . .) really played similar roles in the gas case and in the resonator case, they would have to be replaced in both cases by their continuous limits, formally obtained by setting inline image in the above expressions. The expression (139) for the resonator energy would therefore become U = kT , and instead of Planck's law one would have

inline image

[96] The above calculation provided the formal basis for the derivation of Planck's law first given in Lorentz 1910 (see Darrigol 1988b, 62-63) and interpreted in terms of an intrinsic discontinuity of the resonator energy.

― 72 ―

a result incompatible with experiments, and even absurd since it leads to an infinite energy for the entire spectrum.^[97]

Because his conception of disorder implied a combinatorics different from Boltzmann's, Planck avoided this catastrophic conclusion. However, a deeper understanding of the foundations of Boltzmann's theory would have left him no choice. As Einstein correctly pointed out in 1905, Planck's resonators could be brought to interact with the molecules of a gas, without their interaction with radiation being substantially modified. In such a system, the energy distribution of the resonators and the velocity distribution of the molecules play parallel roles. Consequently, the reasoning just given should apply, and the formula U = kT should hold, as a particular case of energy equipartition.^[98]

In conformity with the original object of this program, Planck reasoned purely in terms of radiation theory and did not consider such an admixture of molecular and electrodynamic systems. Furthermore, like most of his colleagues, he did not believe in the generality of the equipartition law, from which the relation U = kT trivially resulted. This law generally led to much too high values of specific heats of materials, and the best specialists, including Lord Kelvin and Boltzmann (with the exceptions of Gibbs and Maxwell, who died prematurely), attributed this failure to some unknown intricacies of molecular dynamics.^[99]

Planck's only guide was his characterization of disorder in radiation theory, which deprived the energy distribution (N₀ , N₁ , . . . N_i , . . .) of a direct physical meaning and made the secular energy U the only observable property of resonators (except for their frequency, of course). The energy elements had therefore no reason to disappear from the end results. Being the gauge of elementary disorder, they fitted harmoniously, like the hypothesis of natural radiation, in the logical "gap" open because of the indetermination of the detailed structure of resonators; they started to play a role where ordinary electrodynamics ceased to provide definite information. In other words, electrodynamic laws and Planck energy elements did not contradict each other, they complemented each other.

In this context Planck could not possibly have understood the introduction of energy elements as a discrete selection of the admissible energy

[97] This result is called the "Rayleigh-Jeans" law. It was first obtained by Rayleigh in 1900, up to a numerical mistake corrected by Jeans in 1905. The original derivation rested on an application of the equipartition theorem to the stationary modes of a cavity. However, Rayleigh and Jeans did not believe in the validity of this theorem for large frequencies. See Kuhn 1978.

[98] Einstein 1905.

[99] See Brush 1976, 356-363, and Kuhn 1978.

― 73 ―

values of a resonator. Such a discontinuity would have contradicted the rest of his theory, especially the proof of the fundamental equation (96), which entered the derivation of the blackbody law. Moreover, Planck's own wording of the "essential point" of his communication of 14 December leaves no room for doubt. Immediately after introducing the energy elements, he wrote: "When this quotient [E/e ] is not an integer, P is taken to be a neighboring integer." This by itself shows that the energy of N independent resonators, and a fortiori the energy of a single resonator, was not thought to be restricted to multiples of e^[100]

To be the counterpart of Boltzmann's "fiction," Planck's discrete complexions also had to be fictitious. This point was made entirely explicit in the lectures on the theory of thermal radiation of 1905-6: a complexion, Planck said, signified the attribution to each resonator of an "energy domain" delimited by the energy values ie and (i + 1)e , not of a discrete energy value ie . In conformity with this viewpoint, the fundamental notion of Planck's subsequent theory of quantization was that of "elementary domains of probability," a generalization of the energy domains in configuration space. This conception stood against the notion of discrete quantum state introduced by Einstein, raised to a fundamental postulate by Niels Bohr and adopted by most early quantum theorists.^[101]

Summary and Conclusions

The Berlin spectroscopists did not let Planck rejoice for long about his fundamental derivation of Wien's law. In the very year Planck completed his program, 1899, they began to observe systematic deviations from Wien's law in the infrared part of the blackbody spectrum. This helped Planck realize that, contrary to his earlier conviction, there were an infinite number of expressions for the resonator entropy compatible with his electromagnetic H -theorem, and thus an infinity of corresponding blackbody laws. In fact, in order for the total entropy to increase, the only constraint on the expression for the resonator entropy was that its second derivative (with respect to energy) should be negative. Then, on the basis of a new independent argument, Planck imposed an additional constraint on this derivative and recovered Wien's law, to the experimenters' great disbelief.

[100] See n. 90. The absence of a quantum discontinuity in Planck's early derivations of his blackbody law was first observed by Kuhn 1978 against a long historical tradition that asserted the contrary.

[101] Planck 1906, 135: ". . . resonators that carry a given amount of energy (better: that fall into a given 'energy domain') . . ." The notion of Elementargebiete der Wahrscheinlichkeit was systematically developed in the second edition (1913) of Planck 1906.

― 74 ―

The new argument was wrong, and Planck publicly withdrew it in October 1900, after the experimental violations of Wien's law had become more obvious. In the same communication he proposed an alternative blackbody law, a happy guess based on a simple modification of the expression for the second derivative of the resonator entropy corresponding to Wien's law. The new blackbody law immediately proved to fit empirical data quite well, and Planck started to think about a more fundamental derivation. This led him to consider the relation between entropy and probability which Boltzmann had introduced in 1877.

According to the relevant memoir of Boltzmann, in a dilute gas the equilibrium distribution of velocities—that is, Maxwell's distribution—was also the most "probable"; and the entropy (or the function -H ) was given by the logarithm of the (unnormalized) "probability." Calling (according to modern terminology) the exact microscopic configuration of the molecular model a microstate, and the distribution of velocities a macrostate, Boltzmann's (unnormalized) "probability" was defined as the number of microstates compatible with a given macrostate. Of course, this definition has problems since there is a continuous infinity of microstates corresponding to every macrostate. To solve this difficulty, Boltzmann divided up the configuration space of a molecule into cells and regarded all configurations belonging to a given cell as one single configuration. For instance, in a simple model for which the configuration of a molecule is completely determined by its energy, the energy axis is cut up into equal intervals or energy elements, and a microstate is obtained by assigning to each molecule one of these intervals. Boltzmann's subsequent calculations required the energy elements to be finite (so that the number of molecules in an energy interval could be very large) but small enough not to blur the definition of macrostates, to which the quantities of physical importance pertained. On this condition the energy elements disappeared from the end results; and Maxwell's distribution and the corresponding entropy were recovered. In other words, Boltzmann employed the energy elements as a mathematical artifice, for the purpose of giving a definite meaning to the "probability" of a macrostate. They did not belong to the microscopic model, nor could they enter macroscopic laws; for these could be reached independently of the relation between entropy and "probability," through the H -theorem or the ergodic hypothesis.

The relation between entropy and probability played only a minor role in Boltzmann's subsequent work. For instance, in his Gastheorie it appeared only as a "mathematical illustration" of the expression for the H -function. Boltzmann (rightly) believed that derivations of thermodynamic

― 75 ―

quantities and laws through the H -theorem or through the ensemble technique were more fundamental. In 1900 Planck faced a different situation: his electromagnetic H -theorem had proved useless in determining the entropy of a resonator, so that the relation between entropy and probability, far from being superfluous, seemed to be the only available access to the blackbody law. Planck accepted the relation but not its original context, which was a probabilistic interpretation of the irreversibility theorem. Instead he reinterpreted Boltzmann's "probability" as a quantitative measure of elementary disorder, a notion that was at the core of his (Planck's) non-probabilistic conception of irreversibility. Such reinterpretation also had a practical advantage: it provided some guidance about how to extend the analogy between gas theory and radiation theory.

Planck first discussed the type of disorder to be found in a resonator, knowledge gleaned from the requirements of derivation of the electromagnetic H -theorem. In this way he determined what played the roles of microstates and macrostates, as the states of the system respectively in the detailed and the physical levels of description. Next, following Boltzmann, he introduced finite energy elements in order to obtain a definite value for the "probability," that is, the number of microstates in a given macro-state. The logarithm of this "probability" gave him the entropy of a resonator, which leads to Planck's new blackbody law—if only the energy elements can be taken to be proportional to the frequency of the resonator.

Contrary to Boltzmann's case the energy elements now appeared in the final thermodynamic expressions. Planck attributed this peculiarity to a difference in the type of disorder. Indeed, his understanding of the disorder in a resonator led to a notion of macrostate (characterized by the total energy of an ensemble of resonators) that was insensitive to the introduction of energy elements; therefore, Boltzmann's condition that the energy elements should be small enough not to blur the definition of macrostates had no counterpart in Planck's case, and nothing seemed to forbid the appearance of the energy element in the final entropy formula.

In this situation Planck had no reason to question the continuity of the resonator energy. Moreover, such a step would have contradicted, among other things, his derivation of the "fundamental equation," which was necessary for his proof of the blackbody law. In his mind the energy elements were something like the gauge of elementary disorder; they therefore pertained to the indeterminate internal structure of resonators, and they did not contradict his electrodynamic reasonings, which were independent of this structure. In short, Planck relaxed Boltzmann's connection between microworld and macroworld by leaving part of the micromodel

― 76 ―

indeterminate. This allowed him to maintain strict irreversibility in the macroworld, by adjusting the indeterminate part of the micromodel (introduction of elementary disorder). In turn, this adjustment permitted his derivation of the blackbody law, without contradicting the determinate part of the micromodel.

As is well known, a few years ago Thomas Kuhn published an in-depth study of blackbody theory at the turn of the century. I will briefly indicate how my account may differ from his. Kuhn concludes, as I do, that Planck did not restrict the energy of his resonators to discontinuous values. His reasoning may be summarized as follows: Boltzmann introduced finite energy elements with no intention of jettisoning the continuity of molecular dynamics; Planck reached his expression for the resonator entropy working in close analogy with Boltzmann's method; therefore, despite some delusive formal manipulations, he did not quantize the energy of the resonators. As convincing as it might be, this argument does not say why Planck did not feel compelled, within the framework of his own thermodynamics, to imitate Boltzmann's procedure even more closely, which would have led to an absurd blackbody law (the so-called Rayleigh-Jeans law). My explanation for this rests on the idiosyncratic nature of Planck's conception of the microscopic foundations of thermodynamics. Kuhn describes Planck's conversion to Boltzmann's views and methods as quasi-complete (as starting with the introduction of "natural radiation"). In fact, as Allan Needell first demonstrated, Planck did not renounce his nonstatistical conception of irreversibility until much later (around 1914). This in turn explains the role elementary disorder played in orienting Planck's use of analogies in his derivation of the blackbody law in 1900. It also explains why Planck's early readers (and a good number of later ones) found his derivation either obscure or implicitly based on an intrinsic quantization of resonators: they were wearing Boltzmann's spectacles.^[102]

During the first ten years of this century, Planck's theory of radiation, and more generally the problem of thermal radiation, became the object of critical investigations by unusually penetrating minds, among whom were two young physicists, Ehrenfest and Einstein, and the venerated H. A. Lorentz. Some of Planck's results survived: the electromagnetic H -theorem (so named by Ehrenfest) proving the spatial uniformizing effect of resonators, and the blackbody law with its characteristic energy elements and the new fundamental constant h .^[103]

[102] Kuhn 1978; Needell 1980, 1988.

[103] See Kuhn 1978; Klein 1963b, 1967; Darrigol 1988b.

― 77 ―

However, the central concept of Planck's theory, namely his notion of elementary chaos, appeared to be untenable. According to Einstein, no coherent conception of microscopic dynamics was able to provide a strict and indefinite increase of entropy. On the contrary, microscopic disorder implied observable effects like the perpetual agitation of Brownian particles and mirrors. Within Boltzmannian orthodoxy, Planck's assumption of finite energy elements proved to be incompatible with the foundation of electrodynamics. No interpretation of the blackbody law could be given without emancipating the resonators from their classical (even secular) behavior.

In 1906 Einstein reinterpreted the formal skeleton of Planck's derivation of the blackbody law on the basis of a discrete quantization of resonators. In other words, he turned Boltzmann's "fiction" into a reality, interpreting the energy unit as the minimal amount of energy that resonators could exchange with radiation. This idea of a radical quantum discontinuity was certainly paradoxical, for no one (not even Einstein) could imagine a satisfactory mechanism of the quantum jumps. Nevertheless, it quickly led Einstein to a successful theory of specific heat. By the Solvay congress of 1911 an increasing number of specialists (but not Planck) were convinced that the energy of atomic entities could only take discrete values. The ground was ready for even sharper departures from classical theory, which Bohr soon brought with his atomic theory.^[104]

To conclude, the retrospective successes and defects of Planck's program can be largely understood as deriving from certain powerful analogies with Boltzmann's theory, these analogies being constrained by a belief in the absolute validity of the entropy principle (which was not Boltzmann's). One of these successes, the electromagnetic H -theorem, depended upon a formal analogy between the notions of natural radiation and of molecular chaos. Further, the conception of disorder bound to this analogy guided Planck in his exploitation of another analogy, that between Boltzmann's combinatorics and resonator combinatorics. The resulting derivation of the blackbody law happened to be formally meaningful, even though its conservative interpretation would not survive the quantum revolution initiated by Einstein.

[104] Einstein 1906. See Kuhn 1978, and Klein 1965.

― 79 ―

PART A PLANCK'S RADIATION THEORY

Introduction

Chapter I Concepts of Gas Theory

Maxwell's Collision Formula

Boltzmann's Irreversible Equations

The Boltzmann Equation

The H-Theorem

The Nature of Irreversibility

Entropy and Probability

Molecular Chaos

Recurrence

Summary

Chapter II Planck's Absolute Irreversibility

Against Atoms

Blackbody Radiation

Planck's Resonators

The Resonator Equation

Summary

Chapter III On Irreversible Radiation Processes

A Polemic with Boltzmann

Natural Radiation

The Fundamental Equation

The Electromagnetic H-Theorem

The Blackbody Law

Planck Versus Boltzmann

Summary

Chapter IV The Infrared Challenge

The Second Derivative of the Resonator Entropy

A New Radiation Law

Boltzmann's Combinatorics

Quantified Chaos

Quantum Continuity

Summary and Conclusions

PART A
PLANCK'S RADIATION THEORY

Chapter I
Concepts of Gas Theory

Chapter II
Planck's Absolute Irreversibility

Chapter III
On Irreversible Radiation Processes

Chapter IV
The Infrared Challenge