3—
Multiplicative Systems in Several Variables I, II, III:
With C.J. Everett (LA-683, June 7, 1948) (LA-690, June 11, 1948) (LA-707, October 28, 1948)
These three reports with C.J. Everett form the foundation of a very large amount of work concerning cascades of particles, epidemics, and a great variety of other processes of this kind. As can be seen in the book on branching processes by Harris, quoted in the note for Chapter 1, it is the basis of several theories in probability theory. Much of the material in these reports is still unexploited and will give rise to further applications in problems in astronomy, chemistry, biology, and other fields where chain reactions are studied. (Author's note).
I
Abstract
This report is the first of a series developing the probability theory of systems of particles of many types, differing in nature, position, and velocity, which undergo transformations of type. The principal technique is that of the generating transformation G whose iterates yield the probability distributions for higher generations. The properties of G with respect to fixed points which determine criticality, convergence under iteration, and relation to first moment matrix M are obtained. Necessary and sufficient conditions for supercriticality of the system in terms of M are given. A ratio theorem is proved for supercritical systems stating that, with overwhelming probability, the distribution of progeny among types in high generations will be essentially in the ratios of the unique characteristic vector of the first moment matrix M of G.
Introduction
The following report, first of a series, deals with some of the purely mathematical theory of multiplicative systems in several variables. The scheme admits various physical interpretations. The t variables may be regarded as different kinds of elementary particles, for example the electrons, photons, mesons, etc., occurring in cosmic ray showers, or as particles of the same nature but belonging to different velocity groups.
The methods and results generalize those of a report by D. Hawkins and S. Ulam on multiplicative systems involving one type of particle. The principal results are: (1) probability distributions for higher generations are given by iteration of the generating transformation G, (2) relation of first and second moments of the k-th generation to those of the first via the Jacobian M = J(G) and Hessians of G, (3) existence of a unique fixed point x° = G(x ° ) of the unit cube in the supercritical case, (4) convergence of this cube to x° under iteration of G, (5) necessary and sufficient conditions for supercriticality in terms of the first moment matrix M, (6) existence and uniqueness of a positive characteristic vector v of M, (7) convergence of the positive sector to the v-ray under iteration of M, (8) flow of probability toward the v-ray with succeeding generations.
The next report will deal with problems arising in subcritical systems.
I—
The Generating Transformation
1. Suppose that a system of particles consists of t distinct types, and that a particle of type i upon transformation has probability pl(i;jl,...,it) of producing jl +... + jt new particles, ji of type i, i = 1,...,t. For every set of non-negative integers ji,...,jt, we have then pi(i;jl,...,jt) > 0, and for each i, j pl(i;j,...,jt) = 1.
With each i, we associate a generating function gi(xi,...,xt) =Ejpli(i; il,.. , jt)xj ... xt, which defines the probabilities of progeny at the end of one generation from one particle of type i. Hence x'= G(x), explicitly, X 1= gl(xl,,Xt) Zt= gt(xi,...,Xt)
with 0 <xi< 1, defines a generating transformation of the unit cube It of euclidean t-space into itself.* Moreover, abbreviating (1,...,1) as 1, we see that G(1) = 1, so that the point 1 is a fixed point of transformation G.
2. If, in a given generation, the generating function is f(xi,..., xt) = k q(kl,, k. t)xl ... x.kt, i.e., the coefficient q(k) is the probability in this generation of the state: ki particles of type i, i = 1,..., t, then the generating function of the next generation is f(gi,... , gt) = q(k) [-pl(l;j)xl ,.xt]. . [Zpl(t;j)xJl ..Xti]tk j j
3. If we begin with one particle of type i, then the generating functions are: for 1st generation, gi(x1,...,xt), for 2nd generation, i(g,..., gt), for 3rd generation, gi(gi(gi,...,gt),...,gt(g,..., gt)), and so on. Hence, adopting the notation Gk(x) for the k-th iterate of the transformation G(x): x' =Gk(x) : = (x1,... ,Xt) = .pk(i;j)xi ...x i = l,...,t we have that gk) (x) is the generating function for the k-th generation of progeny from one particle of type i.
II—
First moments. Jacobian
1. Let f(xl,...,xt) = jq(j)x ....x xt be the generating function for a particular generation. Then Of/Oxl = Ej q(j)jxl' x- 22 . . xt and 1O)f(=q(j0 )j Ox, ] =1 q(jj = E E q=jE,jjt) j = P(jl)jl i ' = J2=o...jtj-=o0 where P(j1) is the probability of ji particles of type 1 in this generation.
* Euclidean t-space is the set of all real t-tuples, with the distance function d(x,x') =/E(xci -x')2 . The unit cube It consists of all points x for which 0 x zi _ , i=1,...,t.
Hence we define Of/Oxl]x=l as the first moment for particles of type 1, similarly, af/dxj]x=1 as the first moment of particles of type j.
2. We adopt notation Og k)/xj = m () as the first moment of particles of type j in the k-th generation of progeny from one particle of type i.
3. Recall that, for a transformation G(x) : gi(xl,...,xt), the Jacobian matrix is J(G) = [agi/Oxj] (i-row, j-column) and if H(x) is a second transformation: hi(xl,...,Xt), then the Jacobian of the composite transformation H(g(x)) : hi(gl,...,gt) is J(H(G))] = J(H)]G(x) J(G)]x, since &hi(gl,.. .,gt)/Dxk-=hjdhi//Xj]G· •gj/g Xk]x-
4. It follows, since Gk = G(Gk- l) that J(Gk)_ = J(G)Gk-IJ(G)Gk-2 ... J(G)G J(G)x, and for a fixed point,*x = G(x),J(Gk)t = (J(G)0)k, i.e., the Jacobian of the k-th iterate of G at a fixed point is the k-th power of the Jacobian matrix of G. In particular, since 1 = G(1), we have J(Gk) 1=(J(G))k.
5. Now J(Gk)1[ag\ k)/xj][m()], so that the relation [m()] = [m\l)] exists between the first moments of the k-th generation and those of the first.
6. Since (S-1AS)k = S-1AkS for matrices, we have [mi)] = S(S-lMkS)S-1 = S(S-IMS)kS-1 where S-1MS is the canonical form of M = [m(l). This permits more rapid computation of mri.
III—
Second moments. Hessians
1. Let f(xl,...,xt) again be the generating function of a particular generation. Then 2 fl/x2 = q(j,. . ,j,)j(j - 1)X-2Xj2 . . . t jl>2 so that
*A fixed point x = (xi,...,Xt) of G is one such that xi = gi(xl,...,Xt), i = 1,...,t. Clearly, x = G(x) implies x = G(x) = G2(x) = ... so that a fixed point of G is also a fixed point of all iterates Gk of G.
02f/OX211 =Z q(j)jl(ji - 1) + qj)jil(i - 1) = j >2 jl>1yq (j)j2 =q(j)jl , JI>i i ql>l ji?1 jiŽ1 where, just as in II, the former sum is the second moment of particles of type 1 in this generation.
2. Letting t(k) represent the second moment of particles of type j in the k-th generation of progeny from one particle of type i, we have Q2(k)/xj2]=-m
3. Recall that, for a single function g(xl,..., xt), the Hessian is defined as the matrix H(g) = [02g/QxiOxj], (i-row, j-column). If S(x) is a transformation: si(xi,.., Xt), then for the function g(S) = g(si,.., st), we have 9g(S)/9xi =Ej 9g/xj]S· - sj/0xi and hence 02g(S)/OxiOxk = y(92 g/0qXj,Xr) s sr/0k ' QSj/OXi + Z (09g/Xj)S02 Sj/0XiOXk . j,r j In matrix notation, this reads * H(g(S)) = J' (S) H(g)s J(S) + (0Og/0xj)sH(sj) J
4. In particular, if we set g = gn(xi, . ,xt)andS = G- l, we have H(g,,(Gk-)) = H(g(k)) = [02g(k)/Oxixk] = 'r (G1) H(gn)Gk-1 J(G -) +- 9Ogn/0Xj, H(gjl ) . Ji If we write Hk) =H (gn)) and J =J(G)i=[mij], we have
1 H2) = (Jk+1)7H(1)jk-'m, ..... (Jk-l)TH(l)Jk·-l + m (J-2)TH(i)Jk2 + Zmnjlmj,j(J3 ) ji jlJ2 H()Jk-3 + + . mnlm332 ... mjk-2jk j-H "ji2 ,k-k-Ijl ,•••,jk-I *If A = [aij], AT denotes the transposed matrix [aji].
If we define mij as the element in the i,j position of M we obtain H ( k) = (Jk-1)H Jk-1 + Jnj(Jk-2 )rHjJk-2 + Z(J2 )j (jk-3)T 3 J HjJk-3+ + Z(Jk-l)jH * 5. Since Hn)= [09g()/oxiOXk] r(kc) _(k) t(n) - mm1* (k) (k) L * t't - m)nt the preceding result gives the second moments t(k) for the k-th generation in terms of the first and second partial derivatives associated with the first generation.
IV—
Fixed Points of the Transformation x[prime] = G(x)
1. We study properties of the transformation x' = G(x): x = gi(xi, . . . t) = Zjpi(i; ji ... jt)xlx22 ... xt, on the unit cube It. We write 0 = (0,...,0), and as before, 1 = (1,...,1), x = (x1 ,...,xt).
2. We write x = (xl,...,xt) < x' = (x1 ,...,x') in case xi <xi,i 1,... ,t. Then x -< x' implies G(x)< G(x'), since, for each i, gi(x1, . ..,t) < gi(x1,...,xt)<gi(x1,x2,...,xt) < gi(x1,...,xt). The individual inequalities hold because gi has all coefficients non-negative, and hence is monotone non-decreasing in each variable separately.
3. We recall that a fixed point x= G(x) of G is a fixed point of all iterates Gk(:), and hence that Gk(1) = 1: 1 = gik)(1,1,...,1) all k>O, i= ,...,t.
4. Since gi(0,...,0) = pi(i; 0,..., 0), we have 0 < G(O), and so, from 0 -< x < 1 follows 0 < G(0) - G(x) < G(1) = 1. Hence G(It) cIt, G and all its iterates being transformations of the unit cube into itself.
*Note that in case t = 1, this reads gk = (g')2k-29g" + .. + (g,)k-19g" at x = 1, a relation obtained in Chapter I.3
5. If x° = limGk(x) then x° is a fixed point of G. For x°= limGk(x) = limG(Gk-l(x)) = G(limGk-l(x)) = G(x°). Thus all limit points under iteration of G are fixed points.
6. The set of all fixed points of G(x) is closed, i.e., a limit x° of a sequence of fixed points x"= G(xv) is itself a fixed point. For we have x° = limx = limG(xV) = G(lim x") = G(x°).
7. Since 0 < 1, we have 0 - G(O) < G2 (0) < ... - G(1) = 1, and for each i, g(k)(0) = pk(i;0) is monotone non-decreasing and bounded above by 1. Hence exists limg(k)(O) = limpk(i;0)= xi ° < 1. It follows that limGk(0) = x° = (x°...,x°) exists and x -< 1. Moreover, from 6, x°= G(x° ) is a fixed point.
But g(k)(0) = pk(i;O) is the probability of death in the k-th generation of progeny from one i-particle, and x° is the limit approached from below by this probability. For this reason we speak of x° as the death fixed point of G, and say i-progeny form a supercritical system in case xi< 1, subcritical if xi= 1.
8. From 0 -< x - x° , follows Gk(0) < Gk(x) -<Gk(x° ) = x° and since limGk(O) = x° , also limGk(x) = x° for all x in the indicated range.
9. If the death fixed point x° is 1, then limGk(x) = 1 for all x in It. This follows from 8.
10. If 0 -x -< I and limGk(x) exists, then x° - limGk(x) < 1. For Gk(0) - Gk(x)- Gk(1) = 1 and x° = limGk(0) < limGk(x) -< 1.
11. If x is a fixed point of G in It, then x°-< - 1. For x = Gk(x) for all k, so lim Gk(x) exists, and we use 10.
12. If O -<x = (Xl,...,Xt)x' = (X,...,X) < 1, i.e., xi < x, i = 1,... , t and Xj x x for at least one j, then gi(x) < gi(x') for all i = 1,..., t. For, in the inequalities of 2, we have gi (xl, , xjl, , xj+1, ,Xt) < gi(x', . . .,z 1x,zXj+ ..,zt), since by the law of the mean their difference is Ogi (x',.. x'j_l, j,j+, ..,xt)/xj. (xj- Xj) where 0 < Xj < Ej < Xj. *
* In this paragraph and in the remainder of the paper we assume 3gi/axj> 0 for all i,j and all x : 0 of It.
13. If x = (xl,...,xt) I x = (xi,...,Xt) =G(x), the latter being a fixed point, then gi(x) < gi(x) = xi, i = 1,. ., t, as a consequence of 12. In particular, if the death fixed point x° is not 1 = (1,...,1), that is, if a least one xj < 1, we have xi= gi(x ° ) < gi(1) = 1 for all i. This means that x°=limpk(i; 0) is 1 if and only if it is for every i, so that we may speak unambiguously of a system as subcritical (all x i= 1) or as supercritical (all x9 < 1), regardless of which type of particle is considered as ancestor of the process.
14. We have seen that x° -< 1 and that any other fixed point x of It must satisfy x° -< x -< 1. Hence, for G subcritical, x°= 1 and there is only one fixed point. Moreover, if G is supercritical (x°< 1) and x is an additional fixed point, x° ± x -1, we have xz < xi < 1, for all i, from 13. We shall see that no such third fixed point can exist.
15. If 0 ¢ x° then 0 < gi(0) < gi(x° ) = xi. Thus, if the death fixed point is not 0, all of its components are positive.
16. Suppose x, x are chosen in It with xi 7 xi,i = 1,..., t. Define Fi(t) = gi(x + (x - x)t) on 0 <t< 1. Taylor's expansion gives Fi(1)Fi(O) + F'(O) + 1/2F"(Oi), 0 < 0i < 1. Now Fi(1) = gi(x), Fi(0) = gi (), F'(t) = Ej (Ogi/Oxj) + (x - X)t(xj-) , Fi(o) =E (g/ilaxj)(xj- j) Fi(t) = E(02 gi/OxjOxk)±+(x-_)t(Xj - x)(Xk - Xk) , and F"'(9) = y(0 2 g9i/OXjxk)x+(x-x)i (xj - Xj)(Xk - Xk) Hence gi(x) = gi(x) +Ej(Ogi/xj) (zx - xj) + 1/2 Z (02gi/Oxk0xk)V() (xj - Xj)(Xk - Xk) where !) = Xj + (Xj - Xj)t, 0 < t < 1 In case Xj < Xj for all j, we have Xj < i < Xj, and if xi < Xj for all j, then xj < < < .
17. We show now* that, in the supercritical case (x0 1) the transformation G has no fixed point in It other than x° and 1. We have
* In this section and in the remainder of the paper we assume that for each gi(x) at least one of the 02 gi/0qxjxk does not vanish on the range x° -< xi -< 1. This is the last restriction we place on G.
already seen that if such a point x exists we have for all of its components the relation x° < xi < 1. In 16, let x be the hypothetical third fixed point, and set x = 1. Then 1 = i(1) = xi + E (&gi/9xj),(1 - xj) + 1/2 E (02gi/OxjOXk)(i) (1- xj)(l - Xk) where < < < <1. Hence 1 - xi > E ( /gi/xj),(1 - xj) .........(A)
Now, in 16, let x again be the third fixed point, but set x = x° . Then we have x° = gi(x° ) = xi +E(0gi/0xj)_(x° - Xj) + 0-(!i) 1/2 Z (02i/xjxk) .(i)(xo - tj)(X -Xk) where xj<< j < 1. Hence xo - Xi > E (0gi/0x j)i(x ° - j), or xi - X < (gi/0xj).(j -xo) ........ (B).
Now set ui = 1- xi > 0, vi = xi - xi > 0, Pij = (Ogi/0xj)x > 0 We have then, for every i, (A') ui > EpijUj > 0 (B') 0 < vi <Epijvj (B") 1/vi > 1/ Epijvj > 0 (C) ui/vi> Zpijuj/Zpijvj > 0.
Now define um/Vm = min {ul/vl,..., ut/vt} so that for every j, jl/vj>u,/vm and hence Uj>(u,/vm)vij. Hence in (C) with i = m, Um/Vm > (EPmjUj)/(EPmjr j) > (Z-Pmj j(urm/Iv,)) / (Epm vj) = Um/Vm, a contradiction.
V—
On the lim[sub(k)]p[sub(k)] (i;j)
1. We recall that Gk(x) is denoted by X'i = gi ) , Xt) = pk(i; 0) + Epk(i;jl, ,jt)x ..' Xt, jfo where limpk(i; 0) = limg)(0) x i.
2. In the subcritical case, x° = 1, we have 9)(1) -pk(i; O) = 1- pk(i; 0) = Zpk(i;jl,...,jt), jfo and hence limk ZjAo Pk(i; j) = 0.
3. In case 0 x° 1, we have seen that 0 < x i < 1 for all i. Hence, let m = min {x, . . ., x} where 0 < C < 1. Then gi()()) - pkki;0)=) -= pk(i;j) (XI)L (Xt) j3o and hence limk 1Ej_oPk(i;) (x°)i1 .. (x °)j = 0. For every integer Uo> 1, we have* -Pk(i;j)(Xl°)j l ... (X)j > E Pk(i; j)(X)j . ..(XO)J > jo 1<aU(j)<a Z Pk(i;j)f(j i)> 3 Pk(i;j)k since 0 < < 1 and r(j)< r l<a(j)<Oa l<a(j)<<a on the range of summation. It follows that linm E pk(i; j)W = 0 and hence 1<o(j)<a likm pk(i;j) = since e0. 1<a(j) a
4. In the only remaining case x°= 0, we have Gk(0) = 0 and so gi()(0) = pk(i; O) = 0 and thus gk)(xi,...,xt) = EjoPk(i;j)xl ..j t. We will show first the existence of a point x such that G(x) -< x 1. Since G(x) has for its component functions the gi indicated above, we have Ogi/dxi = Epi(i;j)jixl-'xlz2 . ..xand so(gi/Oxi)o = pi(i; j0,. ..,0). Thus generally,(Ogi/0xj)o=pi(i; O,. . . , ).
Letting x = 1 in G(x), 1 = Ejo Pi (i;)>Ej=Pi (i; 0, . ., 1 , 0) = (9gi /lxj,)o Moreover, if 1 = (Ogi/Oxj)o, then all other pl(i;j) = 0 whence gi(x) = E t=Pi(i; O,..., ,..., 0)xj. But then (02 gi/&xjaxk) 0 for all x of It and all j, k, thus violating our assumption on second partials. Hence Ej(Qgi/Oxj)o < I for each i.
Define Si(x1 ,...,Xt) = E(9gi/Oxj)x on It. Then Si(O) < 1, hence for each i there exists a 0i, 0 < ~i < 1, such that, for all x satisfying 0 <xj< (i we have Si(x) < 1 (continuity of S). * We use the notation a(j) = jl +... + jt, for j = (jl,...,jt).
Let U = min(f1,...,t}j , 0 < C < 1. Then for all x such that 0 <xj<< 1, we have Si(x) < 1 for all i. Now the Taylor form for gi(x) about x = 0 may be written gi(x) = 9i(O)+j(0gi/axj)txj = 0+E-(0gi/'xj),(i)xj, with 0 < x < x. Let x = (i,...,) where 5 is that above. Then gi(<,...,0) = ( /gi/)xj) (i) < since 0 < xj <~. Hence for x= (C,..., ) we have G(x) -< x ~ 1. It follows that 0 < ... < Gk(x)-Gk-1(x) - ...-G(x)-<x 1 so limGk(x) ~ 1 exists, and is a fixed point of G. Since there is only one fixed point (x°= 0) other than 1, we have lim G(x) = 0.
For x= (,...,) then, lim g(k) () = limjoPk(i; j)(i) = Again fix a as any positive integer, and we have -Epk(i;j)e()>pk(i;ij)>(j)> pkP(i;j)". jO 1l<a(j)<al<o(j)<a So limk Zpk(i; j)l = 0 and thus limppk(i; j) = 0, where 1 <a(j)< a.
5. In summary, we have shown in this section that in all cases, for every a and every i lirm pk(i;j)= 0 k 1<a(j)<i and hence trivially, limkpk(i; jl... ,jt) = 0 for all i and all j 5 0.
VI—
On lim G[superscript(k)](x) = x[superscript(0)]
1. In case x° = 1, we have already seen that, for all x in It, lim Gk(x) = x°. We prove in this section that, in the remaining case, x° ¢ 1, lim Gk (x) = x° for all x / 1 of It. Indeed, we already know this when 0 - x -< x°.
2. We shall use the following Lemma. Let 0 <xi < 1 for all i, and p = max{x1 ,...,xt}, 0 < p < 1. Then for every e > 0 there exists a a > 0 such thatxil . . xt < e. 0-(j)>47
Analogies between Analogies
Proof. For arbitrary a > 0 we have (Margenau, Murphy p. 417) 00oo oo00 Xjl ... tjt _< E t a(j) -'o(j)= E tCm-lm o(j)>a a(j)>r m=a+l a(j)=m m=ar+ The series :=o Ct+m-l m converges by the ratio test: Ctmnm+l l/C't+m- Im = (t + m/m + 1) /-/< 1
3. Let x°0 1 and fix x in It, with 0 < xi< I for all i. Let l = max {ix,...,xt } and write g((x l...,Xt)-pk(i;0) = Ppk(i;j)x' ...xjt. Fix e > Oj30 Then by 2, there exists a a such that i x... xit < e/2. a(j)>a For this a, there exists a K such that, for all k>K, pk (i;j) < e/2. l1<a_(j)<a Hence, for all k > K, k) (Xl, . .,Xt) -pki)=Pk(i;j)x. ..x+pk(i;j)x .... t l<a(j)<ra a(j)>a < S pk(i;j) + E x' ....xtt < e/2 +e/2 e .1<oa(j)a a(o(j)>a Thus for every x such that 0 < xi < 1, all i, we have 0 = lim (gk)(x) - pk(i; 0)) since limpk(i; 0) = x i
4. Now suppose 0 - x 1. Then gi(x) < gi(1) = 1, for all i, so G(x) is a point of the type considered in 3. Hence x° = limGk(G(x)) = lim Gk+l(x) = lim Gk(x). This completes the proof that, in all cases, every point x7 1 of It has the lim Gk(x)=
VII—
Supercriticality Conditions
We give now necessary and sufficient conditions for supercriticality of the transformation G(x), namely that limGk(O) = x° I1 . By definition, mij = mi') = (Ogi/axj)i. Since we have assumed all (&gi/Oxj) non-vanishing for x :$ 0 on I, clearly all mij > 0.
Theorem. The following conditions on G(x) are equivalent: (a) limGk(O) = x ° 1. (b) there exists an x such that 0 -< G(x) -< x 1, where gi(x) < ii, 0 < -i < 1 for all i. (c) there exists an x such that 0 -< G(x) -<x1, where 0 <Xi < I for all i. (d) there exists an i such that 0 -< G() -< x 1. (e) there exists an x such that _ mij(1 - Xj) > 1 - ji, where 0 < xi < 1, for all i. (f) there exists a 5 = (1i,..., St) such that E mijbj > 6i, where bi > 0 for all i. (g) the matrix [mij - ij]t = J(G(1)) -I contains at least one upper principal minor A, = [mij - bij] (1 < v <t) such that (-l1)v+ I[AI> 0 where (-l)"+l · IA, > 0 in case v = t.* (h) there exists a v = (vl,...,vt) such that Evimij > vj, where vj > 0 for all j. (i) there exists a characteristic root r > 1 of M = [mij] and a corresponding characteristic vector v = (v, .. ., vt) with all vi > 0, that is, Zvimij = rvj for such r and v.
In matrix notation m ii .l * mit (vi,..., vt) · =(rv, ..., rVt) mtl mtt or briefly, vM = rv.
* [aij]~ indicates a matrix whose indices i,j range from m to n. 6j is the Kronecker delta, 1 or 0 as i is or is not equal to j. I is the identity matrix [6ij]. IAI indicates determinant of A.
1. The proof consists of the following implications: a e /\\/\d (c f-*g(- h-(i. \ / b The theorem has been amplified to permit proof in easy stages, but the essential content is the reduction to matrix theory in J(G)1 of the question of criticality of the (non-linear) transformation G(x).
2. (a-c) If x° 1, then 0 -G(x° ) = x° 1, 0 <x° < 1, all i, so that x° serves as x in (c).
3.(c> d) is of course trivial.
4. (d-a) If 0 -< G(x) - x 1, then Gk(0)-Gk(2) -< Gk-l(x) < ... -< G(.) < .l1, hence x° = limGk(O) __< 1.
5.(c —e) If 0 - G(x) -< a 1, 0 <xi < 1, all i, we have xi > gi(2) = gi(1) + - mij(xj - 1) + 1/2 E(igi/Ixj&xk)((i)(xj - 1)(xk - 1) where Xj <((i) < 1 for all j. Hence xi >1 -+ E mij(Xj - 1) and 1 - xi < Z^mij(1 - ;j) with 0 <Xj < 1, all j.
6. (e -- f) is trivial, since we may define (i = 1 - xi > 0 in terms of the x given in e.
7. (f — b) Suppose t=' mij6j > 6i > 0, i = . .., t Since mij = (Ogi/9xj)l and Ogi/Oxj is continuous, there exists a 5, 0 < 1 < 1, such that (O9gi/Oxj)S6j > 6i, all i, whenever <Xj < 1, all j. Define p = min {(I1- )/Si;i j = 1,..., t} > 0. Then _(Ogi/9xj) p6 j > p6i all i, wherever I<Xj< 1, all j. Now define i = 1- Sip; i = 1,...,t. Then E(Ogi/Oxj)=(1 - Xj) > (1 - i) all i, whenever I<j< 1, all j. Since 0 < 1 - xi = 6iP< 1 (because of definition of p), we have 0 < < xi < 1, all i. Now define x = (1t,...,t) where, as we have seen, 0 < xi < 1.Then 1 = gi(1) = gi(x) + E(Ogi/0xj)(i)(1- xj)/x(i) where j < xj < 1, all i,j. x i < all i, j. K Since j< Xy < < , we have (O9gi/Oxj)=(i)(1 -j) > 1 - Zi and 1 > gi(x) + (1 - Xi). Hence gi(x) < xi, all i, and 0 -< G(x) -< x1 for an x of the type required by (b).
8. (b , c) is trivial.
9. For the equivalence of (f) and (g) we need the following:
Lemma. Let A = [aij]l be a matrix in which aij > 0 for all i $ j. Then there exists a point 6 = (6 1,...,6 t), with all 6i > 0, such that Eaij6j > 0, all i, if and only if, for some v, (1 <v < t) the upper principal minor A, = [aij]} satisfies the relation (-1)"+1A,Il> O, with (-1)V+1. IAVI > 0 in case v = t.
Proof. We first prove E aijSj > 0 implies the condition on principal minors. Proof is by induction on the order t. If t = 1, we are given that a116 1 > 0 for 61 > 0 so that all > 0, and (-1)2 1A1 1 > 0 as required. Assume the statement of the theorem for order t - 1, and suppose given j= 2aijSj > 0, i = 1,..., t, t > 2, all 6i > 0, aij > 0 for i 4 j. If all > 0, in the given system, we have at once (-1)2 All > 0 as required (since t > 2). Now suppose all < 0. Then taljSi > -a1161 and aii61 + -t 2aijSj > 0, i = 2,..., t. Since -all > 0 and ail > 0 (i = 2,...,t) we multiply to obtain t ailalj6j > ail(-aii)S1 and (-all)ai6lS + 2 aij (-all)6j > O, i = 2, ,t. .,tHence J (-a11) aij6j+ -ailaib6=Et(-allaij+ ailai)6 > 0; i = 2, ...,t.
Since, for i $ j,aij > 0, and for i, j =2,...,t, ail > 0 and aij > 0 we have also (-allaij+ ailaij) > 0 for i 5 j. By the induction assumption on systems of order t - I there exists a minor of order v: AU = [-allaij + ailalj]2+l such that (-1)"+lA,Il > 0 with (-1)"+1 IAIl > 0 if v = t- 1. But for the original matrix we have lA+l = laijl+l = I...alj... = l/(-all)v ...-alj...a . . . . allai... l/(_ 1)'. i'alja( - alaij + ailaij ...) 1/(-ail) =ai/-a alla12 a) 1.alt
Hence (-l1)+2. IA,+ =(-1)v+2ali/(- all)"IAv = (-1)"+l/( - all)-11A,l> 0 and > 0 in case v + = t (i.e., v = t - 1).
We now prove the converse, namely, the condition on minors implies the existence of a 6 such that E aijSj > 0. Proof is again by induction on order t.
In case t = 1, we are given that (-1)2 1All = all > 0, so for (say) =1 > 0 we have allb6 > 0.
Assume the theorem true for order t - 1, and suppose given A = [aij] with t> 2, having for a given v(1 < v < t) (-1)v+1 IA. > 0 (> 0 if v + t).
If all> 0, let 62 = ...= t = I and choose 61 > 0 so that 61 > max { - (1/al) t2aij; i = 2,..., t}. Then aijj = all6l+ E2 ali> 0 + 2 aij > 0 since alj > 0 for j $ 1. Also, for i = 2,...,t, aij = al61= Et aij> - t aij aij= 0, by definition of 61. Thus the conclusion follows in case all = (_1)2. IA}l > 0.
Suppose then that all < 0. Then the given A, must have 2 < v< t. Now 0<(-1)v+l-A, = (-1)-+1aijv = (-1)V+1 . aij ... .. aij . ... (-l)v+ /(-aii)-. aj ..·.• (-_1)+1/ (-a11)v . (-allaij) ... ... a 13=(-1)v+l/(-al1)V-lallal2 . ait .. (-allaij + ailalj) l0 _l-1 (-1)"/(--all)-2·IAI-. Hence (-1)>"A,_1 > 0 for 1 <v-1<t -1 (> O if v-1=t-1) for the system of order t - 1:
A = [-allaij + ailaij]2 in which -allaij + ailalj > 0 for i 4 j just as before. By the induction hypothesis, 6i > 0, i = 2,... , t exist such that E ( - allaij+ aaiaj)6 j> 0, i 2,...,t. Hence ail ( aj6j/ - all) + t aij6j > 0, i = 2,..., t. Let 61 represent the parenthesis of the last inequality. Then el > 0, since aij > 0 for j $ 1, and substituting, ailel + aij6j> 0,i = 2,..., t.
Choose T so that 0 < T < el, and T < (1/ail)(ailel + 12 aijsj) for all i = 2,..., t. Finally, define 61 = e1 - T > 0. Then Etaij 6 j a11 61+ '2 alj6j = allel + 2- alj6+ (-all)T =(-ail)T> 0, and for i = 2,..., t,t aij0j = ai161+ aij6j = ailel + aijj - ailT > 0 because of the second inequality defining T. Hence the lemma is proved.
10.(f -9 g) If JEmijj > 5j > 0, all i, then Z(mij - 6ij)6j > 0, all i, and the matrix [mij- 6ij]t satisfied the condition of (g) because of the lemma.
11.(g -9 f) If the matrix [mij - bij]t satisfies the condition of (g), there exists by the lemma numbers 6i > 0 such that (mij - bij)6j > 0 and hence E nmijSj > 6i, all i.
12. (g --h) If [mij - 6j] has the property (-1)'+1 \IAI > 0 (> 0 if v = t) for some v, then the matrix [mij - 6ij]T = [mij - 6ij] has* (-1)+l' . IAI = (-1)>+ 1 . IA i > 0 (> 0 if v = t). Write mij = mji; then [mij - 6 j] has (-1)^+ · IlAj > 0 (> 0 if v = t). But we have seen that (g -+f) so that there exist vi > 0 such that Emijvj > vi. Hence _mjivj > vi, or changing notation, Zvimij > vj, all j.
13.(h -* g) If E vimij > vj > 0, all j, then E mjivj > v > 0, and, defining mij = mji, Emijvj> vi > 0. But we know that (f -+ g) so the matrix [imj - 6ij] has the property (-1)^+l · nij - —ijl _> 0 (> 0 if = t). But then (-1)v+l(mij - iji = (-1)V+l [mi - 6jil = (-1)V+l -·mij- 6ij\l> 0 (> 0 if v=t).
14. (i - h) If vimij = rvj for vj > 0, r > 1, we have vimi > v3, hence (h).
15.(ih i) Assume Ei=l vimij > vj > O. Since (Evimij)/lv > 1,
j = 1,...,t, we define ro so that ( vimij)/vj > ro> 1, j = 1,...,t. Then vimij> rovj, all j. Write _ Vimij = rovj + Ej.
* Recall that IAl = IAI .
Define llvll = v2 > 0. Then E(vi/2 llvll)mi j = ro(vj/2. lvIl) + (ej/211vll ). Define vO= (vj/2 11vll) and ej = (e/211vll). We have now v?°mi = rov + ej, where ej > 0, vO > 0, all j, and Ilv°ll = 1/2. Define 1 = min l/2v°;mij/ /mij; all i,j= 1,...,t >0. Let A = {v; vj>t, llvll < 1, E vimij>rovj, all j}.
(A) The set A is non-void for v° E A and is a subset of euclidean t-space Et. For v.O> 2i > p by definition of u,Iv°0 l = 1/2 < 1, and v°mij > rov.
(B) The set A is bounded, since \lvll< I for all v e A.
(C) The set A is closed. For, let v = limv" where vaeA. We have to show v e A. Since vja > , all a, j, we have vj = lima vj > /p. Also, Ilvi = IIlimaaill = ([limvall < 1, since 11 I is continuous and all iva 11 < 1. Finally, vimij = E (limava) ij = lima Vmij> limara rv= ro lim a v=rovj Hence v E A.
(D) The set A is convex. That is, for every v,v' C A, and every a, a'> 0 with a+a' = 1, we have av+a'v' E A. For avj + a'v' > au+a' = (a + a')p = . Also llav + a'v'll<a lvl +'- lIv'il <a + a' = 1. And (avi + a'vD)mij = a Zvimij +a'_vimij>arovj + a'rovj = ro(avj + a'v).
(E) The set A has an inner point, e.g., v° . For v° E A there exists a neighborhood of v° contained in A. Let e = min vj- ; ej (ro + mij); j= l,...,t. .
Since vj > 2, > -, vO - > 0 and e > 0. Now let v be any point of the e- neighborhood of v° , that is, lIv - v°ll < e. We show that v e A. For Ivj -v°\< Z/ (vj - = v) 11 - lv- < e, so vj > vj, -e > /, by choice of e, and thus v satisfies the first requirement for membership in A.
Next, Illvll-IIvOlll<liv-v°0 1 < e, so llvll < e+llvll- = 1/2+e < 1. The latter inequality obtains since e < v -/ < vj< Z(v ) < v = ilv°ll = 1/2. Hence v satisfies the second requirement.
Finally, E vimj>E (v - e) mij = rov + ej- e Ei mij> rov. The latter inequality is true since it requires ro (v - vj)>e mij - ej, i.e.,
o- j>(e Ei mij - j)/r . But we know v-j > -e and it suffices to prove -e>(e E mnij-ej) /ro or ej >e( i mij + ro). But e satisfies this requirement by definition.
The properties A-E show that A is a convex body in Et. It is well known, therefore, that any continuous transformation T of A into A has a fixed point T(v) = v, v E A. (Alexandroff-Hopf).
For every v E A, define T(v)= vM/llvMll, that is, T(v) = v' where V = Eivi, mij/ Ej(vimij)2 . Since for v c A, vi > > 0, and mij > 0, clearly IIvMll > 0 on A.
First we show that if v e A, then T(v) E A. For (vj)2 = (Evimij)2 /Ej(Zivimij)2 EviZ2mj/Ej(Ev2) (im) > 1/VII2 . Ei {mTjvi /Eimij }> (1/IIVI12 )2'EiVi2 = SO Vj > / as required for T(v) = v' E A.
Moreover, IIT(v)II = (IvM/IIvMlIII) = 1. Finally, EiVimjk = (1/I1vMll) Ej mjk Eivimij>(/ll|vMll) E mjkrovj = rov;, so T(v) e A, and T(A) c A. Clearly, T(v) is continuous on A. Hence, for some v e A, T(v) = v = vM/lIvM\l, and thus Vj · IIvMI = vimij>rovj, so I\vMII>ro > 1. Set r = llvMll and we have Evimij = rvj with vj > 0 and r > 1. Thus the theorem is complete.
16. Some general remarks on positive matrices are now appropriate. If [mij] is a matrix with all mij > 0, it can have at most one characteristic root r > 0 with a corresponding positive characteristic vector v, all vi > 0, that is, the equations vimij = rvj and Evjmij = r'vj; v j, vj, r, r' > 0 imply r = r'.
For, choose A, > > 0 so that A <(v'/vi)<j,i = 1,...,t. Then Avi<v i< p/i, and since vM = rv implies vMk = rkv and v'M = r'v' implies v'M = r'v', we have EAviMik<EV/m(k) < ,Virnm,k) and hence Arkvj<r'kvj < prkvj, where Mk = [m()] and clearly m( > 0. Then Avj <(r'/r)kv' < LiVj, an impossibility for high k, if the ratio r'/r is either less than or greater than one.
17. We shall make use of the following
Lemma. If B = [bij] is a matrix with bij> 0 for all i $j, and if Evibij = 0 v= Evibij for vi, v' > 0, then there exists an a > 0 such that v = avi, i = 1,...,t.
Proof. Induct on order t. Suppose t = 1. Then we have vlbll = 0 = vlbll, vl, v1> 0, so v' = avl where a = vl/vl. Assume the lemma for systems of order t - 1, and suppose given t vbij = 0 = tv'j with t > 2. Since vibil= -bll vi and bil > 0 (for i 4 1) and vi > 0, all i, we must have -b1l > 0. Note also vibij +2vib = 0, i = 2,...,t.
Hence T2vibibij = -blbiivl and vlblj(-bl) + Evi(-bli)bj = 0, j =2,...,t. Substituting, ' vi - bbj + bill) = 0, j = 2,...,t. For i 7 j, bij > 0 and - bl > 0, bil > 0, bi > 0 for i, j > 2. Thus the matrix [- bIbij+bilblj]t has the property of the lemma. In identical fashion we have also 2 v i( -b 11bij+ bilblj) = 0. Thus, by the induction hypothesis, I i v' = avj, j = 2,...,t, for some a > 0. But -bllv' i = vb il = a vibil = a(-bllvl) and thence also vi = avl.
18. It follows that, if mij > 0, all i, j, and E vimij = rvj, E v'mij= rv' for vj,v3 > 0, then v' = av. That is, a positive matrix can have at most one positive characteristic vector (up to the scalar multiples) corresponding to any one of its characteristic roots.
For, the above equations may be written vi (mij - ijr) = 0 = i v'(mij - bijr), where the non-diagonal elements of [mij - bijr] are the mij > 0.
19. If M is the moment matrix of any G, there is one and only one solution for r > 0, vi > 0 of the relation vM = rv. G is supercritical if and only if the solution r is greater than one. (See 20 for existence.)
20. The proof of property (i) for supercritical matrices was complicated by the fact that we wanted to prove r > 1. It is possible to show more easily the weaker result that a solution of the relation in 19 exists for arbitrary matrix M = [mij] with all mij > 0.
Define = min{l/(4vt); mij/ ijm} > 0, A = {v; Vi ,>L, lvll< 1} and T(v) = vM/\\vMll for v in A. Then just as in the proof of (i) one verifies A is a convex body in Et, and T(v) is a continuous transformation of A into itself. As such, there exists a v in A such that T(v) = vM/IIvMIl = v, and for r = IlvMll > 0, vM = rv. Since v E A, vi> t > 0.
21. Note that, if a(v) = vi, o is a linear functional: ou(av + a'v') =Z(avi+ a'v') = a,vi + a' E via(Vv) + a'(v') . If we define T(v) = vM/a(vM) on the sector vi > 0 (v $ 0), where M has all nij > 0, we see that a(T(v))u(vM)/a(vM) = 1 and T(v) may be regarded as the central projection of vM onto the hyperplane v 1+. . .+t = 1. Our operator T has some interest beyond that of the classical Markoff operators (see G. Birkhoff) since it need not be a contraction in the positive sector. For example, if M =[1 1 one finds that it is supercritical, has positive characteristic root r = 5, corresponding vector = (4/5,1/5) = v where o(v) = 1, and vM = rv. Hence T(t) = v. Let 6 = (1,0). Then IT(b) - T(v) I = /377/20 > 116 - vIl = /2/5. Hence T(6) is not closer to v than 6 was. Nevertheless we prove limTk(v) =v for all vectors v in the sector vi> 0 (v# 0). Our proof of this fact is lengthy but of a general topological character.
22. Let {yi} denote the simplex with vertices yl,...,y" in E, that is, the set of all points y = E aiyi , ai > 0, E ai = 1. If A is an arbitrary matrix, we have (Caiyi)A ai(yiA) so that the map under A of the simplex {yi} is the simplex with vertices yiA: {y'A= {yA)}
23. For arbitrary x with a(x) 0 define the transformation xS = x/ao(x). If the yi are all $ 0 and have all components non-negative, clearly the same properties are possessed by all points of {yi}. Hence we may operate on {yi} with S and {yi}S = {yiS}. This says that the projection onto the hyperplane vi = 1 of the simplex {yi} is the simplex with vertices yi/ai where ci - ao(yi).
For, {yi}S is the set of all aiyi/ aioi and {yiS} is that of all ,bi(yi/ai) where ai,bi> 0, and Eai = 1 = bi. We have to show that, given either the ai or the bi, the other set may be determined so that i aiyi/ E ai ri(bi/i)yi (1) Given the ai, definition of bi = aiai/ r aii implies bi> 0, E bi = 1 and validity of (1) so that {yi}S C {yiS}.
Given the bi, it suffices to find ai so that aiai = biajcj or E (biaj- bijoj)aj = O,i = 1,..., n. (2) J J
The determinant of the latter system is zero since addition of each row but the first successively to the first yields in the (1,j) position aj b bi - j =rj- aj-j = 0.
Hence a solution (al,...,an) $ (0,...,0) exists, and we see from ai/i aii =bi/ai > 0 that all ai have the same sign. Hence we can choose a solution ai> 0 for all i, with Eai > 0, and then ai/Zai fulfills all requirements.
24. If T(v) = vM/oa(vM) then Tk(v), the k-th iterate of T, is given by Tk(v) = vMkl/a(vMk). Proof is trivial by induction on k. Hence Tk(v) = (vMk)S and Tk{yi} = ({yi}Mk)S = {yiMk}S = {yiMk/a(yi'M)} ={T (yi)}
25. Define j6 = (0,..., 1,. ., 0), the j-th unit vector of t-space, and A = {6i}, Ak = Tk (A) = {Tk(5i)}. Since mij > 0, clearly T(A) c A and hence A D T(A) D T2 (A) D ... is a "nest" of simplices. Define D = QAk, the intersection of all Ak. We prove that T(D) = D. Let de e D, i.e., de = T(dl) = T2 (d2) .... Then T(do) = T2 (dl) = T3 (d2) = ... is in D and hence T(D) c D.
To obtain the other inclusion, we make use of the lemma. If {yi}1 is a simplex contained in A such that T(y) = T(y') for y 4y' in {yi}, then the linear dimension of the subspace spanned by the T(yi), i = 1,...,t is less than that of the space spanned by the yi, i = 1,..., t.
Proof. Let (say) yl,...,yd be linearly independent vertices of the simplex {yi} and the remaining yd+i dependent upon them.
Write y = Edciyi+ y'= d c'yi in {yi} C A, hence a(y) = a(y') = 1. Suppose that T(y) = ciyiM/l E ci&i = T(y') = E cyi M/ E cii where i = a(yiM). If we let C = ciai, C' = E ci then (ci/C c'/C')yiM = 0. Now not all coefficients here are zero, for if so, we should have c' = (C'/C)ci =Rci and hence y' = c'yi = R cyi = Ry. But then cr(y') = Ra(y), R = 1, and thus c i= ci, = = y', a contradiction. Hence the yiM are dependent (i = 1,...,d) and since the yd+i are dependent on the yi, i = 1,..., d so are the yd+iM dependent upon the yiM, i = 1,...,d. It follows that the space spanned by the yiM, i = 1,...,t is of lower linear dimension than d. But this space is identical with that spanned by the T(yi) = yiM/&i.
Since the dimension t of the space spanned by the 6i can suffer only a finite number of reductions, it follows that, for all k>K, for some K, T is one to one on Ak to Ak+l.
Now we prove D c T(D). Let d e D so that we have for all k, d =Tk(dk) for some dk E A. Now for all i > 0, T++l(dk+i+l) = Tk+i+2 (dk+i+2) or T(Tk+i(dk+i+l)) - T(Tk+i+(dki+2))
where the operands of T are in Ak+l C Ak. Since T is one to one on Ak, we have ek= Tk+i(dk+i+l) = Tk+i+l(dk+i+ 2) for all i. Hence d = T(ek) where ek e K +1 Akc C Q Ak = D, and thus D c T(D).
26. We now show D is itself a simplex. To this end, consider the sequences Tk(Si),i + 1,... ,t, of vertices of the Ak. There exists a common subsequence k, such that limTk(bi) = Ai all exist i = 1,...,t. We claim that the simplex {iV}t is indeed the set D.
First, it is clear that D = I Tk(A) = T Tk"(A) since the Ak are nested. Hence it suffices to prove that {Ai } = TTkv(A).
Since the Ai = limTk"(6i), Ai is a limit of a sequence of points which are in Tk"(A) = Ak, for all v > N. Since Tk(A) is closed, Ai E TkN(A). Hence Ai c ; Tkv(A). Now all Tk"(A) are convex, hence so is their intersection D. Since all Ai are in the convex set n Tk"(A) = D, and {Ai} is the convex hull of the Ai , we have {Ai } c 0 Tkv(A) = D (Alexandroff-Hopf).
We prove now nTku(A) c {Ai}. Suppose d in the intersection but not in {Ai}. Then, since {Ai} is closed, there exists an e such that the e-neighborhood of d excludes {Ai}. However, since limTkv(6i) = Ai , we can find N so that, for i-+1,..., t, I TkN(6i) - Ai < e/2. Now d C TkN(A) hence write d = EaiTkN(Si). Then |ld - Eaiill = E ai(TkN(i) Ai)ll < Eaie/2 = e/2 and thus EaiAi of {Ai} is in the e-neighborhood of d, a contradiction.
27. Since {Ai} = n Tkv(A) = n Tk(A) =D, we have that T is one to one on the simplex {Ai} to all of itself. Now {Ai} is the convex hull of the finite set of points i,. . ., At. As such it is a convex polyhedron, and the set of all its geometric vertices* is a subset of the original, say Al,...,A, and {Ail = {Ai}. We know that {Ai}l T = {Ai} {T(Ai)}I. It follows that T(A1 ), T(A2 ),..., T(An ) is a permutation of the geometric vertices A1 ,..., A, and hence for some N, TN(Ai) = Ai = AMN /u(AiMN), i = l,...,n or AiMN = T(AiMN) . A. But MN is a matrix with all elements positive. By the uniqueness results obtained previously, it follows that n = 1, and D = n Tk(A) = {A'}, where AlMN =((AMN) ·-1 . But for M itself we have vM = rv so VMN = rNv and again by uniqueness, Al = v.
Hence limTk(v) = limvMk/a(vMk) = v for all v with vi> 0, (v y 0).
* A geometric vertex is the point of the polyhedron not an inner point of any segment {x, y} of the polyhedron, P, y, x in P.
VIII—
A Theorem On Ratios
1. Let Gl,...,Gt be arbitrary generating functions, each in the variables xl,..., xt, and define G = G... Gt. Define Ma = -(aG/aXa) 1, Ta = (a2G/O 2 XC)l + Ma Mia = (&Gi/xa)l, Tia = (2 Gi/Oax)l+ Mia Then aG/Oxa = (G1/Oxa)G2...Gt+G 1(aG 2/xa)G 3 ... Gt + GiG 2. . Gt-l (aGt/OXa) and 2 G/xa =(d2 Gi/x2)G 2 .. .Gt + (aCGl/xa)(QG2/Xa)...Gt +... +(aGi/Xa) ...(aGt/, a) +... + G1(aG2/xa) ... (oGta/9xa) +. + G... (oGt/xa).
Hence Ma = Ma + M2a + ... + Mta and Ta- Ma = (Tla - Ml) + MlaM2 a +... + MiaMta +MlaM 2a +(T 2a-M 2a) + + M 2aMta + . + MlaMta +... M2aMta + . + (Tta - Mta).
Define Da = Ta - M2 and Dia = Tia - M2 as the various dispersions for particles of type a. Then Da = EZiMia + Z= 1(Tia Mia) + Ei=j MiaMia = (Z i Mia)2= Ei Tia - i Ma = Zi Dia.
2. Write G(x)=P(ji,. ..,jt)x ...x . Then (T-M M2) +.+(Tt-Mt2) = EDa=Dia > P(j E S)K2 *i,a * P(j e S) means the probability that j e S, i.e., , P(jl,...,jt) over all j = (j, ...,jt)S.
where S denotes the set of all j = (jl, . , Jt) such that /(jil -Mi1)2+... + (jt -Mt)2>K . Hence P(j c S)<Eia Dia/K2 .
3. We remark that, if H(x) = hn(x), where h(x) and hence H(x) is a generating function of xl,...,xt, and as usual Ma = (OH/OXa)1, Ta = (O 2 H/dxa) + Ma, ma = (ah/dxa)1, ta = (02 h/9xa)1 + ma, then OH/9 x2 = nh'-l (h/Oxa) and a2H/ = n [(n-l)hn-2(h/9xa)2 + hn-l(a2h/IX2)] Thus Ma = nma and Ta = Ma + n(m - 1)m2 + n(ta - ma). Hence Da = Ta- M2 = n(ta - m2 ) = nda.
4. Now let G = (gk), i= , . ..,tfor a fixed k, and G = (gk)) .. (gtk))t for a fixed n =(nl,...,nt) 4 O. Then with reference to this G, P(j C S)< E Dia/K2 where S = {j; v/(i-M 1)'+c ...+- (-Mj) >K} . But now Ma =- nlm(k) + n2( + . .. +nt(k) and Dia = nldk) where nltla + '2n2a - 'Jtttta- --telUiad(k)=t(k)-([mk)) . Thus P (j G S)<nid )/K 2 , where iaiaiai'daia S {j; j-nMkll>k}
We summarize: for every n =(nl,...,nt), k, and K > 0, in the probability distribution having generating function (k)ni (nk)n- ( i zr 9g1g9t= Pnlk (ji.· ,jt)xI...Xt it is true that Pn,k(Ilj - nMk \>K) <d(k) a(n)/K2 where d(k) =max { ad(k). -diaia But then Pn,k(ij/a(nMk) - Tk(n)fl>e)< d(k)a(n)/e2(a(nMk))2
Now o(nMk) = Enim(k)> E nim(k) = m(k)a(n) where iJ m(k) minim { mij}. Hence Pm,k( jll/((nMk) - Tk(n)l> e) < d(k)/e2(m(k))2 a(n) = e(k) /e2 U(n), where ek) = d(k)/(m())2 .
Then for every n / 0, k, e > 0, we have for 9.k)nl .9) = Pn,k (jl ., jt)Xj' ...X, that Pn,k(lIj/T(nMk) - Tk(n)l> 1/2c) <ek/1/4e26(n).
Now for arbitrary e > 0, rn > 0, there exists a k (hereafter fixed) such that lITk(n) - v\\ < 1/2e for all n 4 0. For this k, and the original e > 0, r1 > 0, we can now fix a so that for all n with a(n) > a, we have e(k)/1/4C2 r a(n) < 1/277, and hence Pn,,k(lIJ/o(nMk) - Tk(n)| > 1/2e) < 1/21. But we have the set inclusion, for each n with a(n) > a: {j; Ilj/a (nMk) - Tk(n)> 1/2e} D {i; ll/ (nMk) - v>}.
Hence u(n) > n implies Pn,k (II ji/ (nMk) - vll > e)< 1/271. For the fixed a, and the original rV, there exists an R so that for all r > R, i= 1,...,t, EO<(n)<a Pr (i;nl,... ,nt) < 1/2. Now for r > R, we have G(k+r) = Gr (Gk) so that g+r) =(r)(gk)) / >.•\g (k)l(k)t(k)n (tpr(i; O)+l<,(r P<pr,(i;n)g )n' g)n +a(n)>aPr(i; n).(k)'•(k)n, and hence 1 < pr(i; 0) + 1/271 + 1/271 + > p,(i;n)Pn,k (Iljl/ (nMki) - vl < e) a(n))>a
It only remains to relate the last term to the probabilities pk+r(i; j) of the generation k + r. Consider the set J(v,, ) of all j= (jl,...,jt) such that llj/r-vil < e for some r > 0. Since llj/r - vil < e if and only if lj -rvll < re, this means the set of all j within an angle about v of opening arc-cos 1 - e2/11ll l2 . Since every term Pr (i;nl, , nt) Pn,k (jl, . ,jt) involved in the above sum represents part of the total probability Pk+r (i; jl,..,jt) for a j = (i,...,jt) c J(v, e), we have
pk+r (i; jl...,j t) > 1 - . Hence we have the jEJ(,e )Theorem. For every e > 0, r > 0, and i = 1,..., t, there exists a K such that for all k>K, :pk (i; jl -, jt) > 1 -, where the summation is over all vectors j such that \IJ - rvll < re for some r > 0.
This means that, with overwhelming probability the distribution of progeny in high generations among the t types will be essentially in the ratios of the unique characteristic vector of the first moment matrix M of the original generating function G.
II
Abstract
We continue in this report the generalization of the methods and results of Hawkins and Ulam2 which we began in I, being concerned principally with systems which are below critical. After deriving necessary and sufficient conditions for this state in terms of first moments, we study the direction of flow induced in the "unit cube" by the corresponding generating transformation. The latter results are used to show that the distribution of the generation in which death first occurs possesses moments of all orders.
Limits of expectations are obtained for the problem of subcritical system with source, and for that of total progeny (corpses) in subcritical systems.
Finally, it is shown how fictitious particles of new types may be introduced in such a way that certain more complicated problems may be reduced to the case of simple iteration of generating transformations. In particular we have shown how this may be accomplished for the system with source, and for the problem of total progeny.
The next section, III of this series, will deal with measure theorems on the space of all genealogies possible in multiplicative systems.
I—
Some Properties of the Jacobian
1. We consider a multiplicative system involving t types of particles, in which a particle of type i has a fixed probability pi(i;jl, .. jt) of producing a total of jl +... +jt particles, j, of type v, upon transformation. The corresponding generating transformation G(x) of the unit cube It:gi(x)- =-Pi (i; il,..*,jt) xi '*6 xtit5
has the property that, upon iteration k times, the resulting transformation Gk(x): gi)() = Pk (i;jl,.. .,jt)x...xt has as its coefficient pk(i;j) the probability that the state (jl,...,jt) shall exist in the k-th generation of progeny from one particle of type i.
2. The transformation G(x) has for its Jacobian at x = 1 the first moment matrix J(G(1)) = [Ogi/axj = [mij][m,i j and, more generally, J(Gk(l)) = [gk)/axj] =[mij ] where mk) is the expectation of particles of type j in the k-th generation of progeny from one particle of type i. Under our assumptions on G (see I), all these moments are positive. Moreover, we have seen that [mij] = [mij]
3. The importance of matrices with positive elements required study of their properties. We found that for such a matrix M, there is one and only one solution, r, v of the relations vM = rv, r > O, v > 0 (i.e., all vi > 0) In similar fashion one shows the existence and uniqueness of r', v' such that Mv' =r'v', r' > 0, v' > . That r = r' in these equations is evident from the following
Lemma. If vM = rv, r > 0, v > 0, and R is an arbitrary positive characteristic root of M, then r > R. Similarly, Mv' = r'v', r' > 0, v' >0, IM - RII=, R > 0 implies r'>R.
Proof. First statement: Let wM = Rw, w 5 0, and define b= min{wi/vi}, B = max{wi/vi}. Since all vi > O, bvi<wi <Bvi and hence, bVm(jk) <Em < BVim(k) and brkvj <RkW<BrkVj, all j. Since r > O, also bvj<(R/r)kwj<Bvj. Suppose R/r > 1. If at least one wj > 0 the right member yields a contradiction for large k. If at least one wj < 0 then the left member does. Hence all wj = 0, contradicting choice of w. Thus R/r< 1.
The second statement of the lemma is proved similarly.
We include for later use the trivial remark: If M is a matrix with positive elements, and Mv' = rv', r > O, v' > O, we have Mnv' = r"v' for all positive integers n, and thus rnvj = m > min(vl ). Hence . m7(n) < r~v'/ min (v) -= rVi < rr max(vi) Vr", where V is a positive constant.
Similarly, if vM = ri, r > 0, v > 0, we have Ei m () < Wrn, where W is a positive constant.
4. We shall also need the following
Theorem. If M is a matrix of positive elements with Mv' = rv', r > 0, v' > 0, and Tk(v) is the transformation Mkv/s (Mkv), then limTk(v) = v' uniformly for all v $ 0, vi> 0.
Here s(w) indicates the sum E wi of the components of the vector w. The proof is entirely analogous to that used in I to prove the same result for row vectors.
5. We proved in I that lim Gk(O) = x° exists: lin gk)(0) = limpk(i; 0)=Xi, and defined G as supercritical in case all x°< 1. Under our conditions on G, the alternative case is that all x = 1, hence x°= 1, and this case we called subcritical. Most of the present report will be devoted to systems of the latter type, in which the probability of death in generation k rises to limit one.
6. We have seen in I that a system is subcritical if and only if the maximal root r of the first moment matrix M = J(G(1)) is less than or equal to one. Equivalently G is subcritical if and only if the determinants IAnl of the upper principal minors An = [mij - Oij]1 of the matrix A = [mij - ijl = M - I satisfy the relations: (-1)n+1lA1l < 0, n = 1,... ,t-1, (-1)t+lIM - I < .
7. We say a system is just-critical in case it is subcritical and the maximum positive characteristic root r of M is equal to one. A subcritical system with r < 1 is said to be below-critical. The justcritical case, while of theoretical interest is refractory, and we have limited ourselves for the most part to systems which are below-critical.
Theorem. A subcritical system is just-critical if and only if IM-II = 0. If r = 1, then of course 0 = \M-rI = IM-II, r being a characteristic root of M. Conversely, if IM - II = O, r > 1 by maximality of r, and r< 1 by assumption of subcriticality; hence r = 1.
8. We include for future reference the trivial remark: If M is a matrix of positive elements with Mv' = rv', 1 > r > O, v' > O, then there exists an e > 0 such that all vectors w in the e-neighborhood of v' are positive and satisfy the inequalities mi Wj < -(1 + r)Wi < Wi .
It suffices to note that the functions Ri(w) =_ mijwj/wi are continuous at w = v' and there have value Ri(v') = r < 1. Hence there is an e > 0 so that whenever llw - v'll< e, w will have positive components and Ri(w)< 1/2(1+ r).
II—
Direction of Flow of G[superscript(k)](x) in Subcritical Systems
1. In this section we study properties of the vector 1 - GK(x) with i-th component I-g?(k) (x), for x 1 in It, for a subcritical system, and of the vector Gk+l(x) = Gk(x) in a system below-critical. These results are of preliminary character, and are exploited in III.
Note that, for x $ 1 in It, we have x; I and thence gk)(x) < (k) (1) = 1 for all k so that the vector 1 - Gk(x) is not zero; in fact all its components are positive.
2. Recall that the e-cone of the vector v' consists of all vectors w such that (lw/a - v'll < e for some real positive a. We prove the
Theorem. If G is subcritical and x $ 1 is in It, then for every e > 0 there is a K so that for all k>K the vector I - Gk(x) is in the e-cone of v'.
Here v' denotes the characteristic vector of the relation Mv' rv', r > 0, v' > 0, where M is the first moment matrix of G. The theorem asserts, in geometric terms, that the direction from Gk(x) to 1 approaches the direction of v' with increasing k.
Proof. Fix x $ 1 in It, and e > 0. There exists a k (hereafter fixed) such that ITk(v) - v'\ < e/2Vt for all v $ 0, vi> O. (The transformation Tk is that defined in I 4.) Since limGn"() = I and the first partials of Gk are continuous at 1, there exists an N such that for, all n> N, mre) = (rga/nXj) < eMk/2 (() where Mk min { E mk;ij j}, and the partial is evaluated at the point Gn (x)e. Fix n >N and define an = E (1-g)(i)) =s(Mk (1-G(x))). Then vi-( - (1-g x)/a[_va-y'(O-gk)/9xj)p (l-g)( n))/aI from Taylor's form of gk)(x) expanded about 1, and evaluated at x = Gx(z). Thus G(x)-<P-< 1.
The latter absolute value does not exceed v -Em (k)-g(n(- ))/a + ' -T(k) (1 - G(n)()) I +E(m()-_ (ag(k)/D X)) (1-gn)(i))/a~ < e/2vt + eMk/2Vt s (1 - C(n)()) /a. But ak> Mks (1 - C(n)(x)) so finally the original absolute value is seen to be less than e/vt. But then Ilv' - (1 - Gk+n(X)) /akll< e.
3. In the below-critical case in one variable (t = 1) the graph of the generating function g(x) is monotone increasing and concave up on the interval (0,1) with 1 = g(1). Moreover, the Jacobian J (g(1)) = g'(1) and hence the characteristic root r is g'(1), which is therefore less than one when g is below-critical. It is obvious geometrically therefore that for every x satisfying 0 <x < 1, the sequence of iterates gk(x) is monotone increasing: x< g(x) < g2 (x) < .... This simple situation need not obtain in case t > 2. For purposes of illustration we regard the following example.
Consider the transformation G(x) of the unit square I2 defined by I I 1 g1(x) = - + - 1X+ x1x21 1192(x)= - + -X2 + xx2 Computation of the four first partials at 1 shows that the first moment matrix is M= 4 I and thus M-I=1_41 The upper minor determinants of the latter satisfy (-1)2 AI1 = -- < 0, (-1)3 A21 =-16 < 0. Hence G is below-critical. Indeed the characteristic equation of M is x2 - x + 3/16 = 0 with roots 1/4 and 3/4. Thus r = 3/4 < 1.
(The right hand characteristic vector v' is found to be [ upon setting r = 3/4 in the matrix-vector equation (M - rI)v' = 0 and solving the resulting two homogeneous equations in v{,vs.)
Note that (see I), M being subcritical, we must have mij vj<vi for at least one i = 1, 2, whenever v # O, vi> 0, but not necessarily for both indices. In our case, for example, [14 [] [] where 9 < 8 but 3 > 2. 42 2 2
It is essentially this fact that causes the simplicity of the one variable case to break down. Specifically, let x = (4/5, 1/5). Then our G(x) at this point is (74/100, 59/100) and x -< G(x) is false for this x.
We can however prove the following
Lemma. If G is below-critical, and x $ 1 is in It, the k-sequences 9i(k)(x) are eventually monotone increasing.
Proof. By I 8, there is an e > 0 such that ilw - v'll < e implies w positive and mijwj< wi, all i.
By the preceding theorem, this e determines a K so that, for all k>K, II (1- Gk(x)) /ak - v'll < e. For all such k therefore, Zmij(1 -g(k)())/ak< (1 -g)())/ak and the positive ak may be deleted from this inequality. Under our assumptions on G, gi(x) > I + Emij(xj- 1) for every x with all components less than one, and thus I - gi(x) < Emij (- xj). In this inequality we may set x =Gk(x), since we have already shown that the latter enjoys this property of components (cf. II 1).
Thus 1 -g (k+l) ) < Em (1 - (k)(x)), and combining with the previous inequality, 1 -gx )) < 1-g(k)x), whence the result desired.
4. We have seen that the direction of the vector 1 - Gk(x) approaches that of v'. We intend to prove the same result for the "vector of flow" Gk+l(x) - Gk(), with x 1 in It.
It is trivial that the latter vector is never zero for any k, and hence defines a direction. For otherwise, G would have a fixed point Gk () 1. Moreover, we know from the preceding result that for all k >K, all components of this vector are positive.
Theorem. If G is below-critical and x : 1 is in It, the direction from Gk(x) to Gk+l(x) approaches that of v'.
Proof is entirely analogous to that for 1 - Gk(x). Fix e > 0 and 2 5 1 in It. Then we may fix k so IITk(v) - v'le/2v/t for all v O 0, vl > 0, and next determine N so n >N implies the inequality (*) of II 2, and i(n)(2X) > g9i(). Now define An = k ij m=(k) (gk+n+l)(,) - g(k+)()) > Mk (Gk+n+l()- Gk+n(r)). Then Iv - (gi++x)( -gki ))/AVil(c (k)/Xj) p(g n+)( -jn)(2 ))/Ak < - ' /g)MJ)-J))/ by Taylor's theorem where Gn(x) -P < G"+L().
The remainder of the proof now proceeds just as in II 2. The essential point is that the Tk operates now on the vector Gk+n+l(p) Gk+n(z) which we know by II 3 to be positive and hence subject to the inequality v' - Tk (Gn+() - Gn(x)) l < e/2V. Thus the final result is Iv'- (Gk+'+n() - Gk+n()) /A\ < e for all n > N.
5. We now have immediately the main result which we want in III.
Theorem. If G is below-critical, and its first moment matrix satisfies the relation Mv' = rv', 1 > r > 0, v' > 0, and further, if x$ 1 is in It, then there is a K such that, for all k > K, the ratio of successive terms of the K-sequence gi +)() - gi()() is less than 1/2(1 + r).
From I 8, there is an e > 0 such that lw - v'll < e implies w positive and mnijwj < 1/2(1 + r)wi. This e determines K by the preceding theorem so that k>K implies
Gl(x)G() v <e Ak Hence EJ((k+(l)k(1)-9(k) ()) < i( + r) (k+)()- g(k)( )) since the positive Ak may be deleted. But (k+2) - _g(k) )=gi (G+ )) - gi(Gk(x)) = 9(gi/ x,)p((kgl)x) (k)(-g ))< Emij(gk)x) g(k)()) Hence, combining, we have the desired result.
III—
On the Distribution of Death in Subcritical Systems
1. Let G be subcritical, and define qk(i) as the probability that complete death of the system of progeny from one particle of type i should first occur in the k-th generation.
Clearly pk(i; 0), the probability of death in the k-th generation, may be expressed as k pk(i;) =qj(i) j=l Hence we have the relations: qk(i) = Pk(i; 0) - Pk-l(i; 0) =- gk)( 0) - (k1(0), k > 2, ql (i = p(i; 0) = gi(0)
2. We recall that x z x' implies gi(x) < gi(x') under our assumptions on G. Now 0 ~ G(0), otherwise G(0) = 0 and G would have a fixed point in It besides 1 and would be supercritical. Hence, inductively gi(O) < 92)(0 ) < g ( ()(0) < ..
It follows that the qk(i) are positive for k> 2, and ql(i) = gi(O) is positive for at least one i. Also, Z'qj(i) = limk Ek qj(i) = limk pk(i; 0) = 1 for every i, and thus the sequence ql(i), q 2(i), q 3(i),... is a probability density function on the positive integers.
3.Theorem. If G is below-critical, its first moment matrix M having maximal positive root r < 1, there is a K so that for all k >K, qk+l(i)/qk(i) < 1/2(1 + r). Consequently the density sequence qk(i) is eventually monotone decreasing, and all its moments ms(i)- ksqk (i), S0 k exist.
The first statement is an immediate consequence of II 5, with x = 0. The finiteness of the moments follows from the ratio test: (k + 1)sqk+l/kSqk < (1 + l/k)s (1 + r) - (1 +r) < .
4. In the case of a system below-critical in one variable (t = 1), it is geometrically obvious that qk+l/qk = gk+ (0) gk (0)/gk (0) gk-() = g(x') - g(x)/x' - x < g'(1) = r < 1 for all k, so qk < qlrk- l and mi = Ekqk < q (1 + 2r + 3r2+ ...) =ql(l +x+x+...) X = q1 ((1- x)-)xr = ql/(l + r)2. Hence in this case m1< p(0)/(1 + r)2 = g(0)/(1 + g(1))2
5. Examples show that, even in the one variable case if the system is just-critical, even ml may be infinite. We hope to study the one variable case more completely in a separate report.
IV—
Subcritical System with Source
1. Consider t types of particles whose probabilities of transmutation are given by the generating transformation G(x): gi(x) =p(i;jil,...,jt)xj l ...xj as before. Suppose further that we have a source which emits independently into the system n+...+nt particles, ni of type i, with probability s(nl,...,nt)> 0. We associate with the source the generating function S(x)= s (nl,...,nt)xL ... xt, S(1) = 1
Consider a process consisting of the following steps:
1. The source produces an initial set of n, + ... + nt particles, ni of type i, with probability s(nl,...,nt). These particles transmute according to the G law to form a system which we regard as the first population.
2. The source again contributes new particles, and these together with the first population transform according to the G law to form the second population, and so on. At the k-th step, the population (ml,...,mt),mi of type i, will occur with some probability hk (ml,...,mt), and we define the corresponding generating function Hk(x) = , hk (ml,. ., m) x ...xt
Now, from the elementary laws of probability, as we have pointed out in I, transmutation of any population with generating function N(x) according to the G law gives a population with generating function N (G(x)) = N (g 1(x),...,gt(x)). Hence for the problem considered above, we see that Hl(x) = S(G(x)), H2(x) = S (G(x))S (G2 ()),..., and, generally Hk(x) = S (G(x)) Hk- 1(G(x)) = S (G(x)) S (G2(x)) ... S (Gk (x)).
For, if Hk-i(x) is the generating function for the (k-1)st population, then the generating function for the intermediate population resulting from the contribution of the source to the (k - )st is Tk(x) = S(x) . Hk-1(x), and, upon transformation of this result by the G law, we must have for the generating function for the k-th population: Hk(x) = Tk (G(x)) = S (G(x)) - Hk-(G(x)) .
k 2. From Hk(x) = S (Gi(x)) follows Hk () = Hk- (x) . S (Gk ())
Since, for x in It, the latter factor is less than or equal to one, the k sequence Hk(x) is monotone non-increasing at x and H(x) - limHk(x) exists in It. Clearly 0 <H(x)< I and H(1) = 1.
If S satisfies the conditions
(S*) at least one OS/Oxj > 0 for all x : 0 on It, then for every x' with all components less than one, we have S(x') < 1. For S(x')1+ E (OS/9xj)p (x' - 1), where 0 ¢ P.
But for every x $ 1 in It, we have seen x' = g\)(x) < 1 for all i,k. Hence, for such x, S(x') = S (Gk(x)) < 1 and 1 > H1(x) > H2 (x) > ... so H(x) < 1. Thus we have
Theorem. The function H(x) - limHk(x) exists for all x in It, and 0 <H(x)<1,H(1) = 1. Moreover, H(x) satisfies the functional equation H(x) = S (G(x)) H (G(x)). If the source function S satisfies condition (S*), then H(x) is not identically one on It, indeed, for every x $ 1 in It, H(x) < 1.
3. If G is supercritical: x° ¢ 1, then Hk (O) = H S (Gi(x°))= (S(x°))k 0, SO H(xO) = 0. Moreover, it is easy to see that H(x) _ 0 for all x 5 1 in It. For limGk(±) = x°(rx z 1) and hence S (Gk()) is bounded from 1. Thus only the subcritical case is of interest.
Theorem. If G is below-critical, and S is a polynomial of degree s, then H(x) is continuous.
Proof. We have H_-(x) - H,(x) = S (G(x))...S (G'-l(x)) (1 -S (G .(x))) 1-S (Gh (x)) < 1-S(G"(0))
Now, by Taylor's form, g2(0) > 1 + Ej m(n (0 - 1) = 1- yj m(n) > 1 - Vr'. Since r < 1, the latter is positive for all n sufficiently large. Now 1-S (Gn(0)) < 1-Es (i,...,jt) (1-Vrn(j) < 1-S(jl, ...,jt) (1 - Vrn) = 1-(1 - Vrn)s < 1- (1 - sVrn)<sVrn. (See Appendix B). It follows that the sequence Hn(x) is uniformly convergent to H(x) on It, and hence the latter is continuous on this range (see Appendix A).
4. Since H,(x) = S(G)...S(G") = Hn_S (Gn), we have OHn/Oxj = OHn-_i/xj •S (Gn ) + H_n-l (S (G)) /Oxj and \OHn_ /Oxj- OHn/G3xj < OHn-1/Oxj . (1 - S(Gn)) + H,O_l (S (Gn)) /Oxj. Now aHn_l/0xj <OS(G)/0x 3+. . .+S(Gn-1) /O xj and OS (Gn) /Oxj = E (S/x)0 (Og/x) (OSS/)Oi) (n)/ < E (S/xi) m) < TE TWrn.
Thus the above absolute value does not exceed TW (r + ...+ rn-~ ) sVrn +TWr' <Krn(1+r +... + r-1 ) < Krn/(1 - r). Hence by Appendix A, the n-sequences OHn/Oxj are uniformly convergent on It, the partials OH/Oxj exist, and lim OHn/0xj= OH/Oxj in It.
But (OHn/Oxj)1 is the expectation of particles of type j in generation n, and the limit approached by this expectation is (OH/Oxj)1.
Since we know H satisfies the functional equation H = S(G)H(G), we have OH/Oxk = EOS/Oxj dgj/OxkH(G) + S(G) ZOH/Oxj Ogj/Oxk. Setting x = 1, we obtain s (mjk -S jk) (OH/O9xj), = - E (0S/0xj)I mjk, k= I,...,t.
Since G is below-critical, the determinant of the system is not zero and the expectation limits are uniquely determined. Thus follows the
Theorem. If G is below critical and the source function S is a polynomial, the limit function H possesses first partials on It, and the limit of the expectation of particles of type j in population n is the value at x = 1 of OH/Oxj. Moreover the latter limits are uniquely determined by the linear system E(mi k - bjk) (OH/0xj)l= -mjk (OS/OXj)l with non-vanishing determinant IM - II.
V—
Total Progeny for Systems Without Source
1. Returning to the simple problem without source, let Pk (i; jl, , t) be the probability that in the total progeny in all generations 1 through k produced by one particle of type i (generation 0), there should be ji particles of type 1,..., jt particles of type t. Define ci)(x) = Z Pk (i; jl, .jt) xj '..X-j and C(k)(x) as the corresponding transformation of It. Here the upper k does not indicate iteration. Clearly P(i; j) = pi(i; ), hence cl1)(x) = gi(x) and C(1 ) = G(x).
Now let k be greater than 1. The production of the total state J,...,Jt at the end of the k-th generation from one particle of type i arises from the mutually exclusive states jl, - jt; 0 < jh < Jh in the first generation. If this state is 0, .. ., 0, then and only then will the total state J1,. .., Jt be 0,... ,0, so Pk(i; O) = p(i; 0).
Suppose then that state J is not 0, and hence state j $ 0. Each of the jh particles of type h in the first generations acts independently of the others, and of those of other types to produce in the k - 1 next generations a total state of some al,...,at particles with probability Pk-1 (h; ai,...,at). We want the total state from the ji,... ,jt particles of the first generation to be J1- ii,..., Jt - jt after the next k - 1 generations. It follows from the elementary laws of probability that, for J 0,
O<jk<Jkji jt Pk(i; J) = Zpi(i;j) ZfPk-l (1; al... at)...f JPk- (t; al,..., at) jiO E ai=Ji-ji But this is the coefficient of xiJ ... xtJt in gi (xic1) (x),.. . ,XtCt- )(X)) = pl (i;j)xjl . xj t [ Pk-1 (1; a)xa ] . [ Pk-l(t; a)xa] and Pk_l(i;0) = pl(i;0), which is the constant term of the above function. Hence the
Theorem. The generating transformation C(k)(x) for the total progeny in generations 1 through k satisfies the recursive relations cM (x) = gi(x) ci (x) = gi (xc 1)(X),. ..,tCt (x)
2. If y,z are arbitrary points of It, we have from Taylor's form, gi(Y) = gi(z) + E (gi/dxj)p (yj - zj) and thus gi(y) - gi(z) I<mij lyj- zj \. It can be said therefore that the number dk _ ci (x) - c(x)I < Emij Ixjck _xjc -2 1 < mijd k-1 since is in It Iteration of this inequality yields dk < mrk-2 dk-2, and eventually dk < Zm (k-2 d(2) = -En -2g9j (xG(x) - gj()) I<ZE,k-2 Emjn <_ ijJ = Z ij _ ij Xn9n(X) -Xn= Z mi'n Xn 9n(X) - 1 <m(k1 ) < Vrk1. n Thus we have the
Theorem. If G is below critical, the generating functions ck(x) are uniformly convergent to continuous limit functions ci(x) on It. The latter satisfy the functional equations ci(x) = gi (xicl(x), . . . ,xtt(x)).
3. We seek now a dominating sequence for Di-=Ocik/xj ck-1 aXI First, note that Pj -ac/Qxj = Z (OgiO/Xn)p [5nj Cn x]+XnC 1 /j] <mij+ P\j , where P = xCk- l. Iteration leads to P1<mij + m m) pkj P2 and eventually pk <+m(2)+. . .+m (k-)+Em(k-)(gn/j)<mni+m( + (+mk) a~ <.~ij +.~ij ij in _j +j
For brevity, let Ak and Bn denote temporarily the round and square brackets involved in the Pj sum above. Then D= pk pk-P= |E aBk _ An1 <AkIBk -B \k-1 + EB 1A- -A - < min,lBk -Bk-11' + E Bk-'llAk - Ak-117
We obtain upper bounds for the three A, B expressions: (1) Bk -B-'- =(cJ- I6ck-2)+ X (cC-1 _1 /X- -c+2 /iXj) < (2) Bn1 < 6 nj + c a2 /oxj<6nj + (mnj +..+ mj ) (3) IAk - A k-1(Ogi/X.l)XCk-1 - (-g9il/Xn)Xk-2<a&gi/azXnpzlxpc 1 _pC-k-21 Bdk- 1 < BtVr k-2
Hence, combining, Dk < E min (6njVrk-2 + Dj ) + BtVrk-2 (6nj + mj +. +m( 2) < mijVr + E mi,nDn1 + BtVrk- + BtVr-2n(mj+. + m 2))j m ijVrk2 tV+ BVkBtVr2 (I+ Bt + Wrk2) + E mn D - < mTijVr- 2 + BtVr-2 + BtVr-2Wr/(l - r) + E minDn Thus we have Dk < Krk + E minD4nj1
Iteration leads to Dk<Krk-2 + Krk-3min + Krk-4Emin) + ... + Kr Emk 3) + F-mik 2 )D We obtain an upper bound for D2 j = dc2 /Oqxj - dc/9xjj = 9 (gn/ x,)G g,i+ E (agn/xp).x•gX/ -X ogn/0x= (09g /aXj ) ,g-1 + I(Ogn/axj),) - (09gn/ j) + | E (a9gn /xp).X pOgp/O/xj <m,j + B > \xpgp-xp + mnpmpj<m,j + +Bt
Therefore, substituting gives Dk < Krk2 (k - 2) + E (-) + E (k-2)m + tm(2) < Krk-2(k - 2) + Vrk- l + Vrk + Btrk-2 . Since each of these terms defines a convergent series so does their sum and, we have
Theorem. If G is below-critical, the sequences ack/x3j are uniformly convergent on It. Hence the partials Oci,/xj exist on It, and are the limit functions of the corresponding sequences. Since lim (Mcr/nxj) = (ci/xj), the latter is the limit approached by the expectation of particles of type j in the total progeny at the end of k generations from one particle of type i. From the functional equation satisfied by ci(x) follows En (mi n- - in) (C,n/OXj)I = -min, i = 1,...,t where IM -IIf 0, and the expectation limits are uniquely determined.
VI—
Total Progeny in Subcritical System with Source
1. Consider again the process described in IV. We found that the probability distribution in the n-th generation had generating function Hn(x) = S(G)S (G2 ) . . . S (Gn) .
Here S (G') gives the distribution for the isolated system produced by the initial action of the source, S (Gn-1 ) is that for the component due to the second action of the source, and so on. Regard the isolated component produced by the (n - k + l)st action of the source, with generating function at the n-th level (of the whole system) S (Gk). By an argument exactly analogous to that of V 1, one sees that the generating function for total progeny at the n-th level for this isolated component of the system is given by S (xic (x),... ,xtCz(x)).
It follows from the elementary laws of probability that the generating function Un(x)- un (il,j,jt) x .. .Xis Un(z) = S (xic(z)..., xtCt(x)). Here n (jl,...,jt) denotes the probability that, at the n-th level of the entire system, there should be a total of ji particles of type 1,..., jt particles of type t produced altogether, counting particles contributed by the source as well as all progeny of such particles.
2. Since Un(x) = Un- 1(x)S (xici(x)), and for arbitrary x in It with all components less than one, the latter factor is less than or equal to S(x) which in turn is less than one (supposing condition (S*)), it is evident that lim Un(x) = 0 for all such x. Since however Un(1) = 1 for all n, Un(x) converges to a discontinuous limit function on It. Moreover, it is manifest that the expectations approach infinity so we cannot expect a simple theory.
Appendix A. We collect here some standard results from classical analysis.
A sequence of functions Fn(x) on It is said to be uniformly convergent on It in case (1) limFn(x) = F(x) exists for each x of It, and (2) for every e > 0 exists N so that n > N implies IF,(x) - F(x)l < e, all x in It.
Theorem. If all Fn(x) are continuous on It, and the sequence is uniformly convergent there, then the limit function F(x) is continuous on It.
Theorem. If Kn(z) is a uniformly convergent sequence of continuous functions on It, with limit function K(x), then the sequence of functions In(X) =- 0 Kn (Zl,X2, ...,xt)dzl is uniformly convergent to jXK(zl,x 2,...,xt) dzl, and similarly for the other variables.
Theorem. If (1) Fn(x) converges pointwise on It to the limit function F(x), (2) the partials aF,n/xl exist and are continuous on It, (3) the sequence QFn/Oxl converges uniformly on It, then OF/Oxi exists on It and is equal to the limit of the sequence aFn/9Xl. Similar statements hold for the other variables.
Theorem.Fn(x) is uniformly convergent on It if and only if e > 0 implies existence of N such that for all n>N and all positive integers p, Fn+p (x) - Fn(x) < e on It.
Theorem. If F (x) is a sequence of functions defined on It, E Mn is a convergent series of non-negative numbers, and for all sufficiently large n, Fn(x) - Fn+l(x)\<Mn, n It then Fn(x) is uniformly convergent on It.
Appendix B Theorem. If 1 + h> 0 and n is a positive integer, then (1 + h)n> 1 + nh. Proof trivial by induction on n.
VII—
The "Time" Particle
1. It is of interest to note that, in the system with source, it is possible to regard the n-th population as essentially the n-th generation of progeny of a simple system of t + 1 types of particles produced from one particle of new type t + 1. Suppose that we associate x1 ,. .., xt as before with the t original types of actual particles, and introduce a new type of particle with variable xt+l. Consider then the transformation V(x) of It+l, defined by the component functions vi(X) = gi(x) vt(x) = gt(x) X: _ (X1, .. ,Xt,Xt+i) Vt+l(x.) = Xt+lS(G(x)), x = (Xl,...,Xt,).
One verifies easily that the (t + 1)st component of the k-th iterate Vk(x) of V(x) is = xt+lS(G(x))...S (G(x)). Hence vk+1, the result of a simple iteration satisfies the relation t+l(Xl ,... ,Xt,Xt+l) = Xt+Hk(X1, . ..,Xt).
The transformation V(x) fails to satisfy the restrictions we have imposed throughout on our generating transformations, for example vli/Oxt+l vanishes identically on It+l, and we have treated the problem independently for this reason. Nevertheless we shall be able to make use of the indicated simplification to the iterative case in a future report on the space of histories of a multiplicative system.
VIII—
Total Progeny as an Iterative Problem
1. In a similar way, the transformation Ck(x) of V may be produced by an iterative process. Let x1 ,...,xt, be the usual variables associated with the t types of actual particles, and z,... ,Zt, be variables for a set of t types of dummy particles. Suppose probabilities of transmutation among the 2t types are defined by the following generating transformation L(x,z) of I2t, L1(x, z) = zig,(X,..,Xt) Lt(x, z) = Ztgt (X,...,t) Lt+l(x, z) = zi L2t(x,z) =- t . = (l,...,t) =(Z,...,Zt)
where the gi(x) are the components of the usual G(x). One verifies easily that ththe -th component (for i = 1,.. , t) L(x, z) of the k-th iterate of the transformation L(x, z) satisfies the relation L (x, x) = XiCk (Z)
If one examines the nature of the process induced by L(x, z) one finds that each time an actual particle of type xi is produced in a generation k, it is forced to produce in generation k+l a dummy particle of type Zi, as well as its actual progeny. Moreover, every dummy particle of type zi is forced to just reproduce itself in one for one fashion. Thus the total actual progeny is tallied by means of the one for one reproduction of the dummies through all generations. Thus if one sets zi = xi in generation k one totals the entire progeny including the actual particles produced from actual particles in the (k- 1)st generation and the dummies which total the whole previous progeny of actual particles. The extra xi factor is due to counting the zero-generation particle in the L process.
III
Abstract
The set FJ of all possible genealogies or graphs z of a multiplicative system produced from one particle of type i is here introduced as a fundamental concept in the theory of such systems (see I, II). This set possesses a natural intrinsic distance function d(z, z') under which it is a complete zero-dimensional metric space satisfying the second axiom of countability.
Simple axioms on (A) intervals and (B) measure of intervals are given for an abstract set from which the classical theory of completely additive measure is derived.
Intervals in the set J1 are defined intrinsically and shown to satisfy the axioms A. If now a particular multiplicative system with given generating transformation G(x) is given, the transition probabilities Pl (i;ji,.. ,jt) serve to define a measure for the intervals of 1; satisfying the axioms B. Proof of the latter is non-trivial due to non-local-compactness of the space 17.
With this mathematical structure at hand it becomes possible to state in a simple way some of the striking properties of multiplicative systems.
If x° = G(x° ) is the death-fix-point of G(x), then the set of graphs of Fi which terminate in death has measure xi .
If v = (vl,...,v 1) is the characteristic vector corresponding to the maximal positive characteristic root r > 1 of supercritical system, then the set of all graphs of 1 whose k-th generation population approaches the ratios v:v2:... :vi has measure I - A°. Thus, almost all graphs (genealogies) either terminate in death or approach the mode v as limit. These results are trivial for subcritical systems, in which by definition, 3z = 1.
I—
A Remark On Measure Theory
1. It is convenient for our purposes to have a simple set of axioms on which measure theory may be shown to rest. To this end, let r = {z} be a set of points z, and I = {i} a class of special subsets i of F called intervals. We demand that the entire set F and the empty set X be intervals. Denote by J the class of all subsets S = Ei of F which are set sums of a finite or countable number of intervals. (All set sums hereafter are understood to operate on finitely or countably many summands. The notation La is used to indicate that the summands are mutually disjoint.)
We suppose that intervals satisfy the following axioms: I 1. Every set S = Zi of J can be represented as a sum Ej of disjoint intervals. I 2. The complement i' = -i of an interval i in J. I 3. The set product ij of two intervals i and j is an interval k.
Moreover, we assume m 1. To every interval i is assigned a non-negative real number m(i)>0 called its measure. m 2. M(F) = 1, m(0) = 0. m 3. If i = Ej, where the i and j are intervals, then m(i) = mr(j).
An additive class C of subsets of F is one such that C 1. All intervals belong to C. C 2. A is in C whenever all sets A are in C. C 3. If A is in C, so is A'.
The Borel sets are those common to all additive classes, hence, themselves form an additive class.
We shall define a property of subsets U of F called measurability, and prove that the class of measurable sets is additive. Simultaneously we define a measure m(M) for every measurable set M and prove
M 1. 0 <m(M)< 1 for M measurable. M 2. m ( M) =Em(M) for M measurable. M 3. If M is an interval, the measure assigned to M as a measurable set coincides with its intitially given measure.
2. In stating the above axioms, we have attempted to focus attention on the fundamental assumptions. It remains to show that the essential theorems follow from our axioms. This we do in the remainder of this chapter, for the sake of completeness, following the classical arguments for the real line1 . We note first the following properties of intervals: I 4. E S is in J whenever all the summands S are in J. For E S is an at most countable sum of at most countable sums of intervals. N
I 5. I1 i, then S' is in J whenever all N factors are in J. It suffices to see this for N = 2. Hence let S1 = Zim, S2 = j. Then S1S2= i,mjn = E (imjn). By I 3, imjn is an interval, so S1S2 is in J, the range (m, n) being at most countable. 6. If S = i, then S' is in J. N For S' = i', each i' is in J by I 2, and thus S' is in J by I 5. 1
3. We now assign a measure to every set S of J. Note first that if i D Si in, then i = Zin±+i(in) . By I 6 and I 5, the latter set is in J, so by I 1, we may write i= in i-n+ Ejm, and by m 3, and m 1, m(i = E (in) + E T(j) > E m (in).
Hence, if in is any sum of disjoint intervals, we have for every N, F D i in and hence 1 > Z m (in). Thus E m (in) exists, less than or equal to 1, for every such disjoint sum.
Now suppose -i,n = Zjn. Then im = im(Zjn)= E (imjn) ,nkmn, kmn being the intervalim, jn. (See 1 3.) Hence by m 3, m (im) = En m (k,,) and m(im) = m En m(kmn). Similarly Em(j,) = En Em m(kmrl). Now the double sum mn, m (kmn) converges to a limit < 1 by the preceding paragraph since the kmn are disjoint. Hence the iterated sums are both equal to this limit and hence to each other. Thus E m(im)= - m (jn).
It is therefore clear that, if S is any set of J we can write S = Zi and define m(S) = rm(i) unambiguously. Clearly 0 <m(S), and if S is an interval i, m(S) coincides with the initially given measure m(i).
4. We establish in this section some properties of the measure m(S) for sets S of J. First, m 4. m(S) =m(S). For, write Sm =i,,mn. Then m(ZS) m( n imn)En,m (im,) = Zm n m (in,) = m (S,) . Next m 5. If S and T are in J and S c T, then m(S)<m(T).
For, let S = E im and define Sn =m. Then T = S +TS', the summands being disjoint and in J. Hence, by m 4, m(T) = m (S) + m(TS')> rn(Sn) =m(i,,). Hence m(S) = m(i) < m(T). Thus follows m 6. For all S in J, 0 <m(S)< 1.
Next we prove m 7. For arbitrary sets S of J, disjoint or not, m(Z£S)<Zm(S), where the latter sum may of course be infinite.
Write S = E imn. Then ZSm = m,i,,, = ip = il4i1i 2+ili2" 3i . , the latter summands being disjoint and in J. Hence, by m 4, m 5, m(Sm,) = m(il) +(i'i2) + ... <(il)+m(i 2) + ...= r m(ip) = m,, 7n(imr,) = nErE m (imn) = Emm(S), the latter possibly being infinite.
Finally we have m 8. If S and T are in J, then m(S + T) + m(ST) = m(S) + m(T). First suppose S and T are each finite sums of intervals. Write S + T = S- (T- ST) ST- (T - ST) = T
where S and ST are in J, and, ST being a finite sum of intervals, also T - ST = T(ST)' is in J. Thus by m 4, m(S + T) = m(S) + m(T - ST) m(ST) + m(T - ST) = m(T), and addition yields the equation of the theorem.
Now let S = Zim, T = ij,, where we make either sum terminate in 0 in case it happens to be finite. Define Sn and Tn as the corresponding partial sums through the first n terms. Then by the first part of the proof we have for every n, n (Sn + Tn)+m(SnTn) = m(Sn)+mr(Tn). But limm(Sn) = limTli, = mnS), and limm(T,) = m(T). (Recall that m(0) = O).) Also, S + T = im + j, = E (im +jm) S, = S1-+SS2i Thus m(S + T) = m (S1) + m (S2) +...= lim, (m (SI)m(S9S2)+...+ m(S S S)) =limnm(Si+SS2 +-... + S ... n) = limn (1 + ...+Sn) =limn m (Sn+ T) . Finally ST = n im , = ,jimj n =mmnn = kj = mn = k +k k2 +.. where the intervals kmn = kp are listed in sequential order by upper squares: (1,1), (1,2), (2,1), (2,2); .... Then m(ST)= rn(kl) + m (klk2) + .. . limn2 (m (kl)+... + mn (ki ... k2_lkn2)) = lim,n mk+... ... (kl + +k k _)lk 2) = lim m (k + m (SnTn).
Hence we obtain m 8 in general by taking limits of both sides of the finite relation.
5. Let U be an arbitrary subset of F and define the outer measure O(U) = glb(m(S); U c S c J), and the inner measure I(U) = I - O(U'). From m 6 we have for all subsets U, m 9. O <O(U)< 1, and O <I(U)< 1.
Moreover, if U1 c U2 , every S of J which contains U2 also contains U1, so the numbers m(S) defining O(U 2) form a subset of those defining O(U1), and hence m 10. If Ui c U2 , then O(U1) O0(U2 ) and I(UI) < I (U2).
For every Si D U, S 2 D U', SI, S 2, in J, we have S1+S 2 D U+U' = F and M (F) = 1 = m (S + S2) <m (S)+m(S2 ). Fix S1 U. Then 1 - rn (S) < m (S2 ) for all S2 D U', so 1 - m(S)<O()1 U') U')< m(SI) for all S, DU, and 1 - O(U')<O(U). Thus follows m 11. For every subset U, I(U)<O(U).
Now let {Un } be a sequence of arbitrary subsets Un of F. Fix e > 0. For every 7n there exists an Sn in J such that Sn D Un and n (S,) < O(U) +e/2n. Hence Sn D EU,, Sn is in J, and m(r Sn)<Em(Sn) <ZO(Un)+e. Thus O(EUn )<m(Sn,) < EO(Un) +e for every e > 0 and we have
m 12. For arbitrary subsets U, O(0 U)<0(U). Suppose U 1, U 2 are disjoint. For every S1DUl, S2 D U2, S1,S2 in J, we have Sl + S2 D U{+ U2 = (U1U2)' = 0 = so m (S + S2 )= 1. Also, (Ul +U2)' = UlU C S1 S2 is in J, hence O (U +U 2)'< gib (m(S1S2)). Then I (Ul + U2) - I (Ul)- I (U2) = 1 - O (U1 + U2)'- I - (U)- 1O(U2) = O(U{) + O(U2) -0(Ul + U2)' - 1 > glb m (S1) + glb m (S2)gib m (S1S2) -1 gib (m(S1) + m(S2) -m (SiS2 )) - = gib m (S1 + S2) - 1= 1 - 1 = 0. We therefore obtain m 13. UIU2 =0 implies I (Ul) + I(U2) < I (Ul + U2 ).
Generalizing this, we have I(ZU)> I(U1)+ I(E2 U) >I(U1)+ I (U) + (I Z 3U)> ... >I() +...+(UN) +I( EN+) > 1 I(U) for all N, and m 14. (I(U)<I(L U), for disjoint summands.
6. We say a set M is measurable in case I(M) = O(M) and denote by K the class of all measurable sets M. For a measurable set M we define a measure m(M) = I(M) = O(M). From m 9 follows m 15. For a measurable set M, 0 <m(M)< 1.
Let i be an interval. Since i is in J and contains itself, we have O(i)<m(i) since O(i) is a lower bound. Now suppose i c S E J. Then m(i)<m(S) and m(i)<O(i) since O(i) is a greatest lower bound. Hence rn(i) = O(i). Since i' is in J, we may write F = i+ jn, so m(F) = 1 = m(i) + m(i'). Now 1(i) = 1-O(i') = 1 - O(Ejn) > 1-E (O(j,)) 1 - Em(jn) 1 - m(i') = m(i) = O(i)>I(i), so we have established m 16. Every interval is measurable with O(i) = I(i) = m(i), its initially given measure.
Also, we see m 17. If Mn are disjoint measurable sets, then E M, is measurable and m(ZMn) = Emr(M,). For, Em(M.) = I (M,)<I( Mn) <O(M,) < 0O(M,)= Em (M,), and I(EMn,) = O(EM) = E (Mn).
As a corollary we get m 18. Every set S = i of J is measurable and its measure as such coincides with that previously defined. If M is measurable, I(M') = 1-O(M) = 1-I(M) = 1-(1-O(M')) O(M'), and so m 19. The complement of a measurable set is measurable and m(M) + m(M') = 1.
Finally , we have to prove m 20. A sum of measurable sets, disjoint or not, is measurable. Let M1, M 2 be measurable, and hence also M,, M2. Fix e > 0. Then there exist sets S1,S2, T1 ,T2 in J such that S1 DMl,S2 D M2 , T1D MI, T 2 D M2 and m(Si) < O(MI)+e; m(Ti) < 0(Ml)+e m (S2) < O(M2) + e; m (T2) < 0 (M2) + e.
Now S1 + TI D M1+ M' and S2+T 2 D M 2+ M2 so S1 +T = F = S2 + T2 . But 1 + m(SlTl) = m(Si + T1)+ mT m(iTi) = i(S 1) + m(Ti) < O(M1 ) + e + 0(Ml) + e =m(Mi) + m(M{) + 2e = 1 + 2e. Thus m(SiTl) < 2e and similarly m(S2T2) < 2e.
But M1 + M 2C S1 + S2 and (M1 + M2)' = Ml,M c TiT2, so F (M1 + M2) + (Mi + M2 )' c S1+ S2 + TiT2 . Moreover, from the first two inclusions, 0(M1+M 2) < m(S1 +S2 ) and I(M1+M 2) = 1-O(M +M2)' > 1-m(TiT2). So, 0(M1+ M2)-I(Mi + M 2)<m(Sl + S2)+m(TiT2)-1 = m(S1+S2+TlT2)+m( (S1 + S2)TiT2)-1 = m (F)+m( (S + S2) TT2)-1 = m(S1TiT2+ S2TIT2)<mi(STl + S2T2)<m(SiTi) + m(S2T2) <4e. Thus O(M1+M2) <I(M1+M2) and M1+M2 is measurable. Accordingly, so is every sum of a finite number of measurable sets.
But we see now that a sum of a countable number of measurable sets E M, = M1 +MM2-+ (Ml+ M2)'M3- ... is measurable by m 17 and the fact that each of the latter summands (M1+ . + Mn_l)'Mn (Ml+ ... +M,-i + M,)' is measurable by m 19 and the preliminary results for finite sums.
II—
The Set of Graphs
1. Consider t types of particles, such that a particle of type i may produce, upon transformation, an arbitrary number jl + .... + t > 0 of such particles, of which js are of type s, s = 1,...,t. We suppose that transformation times are the same for each type, and hence that generations may be counted unambiguously. We agree to consider zerogeneration as consisting of one particle of a fixed type i, and then consider the set Fi of all conceivable genealogies or histories proceeding from it, that is, the infinite record of the transformations of this particle and of all its progeny through all generations k = 0, 1, 2,...
2. We may represent such a genealogy in the plane by a graph or lattice if we agree on the following conventions: (a) A particle of type i in the k-th generation is represented by a number i in the k-th row.
(b) If a particle of type i in generation k is transformed into no particle, that is, if it dies in generation k + 1, this is so indicated by a sequence of zeros proceeding from it to the (k + 1)st and thence successively to all lower rows, thus: (i) row k I (0) row k + 1 (0) row k + 2 I * • * • * • (c) If a particle of type i in generation k is transformed into jl + .. jt > 0 particles, j, of type s, in the (k + 1)st generation, this is indicated by a branching from the corresponding number i in the k-th row into a group of jl + ... + jt numbers in row k + 1,js being the quantity of numbers s, identical numbers being grouped consecutively, and different numbers ranging from left to right in increasing order.
Thus (1) /\ (1) (1) (2) indicates that a particle of type I was transformed into two particles of type 1 and one particle of type 2. Note that the two 1's represent different particles, so that the events (1) and (1) (1) (1) (2) (1) (1) (2) I I I I I (0) (2) (0) (2) (0) (0) are counted as different.
Consideration shows that the set Fi of all genealogies is uniquely represented by the set of all such graphs z in the plane, and we speak hereafter of the set Fr of all graphs z. We will not change the type of the zero-generation ancestor during the subsequent discussion.
3. The set ri has at least the power of the continuum. For to every sequence of O's and l's {an}, we may make correspond a particular graph (there are many) which contains a total of one particle in generation n if a, is 0 and a total of two particles in generation n if an is 1. This correspondence is one-one on the set of all sequences {a(,} which has the power of the continuum to a subset of the set Fi. That the set ri has power less than or equal to that of the continuum may be seen directly or from a topological result which we wish to establish anyway in the next section. It will thus appear that Fi has power of the continuum.
4. If z is a graph, Zn denotes its upper segment from generation O through n, both inclusive. Thus, if z = z', then Zn = zn for all n, whereas, if z 5 z', there exists a first integer k = k(z,z') such that Zk / Zk. Since z0 = (i) for all graphs z of ri, the integer k(z, z') > 1.
5. Suppose z,z',z" are all different, where m = k(z,z')>n= k(z', z"). Then Z,n- =z'_l = Z_, and k(z, z") >n. Hence we have the
Lemma. If z, z', z" are three different graphs, then k(z, z")> min (k(z, z'), k(z, z")).
6. A graph z is said to terminate in case it contains no particles (only O's) in some generation. We can write the set T of all terminating graphs in the form of a disjoint sum T = ToT 1T 2 +...
where Tn is the set of all graphs yn which contain at least one particle in all generations through the n-th and die completely in generation n + 1. The set of To consists of the single graph (i) I (0) I (0)
We prove the Lemma. The sets T, T2 ,... are all countable and thus so is T.
Proof. Induct on n of T,. First, T, is countable, since it is in oneone correspondence with the countable set of all integer-component vectors (jl,...-,t), js> 0, Ejs > 0. Suppose Tn is countable. Every graph of Tn+ 1 is the continuation to one more "live" generation of the section yn of some yn of Tn. This serves to split Tn+1 into a countable (since Tn is) number of subsets S, all y"+l in a fixed S being continuations of the same yn of Tn to one more live generation n +1, followed by death. Now the n-th generation of yn contains only a finite number of particles each of which can branch into the (n + 1)st generation in only a countable number of ways. Hence each subset S is countable and so is Tn+l.
7. We developed in I an abstract theory of measure, which we are eventually to apply to the set Fi of graphs z. To this end we define intervals in Fi as follows. By an interval of order n(n = 0,1,...) is meant the set i(yn ) of all graphs z such that z, = yn, where yn is some terminating graph of Tn. Note that the interval of order 0 is Fi itself. By an interval we mean an interval of order n, or a single graph z, terminating or not, or the null set 0. Define J as the class of all sets S = i which are the sums of an at most countable number of intervals, just as in I. We now have to verify the axioms I 1, 2, 3.
8. We have in the present case the simple and unusual situation that, if i and j are intervals then they are either disjoint or one is contained in the other, hence ij = 0 or i or j and I 3 follows. For suppose ij 5 0 so that some z is in both. If i or j is a single graph it must be z itself and hence lies in the other interval.
Suppose then i = i(ym), j = i(qn), with m < n. Then y =- Zm = y. Let z' be in i(pn). Then z = Y m =Ym and z' is in i(ym). Thus i(ym) D i(yn). Moreover, we see that if m =n,i(ym) = i(yn).
9. Suppose S= E i(yn) + E z + E 0 is any sum of intervals where the i(yn ) are summed according to increasing n. Without altering the sum S we may successively (a) delete the 0 summands, (b) delete duplicate z's, (c) delete duplicate i(yy)'s, (d) delete z's contained in the remaining i(yn)'s, (e) delete every i(yn) which intersects a preceding i(ym) with m < n, since then the former is contained in the latter. The resulting summands are now disjoint with sum S, and we have I 1. Moreover, and unusually, the final summands are a subclass of those originally given.
10. We have to show now (I 2) that the complement of an interval is in J. (a) If i = 0, i' = ri = i(y). (b) If i = i(y ° ) = Fi, i' = 0, an interval by definition.
(c) If i = i(yn), where ny is in Tn and n > 1, we may write the disjoint countable sum n-1 ri= E yk + S i(yn) k=OykeTkyneTn
since every graph z either terminates at some generation less than or equal to n, or lives through generation n and is thus in some i(yn ) with y, in Tn. The complement of i(yn) is manifestly in J.
(d) If i = y-l, where n - 1 > 0, the above decomposition of ri shows that i' is in J.
(e) If i is a non-terminating graph z, define y" in Tn so yn = Zn, n = 0, 1,.... We claim that z = H1i("n). Clearly z is in every i(y"). Moreover if z' is in i(yn) for all n, then zn= yn = Zn and z' = z. It follows that z' = - (i(yn ))', the summands being in J by b,c. Hence z' is in J.
11. We saw in I that if S = ii + . + in then S' is in J. This may be false in our case for countable sums. For example, if T = E yn is the set of all terminating graphs, then T' is the set of all non-terminating graphs. The latter cannot be a sum of a countable or finite number of intervals, for if T' = i i(yn) + z, we have T' with power of the continuum, the z summands are at most countable, hence at least one i(yn) must occur in the T' sum. But then yn is in T', a contracdiction.
12. For use in IV we include the
Lemma. An interval of finite order cannot be expressed as a finite sum of two or more disjoint non-null intervals.
For suppose i("n) =i(ym) 4- z is such a finite sum. Let T be the set of all terminating yn+1 such that n+l = ". Of these there are countably many, all in i(fn). Since the z's above are finite in number, an infinite subset T of T must be in the i(ym); if such a yn+1 is in i(ym) we must have n + I > m since yn+l = ym and ym is alive in generation m. But since y"m is in i($n), we have also m >n. Moreover if m = n, i(yn) = i(ym) and we should have only one summand above. Hence m = n + 1 and yn+l = ym. So to each yn+l of the infinite set T
corresponds a summand i(y"). The correspondence being one-one a contradiction arises.
Corollary. If El in D j, where i,j are intervals then j is in some in.
If j is 0 or a single graph, this is trivial. Let j be an interval of finite order. Then j = j(Zin) = EN(jin), where jin is an interval. By the preceding result, all jin = 0 except one, so (say) j = jil c il.
13. Let i(yn ) be written as the disjoint sum i(gn) = yn+ .i(yn+l) where yn+l ranges over the set T of the preceding paragraph. Then if j is an interval properly contained in i(ny), it must be contained in one of the summands. We need only consider the case j = i(ym). Since ym is in i(yn),ym must be a continuation of yn, and since ym : yn,ym must be a continuation of some yn+1 of T. Thus ym is in some i(-n+1) and so is j.
14. If we are given an arbitrary finite class N of disjoint intervals i (ym) and yP, there exists a disjoint decomposition of Fi into such intervals among which appear all those of the given class N. First write ri = yo+i(y') (1) y'ET1
We leave unaltered all i(y') which occur in the class N. All others we decompose further thus: i(y') = y' Ei(y2 ) where y2 ranges over all graphs of T2 which are continuations of y', and substitute the results into (1). Again we retain all i(y2 ) occurring in N and decompose all other i(y 2 ) one more step as indicated. This procedure eventually obtains a disjoint sum of the desired sort including all i (ym) of N as summands. The yP of N, being disjoint from the given i (ym) must occur in one of the other intervals of the sum before us. Each such interval may be decomposed until the contained yP's are expressed as summands.
15. If Z i(yn ) D i(ym), then the latter is contained in one of the summands. For ym is some i(yn ) and hence, so is i(ym).
III—
The Space of Graphs
1. On ri we define a metric or distance function as follows: If z = z', d(z,z') = 0, whereas, if z - z', d(z,z') = l/k(z,z') where k(z,z') is the first integer k(> 1) for which zk # zk. It is clear that d satisfies the axioms for a metric:
(A) d(z,z')> 0, for if z = z', d = 0, while if z 5 z', d > 0. Hence also d(z, z') = 0 if and only if z = z'.
(B) d(z, z') =d(z', z) since k(z, z') is symmetrically defined.
(C) d(z, z")< max (d(z, z'), d(z', z"). If any two of the three graphs involved are equal, this inequality is trivial. If all three graphs are different, C follows from the lemma of II 5. This is indeed stronger than the customary triangle inequality: d(z, z") <d(z, z') + d(z', z").
2. If z and z' are graphs, the following are equivalent (a) d(z, z') < e. (b) z = z' or z - z' and k(z, z') > 1/e. (c) z = z' or z fz' and k(z, z')> [1/e] + 1,
where the notation [1/e] indicates the greatest integer in 1/e. (d) Zn = z' where n = [l/e].
To each graph z and real positive number e > 0 we assign the e-neighborhood Ne(z) of z, namely the set of all z' such that d(z, z') < 0. Thus Ne(z) consists of all graphs z' such that z4 = Zn where n = [l/e]. It is clear that if z' is in Ne(z) then Ne(z') = Ne(z).
3. A sequence of graphs {z() } converges to limit z in case d(z(n), z) 0, that is, for every e > 0 there exists an N such that for all n >N, z() coincides with z through generation [l/e].
4. The space Fi is complete in the sense that every sequence (z(n) such that d(z(n), z()) O 0 has a sequential limit z. For the given condition implies that for every n = 1, 2,... there is an N(n) increasing with n such that Zn = z,, m >N(n). From this one sees that the subsequence zN(n), = 1,2,... has the property zn() = z(n+P) and hence defines a graph z such that Zn,- zn which is the limit of the subsequence and hence of the original sequence.
5. Every Ne(z) is a closed set. For let z' = limz() where z() Ne(z). Then for n sufficiently large, z() coincides with z' through generation [1/e]. But z() is in Ne(z), hence coincides also with z through generation [l/e]. Thus z' coincides with z through [1/e] and is in Ne(z). Hence the space Fi is zero-dimensional, every neighborhood being both open and closed.4
6. The space ri is not compact. For example, the sequence consisting of an arbitary sequential ordering of the countable set of graphs y' or T 1 can have no convergent subsequence, since no two y' have the same section y'. For the same reason, no neighborhood Ne(z) of a graph z which is alive in generation [l/e] is compact, and therefore the space Fi is not even locally compact.
7. A space is said to satisfy the second axiom of countability in case there exists an at most countable set C of neighborhoods of the original system such that for every z and N,(z) there is a neighborhood N of the system C such that z e N C Ne(z). Our space Fi satisfies this axiom in the trivial sense that the whole set of original neighborhoods is itself countable. Consider an arbitrary Ne(z). If [1/e] = 0, then Ne(z) = Fi, and there is only one such neighborhood. If [1/e] = n> 1, clearly Ne(z) = N (z). In this case define a terminating graph x so that xn = Zn and x has only 0's in generation n + 1. Then N,(z) =N,(x), where x is in T. Since T is countable, so is the entire neighborhood system.
Note that it is not implied in the above that x is in Tn. If z is dead in generation n., x = z will be in some Tm with m <n, and Ne(z) will consist of z alone.
8. It is well known5 that a space satisfying the second axiom of countability has at most the power of the continuum. The argument
rests on assigning to every z the class of all neighborhoods in C which contain z. This is one-one (by the separation axiom) on Fi to some subsets of a countable set. The class of all subsets of a countable set has power of the continuum so the power of ri cannot exceed this. Combining with the opposite inequality obtained earlier, we see that ri has the power of the continuum.
9. A word may be said about the relation of intervals to neighborhoods. Every neighborhood is either an interval of finite order or a terminating graph, hence an interval. Every interval of order n is either ri = N (y°) (when n = 0) or a NL (yn) (when n> 1) hence a neighborhood. Every terminating graph is a neighborhood, e.g., yn = N i (y), but a non-terminating graph cannot be realized as a neighborhood, and of course, neither can the null-set.
IV—
Measure in the Space of Graphs
1. Whereas the notions of the set Fi, its intervals, and its topology are intrinsic in character, depending only on the number t of types of particles considered, we may establish a measure for intervals, and hence for the class of measurable sets in various ways. We are concerned at present only with the following procedure. As in I 3, we consider a fixed generating transformation G(x) of the unit cube It of t-space: gi(2x1. rXt) = Pl(i;jl,. i -.,jt)xjl ...xJ i= 1,...,t,
defining the probabilities of transmutation pl(i;j) of a particle of type i into jl+ ... + jt> 0 particles, js of type s,s = 1,..., ,t. Then every finite upper section Zn of a graph z of ri has an associated probability P(zn) that the event defined by Zn should occur.
For example, if G(x) is g 1(xI,X 2) =I + Ix1+1XlX2 4 4 1 100 92(Xl,X2) = 2 + 4X2 + iXlX2,
and zo = (1); zl =(l); 2 =(1) I / \ (2) (1) (2) l l (0) (2) we have p(zo) = 1,p(z1) = 0, and p(z2) = 1/4(1/2 1/4) = 1/32.
2. We assign a G-measure to intervals of Fi thus: (a) m(0)= 0, (b) m(z)= limp(zn), (c) m(i(y(n)) =p(yn).
Note that m(z) may be non-zero, for example for a terminating yn~, m(yr) = p(y±+i), and m(i(yn)) may be zero. Clearly ml, m2 of I are satisfied and it remains to prove m 3. This is non-classical because of the non-local compactness of the space ri and we divide the proof into a number of steps. Since m(0) = O, we have only to consider the case i(yn) = Zi(yn) + Z of a countable number of summands (see II 12).
3. First suppose i D jl + ...-j, the j's being disjoint and nonnull. Then m(i)> Em(j). Proof is by induction on s. If s = 1, we have the case i D j. If j has finite order, so does i and we write i(ym) D i(yn) where necessarily n >m. Then m(i(yn)) = p(yn) <p(yn) = p(ym) = m(i(ym)). If j is a graph z, the only case to consider is i = i(yT.) D z. Then m(z) = limp(zn)<p(Zm) = p(ym) = m(i(y"7)). Now suppose the theorem true for <s- 1 summands and consider i(yn) :DJi + ...+ is with s> 2. Write i(yn) = yn + Ei(yn+l), where m(i(y")) = m(yn) + m(i(yn+l1 )). Then since s > 2, each j is properly contained in i(yn) and is therefore in one of its summands. If not all j are in the same summand, we invoke the induction assumption and obtain the result. If all j are in the same summand i(yn+l ) we decompose it and repeat the argument. Eventually the j's must be split between two summands, since the intersection of a nest i(yn) D i(yn+l) D i(yn+2 ) D ... is a single point. Hence eventually we obtain
the result from the induction hypothesis and the fact that m(i(yn))>m(i(ynP)) = m (yn+P) + E m(i(yn+P+l)).
4. Hence we see that i = = j implies m(i) >m(j). The opposite inequality remains to be proved. Because of non-local compactness we need the lemma of the next section.
5. Lemma. For every e > 0 there exists a set Y of disjoint intervals i(y") such that (a) Em(i(yn)) < e, (b) Ce (i(y")) is closed and compact.
Denote by s, an arbitrary finite graph from generation 0 through n. Since 1 = Ep(sl) where si ranges over all finite graphs of order 1, we have Ep(si) < e/2 for all but a finite number N 1 of si. The intervals or order 1 corresponding to the former sl we throw into y.
Consider for each of the N 1 remaining sl all of its continuations s12. Then p(Sl) = -P(S12) and we have Ep(s12) < e/2 2 Ni for all but some finite number of s12. Collecting the former s12 for all s1 of the finite set, we have E p(S12) < e/22 . The corresponding intervals of order 2 we throw into y, and retain the remaining finite number of s12. We proceed in this way to obtain the set y with E m(i(y")) < e/2 + e/22 + ... = e, at each stage retaining a finite set of finite graphs s12...,. We see now the Ce is compact, since an infinite set of graphs z in Ce may be thrown into a finite number of disjoint classes according to the agreement of their segments z1 with the retained sl. At least one class must be infinite. The latter class is further subdivided into a finite number of subclasses according to z2, and so on. The process defines a graph z with segments sl, 12,S123, which is clearly a limit point of the given infinite set of z's. Thus Ce is compact. It is manifestly closed since it is a complement of an open set. (Recall that every interval of finite order is a neighborhood, hence an open set, and the sum of open sets is open).
6. We saw in 4 that if = Ej then rn(i)>]m(j). We have now to prove the opposite inequality. We do this first for the case where i = ri, that is, we suppose Fi = j, explicitly
ri = Ei(YT) + P Z(V)_j
the z",v = 1,2,... being non-terminating. Fix e > 0. Since m(z()) limn p(z,() ) and the latter sequence is monotone non-increasing in n, we can fix N(v) such that p(z(,)^)) < m(z(v)) +e/2". Define yN(v) so that it is in 7Tv() and N(v) = Z() Then m(i(yN(v))) = (N(V)) =p(z ) < YN(u) N (v)' rn(z(V)) + e/2" and z() is in i(yN(v)). For the given e > 0 we define the set Ce as in 5, and we have ce c ri c c Zi(ym) + -yP+ E i(yN(v)).
Now we saw in III 9, that every interval i(ym) of order m is a neighborhood of our space namely N1/mr(y"'), and every terminating graph yP is a neighborhood, namely N1/(p+l)(yP). Hence we have the closed compact set Ce covered by a countable class of open sets. A well known topological theorem tells us that Ce is covered by a finite sub-class of the original open sets, i.e., ce CEi(yi) + EYP + Ei(YN(v))
the primed sigmas indicating summation over a sub-class of the original summands. Moreover we know that every sum of intervals can be expressed as a disjoint suml of a subclass of the original intervals so that we have C, C cli(y) + Z'_yp+ti(yN(')). The latter sum we call S.
Since there are only a finite number of intervals in S, consisting exclusively of either intervals of finite order or terminating graphs, we may decompose Fr into a disjoint sum of such intervals, among which will be those of S (see II 14.): ri = S+E y-P+i(y-) and such that 1 = E) y m(i(+Ym))+ E m(yP)+Z r(i(yN(' )))+ Cl(yP)+m (i( yP)) .
Taking complements in the preceding inclusion we obtain Ce D S' that is, E i(y) D Pi(m) y
Now each summand on the right is contained in one on the left (II 15) and thus using the results of (5) and (4) e> m( i(y"))> E m(y) + m(i(m)) y = 1-(EZm(i(ym)) + m(yP) + E m(i(yN()))> 1-(Em((i(y )) + ETm(y P) + Em(i(yN('))))> 1-(,m(i(ym)) + E± m(yP)+ ± m(zV) + e) = 1-m ~ m(j)-e.
So, rm(j)> 1 - 2e for every e > 0 and m(j)> 1, as was to be proved. From (4) we therefore see that ri = 7 j implies 1 = m(j).
7. Finally, suppose i(y n ) = Ej. Then we may decompose Fi into a sum of disjoint intervals (see II 14) in two ways: Fi = i(yn) + k =>j+k. By (6) we know that 1 = m(i(yn)) + Zm(k) and also 1 = Em(j) + rm(k), so that m(i(yn)) = m(j). This completes the proof of property m 3 of I.
8. Hence we see that every generating transformation G defines a G-measure for intervals of Fi, and thus a measure for the additive class of measurable sets as indicated in I.
V—
m(T) = x[superscript(0)subscript(i)]
1. Let T be the set of all terminating graphs y. It is trivial that T is measurable since it may be regarded as the sum of its countably many points each of which is an interval. Now pk(i; 0) is the probability of death in the k-th generation and as such represents the sum of the measures of all graphs in the set To +... + Tk_l. It follows that m(T) = Yk=0yCETkrm(y) = limk Fy-Tk m(y) = limkpk(i;0) = I°. Thus we have
Theorem. The G-measure of the set T of all terminating graphs of Fi is x°, the i-th component of the death-fixed-point x° of G. Hence, in a subcritical system (x° = 1), almost all graphs terminate.
VI—
A Strong Ratio Theorem for Supercritical Systems
1. Let G be the generating transformation: gi(x) = Epi(i;j)xj for a supercritical system with first moment matrix M = [mij]. We recall that mij = (Ogi/Oxj)1 and the maximal positive characteristic root r of M is greater than unity, possessing a unique left characteristic vector v with positive components vi, and having norm llvll = 1.
By the e-cone Ce of the system (e > 0) is meant the vector j = 0 together with all vectors jsuch thatlij/a-vll < efor some positive a.
We proved in I 3, a weak ratio theorem to the effect that, for every e > 0, f > 0, there exists a K such that, for every k>K, E Pk(i;j) < f
2. We prove in this section a much stronger result. Consider the set of all graphs z of li. Let T be the set of all terminating graphs y of Fi. For an arbitrary graph z denote by j(zk) the vector j = (i,..,jt) whose component js equals the number of particles of type s in the k-th generation of z. Let L be the set of all non-terminating graphs z such that limk ray j(Zk) = ray v. Precisely, L consists of all nonterminating z such that, for every e > 0, there exists K such that, for all k>K, j(zk) is in Ce. Define N = Fr - (T + L). We thus have Fi=T+L+N
where N consists of all non-terminating z for which e > 0 exists so that for all K, there exists a k>K for which j(zk) is outside Ce. We may now state the
Strong Ratio Theorem (SRT). For a super-critical system, the set N of non-terminating graphs of Fr which do not approach the ray v is measurable and m(N) = 0. Hence L is measurable and m(L) = 1- x°.
The proof is difficult and we may proceed by means of lemmas 1-20.
Lemma 1. The set N may be decomposed into subsets: N = N1 + N2 + ... where Nn is the set of all non-terminating graphs such that for every K, there is a k>K such that j(Zk) is outside C 1/n.
Clearly every Nn is a subset of N. Moreover if z is in N, where, for every K there exists a k > K for which j(Zk) is outside Ce, then, defining n so that 1/n < ez, and noting that C/,, is in Cz, we see that z is in N.n
Lemma 2. To prove the SRT, it suffices to show that if Ne is the set of all non-terminating graphs such that for every K, a k>K exists for which i(zk) is outside Ce, then outer measure O(Ne) = 0.
For then O(N,) = 0 for every Nn of the decomposition of Lemma 1, and so O(N)< O(N,) = 0, whence N is measurable and m(N) = 0.
We must now adopt some notations: Let gf(x) = EpK(i;j)x' ... xj. Then g ) = K(g,..,g = P(i; j)g gt _ pK(; j)EPj (Jl) x' .. xt and so on. Generally, (X)EPK(i; j)Epj(j) ... Pj-1 (jS)xj ...Xt jjljs
Lemma 3. Let Ne be given as in Lemma 2. To prove O(Ne) = 0, it suffices to show that, for every f > 0 there exists a K such that E PK(i;j) + Pk(i; j) pj(jl) + iJCe iOo j i iC a C, pK(i;j) pj(jl))Pj (j 2 ) + ...< f j3oj30oj 2 Ce JECe jlec,
For, let Ne be fixed, and fix f > 0. Then, assuming such a K exists, we have for this K that every z in Ne has z(jk) not in Ce for some first k >K. Define yk in Tk so yk = Zk. Then z e i(yk). But the above sum is the sum of all m(i(yk)) for all yk so defined. Hence O(N0) <f for every f > 0, and so O(Ne) = 0.
It is the condition of Lemma 3 which makes clear the relation of the SRT with the weaker form.
Lemma 4. To prove m(Ne) = 0, it suffices to show that, for every f there is a K such that for all s> 1, SK,, pK(i;j) +YpK(i;j) E Pj(jl)+ ...+ jffCejoj1 jCe PK(i; j) E pj(jl) * ~ pjs_~ (js)<fj3ojl ?OjsCc iECe jlECe
Now, setting x = 1 in the form obtained for gK+s(), we see that 1 =pK(i; O) + E P(i;j) + E PK(i;j)>= gi (C), we s e e that pK(i; O)+ pK(i;j)+ E K(i;j) E p(jl)+ jce .-(J j C) E K(i;j) E P(jl) >... >pK(i;O)+ SK,s+ j3ojl1o j7Ce jl Ce pK(i;j) E Pj(jl)... E pj-l(jS) jio j I o js o jECe j eC-e jsECe
The latter expression we denote by IK,S.
Lemma 5. To prove O(Ne) = 0, it suffices to show that, for every f > 0 there exists a K such that, for all s> 1, IK,s > 1 -pK(i; 0) - f
For then Sk,s<1-pK(i;O)-IK,S< 1-pK(i;0)-(1-p K(i; )--f) = f.
We must now consider more closely than we have done hitherto the transformations T(n) = nM/s(nM) and S(n) = nM/((nMII, each of
which we regard as operating on the set of non-null vectors with nonnegative components. One sees easily that the k-th iterates are given by Tk(n) = nMk/s(nMk) and Sk(n) = nMk/lInMkll. We have already denoted by v the characteristic vector of norm 1, for which vM = rv, and for our present purposes we let v = v/s(v) so that s(v) = 1, and v)M = rv. Clearly Tk(v) = v and Sk(v) = v. The transformation T(n) is a mapping onto the plane s(x) = 1,S(n) is onto the sphere llxll = 1. Clearly Sk(n)/s(Sk(n)) = Tk(n) while Tk(n)/IITk(n)lI = Sk(n). We first note some trivial relations.
Lemma 6. If j has non-negative components, then llll <s(j) For ljl2 =jEj (EJ)2S 2 j)
Lemma 7. If j is an arbitrary vector, then lljll >ls(j)l/t>s(j)t. For lljll = ( j2)2 > l l, m = 1,...t, and tlljll > E IJm I > I I Is(j)l > s(j).
Lemma 8. If x and y are arbitrary vectors, ( xiyi)2 < 11xll2 I11y2 . This is the well-known Schwarz inequality.2
Lemma 9. If x is an arbitrary vector, then llxMjl<IxlIl(Eij mi .)2 --X IIXi M. 2 2 For, [ [xMII2 = Ej(Eiximij)2 < EjEiXi iT Ij- II22 We have used the Schwarz inequality here.
Lemma 10. If n is a vector with non-negative components, then \lnMll> llnll( min Ej m'2.) 2 -= \nlla.l For IInM2 = Ej (Einimij)2 >Er_jEii = Ein E,m >Ei min Ej m = lin 112. .
We have now the means of establishing the existence of a Lipschitz constant for the transformation S(n).
Lemma 11. There exists a constant L > 1 such that IlS(n) - S(n')ll<Llln-n'll for all vectors n, n' which are non-null and have non-negative components, lln'll being unity.
For IIS(n) - S(n')ll = \lnM/lInMil - n'M/Iln'MIIll<nM/InMI\nM - nM/II II llnMIIn'Mlll + IMnMIIIn'M -n'M/ln'M = 1l/l1nMIl - 1/lln'Mlll · IlnMII + lInM - n'Mil/lln'Mi = \llnMIJ - lln'MI\I/ln'MII + IlnM - n'Mll/lln'MII<2))(n - n')MIll/ln'MIJ< 2arlln - n'll/lln'llai, by Lemmas 9, 10. Now Iln'll = 1. Hence L - 2a/a1 serves.
Lemma 12. If n and n' are non-null, non-negative component vectors and lin'll = 1, then llSk(n)-Sk(n')l<LIln - n'll, k = 1,2,.... For JlS2 (n)-S2 (n')ll < LIIS(n)-S(n')l<L2l\n-n'll, and so on. Note that IISk(n')ll = 1.
The transformation T(n) has topological and algebraic advantages since its denominator is a linear functional (see I 3). However S(n) has simplicity from the point of view of norm inequalities. We work with whichever seems more adaptable to the current purpose. The following two lemmas give inequalities connecting the two operators.
Lemma 13. Let j and k be vectors with non-negative components and having s(j) = 1= s(k). Define j' = j/lljll and k' = k/llkll as their projections on the unit sphere. Then lil' - k'Il <2tllj - kJl.
For l--k'll - llJ/lIJi - lk/\\jll\\<J/llJIl -J/llkllll+lJ/llkll - k/llki\i = il/llJll - I/llkli) llJ ll + llJ - kl\/lIkll = llJlll- llkll/llkll + 11i - k\ll/kl < 21]j - kll/llkll <2tllj - kll/s(k) = 2tllj - kll. Here we have used Lemma 7.
Lemma 14. Let j' and k' be vectors with non-negative components and 1li'll = 1 = 1lk'1l. Define j = j'/s(j'), k = k'/s(k'), their projections on the unit plane s(x) = 1. Then llj - kll<(t + 1)flj' - k'll.
For llj-kll = lj'/s(j') - k'/s(k')|<\\j'/s(j') - j'/s(k')ll + llj'/s(k')- k'/s(k')\\ = Il/s(j') - 1/s(k')l I '11/1 + lli'11 + Ii'-k'll/s(k') = Is(j') - s(k') /s(j')s(k') + iJ' - k'll/s(k') = s(j'-k') /s(j')s(k') + llj' - k'll/s(k') But s(k')> Ilk'll = 1, s(j')> Ill'll = 1 by Lemma 6, and Is(j' - k')<tllj' - k'll, by Lemma 7.
Lemma 15. For every e > 0 there exists a K such that for all k>K and all non-null, non-negative component vectors n, lSk(n)-v- < .
We saw in 1 3, that for e/2t > 0 there exists a K such that for all k > K and all non-null non-negative component vectors n, IITk(n)-—vI < e/2t. Referring to Lemma 13, we see that Tk(n) and v are on the plane s(x) = 1, and are vectors with non-negative components. Moreover Sk(n) = Tk(n)/llTk(n)ll and v = v/ivill. Hence I{Sk(n)- vil<2tllTk(n)vl < e.
Thus the sequence of the transformations Sk(n) is uniformly convergent to v.
We saw in Lemma 11 that IIS(n) - S(n')\\ <Llin - n'll where L > 1. Examples show that S(n) need not be a contraction in the sense that the above relation holds for some L < 1. We can however show that there exists an iterate Sq(n) which is a contraction toward v. It seems simpler first to establish the analogous result for T(n).
We recall that the simplex {, . .., t} means the set of all vectors n = nij6 i , ni > 0, nrii = 1, that is, all vectors n = (nl,...,nt) with non-negative components and s(n) = 1. By the boundary of the simplex we mean all of its points having at least one component zero. Since vM = rv, s(v) = 1, and M has all elements positive, v can have no component zero and is therefore not a boundary point. One sees therefore that there is a minimum distance D > 0 from v to the boundary of the simplex {1, ..., t }.
Lemma 16. For every f > 0 there exists a Q such that for all q >Q, \\Tq(n) -v\< flln-vll for all vectors n with non-negative components and having s(n) = 1.
Proof. Let V = maxvi/minvii > 1. Let D = min {Ilb-vl} > 0 where b ranges over the boundary of the simplex {6, .. ., f}.Iff be given, we know that for the positive constant fD/Vt there exists a Q such that for all q > Q, TTq(n) - vI < fD/Vt for all non-zero n with non-negative components.
Now let n v be an arbitrary such vector with s(n) = 1. The halfray v + a(n - ), a > 0, cuts the boundary of the simplex {, . .., t} in a unique point b, and n may thus be written n= hv+kb,h>O, k>0, h+k=l1.
Thus for q>Q, \\Tq(n)- vIl - \\nMq/s(nMq ) - v1 = IlhivM + kbMq/hs(vMq) + ks(bMq) - v = hrqv+ kb hr + ks(bMq) - I = k IbMq - s(bMq)v I/hrq + ks(bMq ) = ks(bMq) I Tq(b) - v\l/hrq + ks(bMq ) <ks(bMq)fj[b - v\l/Vt(hrq + ks(bMq)) = fllkb - kOvl/Vt(hrq/s(bMq ) + k).
Now (lkb-kIvI = Ilkb-(l-h)vl( = (hv +kb-vll = Iln-vll. Also, note that min i )i miq < Ei viimq = rqvj< rq maxv, and hence Ei (mq)2 < (Ei mi)2 < r2qV2 and (E (r.j)2 ) < rqV. Thus s(bM) = j i bimt < E Ilbll(i (mEij)1) ) Ej (Ei (m )2) 2 <trqV.
Hence hrq/s(bMq)+k>h/tV+k = 1-k+ktV/tV = l+k(tV-l)/tV>l/tV. Thus we see that [[Tq(n) - v\I<flln - vIl.
Lemma 17. For every e > 0, there exists a Q such that for all q>Q, \ISq(n')-v\l<elln' -vll, for all n having non-negative components and lin'll = 1.
Fix e > 0. Define f = e/2t(t + 1). Then by Lemma 16, Q exists so that for all q>Q, ITq(n) - vll<elln - vl\/2t(t + 1) for all n with non-negative components and s(n) = 1.
For arbitrary n' with Iln'll = 1 and non-negative components, define n = n'/s(n'). Then we have \Sq(n')-v\\<2tllTq(n')-vl<2t\lTq(n)-vl<2telln-v\ll/2t(t+1) <e(t+l)lln'-vll/t+1 = elln'-vll. We have used Lemmas 13, 14, and the fact that Tq(n') = n'Mq/s(n'Mq) = nMq/s(nMq) = Tq(n).
Just as the possible failure of S(n) to contract all vectors toward v forced us to prove Lemma 17, so does the possible failure of the
transformation nM to increase the norm of n cause difficulties for which we must provide in the next two lemmas.
Lemma 18. There exists a constant e > 0 such that for all vectors n satisfying Iln-vll < e, one has lInMII > lln11(l+r)/2 = mllnll, 1 < m < r.
For the function R(x) _ IlxMll/llxl( is continuous at x = v and R(v) = r > 1. Thus for m = (1 + r)/2 < r there exists an e > 0 such that ln - vll < e implies R(x)>m.
Lemma 19. There exists a K such that for all k > K, UnMkI> lnll for all n, with non-negative components.
Note that if w is the right characteristic root of M, with s(w) = 1, Mkw = rkw, we have max w >j mk > _jmk.wj = rkwi > rk minw so Ej mrj > rkW where W =minw/max w. Now |nMk l > s(nMk)/t= Eij nimk/t= Eini Ejm /t > rkWs(n)/t > rkWlInIlt. Take K so rKW/t > 1.
By virtue of Lemma 5, it now remains to prove the final lemma for which we have prepared all the essential tools.
Lemma 20. Let e > 0, f > 0 be fixed. Then there is a K such that, for all s > 1, E pK(i;j), P,(jl) E Pj- (jS)> 1 -pK(i;O) - f. jio jlio js»oji-Cejl Ce jS Ce
Our proof is based upon a complicated construction of K which was, of course, obtained by looking from the other end. In spite of its glaring artificiality we give the direct route:
1. By Lemma 18, we fix e so that 0 < e < e and lln - vll < e implies }lnMl >mrllnll where m = (1 + r)/2 > 1.
2. By Lemma 17, there exists a q such that $qSq(n')-vll < ln'-v11/4 for all non-negative n' with Iln'll = 1. We determine an e satisfying the following conditions: a. 0 < 5 < e < e, and e < 1. b. 5 Lq- l < e/2, that is e < e/2Lq- 1 where L is the Lipschitz constant of Lemma 11, and q is the constant just defined.
3. Fix E > 0 so that a. E(1+LL2 +...+ Lq- 1)</2, b. m(1 - E) > 1, that is E < (m - 1)/m < 1. Note that we now have automatically, for all s<q, E(1 + L + L2 + ...+ Ls-2) + e Ls-'<E(1 + L + L2 +... + Lq-2) + 5 Lq-1 < e/2 + e/2 < e/2 + e/2 = .
4. Fix k so that a. $Sk((n)- vI < e /2 for all non-negative n / 0 (see Lemma 15). b. [InMkfl> Ilnll, for all non-negative n. (see Lemma 19).
5. We define dk = Ej( Ei (dik) ) , dI being the dispersion of gk with respect to xj, and define rnk = (minEj (mi)2)½ .
We now fix A > 0 so a. 4dk/ 2emjA < f/4. b. dl/E2 m2(1- e )A < 1. c. dim(l - E)/E2 m2(1 - e )A(m(1 - E) - 1) < f/2.
6. Fix r so that pr(i; n) < f/4, see I 3. O<llnll<A
7. Fix K = r + k. We contend that this K satisfies the condition of the lemma. First note that gK(x) = EpK(i;j)x1 ... = gr (Gk(x)) = pr(i; n) Pk,n(j)xl .. .xt, where (gk) 1 ... (gtk)l = E Pk,n(j)xl ...xt by definition. Setting x = 1, we may write 1 = pr(i; 0) + pr(i;n) Pk,n(j) + O<llnll<A j pr(i; n) E Pk,n(j) + pr(i; n) E Pk,n(j) Ilnl>A llj/llnMklJJ-vIl<
Now just as in I 3, we see that, for arbitrary R > 0, Z Pk, (j)R2 < Pk,n(j)j -nMk11 = Znidj|j-NMk|>R ij lln ( (d4) ) 2 = -nlldk. J 2 Thus, Pk,(j)< llnlldk/R2 . Ilj-nMkl|>R Setting R = (InMkIe /2, we have 5 Pk,n(j)<411nldk/ e2 (lnMkl12 < 4(1nldk/ ( 11nl2m2k = Uj/ll|nMkIl-JSk(n)l> e /2 4dk/ 2m -lllnl, where mk = min ( n (m) )) (see Lemma 10).
However, if 11j/llnMkll - v||> then j is on the preceding range, for if not, \\j/llnMkl| - Sk(n) | < /2 and \Sk (n) -v | < 5 /2 by choice of k, whence 1ij/lJnMkJ1 - v|l< e, a contraction. Hence 5 Pk,n(j)< 4dk/2 mllInll. |ljlln/fMllk-vll> e Thus E pr(i;n)Pk,n(j) < pr(i; n) 4dk/ e 2m2A < Iln\l>A Ij/1\nMkll-v\I>e nll>A (f/4) E pr(i;n)<f/4. IlnII>A But by choice of r E pr(i; n)Pk,n(j)= p(i;n)<f/4. O<llnHJ<A j O<llnll<A
And, as always, pr(i; 0)< pr+k(i; 0). Thus we obtain from the original equation that Pr,(i; n).Pk,n(j) >1-pr+k(i;0) - f. nl>Aj/nMk-v< e
Every vector j involved on the range above has the property that ji/a - vii < 5 for some a > A, for lInMklI> llnll > A by choice of k. Thus Z pK(i;j)>1-PK(i; 0)-f lIj/a-v\l< e a>A Since 1lv l = 1 > e by choice of e, j = 0 is not on this range.
Note the two trivial remarks:
A. If llaj' - vii < e, a > 0, i'll = 1, then lij' - vll < 2e. For e > llaj' - vii > laj'll - ilvll l = a - 11, and lij' - vll = lij' - Jai' + Ilaj' - vii = I1 - al llj'll + e < 2e.
B. Corollary. If lij/a - vll < e and j' = j/lljll then (lj' - vll < 2e. Consider now the product pK(i; j) Pi(jl) Pjl(j) { \\j/a-v\\< e \\j1\\jMj-Sll(j)\<E\\jj/\\jjMjl-S(j,)\\<E a>A E p_j (jq-l) pE _Pq- (j) *- (*) \3jq-1lj~jq-2M\\-S(jq-2) ||<E || j"/\j"-1M\\-S(j"-) ||<E
Since e < 1, E < 1, and Ilvli =1 = Sll(j)ll[ , we see that no j or ji is ever zero on these ranges. Applying Lemma 12 we find that lljq/lljMq-IM_-S(jq-l) I< E, S(jq)-S2 (j -) I< LE, 11(jq-)S3 (jq-3) < L E ... Ilsq-2 (j2)_- sq-l (j)l) 11 < Lq-2E \Sq-l(jl) Sq(j)\\ < Lq-'E, and hence ljq/lljq-lMI - _Sq(j)ll <E(1 + L + ... + Lq- l)
But since IISq(j/(ljl)- vl < /jlJljl - vIl/4 < 2 /4 <-/2 (see remark B and definition of q) we have|jq/lljq-MII- v|l< E(1 + L + .. + Lq- l) + /2 < /2 + /2 = 5. (see definition of E). Similarly we shall find, working from 2q to q, that llj2q/llj2q-lM -vl\ < 5 and so on for all multiples of q. Note also, for s<q, Ijs-/lljS-2Mi - S(jS-2) II < E S(js- -) - S2(j-3 ) l < EL... SS-2(jl) - Ss-l(j)I < ELS-2 \S`-l(j) -v\ <eL-1 , so lljs-1 lIj~-2MI - v\\ < E(1 + L + ... + L-2 ) + t LS-l <
(see definition of E). Similarly all jnq+r on the non-q-multiple postions are in Ce. It is clear now that the ranges for j, j,.. .are all in Ce c Ce and exclude zero. Hence the product of Lemma 20 is greater than or equal to that of (*) carried to the corresponding place.
Now note the elementary result that if l j/a-k 1 < eo where IlkI = 1, we have a - lljll = Ilakll -lljll<1ljlll - Ilak]ll<llj - akll < aeo and hence ll > a(1 - eo).
We have seen that, on the ranges involved in the (*) product, all j and js are in Ce, indeed \lj/a - vll < 5 < e, and jI/lljs-1 Mll - vll < e, all s. By choice of e, ll(j/a)MIl> mllj/all and II(js/lljS-Mll)MII>ml (js/lljs-lMll) 11. But then lljMll >mlljll and \ljsM\\> m\jsll. Thus we see that
lij/a - vll < 5 implies 11jll > a(1 -e ) > A)(1 - ). llj/(jMll - S(j)| < E implies jllj> lljMll(l - E)>m(1 - E)jllj > m(l - E)A(1 - 5), 1j2 /llj1 Mll - S(j)11 < E implies fij2 >jMll (1 - E)>m(l - E)[ jl | > m2 (1 - E)2 A(1 - ),..., js/ij-l'MIJl-S(j'-)1l < E implies ljs >m"(1 - E)8 A( - ).
Letting C = A(1 - ) and R =m(1 - E) > 1 by choice of E, we have on the (*) ranges, llill > C, ljjll > CR, lij21 > CR2 ... ljs-l > CRs-.
Now regard the distribution defined by g .g.. t=Pj-l, (j")xxl . .. . We see as before, for arbitrary n > 0, E P,_l(js) N2 < P(jjs)I-2 -J- E j-ldl < JJjs-js-1MJ(>VNIv Jj-1. dl, and hence 1 Pj-_(js) 1- IIs-1lld1/ .N2 lljl-j»-lMl)<N
Setting N = Ijj -1MII E for all s>1 EPP- (is) > 1 - lj-ll dl/E2lljs-lMj2 > 1- Ij1-dl 1/E2 Iij-11 2 . m l = 1 - di/E2 m llj~l I. Beginning at the last (s)-th term, we have S Pjs- (js)> 1 - di/E2 m2CR-1 > 0 Pj.s-2(jS- 1) > 1 - di/E2 m2CR-2 > 0 1IjS-1 /1jS- 2MII- (js-2) II<E 5 P (j')> 1-d/E2 mC > . 11|/11j11jM-S(j)11<E
Note that d1/E2 m\2CRi < d1/E2 m2C =di/E2 m2(1- 5)A< 1 by choice of A. Hence the (*) product through s factors is greater than or equal to (1-pK(i; 0)-f/2) (1-di/E2 m2C) (1-di/E2 m2CR) ... (1-di/E2 m2CRS-1).
It is trivially proved by induction that if all pi and 1 -pi are positive, then (1-pl)(l-p2) ... (1-pn)>1 -p-p2 -.-Pn,. Hence the preceding product is greater than or equal to
- pK(i; 0) - f/2 - (di/E2 m2C) (1 + 1/R + /R2 + . .. + 1/R-1 ) > 1 - pK(i; 0) - f/2 - di/E2 mC(1 - 1/R) = 1 - pK(i; O) -f/2 - diR/E2m2C(R - 1) -pK(i; 0) - f/2- m( - )/ - - E/ )A(m(1 - E)- 1) > 1 - pK(i; 0) - f/2 - f/2 = 1 - pk(i; 0)- f, by choice of A.Q.E.D.
VII—
Remarks On Systems below Critical
1. For a subcritical system (x° = 1), we have seen in V that m(T) = x = 1 so that m(T') = 0, and since N c T', O(N)<O(T') = m(T') = 0. Thus, in such a system, it is trivial that m(N) = 0. Similarly L C T' and m(L) = 0 = 1- x°. Therefore the SRT is self-evident for subcritical systems.
References
I
1. P. Alexandroff, H. Hopf, Topologie, Berlin, 1935 2. G. Birkhoff, Lattice Theory, Am. Math. Soc. Colloq. Publ. XXV, New York, 1940. 3. D. Hawkins, S. Ulam, Theory of Multiplicative Processes, see Chapter 1. 4. H. Margenau, G. M. Murphy, Mathematics of Physics and Chemistry, Van Nostrand, 1943.
II
1. C. J. Everett, S. Ulam, Multiplicative Systems in Several Variables, I, (LA 693) see Part I. 2. D. Hawkins, S. Ulam, Theory of Multiplicative Processes, see Chapter 1.
III
1. H. Cramer, Mathematical Methods of Statistics, Princeton University Press, 1946.
2. C. J. Everett and J. R. Ryser, The Gram Matrix and Hadamard Theorem, Am. Math. Monthly, LIII, 1946.
3. C. J. Everett, S. Ulam, Multiplicative Systems in Several Variables, I and II.
4. W. Hurewicz and H. Wallman, Dimension Theory, Princeton University Press, 1941.
5. W. Sierpinski, Introduction to General Topology, University of Toronto Press, 1934.