16— The Notion of Complexity: With W. A. Beyer and M. L. Stein (LA-4822, December 1971)

## 16— The Notion of Complexity: With W. A. Beyer and M. L. Stein (LA-4822, December 1971)

This report is a study of complexity per se in certain algebraical systems. Much subsequent work seems to have been stimulated by these results. (Author's note.)

### Abstract

The notion of the arithmetic complexityInlof an integernis defined in terms of the minimum number of additions, multiplications, and exponentiations required to combine l's to formn.The value of Inis calculated forn<2 10 . nis called complicated if In > nlI for everyn1<n.Of the first 19 complicated numbers, 14 are prime. A conjecture about a relation between complexity and entropy is proposed. Some computations are presented to support this conjecture.

### I— Introduction

In this report we discuss notions of complexity in some algebraic structures. These notions are also applicable to more general combinatorial situations that perhaps lack any algebraic pattern in the classical sense. We concentrate on a few special cases for which we studied and calculated a special notion of complexity. Essentially, we examined a special notion of complexity for ordinary integers with a little excursion on such a notion for integers modulo a prime.

446

The notion of complexity, in our view, is separate from, though associated with the idea of the amount of information or entropy of a system. We mention briefly a possible axiomatic approach to defining a real number called complexity for elements of a set or of a class on which certain operations are performed. These could be binary operations; our set could be a set of integers, and the operations could be addition, multiplication, and exponentiation, for example. It is this case that was examined on a computing machine and to which most of this report is devoted.

Another case would be a class of subsets of a given set, with allowed operations being the Boolean operations of union and intersection or union and complementation. One could add other operations, for example, the direct product of sets and also projection. This would correspond to allowing quantifiers in our theory. One can study a notion of complexity for vectors in a countable space or even in the continuum. An important study would be that of a relative complexity; that is to say, complexity of elements or "expressions" when the complexity of certain symbols is normalized to 1. In what has been sometimes called "speculation" on constants in physical theories, for example, the whole art seems to depend on the success of attempts to define some known important numbers, e.g., the dimensionless ratios Mprotonp = 1836.11... Melectron and e2 = 137.1... hc

by use of only a few artificially introduced constants which should be as "simple" as possible. (cf. the attempts by Eddington1 and some very recent ones by Good2 and Wyler.3 )

Considered "genetically," a mathematical theory resembles a tree in that one obtains, from a given number of symbols corresponding to "variables" and from a number of allowed operations, expressions that elongate by branching. The simplifications and abbreviations may then reduce the length of the expressions.

One could try to define complexity in a mathematical structure by postulating certain of its properties, somewhat like postulating properties of a measure.

Let the structure, S, consist of elements x, y, . It may be finite or infinite. We have in the set S a number of, say, binary operations R 1, R 2,... Rn. We want to assign a number c(x)> 0 to each element

447

x of S and to each Ri (i = 1... n) so that the following properties should hold.

a. If z = Ri(x, y), then c(z) = c(Ri(x, y)) <c(x) + c(y) + c(Ri)i = 1...n. b. For each element z, if z = Rj (x, y), we should have for one case at least c(z) = c(x) + c(y) + c(Rj). c. c(xo) = c(xi) = ...c(xn) = 1 for some preassigned elements o..., Xn in S.

Needless to say, one can define analogous desiderata for the case in which the operations are more general than binary ones.

Obviously, in the case to which our exercise is devoted, these postulates are satisfied. Moreover, they define the complexity uniquely if, as must be the case in general, the complexity was normalized for some elements. (In our case, we assume the complexity of the integer 1 to be equal to 0.) We hope to study this notion more thoroughly for the more general case and also to perform experiments to determine complexity functions for the case in which S is a class of sets. Ultimately, one would wish to discuss the complexity of genetic codes and biological organisms quantitatively.

("Integer" always means a positive integer.)

### II— Arithmetic Complexity of Integers

The arithmetic complexity Inl of an integer n is defined as the fewest number of operators: +, x, x x (addition, multiplication, and exponentiation) which combine 1's to form n. Thus, 1)1 = 0; 121 = 1 since 2 = 1 + 1; and 151 = 4 since 5 = (1 +1) x x (1 + 1) + 1 and not fewer than four operators with 1's will form five. Obviously, for a and b integers, la + bl, labl, and labl are each not more than lal + lbl+ 1. For an infinity of integers n, the relation In + 11 = Inl + 1 holds.

For the purpose of calculating the complexity of some integers, all correct formulas (up to some number of operators) involving +, x, x x, and the number 1 were enumerated using parenthesis-free notation on a computer. It required one hour of computer time to enumerate the integers with complexity <6. Ralph Cooper made the following observation. Each correct formula involving n(>0) operators is the composition of two formulas, one formula with nl operators and one formula with n2 operators such that n = nl + n2 + 1. One generates the integers of complexity n by first generating tables of integers of

448

complexity <n. One partitions n - 1 into nl + n2 in all ways and combines the integers of complexity n1 with the integers of complexity n2 to produce integers of complexity not larger than n. This method is considerably more efficient than the previous method. Table I lists the complexity of all integers < 210.

From the above construction, one sees that an upper bound el(k) to £(k), the number of integers of complexity k, is given by the solution of k fl(k + 1) = l(j)kl(k - j) , j=o with f1(0) = 1. The solution to this equation is given by el(k)= k -1()2k which implies that 2ke(k)< + 0(2kk-5 /2 )

Two additional forms of complexity have been considered and calculated.

a. Complement Complexity. To make complexity symmetric in 0's and l's, we introduce a slightly different complexity, the complement complexity K(yln). Define the complement operation C by C(xln) =2n - 1 - x. K(yin) is defined as the fewest operations of addition, multiplication, exponentiation, and complementation that combine 1's to form y. In the count of operations, the first three are given the value 1 and the last is given the value 0. Thus K(yn) = K(2n - 1 - yln). Table II gives the values of K(yln) for y < 21 and n = 10.

b. Modulo a prime p Complexity. In addition to the operations of +, x, and x x, the operation of modp is allowed and is defined by modp(x) = x - p[x/p] where p is a fixed prime and [ ] denotes the greatest integer. Table III gives the modulo prime p = 137 complexity for integers < 137. Table IV gives the modulo prime p = 1009 complexity for integers < 1009.

449

TABLE I. Complexity of Integers < Complexity Integer

450

blank

451

blank

452

blank

453

TABLEII.ComplementComplexity ofIntegers < 210 Complement Complexity Integer

454

blank

455

blank

456

TABLE III. Modulo Prime p = 137 Complexity of Integers < 137 ('omplexity lIltegcr

457

TABLE IV. Modulo Prime p= 1009 Complexity of Integers < 1009 Comlplexity Integer

458

blank

459

blank

460

### III— Complicated Numbers

One defines n to be a complicated number if Inl>lnjl for every nl<n. The complicated numbers < 210 are 1, 2, 3, 4, 5, 7, 11,13, 21, 23, 41,43,71, 94, 139, 211, 215, 431, and 863. (Those in bold are also prime.) Obviously, there is an infinity of complicated numbers. We propose the following conjectures.

a. There exists K such that all complicated numbers K1> K are prime. b. Every sufficiently large integer n is the sum of k < log n complicated integers. c. There exists c such that every sufficiently large n satisfies Inl<c + ±/logn.

### IV— Complexity and Entropy

Kolmogorov4 '5 has introduced the notion of complexity of a finite string over a given alphabet. For simplicity, suppose the alphabet to be {0, 1}. Let A be an algorithm that transforms finite binary sequences into binary sequences. By an algorithm is meant any of the various equivalent concepts used in logic. For a binary string x, one defines the complexity by Mill e(p) KA(X) = A(p)= oo if no p exists such that A(p) = x,

where £(p) denotes the length of the binary string p. Analogously, one defines conditional complexity. Let A(p,x) be an algorithm defined from pairs of binary strings to binary strings. Put Min 4(p) KA (X) = A(p)=x oo if no p exists such that A(p) = x,

KA(ylx) is called the conditional complexity of y with respect to x. Kolmogorov regards complexity as analogous to entropy. We make the following conjecture.

461

Conjecture. Let a discrete binary information source S in the sense of Shannon6 be given with entropy H = -p log p-(1-p) log(1p) where probability (0) - p and probability (1) = l-p; 0 < p < 1. Let {x1,x2,... ,X2n} be the set of all binary strings of length n arranged in order of decreasing probability. Let k(n) be the least integer so that k(n) prob (xi) > r where 1/2 < r < 1. Then asymptotically for large n, k(n) H k(n) ZKA(Xi In). (1)

(In Eq. (1), KA should be normalized so that when p = 1/2, k(n) ( KA(xi In) = 1.)ik-1

In other words, the most likely sequences from A have complexity approximately equal to the entropy of S.

In order to test the conjecture expressed in Eq. (1), we replaced KA(xiln) by A = K(yln), where A is selected so that when p = 1/2, k(n) k(n)EZA= K(xi I n) = 1 i=1

Graphs of H1 = -plogp- (1 -p)log(1 -p) and H2=k( ) E AK(xi n) when n = 10 and r = .75 are shown in Fig. 1.

### V— Complexity of N -Tuples of Integers

Matijasevic7 has proved the following theorem. There exists a fifth-degree polynomial Q(yl,..., Yk; z) with integer coefficients such that any enumerable set m of natural numbers (for example, the set of prime numbers) coincides with the set of natural values of the polynomial Q(yi,..., Yk; am) where am is a certain number effectively constructed for the set m. From the result it follows that if one could

462

Fig.1. Comparison of entropy H1 = - —pilogpi and complement complexity H2 as defined and discussed in text.

discuss complexity of n-tuples of integers, then one could discuss the complexity of enumerable sets of natural numbers by equating such complexity to the complexity of the associated polynomial Q.

463

### References

1. A. S. Eddington, Fundamental Theory (Cambridge University Press, 1946).

2. I. J. Good, "The Proton and Neutron Masses and a Conjecture for the Gravitational Constant," Phys. Let. 33A, 383-384 (1970).

3. A. Wyler, "Les Groupes des Potentiels de Coulomb et de Yukawa," Compt. Rend. Acad. Sci. Paris, 271, 186-188 (1971).

4. A. Kolmogorov, "Three Approaches for Defining the Concept of Informationb Quantity," Problems of Information Transmission 1, 3-11 (1965).

5. A. Kolmogorov, "Logical Basis for Information Theory and Probability Theory," IEEE Trans. Information Theory IT-14, 662 664 (1968).

6. C. E. Shannon and Warren Weaver, The Mathematical Theory of Communication (The University of Illinois Press, Urbana, 1949).

7. Ju V. Matijasevic, "Enumerable Sets are Diophantine," Soviet Math. Dokl. 11, 354-358 (1970).

### Additional References Not Used in Text

1. E. L. Lawler, "The Complexity of Combinatorial Computations: A Survey," Proceedings of 1971 Polytechnic Institute of Brooklyn Symposium on Computers and Automata.

2. D. W. Loveland, "On Minimal Program Complexity Measures," ACM Symp. Theory of Computing, Marina del Rey, California, May 5-7, 1969.

3. P. Young, "A Note on Dense and Nondense Families of Complexity Classes," Math. Systems Theory 5, 66-70 (1971).

465

16— The Notion of Complexity: With W. A. Beyer and M. L. Stein (LA-4822, December 1971)