Speculations about the Mechanism of Recognition and Discrimination:
This report is a preprint of a talk I gave at Los Alamos in 1981, speculating about some of the methods which may be used in some processes occurring in the nervous system. (Author's note.)
Let me first say that this title is not quite exact. I may want to speculate, but it won't be about the physiological or anatomical nature of memory, about which I know nothing! At the end I may venture my own private little questions. I don't think anybody knows really what the true physiological elements of recognition are.
The talk will be about more abstract mathematical schemata, some of which may perhaps have a physical basis. I'll try to talk first about various ways in which a visual picture is recognized. Towards the end, I'll talk about how, with suitable changes, this may apply to other sets of objects--for example linear arrays like DNA codes, or to auditory experiences which are more or less linear too, as for example a sequence of musical notes.
I don't dare speculate too seriously about three-dimensional objects; that indeed is the domain of immunology, for example, or about olfactory recognition which refers to the recognition of molecules by something in our nose. But could all this be coded up linearly perhaps? Or could the shape of molecules have something to do with infrared radiation? I have looked at a book about olfactory problems, but it is ten years old and I suppose people know more now.
Let me now talk about some purely mathematical attempts to give the combinatorial schemata for what we call recognition.
Recognition is already more ambitious than what I would call discernment or discrimination, namely the finding of a difference between two signals. I have not seen that discussed per se in the literature, but it seems to me that it is more elementary to distinguish two different letters, for example, than to recognize an object stored in the memory. Discernment or discrimination I don't know what to call it is something we have experimented on long ago on computers in this lab.
The main discussion will be about distinguishing between twodimensional objects or pictures. The mathematical tool, or at least notion-for I don't think it deserves yet the name of tool-the mechanism I will talk about is the idea of a distance between objects such as, for instance, pictures on a screen.
First let me explain the properties of a distance:
Suppose we have a set E of objects we will call a, b, c,.... A distance p(a, b) is a real valued function of pairs of elements of E. It should be > 0. When it is 0, this means that a = b. It is symmetric, p(a, b) = p(b, a). It also has a very important property for all applications called the triangle inequality, p(a, c) >p(a, b) + p(a, c) for all a,b, c. If you have such a function on a set E, the set is called a metric space.
Mathematicians and physicists are familiar, of course, with metric spaces such as Euclidean space, Hilbert space and all kinds of function spaces, manifolds with "curved" geometry, with distance measured on geodesic lines, and so on.
I would like to give examples of differently constructed metric spaces. These may have something to do with at least a language for certain biological phenomena different from metric spaces in physics.
Here are various fifteen-year-old attempts to define a distance between sequences of symbols of a set of DNA codes, for example. The sequences consist of long arrays of four letters, A, C, T, G or U, looking for instance like ACTTGGA ....
For simplicity's sake, instead of using four letters, I'll take a sequence of just two letters and will call them 0 and 1. One such sequence may be A = 011110101001... 01... the other B may be B = 101101.... The sequences are quite long--they may have a thousand or two thousand letters, and perhaps they form the code for some definite molecule.
How can we compare them? The idea is to define a quantitative measure of this comparison, or a distance between them. Walter Goad*
* Los Alamos physicists who have become interested in biology. (Eds.)
and George Bell have occasionally defined various distances. I myself have played with this ten or twelve years ago and considered distances differently from the line of pure mathematics.
One of the simplest ways to define a metric for sequences ao ... aN, of symbols 0's and l's is, for example: N x=-Ol...OZN, y = .1 . N, p(X,y) = --3i | , i=l sometimes called the Hamming distance. Clearly this distance is not what one would want in biology for a distance between codes. In "pure" mathematics, as in most physical situations, the description of objects, i.e., their positions are fixed. They are so to say rigid, and their beginning and end are fixed, whereas in biological situations the objects are pliable. The above distance between the two sequences x = 0101...0101, y = 1010.. .1010 would have a large distance N because they differ in every place. But if we erase just one letter from each we see that the sequences are the same. If they were written on a circle they would be exactly the same.
The distance between two linear biological arrays could be defined as the mininum number of steps which will change one sequence into the other, or more generally, will work on both to bring them into the same form.
What are these steps? Given two sequences of 0's and l's, x = li ... a2... aN, y =1i 1 ... 2 ... /M (M could be different from N), we defihne the distance between x and y as the minrimlum number of steps which operating on one or the other or both of the sequences will bring x and y into the same form. One can prove that this minimum number satisfies the properties of a distance.
The steps are of the following types: 1. A change of a 0 into a 1 or vice versa. 2. An erasure of a symbol and a contraction of the remaining ones or an insertion of a symbol anywhere in the sequence. It turns out that in order to find out about this distance, a rather efficient algorithm devised by Peter Sellers determines it in less than N 3 steps.
A definition of the distance given above was presented in an article I wrote in the Annual Review of Biophysics.1
Distances of this sort can be used to compare sequences of DNA defining proteins, for instance. The biologist Margoliash had the idea of trying to reconstruct the evolutionary tree for organisms and animals based on the codes for cytochrome C present in essentially every living organism and seemingly constant within a species but (lifferent
from species to species. His idea was to find species whose codes for cytochrome C are less different from each other, or related more closely in the evolutionary tree, and species whose codes for this protein differ by more. To reconstruct a possible evolutionary tree led mathematically to the problem of a binary graph for the species now existing and perhaps some non-extant ones which have disappeared, starting with some very primitive organisms and in such a way that the sum of the distances in this binary graph of "descent" should have the sum of all the distances as small as possible. This postulate of Margoliash translates mathematically into an assumption that the collection of all mutations which occurred giving rise to the now existing variety of species was the least improbable among the possible ones.
In the space of the DNA codes defining the various cytochrome C sequences as a finite metric space we have a generalization of a problem by the nineteenth century German geometer Steiner who considered a finite system of points in the plane. His problem was to draw a graph through all the points, perhaps adding new auxiliary ones, so that the sum of all the edges would be minimal. Here, we have this kind of problem in a much more general combinatorial setting for a finite metric space.
One can think of many other types of distances for the sets of sequences of 0's and l's, or more generally for sequences with k symbols, k being a fixed number. One such definition, again involving the smallest number of steps which allow passage from one sequence to another, with the nature of steps defined ahead of time, is, for example, the following:
Suppose that we compare two sequences of O's and l's with the same number of l's and 0's in each and allow a step which consists of moving a 0 or a 1 from one position to another, the cost of this step being the length through which we move it. The "minimum" work to effect this is a possible distance between two such sequences.
Another definition can be obtained by comparing the number of l's in the two sequences, noting the difference between these numbers, then considering the 0's and l's in the two sequences, again noting the difference in these configurations and then the difference between symbols I followed by 1, etc. After this survey of pairs of succeeding symbols, counting triplets and so on, the sum total of such differences suitably normalized can serve as a distance.
We will now present a larger variety of possible distances between objects as sequences of two-dimensional "pictures." We shall concentrate on visual or two-dimensional impressions and on ways to quantify to a degree the similarity or lack of it between two-dimensional objects or "pictures." A variety of possible distances between two sets in a
plane, or more generally between two classes of sets in a plane, will be discussed. Just as in the case of one dimension, the possibility of "recognition" of a sequence of symbols involves the smallness of a distance between impressions (auditory or tactile, for instance), and the strings of symbols coding them which reside in the memory from previous impressions, a two-dimensional visual impression is compared with the picture or pictures stored in the memory of an organism.
Without at the moment going into possible physiological or anatomical ways to evaluate such distances, we shall discuss abstractly various ways of considering a metric space whose elements are sets in the plane. For our purpose it is sufficient to consider them finite, or if infinite, closed and bounded.
One can consider a topological distance due to Hausdorff. It is defined for closed subsets of a metric space E as follows: Let E be a space with metric PE(X, y). For any two closed sets A and B one can write as a distance PH(A, B) where pH(A,B) = Max Min pE(x,y) + Max Min pE(x,y) xeA yeB yeB xeA
But again this is not quite what one wants for a distance between two impressions or two sets given separately since the distance above depends on the position or mutual relation of two subsets A, B, in the plane (screen).
One can obtain a more satisfactory distance by iterating the Hausdorff formula as follows: Instead of a fixed set A consider a whole class of sets "like" A by slightly deforming, shifting, turning the given set A, and more generally by applying to A a number of transformations forming a neighborhood of the identity of a whole group of transformations. In this way we obtain a whole class A of sets. Proceeding analogously with the set B we obtain a class B. Assuming that these classes are finite, or compact in the case of an infinity of transformations, we may now consider a distance between A and B as: p(A,B) = Max Min pH(A,B) + Max Min pH(A,B) BeB AeA AeA BeB3
One of our contentions is that in problems of reaction to impressions (visual, auditory, tactile, or chemical), the organism produces a number of small variations of the impressions stored in the memory. Perhaps one is allowed to speculate that the memory could reside not only in the central nervous system, but could exist in the immunological or other autonomous parts of the organism.
There are other distances, in addition to the Hausdorff distance, which might actually be more suitable for the arrangements in the visual systems. The distance as defined above has the drawback that its value for a pair of sets which are almost identical except for a few points added to one of the sets may be considerable.
One can generalize the Hausdorff idea still further by considering in the notion of the class A variations in the sets A by looking at them ' modulo" a small number of points, or in the infinite case, "modulo" sets of small linear measure.
Another distance between two sets, each of which has its own metric between points in themi (e.g., if they are both subsets of a plane with Euclidean metric), can be obtained by trying to map the set A into B and vice versa with the smallest number of errors. If both sets are finite, we consider all mappings of one into the other trying to achieve an isometry as much as possible, that is to say, a transformation such that a pair of points x, y in A should go into a pair of points x', y' whose distance is equal to the distance between x and y.
Given a mapping, we can calculate the sum of the errors under an optimal mapping. The distance between two sets A and B can be defined as the minimumr of the sum of errors under all possible mappings. In practice, of course, the number of all possible trial mappings is enormous, even for sets A and B consisting of a small number of points, and trying all mappings is impractical. Instead one can take recourse to a Monte Carlo type assay by looking at very small subsets of A and mapping themn into subsets of B aind vice versa.
Even if the nurmber of such subsets is large, say hundreds or thousands, the total computation will be vastly shorter than the exponentially increasing (factorial n) number of all mappings.
The above definition bears a resemblance to the one involving a problem considered by Appell in his study of "Deblais and Remblais"2 : the minimum work necessary for transforming a given pile of sand ("points" of a set) into a given different configuration.
What such a definition suggests is that, given a set on a screen providing a new impression, there may be a mechanism of attempting to map the points which form it onto a set of points of a set residing in the memory this with a small number of errors in the distances between pairs of corresponding points. If this is possible, we consider the new impression as recognition of a previous one. If this is not possible, we might put it into the memory as a new object.
Here are a few more possible distances between sets: Given a set of points in the unit square, we may imagine it transformed homothetically so that it would touch the boundary of the square. We now consider a successive division of the square, first into
four squares of size 1/2, then a division into sixteen squares of size 1/4, then sixty-four, etc... We examine the characteristic function of the given set in the subdividing squares. We allot weights each normalized so that the squares of the first subdivision have weight 1, in the second subdivision 1/4, then 1/16, etc... The set will then be coded by a sequence of these numbers.
Given two sets A and B, we may define their distance as the norm of the absolute value of the difference between the two coding sequences. It is distances of this sort that were used by Schrandt and myself in experiments on the computer which we performed in Los Alamos around 1965 to attempt recognition via computer of hand written letters of the alphabet.
We proceeded as follows: We wanted to discrininate between two handwritten letters A and B. We stored in the memory 256 examples of each in the following way: Obviously it would have been laborious, time consuming, and slow to do it by changing the styles of these letters by handwriting. Instead we produced varied examples of a handwritten letter on the computer rather quickly. It is well known that there exists on an interval and similarly on the unit square two continuous transformations whose composition will produce a set dense in the space of all such continuous transformations.
Remembering this fact, we chose two transformations of the unit square S and T, each letter different from identity (small deformations). If we consider transformations of the form STTST.. .SSTSTS... etc., of say 7 letters, we have 27 = 128 different transformations, each still not too violently distorting the geometry of the square. Thus we obtain 128 examples of each of the letters A and B initially written once by hand into the machine.
Given a new letter also handwritten, the problem was for the machine to decide whether it was an A or a B.
We computed the distance between the problem letter and the 128 examples of each which were now stored in the memory. Whichever sum of distances was smaller determined the answer. As it turned out, the deformed examples of each letter produced by the computer looked like lifferent handwritten styles, some written by an old man with a shaking hand, some more rounded or pointed, etc., as if they had really been produced by people. The first computer trials gave more than 80% correct answers!
Is it possible that in the actual process of recognition---or discernment or distinction--between two visual impressions one does not need to have recourse to very many examples stored in the memory? That instead, by taking one of the examples in the memory, we might use internally some deformnations and compare them with the given impression?
This would be a great saving of memory storage and it could be applied not only to single pictures, but to pairs or triplets or short "films" of pictures, thus enabling recognition or distinction between different new impressions.
A more sophisticated schema of recognition, in fact the beginning of more abstract reasoning, would involve a distance more general yet than the ones mentioned above. This could be based on the comparison of two given sets by decomposing them into pieces and considering a distance calculated from the sum of the comparisons of the parts.
A still more general distance leading to a beginning of "logic" would involve a comparison of classes of pictures. The post hoc ergo propter hoc (after that hence because of that) conclusion serving as an example.
About problems of recognition of shapes by comparison with a large storage of examples:
One question which occurs in auditory, tactile, or visual impressions concerns the taking of a decision that the new impression is "novel" and not to be considered as a variant of one of the examples stored in the memory. Thus, for example, antibodies are able to recognize a foreign or strange object.
We may postulate that there is a list of objects in the memory considered "familiar." This list might be, for example, arranged lexicographically or alphabetically or coded by numbers. As we will see, the way of arranging it in the memory is important.
For example, suppose we have listed 106 numbers, each of ten digits, say, so we have a rather sparse collection. A new number is presented having ten digits. The first question is: Is it equal to some number in the list? On a computer the answer can be obtained immediately. Similarly in a putative mechanism in the brain, since it suffices to go through the digits in succession, which is a fast and efficient process.
Suppose however that the question is: Is the new number if not equal to any number in the list, at least close to some such number, e.g., differs from it by I or 2 in some position? The search for such a close number would be time consuming if we tried to compare the given number with each of the 106 numbers in the memory. Obviously there is a better way.
From the given number we fabricate 20 numbers which differ from it by 1 in some one of the ten digits. Then we look for each of the 20 whether one of them might be in the list, as above, very quickly. Again we see that should there exist in the brain a mechanism to produce some small changes or deformations, then compare those with the contents of the memory, the recognition or discrimination would be much more efficient.
The above example refers to one particular distance which depends on the absolute value of the difference in the digits in the same position.
For other types of distances described above, analogous but combinatorially more complicated procedures are possible. For each of these distances the question of what is the most economical or practical clustering presents an interesting exercise.
The principal contention or conjecture is then of the existence of a mechanism in the nervous system capable of producing a number of e-modifications of the impressions that affect distinction or recognition.
Astronomers trying to find a new star on a photograph of a portion of the sky quickly flip a number of pictures of this region and the new object jumps out visually from the collection of the others which are constantly present. A parade of coded pictures in the memory compared with the new impression might serve a similar purpose. Our supposition is that there exist ways of sensing quantitatively a number of different distances between the impression and the memory data, distances that are perhaps not unlike the ones we have enumerated above--an "averted" memory may, like averted vision, aid in the search.
Going beyond, we may have the beginning of a "logical" or "reasoning" process by considering sequences of impressions of pictures and measuring their analogy by metric distances.
1. S. Ulam, "Some Ideas and Prospects in Biomathematics," Ann. Rev. Biophys. and Bioeng. 1, 277-292, (1972).
2. Paul Appell, "Le Probleme Geom6trique des Deblais et Remblais," Memorial des Sciences Mathematiques de l'Academie des Sciences, fasc. 27 (1928), Gauthier-Villars, Paris.