4.3—
Markers
The first and most basic of the three sortal terms is 'marker'. Thus far the development of the term 'marker' has consisted of the citation of a few paradigm examples (letters, numerals, characters employed in musical notation) and a negative claim to the effect that being a marker has
nothing to do with syntax or semantics. To come to a better understanding of markers, it will be useful to employ a thought experiment.
4.3.1—
The "Text from Tanganyika" Experiment
Suppose that the noted Victorian-age explorer and linguist Sir Richard Francis Burton, while traversing central Africa in search of the source of the Nile, comes upon a lost city. There he finds a clay tablet on which there are inscriptions of unknown origin and meaning. One line of the script reads as follows:
What assumptions can Burton reasonably make about the inscription? First, he can probably proceed upon the assumption that what he has come upon is an inscription in a written language, which he dubs "Tanganyikan." He can assume that, like other written languages, Tanganyikan will employ symbols, that it will have a syntactic structure, and that at least some of the symbols will be used meaningfully. At this point, however, he most emphatically does not know what any of the symbols mean. Nor does he even know what symbolic units are meaningful . What he encounters may be a phonetically based script like that used in written English, in which case few if any of the individual characters will be meaningful. On the other hand, it might be an ideographic notation like that employed in written Chinese, in which case individual ideogram types are correlated with specific interpretations. Or it might be like Egyptian or Coptic script, in which characters can function as ideograms in some contexts and function as indications of phonemes in others. (If English were to be represented in a similar way, for example, we might have a character to represent the word 'heart, and then represent the word 'hearty' by the string -y .) And of course it could be the case that what he sees is not writing at all, but mere ornamentation or doodlings.
Now there is a great deal that Burton can do without an interpretation scheme for this writing. Notably, he can begin by making a list of the atomic characters employed, and on the basis of this he can do such things as compare them with characters used in other African languages to see if Tanganyikan may be related to any of these. For example, if the writing found at Timbuktu contains a character , then Burton might postulate that the symbol found in Tanganyikan script is a variant of ,
and that Tanganyikan is related to Timbuktuni. And he can do all of this without knowing anything about the syntax or semantics of Tanganyikan. Indeed, even if it turns out that what he has found are a child's handwriting exercises or an ancient eyechart—in which case what he sees does not have either syntactic structure or semantic interpretation—his conclusions about the character type need not be imperiled. And the reason for this is that the characters themselves can be understood as falling into types quite independently of the linguistic uses to which they are put.
Once Burton has made this observation, he begins to realize that it is not only atomic graphemic character types that can be studied apart from their syntactic and semantic properties. On the one hand, strings of characters that function together can be treated as a single unit, and hence Burton can make some guesses about what sequences of characters make up words.[1] On the other hand, graphemic characters are not the only tokens whose membership in a type can be understood apart from syntax and semantics. The very same kind of analysis can be applied to nonvisual units, such as phonemes, Morse code units, or ASCII units in computer storage locations. If, for example, Burton had a tape recording of someone speaking Tanganyikan, he might undertake a very similar analysis of the phonemes employed in the language, even without knowing where the breaks between words fall or what anything in the language means. Or, if he were in a position to intercept an electronically transmitted message such as a transmission in Morse code, he might be able to figure out the basic units (e.g., dots and dashes) and how they were instantiated in a telegraph wire or through modulations of radio waves. In light of these realizations, of course, he would come to realize that he could no longer employ the term 'character' to cover all of the relevant cases, and would be in search of a suitably neutral term: for example, the term 'marker.'
4.3.2—
What Is Essential to the Notion of a Marker?
If 'marker' is to serve as a generic term for phonemes, graphemes, units of Morse code, and other such entities, it is worth asking just what is involved in being an entity of one of these kinds. And the best way of answering is by making a series of observations.
(1) Markers are tokens of types . The type-token distinction is applicable to all markers—to letters, numerals, Morse code units, ASCII code units, phonemes, and so on.
(2) Marker types are conventional . To say that a graphite squiggle on
a sheet of paper is a letter P is to say that it is a token of a particular type that is employed by particular linguistic communities. To say that it is a rho is to say that it is a token of a different particular type employed by a different community. And to claim that a particular squiggle is a P is not the same thing as to claim that it is a rho, even if it is the case that an object has the right shape to count as a P if and only if it has the right shape to count as a rho. This is because the claim that the squiggle is a P (or a rho) makes reference to more than the shape of the object: it makes reference to specific conventions of a specific linguistic community as well. Likewise, the claim that the squiggle is a P (or a rho) is not equivalent to a claim about its shape—for example, that it is composed of a vertical line on the left and a half-oval attached to the right side of the upper half of the line.
When I say that marker types are conventional, what I mean is merely that marker types are established by the beliefs and practices of language users. In particular, I wish to emphasize that marker types are not natural kinds . To be sure, sounds and squiggles may also fall into natural kinds on the basis of physical patterns present in them, such as their waveforms or their shapes: a sound wave is a sine wave at 440 kHz just because of its physical characteristics, and an inscribed rectangle is a rectangle just because of the distribution of graphite on paper. But when we say that an object is a marker—for example, an inscription of the letter P or an utterance of the word 'woodchuck'—we are not picking it out just by its sound or its shape, but by the way it fits into established linguistic practices in some community of language users. To determine what marker types an object falls into, we need to know more than what patterns are present in the object: we need to know what marker types there are as well, and what kinds of objects can count as tokens of those types. And to answer those questions, we need to know what linguistic communities there are and what shared understandings and practices members of those communities have about using sounds and inscriptions communicatively. An object can only be a P-token if there is a letter type P , and there can only be a letter type P if there is some community of language users who have a set of shared beliefs and practices to the effect that there is a marker type whose tokens are shaped in certain ways and may be employed in certain activities. So when I say that marker types are conventional, I mean that the existence of the type is determined by the beliefs and practices of language users.
(3) The conventions that establish marker types involve criteria gov -
erning what can count as tokens of those types . So while the assertion that a squiggle is a rho involves more than claims about its shape, it does entail things about the shape of the squiggle as well. The letter type rho is established by the conventions employed by writers of Greek, but part of what is involved in those conventions is a set of criteria governing what a squiggle has to look like in order to count as a rho.
(4) The criteria governing what can count as a token of a marker type pick out a set of (physically instantiable) patterns such that objects having those patterns are suitable to count as tokens of that type . In the case of letters, numerals, and other graphemes, the patterns are two-dimensional visible spatial patterns. In the case of phonemes, they are acoustic patterns distinguishable by the human auditory system. In the case of Morse code and computer data storage they are abstract patterns made up, respectively, of dots and dashes or binary units which can be instantiated in various ways in different media. One can also have complex marker types that are formed from arrangements of simple marker types: written words, for example, are complex markers composed of sequences of atomic markers (letters).
(5) The criteria for a marker type may be flexible and open-ended, and need not be subject to formulation in terms of a rule . This is clearest in the case of graphemic symbols. As Douglas Hofstadter (1985) has argued, letter types seem to permit an indefinite number of stylistic variations. A reader who has not foreseen these can nonetheless quickly recognize them as such when presented with them. It is by no means clear that one could provide a rule (e.g., in the form of a computer program) that could, for example, distinguish all of those patterns that a person could recognize as stylistic variants of the letter P from those patterns which a person would not recognize as such.
(6) Marker types are often found in groups or clusters that are employed in the same symbol games . Thus we speak of different sets of graphemic characters such as "the letters," "the numbers," "the punctuation symbols," and so on.
(7) Criteria for marker types may overlap, both within groups and across groups . Thus the same squiggles that count as letter o's can count as zeroes and omicrons as well. And indeed, as anyone who has had trouble reading another person's handwriting knows, handwritten letters are often interpretable in a number of different ways.
(8) Language users possess a repertoire of marker types, which can be used in various ways . Mathematicians, for example, are in the busi-
ness of developing new symbol games. In doing so, they commonly employ existing marker types such as letters and numerals whose origins may be traced to various linguistic communities. Mathematicians use existing marker types, but put them to new uses in new symbol games. Similarly, one can use one's knowledge of phonemes and the rules for combining them into words in one's language in order to coin a new word if one is needed.
(9) Marker types can be added to or deleted from an individual's repertoire . That is, a person can learn marker types and also forget them.
(10) Marker types can be added or deleted from the repertoire of a linguistic group . New words (complex markers) are coined, new atomic markers are invented (as in the case of the integration sign used in the calculus or the missionary St. Cyril's invention of the Cyrillic alphabet) and imported (as in the case of Europe's adoption of the Arabic numerals). Markers also disappear from usage. Many of the complex markers (Middle English words) one finds in Chaucer's writings, for example, are no longer in use; and the Old English letter thorn has survived only in the guise of a y on the signs of anglophilic innkeepers.[2]
(11) The boundaries of a "linguistic group" and the extent to which conventions are shared within a group are highly flexible . In the case of natural languages, for example, there are often significant differences in dialect and idiolect which involve differences in the conventions for pronunciation, inscription, and so on. It is not always fully clear when one should say that one is faced with separate linguistic groups and when one is faced with a variety of practices within a single group. Moreover, there may be groups within groups: all topologists may observe certain notational practices, but topologists who work in a particular topological specialty (e.g., surgery theory) may all observe an additional set of practices not shared by other topologists, and an individual mathematician who has developed his own techniques for a particular problem may be the only person employing his new conventions. Similarly, an individual may find the need for a new word in a natural language and may therefore choose a phonetic sequence (a complex marker type) that is not currently used in his language and then employ it as a marker type. The new marker type is conventional in the sense that it is established by a human convention and not simply by a natural pattern, even though the convention that establishes it is not (yet) a convention of English, but merely a convention within some individual's idiolect. (Of course, it can become a convention of English; new words are introduced into languages, and they all start out as someone's idiosyncrasies.)