ESL logo small

Guest Author: Dimitrios Thanasoulas

Back to Guest Authors | Index

Speech Errors--A window into linguistic processes

by Dimitrios Thanasoulas
BA (English Literature and Linguistics, Athens University)

1. Introduction

Human speech tends to be fast, effective and complex—the product of millions of years of evolution. Yet, this remarkable feat is rife with hesitations, false starts, repetitions, and errors. As Baars (1992: 20) put it, all these errors, dysfluencies and, at times, non-sequiturs, are ‘the price that is paid for the great flexibility of the human control system, that is, its ability to select among numerous degrees of freedom’. Moreover, if everyone aspires to the ‘ideal delivery’ (Clark & Clark, 1977), i.e., the correct way of uttering a sentence, why do we come up with such "deviant" speech? If the output is different from what the speaker intended to say, then there must be something wrong with the planning of speech, that is, the processes involved in its production—given, of course, that the person does not suffer from aphasia or any other problems in muscle coordination, and so on. Thus, in the present study an attempt will be made to identify and examine speech errors, with a view to throwing some light on what is going on in the speaker’s mind prior to, and in the course of, the articulation of speech. To this end, we will subsequently examine some models of speech production, which posit radically different ways of describing and explaining its nature. As Fromkin (1973: 43-44, cited in Whitney, 1998: 273) noted, speech errors ‘provide us with a window into linguistic processes’—a claim shared by many linguists and psychologists (e.g. Aitchison, 1989; Baars, 1992; Harley, 1998, et al.).

2. Hesitations

Before contenting ourselves with the description and categorisation of hesitations, we should note in passing that, according to Levelt (1989, cited in Harley, 1998: 243), the processes of speech production fall into three main categories. At the highest level are the processes of conceptualisation (or Laver’s ‘ideation’; see Laver, 1970, cited in Lyons, 1987: 336) which appertain to the intentions of what to say or the generation of ideas at a pre-verbal stage. At a lower level are the processes of formulation, or the translation of concepts into linguistic forms. Finally, the processes of articulation involve the planning of speech at the articulatory and phonetic levels. Here, our focus will be on the formulation process and, more specifically, lexical retrieval and syntactic planning.

Having said this, we can now proceed to a discussion of hesitations or pauses in speech. It may seem odd to investigate speech by studying non-speech, but if we consider that 40 to 50 per cent of our speech consists of silence (see Aitchison, 1989: 242), then we realise that "non-speech" speaks volumes, as it were, as to the ways in which we plan and produce speech. There are two main types of pauses: breathing pauses and hesitation pauses. Breathing pauses are also called unfilled pauses (UFPs) and their function is simply that of filling one’s lungs with air. Hesitation pauses, or filled pauses (FPs), on the other hand, are of the er, uh, or um type, as well as repetitions, false starts, or other parenthetical remarks, such as well, you know, I mean, and so forth, ‘which appear to add nothing to the utterance but may gain time for the speaker to plan her speech’ (Lyons, 1987: 339).

Lounsbury (1954, cited in Lyons, 1987: 339-340) suggested that hesitations, or FPs, do not occur at random during speech production but rather between the major units in speech, reflecting that the speaker is planning the next unit. In this light, he made the distinction between ‘juncture pauses’, whose function is to mark syntactic boundaries for the listener, and ‘hesitation pauses’, which bear witness to the speaker’s planning of what is to come next. This distinction, though, has been jettisoned in favour of the view that any pause might reflect planning.

There have been some studies on the function of hesitations. Goldman-Eisler, for example, investigated whether UFPs reflect the speaker’s planning time for word selection. To prove this contention, she tried to figure out whether UFPs occur before unpredictable words, and found that such words were more likely to be preceded by a UFP. This finding suggested to her that unfilled pauses occurred at points where the speaker chose between possible words to utter. Nevertheless, it has been shown that not all unpredictable words were preceded by UFPs, nor all UFPs precede unpredictable words.

Another study by Goldman-Eisler was based on the assumption that the more difficult a speaking task, the more planning time is required, which should be reflected in the duration of hesitation. To investigate this assumption, she asked some people to perform two different speaking tasks of varying difficulty, using the same materials, which comprised nine cartoons. The task involved describing what was in the cartoons and then interpreting and summarising the moral of each cartoon. The end-result was that the pause rate was higher during the "interpreting stage" than during the description. Yet, this assumption ‘is not enlightening in decomposing the processes of speech production’ (Lyons, 1987: 347), since it fails to show where hesitations actually occur.

What we could glean from this discussion is that the study of pauses does not offer significant insights into the processes of speech production, mainly because they have more than one function. It is very likely that speakers, either consciously or unconsciously, make pauses in speech to help their listeners grasp what they say; to aid themselves to segment speech; or to take time to plan their output. Good and Butterworth (1980, cited in Harley, 1998) came to the conclusion that hesitations have a two-fold function: that of achieving interaction between speaker and listener, as well as helping the speaker to cope with the cognitive load of planning speech. The study of the "tip-of-the-tongue" (TOT) state seems far more promising, though, as the evidence it provides ties in with a two-stage model of lexical access in speech production: we first formulate a semantic representation of what we intend to say and then retrieve any phonological information relevant to it (we will discuss the two-stage model later in this study).

2.1. The "tip-of-the-tongue" state

The tip-of-the-tongue (TOT) state afflicts us, so to speak, when, despite the fact that we know fairly well what the word is—in fact we can provide the first and the last syllables correctly—we fail to utter the word. This state is characterised by strong ‘feelings of knowing’ what the specific word is (Harley, 1998: 246), with ‘[a] teasing and seemingly uncatchable wraith of it remain[ing]’ (Aitchison, 1989: 241). Brown and McNeil (1966, cited in Harley, 1998) examined the TOT state by inducing such TOTs in several subjects who were read some definitions of words, as in the example below:

"a navigational instrument used in measuring angular distances, especially the altitude of the sun, moon, and stars at sea"

They found that a proportion of the subjects were placed in a tip-of-the-tongue state, coming up with phonological neighbours such as "secant," "sextet," and "sexton," in groping for the correct word (sextant). This evidence certainly lends support to their claim that ‘lexical retrieval is not an all-or-none affair’ (ibid.: 246). The speaker in a TOT state may be unable to utter the word, yet he can retrieve some information with regard to the number of syllables, the stress pattern, the initial or final letter, etc.

Two theories attempting to account for the causes of TOTs have recently emerged: the partial activation and blocking or interference hypotheses (see Harley, 1998: 246-247 for further details). According to the first hypothesis, proposed by Brown (1970), the intended words are inaccessible in virtue of the fact that they are weakly represented in the system, which results in the ‘retrieval deficit’ (ibid.: 247). The blocking hypothesis, on the other hand, put forward by Woodworth (1938), holds that a stronger "competitor," phonological or semantic, suppresses the intended word.

So far, we can see that the study of the tip-of-the-tongue state has proved more promising, as it has been instrumental in offering some hints as to what may be going on in speech planning. However, since the evidence provided is far from conclusive, we should now turn to speech errors—an area of study which has yielded some very interesting and insightful results in respect of ‘the finer levels of planning and execution that deal with the formation of words, syllables, and sounds’ (Clark & Clark, 1977: 273).

3. Identification and classification of speech errors

As was mentioned above, our speech is full of dysfluencies and errors. We usually get our message across quite fluently, yet we tend to pause, hesitate, repeat words or segments of words, make errors and self-correct; in short, we seem to put some effort into speech, trying to find the ideas or concepts we want to impart, the words that will represent these ideas, and the order in which we will put these words together to form clauses and sentences. But why should we study speech errors in order to gain insight into these procedures? Aitchison (1989) and Whitney (1998) believe that we can understand a complex system like language when it breaks down.

It is possible that speech is like an ordinary household electrical system, which is composed of several relatively independent circuits. We cannot discover very much about these circuits when all the lamps and sockets are working perfectly. But if a mouse gnaws through a cable in the kitchen, and fuses one circuit, then we can immediately discover which lamps and sockets are linked together under normal working conditions (Aitchison, 1989: 244).

It is these "fused circuits," speech errors, that we will be concerned with, in the hope of "unravelling" the mystery of speech production.

Freud was perhaps the first to examine speech errors or slips of the tongue, which he called parapraxes, and proposed that they were the product of our repressed thoughts and pent-up feelings, mostly of sexual origin. In an oft-quoted example he gave, a professor said, "In the case of female genitals, in spite of many Versuchungen (temptations)—I beg your pardon, Versuche (experiments)…" In another example, a woman said that her cottage was on the hill-thigh (berglende) instead of "on the hillside" (berglehne). Despite its popularity, Freud’s theory has gradually lost ground to the psycholinguistic theories that have attempted to classify and explain speech errors, mainly because it makes sweeping overgeneralisations as to the nature of slips of the tongue. Even Freud himself would have difficulty explaining the following speech errors in terms of "sexually motivated" feelings:

I’d like a Vienel Schnitzer (I’d like a Viener Schnitzel) or
A cop of cuffee (A cup of coffee)
(found in Aitchison, 1989: 251)

There are many types of errors and many researchers have preferred to categorise them in terms of the linguistic units which are involved in the error (for instance, phonological features, phonemes, syllables, morphemes, etc.). We will base our own classification mainly on Harley (1998), thus dividing speech errors into eleven types (see table below).

Type

Utterance

Target

Feature perseveration

Turn the mop

knob

Phoneme anticipation

The mirst of May

first

Phoneme perseveration

God rest re merry gentlemen

ye

Phoneme exchange

Do you reel feally bad?

feel really bad

Affix deletion

The chimney catch fire

catches fire

Phoneme deletion

Backgound lighting

background

Word blend

The chung of today

children + young

Word exchange

Guess whose mind came to name?

whose name came to mind

Morpheme exchange

I randomed some samply

I sampled some randomly

Word substitution

Get me a fork

spoon

Phrase blend

Miss you a very much

very much + a great deal

Of course, there are some linguists and psychologists (Baars, Clark & Clark, Whitney, et al.) who have either identified other types of speech errors along with the ones listed above, or used different names for some of these types. We will consider some of these typologies in our discussion.

3.1. Errors involving phonological features: Phoneme perseverations, phoneme anticipations, malapropisms, blends, word substitutions, phoneme exchange (transpositions), and phoneme deletion

3.1.1. Phoneme perseverations

In phoneme perseverations, we usually find repeated sounds as in the examples taken from Aitchison (1989) and Harley (1998):

  1. The book by Chomsky and Challe (Chomsky and Halle)
  2. God rest re merry gentlemen (God rest ye merry gentlemen)
  3. Here, the sound in Chomsky and rest is transferred, so to speak, to the nearest words, i.e., Challe and re, respectively. One could say that the phonetic form of the first words clutter up the speaker’s mind and, as a result, induce him to repeat them. Such repetitions are relatively unusual because most people have the ability to ‘wipe the slate clean’ (Aitchison, 1989: 252) in that, as soon as they have uttered a word, their memory dispenses with its sound.

    3.1.2. Phoneme anticipations

    In phoneme or sound anticipations, the speaker brings in a sound earlier, thus anticipating the syllable or word that is to come next in the sentence. For example:

  4. The mirst of May (The first of May) (Harley, 1998)
  5. The Worst German Chancellor (The West German Chancellor) (Aitchison, 1989)
  6. Bake my bike (Take my bike) (Clark & Clark, 1977)
  7. It is manifest that the sounds in May, German, and bike were anticipated, thus producing the erroneous "mirst," "worst," and "bake." In the first case, "mirst" is a nonsense word, while in the second and third, the speaker has come out with two English words—"worst" being an offensive one, at that!

    3.1.3. Malapropisms

    Malapropisms occur when the speaker confuses a word with a similar sounding one. This name originally came from a character called Mrs. Malaprop in Sheridan’s play The Rivals, who tended to confuse words. She would say reprehend for "apprehend," or

  8. She’s as headstrong as an allegory on the banks of the Nile (She’s as headstrong as an alligator on the banks of the Nile)
  9. A nice derangement of epitaphs (A nice arrangement of epithets)
  10. Apart from Mrs. Malaprop, children seem to make such mistakes, albeit intentionally, to create a comic effect, as in

  11. Mussolini pudding (semolina pudding)
  12. Naughty story car park (multi-storey car park)

But sometimes there seems to be a semantic as well as phonetic link between the words intended and those output. For example, in

(10) You keep newborn chicks warm in an incinerator (You keep newborn chicks in an incubator) (Aitchison, 1989)

in addition to their phonetic similarity, both "incinerator" and "incubator" convey the idea of heat.

Pertaining to malapropisms, Clark and Clark (1977: 288) note that these errors take place ‘because the speakers ha[ve] incomplete phonetic representations of the words they were thinking of and so they [select] the first word that sound[s] right’.

3.1.4. Blends

In the case of blends, two words, sometimes phonetically similar ones, combine to form a new one, as shown below:

(11) slick + slippery = slickery (Clark & Clark, 1977)
(12) children + young = chung (Harley, 1998)
(13) explain + expand = expland (Aitchison, 1989)
(14) lithe + slimy = slithy (ibid.)

Here, the speaker has in mind two ways of saying the same thing and comes up with a ‘synthesized utterance’ (Cutler, 1982).

3.1.5. Word substitutions

This kind of speech errors ‘sometimes fall[s] completely under the spell of phonetic similarity’ (Clark & Clark, 1977). Consider the examples taken from Fromkin (1973, cited in Clark & Clark, 1977: 282).

(15) like wild fireà like wild flower
(16) sesame seed crackersà Sesame Street crackers

In both (15) and (16) the words uttered have no bearing on the meaning of the intended words, yet they are phonetically related. Presumably, the speaker wanted to say "wild fire" but switched to "wild flower," which eventually took precedence. Yet, it is not necessary that the two words will be phonetically similar, as illustrated in:

(17) Get me a fork (Get me a spoon) (Harley, 1998)

In this case, the word "fork" may have insinuated itself into the utterance because of an item being available in the speaker’s environment.

3.1.6. Phoneme transpositions and spoonerisms

Transpositions can affect words as well as syllables. Here, we will examine sound transpositions. Consider examples (18), (19), (20) and (21).

(18) I’d like a vievel schnitzer (I’d like a Viener Schnitzel) (Aitchison, 1989)
(19) Do you reel feally bad? (Do you feel really bad?) (Harley, 1998)
(20) fats and kodor (Katz and Fodor) (Fromkin, 1973, cited in Clark & Clark, 1977)
(21) heft lemisphere (left hemisphere) (ibid.)

As you may have noticed, in sound transpositions, phonemes switch places, but this tends to occur in phonetically similar words. For example, in (19) "feel" and "really" sound almost alike, and this holds true for "reel," which is an English word. In (20) /k/ and /f/ have switched places but the new word "fats" that has been produced has the plural morpheme –s to accommodate to the voiceless sound /t/.

Spoonerisms are perhaps the best known sound transpositions, named after the Reverend William A. Spooner, Dean and Warden of New College, Oxford, who reputedly transposed the initial sounds of words, coming out with such utterances as:

(22) You have hissed all my mystery lectures (You have missed all my history lectures)
(23) You have tasted the whole worm (You have wasted the whole term)
(24) The lord is a shoving leopard to his flock (The lord is a loving shepherd to his flock) (Clark & Clark, 1977)

However, since the utterances of the Reverend Spooner always made sense, as the sound transpositions generated English words, it is reasonable to argue that his students concocted them to make fun of him. In real life, such transpositions hardly make any sense, as in

(25) tilver siller (Silver tiller) (Aitchison, 1989)

3.1.7. Phoneme deletion

Phoneme deletion is very rare. Consider the following example:

(26) backgound lighting (background lighting) (Harley, 1998)

As we will argue later on, speech errors, in general, and those involving sound features or phonemes, in particular, lend credence to the view that, in speech production, we go through several levels of planning, only briefly touched upon earlier: We first build a syntactic structure; then we plan each constituent to fit that structure; and then we fill in any other relevant information—morphemes, content and function words, and so forth. So, sound transpositions are claimed to show that the sounds of words represent separate units in the planning of an utterance, thus seemingly corroborating the abovementioned view. Nevertheless, this hypothesis has been disputed (see models of speech production in this study).

3.2. Speech errors involving morphological units: Affix deletion, morpheme exchange, and misderivations

3.2.1. Affix deletion

Examples of affix deletion are very rare and, when they occur, they are more likely to be dismissed as ungrammatical sentences than as speech errors. For instance:

(27) The chimney catch fire (The chimney catches fire) (Harley, 1998)

In keeping with the view outlined above, it is clear that the speaker failed to fill in the grammatical morpheme –es after she had finished building the overall skeleton for the sentence.

3.2.2. Morpheme exchange

In this case, free morphemes may switch places, while leaving bound morphemes intact. Consider:

(28) I randomed some samply (I sampled some randomly) (Harley, 1998)

Here, the free morphemes "sample" and "random" have swapped places, whereas their suffixes –ed and –ly have remained in their initial slots in the sentence.

3.2.3. Misderivations

We have borrowed this term from Fromkin (1973, cited in Clark & Clark, 1977). Clear examples of misderivations are:

(29) an intervening nodeà an intervenient node
(30) peculiarityà peculiaracy
(31) swamà swimmed

The speaker has a feature in mind, a noun (in the case of (30)), but programs the wrong realisation for it. So, if we regard, say, "peculiarity" as consisting of the word "peculiar" + the feature [NOMINAL], it is reasonable to assert that the speaker had to choose out of several morphemes, -ity, -acy, -ness, -ation, etc., to form the target noun "peculiarity." However, he mistakenly chooses the suffix –acy, thus producing "peculiaracy." According to Clark & Clark (1977: 285), ‘[f]or this to have happened, peculiarity must have been retrieved from memory, not as an unanalysed word, but rather as peculiar + [NOMINAL]’.

3.3. Speech errors involving words: word blends, word exchange, and word substitution

3.3.1. Word blends

According to Cutler (1982), there are several types of blends: substitution blends, splice blends, indeterminate blends, and complex blends. We will only focus on the first two types. Examples of substitution blends are:

(32) If I ever get my hold on them…(If I ever get my hands on them…+ If I ever get hold of them…)
(33) It’s spent me a year (It’s taken me a year + I spent a year)

In both (32) and (33) the speaker has two target words in mind, but one of them seems to dominate. Substitution blends differ from more common word blends in that they always produce words or even sentences that make sense, as we saw above.

Splice blends, on the other hand, involve the linking of either part or the whole of one target with part of the other. Let us look at (34) and (35).

(34) Who is it that? (Who is it? + Who is that?)
(35) You’re going to be another Oppenheimer when you get up (…when you get older + …when you grow up) (Cutler, 1982)

Yet, word blends do not always make sense. For example, one cannot always figure out the meaning of the following blends:

(36) The chung of today (children + young) (Harley, 1998)
(37) lithe + slimyà slithy (Aitchison, 1989)

There are some blends, though, that are easy to figure out and are of everyday use:

(38) breakfast + lunchà brunch
(39) smoke + fogà smog

3.3.2. Word exchange

Another type of speech error involving words are the ones below:

(40) Guess whose mind came to name? (Guess whose name came to mind?) (Harley, 1998)
(41) dinner is being served at wine (wine is being served at dinner) (Fromkin, 1973, cited in Clark & Clark, 1977)
(42) a gas of tank (a tank of gas) (ibid.)

As Clark & Clark (1977: 278) argue, ‘for gas and tank to have been reversed…, they must both have been present [in the speaker’s mind]’, and add that ‘target words are almost invariably in the same constituent, usually both stressed, and within six or seven words of each other’.

3.3.3. Word substitution

Here the target word is replaced by another one, which can be said to bear a semantic relationship to it. For example:

(43) It’s six o’clock. Won’t that be too early to buy bread? (Won’t that be too late to buy bread?)
(44) Get me a fork (Get me a spoon)

In both cases, it seems that the two words ("early"-"late" and "fork"-"spoon") compete with each other, but the wrong word eventually dominates. These errors are also called semantic or similar meaning errors. According to one of the models of speech production that we will discuss later on, these semantically similar words activate each other, just as it happens when one is trying to say "Monday" and all the days of the week spring to mind, i.e., are activated.

3.4. Speech errors involving phrases and sentences: phrase blends, transpositions, and anticipations

In addition to the sound transpositions we have hitherto dilated upon, there are also transpositions involving words and clauses switching places, as in

(46) a purse for every lighter (a lighter for every purse) (Clark & Clark, 1977)
(47) I can’t help the cat if it’s deluded (I can’t help it if the cat’s deluded) (Aitchison, 1989)

The same applies to anticipations such as examples (3)-(5) or the one below:

(48) I want you to tell Millicent…(I want you to tell Mary what Millicent said) (Aitchison, 1989)

Let us now have a look at phrase blends.

3.4.1. Phrase blends

Phrase blends occur when two clauses combine to form another clause, which is sometimes ungrammatical or ill-formed. Cutler (1982) refers to them as complex blends. For example:

(49) Miss you a very much (Miss you very much + Miss you a great deal) (Harley, 1998)
(50) One thing that fascinated by me… (One thing that fascinated me + One thing I was fascinated by…) (Cutler, 1982)

4. The implications of speech errors for speech planning and production

Now that we have identified and classified the most basic speech errors, it is of consequence to draw our attention to their implications for the planning and production of speech. In other words, a question germane to our discussion is, "What do we learn from an examination of speech errors?". First of all, we will show that speech errors can suggest what the units of planning are, namely, the segments into which our speech is divided (e.g., phonemes, syllables, etc.). Secondly, slips of the tongue can throw some light on the processes of word selection and formation. Thirdly, they are instrumental in showing how words and structures are planned and assembled.

4.1. The units of planning

According to Clark & Clark (1977: 275-276), in comprehension the hierarchy of units looks like this:

Distinctive features, like Voicing
Phonetic segments, like [b]
Syllables, like [bro]
Words, like broken
Larger constituents, like the broken promise, and so on.

In production, speakers seem to store in memory a complete pattern for every phrase, so in uttering "in the manor house," they ‘initiate the appropriate stored motor pattern and let it play to the finish like a tape recorder’ (ibid.). But if this was true, our speech would be bereft of hesitations or pauses within the smaller units, as in "the // manor house" (the symbol // standing for a long pause), or "in the, uh, manor house." For Clark & Clark (1977: 276), the phrase cannot be the one and only unit in the articulatory program because the evidence from hesitations, pauses, and slips of the tongue point to the fact that planning must be dealing in phonemes, morphemes, words, clauses, and sentences, as well as their structure. Let us adduce some of our examples of speech errors to illustrate this (much of our discussion is based on Clark & Clark, 1977).

A brief glance at examples (3)-(5) or (18)-(21) repeated here for convenience, as well as all speech errors involving phonemes, attests to the fact that the most basic unit in the planning of speech is the phonetic segment.

(3) The mirst of May (The first of May)
(4) The Worst German Chancellor (The West German Chancellor)
(5) Bake my bike (Take my bike)
(18) I’d like a vienel schnitzer (I’d like a Viener Schnitzel)
(19) Do you reel feally bad? (Do you feel really bad?)
(20) Katz and Fodorà fats and kodor
(21) left hemisphereà heft lemisphere

According to Clark & Clark (1977: 276), all the argument boils down to is this: If words were the only indissoluble unit in planning, we would not expect there to be any sound transpositions or anticipations. In (18)-(21) two phonemes switch places and it is not only the initial consonants that evince this tendency. The same applies to final consonants, as in the example taken from Fromkin (1972, cited in Clark & Clark, 1977: 276): (51) pass outà pat ous; to consonants between vowels; and to vowels, as in (52) David, feed the poochà David, food the peach (ibid.). Moreover, Garrett (1998, cited in Newmeyer, 1988: 75) shares this claim in asserting that ‘[s]uch evidence dictates a separation of the processing level(s) that fix(es) detailed phonetic representation from that representing abstract segmental and lexical structure’. In fact, rather than considering words to be indissoluble units, Clark & Clark (1977: 276) believe that ‘[w]hile speech is divided "horizontally" into phonetic segments, each phonetic segment is divided "vertically" into distinctive features (like Voicing, Nasality, and Stridency)’. Example (53) below taken from Fromkin (1973, cited in Clark & Clark, 1977: 276) succinctly illustrates this hypothesis:

(53) Terry and Juliaà Derry and Chulia

What have been transposed here are not the phonemes /t/ and /d/ but rather their voicing ([--voice] and [+ voice] for /t/ and /d/, respectively).

Going down the hierarchy of units, we see that the syllable is the next unit. In transpositions, perseverations, and anticipations, the features that are often affected are syllables, as in the following examples:

(54) animalà aminal
(55) harpsichordà carpsihord (Fromkin, 1971)

As Hockett (1967) and MacKay (1972) hold, the syllable is comprised of an initial consonant group and a final vowel group. In English, for example, the basic pattern is of an initial consonant group followed by a vowel group, as in bl + imps, resulting in "blimps." What is more, such clusters have the tendency to be transposed, anticipated or perseverated as a whole, as shown in

(56) coat thrutting (throat cutting) or
(57) clamage dame (damage claim)

Besides, the same tendency is noted in blends, where the initial consonant group of the first word combines with the final vowel group of the second. Consider:

(58) shout + yellà shell
(59) grizzly + ghastlyà grastly

In view of this evidence, Clark & Clark surmise that ‘[t]he syllable must…be part of the articulatory program, for it specifies which segments can be anticipated, perseverated, or reversed and which segments cannot’.

If we consider examples (46)-(50) we will see that speech errors also affect larger units of speech. For instance, in (46) what have been reversed are the constituents "a lighter for every purse" and "a purse for every lighter’—not merely words, morphemes or phonemes. If correct, this evidence is significant, ‘for it dovetails with the earlier conclusion that constituents of this size are the main units of planning and execution’.

4.2. Models of speech production

As was hinted at above, there have been a great many researchers trying to build various models of speech production, based on the evidence that speech errors offer. Here, we will content ourselves with Fromkin’s "five-stage" model and Garrett’s "two-stage" model, as well as with Dell and Reich’s "spreading-activation" or "interactive" model. The difference between the first two models and the third one lies, not in the actual levels of speech production each of them posit, but rather in the ways in which these levels "communicate" with one another. So, there is tacit agreement among researchers that there are four levels of speech production mechanisms at work in proceeding from the conceptualisation of the message to the articulation of the utterance. Whitney (1998: 275), for example, fleshes out the main skeleton of the speech production system in these terms: the message level, where the propositions to be conveyed are formed; the syntactic level, where the selection and assemblage of lexical features takes place; the morphemic level, where stems, affixes, and inflections are added; and the phonemic level, where the sound structure of every single word is built. Let us now consider whether we work through each level in a modular fashion (from syntactic to phonological)—proposed by Fromkin and Garrett—or whether each level interacts with the others, as Dell and Reich’s model suggests. In doing so, we will be furnished with more information about word selection and formation, as well as how words and structures are built.

      1. Fromkin and Garrett

The premise underlying both models is that speech planning occurs serially, that is, only one thing is happening at a time—albeit it is plausible that more than one thing is happening at different levels. These levels, though, are independent, as they do not interact with one another. If, for example, someone utters, "The competition is a little strougher" (Fromkin, 1973, cited in Whitney, 1998: 281), it is because she had in mind two different frames, namely, "The competition is a little stiffer" and "The competition is a little tougher."

4.2.1.1. Fromkin’s "five-stage" model

Based on the analysis of more than four thousand speech errors, Fromkin proposed a five-stage model of speech production. Stage 1 is meaning selection. Stage 2 involves the selection of a syntactic outline and semantic features. Stage 3 is the choice of intonation contour and primary stress. Pertaining to stage 3, she argued that this choice should take place before word selection, because most slips of the tongue leave the intonation contour intact, as in "in the phonology of theory" vs. "in the theory of phonology." Stage 4 consists in the selection of words fitting in with the semantic features chosen at stage 2. Finally, stage 5 is the application of morphophonemic rules, which encompasses the phenomenon of accommodation, as in the case of a vis-à-vis an (see Lyons, 1987: 353-354).

4.2.1.2. Garrett’s "two-stage" model

Garrett’s model is similar to Fromkin’s, but instead of positing five stages, it distinguishes two stages of syntactic planning, i.e., the functional level, where word order has not yet been explicitly represented, and the positional level, where words have been explicitly ordered (Harley, 1998: 259-260). Moreover, he proposed another distinction between content words, such as nouns, adjectives, verbs, and adverbs, and function words, including determiners (a, an, the, etc.), prepositions (at, by, on, in), wh-words (what, who, when), and so forth, asserting that these two categories are processed differently. In addition, he claimed that some speech errors affect content words, some affect function words, but none affect both. For instance, we never find:

the pot of goldà the of pot gold
a tank of gasà a of tank gas (Clark & Clark, 1977)

An example of how we produce an utterance is provided in Garrett (1975, 1976, cited in Harley, 1998: 262):

  1. Message level—intention to convey particular meaning activates appropriate propositions
  • SUBJECT= "mother concept," VERB= "wipe concept," OBJECT= "plate concept"

TIME= past

NUMBER OFOBJECTS= MANY

  • (DETERMINER) N1 V [+PAST] (DETERMINER) N2 [+PLURAL]
  • /mother//wipe/ /plate/
  • (DETERMINER) /mother/ /wipe/+[PAST] (DETERMINER) /plate/+[PLURAL]
  • /the/ /mother/ /wiped/ /the/ /plates/
  • Low level phonological processing and articulations

Drawing upon a wide assortment of speech errors, Garrett, like Fromkin, argued that semantic substitution errors, as in examples (43) and (44), show problems in lexical selection. What is more, errors involving sounds or morphological features have different characteristics. In word errors, for instance, the elements involved belong to the same syntactic class (in (43) "early" and "late" are both adjectives and complements in the sentence, while in (44) "fork" and "spoon" are both nouns and objects); they are not phonologically similar; they can move across words or syntactic boundaries; and accommodation takes place, as in the example below (taken from Garrett, 1976: 238, cited in Lyons, 1987: 355):

I’d hear one if I knew it (I’d know one if I heard it).

According to Garrett, word errors occur during the functional stage, when lexical items are placed into the wrong places, and, because there is often no phonological similarity, this must take place before any phonological information is fitted in. Nevertheless, in cognisance of the fact that accommodation occurs, this stage must have access to the lexicon, providing information about word forms.

On the other hand, errors involving sounds and grammatical morphemes have these characteristics: elements are from different categories; they are phonologically similar (see (6)-(10)); they can move within a phrase (see (3)-(5) and (18)-(21)); and accommodation does not usually take place. In Garrett’s view, these errors occur at the positional stage, when phonological and morphological information is being specified. He also suggested that at the positional stage what are planned are smaller chunks of the utterance, given that such errors affect phonemes and morphemes. Furthermore, inasmuch as there is phonological similarity between the elements, the speaker must have partial access to phonological information, which often leads to nonsense words, as in (25).

As evidence for these models, and particularly Fromkin’s model, Clark & Clark (1977: 279-281) analyse the following transposition:

a weekend for MANIACSà a maniac for WEEKENDS (Capital letters indicate primary stress and italics secondary stress).

At Stage 1 (see Fromkin’s model above), the speaker builds a constituent referring to time ("weekend). At Stage 2, he builds a syntactic outline, such as:

indefinite-article + noun + [2 STRESS] + preposition + noun + [PLURAL] +

[1 STRESS]

Besides this syntactic outline, the speaker is also equipped with semantic knowledge, i.e., he knows that the indefinite-article + noun + [2 STRESS] denotes an indefinite period. At Stage 3, he selects the words weekend and maniac to go into the appropriate noun slots. However, he makes a speech error, inserting them into the wrong slots:

indefinite-article + maniac + [2 STRESS] + preposition + weekend + [PLURAL] +

[1 STRESS]

At Stage 4, the speaker fills in the phonological information:

a + maniac + for + WEEKEND + [z]

Finally, at Stage 5 he applies any morphophonemic rules relevant to the utterance.

Much as these models—and especially Garrett’s model—account for a great number of errors, there are some shortcomings that should be considered. As Harley (1998: 263) notes, it is hard to pin down whether speech production is a serial process, as many speech errors, like blends (see (11)-(14) and (32)-(35)), can only be accounted for in terms of parallel processing, whereby two or more words are being retrieved. Since many blends are phonologically similar to their targets, as in "It’s difficult to valify" (validate + verify) (Harley, 1998), we can surmise that there have to be ‘two alternative messages [being] processed in parallel from the message to the phonological levels’ (ibid.). So, is there a model that can better explain these errors? Dell and Reich’s "spreading-activation" model is more promising.

4.2.1.3. Dell and Reich’s "spreading-activation" model

Dell and Reich’s "spreading-activation" model, also known as connectionism, can account for a great many speech errors and has been applied in psychology, as well. Within this approach, processing takes place by dint of the spread of activation throughout a network of nodes. For example, there are nodes for phonemes, morphemes, syllables, and so on, each node being connected to other nodes above and below it. The phoneme /s/, for instance, is connected to the syllables set, sap, and sip above it, and to such features as "fricative" or "voiceless" below it. When a word is selected for articulation, the morphemes comprising it are activated, and this activation spreads up and down the hierarchy, ‘eventually reaching the feature level, where the most highly activated features determine the output’ (Baars, 1992: 273).

Dell and Reich argued against Garrett’s two-stage model, claiming that words in many speech errors and blends, in particular, are more phonologically similar than one would expect by chance. Furthermore, they found that in sound errors, the elements involved tend to be real English words. For example, in (19) "reel" is a real English word, which shows that there is an interaction between lexical and phonological processes. In the case of exchange errors, the model holds that these occur when other sources activate the wrong phoneme from the next word.

5. Conclusion

To sum up, it should be reiterated that speech is a remarkable feat—and it is in its dysfluencies that its merit resides. More specifically, the evidence from hesitations and pauses, but fundamentally from slips of the tongue, opens up new vistas of study, as speech errors can tell us a great deal about the processes of speech planning and production. Speech errors are no longer regarded as emanating from the "subconscious," but rather as "concrete" misapplications at the level of lexical selection, word formation, and structural assemblage. There are many models attempting to "delve" into, and explain, speech errors (see Shattuck-Hufnagel’s scan-copier model, 1979; Laver’s Neurolinguistic Control model, 1980; Levelt’s Perceptual Loop Approach to Speech Repair, 1983; Stemberger, 1985, et al.), but the three models presented here are representative of two different approaches to speech production. Even though there are no definitive conclusions as the nature of speech production, speech errors are nevertheless ‘a window into linguistic processes’, as Fromkin put it.

 

REFERENCES

Some of the materials presented on these pages are copyrighted by their respective authors or original publishers. You are not allowed to use them in any other but non-profit, educational setting.  You may use the "Ervin's ESL Net" logo (c) when creating a link to our site; we request a proper link to http://www.geocities.com/CollegePark/Union/7044/ and an e-mail notification to ervin@unforgettable.com.

Pages created by Ervin Nemeth.