Topic 9 – The phonological system of the english language III: stress, rhythm and intonation. Comparison with the language of your community

Topic 9 – The phonological system of the english language III: stress, rhythm and intonation. Comparison with the language of your community



1.1. Aims of the unit.

1.2. Notes on bibliography.



3.1. On the nature of communication and language: origins and general features.

3.2. The sound system: segmental and suprasegmental levels.

3.3. The suprasegmental level within a communicative competence theory.

3.4. The relevant role of the suprasegmental level within the oral discourse.

3.5. The suprasegmental level: at the core of conversational studies.



4.1.1. On defining stress.

4.1.2. Stress at word level.

4.1.3. The origins of stress placement.

4.1.4. Word accentual patterns: simple and compound words.

4.1.5. The influence of affixation on stress placement: simple words. Prefixes. Suffixes.

4.1.6. The influence of a word’s grammatical function on stress in compound words.

4.1.7. Fixed stress patterns in other categories: numbers, reflexives, and phrasal verbs.

4.1.8. Comparing English vs Spanish word stress patterns.


4.2.1. On defining sentence stress and rhythm: the stress-timed nature of English.

4.2.2. Content vs function words.

4.2.3. Strong vs weak forms.

4.2.4. Adjustments in connected speech. Linking. Assimiliation. Dissimilation. Deletion. Epenthesis.

4.2.5. Comparing English vs Spanish sentence stress and rhythm


4.3.1. On defining intonation: the notion of pitch.

4.3.2. Intonation units.

4.3.3. The main functions of intonation. Emphatic function. Discourse function. Attitudinal function. Grammatical function.

4.3.4. Intonation contours. Falling tone. Low falling. High falling. Rising tone. Low rising. High rising. Falling-rising tone. Rising-falling tone.

4.3.5. Comparing English vs Spanish intonation.




1.1. Aims of the unit.

This study is aimed to serve as the core of a survey on pronu nciation, and in particular on suprasegmental levels regarding stress, rhythm, and intonation. Therefore, all sections which shall be reviewed in this unit are aimed to provide the reader with the following: (1) a historical overview of the issues involved in teaching pronunciation, such as how stress, rhythm, and intonation have been viewed from various methodological perspectives and what we know about the main methods in second language phonology; (2) a thorough theoretical grounding in the suprasegmental level; (3) insight into the ways in which this suprasegmental level intersects with other skills and areas of language, such as listening, inflectional morphology, and orthography; (4) a comparison of stress, rhythm, and intonation between the English and Spanish phonological system is offered at the end of each chapter; and finally, (5) a conclusion on the issue will be offered, followed by (6) listed bibliography used in this study.

1.2. Notes on bibliography.

Different valuable sources have been taken into account for the elaboration of this unit. Thus, in Part 2, for a historical overview of the development of the phonological system, see Celce-Murcia, and Algeo and Pyles, The origins and development of the English language (1982). Gimson, An introduction to the pronunciation of English (1980) In part 3, for a theoretical background to the phonological system, classic works are and Crystal, Linguistics (1985); Gimson, An introduction to the pronunciation of English (1980); Brown, G. and G. Yule, Discourse Analysis (1983); and Canale, From Communicative Competence to Communicative Language Pedagogy (1983).

In Part 4, an influential description of the suprasegmental level is mainly offered again by Gimson (1980), Alcaraz and Moody, Fonética inglesa para españoles (1982); and O’Connor, Better English Pronunciation (1988); O’Connor, Better English Pronunciation (1988); Celce-Murcia, Teaching Pronunciation (2001); O’Connor and Arnold, The intonation of Colloquial English (1973); and van Ek and Trim, Vantage (2001).

In part 5, among the many general works that incorporate recent phonological advances and present-day directions in teaching pronunciation, see especially Celce-Murcia (2001); and classic works by Gimson (1980) and O’Connor (1988). See also B.O.E. RD Nº 112/2002, by which Secondary Education and Bachillerato curricula are established in Murcia Autonomous Community, and also some information about Sócrates projects on Education and Culture in


This section, in briefly reviewing the history of the suprasegmental elements, provides a historical background for the theoretical part examine d in next section, and together, they both will prepare the reader for the descriptive account of stress, rhythm, and intonation in sections 3 and 4). From this historical perspective we are able to see that current issues on pronunciation, and especially, on the prosodic elements in an act of communication are not particularly new.

In fact, earlier records of prosodic elements are bound up with the appearance of language forty or fifty thousand years ago as part of an oral patrimony of humanity so as to provide ourselves a cultural identity in society (Goytisolo 2001). Thus, whenever we speak, we make known our identity to the outside world by means of our voice quality as this is a personal and not transerable feature. Moreover, our accent, as a more general phenomenon, may inform others about our regional and social origins. So, voice quality tells us who someone is, and accent tells us where they are from.

Since ancient times, tribal chiefs, chamans, bards and story-tellers have been in charge of preserving and memorising for the future the narratives of the past and unconciously, they have transmitted pronunciation patterns which are still being used today. According to Crystal (1985), there is a considerable body of religion and myth in many cultures concerning these oral traditions where the language of worship is the product of particular care and attention on the part of a community. Hence, this motivation sometimes produced detailed studies of language which were great achievements.

For instance, in ancient India, the Hindu priests realized around the fifth century B.C. that the language of their oldest hymns, Vedic Sanskrit, was no longer the same, and therefore, they needed to reproduce accurately the original pronunciation of their hymns in order to successfully preserve their oral ceremonies. The solution was to write a set of rules, known as sutras, in order to describe the grammar and pronunciation of the old language. This work contained a great number of phonetic and grammatical minutiae with methodological and theoretical principles, which are still used in modern linguistics. Regarding suprasegmental elements, it is worth noting that the phenomena that refers to the “placing together” of sounds within and between words, that is, adjustments in connected speech, derives from Sanskrit and it was referred to as sandhi variation.

Later on we also find several references to the suprasegmental level. For instance, in the sixteenth century, the French grammarian, John Palsgrave wrote about the pronunciation of French in his work L’esclarcissement de la Langue Francoyse (1530). In it, he explained the values of the French sounds, comparing them with the English, in a kind of phonetic transcription. Moreover, in the seventeenth century, the philosopher Thomas Hobbes devoted in his work The Leviathan (1660), chapter IV “Of Speech” to oral discourse where he makes reference to the suprasegmental level when he states that the most noble and profitable invention of all other was that of speech, consisting of names or appellations, and their connexion; whereby men register their thoughts, and also declare them one to another for mutual utility and conversation.

However, the most relevant contribution to the study of prosodic elements is also to be found in the seventeenth century, when a group of writers showed a considerable interest on speech, and therefore, a great concern at detailed analysis of speech activity, and the establishment of systematic relationships between the English sounds. Among these writers, we shall mention John Wallis and Christopher Cooper among others, as they are considered to be the true precursors of modern scientific phoneticians. Their work is entirely phonetic in character and most of their observations on speech and pronunciation are still current today.

Yet, the linguist John Wallis examined, in his work Grammatica Linguae Anglicanae (1653), the sounds of English as constituting a system in their own right. According to him, by his methods, he succeeded in teaching not only foreigners to pronounce English correctly but also the deaf and dumb to speak. Moreover, Christopher Cooper attempted to describe and give rules for the pronunciation of English rather than to devise a logical system into which the sounds of English might be fitted. In his work The Discovery of the Art of Teaching and Learning the English Tongue (1687), he states ‘The Principles of Speech’ and gives rules for the relation of spelling and pronunciation in different contexts.

In the eighteenth century, modern languages began to enter the curriculum of European schools and language teaching progressively developed from grammatical to more communicative approaches focusing on oral skills. As a result, a special attention was paid to productive skills, such as speaking, and therefore, to prosodic elements. Yet, the main achievement of the century lies in its successful attempt to fix the spelling and pronunciation of the language by means of dictionaries, which provided us with information concerning the contemporary forms of pronunciation. In fact, the Dictionaries of Samuel Johnson (1755), Thomas Sheridan (1780), and John Walker (1791) led to a standardization of pronunciation.

In the nineteenth century, phoneticians such as Henry Sweet, Wilhelm Viëtor, and Paul Passy, promoted a great interest on speaking skills which was to be developed by the Direct Method in the late 1800s and early 1900s. In fact, these phoneticians formed the International Phonetic Association in 1886 and developed the International Phonetic Alphabet (IPA). This alphabet made it possible to accurately represent the sounds of any language because, for the first time, there was a consistent one-to-one relationship between a written symbol and the sound it represented.

But it was in the twentieth century, during the 1940s, that the prosodic elements were to be studied in detail for the first time within a phonemic approach. In the 1940s and 1950s, the Reform Movement played an important role in the development of Audiolingualism in the United States and the Oral Approach in Britain for which pronunciation was very important and was taught explicitly from the start. Their main features are, firstly, that students imitate or repeat sounds, a word, or an utterance out of a model given by the teacher or a recording. During the 1970s, the Silent Way (Gattegno 1976) is characterized by the attention focused on how words combine in phrases, and on how blending, stress, and intonation all shape the production of an utterance by means of sound-color charts and word charts. In the 1980s, the Communicative Approach , currently dominant in language teaching, holds that the primary purpose of language is communication, which means a renewed urgency on pronunciation since intelligible pronunciation is one of the necessary components of oral communication.

Until now we can see that the emphasis in pronunciation instruction has been largely on a segmental level, that is, getting the sounds right at the word level, dealing with words in isolation or with words in very controlled and contrived sentence-level environment. In the mid- to late 1970s other approaches directed most of their energy to teaching suprasegmental features of language (i.e. rhythm, stress, and intonation) in a discourse context as the optimal way to organize a short-term pronunciation course for nonnative speakers. Today, however, we see signs that pronunciation is moving towards a more balanced view. As a result, today’s pronunciation curriculum seeks to identify the most important aspects of both the segmental and suprasegmental levels and integrate them depending on the needs of any group of learners.


We shall provide in this section a linguistic background for the English phonological system so as to provide the reader with a relevant framework for the descriptive and pedagogical survey on stress, rhythm, and intonation presented in subsequent sections. Therefore, we shall review the notion of oral communication in relation to human communication systems and its main features, in order to establish a link between the concept of language within a communicative competence theory and the relevant role of stress, rhythm, and intonation patterns in social human behavior, and therefore, speech acts.

Then, once the link between language and communicative competence is established, we will offer a brief account of how the oral component has been approached through history, and in particular, the suprasegmental level (i.e. prosodic elements), within the main types of teaching approaches and techniques. Upon this basis, we will move on towards a description of each suprasegmental level, which will be approached from current pronunciation instruction and the most relevant figures in this field.

3.1. On the nature of communication and language: origins and general features.

Research in cultural anthropology has shown quite clearly that the origins of communication are to be found in the very early stages of life when there was a need for animals and humans to communicate so as to carry out basic activities of everyday life. However, even the most primitive cultures had a constant need to express their feelings and ideas by other means than gutural sounds and body movements as animals did. Human beings constant preoccupation was how to turn thoughts into words. For our purposes in this study, it is worth, then, establishing a distinction between human and animal systems of communication whose main difference lies in the way they produce and express their intentions. So far, the most important feature of human language is the auditory-vocal channel which, in ancient times, allowed human beings to produce messages and, therefore to help language develop.

From a theory of language, we mainly distinguish two types of communication , for instance, verbal and non-verbal codes. Firstly, verbal communication is related to those acts in which the code is the language, both oral and written. Secondly, when dealing with non -verbal devices, we refer to communicative uses involving visual, sound, and tactile modes, such as kinesics, body movements, and also paralinguistic devices drawn from sounds (whistling), sight (traffic signs) or touch (Braille).

With respect to elements in the communication process, we shall follow the Russian linguist Roman Jakobson, whose productive model on language theory explains how all acts of communication, be they written or oral, are based on six constituent elements (1960). Thus, the addresser (speaker) sends a message (oral utterance) in a given context (socially determined) to the addressee (listener). Both the addresser and addressee need to share a code (language if verbal, and symbols if non- verbal) through a physical channel (phonological system) and establish contact to enter and stay in communication. For our purposes in this study, during an oral exchange the sound system shows relevant nuances between the message and its context by means of the suprasegmental level, that is, stress, rhythm, and intonation, which sha ll highlight important differences in the speaker and listener’s attitudes and meaning.

3.2. The sound system: segmental and suprasegmental levels.

Following Celce-Murcia (2001), one of the main features of the sound system of any language is its inventory of sounds, which consists of a combination of acoustic signals into a sequence of speech sounds, thus consonants and vowels. In fact, all languages are somewhat distinctive in their vowel and consonant inventories, and in the way that these components combine to form words and utterances. Yet, linguists refer to this inventory of vowels and consonants as the segmental aspect of language.

In addition to having their own inventory of vowels and consonants, languages also have suprasegmental features which trascend the segmental level, and involve those phenomena that extend over more than one sound segment. We may distinguish two main types of suprasegmental levels. First of all, predictable features such as word stress, sentence stress, and rhythm along with adjustments in connected speech (i.e. assimilation and linking, as the adjustments or modifications that occur within and between words in the stream of speech); and secondly, features that are sensitive to the discourse context and the speaker’s intent, such as prominence and intonation.

It has been claimed that a learner’s command of segmental features is less critical to communicative competence than a command of suprasegmental features, since the suprasegmentals carry more of the overall meaning load than do the segmentals. Celce-Murcia (2001) affirms that misunderstandings involving mispronunciation of a segmental sound usually lead to minor repairable incidents than with suprasegmental sounds. For instance, an adult learner is discussing with a native speaker an incident in which her child had choked on something and could not breathe. “He swallowed a pill”, the learner says. “What kind of peel?” asks the native speaker. “An aspirin,” says the learner. “Oh, a pill! I thought you said peel,” responds the native speaker.

However, when dealing with suprasegmentals in connected speech, the misunderstanding is likely to be of a more serious nature. For instance, if the stress and rhythm patterns sound too nonnative – like, the speakers who produce them may not be understood at all. Moreover, learners who use incorrect rhythm patterns or who do not connect words together are at best frustrating to the native- speaking listener. And even more seriously, if these learners use improper intonation contours, they can be perceived as abrupt, unpolite, or even rude.

In the section that follows, it is relevant to examine the relationship between the elements of the suprasegmental level within a communicative competence theory in order to make the reader aware of the essential role of prosodic elements in oral communication.

3.3. The suprasegmental level within a communicative competence theory.

Language has proved to be the principal vehicle for the transmission of cultural knowledge, and the primary means by which we gain access to the contents of others’ minds by means of verbal and non-verbal codes. Moreover, language is involved in most of the phenomena that lie at the core of social psychology, such as attitude changes, social perception, personal identity, social interaction, and stereotyping among others.

The way languages are used is constrained by the way they are constructed, particularly the linguistic rules that govern the permissible usage forms. Language has been defined as an abstract set of principles that specify the relations between a sequence of sounds and a sequence of meanings. How participants define the social situation, their perceptions of what others know, think and believe will affect the form and content of their acts of speaking. As a result , any communicative exchange is to be analysed from two interrelated levels, thus regarding its social context and also regarding the linguistic forms participants use, that is respectively, a pragmatic and a linguistic level.

Therefore, it is at this point that the notion of communicative competence, coined by Dell Hymes in the 1970s and developed by Canale and Swain in the 1980s, comes into force in our study in order to highlight the relevance of suprasegmental devices in a speech act. Since the notion of communicative competence is concerned not only with purely grammatical competence but also with the area of pragmatics, that is, what is appropriate in a given social situation, we may define an ‘act of speaking’ as a set of complex and organized systems that operate in concert with the use of language in everyday communicative situations.

Linguistically speaking, although the notion of communicative competence is divided up into four subcomponents (i.e. grammatical, discourse, sociolinguistic, and strategic competence), we must note these four competences are interrelated, and essential to each other, in order to achieve a successful communicative act. Similarly, although the suprasegmental level is to be found within the grammatical level among other three subcomponents (i.e. morphological, syntactic, and semantic), the phonological system is also interrelated with the way speakers and listeners make use of the other three linguistic levels for a communicative exchange to be successful.

3.4. The relevant role of the suprasegmental level within the oral discourse.

At the level of discourse analysis, acts of speaking can be regarded as actions intended to accomplish a specific purpose by verbal means, and in particular, by means of utterances. Looked at this way, utterances can be identified in terms of their intended purposes, thus assertions, questions or exclamations, and commands in terms of their intentions, such as statements, requests, expression of surprise and doubt, and anger among others. Therefore, they meet the requirements of not only what we say but also how we say it. An example would be the word “yes” said with firm tone of voice as opposed to a doubtful one.

It is at this point that the prosodic elements, that is, stress, rhythm, and intonation emerge as an essential part in the oral production of these intended purposes. We must bear in mind that the grammatical form does not determine the speech act an utterance represents but the way it is uttered. For instance, a sentence like “The y had already eaten at 7 o’clock” may constitute quite different speech acts with different purposes, depending on the word we stress, the rhythm with which we utter this sentence, and the intonation we apply at the end of the sentence.

Considerations on this sort require a distinction between the literal meaning of an utterance and its intended meaning since an act of speaking is imbedded in a discourse made up of four main subcompetences where the use of prosodic features will convey different meanings to different sentences. We believe that efficient communication depends on the speaker’s ability to integrate grammatical knowledge of the English sound system with knowledge of the other subcompetences, that is, sociocultural, discourse, and strategic.

Thus, during an act of speaking, the grammatical competence implies knowledge of lexical items, syntax, semantics, and in particular, of phonology for students to match sound and meaning by means of word formation, to construct sentences using vocabulary, to handle linguistic semantics, and specially, to use language through spelling and pronunciation regarding word and sentence stress, rhythm, and intonation patterns. The suprasegmental features are also present within the sociolinguistic competence as far as sociocultural rules of use, and rules of discourse are concerned to convey different meanings depending on the purposes of the interaction, for instance, asking for information, commanding, complaining or inviting.

Moreover, prosodic features come into force within the discourse competence when the unity of a text is addressed by means of coherence and cohesion in meaning. Whereas cohesion facilitates the interpretation of a text, coherence relates different meanings depending on different attitudes expressed by prosodic features, thus word stress on pronouns and synonyms, and sentence stress.

Finally, strategic competence highlights the fact that rhythm and intonation are essential in the negotiation of meaning to sustain communication with someone. Thus, in a telephone conversation, when asking for slower and clearer repetition, seeking clarification and paraphrase in order to understand key points.

Once we have stated the relevance of the suprasegmental level within the oral discourse, we may go further by noting that these prosodic features are directly related to conversational studies where they lie at the core of the speech act theory and conversational studies.

3.5. The suprasegmental level: at the core of conversational studies.

The introduction of cultural studies to language teaching methods in the 1980s highlighted the need for students to know not only the linguistic patterns of the foreign language under study but also the pragmatic use of verbal and non-verbal behaviour, that is, according to Hymes (1972) to know when to speak, when not, what to talk about with whom, when, where and in what manner . Language was considered as social behaviour, and therefore, the inability of or insensitivity to foreign language discourse may lead to impede communication more than grammatical inaccuracy.

This approach is related to the sociolinguistic competence, as the grammatical competence may mislead learners into thinking that certain rules of use of their native language may be applied in the foreign language with no change of meaning. This is to be applied to both the segmental and the suprasegmental level. In order to make effective discourse productions, learners need to approach their speeches from a conscious sociolinguistic perspective, in order to get considerable cultural information about communicative settings and roles.

For instance, Spanish learners of English should take into account that applying their native pitch when speaking English or using their native intonation contours may be perceived as nonnativelike, rude, or abrupt. It is important, then, to enforce foreign language standards of pronunciation for our students to express themselves in exactly the ways they choose to do so-rudely, tactfully, or in an elaborately polite manner in order to prevent them being unintentionally rude or subservient by using certain intonation contours or inappropriate word or sentence stress.

Learners are expected to select the language forms that are appropriate in different settings, and with people in different roles and with different status in order to achieve successful communication. Sometimes, unconciously, we follow a large number of social rules which govern the way we speak, and affect the way in which we select sounds (i.e. talking to older people, people of special rank, and so on). It is at this point that prosodic features are considered to be essential elements within language production since they enable us to recognize pragmatic distinctions of formality, politeness and intimacy among others.

In connected speech, the ability to link units of speech together with the appropriate stress, rhythm, and intonation, that is, with facility, and without inappropriate slowness, or undue hesitation is normally related to a speech act theory. However, once we start to look at actual interaction, the suprasegmental level is particularly enhanced in a unit of analysis wider than a speech act, thus conversational mechanisms, such as turn-taking, the cooperative principle, and the notion of adjacency pairs.

The English language philosopher H. Paul Grice (1969) was not the first to recognize that non- literal meanings posed a problem for theories of language use, but he was among the first to explain the processes that allow speakers to convey, and addressees to identify, communicative intentions that are expressed non -literally , as for him, meaning is seen as a kind of intending, where the hearer and speaker recognize that there is something else than its literal meaning in a speech act. He proposed four general maxims, thus be truthful, be informative, be relevant, and be brief. It is in the last maxim, that is, regarding manner, that prosodic elements are implicitly present (i.e. Spanish learner of English applying their native intonation patterns to a sentence like “Shut the door, please” may sound abrupt instead of a request).

Regarding turn-taking, it is defined as a main feature of conversations where one person waits for the other to finish his/her utterance before contributing their own. Note, however, that a person rarely explicitly states that they have finished their utterance and are now awaiting yours, but rather it is expressed by intonation patterns, such as pause or hesitation.

Prosodic elements are also present in the notion of adjacency pairs posited by Goffman (1976). This fundamental feature of conversation analysis is to be found in a question -answer session, and therefore, stress at word and sentence level, rhythm, and rising and falling intonation play an essential role in questions and replies. In some cases, the speakers might make inferences about the reasons for incorrect responses. These may be not to have responded because he did not understand the question, or not to agree with the interlocutor. As Goffman notes, a silence often reveals an unwillingness to answer. Dispreferred responses tend to be preceded by a pause, and feature a declination component which is the non-acceptance of the first part of the adjacency pair.


We have so far dealt almost entirely with the historical and theoretical framework for the suprasegmental level. As we have seen, an earlier extension of the term phonology was totally concerned with the ‘segmental’ aspects of the sound-system of a language, and it was not until the early forties that the prosodic features of pronunciation first came to be studied in detail. Then, the way in which vowel and consonant combinations could be varied, showed alterations in melody, loudness, speed of speaking and the like, and at a pragmatic level, changes in meaning.

Regarding prosodic features , Gimson (1980) states that a sound, whose phonetic nature can be described and function in the language determined, has not only quality but also le ngth, pitch, and a degree of stress as essential elements of prominence in speech. All three features may be measured physiologically or acoustically: length, as duration; pitch, as the frequency of the fundamental; and stress, as a measure of intensity, muscular activity, or air -pressure.

In general, these four factors, stress (muscular activity), pitch change (frequency of stress/loudness), sound quality (weak and strong forms), and quantity (length/duration) may play an essential part in rendering a sound or syllable prominent. In speech, length variation is an important factor regarding the association of vowel quantity with accentuation. Sound qualities also contribute to an impression of prominence, mainly by means of unaccented and accented syllables. Yet, stress, strictly defined in terms of energy and loudness, is the least effective means of conveying prominence. However, it is pitch variation (high/low), more commoly known as intonation, the most commonly used and efficient cue of prominence for the listener, thought of as a tone system.

Therefore, on an ultimate notation, these factors will be described in terms of concrete expressions such as stress, rhythm, and intonation. Despite the fact that these labels may imply they are distinct from each other, it is worth noting again that these three functionalcategories are embedded and interrelated in the stream of speech by means of relative prominence. Then, the three subsequent sections will be devoted to a descriptive account of each suprasegme ntal level. Firstly, we shall focus on stress at word and sentence level, and then rhythm as a borderline element between word and sentence level, and finally intonation in connected speech.


4.1.1. On defining stress.

Following Celce-Murcia (2001), stress is defined by means of stressed and unstressed syllables since certain syllables of a word are more prominent than the others because of length, quantity, or pitch change. Thus, stressed syllables (or rather the vowels of stressed syllables) are often longer, louder, and higher in pitch than unstressed syllables, which are more centralized or neutralized vowels. Therefore, we shall describe, first, this phenomenon in articulatory terms, and then, in relation to the accentual patterns it is divided in.

In articulatory terms, stress involves a greater outlay of energy as the speaker expels air from the lungs and articulates syllables. This increase in muscular energy and respiratory activity is undoubtedly what allows the native speaker to tap out the rhythm of syllables within a word or words within an utterance. Longer vowel duration in the stressed syllable and higher pitch are probably the most salient features of stress from the listener’s point of view.

Since the difference between stressed and unstressed syllables is greater in English than in most other languages, we must capture this differentiation in stress levels. English language -teaching texts generally speak of three levels of stress, defined as the pattern of stressed and unstressed syllables within a word. We refer to primary, secondary, and tertiary stress. This basic pattern, which is as much a part of a word’s identity as its sound sequence, may, however, be somewhat modified by the general accentual pattern of the longer utterance in which it occurs.

4.1.2. The origins of stress placement.

According to Celce-Murcia (2001), far from being random, stress placement in English words derives from the rather colorful history of the language. Today, roughly thirty percent of the vocabulary of English stems from its Old English origins and retains the native Germanic stress accentual patterns for kinship terms, body parts, numbers, prepositions, and phrasal and irregular verbs stem from its Old English origins and retains the native Germanic stress patterns. In fact, of the 1,000 most frequently used words in English, approximately 83% are of Germanic origin.

Many of the remaining words have been acquired through historical events, such as the Norman Conquest, which brought much French vocabulary into English, or through the influences of Christian religion and academia, which have done much to secure the position of words of Greek and Latin origin in the English language. Nowadays, new loan words continue to be assimilated into English and undergo similar changes in spelling and pronunciation as have words that entered the language in earlier eras – until they are no longer perceived as foreign and their origins are all but forgotten to users who do not study etymology.

Although loan words in English may sometimes retain the stress patterns of the language from which they derive, they are more often incorporated into the stress patterns of English, which imposes on them a more indigenous or Germanic stress pattern by moving the stress to an earlier syllable, often the first. We can see this in borrowings such as GRAMmar (from French gramMAIRE) and CHOColate (from Spanish chocoLAte). In fact, the longer a borrowed word has been in the English language, the more likely it is that this type of stress shift will occur.

4.1.3. Stress at word level.

According to some phoneticians (Celce-Murcia 2001), there are as many as six levels of word stress, not all of which are readily discernible. However, for pedagogical purposes, we will adhere to the conventional designation of three levels which are often referred to as strong, medial, and weak, as they best represent what occurs on the syllable level, or alternatively, primary, secondary, and tertiary stress. The designation primary makes reference to those syllables taking the tonic or nuclear accent and therefore, which sound with more force than the rest; secondary refers to stressed syllables with pretonic accent which are not as strong as the primary stress; and finally, the designation tertiary refers to unstressed syllables.

There are, however, some general orthographic considerations to be taken when placing stress at word level. For instance, regarding primary stress, the vocalic groups will only remain together if they form a diphthong or triphtho ng in English (i.e. ‘so-cial), not being the case for those which are divided by an accent in between (i.e. ,so-ci-‘ol-ogy). Moreover, according to Gimson (1980), initial consonant clusters (i.e. p, t, k, b, d, g, m, n, l, f, v, s, h + l, r, j, w; or sp, st, sk + l, r, j, w) are considered to be part of the next syllable and cannot be separated (i.e. geo’graphic; slightly; in’spec-tor ).

Regarding secondary stress, we shall mention several rules to be applied. For instance, (1) firstly, there must be at le ast two syllables of distance between primary and secondary stress in the same word due to rhythmic reasons (i.e. ,meteoro’logical); (2) secondly, when two accents meet in the same word, the first one is to be considered the secondary stress, and the next, the primary stress (i.e. ‘four’teen becomes ,four’teen); and (3) thirdly, when the primary stress is preceeded by several secondary accents, it makes the nearest secondary stress be weaker than the rest (i.e. ,in –ter-de-,no- mi-‘na-tio- nal; ,in-ter-,dis-ci-‘plin-ary).

To indicate strongly stressed syllables or primary stress in phonetic transcription it has been established the convention of a superscript accent mark (‘) placed on the upper left hand side before the syllable, which may be substituted by an apostrophe if is not found on the current software program; to indicate lightly stressed syllables or secondary stress we use a subscript accent mark (,) which is placed on the lower left hand side of the syllable; finally, unstressed syllables are not specially marked. This system of vertical subscript and superscript accents is likely to be quite intuitive, but not as visually commanding as other systems, such as capital letters or bubbles.

In fact, there are other systems of notation for marking stress in a written word that can help make the concept visual for students. For instance, capital letters, boldface, bubbles, accents, and underlining. Although capital letters stand out well in print and are easy to create with a typewriter, usually only two levels of stress can be indicated. The addition of boldface type and bubbles open up the possibilities for indicating additional levels. Also, in some dictionary pronunciation guides, accents are often used, with an accent aigu (´) signaling primary stress and an accent grave (`) for secondary stress, and no symbol at all for unstressed syllables. Whatever system for marking stress teachers ultimately choose, they can add paralinguistic cues for visual reinforcement by humming, clapping, or tapping the stress pattern.

4.1.4. Word accentual patterns: simple and compound words.

It may be said that a word has a characteristic accentual or rhythmic pattern for speaker and listener alike which is as much a part of a word’s identity. This sound sequence may be modif ied by the general accentual pattern of the longer utterance in which it occurs, and it may lead to a reduction of unstress vowels to schwa. Yet, a main feature of word stress in English is that it can occur on virtually any syllable, depending in part on the origin of the word. This apparent lack of predictability as to where the stress falls is confusing to learners from language groups in which stress placement is more transparent (i.e. Spanish learners).

In fact, there are different word patterns for the placement of stress within a word depending on the number of syllables it consists of. Thus, the first group takes a two-syllable pattern in which the primary stress usually falls on the first syllable whenever the unstressed syllable contains schwa, /i/, or the dipthongs /ou/, /ai/, and /ei/ (i.e. mother, language, yellow, fertile, always ). However, sometimes, the primary stress falls on the second syllable when the unstressed syllable contains /i/ or schwa (i.e. believe, collect). The second group takes a three-syllable pattern in which words may be accentuated in the first, second, and third syllable (i.e. ‘wonderful, ‘excellent, e’xample, en’gagement, unders’tand, after’noon ). The third group takes a four-syllable pattern in which primary stress may also fall on the first, second, and third syllable (i.e. ‘dictionary, ‘nationally, for’getfulness, es’tablishment, expec’tation, tele’vision ). The last group is formed by words with five or more syllables in which the primary stress may fall in all situations (i.e. ‘dedicatory, un’comfortably, archae’ologist, nationa’listic, experimen’tation).

Factors that influence stress placement include (1) the historical origin of a word as we have already seen, (2) affixation, and (3) the word’s grammatical function in an utterance. One important difference between words of Germanic origin and those of non-Germanic origin is the way in which stress is assigned. For words of Germanic origin, the first syllable of the base form of a word is typically stressed (i.e. Father, YELlow, TWENty, HAMmer, Water). Today, even many two-syllable words that have entered English through French and other languages have been assimilated phonologically and follow the Germanic word stress pattern (i.e. MUsic, DOCtor, FLOWer, FOReign, MANa ge).

According to Gimson (1980), we may distinguish between simple and compound words. Simple words are called polysyllabic whereas compounds are called multisyllabic. They both undergo different stress patterns, and it is worth bearing in mind that the syllabic division in English is not made according to orthography (as in Spanish) but to pronunciation. It is possible to give rules governing the relationship of accentuation and the spelling of English simple and compound words.

Words that have not been assimilated to the Germanic pattern have less predictable word stress in their base forms, but stress is often predictable if certain affixes or spellings are involved. Therefore in our next section we shall examine within this predictable group (4.1.5.) how affixation may affect stress on simple words, and then, how the word’s grammatical function in an utterance may affect stress on compound nouns (4.1.6.), as well as the effect of stress on (4.1.7.) other means of accentual patterns, such as numbers, reflexives, and phrasal verbs. The way a word’s grammatical function affects stress on words in an utterance will be examined again within the framework of rhythm and intonation patterns (sections 5 and 6 respectively).

4.1.5. The influence of affixation on stress placement: simple words.

In general, there are certain relatively simple rules involving the influence of word affixes on accentuation, which have sufficient general applicability for foreign language learners. In the following two subsections, we shall examine the influence of prefixes and suffixes on stress placement in simple words. Prefixes.

With respect to prefixes, those words, such as nouns, adjectives, and verbs, containing prefixes tend to be strongly stressed on the first syllable of the base or root element, with the prefix either unstressed or lightly stressed (i.e. nouns: surPRISE, proPOSal, aWARD; adjectives: unHEALTHy, aSLEEP, inCREDible; verbs: deCLARE, exPLAIN, forGET).

In English, prefixes tend to fall into one of two categories: (1) firstly, prefixes of Germanic origin and (2) secondly, prefixes of Latinate origin. Among (1) the Germanic prefixes we may mention: a-, be -, for-, fore -, mis-, out-, over-, un -, under-, up-, and with – (i.e. awake, belief, forgive, forewarn, mistake, outrun, overdo, untie, understand, uphold , and withdrawn) and, as we may note, these words follow a general pattern by which there is no stress on the prefix and strong stress on the base.

It is worth noting that some of these prefixes (a-, be-, fo r-, and with-) are always unstressed in the words in which they occur whereas others receive light stress in prefix + verb combinations (i.e. un-: ,un’do, ,un’hook; out-: ,out’run, ,out’last; over-: ,over’look, ,over’take; under-: ,under’stand,

,under’pay). However, an exception to this general rule occurs when the prefix functions as a noun and has the same pattern as a compound noun. As a result, the prefix tends to be strongly stressed (i.e. ‘forecast, ‘outlook, ‘overcoat, ‘underwear, ‘upkeep ).

The sec ond category is (2) prefixes of Latinate origin which usually receive strong stress on the word base and not on the prefix. These include a(d)-, com-, de-, dis-, ex-, en-, in -, ob -, per-, pre-, pro -, re-, sub-, and sur- (i.e. com’plain, dis’play, in’habit, per’suade, sub’divide , and so on). We must note that, when added to verbs, unlike Germanic prefixes, most of Latinate prefixes are unstressed when part of a verb. Among the most frequent we may mention com- (also co-, col-, con -, cor-) as in com’mand), dis- (i.e. dis’turb), pro- (i.e. pro’test), ex- (i.e. ex’tend ).

However, when these prefixes are part of a word that functions as a noun, the prefix often receives strong stress (i.e. a difficult PROject compared to they proJECT…). We note that the influence of a word’s part of speech on its stress pattern is dealt with more thoroughly in sections 4.1.6, 5 and 6. Suffixes.

With respect to suffixes, they affect word stress in one of three ways: (1) firstly, they may have no effect on the stress pattern of the root word; (2) secondly, they may receive strong stress themselves; (3) and thirdly, they may cause the stress pattern in the stem to shift from one syllable to another.

Within the first group, we find (1) neutral suffixes, which have no effect on the stress pattern of the root word and are Germanic in origin. These suffixes include, for instance, –hood (i.e. brotherhood ),

less (i.e. careless), –ship (i.e. kinship), and –ful (i.e. forgetful). Other neutral suffixes which are not all of Germanic origin, but which function in the same way include: –able (i.e. unable ), –al (i.e. noun suffix, chemical), –dom (i.e. stardom), –ess (i.e. princess), –ling (i.e. yearling), –ness (i.e. darkness), –some (i.e. troublesome), –wise (i.e. clockwise), and –y (i.e. silky). In fact, as a general rule, words with Germanic or neutral suffixes (whether the stem is of Germanic origin or not) still tend to maintain the stress pattern of the base form (i.e. BROTHer, unBROTHerly; HAPpy, HAPpiness, unHAPpiness; Easy, unEAsily).

Within the second group, we find (2) suffixes that, unlike the Germanic ones, have come into the English language via French (i.e. –eer (i.e. volun’teer, engi’neer), –esque (i.e. gro’tesque, ara’besque ), –eur/-euse (i.e. chaf’feur, chan’t euse), –ette (i.e. cas’sette, basi’nette ), –ese (i.e. Suda’nese, Vietna’mese), –ique (i.e. tech’nique, an’tique), –oon (i.e. bal’loon, sa’loon), –et /ey/ (i.e. bal’let, bou’quet). As a result, they often cause the final syllable of a word to receive strong stress, with other syllables receiving secondary or no stress. As a general tendency, the longer a word remains as part of the English vocabulary system, the greater is the tendency for stress to shift toward the beginning of a word. Hence, note the coexistence today, for instance, for the pronunciations cigarETTE and millionAIRE (where the stress is on the final element) and CIGarette and MILLionaire (where the stress is on the first element).

Finally, within the third group, we include (3) suffixes that can also cause a shift of stress in the root word, that is, when added to a word, they can cause the stress to shift to the syllable immediately preceding the suffix. Note the stress shift caused by the addition of the following suffixes to the root word: –eous (i.e. from root word ad’vantage to root with suffix advan’tageous); –graphy (i.e. ‘photo, pho’tography); –ial (i.e. ‘proverb, pro’verbial); –ian (i.e. ‘Paris, Pa’risian); –ic (i.e. ‘climate, cli’matic ); –ical (i.e. e’cology, eco’logical); –ious (i.e. ‘injure, in’jurious); –ity (i.e. ‘tranquil, tran’quility); and –ion (i.e. ‘educate, edu’cation).

clip_image001Besides, adding these suffixes to a word not only brings about a shift in stress but also a change in the syllable structure or syllabification, causing vowel reduction or neutralization in the unstressed syllables to schwa (i.e. a’cademy, aca’demic, and acade’mician ; and ‘photograph, pho’tography, and photo’graphic , where the syllables preceding the stress are reduced to schwa). In certain cases, suffixation may also cause a complete change in vowel quantity (i.e. page /ei/ vs. paginate /ae/, and mime /ai/ vs mimic /i/).

Finally, it is important to note that in cases where the base and the suffix have different historical origins, it is the suffix that determines the English stress pattern. For example, Germanic suffixes such as –ly and –ness cause no shift in stress (i.e. ‘passive, ‘passively, ‘passiveness) whereas with the addition of the Latinate suffix –ity to the same word, it does (i.e. compare ‘passive to pas’sivity). This stress shift would extend even to a base word of Germanic origin if it were to take a Latinate suffix (i.e. ‘foldable vs folda’bility ).

4.1.6. The influence of a word’s grammatical function on stress in compound words.

In general, a compound noun is made up of two separately written words, hyphenated or not (as in tea-cup or armchair), and as a general rule, the first element of the compound is strongly stressed, whether the compound is simple or complex (i.e. ‘airplane (simple compound) vs ‘airplane wing (complex compound)). We may distinguish three major compound patterns: (1) noun + noun compounds (i.e. sunglasses, cowboy), (2) adjective + noun compounds (i.e. blackboard, hot dog ), and (3) noun + verb patterns (i.e. typewrite, babysit).

It is worth noting that, although noun compounds are more frequent in English than adjective compounds and verb compounds, the three of them follow the same stress patterns, that is, primary stress falls on the first element of the compound and secondary stress on the second. Moreover, since both elements of these three patterns receive stress, they do not exhibit any vowel reduction to schwa, except for compounds with –man, which often have the reduced vowel schwa in the –man syllable (i.e. postman, fireman).

Regarding (1) noun + noun compounds, stress will vary between such “true” noun compounds and words that look like noun compounds but are actually functioning as adjective + noun sequences. Stress and context are essential, then, to establish which type of word sequence we are dealing with. For instance, the noun compound in: I always use ‘cold ,cream functions as a noun + noun sequence because the primary stress is placed on the first element of the compound, and it means “I always use face cream”.

However, in a sentence with (2) an adjective + noun sequence, like I always use ,cold ‘cream, the first element is carrying a secondary stress, and functions simply as an adjective modifying the noun ‘cream, which carries the primary stress, and it means “I always use well-chilled cream”. Hence, we may find word sequences that can function as either noun compounds or adjective + noun phrases depending on stress and context, such as greenhouse, darkroom, blackboard, and hot plate).

Then, the adjective compounds actually take two stress patterns, which are often hyphenated when written. The first pattern, where the first element carries the primary stress and the second element carries the secondary stress, tends to be used when the adjective compound modifies a noun (i.e. a

‘well,trained dog and a ‘second,hand jacket). The second pattern takes the secondary stress on the first element and the primary stress on the second element when the adjective compound occurs in utternace-final position (i.e. This salesman is ,middle-‘aged or He is really ,good-‘looking).

Finally, (3) verb compounds usually take as a general rule only one stress pattern where the primary stress falls on the first element, and the secondary stress falls on the second element in the compound (i.e. ‘baby,sit). Note that stress will also vary between such “true” verb compounds, which consist of a noun and a verb, where the noun element receives primary stress and the verb element secondary stress (i.e. “Did you ‘type,write that report for me?”).

In those cases where there are words that look like verb compounds but are actually functioning as prefix + verb sequences, it is the verb that receives primary stress and the prefix secondary stress or no stress (i.e. “Can you re’heat those leftovers for me? ”).

4.1.7. Fixed stress patterns in other categories: numbers, reflexives, and phrasal verbs.

Fixed stress patterns in other categories include cardinal and ordinal numbers, reflexive pronouns, and phrasal verbs. First of all, regarding (1) numbers, we must note that both cardinal and ordinal numbers have predictable stress on the first syllable when representing multiples of ten, that is, 20, 30, 40, 50, and so on (i.e. ‘twenty, ‘twentieth, ‘thirty, ‘thirtieth, etc). However, two different stress patterns are possible with the –teen numbers and their ordinal counterparts (i.e. ‘thirteenth and thir’teenth ).

In general, according to Celce-Murcia (2001), native speakers tend to stress the first syllable in a word before a noun in attributive position (i.e. the ‘twentieth century) and when counting, whereas placing the stress on the second syllable is more common in phrase-final or utterance-final position, and when speakers are trying to make a deliberate distinction between the ten and teen digits. In these cases, the second pattern is to be chosen in order to differenciate confusing pairs of words such as thirteen and thirty.

We must not forget that the –teen numbers are compounds, that is, combinations of two or more base elements (i.e. cardinal and ordinal numbers + teen/ty + (th)). Consequently, all hyphenated numbers (i.e. thirty-seven, ninety -four) will follow compound patterns, where the placement of stress have two possible settings depending on the context.

The first pattern will place primary stress on the first element, firstly, if a number is used without another number as a contrast (i.e. He lent me ‘fifty -five dollars); and secondly, if the multiple of ten is in contrast or is given special emphasis (i.e. I said ‘forty-one, not ‘forty -six). On the contrary, the second pattern will place primary stress on the second element, firstly, if the number is in utterance final position (i.e. In March, she will be thirty -‘two); and secondly, if it is the second number in the compound that is contrasted (i.e. I said twenty -‘two, not twenty -‘three).

Regarding (2) reflexive pronouns, we must note that this is a grammatical category that exhibits complete predictability of stress since the second element ( pronoun + self/selves ) receives primary stress in virtually any environment (i.e. my’self, your’self, them’selves). On the other hand, (3) phrasal verbs , which consist of two or three words and are composed of verbs followed by

adverbial particles and/or prepositions, are actually informal colloquial verbs of Germanic origin that can often be paraphrased with a more formal single verb of Latinate origin (i.e. Germanic “look at”, Latinate “regard”; and similarly: look over and peruse, talk about and discuss, talk up and promote ).

Prepositions are the second element of some two-word phrasal verbs or the third element of three- word phrasal verbs. Among the most common, we include: about, at, for, from, of to, and with. Among the most common adverbial particles in two-word verbs, we may mention: across, ahead, along, away, back, behind, down, in(to), off, on, over, under, and up. Prepositions and adverbial particles follow different stress patterns since they fall into different grammatical categories. Yet, nouns, verbs, adjectives, and adverbs, tend to receive stress in a sentence, whereas articles, auxiliary verbs, and prepositions do not. This helps explain why prepositions in phrasal verb units are unstressed and why adverbs receive stress.

In fact, we can classify two-word and three-word phrasal verbs into three main patterns: (1) verb head + unstressed particle (i.e. ‘talk about, ‘look at); (2) verb head + stressed particle (i.e. ‘figure ‘out, ‘take ‘over); and (3) verb head + stressed particle + unstressed particle (i.e. ‘run a’way with, ‘talk ‘down to). In all three patterns, the verb head has at least one stressed syllable and the following elements are either unstressed (if functioning as prepositions) or stress (if functioning as adverbial particles). These stress patterns appear when phrasal verbs are spoken in isolation or when the phrasal verb represents the last piece of new information in the predicate (i.e. “She’s ‘looking at it”, “They were ‘standing a’round”, and “He ‘ran a’way with it”).

4.1.8. Comparing English vs Spanish word stress patterns.

Stress placement in English words if for the most part a rule -governed phenomenon, and a primary dilemma our Spanish students must face. It should be a part of the English Second Language pronunciation curriculum for two main reasons. Firstly, foreign language learners need to understand that English is a tone-language based on suprasegmental levels in connected speech. Secondly, they also need to understand that even if all the individual sounds are pronounced correctly, incorrect placement of stress can cause misunderstanding.

Yet, we must take into account that the main problem for Spanish students in English is, namely, hearing and predicting where stress falls in words. As mentioned earlier, word stress in English is not nearly as predictable as it is in languages such as French or Polish; nor does English indicate regularly placed stress patterns through stress or accent marks in the spelling, which is the case of Spanish.

Initially, learners need to understand that a basic characteristic of every English word containing more than one syllable is its stress pattern. Thus, our first step as teachers is to clarify the systematicity of stress placement in words. Firstly, by showing how native speakers highlight a stressed syllable by means of length, volume, and pitch; secondly, by showing how they produce unstressed syllables often with vowel reduction from strong forms to weak forms with schwa; and thirdly, by showing what the three main levels of stress in English are, for instance, primary, econdar, and tertiary.

Stress in noun compounds is often misplaced by Spanish learners of English, who tend to place primary stress on the second noun of the compound rather than the first, as in “I’d like to have a hot ‘dog, please”. Therefore, because of the complexity of word stress rules in general, classroom explanations must be reinforced with both in-class and out-class opportunities for students to make predictions about stress placement and apply any new rules they have been exposed to in class.

In the following section we shall deal first with rhythm and sentence stress in connect speech, whose theoretical framework will lead us to examine intonation as the third and last element of the three suprasegmental levels.


In this section, we shall examine the stress-timed nature and rhythm of English and its connection to word stress, since this involves knowing the stress patterns for the individual multisyllabic words in an utterance. In addition, we shall provide the reader with clear guidelines concerning which words in a sentence tend to receive stress, that is, content and function words, as part of a selection process on stressing key words in an utterance by means of strong and weak forms.

4.2.1. On defining sentence stress and rhythm: the stress-timed nature of English.

The previous section on word stress provides a useful basis for understanding how stress functions beyond the word level, that is, sentence stress, and therefore, rhythm in connected speech. Yet, an utterance c onsists of more than one word which exhibit features of accentuation that are in many ways similar to those in polysyllabic words, that is, depending on stress and context. However, in sentence stress, the syllabic prominence is determined mainly by the me aning which the utterance is intended to convey.

But the meaning of an utterance is largely conditioned by the situation and context in which it occurs. Thus, it must be expected that the freedom of accentual patterning of the utterance and, in particular, of the situation of the primary (tonic) accent will be considerably curtailed by the constraints imposed by the contextual environment. In the case of new information, or an opening remark, there is a greater scope for variations in meaning pointed by accentuation.

Hence, successive quality and quantity changes shall determine the relationship of the words in the utterance by means of accent and prominence. In fact, the combination of unstressed, secondary, and primary stressed elements in multisyllabic words is a relevant characteristic of English utterances. Therefore, we shall define sentence stress as the various stressed elements of each sentence that exist in both multisyllabic words and simple sentences.

Word and sentence stress combine to create the rhythm of an English utterance, that is, the regular, patterned beat of stressed and unstressed syllables and pauses. This rhythmic pattern is similar to the rhythm of a musical phrase, where the English language moves in regular, rhythmic beats from stress to stress, no matter how many unstressed syllables fall in between.

This stress -timed nature of English means that the length of an utterance depends not on the number of syllables (as it would in a syllable -time language like Spanish) but rather on the number of stresses. In English rhythm, then, pauses are of great importance since they mark intervals. For instance, stress-timed rhythm is the basis for the metrical foot in English poetry and is also strongly present in chants, nursery rhymes, and limericks.

Besides, we must note that there is a basic hierarchy in correctly determining stress placement within an utterance when deciding which words would normally be stressed. In our next section, we shall examine this kind of words under the heading of content words versus function words.

4.2.2. Content vs function words.

In connected speech, accentual patterns are freer than those of the word and are largely determined by the meaning to be conveyed. In fact, we may distinguish two main types of words depending on the categories they represent: content and function words. Content words include main verbs, nouns, adjectives, adverbs, adverbial particles, possessive and demonstrative pronouns, interrogatives, and not/negative contractions. On the contrary, function words include auxiliary verbs, articles, personal pronouns, possessive and demonstrative adjectives, conjunctions, and prepositions.

Concerning content words , they carry the most information, and are, therefore, usually stressed, generally the nouns, main verbs, and adjectives. We also stress adverbs (i.e. always, quite, very, almost, etc ), and adverbial particles following phrasal verbs (i.e. get away with, take off). Possessive pronouns (i.e. mine, yours, his, hers, etc ) and demonstrative pronouns, which are words that point or emphasize (i.e. this, that, these, those). Moreover, we stress interrogatives, that is, words that begin information questions (i.e. who, what, when, and where), and negative contractions (i.e. can’t, mustn’t), and even the negative particle not when uncontracted usually receive stress because of their semantic as well as syntactic prominence.

Concerning function words, they are more likely to be unaccented since words that modify the lexically important nouns and verbs (such as articles and auxiliar verbs) tend not to be stressed. Likewise, words that signal information previously mentioned (i.e. personal pronouns, relative pronouns, possessive and demonstrative adjectives) are usually unstressed. In these unstressed sentence elements, the vowels also tend to be reduced to schwa.

4.2.3. Strong vs weak forms.

The English speaker is aware of a certain number of strong stresses or beats corresponding to those parts of the utterance to which he wishes to attach particular meaning and on which he expends great articulatory energy. The remaining words or syllables are weakly and rapidly articulated. Therefore, the syllables uttered with the greatest stress will be defined as the strong forms of a word, and those syllables which are weakly and rapidly pronounced, will be defined as weak forms.

In English, alike Spanish, there are thirty-five common words which have both strong and weak forms ranging from modal and auxiliary verbs to personal pronouns, prepositions, or conjunctions (i.e. and, as, but, than, that, he, him, does, am, are, was, has, can, must, some, at, for, from, etc ). Yet, we shall pronounce a word in its strong form mainly for reasons of meaning in the following cases. (1) Firstly, whenever the word is meaningfully relevant in the utterance (i.e. Can I phone?, Have you finished? ); (2) secondly, whenever the word is final in the group (i.e. No, I don’t; What’s that for?) although there are some exceptions of the personal pronouns (i.e. he, him, his, her, them, us); (3) and thirdly, concerning the negative particle not when attached to can, have, is, etc , but never otherwise (i.e. I hope not).

Weak forms are not pronounced alone or separate in the sentence, and therefore, they are not stressed. Their main characteristic is that they contain the vowel schwa. English people often think thaty when they use weak forms of a word, they are being rather careless in their speech and believe that it would be more correct always to use the strong forms. However, English spoken with only strong forms sounds wrong. The use of weak forms is an essential part of English speech and foreign language learners must learn to use it if they want to sound English.

4.2.4. Adjustments in connected speech.

So far we have dealt with processes, such a s word stress, sentence stress, and rhythm, and now we focus our attention on adjustments in connected speech, which are changes in pronunciation that occur within and between words due to their juxtaposition with neighbouring sounds. The main function of most of these adjustments is to promote the regularity of English rhythm, that is, to squeeze syllables between stressed elements and facilitate their articulation so that regular timing can be maintained (Celce-Murcia 2001).

In the sections that follow, we shall examine the elaborate language system whereby sounds are influenced by other sounds in their immediate environment, taking on different characteristics as a result. We must note that these processes are common to all languages, but here we shall discuss mainly the differences between English and Spanish language. The processes to be discussed are those of linking, assimilation, dissimilation, deletion, and epenthesis, as they occur in connected speech. Linking.

According to Alcaraz and Moody (1976), whenever the message is visually transmitted (i.e. written), the receiver may easily determine the word limits as they are marked with blank pauses in between. However, they say, when the message is orally transmitted, the receiver is offered a cha in of connected phonemes that will be chopped according to his/her linguistic habits.

Yet, prepositions with articles, nouns, and adjectives are easily recognizable in speech as well as auxiliary and modal verbs. Though, the speech chain may be often ambiguous and have double meaning in all languages. For instance, note the Spanish sequence “mujeres odiosas” and “mujeres o diosas”, and the English one “A Greek spy” and “A Greek’s pie ”, where context is a key element to solve this kind of ambiguous duality.

Therefore, even to the linguistically naive, a salient characteristic of much of nonnative English speech is its choppy quality. The ability to speak English “smoothly ”, to utter words or syllables that are appropriately connected, entails the use of linking (or liaison), which is the connecting of the final sound of one word or syllable to the initial sound of the next.

According to Celce-Murcia (2001), the amount of linking that occurs in native -speaker speech will depend on a number of factors, such as the informality of the situation, the rate of speaking, and of course the individual speech profile of the speaker. Thus, the amount of linking that occurs is not entirely predictable. However, linking occurs with regularity in the following five envir onments.

First of all, (1) linking with a glide towards the semiconsonants /j/ or /w/ (i.e.: ei, ai, oi; au, ou ). They are common when one word or syllable ends in a vowel or diphthong and the next word or syllable begins with a vowel (i.e. say it, my own, toy airplane and blue ink, no art, how is it? ). In this environment or after schwa, some speakers tend to add a linking or intrusive /r/ (i.e. I saw Ann, vanilla ice-cream).

Secondly, (2) when a word or syllable ending in a single consonant is followed by a word or syllable beginning with a vowel, the consonant is often produced intervocalically as if it belonged to both syllables (i.e. black and white, Macintosh apple). Thirdly, (3) when a word or syllable ending in a consonant cluster is followed by a word or syllable beginning with a vowel, the final consonant of the cluster is often pronounced as part of the following syllable (i.e. lef/t_arm /lef’ta:m/, fin/d_out


Fourth, (4) when two identical consonants come together as a result of the juxtaposition of two words, there is one single and we do not produce the consonant sound twice (i.e. stop pushing, rob Bill, short time, bad dog, quick cure, big gap, classroom monitor, le ss serious). And finally, (5) when a stop consonant is followed by another stop or by an affricate, the first stop is not released, which facilitates the linking (i.e. pet cat, blackboard, next train, big church). Assimilation.

During this process, a given sound (the assimilating sound) takes on the characteristics of a neghbouring sound (the conditioning sound) in connected speech. Although the organs of speech involved appear to be taking the path of least resistance, such a characterization ignores the fact that assimilation is a universal feature of spoken language. It occurs frequently, both within words and between words, and there are three main types of assimilation in English: (1) progressive , (2) regressive, and (3) coalescent.

In (1) progressive assimilation, the conditioning sound precedes and affects the following sound. We distinguish two main examples in English: the regular plural /s/ vs. /z/ alternation, and the regular past tense /t/ vs. /d/ alternation, in which the final sound of the stem conditions the voiced or voiceless form of the suffix (i.e. bags /z/ vs backs /s/; moved /d/ vs liked /t/). This process also occurs in some contractions (i.e. it is /z/ – it’s /s/), and in some reductions to schwa (i.e. had to ).

In (2) regressive assimilation , the assimilated sound precedes and is affected by the conditioning sound (i.e. good boy /gu:boi/). This type of phenomenon occurs commonly and most of them involve a change in place of articulation or in voicing. However, there are also some cases of a change in manner of articulation in informal speech (i.e. “Give me some money”, “Let me go”) For instance, in the periphrastic modals has/have to when expressing obligation, and used to when expressing former habitual action, and its main feature is that it reduces the final sound to schwa (i.e. have to /hafta/, has to /hasta/, used to /usta/).

Secondly, another clear example of this phenomenon is reflected in the English spelling system, mainly in the four allomorphic variants of the negative prefix not (i.e. in-, im-, il-, ir as in the words indifferent, impossible, illogical, and irrelevant). The third type occurs in rapid native-speaker speech, where sequences of sibilants (i.e. /s/ or /z/) are followed by certain consonants. For instance, as in the examples Swiss chalet or his shirt, where the sibilants are assimilated to the next sound.

With stop consonants, a final /t/ or /d/ may assimilate to a following initial /p, k/ or /b, g/ respectively (i.e. Saint Patrick, pet kitten, good bye, good girl). With respect to final nasal consonants, especially /n/, the same phenomenon occurs by adjusting their place of articulation to that of a following conditioning consonant (i.e. “He’s in pain”, “it rains in May”, “They’re in Korea”, “Be on guard!”)

Finally, in (3) coalescent assimilation , we find a reciprocal assimilation by which the first sound and second sound in a sequence come together and mutually condition the creation of a third sound with features from both original sounds. This process occurs most frequently in English when final alveolar consonants such as /s, z/ and /t, d/ or final alveolar consonants sequences such as /ts, dz/ are followed by initial palatal /j/.

This type of assimilation is often referred to as palatalization where the alveolar consonants become palatalized fricatives and affricates, respectively (i.e. “I’ll pass this year”, “Does your sister come?”, “Is that yours?”, “She lets your dog in”, “Would you mind moving?”, “He nee ds your help”). As with linking, the amount of assimilation depends on variables, such as the formality of

the situation, the rate of speech, and the style of the speaker. Dissimilation.

In this process, alike assimilation, this phenomenon occurs when adjacent sounds become more different from each other rather than more similar. For instance, a clear example of dissimilation would be to break up a sequence of three fricatives by replacing the second with a stop (i.e. fifths

/fts/). This phenomenon is considered not to be an active process, and it is rare in English. Therefore, we shall not examine it thoroughly. Deletion.

The deletion process is also known as omission , a process whereby sounds disappear or are not clearly articulated in certain contexts. It has two main representations: written and oral. Firstly, regarding written representation, deletion appears in contracted forms of auxiliary verbs plus the negative particle not (i.e. isn’t). Secondly, regarding the oral component, deletion phenomena appear in the following environments.

First, (1) the loss of /t/ when /nt/ is between two vowels or before a lateral /l/ (i.e. winter, mantle, enter); secondly, (2) the loss of /t/ or /d/ when they occur second in a sequence or cluster of three consonants (i.e. castle, whistle, exactly; windmill, kindness hands ); thirdly, (3) the deletion of word- final /t/ or /d/ in clusters of two at a word boundary when the following word begins with a consonant (i.e. blind man, East side ). It is worth noting that there is no exception to this rule.

However, when the following word begins with a vowel, there is no deletion but resyllabification (i.e. blin/d eye, wil/d eagle). Then, the loss of unstressed medial vowel (also referred to as syncope) makes the unstressed vowel, schwa or /i/, drops out in some multisyllabic words following the stronly stressed syllable (i.e. chocolate , every, mystery, vegetable, different, reas onable ). Note that if the last syllable is stressed, syncope does not occur (i.e. Compare the verb ‘sepa,rate with the adjective ‘separate ).

clip_image002clip_image003Also, we find another process known as aphesis, which is related to the loss of an unstressed initial vowel or syllable in highly informal speech (i.e. ‘cause, ‘round, ‘bout). There are three main rules governing this process. Firstly, (1) the loss of the first non initial /r/ in a word that has another /r/ in a following syllable (i.e. governor, surprise, temperature); secondly, (2) the loss of final /v/ in of with a reduction to schwa, before words with initial consonants (i.e. lots of money, waste of time ). Thirdly, (3) the loss of initial /h/ and voiced /d/ in pronominal forms in connected speech (i.e. ask her, help him). Epenthesis.

Epenthesis makes reference to the insertion of a vowel or consonant segment within an existing string of segments. The most important type of epenthesis in English occurs in certain morphophonological sequences such as the regular plural and past tense endings. Regarding regular plurals, an eclectic schwa is added to break up clusters of sibilants or alveolar stops (i.e. places, buzzes) since progressive assimilation alone will not make the morphological endings sufficiently salient.

Regarding regular past tenses, for which we posit the –ed suffix, we have the examples such as planted and handed. Finally, there are other cases of consonant epenthesis in words like prince and tense, which end in /ns/, and are pronounced with an inserted /t/ so that they sound just like prints and tents. In such cases, the insertion of the voiceless stop /t/ makes it easier for speakers to produce the voiced nasal plus voiceless fricative sequence. Besides, the same process at work add a /p/ between the /m/ and /f/ in comfort.

4.2.5. Comparing English vs Spanish sentence stress and rhythm.

Regarding sentence stress and rhythm, the main difference between English and Spanish phonological systems is that English language is said to have a stress-timed nature whereas Spanish language has a syllable -timed one. This means that, for Spanish students of English, maintaining a regular beat from stressed element to stressed element and reducing the intervening unstressed syllables can be very difficult since their native tongue has syllable -timed patterns.

In Spanish, as well as in other syllable -timed languages (such as Italian, Japanese, French, and many African languages), rhythm is a function of the number of syllables in a given phrase, not the number of stressed elements. Thus, in Spanish, the rhythm unit is the syllable, which means that each syllable has the same length as every other syllable and there are not the constant changes of syllable length as in English word groups. Then, phrases with an equal number of syllables take roughly the same time to produce, and the stress received by each syllable is much more than in English (i.e. Spanish: “Los niños es tán en la calle”; French: “Les garçons sont dans la rue”; English: “’The children are in the ‘street).

As a result of these differences in stress level and syllable length, Spanish students tend to stress syllables in English more equally, without giving sufficient stress to the main words and without suffic iently reducing unstressed syllables. This involves knowing the English stress patterns for the individual multisyllabic words in an utterance and deciding which words in an utterance would be stressed. This is possible by clapping or tapping out the rhythmic pattern of a poem which is read aloud.

In the pronunciation classroom is highly relevant to explain and illustrate for students the stress-timed nature and rhythm of English since, when Spanish learners obscure the distinction between stressed and unstressed syllables in English, native speakers may fail to comprehend. In fact, Spanish students usually give all English syllables equal stress, and this actually hinders native speakers’ comprehension.

As we have seen previously, all five types of adjustments in connected speech reflect speakers’ attempts to connect words an syllables smoothly in the normal stream of speech. Sometimes underlying sounds are lost or modified (i.e. deletion and assimilation ) whereas sometimes other sounds are added (i.e. epenthesis and linking ).

In general, all these modifications seem to achieve firstly, ease of articulation for the speaker; secondly, preservation of the preferred English syllable structure; and thirdly, preservation of grammatical form. These phenomena are, in fact, working together to preserve stress-timed rhythm. In our next section, we shall deal with the third and last suprasegmental element, intonation in discourse.


In the previous sections we have discussed the phenomena of word stress, sentence stress, rhythm, and adjustments in connected speech, which are largely ruled governed but not particularly sensitive to discourse and speaker’s intent. In the present section, we shall focus on those features of pronunciation that are quite sensitive to the discourse context and the speaker’s intent, namely, prominence and intonation so as to highlight important information and to segment speech.

4.3.1. On defining intonation: the notion of pitch.

Following Alcaraz (1976), intonation is the most difficult suprasegmental level to be systematically defined since it conveys not only general meaning (i.e. questions, statements, doubts, and so on ) but also connotative features, such as personal and regional melodic characteristics, expressive signals of affection, happiness, and so on, and the speakers’ mental attitude.

In order to define intonation, it is first necessary to define pitch as the relative highness or lowness of the voice. This relative notion refers to the differenciated pitch levels of a given speaker as pitch variations in music. Following O’Connor (1988), every language has melody in it, and therefore, no language is spoken on the same musical note all the time. The voice goes up and down and the different notes of the voice combine to make tunes. For instance, ascending do, re, and mi represent progressively higher tones, or musical pitch

There are four main levels of phonetic pitch in English: extra high, high, middle, and low. The

function of pitch does not change the fundamental meaning of the word itself. Rather, it reflects the discourse context within which a word occurs. For instance, the one-word utterance “now ”, produced with a rising pitch contour from middle to high, could signify a question: “Do you want me to do it now?”. Produced with a falling pitch contour from high to low, however, this same word could signify a command: “Do it now!” (Celce-Murcia 2001).

Normal conversation moves between middle and high pitch, with low pitch typically signalling the end of an utterance. The extra high level is generally used to express a strong emotion such as surprise, great enthusiasm, or disbelief, and the pitch level is often used in contrastive or emphatic stress. English makes use of pitch variation over the length of an entire utterance rather than within one word, and this is the reason why it is known as a tone-language.

If pitch represents the individual tones of speech, then intonation can be thought of as the entire melodic line which involves the rising and falling of the voice to various pitch levels during the articulation of an utterance. It is said to perform several unique functions, such as to emphasize a word or utterance, to mark grammatical types of sentences, to express the speaker’s attitude, and to highlight new information in a sentence.

Following two of the most relevant figures in this field, O’Connor and Arnold (1973), intonation would be defined, first, as meaningful since it conveys denotative and connotative meanings; secondly, as systematic , since we are aware of the existence of common intonation units; and finally, as characteristic feature of individuals, groups, and regional types.

4.3.2. Intonation units.

As we have seen earlier, just as individual utterances can be divided into words and these words into syllables, we can also divide the stream of speech into discrete stretches that form a semantically and grammatically coherent segment of discourse. These smaller units in the stream of speech are called thought groups, word gr oups, or tone groups, and they are essential in English intonation.

Within the tone group, stressed syllables are spoken in a regular rhythm, and unstressed syllables are made to fit in between the beats. The stressed syllables of words which convey lexical information (mainly nouns, adjectives, principal verbs and adverbs) are given prominence in the intonation pattern, unless the information has already been mentioned or is obvious in context. In that case, whilst continuing to mark the rhythmic beat, they are not given pitch prominence.

According to van Ek and Trim (2001), every tone group contains a nucleus which is usually marked by a left to right (also right to left) diagonal falling or rising mark. Many short utterances will comprise a single tone group, contaning only one prominent syllable, which is then the nucleus of the tone group. In those cases where there is more than one prominent syllable, the last of these is the nucleus and the first is the head , which is usually marked with a rising mark above the line of writing. Following Gimson (1980), both definitions ‘nucleus’ and ‘head’ correspond to ‘primary accent’ and ‘secondary accent’, respectively.

The head is usually marked by a jump up in pitch to a high-mid level. The actual pitch varies from mid to high, depending on the attitude of the speaker towards what he or she is saying and towards the listener. The higher the level, the more cheerful and friendly the speaker sounds. The high head is marked in the texts by an upright line before the syllable concerned, above the line of writing [‘]. Some common markers for these divisions or pauses are commas, semicolons, periods, and dashes. However, in spoken discourse a speaker may pause at points where such punctuation does not always occur in a written transcription of the utterance.

Non prominent syllables, stressed or unstressed, which precede the head, are spoken on a low mid pitch (Gimson’s ‘secondary accent without pitch prominence’ ). They are often manifested by qualitative, quantitative, or rhythmic prominence, that is, by weak and strong forms, schwa reductions, linking, and so on and are usually marked by a rising mark below the line of writing Those following a high head are kept on the same level, or form a descending sequence. Those following the nucleus conform to the configuration of the nucleus, as elaborated above. Often, rhythmic beats are marked in the utterance, but have no effect on the pitch pattern. Non-prominent unstressed syllables are left unmarked (Gimson’s ‘unaccented syllables ’).

As we shall see later, the pattern of intonation used will be closely related to the language function of the sentence and its grammatical category. The term intonation unit describes a segment of speech but refers also to the fact that this unit of speech has its own intonation contour or pitch pattern, and typically contains one prominent element. We must note that a single utterance or sentence may include several intonation units, each with its own prominent element and contour.

Many, perhaps most, short exchanges in conversation consist of single tone groups. Longer utterances may simply juxtapose tone groups. However, compound (i.e. and, but, either, or, etc ) and complex (i.e. if, because, when, etc ) sentences may have two or more closely linked tone groups. This sequence is then termed a major tone group , and its completion is shown in a text with two vertical marks whereas the constituent minor tone groups are marked with a single vertical mark.

To sum up, each typical intonation unit (1) is set off by pauses before and after; (2) contains one prominent element; (3) has an intonation contour of its own; and (4) has a grammatically coherent internal structure. The way to divide an utterance into intonation units is no foolproof since it depends on several factors. Thus, in rapid speech, these may be fairly long, and in slower speech, they may be shorter, and breaks between units will then be more frequent. Also, some speakers produce fewer breaks than others, and finally, it is also dependent on the performance context, pausing frequently to make their message more emphatic (i.e. political meetings).

Yet, two additional points are to be made regarding intonation units. First, too many pauses can slow speech down and create too many prominent elements, causing the listener difficulty in processing and comprehending the overall message. And second, taking into account the process of blending and linking that occur within intonation units as part of the process of reducing unstressed vowels to schwa.

4.3.3. The main functions of intonation.

In this section we shall deal with the different functions of intonation that will lead us to establish and examine the different pitch patterns. Intonation is said to function in order to express whether a speaker is ready, to signal that a response is desired, unnecessary, or unwanted, and to differenciate normal information from contrastive or expressive intentions. In other words, intonation is said to perform an important conversation management function, with the speaker being able to subtly signal to the interlocutor to quit talking, to respond in a particular fashion, or to pay particular attention to a piece of highlighted information (Celce-Murcia 2001).

In fact, the meaning of an English utterance, that is, the information it conveys to a listener, derives not only from its changing sound pattern and the contrastive accentual prominences, but also from associated variations of pitch as we have already referred to. In fact, the discourse context generally influences which stressed word in a given utterance receives prominence, and therefore, the word the speaker wishes to highlight.

Following Celce-Murcia (2001) there are several circumstances governing the placement of prominence which are closely related to the main functions of intonation. Had we classified these circumstances following Gimson’s distribution (1980), we would have distinguished between accentual and non-accentual functions, and therefore, we would have included (1) to place emphatic stress within the accentual function, and (2) to highlight new information, (3) to express emotions and attitudes, and (4) grammatical patterns within the non-accentual functions of intonation.

Still, although it is not considered to be a function, we must not forget about the relationship between intonation and meaning. Yet, individual speakers make very specific use of prosody (i.e. intonation, volume, tempo, and rhythm) to convey their meanings in extended spoken discourse. It is a fact that nonnative speakers are frequently misinterpreted as rude, abrupt, or disinterested solely because of the prosodics of their speech, as they may sound unnatural, or not funny when intended to be. Emphatic function.

The emphatic function is also called the accentual function since it is related to the placing of tonic stress on a particular syllable. In doing so, since the speaker wishes to place special emphasis on a particular element, he or she makes the listeners concentrate their attention on the word or words carrying the primary accent.

In fact, the element receiving emphatic stress usually communicates new information within the sentence, and in contrast with normal prominence, it is characterized by the greater degree of emphasis placed on it by the speaker by means of pitch level. For instance, in the sentence “You are ‘always doing the same ”, the speaker might place emphatic stress on always to signal a particularly bad reaction to a repetitive situation. Discourse function.

Similarly, the discourse function places prominence on new information in order to indicate a contrast or link with previously given information. We shall point out that within an intonation unit, words expressing old or given information are unstressed and spoken with lower pitch, whereas words expressing new information are spoken with strong stress and higher pitch.

In unmarked utterances, it is the stressed syllable in the last content word that tends to exhibit prominence (i.e. “I have bought a ‘camera” – “A ‘digital camera?”- “Yes. A digital camera with ‘amazing functions in it. It is the ‘last ‘Canon model”). In this example, camera functions as new information in the first utterance. However, in the second sentence, digital receives prominence because it is the new information. In the third sentence, both camera and digital are old information, whereas last and Canon are new information, thus receiving prominence.

Similarly, two parallel elements, either explicitly or implicitly, can receive prominence within a given utterance at the same time. For instance, “Is it a ‘cheap or ‘expensive car?”, where both ‘cheap and ‘expensive signal an important contrast in the sentence. Attitudinal function.

The attitudinal function indicates the emotional attitude of the speaker by means of a single word or more words. In these cases, it is not the situation of the nucleus which is of importance, but rather the type of nucleus employed, that is, the intonation contours. The choice of pitch patterns can vary a great deal the discourse context within which a word occurs.

For instance, the one -word utterance “No”, produced with a rising pitch contour from middle to high, could mean surprise: “Are you sure you don’t want to come?” whereas, if produced with a falling pitch contour from high to low, this same word could express anger: “I said no!”.

It is worth noting again that the attitudinal meaning of an utterance must always be interpreted within a context, both of the situation and also of the speaker’s personality. It is likely to happen that an intoantion which is neutral in one set of circumstances might be, for instance, offensive when used by another person or in other circumstances. Grammatical function.

The grammatical function distinguishes different types of sentence by means of different pitch patterns. In fact, the same sequence of words may, with a falling intonation, be interpreted as a statement or, with a rising intonation, as a question (i.e. a statement like “Sally’s moving” may be made into a question if a rising intonation is used instead of a falling intonation type).

Moreover, if an utterance is pronounced with a rising-falling intonation, then it signals speaker certainty, which often corresponds to a declarative statement. However, pronounced with rising intonation, the same sequence of phonemes signals uncertainty and corresponds to a special type of yes/no question with statement word order, showing that intonation can override syntax in spoken English.

Yet, the main types of utterances which can describe different attitudes by means of pitch patterns are (1) assertions, (2) wh- questions, (3) yes/no questions, (4) question tags, (5) commands, requests, and orders, and finally (6) exclamations, greetings, and similar ones. These type of utterances will be examined later in the section of intonation contours.

4.3.4. Intonation contours.

We have seen how rises and falls in the pitch of the voice in connected speech produce what is called intonation. The intonation of English RP is used by native speakers on the one hand to indicate the informational structure of sentences and on the other to express nuances of meaning, to indicate unspoken implications or reservations and to convey attitudes and emotional states. As such it plays a very important part in communication and is a frequent source of intercultural misunderstandings.

The intonation contour (or pitch pattern) of a word group (or tone group) is crucial since the intonation of the sentence will show the attitude of the speaker. This level are highly dependent on discourse meaning and prominence, with rises in intonation co-occurring with the highlighted or more important words that receive prominence within the sentence. Thus pitch and prominence can be said to have a symbiotic relationship with each other in English, and the interrelationship of these phenomena determines the intonation contour of a given utterance.

The movement of pitch within an intonation unit is referred to as the intonation contour whic h ranges from extra high pitch to low pitch (i.e. extra high, high, mid, low). Pitch patterns are to be represented by two parallel horizontal lines where, according to Alcaraz (1976), we may find two types of movements: static and dynamic. On the one hand, the static type includes high, mid, and low pitch which are to be represented by the imaginary upper line, mid position, and lower line, respectively. On the other hand, the dynamic type includes rising, falling, or a combination of both pitches, depending on the direction they take within the two lines.

Hence, we may distinguish five nuclear tones (van Ek and Trim 2001), thus low falling, high falling, low rising, high rising, and falling-rising. Besides, another category (i.e. rising-falling) is added by O’Connor (1973). For our present purposes, we shall examine first the main dynamic pitch patterns, that is, falling and rising, and then, their combinations in relation to the static pitch patterns and semantic values. Falling tone.

According to Gimson (1980), a falling nucleus, marked by a diagonal falling mark, is considered to be the most neutral tone among all the pitch patterns to be examined. It is in fact, separative and assertive, by which the higher the fall the more vigorous the degree of finality implied. Note that the fall is on the stressed syllable or from the stressed syllable to a following one.

The listener is not made any explicit appeal nor impolite requests. This kind of tone is characteristic in conversations of acquainted people where there is no need of social courtesies in speech. We may distinguish two types of falling intonation depending on the tone and the discourse context where they occur: low falling and high falling. Low falling.

This is marked by a left to right diagonal falling mark, below the line of writing, placed before the nuclear syllable [,]. This mark is to be interpreted as indicating that the next syllable is stressed. Its vowel starts on a clear, low-mid tone, and then, the voice drops to a low creaky note and remains on this low pitch until the end of the tone group.

Low falling is used (1) in declarative sentences. First, for factual statements (i.e. identifying, describing, defining, and narrating as well as in answers to wh- questions (i.e.‘This is a ,door; They ‘drove to ,London ). Second, for expressing definite agreement or disagreement, firm denials, firm acceptance or rejection of an offer, intention, obligation, granting or asking for permission. In general, it indicates an unambiguous certainty (i.e. You ‘must eat your ,dinner).

(2) In interrogative sentences expected to be answered by yes or no. Those of the type of demands (i.e.‘Have you seen this film be,fore?), requests (i.e. May I come in, please? ), yes/no questions (i.e. ‘Can you ,eat it? ), tag questions (i.e. ‘Is it, red?), and in choice questions, to indicate the list of options is closed (i.e.‘Would you prefer ,tea/ or ,coffee? ). (3) In wh- questions as a definite request for a piece of information (i.e. ‘Where is Mary ?), and (4) in imperative sentences as a direct order or prohibition (i.e. ‘Sit ,down!), as an instruction (i.e. ‘Push the door!), and as a strong form of offer (i.e. ‘ Have one of ,my cakes!). High falling.

High falling tone is similar to the low falling one, except that the nuclear vowel starts on a pitch above the mid point. It is marked by place the mark above the line of writing. High falling is used (1) in declarative sentences, first, to indicate surprise, protest, enthusiasm, empha sis or insistence (i.e. That’s ‘great!, Look at ‘that!), and second, to indicate contrast with an element previously mentioned or believed to be the listener’s mind (i.e. No, it was in ‘1970 he was born).

(2) In interrogative sentences, both those answerable by yes or no and wh– questions, first, to insist on an answer being given (i.e. Did you ‘mend my bicycle?). Second, to indicate surprise or irritation (i.e. Are you ‘still thinking about going out?). Third, in rhetorical questions of an exclamatory type, to which no answer is sought (i.e. Isn’t it ‘lovely?). Finally, in tag questions, to insist on the listener’s agreement to a proposition (i.e. You ‘knew it, ‘didn’t you?).

(3) In imperative sentences, first, to insist on an order or prohibition (i.e. Don’t ‘listen to her, I say). Second, to indicate the urgency to an instruction (i.e. ‘Stop. ‘Don’t ‘move). Third, to insist on the acceptance of an offer. (i.e. ‘Do let me ‘invite you). Rising tone.

A rising nucleus, marked by a diagonal rising mark, may start from a fairly low, mid, or high pitch and it may end at a low or high pitch. This tone implies that something more is to be still said in order to catch the listener’s attention. Low rising.

This is marked by a rising mark placed before the nuclear syllable and below the line of writing [,]. It indicates that the next syllable is stressed. Its vowel starts on a clear, low level pitch to be followed by a continuous glide upward, but not rising above mid, until the end of the tone group.

The glide occurs within the nuclear syllable if it is the last in the group. If it is followed by one or more non-prominent syllables (also called the “tail”), stressed or unstressed, the nuclear syllable is spoken on a low level pitch and the rise spans the tail.

Low rising is used (1) in declarative sentences, first, to indicate indifference, resentment, guardedness or suspicion (i.e. It doesn’t ,matter; you shouldn’t complain about ,me ). Second, to reassure (i.e. You ,needn’t be worried).

(2) It is also used in interrogative questions, answerable by yes or no, first, to ask politely for confirmation or disconfirmation (i.e. She’s ,Italian, ,isn’t she? ). Second, to make polite requests and offers (i.e. ‘Would you please close the ,door?). Third, to indicate that the list is open in choice questions (i.e. ‘Would you like ,tea or ,coffee or something ,stronger?).

(3) In wh questions, first, to indicate polite interest (i.e. ‘Where are you going on ,holidays? ), and secondly, to avoid the appearance of interrogation (i.e. ‘What are you ,doing there?). (4) It is finally used in imperative sentences for gentle commands, especially to children and hospital patients (i.e. ‘Just drink this ,medicine slowly). High rising.

High rising is shown by placing the rising mark above the line of writing [‘]. It indicates that the nuclear vowel starts somewhere between low and mid- level, and that the upward glide extends well above mid.

High rising is used (1) in declarative sentences, first, to convert a statement into a question (i.e. You went to Ireland last year? ), and second, to query what someone has said (i.e. You said he is unemployed?). (2) It is also used in interrogative questions answerable yes or no, first, to indicate a casual enquiry (i.e. Would you care for a ‘coffee? ), and second, to repeat a question (i.e. A ‘coffee? Would I care for a ‘coffee?).

(3) Moreover, in wh questions, first, to repeat a question including a change of first and second person before answering (i.e. ‘Where do you ,live?- Where do I ‘live? ); and second, having the wh word as nucleus, to ask for repetition (i.e. He lives in (not understood) – He lives ‘where? ). (4) Finally, in imperative sentences to repeat an order, instruction or offer while deciding whether or how to comply (i.e. ‘Sit down, please – ‘Sit ‘down? ‘Why ,not?). Falling-rising tone.

This may be seen as a sequence of high falling and low rising, by which the nuclear vowel sound starts high-mid pitch and drops to a low creak. An upward glide follows, which does not go above mid. This tone is indicated by a v-shaped mark placed before the nuclear syllable above the line of writing [`´] and is connected with the stressed syllable of the last important word, like the fall and rise of the other tones. But it is only completed on one syllable if that syllable is final in the group. If there is one or several syllables following, the fall and the rise are separated.

This fall-rise tone combines the effect of the fall, which is contradictory and contrastive, with the emotional or meaningful attitudes, not expressed verbally, associated with the rise. Both of them may occur wihin a single word.

Thus, the falling-rising is used (1) in declarative sentences to convey various implications, such as first, warnings (i.e. The traffic lights are `´red!); secondly, corrections (i.e. Her brother ‘isn’t a teacher, / he’s an `´architect!); thirdly, limited agreement implying disagreement (i.e. I ‘don’t know if I agree with `´that); fourth, mental thought of promises (i.e. `´Yes, /I `´will be good this year); fifth, uncertainty and hesitation (i.e. I can’t be `´certain ); sixth, to soften the effects of bad news (i.e. You’re `´wrong, I’m afraid); seventh, anxious query with tag-questions (i.e. You ‘do `´love me, don’t you?); eighth, discouragement (i.e. You can’t ‘go to the cinema if you `´like); ninth, tentative advice (i.e. If ‘I were `´you, /I’ll do it); tenth, implying something has been left unsaid and contrasts what has been stated (i.e. Your opinion is `´interesting (implying: but I ‘don’t agree); eleventh, to query what has been said, implying that it is mistaken or untrue (i.e. ‘Seven eights are fity `´four?).

(2) It is also used in interrogative questions answered by yes or no, first, to add a note of warning or doubt (i.e. Are you `´sure you paid the bill?); second, when the expected answer to the question may be unwelcome (i.e. ‘Have you thought what might happen if you `´did?). (3) In wh questions, first, to repeat a question, focusing on the key issue in contrast with other possibilities (i.e. ‘What did I do on `´Saturday of last week?), and second, to query a statement with the wh word as nucleus (i.e. `´Where did he buy that motorbike?). (4) And finally, in imperative sentences, first, for issuing warnings rather than commands or instructions (i.e. ‘Watch where you’re `´going ), and second, for pleading with the imperative as nucleus (i.e. `´Do / try to be / little more careful). Rising-falling tone.

The rising-falling intonation contour is one of the most common patterns. In it, the intonation typically begins at a neutral middle level and then rises to a high level on the main stressed element of the utterance. The intonation then falls to either the low level – a terminal fall, signa lling certainty and generally corresponding to the end of the utterance – or to the middle level – a non terminal fall, signalling a weaker degree of certainty and usually corresponding to an unfinished statement, an incomplete thought, or a mood of suspense.

Rising-falling tone is indicated by an inverted v-shaped mark placed before the nuclear syllable above the line of writing [^]. Besides, intonation patterns of the “certainty” type are typically used to convey stronger feelings of approval, surprise, or disapproval. Thus, (1) in declarative statements, first, to reassure a fact (i.e. John is ^sick. He’s taken an ^aspirin ); second, in wh questions to reassure an action that causes surprise (i.e. Who will ^help?); and third, in commands to show disapproval (i.e. Fix me some ^soup ).

(2) In unfinished statements, first, where a non terminal fall with a slight rise at the end indicates that the utterance is an unfinished statement in which the speaker has left something unsaid or implied (i.e. John’s ^sick… (… but I think he’s going to work anyway ). Secondly, in unfinished statements where the slight rise at the end creates suspense (i.e. I opened the old ^suit/case… (… and found a million dollars!). And thirdly, in tag questions eliciting agreeme nt, in which the speaker is requesting confirmation from the interlocutor. Although it functions almost like a statement, they typically signal certainty (i.e. We really ought to ^vi/sit him, ^shouldn’t we?)

Once we have discussed intonation regarding its functions and its patterns, we shall move on to establish a comparison between the English and Spanish phonological systems concerning this issue.

4.3.5. Comparing English vs Spanish intonation.

We must bear in mind that English and Spanish intonation patterns are quite different, and it is a phenomenon Spanish learners must face. Firstly, they need to understand that English is a tone- language whereas Spanish is syllable-based language. Therefore, they also need to understand that even if all the individual sounds are pronounced correctly, incorrect placement of stress can cause misunderstanding.

Regarding the placement of word stress, another problem for Spanish students in English is, namely, that word stress in English is not nearly as predictable as it is in Spanish where stress patterns are regularly indicated through stress or accent marks in the spelling. Secondly, in English we find vowel reduction to schwa from strong forms to weak forms in unstressed syllables, whereas in Spanish schwa does not even exist.

Regarding sentence stress and rhythm, the main difference between English and Spanish phonological systems is that English language is said to have a stress-timed nature whereas Spanish language has a syllable -timed one. This means that, for Spanish students of English, maintaining a regular beat from stressed element to stressed element and reducing the intervening unstressed syllables can be very difficult since their native tongue has syllable -timed patterns. As a result of these differences in stress level and syllable length, Spanish students tend to stress syllables in English more equally, without giving sufficient stress to the main words and without sufficiently reducing unstressed syllables

Regarding intonation, it is a fact that certain intonation patterns present difficulties for Spanish learners since they frequently associate questions exclusively with rising intonation, for instance, and as a result, they have difficulty when producing wh questions, which typically have falling intonation in English. Tag questions are also difficult for nonnative learners, in terms of both grammar and intonation.

The main difference between English and Spanish intonation relies on the way Spanish produce the melodic tone, that is, with a narrower range making the English intonation of learners sound somewhat flat, bored, and disinterested. In fact, much research has shown that nonnative speakers are frequently misinterpreted as rude, abrupt, or disinterested mainly because their speech sounded choppy and with an unnatural rhythm, sometimes with flat intonation, or inappropriate application of intonation patterns. Moreover, Spanish learners often cannot hear important keys to meaning because of their limited command of prosodic clues.

This is especially true when humor, sarcasm, anger, irony, and the like are conveyed through prosodic elements. Thus, though the message may be understood, the speaker’s intent may be misinterpreted, resulting in the entire meaning being miscontrued. Therefore, a top priority should be given to providing them with adequate opportunities to listen for the shades of meaning in authentic conversational exchanges and to check their interpretation against that of a native speaker listening to the same conversational exchange.


As we have seen, for foreign learners of English, and in particular, Spanish learners, it is imperative even at the most elementary stage of language instruction to pay attention not only to the vocabulary, grammar, and functions of the foreign language but also to the prominence and intonation, due to the critical role these features play and the meaning they carry. From both the receptive and productive points of view, learners need extensive practice in distinguishing the subtle shades of meaning that are conveyed through prosodic clues.

Therefore, once students have understood the concepts of word stress, sentence stress, and rhythm, these can be integrated into the presentation of prominence and intonation in English. In reality these features cannot be separated naturally. However, we believe that the various intonation patterns and accompanying pitch movements make more sense if word stress, sentence stress, rhythm, and prominence have already been understood.

In this general overview of suprasegmental elements, the main point of this study has been to emphasize the functions of stress, rhythm, and intonation within authentic conversational situations. In the present study we have touched on only some of the more straightforward features with respect to how these prosodic elements are treated in second language learning.

These features allow the learner to turn the basic building blocks of the sound system (i.e., the vowel and consonant phonemes) into words, meaningful utterances, and extended discourse. A good command of these features is therefore as critical as command of the segmental features in order to achieve successful communication for second language learners.

In fact, this unit was aimed to make learners aware of the relevance of these major patterns in ongoing discourse. Besides, there is a need of alerting students to differences between the punctuation and intonation systems of English and Spanish, and overall, to teach students to think in terms of the speaker’s intention in any given speech situation.


– Alcaraz, E., and B. Moody. 1976. Fonética inglesa para españoles. Teoría y práctica (2nd ed.). Gráficas Díaz. Alicante.

– Algeo, J. and T. Pyles. 1982. The origins and development of the English language. Harcourt Brace Jovanovich, Inc.

– Brown, G. & G. Yule, 1983. Teaching the Spoken Language . Cambridge: Cambridge University Press.

– B.O.E. RD Nº 112/2002, de 13 de septiembre por el que se establece el currículo de la Educación Secundaria Obligatoria/Bachillerato en la Comunidad Autónoma de la Región de Murcia.

– Brown, G. and G. Yule. 1983. Discourse Analysis. CUP.

– Canale, M. 1983. From Communicative Competence to Communicative Language Pedagogy, in J. Richards and R. Schmidt (eds.). Language and Communication. London, Longman.

– Celce-Murcia, M., Brinton, D., and M. Goodwin. 2001. Teaching Pronunciation, A Reference for Teachers of English to Speakers of Other Languages. Cambridge University Press.

– Crystal, D. 1985. Linguistics. Harmondsworth, England. Penguin Books.

– Gimson, A. C. 1980. An introduction to the pronunciation of English. Edward Arnold.

– Goytisolo, Juan. 2001. Proclamation of Masterpieces of the Oral and Intangible heritage of Humanity 18 May 2001. Speech delivered at the opening of the meeting of the Jury (15 May 2001)

– Hymes, D. 1972. On communicative competence. In J. B. Pride and J. Holmes (eds.), Sociolinguistics, pp. 269-93. Harmondsworth: Penguin.Press.

– O’Connor, J.D. 1988. Better English Pronunciation. Cambridge University Press.

– O’Connor, J.D. and G.F. Arnold.1973. The intonation of Colloquial English . Longman.

– van Ek, J.A., and J.L.M. Trim, 2001. Vantage. Council of Europe. Cambridge University Press.