The Intonation of Givenness: Evidence from German [Reprint 2012 ed.] 9783110921205, 9783484305083

This book addresses students and researchers of phonetics/phonology, and the semantics and pragmatics of discourse. It e

168 112 816KB

German Pages 193 [197] Year 2006

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The Intonation of Givenness: Evidence from German [Reprint 2012 ed.]
 9783110921205, 9783484305083

Table of contents :
1 Introduction
1.1 Motivation and Aims
1.2 Structure of the Study
2 Theoretical Background
2.1 Intonation
2.2 Givenness
2.3 Intermediate Summary: The Relation between Intonation and Givenness
3 Corpus Analysis
3.1 The MULI Database
3.2 Analysis
3.3 Discussion
4 Experiments
4.1 Perception Experiment I: Accent Type and Modes of Givenness
4.2 Perception Experiment II: Accent Type and Types of Accessibility
5 A Model of Intonation and Givenness
6 Summary and Outlook
Bibliography

Citation preview

Linguistische Arbeiten

508

Herausgegeben von Peter Blumenthal, Gereon Mller, Ingo Plag, Beatrice Primus, Klaus von Heusinger und Richard Wiese

Stefan Baumann

The Intonation of Givenness Evidence from German

Max Niemeyer Verlag Tbingen 2006

n

Bibliografische Information der Deutschen Bibliothek Die Deutsche Bibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet ber http://dnb.ddb.de abrufbar. ISBN 13 978-3-484-30508-3 ISBN 10 3-484-30508-8

ISSN 0344-6727

) Max Niemeyer Verlag, Tbingen 2006 Ein Unternehmen der K.G. Saur Verlag GmbH, Mnchen http://www.niemeyer.de Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschtzt. Jede Verwertung außerhalb der engen Grenzen des Urheberrechtsgesetzes ist ohne Zustimmung des Verlages unzul:ssig und strafbar. Das gilt insbesondere fr Vervielf:ltigungen, ;bersetzungen, Mikroverfilmungen und die Einspeicherung und Verarbeitung in elektronischen Systemen. Printed in Germany. Gedruckt auf alterungsbest:ndigem Papier. Druck: Laupp & Gçbel GmbH, Nehren Einband: N:dele Verlags- und Industriebuchbinderei, Nehren

The Intonation of Givenness Evidence from German

Dissertation zur Erlangung des akademischen Grades eines Doktors der Philosophie der Philosophischen Fakultäten der Universität des Saarlandes

vorgelegt von

Stefan Baumann aus Kiel

Dekan:

Prof. Dr. Wolfgang Haubrichs

Berichterstatter/innen:

Prof. Dr. Martine Grice Prof. Dr. William J. Barry Prof. D. Robert Ladd (Univ. of Edinburgh)

Tag der letzten Prüfungsleistung:

15. 12. 2005

ii

Für meine Eltern Elisabeth und Ernst-Otto Baumann

iii

Zusammenfassung Die vorliegende Arbeit beschäftigt sich mit der intonatorischen Markierung von Informationsstruktur, insbesondere auf der Ebene der kognitiven Aktivierungsgrade von Diskursreferenten. Ein übergeordnetes Ziel besteht darin, Intonation und die ‘Gegebenheit’ (Givenness) referierender Ausdrücke miteinander zu verbinden. Diese beiden Gebiete gelten oftmals aus Sicht des jeweils anderen Gebiets als entweder zu vage oder zu komplex, um sie in die eigene Forschung zu integrieren. Die Ergebnisse der vorliegenden empirischen Untersuchung beziehen sich auf das Deutsche, sind aber größtenteils auf andere westgermanische Sprachen wie das Englische und Niederländische übertragbar. Nach einer Einführung in die Ziele und Struktur der Arbeit (Kapitel 1) wird in Kapitel 2 der theoretische Rahmen der Untersuchung abgesteckt. Hierbei werden zunächst die phonetischen Aspekte der Intonation (mit dem Schwerpunkt Akzentuierung) sowie die hier zugrunde gelegte phonologische Theorie, die Autosegmental-Metrische Phonologie, vorgestellt. In diesem Zusammenhang erfolgt außerdem die Vorstellung von GToBI, dem in dieser Arbeit verwendeten Modell zur Annotation deutscher Intonationsmuster. Anschließend wird, weitgehend theorie-neutral, in das Konzept der Gegebenheit eingeführt und eine Abgrenzung zu anderen Dimensionen der Informationsstruktur, wie der Fokus-Hintergrund- und Thema-Rhema-Gliederung, vorgenommen. Ferner wird diskutiert, auf welche Bereiche und Konstituenten sich Gegebenheit (und Neuheit) bezieht, wodurch sie hervorgerufen wird und welche Rolle dem Sprecher und Hörer bei ihrer Bestimmung zukommt. Es werden drei (kognitive) Ebenen unterschieden, die in der Literatur mit dem Konzept der Gegebenheit in Beziehung gesetzt werden, nämlich die Ebenen ‘Wissen’, ‘Bewußtsein’ und ‘Wichtigkeit für den Sprecher’. In Anlehnung an Lambrecht (1994) betrachte ich die ersten beiden Ebenen als zentral für meine Defintion von Gegebenheit, weil sie sich auf den kognitiven Status von Diskursreferenten oder Propositionen konzentrieren. Sie entsprechen den Ebenen der ‘Identifizierbarkeit’ und ‘Aktivierung’. Erstere beschreibt die Fähigkeit des Hörers, aus der Menge aller möglichen Referenten, die durch einen bestimmten sprachlichen Ausdruck kodiert werden, denjenigen auszuwählen, den der Sprecher gemeint hat. Letztere kennzeichnet den Grad, zu dem ein Referent oder eine Proposition im Moment der Äußerung im Bewußtsein des Hörers abrufbar ist. Die dritte Ebene bezeichnet die pragmatische Rolle eines Diskursreferenten in einer Proposition, ausgedrückt durch die Unterscheidung zwischen Fokus (was dem Sprecher wichtig ist) und Hintergrund (was dem Sprecher nicht wichtig ist). Auf Grundlage der Differenzierung dieser Ebenen werden in einem nächsten Schritt Fragen ihrer sprachliche Realisierung diskutiert. Dabei gilt der linguistischen

iv

Markierung von Gegebenheit im engeren Sinne, d.h. der Ebenen von Identifizierbarkeit und Aktivierung, ein besonderes Interesse. Es zeigt sich, dass (Nicht-)Identifizierbarkeit vor allem durch (In-)Definitheit gekennzeichnet ist, während Aktivierung, ein Parameter, der die Identifizierbarkeit eines Referenten voraussetzt, durch zwei verschiedene linguistische Beschreibungsebenen kodiert wird: lexikalische Form und Intonation. Bereits aktivierte Referenten werden oft als Pronomen realisiert, während weniger aktive Konzepte im Allgemeinen in ihrer vollen lexikalischen Form auftreten. Ferner sind aktivierte Referenten oftmals unakzentuiert, wohingegen halb- oder nicht-aktivierte Referenten in aller Regel einen Akzent tragen. Allerdings zeigt diese Studie, dass die Korrelation der Dichotomien ‘akzentuiert versus unakzentuiert’ und ‘neu versus gegeben’ eine unzulässige Vereinfachung darstellt. Viele neuere Arbeiten über den kognitiven Status von Diskursreferenten betrachten die Unterscheidung von neuer und gegebener Information nicht mehr als binär, sondern als graduell. Ein solches Kontinuum möglicher Aktivierungsgrade läßt sich allerdings sprachlich nicht angemessen widerspiegeln, weil die Anzahl der zur Verfügung stehenden linguistischen Kategorien begrenzt ist. Ich beschränke mich daher auf die Untersuchung dreier verschiedener kognitiver Zustände, die ich in Anlehnung an Chafe (1994) ‘gegeben’ (Given), ‘erschließbar’ (Accessible) und ‘neu’ (New) nenne und von denen ich annehme, dass sie durch distinktive formale, insbesondere prosodische Kategorien markiert werden. Der bislang kaum untersuchte Bereich der ‘erschließbaren’ bzw. ‘halb-aktivierten’ Information und seiner sprachlichen Realisierung ist Hauptgegenstand meiner Studie. Während die morphosyntaktische Markierung erschließbarer Referenten relativ unumstritten ist (sie treten zumeist in Form definiter Nominalphrasen auf), gibt es keinen Konsens bezüglich ihrer prosodischen Markierung. Dies ist unter anderem darauf zurückzuführen, dass in den Beispielen, die in der einschlägigen Literatur diskutiert werden, die Position der halb-aktivierten Referenten innerhalb der Äußerung variiert. Folglich wird der Unterschied zwischen pränuklear und nuklear akzentuierten Ausdrücken vernachlässigt. Der Status pränuklearer Akzente ist allerdings weit weniger eindeutig als derjenige nuklearer Akzente, nicht zuletzt weil die Verteilung und Stärke pränuklearer Akzente in größerem Maße von der rhythmischen Struktur und Länge der Äußerung abhängt. Ich habe mich daher auf die prosodische Markierung des letzten – und somit potentiell nuklearen – Arguments in deklarativen und vollständig fokussierten Äußerungen konzentriert. Auf diesem theoretischen Fundament stehen die empirischen Untersuchungen, die die Analyse eines gelesenen Korpus deutscher Nachrichtentexte (Kapitel 3) sowie zwei Perzeptionsexperimente (Kapitel 4) umfassen. Die Korpusanalyse, die zur Erhebung von Produktionsdaten dient, ist dabei lediglich als Vorstufe der Untersuchung zu betrachten, weil die Daten von einer einzigen Sprecherin gelesen wurden und somit die Repräsentativität der Intonationsmuster nicht uneingeschränkt v

gewährleistet werden kann. Dennoch können gewisse Korrelationen zwischen Akzenttyp und kognitivem Aktivierungsgrad von Referenten gefunden werden, die anschließend in kontrollierten Perzeptionsexperimenten überprüft werden. In diesen Experimenten haben die Versuchspersonen die Aufgabe, die Angemessenheit bestimmter Akzenttypen (H*, H+L*) sowie Deakzentuierung als Kennzeichen eines Referenten in verschiedenen Kontexten zu beurteilen. Die Auswahl der getesteten Akzenttypen basiert – neben den Daten der Korpusanalyse – auf Vorschlägen aus früheren Studien zum Englischen und Deutschen, vor allem von Pierrehumbert & Hirschberg (1990) und Kohler (1991a). Das erste Experiment hat die Untersuchung von Akzenttyp-Präferenzen auf Referenten in unterschiedlichen PrimingBedingungen zum Gegenstand, d.h. nach vorangehender auditiver, visueller oder fehlender Darbietung des gleichen Referenten. Dem liegt die Annahme zugrunde, dass der unterschiedliche Präsentationsmodus die Referenten in unterschiedlichem Maße aktiviert, so dass den drei Bedingungen gegebene, erschließbare und neue Information entspricht. Das zweite Experiment konzentriert sich weitgehend auf die Prosodie erschließbarer Referenten, insbesondere verschiedener Arten textuell und inferentiell erschließbarer Information – im Gegensatz zum ersten Experiment, in dem situativ (d.h. visuell) erschließbare Referenten dargeboten wurden. Es werden die gleichen Akzenttypen (inklusive Deakzentuierung) verwendet wie im ersten Experiment. Die Ergebnisse der empirischen Untersuchungen bilden die Grundlage des hier vorgeschlagenen Modells von Intonation und Gegebenheit im Deutschen (Kapitel 5). Im einzelnen zeigt sich, dass Neuheit mit dem Akzenttyp H* korreliert. Dies gilt sowohl für ‘brandneue’, d.h. für den Hörer nicht identifizierbare Referenten als auch für ‘ungebrauchte’, d.h. dem Hörer zwar bekannte, aber aus dem vorangehenden Diskurs nicht ableitbare (und damit nicht aktivierte) Referenten. Der formale Unterschied zwischen diesen beiden Kategorien besteht in ihrer morphosyntaktischen Markierung: Brandneue Information wird als indefinit, ungebrauchte als definit gekennzeichnet. Am anderen Ende der Skala läßt sich feststellen, dass gegebene, d.h. vollständig aktivierte Information gemeinhin nicht akzentuiert wird. Die intonatorische Markierung von Referenten im Zwischenbereich der ‘erschließbaren’ Information ist hingegen weniger eindeutig, auch wenn eine Präferenz von Akzenttyp H+L* für bestimmte Arten halb-aktivierter Information, wie z.B. ein durch ein ‘Szenario’ mit-aufgerufenes Konzept (z.B. Restaurant – Kellner) oder die Anapher in einer Ganzes-Teil-Relation (z.B. Buch – Seiten), nachgewiesen werden kann. Andere halb-aktivierte Referenten, wie Synonyme (z.B. Apfelsine – Orange) oder die Anapher in einer Teil-Ganzes-Relation (z.B. Seiten – Buch) oder einer Äquivalenz-Relation (z.B. A=Bs Schwester ≈ B=As Bruder), werden bevorzugt deakzentuiert. Generell zeigt sich, dass Information zwischen den Polen ‘gegeben’ und ‘neu’ im Hinblick auf ihre intonatorische Markierung nicht klar abgrenzbar ist und dass vi

unterschiedliche Arten mehr oder weniger aktivierter Information, die z.B. verschiedene lexikalisch-semantische Relationen kodieren, unterschiedliche Akzenttypen zu ihrer Kennzeichnung erfordern. Dabei deutet sich eine (pseudograduelle) Rangfolge der Akzenttypen an – wobei die Tonhöhe auf der lexikalisch betonten Silbe des referierenden Ausdrucks den entscheidenden Faktor darstellt –, die mit Änderungen im Aktivierungsgrad des Referenten einhergeht. Dieser Zusammenhang suggeriert eine gewissermaßen ‘ikonische’ Funktion der Tonhöhe, die sich auch in Gussenhovens (2002) Effort Code widerspiegelt: je höher der Ton auf der lexikalisch starken Silbe, desto ‘neuer’ (oder wichtiger) der Diskursreferent. Diese Faustregel erklärt auch den Zwischenstatus des Akzenttyps H+L*, der in allen Fällen, in denen einer der beiden anderen getesteten Markierungen (H* oder kein Akzent) bevorzugt wird, als zweitbeste Möglichkeit eingestuft wird. Diese Rangordnung verschiedener Akzenttypen beinhaltet auch eine Abstufung in der wahrgenommenen Akzentstärke. Mehrere Studien zum Deutschen und Englischen postulieren mehr oder weniger direkt die Relevanz eines ‘Sekundärakzents’ als Kennzeichen halb-aktivierter Information. Allerdings können Art, Position und genaue Funktion dieser sekundären Prominenzen variieren, wie etwa zwischen ‘Sekundärakzenten’ im Sinne Bürings (Büring 2003), ‘Dauerakzenten’ (Kohler 2005) oder ‘Phrasenakzenten’ (Grice, Ladd & Arvaniti 2000). Die Ergebnisse dieser Arbeit sind auch für sprachtechnologische Anwendungen relevant, vor allem im Bereich der Sprachsynthese. Die Integration eines feiner differenzierten Intonationsmoduls kann die Natürlichkeit eines Synthesesystems erheblich steigern, insbesondere wenn der Zugriff auf den sprachlichen Kontext sowie auf semantische Informationen, die für die korrekte Zuweisung der verschiedenen Akzenttypen (inklusive Deakzentuierung) von entscheidender Bedeutung sind, gewährleistet ist. Dies kann durch die Verküpfung des Synthesesystems mit einer semantischen Datenbank wie GermaNet erreicht werden. Ein solches System wäre nicht mehr ausschließlich textbasiert – und seine Akzentzuweisungsregeln nicht mehr allein von Wortart-Informationen abhängig – sondern die Vorstufe einer concept-to-speech-Synthese mit kontext-sensitiver Intonation.

vii

Acknowledgements Many people contributed to the completion of this thesis. Firstly, I would like to thank my supervisor Martine Grice for her invaluable support and assistance throughout all stages of this study, for her enthusiasm and countless inspiring discussions. I am also deeply grateful to my second supervisor and mentor, Bill Barry, who was always there when I needed advice, and to Bob Ladd, who kindly agreed to write an external review. This meant a lot to me since it was one of Bob’s courses in Edinburgh several years ago that aroused my interest in the subject. Furthermore, I feel indebted to Richard Wiese, editor of the series ‚Linguistische Arbeiten’, for his helpful comments on the manuscript. Also thanks to the many friends and colleagues who helped me in various ways, with both intellectual as well as practical contributions, especially in setting up and running the experiments. In Saarbrücken, these people are (among others) Kerstin Hadelich, Caren Brinckmann, Gudrun Schuchmann, Bettina Braun, Jürgen Trouvain, Marc Schröder, Bistra Andreeva, Cordula Klein, Silke Jarmut, Andrea Weber, Jacques Koreman, Thomas Blug, Anja Moos, Kerstin Kunz, Stephanie Becker and Silvia Hansen-Schirra. In Cologne, I would like to cordially thank the whole team, namely Doris Mücke, Maja Warnking, Philipp von Böselager, Barbara Hugo-Dilworth, Johannes Becker, Anne Hermes, Jacqueline Anthes, Kyung-Hee Kim, Anna Diagne, Christine Riek and Theo Klinker. And finally, special thanks to my Schatje, Stella Neumann, who supported me in ALL matters.

viii

Contents 1 Introduction.........................................................1 1.1 Motivation and Aims ............................................................. 1 1.2 Structure of the Study............................................................. 2

2 Theoretical Background .....................................5 2.1 Intonation .............................................................................. 5 2.1.1 Phonetics of Intonation ................................................................ 5 2.1.2 Phonology of Intonation ............................................................ 12 2.1.2.1 Principles of Autosegmental and Metrical Phonology...... 12 2.1.2.2 Intonation in Autosegmental-Metrical Phonology ............ 18 2.1.2.3 GToBI ...................................................................................... 23 2.2 Givenness ............................................................................. 28 2.2.1 Givenness and Information Structure ...................................... 28 2.2.2 Domains and Modes of Givenness........................................... 39 2.2.3 Perspectives of Givenness.......................................................... 43 2.2.4 Levels of Givenness..................................................................... 48 2.2.5 Degrees of Givenness and their Linguistic Marking .............. 57 2.2.5.1 The Marking of (Non-)Identifiability .................................. 58 2.2.5.2 The Marking of (In-)Activation............................................ 65 2.2.5.3 The Marking of Focus-Background Structure ................... 89 2.3 Intermediate Summary: The Relation between Intonation and Givenness ...................................................................... 96

3 Corpus Analysis............................................... 100 3.1 The MULI Database...........................................................100 3.2 Analysis ...............................................................................103 3.3 Discussion...........................................................................109

4 Experiments .....................................................114

ix

4.1 Perception Experiment I: Accent Type and Modes of Givenness ............................................................................ 114 4.1.1 Motivation................................................................................... 114 4.1.2 Hypotheses.................................................................................. 116 4.1.3 Design of Test Material............................................................. 116 4.1.3.1 Modes and Degrees of Givenness Investigated............... 116 4.1.3.2 Visual Test Material .............................................................. 117 4.1.3.3 Auditory Test Material ......................................................... 118 4.1.4 Experimental Setup ................................................................... 121 4.1.5 Results.......................................................................................... 123 4.1.6 Discussion ................................................................................... 124 4.2 Perception Experiment II: Accent Type and Types of Accessibility ........................................................................126 4.2.1 Motivation................................................................................... 126 4.2.2 Hypotheses.................................................................................. 126 4.2.3 Design of Test Material............................................................. 126 4.2.3.1 Types of Accessibility Investigated .................................... 126 4.2.3.2 Textual Test Material............................................................ 130 4.2.3.3 Auditory Test Material ......................................................... 133 4.2.4 Experimental Setup ................................................................... 135 4.2.5 Results.......................................................................................... 135 4.2.6 Discussion ................................................................................... 136 4.2.7 Summary and Conclusion......................................................... 141 4.2.8 Digression: Intonation, Biological Codes and their Linguistic Manifestations ............................................................................ 143

5 A Model of Intonation and Givenness ............ 150 6 Summary and Outlook .................................... 162 Bibliography ......................................................... 167

x

1 Introduction 1.1

Motivation and Aims

This study is intended to address two kinds of audience: intonologists who are interested in information structure, and experts in the semantics and pragmatics of discourse who have an interest in prosodic aspects of information structure. By addressing both, we aim to forge links between the two fields, taking into account the growing demand articulated in recent years. In order to establish a ‘common ground’, an overview of the state of the art in both intonation research and information structure will be given, the latter predominantly with respect to the cognitive states of discourse referents, subsumed under the notion of ‘Givenness’. Since the notion of Givenness has been used in the literature in diverging ways, applying to different levels of description, we will distinguish it from other levels of information structure and propose a model of ‘Givenness proper’ and its linguistic marking. Multiple functions can be attributed to intonation, ranging from the clearly paralinguistic encoding of emotions, over ‘more linguistic’ pragmatic functions such as indicating speech act distinctions, to expressing strictly linguistic contrasts at word level in tone languages. We will concentrate on the (linguistic) functions of intonation which serve to assign a structure to utterances in terms of phrasing and prominence relations and which are relevant for information packaging. This structure, which is often realised with a combination of phonetic properties (predominantly perceived pitch, but also entailing loudness, vowel quality, and relative length of syllables and words) is determined phonologically in terms of abstract tonal values at the edges of phrases and on prominent syllables. Of particular interest is the distribution of prominences or (pitch) accents, which fulfil the highlighting function of intonation. In studies on the realisation of information structure in West Germanic languages (notably English, German and Dutch), it is commonly assumed that important or New information is marked by a pitch accent, while unimportant or Given information is either unaccented or deaccented (i.e. there is no pitch accent where one would otherwise be expected). A central aim of this study is to show that this view is an inadequate simplification. This applies to both fields of research: First, there is evidence from psycholinguistically oriented studies that the difference in the cognitive activation of discourse referents is gradual in nature and – consequently – that there are ‘degrees of Givenness’ between the extreme poles of Given and New. Most linguistic studies which acknowledge the existence of Givenness degrees are predominantly concerned with the morphosyntactic form of

1

referring expressions, since they usually investigate written language. Only few approaches combine morphosyntax and intonation. Second, the intonational means used for the encoding of information go far beyond a dichotomy of accent versus lack of accent. As a number of investigations of the prosodic marking of information structure have shown, different types of accent have to be taken into account, as well as different degrees in the strength of accents. Thus, it will be investigated in this study how far not only accentuation and lack thereof, but also type of accentuation can be used to indicate different degrees of Givenness, and in particular how they are used in German. We will gain our evidence from empirical data, elicited in both production and perception experiments. Almost all – of the few – empirical studies on the relation between activation degrees and intonation that have been conducted so far are based on English data. Since German and English are said to be closely related, at least in terms of their intonational systems, being both West Germanic (see Ladd 1996), the claims made in the literature on English are taken as a point of departure for our own investigation. Having said this, however, we should be aware that structures or relations that hold for one language cannot be automatically transferred to another language, however related. We therefore stress that the model of intonation and Givenness proposed here applies to German and although it may be relevant for research into English, it is not claimed that the results hold for English.

1.2 Structure of the Study Since we are bringing together two only partially intersecting fields of research, we shall first give separate accounts of their theoretical background (chapter 2). Section 2.1 deals with intonation. First, phonetic aspects of intonation (understood as being equivalent to the more general term ‘prosody’) will be discussed, in particular the phonetic parameters which are relevant for the description of speech melody, phrasing and – most importantly – accentuation (section 2.1.1). We shall then turn to the phonological description of intonation (section 2.1.2), starting with a brief overview of the principles of Autosegmental and Metrical Phonology (section 2.1.2.1). The combined autosegmental-metrical theory is the currently most widespread framework for representing pitch accents and prosodic phrasing and the one used in this study (presented in section 2.1.2.2). Finally, the annotation system GToBI (German Tones and Break Indizes) will be introduced, which has been developed for the description of German intonation within the autosegmental-metrical framework (section 2.1.2.3). This system will be used for the annotation of examples throughout the whole study. Section 2.2 deals with aspects centering around the notion of Givenness. We will first develop a definition of the concept of Givenness and other basic information 2

structural concepts. Then the role of Givenness within the vast field of information structure will be determined, delimiting it in particular from the dimensions of background versus focus and theme versus rheme (section 2.2.1). This means that Givenness proper, in our understanding, exclusively applies to the status of referents or propositions in discourse, not to the partitioning of a sentence or utterance. The following subsection (2.2.2) deals with the domains and modes of constituents that can be called Given or New, i.e. with questions of their nature (e.g. referents or propositions), size (e.g. one item or a whole phrase) and origin (e.g. recoverable from the preceding discourse or the physical context). Furthermore, the role of speaker and listener in determining what is Given and what is New (section 2.2.3) will be discussed. Finally, in sections 2.2.4 and 2.2.5, three levels are differentiated to which the notion of Givenness has been applied in the literature. These are the levels of knowledge, consciousness and newsworthiness, which will be explained in detail. Especially the second level of consciousness or activation, which represents the core of our understanding of Givenness, allows for a differentiation of various degrees of Givenness. We will propose to add at least one intermediate state between the extreme poles of Given and New, namely Accessibility, and give a detailed account of how these different degrees or states of cognitive activation are marked linguistically, i.e. by morphosyntactic and – in particular – prosodic means. As a first step towards a comprehensive model of intonation and Givenness in German, a read corpus of German newspaper texts, the MULI corpus, was analysed (chapter 3). Since the corpus was read by a single speaker, the attested relationship between activation states of discourse referents and their prosodic marking can only be regarded as tendencies which should be verified in further experiments involving a number of speakers or in perception experiments involving a (large) number of listeners. We carried out the latter, described in chapter 4. In two perception experiments, listeners judge the appropriateness of the presence or absence of accentuation, as well as accent type, as a marker of discourse referents in various contexts. The first experiment investigates preferences for accent type and placement across different conditions: where, prior to the target utterance, the referent has been presented in the form of a picture (visual priming) or in spoken form (auditory priming), and where it has not been presented at all (no priming). The different modes of presentation are assumed to activate the referents to different degrees, thus eliciting a distinction between Given, Accessible and New information (section 4.1). The second experiment concentrates on the intonation of the largely unstudied area of information between the poles Given and New. Using the same accent types as in the first experiment, various kinds of textual and inferential Accessibility are examined, again assuming differences in activation, this time brought about by differences in the semantic relation between an antecedent and an anaphor (section 4.2). The discussion of the experimental results leads to a brief discussion of the 3

iconicity of intonation and the question to what extent intonational meaning is determined by biological codes (section 4.2.8). Based on our empirical data, and taking into account several aspects of previous approaches, we propose in chapter 5 a fine-grained model of intonation and Givenness for German, which overcomes a simple binary distinction between New and Given on the one hand and accented versus unaccented on the other. Finally, in chapter 6, the main results are summarised, and suggestions are put forward for future research and as to how parts of the proposed model can be made use of in technological applications, in particular in speech synthesis.

4

2 Theoretical Background 2.1 Intonation In spoken language, intonation serves diverse linguistic and paralinguistic functions, ranging from the marking of sentence modality to the expression of emotional and attitudinal nuances. However, the term ‘intonation’ has been defined in at least two different ways in the literature. A narrow definition equates intonation with ‘speech melody’, restricting it to the “ensemble of pitch variations in the course of an utterance” (‘t Hart et al. 1990: 10). A broader account of intonation, which will be adopted for the present study, is equivalent to what is often called ‘prosody’, subsuming such different phenomena as pitch movements and range (speech melody), the division of speech into chunks (phrasing), highlighting at word level (lexical stress) and utterance level (accentuation), the marking of prominence relations (rhythm) and variations in speech rate (tempo). All these phenomena, most of which serve to assign a phonological structure to utterances, are realised with a combination of phonetic properties such as perceived pitch, loudness, vowel quality, and relative length of syllables, words and pauses. These properties, especially the phonetic correlates of accentuation, which are of particular interest for this study, will be dealt with in section 2.1.1. Basics of the phonological theory adopted here will be presented in section 2.1.2, including a brief description of the principles of Autosegmental and Metrical Phonology (section 2.1.2.1), the representation of intonation within this framework (section 2.1.2.2) and, finally, a description of the annotation model for German intonation used in the present study (section 2.1.2.3). 2.1.1

Phonetics of Intonation

The phonetic parameter that is most centrally associated with intonation – especially in the narrow sense of ‘speech melody’ – is pitch. However, the term ‘pitch’, or, more precisely, ‘variations of pitch’, only denotes the perceptual impression of speech melody, which has physiological and acoustic correlates. The physiological or articulatory source of pitch variations are changes in the rate at which the vocal folds vibrate. The vibration is a result of aerodynamic forces acting upon the folds after they have been sufficiently approximated by some of the laryngeal muscles. Acoustically, the frequency at which the vocal folds vibrate correlates with the fundamental frequency or F0. It denotes the repetition frequency of a complex, quasi-periodic sound wave, which is equivalent to the highest common factor of the sound wave’s component frequencies or ‘harmonics’ (see Ladefoged 1962: 111). The fundamental frequency is measured in ‘Hertz’ (Hz), replacing the older term ‘cycles

5

per second’ (cps), which directly mirrors the cyclic opening and closing of the glottis. The higher the frequency of vocal fold vibrations and, in turn, the higher the fundamental frequency of a sound, the higher is its perceived pitch. Normally, we are able to hear sounds between 40 and 4000 Hz (see ‘t Hart et al. 1990: 26), with normal speaking voices ranging between 150 and 400 Hz for women and between 80 and 200 Hz for men. However, pitch values can be perceived even if the F0 is missing, as e.g. in hissed (i.e. voiceless) sounds or in conversations on the telephone, which does not transmit sound waves below 300 Hz. In these cases, the human ear can detect the fundamental frequency by measuring the difference between the harmonics (which are integer multiples of F0). The rate at which the vocal folds vibrate is determined by their elasticity, length and mass (with vocal folds in females being shorter and lighter than in males, thus producing higher vibration frequencies) but also, more indirectly, by muscular tension and the amount of air pressure below the glottis. Increased tension stretches the vocal folds which, in turn, thins their effective vibrating portion and leads to higher F0 values. Enhanced subglottal pressure, on the other hand, primarily increases the amplitude of vocal folds vibration, not its frequency. Nevertheless, a higher amplitude induces a greater deformation of the folds, which, in turn, leads to greater mechanical stiffness and – as a consequence – faster vibration of the vocal folds (see ‘t Hart et al. 1990: 13f.). While the two influencing factors on F0 just mentioned, i.e. muscular tension and sub-glottal air pressure, are to a large extent under the speaker’s control (see Borden & Harris 1984: 74ff.), other physiological factors, like certain supralaryngeal articulatory gestures, are not. For example, high vowels have higher intrinsic pitch than low vowels (especially in prominent syllables; see e.g. Lehiste & Peterson 1961, Ladd & Silverman 1984, Di Cristo & Hirst 1986), and the F0 of vowels in general is affected by adjacent consonants. A voiceless obstruent, for instance, leads to a higher F0 at the beginning of the following vowel than a voiced obstruent due to increased vocal fold tension after a glottalic opening gesture (see Kingston 1991, Gussenhoven 2004). Such minor perturbations in the F0 curve do not influence listeners’ interpretation of the intonation contour (see Silverman 1987).1 They are known as instances of ‘microprosody’ or ‘microintonation’. More complicated than determining the phonetic correlate of speech melody is the answer to the question of what constitutes accent. Among others, Kohler (1977) and Beckman (1986) claim for German and (American) English, respectively, that the acoustic correlate of accentuation is a complex mixture of F0 variation, increased intensity (primarily equivalent to higher subglottal pressure) and increased duration Compare, however, Kingston & Diehl (1994), who argue that the phenomena just mentioned may be controlled articulations which are intended to enhance height or voicing contrasts. Nevertheless, they are not controlled for intonation purposes.

1

6

of syllables and words. However, especially early approaches did not treat accents as holistic phenomena, but considered single phonetic parameters responsible for their realisation. American phonologists like Bloomfield (1935), Pike (1945), Trager & Smith (1951) and Chomsky & Halle (1968), e.g., equated ‘stress’ (in the sense of ‘postlexical prominence’, i.e. at utterance level) with loudness (or intensity) and ‘intonation’ with variations in pitch (or fundamental frequency). However, British linguists such as Jones (1950) and Kingdon (1958) suspected that stress could not be separated from intonation, an assumption which was confirmed in perception experiments by Fry (1955, 1958). Fry conducted perception experiments in which subjects had to judge whether the first or second syllable in synthesised minimal pairs like contract (noun) versus contract (verb) were accented. The test words presented in carrier sentences such as “Where is the accent in __” displayed combinations of different vowel intensities, durations and various stylised F0 patterns. Results suggest that a change in F0 is the most effective cue to the perception of accent, followed by duration and intensity.2 Similar results for German were found in perception tests by Isačenko & Schädlich (1966; see also Uhmann 1991: 112). Based on Fry’s findings about the hierarchical importance of the different phonetic parameters for the perception of postlexical prominences (and also in line with Isačenko & Schädlich’s findings for German), Dwight Bolinger (1958) developed a theory of pitch accents in which he equates stress (or postnuclear prominence) with fundamental frequency inflections. Thus, as Beckman (1986: 54) points out, Bolinger’s pitch accents are not only associated with stresses, as, e.g., in Kingdon’s (1958) account, in which a syllable’s prominence stems from greater intensity and the tonal movement is associated with this prominent syllable, they are stresses. That is, the prominence is directly attributed to pitch shape. More recent experiments showed, however, that postlexical prominence must not be equated with a single phonetic feature and, furthermore, that the separation of the different parameters constituting accents is highly artificial. This artificiality becomes even more striking if we consider that eliminating certain parameters does not imply their perceptual irrelevance. Instead, a competent listener is able to add speech signals or parameters which are not there (see Kohler 1977: 83, Ladd 1980: 42). A study that investigates several phonetic correlates of prominence at the same time is Nakatani & Aston (1978). The authors used delexicalised stimuli pairs with different stress patterns (e.g. MAma versus maMA, resulting from changes in their phonetic parameters) as substitutes of real words in naturally spoken sentences and placed them at varying positions in the test utterance. Subjects had to give stress ratings to each stimulus. Generally, the relatively low effect of intensity on stress See also Lehiste (1970) who states that most production studies on intonation languages come to a slightly different hierarchy of parameters: While F0 changes are still most important, intensity is reported to be more crucial to accentuation than duration.

2

7

perception could be confirmed. The effect of the other parameters, however, varied depending on the stimulus’ position in the utterance: in prenuclear position, duration and vowel quality turned out to be as important as F0 variations; in nuclear position, F0 was the most effective cue (again in line with Fry’s findings)3; and in postnuclear position, duration was the factor with the greatest influence on prominence perception. Mary Beckman’s (1986) investigation of the correlates of stress in English (and Japanese) does not (primarily) rely on subjects’ perception of (re)synthesised stimuli but is based on naturally produced data. Beckman had her American English subjects read pairs of context and target sentences, the latter having the form “I said __ this time”, i.e. the target words, which were taken from the same minimal pairs as used by Fry (e.g. CONtract versus conTRACT), always occurred in nuclear position. Beckman used a technique of automatic stress recognition from the naturally spoken utterances. She found for the English data that fundamental frequency and intensity are equally good stress markers (around 80% recognition rate), and that the duration values are just a little lower (around 70%). The attribute with the highest recognition rate, however, is ‘total amplitude’, a factor converging duration and intensity into a single acoustic category (recognition rate of 94%). This attribute also turned out to be the most dominant perceptual cue in a follow-up experiment with native speakers (and listeners) of American English, who judged the stress patterns of synthetic stimuli based on the utterance pairs from the production experiment.4 Beckman claims that duration and intensity do not act independently as correlates of accentual prominence (in production and perception). She further claims that the dependency between these two cues to stress is not a simple ‘trading relationship’, but stems from a more general auditory connection (Beckman 1986: 197): Since the basic psychological dimension of loudness is dependent on the total sum of intensity in the signal at the short signal durations typical of speech sounds, the association between stress and duration may have its origin in the correlation between loudness and duration, rather than in any direct correlation between stress and subjective duration as a consequence of the articulatory gestures of stress. Thus the total amplitude may be a better correlate of stress than is either duration or intensity alone and it may be a more consistent perceptual cue simply because it is a better measure of loudness, and not because loudness and subjective duration are in a trading relationship as psychological cues.

Support for Beckman’s findings comes from a recent study by Batliner et al. (2001) on spontaneous (American) English and German speech (the latter taken from the VERBMOBIL corpus, see Wahlster 2000). Using automatic feature extraction, the Similarly, Goldbeck & Sendlmeier (1988: 312) found in a perception experiment on German declarative sentences that relative pitch height is the most relevant cue to differentiating word pairs like Übersetzen versus überSETzen if they occur in sentence final position. 4 However, taking each parameter separately, the most effective cue is again fundamental frequency variation. 3

8

authors found that for modelling accents (as well as boundaries) a combination of duration and energy (or intensity) was the most relevant prosodic feature in both languages. These last two studies indicate that in languages like English and German accentual prominence is not exclusively cued by pitch variations (a claim made by Bolinger), but also by increased loudness, increased length and unreduced vowel quality. Beckman (1986) calls these languages ‘stress accent languages’ and defines them in contrast to ‘non-stress accent languages’ like Japanese by saying that the former language type simply makes use of material other than pitch to a greater extent. The more commonly used classification of languages into ‘intonation languages’, ‘pitch accent languages’ and ‘(lexical) tone languages’ is predominantly based on differences as to the linguistic level at which the parameter ‘pitch’ applies: while in intonation languages (which are generally also stress accent languages) like English and German, pitch is a postlexical feature, i.e. the tonal movement is superimposed on the words at utterance level, pitch accent languages like Swedish (a stress accent language) and Japanese (a non-stress accent language) as well as tone languages (generally non-stress accent languages) like Standard Chinese employ tonal contrasts at word level to make lexical and morphological distinctions. The difference between pitch accent languages and tone languages is that the former restrict their tonal contrasts to specific syllables, while the latter have contrastive tone on almost all syllables. However, it is difficult to draw a dividing line between these two language categories (see Gussenhoven 2004: 47). The notion of ‘stress’ applies to both word and utterance level. We have to differentiate between ‘lexical stress’, denoting abstract prominences at word level, and ‘postlexical stress’, i.e. concrete prominences at utterance level.5 To some extent following Beckman (1986), we will define ‘postlexical stress’ as prominence brought about by increased length and loudness plus unreduced vowel quality, and ‘accent’ (which is always postlexical) as prominence due to pitch variation superimposed on (postlexically) stressed syllables. Thus, ‘accent’ is equivalent to ‘pitch accent’, both of which will be used as synonyms in the present study (unless otherwise noted). The following list summarises the different levels of description. It is important to point out, however, that this list applies to languages which have ‘stress accent’ (in Beckman’s terms), such as English and German.

For Bolinger (1964), ‘stress’ is a strictly lexical feature, whereas ‘accent’ exclusively applies to the postlexical level.

5

9

(1)

Lexical stress

word level, abstract, potential for concrete prominence

Postlexical stress

utterance level, concrete, increased intensity and duration, unreduced vowels

Accent

utterance level, concrete, postlexical stress plus pitch variation

Shattuck-Hufnagel et al. (1994) showed that in American English, (lexical as well as postlexical) stresses are not always a prerequisite for pitch accents, however. In rhythmic clash contexts (as e.g. in MassaCHUsetts MIracle), which trigger a perceived shift of prominence from a primarily stressed syllable to a secondarily stressed one (e.g. MassaCHUsetts → MAssachusetts), the authors found that the acoustic reason for the perceived early prominence is a substantial F0 movement without an increase in duration on the syllable in question.6 Thus, only the pitch accent is shifted to an earlier syllable while the positions of the stresses remain constant7 (see also the discussion of example (11) in section 2.1.2.1). Support for this relative independence of the (lexical) stress pattern from the superordinate level at which accents are assigned comes from a study by Harrington et al. (1998). The authors could show that in contexts in which a potential accent contrast is neutralised (here: in postnuclear deaccented position), the contrast between primary and secondary lexical stress is also marked postlexically, since primarily stressed syllables were longer and produced with greater lip-aperture or jaw height than secondarily stressed syllables. According to Kohler (2005), not only the realisation of postlexical stresses (which are largely equivalent to what he calls duration accents; see below) is largely independent of pitch, there might even be (strong) accents which are not marked by pitch variations. Kohler calls this type of non-pitch accent force accent. Force accents are based on increased physiological and articulatory effort, the primary correlates of increased intensity. Thus, the force accent resembles the accent concept of the early days of intonation research mentioned above (going back to Sievers’ (1876) concept of expiratorischer Accent). Kohler (2005: 100) gives an example of a force accent in a spontaneous German utterance (capitalisations added by SB):

Changes in intensity have not been investigated. However, the duration of the main-stress syllable of the target word decreases and does so irrespective of the accent shift (Shattuck-Hufnagel et al. 1994: 377).

6 7

10

(2)

[...] wie Boris ValeRIE die TREPpe runterKICKT pitch accent pitch accent

force accent

(‘when Boris kicks Valerie down the stairs’) The items Valerie and Treppe (‘stairs’) are marked by F0 peaks, while runterkickt (‘kicks down’) has low level pitch. Note that the lexical stress pattern of the compound verb is RUNterkickt. Nevertheless, the last syllable of the verb – Kohler claims – is perceived as stronger due to its forceful articulation, primarily signalled by a long and strongly aspirated initial plosive. Despite its status as the strongest accent in the phrase, however, Kohler neither claims that the force accent represents the nucleus of the intonation unit, nor that it marks the verb as the focus of information.8 It rather “adds an expressive component of disapproval, which emotionally intensifies the meaning of the verb” (2005: 101). The difference between stresses and accents entails a difference in the strength or degree of (postlexical) prominences. We can think of at least four different degrees of prominence at utterance level (assuming the primacy of pitch variation for the perception of prominences): (3)

No stress/accent Stress (syllable is louder, longer and more strongly articulated than an unaccented syllable) Pitch accent (additional tonal movement on or in direct vicinity of a stressed syllable) Nuclear pitch accent (last pitch accent in an intonation unit; see next section)

The idea of strength degrees implies the possibility of ‘secondary accents’, a concept that has always been a problem for intonation models due to its intermediate status (see e.g. Ladd (1996) for an overview). Kohler’s ‘duration accents’, classified in the Kiel Intonation Model (Kohler 1991b) as ‘partially deaccented’, represent one instantiation of this category. This level is defined as having “its acoustic exponents primarily in the duration domain although it may be accompanied by an F0 peak inflection of a magnitude that is well below the F0 peak declination, and, of course, Both concepts, ‘nucleus’ and ‘focus’, will be discussed in detail at later points in this study (especially in sections 2.1.2.2 (nucleus) and 2.2.1 (focus)).

8

11

also by higher energy” (Kohler, in press). In contrast to comparable concepts, however, duration accents are not specified with respect to their position in an intonation unit, i.e. they may occur in prenuclear or postnuclear position (not in nuclear position, though, since the nucleus is by definition the strongest primary (pitch) accent in the phrase). A secondary status has often been attributed to pitch accents as well. Büring’s (2003a) ‘ornamental accents’, e.g., are secondary pitch accents which are only allowed in prenuclear position.9 On the other hand, Grice, Ladd & Arvaniti’s (2000) ‘phrase accents’ only occur after the nucleus. Like Kohler’s duration accents, they do not represent fully-fledged pitch accents. All these concepts will be discussed in more detail later (in particular in sections 2.1.2.2, 2.1.2.3, 2.2.1, 2.2.5.2 and 5). The following table (largely adopted from Uhmann 1991: 109) sums up the phonetic parameters that constitute accents in ‘stress accent languages’ like German and English and gives their correlates at the respective levels of description: (4) Auditory Phonetics pitch scale of perception: high – low loudness scale of perception: loud – soft length scale of perception: long – short vowel quality

Articulatory Phonetics quasi-periodic vibrations of vocal folds

Acoustic Phonetics fundamental frequency (F0) measure: Hertz (Hz)

articulatory effort, air pressure

intensity measure: decibel (db)

articulation process

duration

vocal tract configuration

measure: millisecond (ms) spectral characteristics

scale of perception: full – reduced

2.1.2

2.1.2.1

Phonology of Intonation

Principles of Autosegmental and Metrical Phonology

The theories of Autosegmental and Metrical Phonology were developed in the mid 1970s with the aim of overcoming the inadequacies of the standard theory of phonological representation in Generative Grammar at that time, as e.g. articulated in The Sound In some studies, prenuclearity already implies secondariness, since the nuclear accent has the status of primary accent. In this study, secondary accents are one level below accent, which in turn is one level below nuclear accent, i.e. nuclear accent > accent > secondary accent.

9

12

Pattern of English (SPE) by Chomsky & Halle (1968). In this standard model, words are split up in linear sequences of sound segments represented in the form of unordered bundles of binary distinctive features. Theses bundles do not only contain ‘segmental’ but also what was considered in earlier work to be ‘suprasegmental’ information such as features for tone and stress. A model like this faces several problems: first, binary features are inappropriate for expressing a relative and gradual concept like stress or prominence in general. Second, prominence is a property of syllables (at least in languages like German and English), not of single (sound) segments, as is suggested by the SPE model, in which suprasegmental features are attributed to vowels only. Third, the strictly linear arrangement of the strictly binary features disables the model to represent a tonal movement on a single segment, e.g. a fall in pitch from high to low on a short vowel (e.g. [Ι]): (5)

A:

* Ι

B:

+ High

* Ι

[+ High -High]

- High In (5) A, two opposite features are contained in the same bundle, which is neither allowed structurally nor does it necessarily indicate a falling contour, since the features’ arrangement is unordered. The notation in (5) B is just as impossible, since a succession of two features is not allowed within the same segment.10 The solution to this problem was found in the separation of segmental and suprasegmental features and their organisation on different independent levels or ‘tiers’. On each of these tiers phonological features or feature bundles are arranged as independent segments or ‘autosegments’ (hence ‘Autosegmental Phonology’, see Goldsmith 1976) in linear fashion. The autosegmental tiers contain textual, melodic and rhythmic information, i.e. e.g. syllables (the tone bearing units (TBUs) in German and English), tones and stresses, and are arranged in a three-dimensional space parallel to an axis called the ‘skeletal tier’ (see Clements & Keyser 1983, Goldsmith 1990: 48ff.). The elements on this tier are time slots encoding segment length sometimes marked by V(owel)- and C(onsonant)-slots, sometimes as undifferentiated X-slots - which serve as the anchor points for elements on the other tiers. These different kinds of ‘autosegments’ are connected to one or more anchor points by association lines (see Halle & Vergnaud 1979, McCarthy 1979), without obligatory one-to-one relations between the elements on the various planes. Since autosegmental association conventions allow for many-to-one relations between tones and text, a falling contour on a short vowel, which we found Note that a long vowel or diphthong could be split up in two segments, making a succession of two tones possible.

10

13

impossible to represent in a strictly linear framework (see (5)), can now be described as a succession of a high and a low tone which are both associated with a single tone bearing unit: (6)

Ι

[+ High]

[- High]

However, different languages may have specific constraints on associations of particular tones, as e.g. pointed out by Pulleyblank (1986) and Halle & Vergnaud (1982) (see also Gussenhoven & Jacobs 1998: 139f.). The only uncontroversially universal aspect to autosegmental representations is the ‘No Crossing Constraint’ (“Association lines do not cross”), a well-formedness condition that guarantees the same order of the autosegments on different tiers. The other branch of the new ‘non-linear’ phonology, called ‘Metrical Phonology’, is concerned with the hierarchical organisation of the units on each tier (see e.g. Liberman 1975, Liberman & Prince 1977, Selkirk 1984, Hayes 1982). In particular, Metrical Phonology describes prominence relations within and between prosodic domains of different sizes (as e.g. intonation phrases, phonological phrases, prosodic words, feet and syllables) and rhythmic structures of utterances. Two different formal means are used for the representation of prominence patterns, namely ‘metrical trees’ and ‘metrical grids’, both proposed by Liberman & Prince (1977). Metrical trees are particularly suitable to express the relational character of prosodic patterns. In the original form of metrical tree notation, the nodes of the trees are strictly binary branching, with one node being structurally stronger (labelled ‘s’) and the other weaker (labelled ‘w’), as in the following examples: (7) w

s

con

s

tract (verb)

con

w tract (noun)

The strongest element in a given domain, i.e. the one that is exclusively dominated by s-nodes, is the ‘Designated Terminal Element’ (DTE, see Liberman & Prince 1977). In (8), it is the third syllable of the word Massachusetts: (8) w

s

s

w

s

w

Ma

ssa

chu

setts

DTE

14

Metrical trees are also used to mirror the different layers of prosodic constituents. Thus, a structure like (8) could be transferred into a more closely specified tree with labels not only for syllables (‘σ’) but also for higher-level constituents such as feet (‘F’) and prosodic words (‘ω’): ω

(9)

Fs

Fw σs

σw

σs

σw

Ma

ssa

chu

setts

The foot is the basic structural unit for metrical trees. In German and English, feet are left-headed, i.e. they are composed of one strong syllable followed by one or more weaker syllables. The head syllable of the strongest foot carries the main stress of the word (since it coincides with the DTE of this domain; in (8) and (9) the syllable -chu-), while the heads of the other feet are said to have secondary stress (Main (8) and (9)). Degrees of prominence are best represented in the form of metrical grids. A metrical grid is composed of a set of layers (usually not more than five) that runs parallel to a string of syllables. Every syllable is associated with a basic beat (indicated by ‘x’) and may receive further beats on higher levels (depending on its ‘weight’ and position in the domain). The more beats attributed to a syllable, the more prominent is this syllable. The result is a structure expressing the rhythm of a word or phrase with potentially very subtle prominence differences. If we ‘translate’ the metrical tree in (8) into a metrical grid, we receive the following: (10) x x Ma

x ssa

x x x chu

x setts

In her influential study on the rhythmic well-formedness of English utterances, Elizabeth Selkirk (1984) develops two sets of rules, the ‘text-to-grid alignment rules’ and the ‘grid euphony rules’. The first type of rule governs the assignment of beats on different levels. In a first step, each syllable receives a (demi)beat. On the second level, the ‘Basic Beat Level’, heavy syllables (i.e. syllables of the type CVV or CVC) and root-initial syllables are aligned with a beat. In our example (10), this applies to the syllable Ma-, since it is the first syllable of the root, and to the syllable -chu-, which is heavy due to its long vowel. Note that the final syllable -setts is heavy as well. It does not receive a basic beat, however, because it counts as ‘extrametrical’ (see Hayes 15

1980, 1982, Selkirk 1984). On the third level, the ‘Main Stress Rule’ assigns a beat to the last syllable that carries a second-level beat, in our example the syllable -chu-. With this cycle the lexical stress pattern is completed. Above word level, three kinds of text-to-grid alignment rules apply. The ‘Nuclear Stress Rule’ attributes the greatest prominence to the rightmost lexically stressed constituent in the phrase, while the ‘Compound Stress Rule’ makes the lefthand constituent in compound words most prominent. Finally, the ‘Pitch Accent Prominence Rule’ (Selkirk 1984: 152, 276), which is central to Selkirk’s model, ensures that syllables carrying a pitch accent are more prominent than unaccented syllables. The second type of rules, the ‘grid euphony rules’, serve to adjust the outcome of the text-to-grid alignment rules to an ‘ideal grid’. The basic principle of this ideal grid is the ‘Principle of Rhythmic Alternation’, saying that every strong position should be followed by a weak position and any weak position should not be preceded by more than one weak position on the same level (Selkirk 1984: 52). Three kinds of purely rhythmic euphony rules are set up in order to establish this alternation, namely the rules of ‘Beat Addition’ (filling rhythmic gaps), ‘Beat Movement’ and ‘Beat Deletion’ (both removing stress clashes). The most prominent position in a given domain (the DTE) is always preserved, however, i.e. it is not affected by stress changes brought about by euphony rules.11 For illustration purposes, let us have a look at the stress shift example discussed in the previous section. In (11), the metrical grids of the isolated words Massachusetts and miracle are presented. The Main Stress Rule assigns primary lexical stresses to the syllables -chu- and mi-: (11) x x Ma

x ssa

x x x chu

x setts

+

x x x mi

x ra

x cle

The figure in (12) shows a succession of the two words in a phrase. After applying the postlexical Nuclear Stress Rule, the first syllable in miracle becomes most prominent, since it carries the last lexical stress in the phrase. The box indicates the clash between the two adjacent primary-stressed syllables (the extrametrical syllable setts does not prevent the clash), which is resolved by a euphony rule moving the beat on -chu- leftward to the next stronger syllable Ma- (see Shattuck-Hufnagel et al. 1994: 359). Rhythmic alternation is established.

Based on Selkirk’s model for English, Susanne Uhmann (1991: 176ff.) proposed very similar rules for the rhythmic structure of utterances in German.

11

16

(12)

x x Ma

x ssa

x x x chu

x x x x mi

x setts

x ra

x cle

In Metrical Phonology, linguistic stress is usually not regarded as a phonological feature that is given some content by phonetic implementation rules, but as an abstract structural position (see Gussenhoven & Jacobs 1998: 206ff.). This also holds for Selkirk’s approach, whose rules – including the postlexical Nuclear Stress and Compound Stress Rules – do not make reference to local acoustic or auditory properties. However, Selkirk’s Pitch Accent Prominence Rule emphasises the importance of tonal features, leaving behind the level of pure abstraction to some extent. In fact, Selkirk proposes a so-called ‘pitch-accent first’ approach, which claims that the assignment of pitch accents takes place independently of (and primary to) the metrical grid. This may result in pitch accents being placed on syllables that do not have maximum level prominence in the metrical grid, so that the whole grid – including the position of the DTE – may have to be rearranged to make sure that the greatest prominence is assigned to the pitch accented syllable. This relative independence of pitch from stress is supported by the experimental findings of Shattuck-Hufmagel et al. (1994) reported on in section 2.1.1. Results showed that a pitch accent (in American English) may occur early in the phrase irrespective of the basic stress pattern. In fact, the findings suggest that in English clash contexts like (12), it is the position of the pitch accent that is shifted or rearranged and not the stress (as already proposed by Bolinger 1965, 1986). However, as Gussenhoven (2004: 142) points out, a pitch accent cannot be relocated on an unstressed syllable (as the first syllable of obese in (13) B):

(13) A: ChiNESE



a CHInese BOOK 12

but B: oBESE →

an oBESE PERson

The notion of ‘stress’ as used by Shattuck-Hufmagel et al. and other linguists who did empirical studies is more concrete than Selkirk’s, similar to Ladd’s understanding Actually, Gussenhoven postulates that a word like Chinese which consists of two feet also carries (at an abstract level) two pitch accents. Thus, the example in (13) A is strictly speaking a case of accent deletion rather than accent shift.

12

17

of the concept. Ladd (1996) is not only concerned with the phonological but also with the phonetic nature of stress. He regards increased duration and intensity as well as shallower spectral tilt as the acoustic factors that cause the salience of metrically strong or stressed syllables, which are sometimes accompanied by F0 excursions that indicate pitch accents. However, according to Ladd (1996: 59), pitch accents are elements of the intonation contour and do not in themselves represent the acoustic realisation of stress. They serve as an indirect cue to syllable prominence, because they must be associated with strong or prominent syllables, but [...] they do not in and of themselves constitute the prominent syllable’s prominence.

The claim that (in English and German) pitch accents “must be associated with strong or prominent syllables” makes Ladd’s account a ‘stress-first’ account, since postlexical prominence is primarily determined by metrical strength relations in the phonological structure of utterances, not by the distribution of pitch accents. Pitch accents can not override the metrical pattern (as in Selkirk’s approach) but only add another level of prominence to the grid. This conception is in line with our definition of stress and accent in (1). In sum, prominence relations are first of all phonological abstractions, but they may be manifested in two distinct phonetic aspects of utterances: acoustic salience of individual syllables (i.e. stress), and the location of pitch accents (see Ladd 1996: 53f.).

2.1.2.2 Intonation in Autosegmental-Metrical Phonology Based on the principles of Autosegmental and Metrical Phonology introduced in the previous section, Janet Pierrehumbert (1980) developed a system for the description and analysis of (American) English intonation, which became a widely accepted standard within the framework of Autosegmental-Metrical (henceforth AM) theory13 and served as a point of reference for work on many other intonation languages. As we have seen above, some of the central claims of this theory, especially of the association conventions in Autosegmental Phonology, which govern the synchronisation of the autosegments on different tiers, were originally developed for tone languages, i.e. for tonal specifications on the lexical level (see Goldsmith 1976, Williams 1976). However, they can be transferred to intonation languages like English and German as well, in which tone applies only postlexically. Studies on English intonation in this framework include, among many others, Beckman & Pierrehumbert (1986), Pierrehumbert & Hirschberg (1990), Ladd (1983a, 1996) and Gussenhoven (1984), while work on German has been carried out e.g. by

13

The term was introduced by Ladd (1996: 42). 18

Wunderlich (1988), Uhmann (1991), Féry (1993), Grabe (1998) and Grice et al. (2004).14 A central feature of intonation models within AM Phonology is the decomposition of intonation contours into sequences of tonal events. These events consist of high (H) or low (L) targets15 which are anchored to prominent elements or to the edges of phrases. This anchoring corresponds to the two major functions which tones can have: prominence-lending and delimiting. The tone or combination of tones anchored to, or associated with, prominence is referred to as a pitch accent, and the tones anchored to, or associated with, the ends, or right edges, of phrases are referred to as edge tones or boundary tones. The association with prominence is characterised by the association of tones to the lexically (and generally also postlexically) stressed syllable of a metrically strong word, as the H tone on the syllable mor- in (14). Similarly, the end of the falling pitch contour in (14) is characterised by the association of an L tone to the boundary of the intonation unit:16 (14)

H

pitch L

good MORning

(adapted from Grice 2004)

stress Pitch accents are marked with a star ‘*’ following the tone (according to a convention proposed in Goldsmith’s (1981) ‘Accentuation Principle’), e.g. H*. Where a pitch accent consists of more than one tone, the two tones are joined with a ‘+’ sign, and the tone which has the main association with the lexically stressed syllable of the accented word is marked with a star just after it (e.g. H+L* = low tone on stressed syllable preceded by a high tone). Boundary tones (to be more exact, the boundary tones of intonation phrases) are symbolised with a percent ‘%’ sign following the tone, e.g. L%. The association of the autosegmental tone and text tiers of the above example is given in (15): (15)

Tones Text

H*

L%

[ good MORning ]

For an overview of older work on German intonation, mostly within the British tradition (e.g. von Essen 1956, Kohler 1977, or Pheby 1980), compare Grice & Baumann (2002) or Grice et al. (2004). 15 Some other studies make use of an additional mid (M) level (e.g. Goldsmith 1981). 16 Intonation units can be of various sizes and mark different levels of phrasing, depending on the language and model at hand. Possible intonation units are ‘Accentual Phrase’, ‘Intermediate Phrase’, ‘Phonological Phrase’ or ‘Intonation Phrase’. 14

19

Thus, three tiers are crucial for the description of intonation contours in AM Phonology: a tone tier, a metrical tier (determining the prominent syllables) and a text tier. The greatest advantage compared to the other widely applied model in intonation research, known as the ‘British School’ (e.g. Palmer (1922), Kingdon (1958), Halliday (1967a), Crystal (1969), O’Connor & Arnold (1973)), is that tonal information can be precisely localised on single syllables and/or at the edges of phrases. In British School studies, the only direct connection between tones and text occurs on the nucleus (tonic in Halliday’s terminology), defined as the most prominent stressed syllable in an utterance. The nucleus is the only obligatory part of a tone group (optionally preceded by a prehead and a head, and followed by a tail) carrying the most relevant tonal information. In most AM models, the nucleus does not have a special status. It is simply defined as the last fully-fledged pitch accent in a phrase (associated with the DTE at phrase level, and in the unmarked case determined by the Nuclear Stress Rule17), which means that there is no theoretical distinction between ‘prenuclear’ and ‘nuclear’ accents. However, the nuclear accent tends to be pragmatically the most important accent in the phrase, often signalling the main focus of the sentence. The utterances in (16) A and B contain two pitch accents each, one prenuclear and one nuclear. Although both phrases differ in length and segmental material, they display the same intonation contour, which is mirrored by the same succession of tonal events: (16) A:

H* [ GO

B:

H*

L%

NOW

]

H*

H*

L%

[ I ASK you to leave the DIning room ] This abstract phonological notation has the advantage of showing that the intonation contours of the two utterances are functionally equivalent. In terms of phonetic realisation, there might be slight differences, though. There is, e.g., the physiological effect that the overall frequency decreases in the course of an utterance. This phenomenon is called ‘declination’ and leads to gradually lowered accent peaks, an effect that is enhanced with utterance length. Furthermore, the transitions between adjacent tones, also called ‘interpolations’, will not always be as direct as the notation suggests. While the interpolation between a high and a low target may be thought of 17

This has been proposed by Pierrehumbert (1980). 20

as relatively linear, a transition between two high tones is usually realised as a sag, whose markedness again depends on the distance between the targets. However, these differences are claimed to have no phonological relevance but are regarded as surface phenomena. There are syntagmatic relations between peaks that are regarded as phonologically relevant, though. If a high tone is considerably lowered in relation to a previous high tone (without being really low), we talk about downstep, and if the high tone is considerably higher than the preceding H, we talk about upstep. Downstep and upstep apply to both pitch accents and edge tones. The notation of these two phenomena will be discussed in section 2.1.2.3. The tonal inventory of Pierrehumbert’s original analysis does not only comprise pitch accents and boundary tones but also a third type of tone, the phrase accent or phrase tone (as it has occasionally been called). I will use the more neutral term ‘phrase tone’ in this section and reserve the term ‘phrase accent’ for a specific concept as defined by Grice, Ladd & Arvaniti (2000), which will be discussed below. The phrase tone determines the pitch value between the nuclear accent and the boundary tone. Pierrehumbert’s original motivation for the phrase tone, inspired by work on Swedish by Bruce (1977), was the description of the nuclear rise-fall-rise pattern in English. Such a contour that consists of a sequence of four tones (LHLH) is impossible to account for with one maximally bitonal pitch accent and an obligatorily monotonal boundary tone. The solution was to insert another kind of edge tone between the final accent and the boundary. Pierrehumbert could prove the existence of such a tone in utterances with longer postnuclear stretches. Here, it was the phrase tone (usually an L) that spread over several syllables, while the shapes of the pitch accent and boundary tone were held constant. The phrase tone, which is always monotonal, is marked with a minus ‘-’ sign after the tone. Thus, the rise-fall-rise contour mentioned above is transcribed as L*+H L- H%. Later, in Beckman & Pierrehumbert (1986), the phrase tone received another motivation, when it was defined as the boundary tone of an intermediate phrase. Intermediate phrases (ip) constitute minor intonation units, whereas intonation phrases (IP) constitute major ones. The hierarchical relationship between the two levels of phrasing is illustrated in (17):

21

(17)

Intonation Phrase (IP)

intermediate phrase (ip) ((

intermediate phrase (ip) intermediate phrase (ip)

)ip (

)ip (

)ip )IP







%

The concept of phrase tones has recently been modified in a study by Grice, Ladd & Arvaniti (2000). The authors could show that in a variety of languages, phrase tones function as edge tones but are at the same time secondarily associated with postnuclear stressed syllables (if present). Thus, the category in question – which Grice et al. call ‘phrase accent’ – has a hybrid nature, combining the two functions of delimiting (expressed by the first part of the term, i.e. ‘phrase’) and highlighting (expressed by the second part of the term, i.e. ‘accent’). An example of a phrase accent in German and a suggestion of a new way of transcribing this tonal event is given in the following section, 2.1.2.3. In this section, we will also present the complete inventory of German tones as part of the intonation model GToBI. This system is based on the (American) English ToBI system, which is in turn based on Pierrehumberts analysis of English intonation. Our motivation for restricting the description to German is that the tonal systems of the standard varieties of German and English are very similar, and that it is German intonation we will examine in the perception experiments presented in chapter 4. Let us conclude this section with a remark on the status of pitch accents and stresses in AM approaches to intonation. Unfortunately, postlexical prominences are exclusively linked to pitch variation in these studies, turning the whole theory into a ‘pitch-accent-first’ theory, similar to Selkirk’s. On the other hand, pitch accents are generally just ‘added’ to the metrically strongest, i.e. primarily stressed, syllables and cannot be placed independently of the utterance’s stress pattern, which is a feature of ‘stress-first’ accounts. However, the metrical component has always been neglected in AM models on intonation – even in Ladd (1996), despite his proposal of a ‘metrical theory of sentence stress’ (1996: 221ff.). A result of the general primacy of pitch accents can be seen in AM-based annotation models, which usually leave no room for postlexical stresses or duration accents (see Kohler 2005), which certainly constitute prominences (although secondary) but lack pitch variation. All kinds of secondary prominence is usually neglected (with the exception of Grice et al.’s phrase accents), although they can be used as markers of relevant shades or degrees of

22

information. We will come back to the issue of secondary prominences several times during this study.

2.1.2.3 GToBI GToBI (‘German Tones and Break Indices’) is a description system for ‘Standard German’ intonation that is based on and closely related to the original ToBI model for ‘Mainstream American English’ intonation. This original model has been extended as a general framework for developing intonation systems for other varieties and languages since the early 1990s.18 GToBI was developed between 1995 and 1996 by researchers from Saarbrücken, Stuttgart, Munich and Braunschweig with the aim to facilitate the exchange of prosodically annotated data. The original GToBI system including the (by now updated) training materials19 is based on speech data mainly from Northern German speakers. It has been slightly modified in the last few years (see Grice & Baumann 2002, Grice, Baumann & Benzmüller 2005 for an overview). A (G)ToBI record consists of at least three different levels of description, which can be thought of as corresponding to autosegmental tiers. These tiers contain labels for text, tones, and break indices. The text tier provides an orthographic transcription of the words spoken, the tones tier mirrors the perceived pitch contour in terms of tonal events such as pitch accents and boundary tones, and the break index tier marks the perceived strength of phrase boundaries. The information displayed by the three levels is related to each other by means of association principles described in the previous section. Thus, pitch accents are associated with lexically stressed syllables, indicated by a starred tone placed within the limits of the accented word generally at local F0 minima and maxima.20 Edge tones are assigned to phrase-final syllables, marked by ‘-’ or ‘%’ after the tone, signalling the edge of an intermediate phrase or an intonation phrase, respectively. Since the break indices ‘3’ and ‘4’, which are used in the American English ToBI system, coincide with the ‘-’ and ‘%’ symbols and are thus redundant, they are not used in GToBI. Rather, the German ToBI system only indicates mismatches in tonal and rhythmic structure as well as insecurity as to the level of phrasing on the break index tier. In the following, we will only be concerned with the inventory of pitch accents and boundary tones for German, not with transcription details, which are described in Grice & Baumann (2002), Grice et al. (2004) and on the GToBI webpage.

The ToBI framework homepage can be found at http://www.ling.ohio-state.edu/~tobi/. The training materials are available via the GToBI home page: http://www.coli.unisb.de/phonetik/projects/Tobi/gtobi.php3. 20 Often, the actual turning point or tonal target (in most cases a pitch maximum) is reached in the syllable following the accented syllable. In a careful transcription, this late alignment of tonal peaks (as well as valleys) can be indicated by a ‘’ symbol is to be interpreted as ‘significantly preferred over’. (188) Significant relations between priming conditions and nuclear pitch accent type Mode of Prime

Accent Type Preferences

auditory prime (Given) visual prime (Accessible) no prime (New)

Ø > H+L* > H* --H* > Ø H+L* > Ø

123

Priming Mode Preferences

Accent Type

auditory prime > visual prime > no prime ---

Ø H+L*

no prime > visual prime > auditory prime

H*

Overall results show a highly significant interaction between accent type (including deaccentuation) and priming condition (F(4, 728) = 27.82; p < .001). However, the effect of accent pattern considerably varies within the different priming conditions. In the no-prime (Newness) condition, both accent types are significantly preferred over deaccentuation (F(2, 240) = 30.92; p < .001), whereas the distinction between H* and H+L* is not significant. The visual-prime condition, which we assumed to trigger Accessible referents, does not provide significant differences between the three accent patterns. In the auditory-prime (Givenness) condition, however, the phonological differences in the target sentences are highly significant (F(2, 239) = 26.68; p < .001): Deaccentuation is the preferred marker of an already activated referent, as hypothesised. Also, H+L* turns out to be significantly more acceptable than H* for marking Given information. To clarify further the appropriateness of the individual accent patterns, we also conducted an analysis across priming conditions. For the H* pitch accent as well as for deaccentuation, the influence of all three priming conditions has significant effects (F(2, 231) = 19,49; p < .001 and F(2, 240) = 30,19; p < .001, respectively). H* is significantly more acceptable after no prime than after a visual prime, and more acceptable after a visual prime than after an auditory one. Similarly, deaccentuation is preferred after auditory priming in comparison to visual priming, and after visual priming as opposed to no priming. However, the priming condition does not have a significant influence on the acceptability of the H+L* pitch accent. Nevertheless, the figures support our hypotheses that deaccentuation is most appropriate to mark Given information, while H* is most appropriate to mark New information. 4.1.6

Discussion

Our experiment clearly confirms the general assumption that New information is preferably marked by a pitch accent. However, there is no significant preference for the type of accent marking Newness. There is, nevertheless, indirect evidence in favour of H*, since this accent type is significantly more acceptable in the no-prime (New) condition in comparison to the other conditions. There is no such effect with the other pitch accent type tested, H+L*.

124

The data further suggest that deaccentuation is most appropriate to mark Given information, provided that an auditorily presented element can be called Given. Indeed, there are good reasons to think so: first, as mentioned in 4.1.4 and according to Terken & Hirschberg (1994), the setup guarantees persistence of the surface position between prime and target referent, which should in itself be sufficient to regard the referent as activated. Second, there might be an activation support for spoken language due to ‘echoic memory’ (see section 4.1.3.1). Third, auditorily presented material is textually or linguistically Given, which in other studies (e.g. Cruttenden, in press) has been taken to be activated information. Finally, the nature of the task might have a crucial influence: since subjects were asked to evaluate the appropriateness of an auditorily presented target sentence, they may have been more sensitive to the auditory channel, which could have increased the referent’s activation degree. The auditory priming condition provides another significant difference: pitch accent type H+L* was preferred over H* for marking the activated referents. This can serve as (at least indirect) evidence for the role of H+L* as an ‘Accessibilityaccent’ or ‘activation-accent’ rather than a ‘Newness-accent’, which is in line with our hypothesis. The visual priming condition did not trigger a significant preference of pitch accent type in the target sentences, which indicates that the activation status of referents established by this (non-linguistic) mode of presentation is not as clear-cut as in the auditory mode. One reason could be that different subjects reacted differently to the visual primes, e.g. in that they paid more or less attention to the pictures causing the variability that is probably behind the lack of significance in the results. However, evidence that the two priming conditions are different is provided by the fact that H* is more acceptable and deaccentuation is less acceptable after visual than after auditory priming. We interpret this to mean: visually presented referents are “less Given” than auditorily presented referents. Nevertheless, a simple equation of visually presented material with Accessible information appears to be at most an oversimplification. The degree of Givenness of a visually available referent remains vague, since no significant difference in its intonational marking could be found. Research in two directions is necessary: first, we need more sophisticated experiments on visual Givenness in a less artificial setup, including ‘real-world’ objects instead of pictures. Second, we need experiments on other types of Accessibility, i.e. on textual and inferential Accessibility, in the sense of semi-active information due to previous mention that is either displaced or evoked as part of a given scenario or schema, or which is inferable from an antecedent via a lexical relation. The following section (4.2) reports on a perception experiment of this second type.

125

4.2 Perception Experiment II: Accent Type and Types of Accessibility 4.2.1

Motivation

In the perception experiment discussed in the previous subchapter (also described in Baumann & Hadelich, 2003a and 2003b) the hypothesis was tested whether pitch accent type plays a role in the marking of different degrees of Givenness (or levels of activation) in German. In that experiment, target referents were either auditorily or visually primed, or not primed at all, corresponding to Given, Accessible and New information (see sections 2.2.4 and 2.2.5.2). In general, accent type H* was found to be the most appropriate marker for New information, while for Given referents pitch accent type H+L* was preferred over H*, although deaccentuation (i.e. no accent at all) was most acceptable. Since there was only indirect evidence for a preferred marking of the category ‘Accessible information’, and since only one type of Accessibility - namely situational Accessibility due to visual priming - had been tested, there was an obvious need for further experiments. 4.2.2

Hypotheses

The experiment investigates the intonational marking of textually and inferentially Accessible referents in sentence final position (in German). The basic hypothesis is that the type of Accessibility of a referent correlates with the type of pitch accent (including deaccentuation) used for marking it. We further hypothesise that within the category of ‘Accessibility’ there are differences in degree of activation, which are reflected in the preferred choice of intonational marking. In particular, the more active a referent, the more likely deaccentuation is to be the preferred prosodic marker. The less active a referent, the more likely an H* pitch accent is to be preferred. H+L* should take an intermediate position, marking information between the extreme poles of the continuum. In order to keep the length of the test within reasonable limits, we did not repeat the investigation of the first perception experiment into the intonational marking of New information. Neither did we include totally Given referents. This means that we concentrated on the intermediate stages between fully Given and fully New. 4.2.3

Design of Test Material

4.2.3.1 Types of Accessibility Investigated Eight different relations between a textually given antecedent and an anaphor108 (the target referent) were tested with regard to listeners’ preferred pitch accent type on the 108 We adopt the notion of anaphor from van Deemter (1992), where the term is not restricted – as in the traditional sense – to proforms, and, as a consequence, to simple identity relations to the antecedents. Rather, an anaphor may be any kind of expression that refers back (directly or via inference) to an already established concept.

126

target referents. The types of Accessibility included the same expression recurring after three intervening clauses, which will be referred to as textually displaced. They also included inferentially Accessible relations comprising a scenario condition (trial - judge), symmetrical lexical relations such as synonymy (lift - elevator) and converseness (sister - brother), and asymmetrical lexical relations like hypernymyhyponymy (flower - lily) and meronymy (whole-part; hand - finger) in both orders. We restricted our analysis to the above relations largely on the basis of claims and observations by Chafe (e.g. 1987) and Allerton (1978). They will be dealt with in some detail below.

Textually Displaced As pointed out in section 2.2.5.2, one reason for a referent to be ‘merely’ Accessible is its “deactivation from an earlier state, typically through having been active at an earlier point in the discourse” (Chafe 1987: 29). However, it is a matter of debate how far away the ‘earlier point’ of a referent’s mention or activation may be to still count as at least semi-active. Clark and Sengul (1979) claim that recency effects are not linear. While they found a significantly higher availability of a referent mentioned in the previous clause compared to a referent mentioned two clauses back, there was no significant effect between referents from two clauses back and referents from three clauses back. We investigated this question in our textually displaced condition. For example: (189) The hikers passed by an old house. They were exhausted by the long way they had walked and longed for a short rest. The hikers approached the house. One of them knocked on the door.109 Note that the recurrence of the same word or lexeme is crucial for the notion of textual Givenness or Accessibility – it does not necessarily have to denote the same referent (see examples (38) and (39) in section 2.2.2). We will come back to the question of coreference in the discussion.

Scenario A second reason for a referent to be semi-active is its being part of a scenario (see Sanford & Garrod 1981) – or of an equivalent concept like a schema (see e.g. Tannen 1979) or semantic frame (see Fillmore 1982), already introduced in section 2.2.5.2. The establishment of a scenario (e.g. ‘in court’) simultaneously coestablishes a set of referents or concepts that are representative of or constitutive for the given scenario (e.g. ‘judge’, ‘lawyer’, ‘juror’). They cannot be considered fully Given, since they have not been activated in an explicit and thus direct way, but rather Accessible information. Prince’s (1981: 233) ‘bus-and-driver’ example (mentioned above in (73) and (103)) is such a case.

109

In the examples throughout this chapter, antecedents and anaphors are underlined. 127

Synonymy and Converseness Allerton (1978) claims that synonyms and anaphors which are in a converseness relation to an antecedent, are likely to be deaccented (or, in his terms, ‘denuclearised’). Converseness is an antonymic relation (next to antonymy proper and complementarity) expressing the equivalence of two lexical items. It has been chosen from the group of opposites since converseness relations are more often realised as referring expressions, i.e. noun phrases, and because antonyms are more likely to be treated as contrastive or New(sworthy) items, and would thus always be accented. Allerton (1978: 141) mentions the following examples: (190) Synonymy Why don’t you sit on our settee? By the way, where did you BUY the sofa? (191) Converseness I’ve just heard about Derby County’s victory over Liverpool. That’s Liverpool’s SEcond defeat. (adapted from Allerton’s original example)

Hypernymy-Hyponymy and Meronymy A similar tendency to deaccent holds for superordinates (like hypernyms or holonyms) if they follow their subordinate terms (like hyponyms or meronyms)110. Examples are given in (192) and (193) (Allerton 1978: 141; see also Rochemont 1986): (192) Hyponym-Hypernym D’you drink whisky? I’m afraid I don’t TOUCH spirits. (193) Part-Whole John’s got trouble with his handbrake. What SORT of car has he GOT? However, deaccentuation is not appropriate if the order is reversed:

Of the possible meronymic relations, we selected whole-part, which can be regarded as the prototypical meronymic relation (Cruse 1986: 160).

110

128

(194) Hypernym-Hyponym D’you drink whisky? I’m afraid I don’t like BOURbon.

but * I’m afraid I don’t LIKE bourbon.

(195) Whole-Part John’s got trouble with his car. What sort of HANDbrake has he got? but * What SORT of handbrake has he got? Allerton explains this pattern by the fact that while the hyponym frequently implies the superordinate (Lyons 1968: 455), and the part frequently implies the whole, the reverse applies only rarely in either case. In other words, the use of the hyponym or ‘part’ word involves adding extra information in a way the reverse sequence does not, and this extra ‘new’ information requires some degree of stress (1978: 142).

A similar point is made by van Deemter (1994: 21), who accounts for this difference in accentuation in asymmetrical relations by the notion of ‘concept-Givenness’:

Concept-Givenness An occurrence w of a word is concept-Given if the same or the previous sentence contains, to the left of w, another occurrence w’, of an expression, whose reference is known to be subsumed by that of w. He gives the following examples (van Deemter 1999: 7): the anaphoric hypernym string instruments (the subsuming term) can be deaccented, as in (196): (196) Bach wrote many pieces for viola. He must have LOVED string instruments. In contrast, the anaphoric hyponym viola (the subsumed term) cannot be deaccented, resulting in (197) (197) Bach wrote many pieces for string instruments. He must have loved the viOla. but * He must have LOVED the viola.

129

Subsectional anaphors like viola in (197), would not be concept-Given, but New, and thus have to be accented. Van Deemter does not claim that different types of accent mark different degrees of Givenness. He simply regards accents of various phonetic realisations as markers of New information, and lack of accent as a marker of Given information.

4.2.3.2 Textual Test Material All texts for the experiment were composed of up to four context sentences, a target sentence, and a concluding sentence. The context always included a referring expression that served as an antecedent for the target referent. However, the context varied according to the different lexical relations to be investigated. In the textually displaced condition, the same word occurred as antecedent and anaphor, separated by three clauses. (198) Textually displaced: Django ging an die Bar und bestellte einen Whisky. Er war bekannt dafür, dass er den Revolver schneller zog als sein Schatten. Man hatte Respekt vor ihm. Django trank den Whisky. Er brauchte nur einen Zug. (Django went to the bar and ordered a whisky. He was known for drawing the gun faster than his shadow. People respected him. Django drank the whisky. He finished it in one draught.) This separation serves the purpose of decreasing the antecedent’s degree of Givenness by having other concepts or referents occupy the listener’s immediate consciousness. However, such a decrease of a referent’s Givenness is only provided if the referent in question does not function as the topic of the intervening discourse. This is in line with a major claim of Centering Theory (e.g. Grosz et al. 1995), saying that center (or topic) continuation considerably decreases the inference load111 placed upon the listener. Thus, we made sure that the target word did not surface as (backward looking) center in the stretch of discourse. On the other hand, we arranged for the antecedent to surface in the same grammatical function (direct object) and in the same position (sentence final argument)112 as the anaphor. This was done to guarantee a certain degree of availability of the target referent in a controlled way, since – according to Terken & Hirschberg – “the properties of grammatical function and surface position may be used by the listener as cues to access potential antecedents in the discourse model, by leading him or her to look for a candidate antecedent with the same properties” (1994: 143; see also sections 2.2.5.2, 4.1.4 and 4.1.6).

‘Inference load’ can be considered equivalent to Chafe’s ‘activation cost’, introduced in section 2.2.4. 112 The antecedent is sometimes followed by a verb. However, and most importantly, the noun phrase always receives the nuclear accent. 111

130

In the scenario condition, the antecedent occurs as the topic or theme (i.e. as subject in initial position) of the first sentence in order to immediately establish the ‘semantic frame’ for the target referent. The following three clauses elaborate on the scenario before the target referent is mentioned. (199) Scenario: Das Restaurant war vom Feinsten. Schon das Lesen der Karte war ein Genuß. Allerdings hätten wir uns nicht alles bestellen können, was wir gerne gegessen hätten. Unsere Tischnachbarn riefen den Kellner. Sie hatten schon zwei Flaschen Champagner getrunken. (The restaurant was excellent. It was already a pleasure to read the menu. Nonetheless, we couldn't have ordered everything we would have liked. The people at the next table called the waiter. They had already drunk two bottles of champagne.) Here, as in all other relations that require an inferential bridge between antecedent and anaphor, center (or topic) continuation was generally avoided, but not with the same strictness as in the textually displaced condition. This was based on the assumption that the cognitive effort which has to be invested to activate an inferentially available referent should be higher than is the case with a textually available referent. All other conditions displayed only one clause between the antecedent and the anaphor, so as not to further decrease the degree of accessibility.113 However, as in the textually displaced condition, persistence of grammatical function (direct object) and surface position (final/nuclear) between the two referring expressions was ensured, which again increases the degree of Givenness. For the symmetrical lexical relations (synonymy and converseness), only one context was constructed, assuming that the order of occurrence does not affect the prosodic realisation of the anaphor. (200) Synonymy: Sie hatte gestern für ihr Kind auf dem Markt eine Apfelsine gekauft. Die Vitamine würden ihm gut tun. Die junge Mutter schälte die Orange. Dann legte sie die Scheiben auf einen Teller. (Yesterday she had bought an orange for her child on the market. The vitamins would be good for him/her. The young mother peeled the orange [synonym] 114. After that, she put the slices on a plate.) (201) Converseness: Markus hatte in der fünften Klasse einen ganz besonderen Lehrer. Er hieß Müller und unterrichtete Deutsch und Geschichte. Herr Müller unterstützte seinen Schüler. Markus hat viel von ihm gelernt.

The main reason to separate the two referring expressions at all was motivated by the wish to increase the fluency and naturalness of the texts. 114 There is no English equivalent to the German synonymy relation Apfelsine – Orange. 113

131

(Markus had a very special teacher in his fifth school year. His name was Müller and he taught German and History. Mr. Müller supported his pupil. Markus had learned a lot from him.) With the asymmetrical lexical relations (hyponym-hypernym, part-whole and vice versa), we tried to construct identical contexts for both directions, in which only the two referring expressions in question were exchanged (e.g. the hyponym Tennisspieler (‘tennis player’) and its hypernym Sportler (‘sportsman’); see (202) and (203)). This was intended to minimise unforeseen contextual influences. (202) Hyponym-Hypernym: Ole war ein begabter Tennisspieler. Er war in seiner Region praktisch unbezwingbar. Die Lokalpresse lobte den Sportler. Vor allem sein Aufschlag war berühmt. (Ole was a talented tennis player. He was virtually unbeatable in the region. The local press praised the sportsman. It was above all his service which was renowned.) (203) Hypernym-Hyponym: Ole war ein begabter Sportler. Er war in seiner Region sehr bekannt. Die Lokalpresse lobte den Tennisspieler. Vor allem sein Aufschlag war berühmt. (Ole was a talented sportsman. He was well-known in the region. The local press praised the tennis player. It was above all his service which was renowned.) (204) Part-Whole: Der kleine Martin studierte jede einzelne Seite. Er kannte schon fast alles auswendig. Der Junge liebte das Buch. Es handelte von Dinosauriern. (Little Martin studied every single page. He already knew almost everything by heart. The boy loved the book. It was about dinosaurs.) (205) Whole-Part: Martin war begeistert von seinem neuen Buch. Er wollte alles auf einmal wissen. Der Junge durchstöberte die Seiten. Das Buch handelte von Dinosauriern. (Martin was enthusiastic about his new book. He wanted to know everything at once. The boy flicked through the pages. The book was about dinosaurs.) The target sentences displayed the same morphosyntactic structure over all conditions: a definite full noun phrase as subject (sometimes modified by an

132

adjective), followed by a semantically unmarked verb as predicate115, and a definite full noun phrase functioning as direct object, e.g. (206) Der Junge betrachtete die Trompete. (The boy inspected the trumpet.) The target word always coincided with the direct object, i.e. its grammatical function as well as its surface position (final) were kept constant.

4.2.3.3 Auditory Test Material In terms of prosodic structure, we created three different versions of each target sentence. In order to do so, the sentences – naturally spoken by a 30-year-old female speaker of Standard German and digitally recorded in a noise-reduced environment – were resynthesised by using the PSOLA (Pitch-Synchronous Overlap and Add; see e.g. Charpentier & Stella 1986, Moulines & Charpentier 1990) manipulation technique in the speech analysis tool Praat (see Boersma & Weenink 1996 and http://www.fon.hum.uva.nl/praat/). PSOLA is a modulation operation which changes the fundamental frequency (and duration) of a speech signal without changing the voice quality.116 We not only resynthesised the nuclear contour of each sentence, but the whole utterance. Although this method slightly reduced the sound quality of the utterances, it ensured comparable quality over of all target sentences. We ensured that there were two pitch accents in the sentence, i.e. on the subject and object nouns. If there is only one accentable element in a phrase, its information load or newsworthiness increases. This is due to purely structural (or rhythmical) reasons, which nevertheless influences the form of the accent: A single newsworthy element in an utterance is more likely to receive a peak accent (H*) than any other type of accent. We avoided such a structural bias by providing another accented item (here: the subject noun) in prenuclear position. The first part of the target sentences was not changed. The subject noun always received a high prenuclear accent (H*). The target referent, on the other hand, either 115 In an unmarked predicate-argument structure in German, only the argument receives an accent (if it is not pronominalised) (see Uhmann 1991: 215). Some verbs, however, always demand an accent, especially if semantically ‘heavy’, as e.g. the verb beneiden (‘envy’) in the following example (Uhmann 1991: 225): (i) Warum ist Maria sauer? (‘Why is Maria angry?’) Weil sie ihren FREUND beNEIdet. (‘Because she ENvies her BOYfriend’) Verbs like these were excluded from the experiment, due to expected accent clashes or other influences on the target referent’s prosodic realisation. It could, e.g., increase the acceptability of deaccenting the target referent. 116 The basic algorithm for the PSOLA technique consists of three steps. First, the speech signal is divided into separate but overlapping smaller signals. This is accomplished by windowing segments around each ‚pitch mark‘ in the original signal. Second, the smaller signals are modified by either repeating or leaving out speech segments, depending on whether the pitch of the modulation signal is higher or lower than the pitch of the carrier signal. Last, the remaining smaller segments are recombined through overlapping and adding. The result is a signal with the same spectrum as the original but with a different fundamental frequency (see Upperman 2004).

133

carried a nuclear H* or H+L* pitch accent, or was deaccented (with a nuclear H* pitch accent assigned to the preceding verb). In order to guarantee equivalent pitch accent types, we adjusted the Hertz values of the pitch accents produced by the speaker, taking her normal values as a point of reference. All nuclear H* pitch accents (on the object noun or verb, respectively) had a peak of 240 Hz in the middle of the accented syllable, preceded by a syllable of 200 Hz and followed by a syllable of 155 Hz. The H+L* pitch accent was characterised by a peak of 240 Hz immediately preceding the accented syllable, followed by a fall to 170 Hz in the middle of the nuclear syllable. The boundary tone of each of the three contours had a value of 150 Hz. Three different prosodic realisations of the target sentence (here: the scenario example mentioned in (199)) are given in (207). The intonation contours are schematised by using a line notation, and the pitch accents are annotated according to GToBI.117 (207) Unsere TISCHnachbarn riefen den KELLner. H* H*

Unsere TISCHnachbarn riefen den KELLner. H* H+L*

Unsere TISCHnachbarn RIEfen den Kellner. H* H* Ø (‘The people at the next table called the waiter’) Without the last sentence the target sentence would have occurred in paragraph final position. In German, however, paragraph finality is often marked by an H+L* L-% nuclear contour, which coincides with one of the intonation patterns tested in the experiment. Furthermore, as we have seen in the corpus analysis in chapter 3, this high falling nuclear contour is a very common feature of read (German) speech in general. Thus, a concluding sentence had to be added in order to avoid an indistinguishable cooccurrence of two different meanings (the degree of Givenness

As in the schematic contours in section 4.1.3.3, capital letters indicate accented syllables, bold face letters indicate syllables bearing nuclear accents. The symbol Ø, which is not part of the GToBI annotation scheme, indicates lack of accent. The notation of boundary tones is left out.

117

134

of the target referent and the strength of a discourse unit boundary) resulting in a single intonational form (H+L* L-%) in the target sentence. We constructed five texts for each relation, and three versions of each target sentence. This resulted in 120 different stretches of discourse. We also created five practice texts of the same structure. In order to provide some variation, we additionally constructed ten filler texts which included e.g. pronouns and adverbials (which were missing in the actual target sentences). 4.2.4

Experimental Setup

As in the previous experiment (discussed in section 4.1), we asked 30 native speakers of German to take part as subjects. This time, they were exclusively undergraduate students, mostly residents of Cologne and raised in the West of Germany. Nearly all subjects were in their first or second semester, and generally naive with respect to the task. The short texts were visually presented on a computer screen (using MS PowerPoint slides), with the target sentence marked in red. Subjects listened to the texts over headphones by clicking on a loudspeaker symbol. Their task was to judge the contextual appropriateness of the target sentence’s intonation patterns on a seven point scale, to be marked on a test sheet. After training in five practice trials, each subject was presented one of six different, pseudo-randomised blocks, consisting of 40 test texts (five per relation) and ten fillers. Each subject was played only one of the three versions of each target sentence. The task was self-paced, and subjects were allowed to listen to the texts more than once. They were advised to use the full range of the scale. 4.2.5

Results

The appropriateness judgements were z-transformed so that each subject had a mean score of 0 and a standard deviation of 1 (see section 4.1.5). As a general result of an Anova test, we found a highly significant interaction between accent type and type of Accessibility (F(14) = 19.067; p < 0.001). The table in (208) shows a summary of the posthoc tests (Scheffé) that were conducted. The types of Accessibility are ordered according to the absolute preference values (mean z-scores) for deaccentuation of the respective target referents. The symbol ‘ ’ indicates ‘highly significant preference’ (p < 0.005), the symbol ‘>’ indicates ‘significant preference’ (p < 0.05), and the symbol ‘=’ indicates ‘no significant difference’.

135

(208) Summary of the results Type of Accessibility converseness part-whole synonymy hyponym-hypernym hypernym-hyponym textually displaced whole-part scenario 4.2.6

Accent Type Preferences no accent H+L* > H* no accent H+L* H* no accent H+L* > H* no accent H+L* H* no accent H+L* > H* H+L* = no accent H* H+L* H* = no accent H+L* > H* = no accent

Preference Values for Deaccentuation of Target Referent - 1.18 higher preference - 0.84 - 0.68 - 0.67 - 0.55 - 0.18 0.01 lower preference 0.09

Discussion

The results clearly confirm the basic hypothesis that the factors ‘type of Accessibility’ and ‘type of pitch accent’ are highly correlated. However, the order of accent type preferences varies across different semantic relations. The findings indicate that Accessible information cannot be treated as a uniform category – at least not in terms of a consistent prosodic marker – which is in line with claims e.g. by Lambrecht (1994), who argues that there is no direct phonological correlate of Accessible information (see section 2.2.5.2). However, this should not be interpreted as tantamount to saying that the intonational marking of an Accessible referring expression is arbitrary. The choice of pitch accent type (including deaccentuation) rather depends on the relation between the antecedent and the anaphor, and – in the case of (some) asymmetrical lexical relations – on the order of occurrence. Furthermore, there are gradient differences as to the acceptability of (de)accentuation for the different types of Accessibility (expressed by the z-score values in (208)), although the (categorical) ranking of the three accent types is the same across some types of Accessibility. This is the case for five different semantic relations all of which are preferably deaccented (converseness, part-whole, synonymy, hypernymy and hyponymy) but to different degrees. Thus, e.g., deaccentuation is more acceptable with an anaphor expressing a converseness relation to an antecedent than with an anaphoric hypernym. Let us now have a closer look at each lexical relation. Not surprisingly, the preferred marking of anaphoric synonyms is deaccentuation (example (200) repeated as (209)): (209) Sie hatte gestern für ihr Kind auf dem Markt eine Apfelsine gekauft. [...] Die junge Mutter SCHÄLte die Orange. (Ø) (Yesterday she had bought an orange for her child on the market. [...] The young mother peeled the orange [synonym].)

136

The most readily available interpretation is that Apfelsine (‘orange’) and Orange are coreferential, i.e. they stand in an identity relation to each other (see van Deemter 1992). Thus, the anaphor represents almost fully active information and consequently does not require an accent. Nevertheless, H+L* is significantly preferred over H*, indicating that, as a second choice, H+L* is more appropriate than H* when marking activated information. The other symmetrical lexical relation, converseness, shows the same preference pattern as synonymy: deaccentuation is preferred over H+L*, which is in turn preferred over H* (example (201) repeated as (210)). (210) Markus hatte in der fünften Klasse einen ganz besonderen Lehrer. [...] Herr Müller unterSTÜTZte seinen Schüler. (Ø) (Markus had a very special teacher in his fifth school year. [...] Mr. Müller supported his pupil.) With the help of the bridging antecedent Lehrer (‘teacher’), the expression seinen Schüler (‘his pupil’) can be unambiguously identified with the referent Markus. Again, if an anaphor is interpreted as coreferential with an antecedent, it can be marked as Given information by lack of accent. The same accent preference distribution (no accent > H+L* > H*) is found in the asymmetrical relations part-whole and hyponym-hypernym. This can be explained by van Deemter’s ‘concept-Givenness’ (see section 4.2.3.1), saying that a superordinate expression following a subordinate one can be deaccented, since the subordinate expression has already established the superordinate concept. (211) shows a part-whole example (Seite-Buch, ‘page-book’), with the nuclear accent shifted to the verb. (211) Der kleine Martin studierte jede einzelne Seite. [...] Der Junge LIEBte das Buch. (Ø) (Little Martin studied every single page. [...] The boy loved the book.) The whole-part relation showed a highly significant preference of the early peak accent H+L* over both H* and deaccentuation, as in (212) (example (205) repeated): (212) Martin war begeistert von seinem neuen Buch. [...] Der Junge durchstöberte die SEIten. (H+L*) (Martin was enthusiastic about his new book. [...] The boy looked through the pages.) That is, the early peak accent is most appropriate for marking this type of Accessibility, and the H* accent and deaccentuation are equally unacceptable.

137

Example (202) (repeated as (213)) shows a hyponym-hypernym (Tennisspieler – Sportler, ‘tennis player’-‘sportsman’) relation. (213) Ole war ein begabter Tennisspieler. [...] Die Lokalpresse LOBte den Sportler. (Ø) (Ole was a talented tennis player. [...] The local press praised the sportsman.) In the case of hyponym-hypernym, although the preference was “no accent > H+L* > H*”, like part-whole, the reverse order did not reverse the preferences for accent type. The hypernym-hyponym relation produced the same distribution as in the hyponym-hypernym relation (see (213) with (214)). (214) Ole war ein begabter Sportler. [...] Die Lokalpresse LOBte den Tennisspieler. (Ø) (Ole was a talented sportsman. [...] The local press praised the tennis player.) How can this result be explained? Let us take another look at the hypernymhyponym relations mentioned above, e.g. in (197) (repeated as (215)): (215) Bach wrote many pieces for string instruments. He must have loved the viOla. Another example that requires an accent on the anaphor would be: (216) As long as she could remember she was in touch with pets. Her parents owned a DOG. Here, the anaphor denotes either a generic (the viola in (215)) or an indefinite term (a dog in (216)). Neither of them is a uniquely identifiable individual referent, i.e. the listener cannot identify the speaker’s intended referent (see Gundel et al. 1993). The anaphors used in the experiment, however, were definite decriptions denoting unique individual referents, which presumably led to an interpretation of coreference with the antecedent. Thus, Tennisspieler (‘tennis player’) in (214) is rather understood as a synonym of Sportler (‘sportsman’) than as a ‘proper’ hyponym, resulting in deaccentuation of the anaphor.118

Such an interpretation was not intended. It can be explained by our desire to keep the texts of the two relations (hyponym-hypernym and vice versa) as similar as possible in order to minimise unpredictable influences of diverging contexts. Since the structure of all five example texts of the hypernym-hyponym relation was the same – including the same possibility of a coreference reading –, almost all of them (four out of five) showed a significant preference for deaccenting the anaphor. Note that an identity-anaphoric reading would be impossible in a whole-part relation, due to the intrinsic semantic difference of the two elements – hence the preference of H+L* for marking the anaphor. 118

138

In order to test the plausibility of this explanation, we can replace the generic anaphor in example (215) with a definite, non-generic one (and slightly modify the content of the sentence): (217) Bach owned only one string instrument. He must have LOVED the viola. In (217), deaccentuation of viola seems to be much more appropriate than in (215), since an identity-anaphoric interpretation becomes most plausible.119 What remains remarkable about sequences like (214) is the fact that the anaphor Tennisspieler (‘tennis player’) is presented as already Given, although it introduces a New (or at least Accessible) piece of information to the discourse (see the citation by Allerton in 4.2.3.1). This can be regarded as an economical stylistic device used by the speaker as an instruction to the listener to link the novel information to the most suitable referent in the context. Such an ‘accomodation’ is not only achieved by deaccentuation but also by definiteness and (syntactic) topicalisation. It is commonly used in (spoken and written) news texts, as in the following excerpt from a newspaper article reporting on a trial (Umbach 2001: 265): (218) [...] This morning the court heard the defendant. The 34-year-old father of two teenage daughters claimed to be innocent. There is another important aspect which concerns the asymmetricality of hyponymy relations. Provided that an identity-anaphoric reading is blocked, as in (219) Bach wrote many pieces for viola. He must have LOVED string instruments. and (220) Bach wrote many pieces for string instruments. He must have loved the viOla. the deaccentuation of string instruments in (219) and the accentuation of viola in (220) have to be attributed to different levels. The deaccentuation of string instruments is simply due to cognitive Givenness or activation, since the concept string instruments is established by the subsumed antecedent viola. The accent on viola in (220), however, is an expression of focus rather than Newness. Whenever a hypernym is mentioned, a set of alternative hyponyms is established, even if their degree of activation is lower The ‘identifiability-test’ also works in the other direction. Consider two generic versions of the tennis player-sportsman relation: (a) Er war begeistert von Tennisspielern. Vielleicht, weil er SELber Sportler war. (He was enthusiastic about tennis players. Maybe because he was a sportsman himSELF.) vs. (b) Er war begeistert von Sportlern. Vielleicht, weil er selber TENnisspieler war. (He was enthusiastic about sportsmen. Maybe because he (himself) was a TENnis player.) An identity-anaphoric reading in (b) is blocked due to the generic use of tennis player. Thus, tennis player receives an accent. 119

139

than in the reverse case. Mentioning a hyponym is like zooming in (or focussing in a literal sense) on one element of the alternative set. This process is quasi-contrastive (along the lines of Jacobs (1988) and Rooth (1992)) and requires an accent on the hyponym-anaphor. The fact that the anaphor is cognitively semi-activated influences the form of the accent, i.e. H+L* should be the accent’s favoured surface realisation. If the anaphoric hyponym is coreferential with its hypernym, however, as in our sportsman/tennis player example, the hyponym is likely to occur as part of the background – not as part of the focus –, which decreases the probability of being accented.120 Thus, the level of the binary pragmatic partitioning of an utterance into a more informative part (focus) and a less informative part (background) blurs the influence of the lexical relation on the target referent’s prosodic marking. The interpretation of the target referent as background material becomes possible because the same referent recurs (as ‘topic’) in the intervening sentence between context and target sentence. The example text discussed above is repeated again as (221), with the assumed focus-background structure of the target sentence indicated. All expressions that refer to the same referent are underlined: (221) Ole war ein begabter Sportler. Er war in seiner Region sehr bekannt. [ Die Lokalpresse LOBte ]Focus [ den Tennisspieler. ]Background (Ole was a talented sportsman. He was well-known in the region. [ The local press praised ]Focus [ the tennis player. ]Background) Such ‘topic continuity’ (in the sense of Fretheim, e.g. 1994, 1996) or ‘center retention’ (following Centering Theory; see Grosz et al. 1995: 210) from the intervening second sentence (er (‘he’)) to the target sentence (Tennisspieler (‘tennis player’) was generally avoided in the other relations.121 However, if coreference is avoided, there is no background material in the target sentence at all. Compare the two hypernym-hyponym relations below: the target referent and hyponym Hund (‘dog’) in (222) is coreferential with the two previous instantiations of the discourse topic (ein Haustier (‘a pet’) and es (‘it’)), making Hund (‘dog’) part of the (deaccented) background, but the target referent in (223) does not denote the same referent as the (generic) hypernym Haustiere (‘pets’) and the following pronoun sie (‘they’). The hyponym Hund (‘dog’) is thus part of the focus and more likely to receive an accent.

However, background elements can receive an accent, as shown in example (153) in section 2.3: A: What about your sister? B: [ My stupid SISter ]Background [ got MARried last week. ]Focus Still, background material is more likely to be deaccented, especially in sentence-final, i.e. postnuclear, position (however, see examples (128) and (129) by Halliday and the discussion in section 2.2.5.2). 121 Nevertheless, there are some instances of ‚topic continuity‘ in the synonymy and converseness relations, which might be a factor supporting the strong preference for deaccenting the target referents in these contexts. 120

140

(222) Unsere Nachbarn hatten ein Haustier. Es machte viel Freude, aber auch eine Menge Ärger. [ Die Kinder verSORGten ]Focus [ den Hund. ]Background (Our neighbours had a pet. It was great fun but also caused a lot of trouble. [ The children took care of ]Focus [ the dog. ]Background ) (223) Unsere Nachbarn hatten Haustiere. Sie machten viel Freude, aber auch eine Menge Ärger. [ Die Kinder versorgten den HUND. ]Focus (Our neighbours had pets. They were great fun but also made a lot of trouble. [ The children took care of the dog. ]Focus) The scenario condition showed a significant preference for H+L* over both other types of contour, as in (224): (224) Das Restaurant war vom Feinsten. Schon das Lesen der Karte war ein Genuß. Allerdings hätten wir uns nicht alles bestellen können, was wir gerne gegessen hätten. Unsere Tischnachbarn riefen den KELLner. (H+L*) (The restaurant was excellent. It was already a pleasure to read the menu. Nonetheless, we couldn’t have ordered everything we would have liked. The people at the next table called the waiter.) This result strongly suggests that this particular accent type can serve, under certain circumstances, as an ‘Accessibility accent’, since it most convincingly encodes the semi-active cognitive state of the referent. For the scenario condition, we can at least claim that relatively prototypical co-established concepts, such as the waiter in a restaurant scenario or the judge in a courtroom scenario, have the appropriate degree of Givenness for being marked by this type of accent. Judgements on the prosodic marking of textually displaced items revealed that H+L* and deaccentuation are equally preferred over H*. The fact that H+L* was not dispreferred suggests that a textually given item recurring after three clauses has a slightly lower degree of Givenness than an antecedent’s synonym or hypernym mentioned in the (almost) immediate context. This is possibly due to a necessary search in the working memory in the case of displaced items, which requires a little more activation cost. Interestingly, the distance of three clauses between antecedent and anaphor may cause a referent to be just on the border between Accessible and Given information, which again suggests a continuum of activation degrees. 4.2.7

Summary and Conclusion

We have shown for the purposes of prosodic marking that Accessible information cannot be treated as a uniform category. We have further shown that one particular

141

type of pitch accent, H+L*, is appropriate and significantly preferred over another accent type (H*) and over deaccentuation in a number of cases where the referent in question denotes Accessible information. These cases comprised whole-part relations where the referent constituted a part of an already mentioned whole, and the scenario condition where the referent was predictable from the contextually given schema or frame. We have also shown that certain types of Accessible information are preferentially deaccented. These cases comprised the following relations: converseness, part-whole (in that order only), synonymy, and the relation between hypernym and hyponym in either order. The surprising preference for deaccentuation of the anaphor in the hypernym-hyponym relation might be explained by unintended coreference interpretations which also had an impact on the focus-background structure of the target sentences. In all cases where deaccentuation was preferred, the second choice was H+L* which in turn was significantly preferred over the other pitch accent, H*. This provides indirect evidence for the intermediate status of H+L*. Baumann & Hadelich (2003) already found that H* is appropriate for signalling New (inactive) information, and deaccentuation for truly Given (active) information. Since the experiment reported here has shown that H+L* is appropriate for certain cases of semi-active information, we can place this accent on a scale of intonational marking, along which differing degrees of activation are expressed: (225) active no accent

inactive H+L*

H*

The scale in (225) not only attributes specific types of pitch accent to a continuum of activation degrees, it also suggests a somewhat iconic use of pitch height in the intonational marking of a referent’s information status. This observation is compatible with Gussenhoven’s (2002, 2004) Effort Code: the higher the pitch on a lexically stressed syllable, the Newer (or more newsworthy, in the case of contrastive but active items) the discourse referent. Gussenhoven also claims that the same effect is produced by alignment differences that substitute for differences in pitch height: later peaks are perceived as higher and more prominent than earlier ones. This use of substitute variables can be seen to receive support from the results reported on here. Since the peak in H* is later than that in H+L*, it is interpreted as higher, and therefore taken to mark newer information, whereas H+L* has a very early peak, contributing to the impression of increased Givenness. The role of intonation, including the marking of Givenness, as determined by three biological

142

codes and their linguistic interpretations will be dealt with in the next subchapter (4.2.8), before we will present a comprehensive model of discourse referents’ activation degrees and their intonational marking in chapter 5. Although we take the intonational marking of a discourse referent as the main cue for its degree of activation, we have to be aware that solely depending on intonation in the determination of different activation degrees leads to a circular argumentation. It would thus be desirable to have a measure for degrees of Givenness independent of the referring expression’s prosodic marking. Psycholinguistic experiments should be conducted along the lines of Haviland & Clark (1974) and Clark & Haviland (1977), who measured the time it took subjects to comprehend sentences in a reading task with pairs of antecedents and anaphors in different contexts. They showed that Accessible referents, i.e. those referents whose comprehension requires an inferential bridge, take longer to process than Given ones. This is in line with Chafe’s model of activation cost, which he claims to be higher the less accessible the anaphor is (see section 2.2.4 and Chafe 1994: 172). 4.2.8

Digression: Intonation, Biological Codes and their Linguistic Manifestations

Intonation is often said to serve primarily an emotive function, implying an inherently iconic usage of pitch variations. Such fundamental iconicity further implies the claim of universal validity of (paralinguistic) meaning differences in spoken language brought about by changes in pitch height. This is, in principle, Bolinger’s view when he claims that intonation is part of a gestural complex, a relatively autonomous system with attitudinal effects that depend on the metaphorical associations of up and down – an elaborate scheme of iconism. It assists grammar – in some instances may be indispensable to it – but is not ultimately grammatical. (1985: 106)

However, following Bolinger (1985: 97f.) further, the iconicity of intonation is only ‘symptomatic’ in nature, since pitch variations do not directly mirror the meaning they help to convey, as is the case – at least to a larger extent – with onomatopoetic expressions. Nevertheless, intonation is generally less arbitrary than other aspects of language (see Liberman 1979: 138), since it is more immediately determined by physiological or biological properties and conditions. Carlos Gussenhoven (2002, 2004) claims that the (universal) form-function relations with respect to pitch are based on three biological codes: the Frequency Code, the Production Code and the Effort Code. Each code has affective and/or informational interpretations and may have different linguistic manifestations in different languages. It is these grammaticalisations of intonational meanings in German and English we are particularly concerned with here.

143

The Frequency Code, which was introduced by Ohala (1983, 1984), suggests size differences by differences in pitch height: since a bigger larynx (including longer vocal folds) and a longer vocal tract produce lower frequencies, low pitch is associated with larger creatures and high pitch with smaller ones. This universal principle may have affective interpretations along dimensions like dominant~submissive or impolite~polite and more informational interpretations along dimensions such as certain~uncertain or – closely related – assertive~questioning, with low pitch attributed to the first and high pitch to the second pole of these dimensions (see Gussenhoven 2004: 80ff.). The most obvious linguistic manifestation of the Frequency Code is the distinction between statements and questions, which is marked in a great number of languages (including, to some degree, German and English) by falling or low versus rising or high pitch. For many interpretations of the Frequency Code as well as for its grammaticalisation in the form of statements versus questions, it is the contour endings which are particularly important (see Ohala 1983, 1984, Gussenhoven 2004: 82). This is also true for the Production Code, which derives its interpretations from a gradual decrease in subglottal air pressure in the course of a breath group (see Gussenhoven 2004: 89f.). A consequence of the drop in energy is a gradual lowering of pitch (along with intensity), so-called ‘declination’ (Cohen & ‘t Hart 1967). The central linguistic interpretation of this code is, then, finality versus continuation, marked by low versus high endings. Translated into autosegmental-metrical categories (here: GToBI categories), we can attribute different degrees of finality to different types of boundary tone: (226) H-^H%

non-finality

H-% L-%

finality

At the beginning of a phrase, the relation is reversed: an initial H tone often signals a new topic, whereas a low beginning marks topic continuation (in German and English; see Wichmann et al. 2000). The third biologically determined code is the Effort Code, which is based on the physiological phenomenon that an increased effort in producing speech leads to greater articulatory precision. This is reflected by more pronounced and wider pitch movements (see Gussenhoven 2004: 85f.). The primary informational application of this code in many languages is to express increased emphasis or importance. Its most common linguistic manifestation is the marking of focus. However, the iconic nature of the Effort Code is also revealed at the level of Givenness,

144

in the sense that there exists a direct correlation between different mental states and differences in phonetic intensity or word length (pronouns tend to be shorter than lexical NPs). Creating and interpreting a new discourse representation of a referent requires a greater mental effort on the part of the speaker and the hearer than keeping an already established referent in a state of activeness. As a result, it involves higher acoustic intensity and typically more phonological material (Lambrecht 1994: 96f).

Lambrecht’s citation shows that the prerequisite for an increased articulatory effort is an increase in mental effort, underlining the intentional usage of the Effort Code in marking varying degrees of activation. Already in 1985, Dwight Bolinger recognised reflections of the three biological codes when mentioning the following examples within a single paragraph: In the course of an action we are up and moving; at the end, we sit or lie down to rest. In a discourse this translates to higher pitches while utterance is in progress and a fall at the end. In this we have the almost universal downdrift observed as an utterance draws to its close, as well as the opposite tendency when something further is expected, such as a continuation clause or the answer to a question [= Production Code; SB]. The same polar opposition gives us our accentual contrasts. When we come to elements in an utterance that interest and excite us, we mark the spot with a rise in pitch – the more interesting and exciting they are, the greater the rise [= Effort Code; SB]. And if it is objected that this is not really true because accents are often marked by downward obtrusions of pitch rather than upward obtrusions, we must ask what is the nature of the exception; and we discover that the exception is explained on the same basis as the rule, because accents that are obtruded downward are meant to downplay the accented item at the same time that they recognize its importance. It is more courteous to say the first of the following than the second [= Frequency Code; SB]: Give me

pu a

Give me a sh. pu

sh. (Bolinger 1985: 99f.)

An important observation Bolinger makes is that the form or type of accent determines its meaning. This aspect particularly applies to the linguistic interpretations of the Effort Code. Gussenhoven (2004: 86) gives an example of two different kinds of focus in European Portuguese, which are marked by different types of pitch accent (adopted from Frota 1998; see also Ladd 1996: 127): (227) A: E o Roberto e a Maria? (‘What about Roberto and Maria?’) B: CaSAram.

(‘They got married’)

H+L* L%

145

(228) A: Eles separaram-se? B: CaSAram.

(‘Have they split up?’) (‘(No) they got married’)

H*+L L% Example (227) shows a question-answer pair with neutral narrow focus in the answer, while the answer in (228) represents a correction, i.e. a special type of contrast. This difference in focus type is mirrored by pitch accents differing in peak timing and, as a consequence, in pitch excursion: the early peak accent in (227) displays a smaller fall on the accented syllable than the medial peak accent in (228). The greater fall on on the accented syllable in (228) leads to the impression of greater markedness or emphasis of the contrastive focus. We have seen in the previous chapters (and particularly in the perception experiment in the present chapter) that different types of pitch accent are not only used for marking different focus structures but also different degrees of Givenness. In particular, New information is often marked by H* pitch accents, while somewhat more active information is characterised by lower pitch on the accented syllable (translated in autosegmental-metrical terms as a downstepped or low starred tone, i.e. !H* or L*), and while fully active information is deaccented. This relation between pitch height and degree of activation has repeatedly been pointed out (see section 2.2.5.2 for a detailed discussion), e.g. by Bolinger (1985: 105) when he claims that predictable items are marked by a fall within the accented syllable, or by Pierrehumbert & Hirschberg (1990) who broadly assign high accents to New, downstepped accents to Accessible and low accents to Given discourse referents. We repeat the scale presented in (122) in a reduced form: (229) Pierrehumbert & Hirschberg’s (1990) assignment of accent types to Givenness degrees H*

New

!H* L* no accent

Given

Thus, the most obvious conclusion would be: the higher the pitch on an accented syllable, the more prominent it is, with prominence being interpreted either as Newness or focus, both as a function of the Effort Code. Perception experiments (Rietveld & Gussenhoven 1985, Ladd & Morton 1997) have indeed shown that higher pitch peaks are perceived as more prominent (in identical contexts). However, as Gussenhoven (2004: 85) points out, perceived prominence is not a correlate of pitch height but of relative pitch excursion. Furthermore, the pitch does not necessarily

146

have to be high in order to be perceived as high – the same effect may be created by peak delay. The phenomenon that late peaks signal higher prominence than early peaks is not only attested in perception studies on English (see Ladd & Morton 1997) but also on German (Kohler 1991a). Moreover, Peters (2002) found evidence for this difference in Hamburg German production data, in which narrow focus expressions are marked by late peaks in contrast to broad focus expressions. The peak delay enhances the slope of the nuclear fall (in low-ending utterances), which leads to an increase in perceived prominence. The alleged primacy of pitch excursion as a marker of prominence (and Newness) is compatible with the scale in (229) – under the assumption that the accents represent nuclear accents in declarative sentences with a low boundary tone – since a fall from H* is steeper than from !H* and hardly existent from L*. It is only partly in line, however, with the results of perception experiment II, presented in (225) and rearranged in (230): (230) Results of perception experiment II H*

New

H+L* no accent

Given

Both pitch accent types tested are characterised by a fall whose excursion is almost the same: the peak in H* is 240 Hz and is followed by an L-% boundary tone of 155 Hz, while the pitch in H+L* falls from 240 Hz to 170 Hz in the middle of the accented vowel, and further declines to 155 Hz at the end of the phrase. Despite similar slope of the fall, pitch accent type H* was clearly judged to be more appropriate as a marker of Newness than H+L*, which – we assume – implies that H* is also perceived as more prominent. A possible explanation lies in the alignment of the pitch peak, somewhat extending Gussenhoven’s claims about peak delay to differences in accent type: in H*, the peak occurs later than in H+L*, which makes accent type H* sound more prominent. The status of !H* pitch accents remains unclear, since it has not been tested under controlled conditions (for German). Informal observations suggest, however, that !H* is perceived as more prominent (i.e. marking less active referring expressions) than H+L*, although the nuclear fall is more gradual. Thus, pitch height on the accented syllable seems to be a more relevant cue than pitch excursion here – inconsistent with Gussenhoven’s claim that relative pitch excursion is the primary cue for perceived prominence. Support for our view comes from Kohler (2004), who argues that the contrast between low and high pitch in the accented vowel is essential for the perception of early versus medial peak accents, encoding a change from Givenness/Accessibility to 147

Newness (and from ‘finality’ to ‘openness’). However, Kohler does not regard ‘high’ and ‘low’ pitch points or levels by themselves as the relevant perceptual categories but the high-low (early peak) or low-high (medial peak) F0 trajectories into the accented vowel (following Diehl’s (1991) ‘auditory enhancement hypothesis’). Already in 1991, Kohler investigated the question of peak alignment differences in German single-accent sentences and the influence of these differences on the sentences’ linguistic and paralinguistic meanings. As just mentioned and discussed in some detail in section 2.2.5.2, the change from an early to a medial peak accent causes a perceptual change from Givenness/Accessibility to Newness, i.e. a linguistically relevant change, while the change from a medial to a late peak ‘only’ adds greater involvement or surprise, a basically paralinguistic value. Comparable to the function of late-peak pitch accents is the function of Kohler’s (2005) force accents (see section 2.1.1). They represent prominences lacking pitch movement but which are produced with increased articulatory effort, thus being another reflection of the Effort Code. Kohler assigns only paralingusitic functions to this kind of accent which “adds an expressive component of disapproval” (2005: 101) to the word concerned. According to this restriction to paralingusitic meanings, Kohler claims force accents to be a universal feature of language, while the use of pitch accents is language-specific. Alignment differences in the nuclear movement do not only affect instantiations of the Effort Code, mainly expressed by accentual contrasts, but also manifestations of the Frequency and Production Codes, which have been claimed to be predominantly expressed by boundary tone contrasts (as to the Frequency Code especially Ohala 1983, 1984). Swerts et al. (1994), e.g., found that the perceived degree of finality is higher in early falls than in late falls (see also Kohler 2004, Wichmann et al. 2000) – which in turn suggests that a greater duration of a low stretch before a boundary substitutes perceptionally for an even lower pitch level. Presumably, the impression of lowness/finality will also be enhanced if the low pitch is reached on the sonorous part of the stressed syllable. Furthermore, according to Wichmann (1991), the height of a nuclear fall affects the degree of perceived finality: the lower the starting point of the fall, the greater the perceived finality. The figures in (231) summarise the effects of timing differences on the perception of degree of prominence and finality:122

122

The accented syllable is indicated by the black bar. 148

(231) A:

accent sounds less prominent boundary sounds lower and ‘more final’

B:

accent sounds more prominent boundary sounds higher and ‘less final’

As we have seen several times now, the different linguistic and paralinguistic functions of the three biological codes can largely be attributed to either of the two basic functions of intonation: the phrasing function marked by boundary tones (Frequency Code and Production Code) and the highlighting function marked by pitch accents (Effort Code – to some extent also Frequency Code).123 It could further be claimed that pitch contrasts in accents and boundary tones apply to different domains in the linguistic structure: whereas the type of pitch accent marks the pragmatic or cognitive status of (single) discourse referents, the type of boundary tone marks the status of a whole proposition. In particular, the degree of Givenness of an accented referring expression depends (among other factors) on the pitch height that is reached within the metrically strong part of the accent (i.e. the starred part). Thus, (nuclear) pitch accents containing an L* (L*, H+L*, L*+H) generally serve to mark the Accessibility of a referent, while accents containing an H* (H*, L+H*) mark a referent as New (see Pierrehumbert & Hirschberg (1990) for English, Baumann & Hadelich (2003) – as well as the corpus analysis and the two perception experiments presented earlier – for German). The general plausibility of the different functionality of pitch height in the two types of tonal events can be shown in common combinations of (nuclear) accents and boundary tones. A neutral utterance of the form H* L-% introduces a New referent by a high pitch accent (H*) in an assertive way (L-%). Similarly, a typical echo question has the form L* H-^H% in many languages. Here, a Given item (L*) is called into question, or the item’s status is unclear and needs further explanation or elaboration, which is expressed by the use of a very high boundary tone (H-^H%). Nevertheless, there are a number of languages and dialects – even German dialects – in which this mapping does not hold. Some dialects of the Palatinate, e.g., mark yes/no-questions by (rise-)falls and declarative utterances by (fall-)rises (see Peters 2001, 2004). The functional relations ‘phrasing – boundary tones’ versus ‘highlighting – accents’ hold at least for German and English. Cross-linguistically, they are not valid. In Korean, e.g., boundary tones are used to express focus (see section 2.2.5.3), and in Bari and Palermo Italian, e.g., specific pitch accent types take on an interrogative function (see e.g. Grice 1995, Grice et al. 2005). 123

149

5 A Model of Intonation and Givenness In this study, we have identified three distinct levels addressing the Givenness status of referents and propositions (and the referring expressions denoting them). They were defined in the intermediate summary in section 2.3 and are repeated here: Identifiability of entities, states or events on the basis of the speaker’s assumption that the listener has knowledge – in the sense of having a mental representation – of these referents or propositions Degree of Activation of an entity or proposition assumed by the speaker to be in the listener’s consciousness at the time of utterance Focus-Background Structure, i.e. the pragmatic partitioning of an utterance according to which there are elements the speaker chooses to present as newsworthy or not newsworthy, irrespective of their cognitive state The first two levels are non-relational and describe the assumed cognitive state of a referent or proposition in the listener’s discourse model (identifiability) and in the ongoing discourse (activation). The level of focus and background is relational in nature and applies to the domain of the sentence or utterance. It cannot be interpreted without reference to the preceding discourse, though. While the levels of (non-)identifiability and focus-background structure are concerned with binary distinctions,124 the activation level should be thought of as a (potential) continuum. Accordingly, degrees of activation are predominantly marked by intonational variation, which is more gradual than other means of linguistic encoding and to some extent iconic in nature – as discussed in section 4.2.8. However, since the number of linguistically relevant intonational contrasts is limited, there is also a limited number of formal categories indicating different activation states. We distinguish (following Chafe’s model) between three activation states of discourse referents, namely ‘inactive’, ‘semi-active’ and ‘active’ – corresponding to New, Accessible and Given information –, marked by differences in strength and type of pitch accent (including deaccentuation). A prerequisite for a referent’s activation is its identifiability, generally marked by morphosyntactic means. In particular, fully active referring expressions are often marked by pronouns, while There are, however, different kinds of focus, expressing different degrees of markedness and expressed by differnt degrees of (phonological) prominence. Contrastive focus, e.g., is generally perceived as particularly prominent. A similar gradience of prominence does not hold for backgrounded elements. 124

150

semi-active and inactive items are encoded in their full lexical form (see section 2.2.5.1). Furthermore, non-identifiable (and thus necessarily inactive) items are often encoded as indefinite noun phrases, while identifiable items surface as definite noun phrases. However, there is no one-to-one correspondence between grammatical (in)definiteness and cognitive (non-)identifiability. Decisive factors are, e.g., whether an expression is specific or non-specific, or whether it is generic or not (see examples (79) to (84) in section 2.2.5.1). Strictly speaking, however, we cannot predict from a referent’s identifiability and activation state alone whether and how a referring expression will be accented. The actual prosodic form depends on the referent’s pragmatic role in the given proposition (see Lambrecht (1994: 323) and section 2.2.5.3), i.e. whether the referent is part of the focus or the background in the utterance. Since the level of focus and background is determined by the intentions of the speaker – and largely independent of the referent’s activation degree – we strive to minimise the influence of this level by assuming broad or all-focus structures for our proposed model. However, there are cases of overlap between the focus-background level and the other two levels which are difficult to avoid: for example, a textually Given item, i.e. an item that has been mentioned in the immediate context, is very likely to be part of the background in the subsequent utterance. The perception experiment in section 4.2 has shown that even Accessible information may occur in the background – unless deliberately avoided in the experimental setting – which increases the probability of deaccentuation of the respective referring expression. Moreover, our claims concerning the prosodic marking of discourse referents is restricted to their occurrence as the final argument125 in an assertive, i.e. low-ending, intonation phrase. This restriction is necessary because the type of boundary tone strongly influences the type of nuclear pitch accent. Often, at least in German and English, they have opposite values, resulting in a clearly audible falling or rising movement. Thus, a low boundary tone is very often preceded by a high(er) nuclear accent (disregarding alignment differences), as e.g. in unmarked declarative utterances, while a high or rising boundary tone is often preceded by a low(er) pitch accent, as e.g. in echo questions. Finally, our claims are restricted to the language we tested in the corpus analysis and the perception experiments, i.e. (Standard) German. The diagram in (232) represents an extended version of the diagram in (105) in section 2.2.5.2. It attempts to give a comprehensive summary of the relevant Givenness states of discourse referents (New, Accessible, Given), along a potential continuum of Givenness degrees (ranging from inactive to active), and their (preferred) linguistic marking in German – leaving aside the types of accent being used for marking the different Givenness states (this issue will be discussed in the rest of the Terken & Hirschberg (1994) found that the grammatical function and surface position are relevant cues for a speaker’s decision to accent or deaccent a textually Given item (see section 2.2.5.2). 125

151

chapter). Furthermore, we disregard unpredictable variation due to speaker intentions here, which is tantamount to disregarding the level of focus and background. (232) Givenness Degrees and States of Discourse Referents and their Linguistic Marking in German (without Accent Types) Discourse Referent

Non-Identifiable

BrandNew

Identifiable

Inactive

Semi-Active

Active

Unused (New)

Situationally Accessible

Situationally Given

Textually Accessible (displaced)

Textually Given (currently evoked)

Inferentially Accessible (scenario, whole-part)

Indefinite full NP

Definite full NP

(synonym, part-whole)

Definite full NP

Accent

Definite full NP or Pronoun No Accent

The model is based on Chafe’s (1987, 1994) approach but also incorporates aspects of the models proposed by Allerton (1978), Prince (1981), and Lambrecht (1994), and of our empirical data. Allerton’s model (see (111), section 2.2.5.2) is similar to the one proposed here in many respects. For example, he postulates – as we do – four different formal categories, which are defined in morphosyntactic and prosodic terms. He calls these four categories ‘New’, ‘Semi-New’ (both subsumed under ‘New’ information in our model), ‘Semi-Given’ (equivalent to ‘Accessible’) and ‘Given’. They are derived from three binary distinctions, which we also claim to be relevant. First, ‘unknown’ versus ‘known’ applies to the level of (non-)identifiability or knowledge and is considered to

152

be marked by (in)definiteness. Second, ‘offstage’ versus ‘onstage’ applies to the level of activation or consciousness and can be thought of as a differentiation of ‘New in the discourse’ and ‘not New’, while – third – Allerton’s dichotomy of ‘nonimmediate’ versus ‘immediate’ further differentiates the activation parameter into what we called ‘Accessible’ and ‘Given’ information. As far as the prosodic marking of the proposed Givenness degrees is concerned, Allerton’s scale is not directly compatible with ours, since it only “applies to the relative givenness of noun/adverbial phrases that occur as appendages to a sentence” (1978: 148) and not to the final argument (or NP) in an assertive sentence. Nevertheless, Allerton argues that the type (e.g. fall for New and Semi-New information) and strength (secondary rise on Semi-Given or Accessible information) of the nuclear contour – including the nuclear accent – has an influence on an item’s perceived degree of Givenness. Fully Given items are claimed to be non-nuclear, i.e. they do not carry a nuclear accent at all, which is compatible with our claim. Prince suggests a ternary model with ‘New’, ‘Inferrable’ and ‘Evoked’ information, being equivalent to Chafe’s ‘New’, ‘Accessible’ and ‘Given’. She does not explicitly differentiate between non-identifiable and identifiable referents, although this distinction is implicitly present in the division of New information into ‘Brand-New’ and ‘Unused’. We adopt this distinction of the two types of New information (including Prince’s terminology) for our model. Brand-new referents are new for the hearer and new in the current discourse, while Unused referents are known to the hearer (i.e. present in his/her discourse-model) but not yet established in the ongoing discourse. Brand-New items are generally encoded as indefinite expressions, Unused items as definite ones. Both types of expression usually receive an accent.126 From Lambrecht’s extended version of Chafe’s model we adopt the subdivision of the Accessibility category into different types by their source or origin and extend it to some degree to the category of Given information (following Prince’s distinction between ‘Situationally Evoked’ and ‘Textually Evoked’). An Accessible or Given referent may either be derivable from the physical context (‘situational’) or directly from the preceding text (‘textual’). In addition, an Accessible referent may be available via a bridging inference from a previously mentioned referent or proposition (‘inferential’). The following table gives examples of each of the seven categories. We provide two examples of inferentially Accessible information, since different types of this category cause different prosodic realisations. The referents in question are underlined. Where there are relevant antecedents for the target referents, these are underlined as well. Nuclear accents are marked by capital letters.

The question of pitch accent type will be addressed later. Prince does not differentiate between different accent types, since she is not concerned with intonation at all. 126

153

(233) Example sentences of the different Givenness states of discourse refents Brand-New

Ich habe mir gestern ein BUCH gekauft. (I bought a book yesterday.)

Unused (New)

Das Buch beschreibt den MOND. (The book describes the moon.)

Situationally Accessible

Ich habe noch nie so hässliche BILder (or: HÄSSliche Bilder) gesehen.127 (I’ve never seen such ugly pictures.)

Textually Accessible (displaced)

Django ging an die Bar und bestellte einen Whisky. Er war bekannt dafür, dass er den Revolver schneller zog als sein Schatten. Man hatte Respekt vor ihm. Django trank den WHISky (or: TRANK den Whisky). Er brauchte nur einen Zug.128 (Django went to the bar and ordered a whisky. He was known for drawing the gun faster than his shadow. People respected him. Django drank the whisky. He finished it in one draught.)

Inferentially Accessible (whole-part)

Martin war begeistert von seinem neuen Buch. [...] Der Junge durchstöberte die SEIten.129 (Martin was enthusiastic about his new book. [...] The boy flicked through the pages.)

Inferentially Accessible (part-whole)

Der kleine Martin studierte jede einzelne Seite. [...] Der Junge LIEBte das Buch.130 (Little Martin studied every single page. [...] The boy loved the book.)

Situationally Given

Ich habe hier ein paar BILder für dich. (I have got some pictures for you here.)

Textually Given (currently evoked)

Django ging an die Bar und bestellte einen Whisky. Er TRANK den Whisky/ihn. Django brauchte nur einen Zug. (Django went to the bar and ordered a whisky. He drank the whisky/it. Django finished it in one draught.)

The sentence is adapted from Lambrecht’s English example (102) mentioned in section 2.2.5.2. We made sure that the referring expression Bilder (‘pictures’) occurs as the final argument in an assertive utterance in order to have the same surface structure in all examples. 128 This example is taken from perception experiment II (see (198)). 129 This example is taken from perception experiment II (see (205)). 130 This example is taken from perception experiment II (see (204)). 127

154

Our determination of different Givenness degrees of discourse referents is argued for on the basis of preferences as to their intonational marking, attested in a corpus analysis and two perception experiments. We aimed at examining claims made in the literature, going beyond the simple binary distinction between accentuation as a marker of New information and deaccentuation as a marker of Given information (see Cruttenden, in press). We were particularly interested in degrees of activation between the extreme poles of Given and New, and – above all – in the accent types used for marking them. The most influential studies in this area of research are the ones by Pierrehumbert & Hirschberg (1990) for American English and Kohler (1991a) for German (see section 2.2.5.2 for a detailed discussion, including further approaches). Both studies served as points of departure for our own investigation. Pierrehumbert & Hirschberg, working within the framework of AutosegmentalMetrical Phonology, propose a model of intonational meaning in which the meaning of a whole contour can be derived from the composite meanings of pitch accents, phrase accents and boundary tones. Pitch accents are claimed to mark the status of individual discourse referents. A summary of the meanings attributed to different accent types is given in (234) (repeated from (122) in section 2.2.5.2): (234) Pierrehumbert & Hirschberg’s (1990) assignment of accent types to Givenness degrees H* L+H* !H* H+!H* L*+H L* no accent

New Addition of a New value Accessible

Increasing degree of Givenness

Modification of Given Given

In a series of perception experiments, Kohler (1991a) investigates three accent contours – early, medial and late peak –, which are found to differ in meaning. However, only the distinction between early and medial peaks turns out to be categorical, while the difference between medial and late peaks is gradual in nature. The table in (235) summarises Kohler’s findings as to the relation between accent type and degree or state of Givenness,131 translating the contours tested into GToBI categories (see also Kohler 2004):

Note, however, that Kohler does not concentrate on the information state of individual discourse referents (as Pierrehumbert & Hirschberg) but investigates the marking of a higher-level semanticpragmatic relations. 131

155

(235) Relation between accent type and state of Givenness in Kohler (1991a) L+H*/L*+H (Late Peak) H* (Medial Peak) H+L*/H+!H* (Early Peak)

Emphasis (on sth. New) New Accessible or Given

Our corpus study (see chapter 3) provided us with first insights as to how the final argument in assertive sentences is marked prosodically in German, i.e. which types of pitch accent can be found in actual production data. These types of pitch accent which turned out to be used for marking different cognitive states of discourse referents were only partly in line with our expectations derived from claims made in the (scarce) literature on German and English intonation and information structure, in particular Pierrehumbert & Hirschberg (1990) and Kohler (1991a). Results show that pitch accent type H* indeed correlates with Newness and deaccentuation with Givenness, but that a surprisingly large number of items is marked by H+L* – irrespective of their activation state. This type of accent has been expected as a marker of Accessible referring expressions but neither of fully Given nor fully New ones. However, the considerable amount of H+L* accents may be explained by the text genre: the typical reading style of newspaper texts in German is characterised by a falling nuclear intonation contour with a H+L* pitch accent. Although this stylistic device might dilute the results, the use of H+L* reveals an interesting tendency, namely that the type of Accessibility relates to a specific type of intonational marking. In particular, while synonyms are often unaccented, an inferable item in a given scenario or an anaphoric meronym as well as situationally Accessible items turn out to be marked by an accent – preferably H+L*. On the whole, however, we cannot guarantee the representativity of the intonation patterns produced, since the corpus was read by a single speaker. Nevertheless, we were able to use the pitch accent types (including deaccentuation) observed as the basis for our closer investigation into the appropriate intonational marking of discourse referents. This closer investigation was carried out in two perception experiments with 30 subjects each, in which the preferred marking of the three accent types H*, H+L* and deaccentuation/no accent was tested in relation to (assumed) differences in the Givenness degrees of referring expressions. Our motivation for selecting H*, H+L* and ‘no accent’ was (a) we considered them to be perceptually distinct, (b) they are claimed to mark different activation states in the literature, and (c) they frequently occurred in our production data. In the first experiment, in which we made use of auditory and visual priming, there was some evidence for H* as a Newness accent and (rather indirect evidence) of H+L* as an Accessibility accent, while deaccentuation proved to be most appropriate for marking already Given information. However, no significant difference as to the intonational marking of Accessibility as opposed to Newness and

156

Givenness could be found. Furthermore, the experiment only investigated one type of Accessibility, i.e. situational Accessibility due to visual priming. Thus, it was obvious that the prosodic marking of Accessible information needed closer investigation in a further experiment. The results of the first experiment may have been affected by the fact that the stimuli were produced using diphone synthesis, which necessarily had a relatively poor segmental quality. To reduce these problems we used PSOLA resynthesis of natural recordings for the second experiment. Moreover, we examined eight different Accessibility relations between a textually given antecedent and an anaphor (the target referent) with regard to listeners’ preferred pitch accent type on the target referents. The types of Accessibility included textually displaced items (i.e. the same expression recurring after two or three intervening clauses) and inferentially derivable items such as synonyms, expressions standing in a converseness relation to an antecedent, hypernyms, hyponyms, holonyms, meronyms and expressions inferable from a given scenario. Results show that H+L* is the significantly preferred marker of certain types of Accessible information, namely anaphoric expressions in a wholepart relation and as a part of an established scenario. Other types of Accessible information, such as items in a converseness relation, holonyms (i.e. the anaphor in a part-whole relation), synonyms and hypernyms, are preferentially deaccented. The intermediate status of H+L*, and in turn its appropriateness for marking semi-active or Accessible information, is confirmed by the fact that this type of pitch accent was preferred over H* in all cases where deaccentuation was judged best. In other words: H+L* was at least the second choice for all kinds of supposedly Accessible information. The – broadly speaking – ternary distinction between high accents for New information, low accents for Accessible information132 and no accents for Given information, mirrors a somewhat iconic use of pitch height in the marking of a referent’s information status and is in line with the function of intonation attributed to the Effort Code (Gussenhoven (2002, 2004), see also sections 4.2.7 and 4.2.8): the higher the pitch, the Newer (and more newsworthy) the discourse referent. Such a gradient scale not only implies differences in accent type but also in accent strength, especially when thinking in terms of effort. This leads to another ternary distinction between primary, secondary and no accents, parallel to the other two scales mentioned above, presented in (236). It has to be stated clearly, however, that the categories on these scales do not stand in a one-to-one relation to each other.

132

H+L* counts as a low accent here, since the starred tone is low (see section 4.2.8). 157

(236) Proposed relation between activation state or degree, accent type and accent strength in German (and English) New

Accessible

H*

L*

primary accent

secondary accent

Given

no accent

In fact, several studies on German and English propose different kinds of secondary accents which are (more or less directly) claimed to serve as markers of semi-active information. However, a secondary status is usually not attributed to nuclear accents. Secondary accents may instead surface as prenuclear (see Chafe 1994, Büring 2003a, 2003b) or postnuclear prominences, such as Halliday’s (1967b) ‘secondary information focus’, which closely resembles Allerton’s (1978) ‘Semi-Given’ information, marked by a secondary rise on a postnuclear item that is recoverable from the preceding discourse. Further instances of (potential) postnuclear prominences are Kohler’s (2005) ‘duration accents’ as well as ‘force accents’, characterised by increased articulatory effort and lack of pitch movement, and Grice et al.’s (2000) ‘phrase accents’. Phrase accents are basically edge tones which may nevertheless be secondarily associated with stressed syllables (see the discussion in section 2.2.5.2). Duration Accents, force accents and phrase accents, as well as Büring’s secondary accents, are claimed to apply to German. The final version of our model of Givenness degrees and states of discourse referents and their linguistic encoding in German in (237) is extended with a detailed list of possible variants in the intonational marking of the referents. The first row of pitch accents, printed in bold face, presents the variants we have evidence for as appropriate markers of the respective Givenness degrees and states. They were attested in our perception experiments. The categories in the second row are the alternatives that were attested in our corpus analysis. Finally, the third row shows those variants which have been considered to be appropriate markers of the respective Givenness states in the literature (see the discussion in 2.2.5.2). Although some of the alternatives were proposed for English, they may be relevant for German as well. Note that the non-uniform character of Accessibility is mirrored in the diagram: No significant preferences in the intonational realisation of situationally or textually Accessible referents were found. This is indicated by their position between H+L* and lack of accent. Both H+L* and deaccentuation are possible markers for these types of Accessibility (see the examples given in (233) above). On the other hand, the types of Accessibility for which we obtained significant results can be placed just to 158

the left (H+L*) and to the right (no accent) of this dividing line. Since these are only preferences, it does not mean that another type of intonational marker would necessarily be inappropriate.

159

(237) Givenness Degrees and States of Discourse Referents and their Linguistic Marking in German (including Accent Type Preferences) Discourse Referent

Non-Identifiable

BrandNew

Identifiable

Inactive

Semi-Active

Active

Unused (New)

Situationally Accessible

Situationally Given

Textually Accessible (displaced)

Textually Given (currently evoked)

Inferentially Accessible (scenario, whole-part)

Indefinite full NP

Definite full NP

(synonym, part-whole)

Definite full NP

1.

H*

H+L*

2.

H* H+L* H+!H* !H*

H+L* H+L* L* L* H+!H* No Accent !H* H+!H*

3.

H* L+H* L*+H fall high level

Accent

Definite full NP or Pronoun

No Accent No Accent L* H+L*

No Accent H+L*

H+L* No Accent !H* H+!H* Secondary Accent Force Accent Phrase Accent low rise

No Accent L* L*+H H+L* H+!H* fall-rise low level

160

To sum up, we have shown that a binary distinction between accent and lack of accent is far too simplistic for an adequate description of the various cognitive states a discourse referent may have in a listener’s mind. We have to be aware that we are dealing with a continuum of activation degrees, and that the activation degree of referents is constantly changing as the discourse proceeds. Thus, the number of activation degrees is potentially infinite and cannot be captured by the limited number of distinct linguistic categories available. Our data show, for example, that a referent’s degree of Givenness depends on factors such as mode of presentation, distance from the referent’s last mention, type of lexical relation to an antecedent, and even order of occurrence (e.g. in a whole-part relation). It could also be shown, however, that there are at least three distinct intonational categories (H*, H+L*, no accent) which are roughly appropriate for marking three different activation states (New, Accessible, Given), although there is some overlap of H+L* and no accent as the preferred marker of a number of types of Accessibility. There is generally considerable variation in the prosodic marking of discourse referents, since preferences may vary between speakers and listeners. For example, an H+L* pitch accent may be acceptable for marking a synonymous expression (which proved to be preferably deaccented). Again, other choices may be unacceptable, such as deaccentuation as a marker of Newness. That is, although it is surely too strong to claim that each of the three Givenness states proposed here is marked by a single prosodic category, their intonational encoding is by no means arbitrary.

161

6 Summary and Outlook This study has given an overview of the intonational marking of various domains in the field of information structure, in particular at the level of discourse referents’ cognitive states. We have combined two areas of research – intonation and Givenness – both of which can be deemed either too vague or too complex to be integrated into the respective other field. The empirical evidence provided here is based on German, but it has implications for other West Germanic languages such as English and Dutch. We started by giving an introduction to phonetic aspects of intonation, especially accentuation, and to the currently most widespread phonological framework, Autosegmental-Metrical Phonology. We also introduced GToBI, a model for the annotation of German intonation. We then provided a rather theory-neutral introduction to the notion of Givenness, its relation to other dimensions of information structure (such as background-focus and theme-rheme), and discussed the nature and size of the domains of Givenness as well as its sources and perspectives. We distinguished three levels the notion of Givenness has been attributed to in the literature (e.g. by Prince, Chafe, and Allerton), namely the levels of knowledge, consciousness and newsworthiness. Based on Lambrecht (1994), we consider the first two levels (knowledge and consciousness) to make up the core notion of Givenness, or Givenness proper, since they apply to the cognitive states of (the mental representations of) discourse referents. We relate them to the levels of identifiability – denoting the listener’s ability to pick out a particular referent (or ‘file’) from among all those which can be designated with a particular linguistic expression, and identify it as the one the speaker intends – and activation – denoting the listener’s awareness of an entity or proposition a speaker can assume at a particular moment. The third level (newsworthiness) correlates with the pragmatic role of a discourse referent in a proposition, expressed by the distinction between focus (important information for the speaker) and background (unimportant information for the speaker). We were particularly interested in the linguistic marking of Givenness proper, i.e. of the levels of identifiability and activation. Identifiability is, as Lambrecht (1994: 87) concludes, “imperfectly and non-universally matched by the grammatical category of definiteness” (and, consequently, non-identifiability by indefiniteness). The level of consciousness or activation, which applies to identifiable referents only, is marked by two different linguistic means: lexical form and intonation. Discourse-active referring expressions often surface as pronouns, while less active referents are encoded in their full lexical form. Furthermore, fully active referents are often unaccented, while less active items generally receive an accent.

162

However, our study has shown that assuming the two categories ‘accent’ and ‘no accent’ and their correspondance to ‘New’ versus ‘Given’ information is a crude simplification. Some recent studies on Givenness no longer regard the distinction between Given and New information as a dichotomy but rather as a continuum. Nevertheless, such a continuum cannot be adequately expressed in terms of linguistic marking, since the set of linguistic categories available is limited. Taking this mismatch into account, we postulate (following Chafe) three different cognitive states – Given, Accessible and New – that we claim to have formal, predominantly prosodic, correlates. These correlates are particularly difficult to detect between the extreme poles of the continuum, i.e. in the realm of Accessible or semi-active information. While the morphosyntactic marking of Accessible items (as full definite noun phrases) is relatively uncontroversial, there is no similar agreement as to their prosodic marking – possibly because Accessible referring expressions in the examples given in the literature are often early in the sentence, or intonational phrase, and would not bear the main (nuclear) accent in the phrase even if they were accented. Since they are in prosodically less salient positions, which makes the identification of an accent more difficult, and since the distribution and strength of prenuclear accents is even more dependent on the rhythmic structure and length of the utterance, accounts differ considerably as to the accentuation of these items. Thus, we concentrated on the prosodic marking of the last (and potentially nuclear) argument in assertive and completely focussed utterances. The empirical part of this study comprised the analysis of a read corpus of German newspaper texts (chapter 3) and two perception experiments (chapter 4). The corpus analysis served as a preliminary study. Whereas the information states of the discourse referents examined could be considered to be reliable since they had been checked by experts (in the MULI project), there was no guarantee as to the intonational well-formedness of the sentences or as to their representativity, since they were produced by just one speaker. Nevertheless, certain relations between accent type and cognitive activation state could be found which were further tested in the two perception experiments. The experiments paid particular attention to the analysis of specific types of Accessibility (situational, inferential, textual) and their preferred type of accentuation (H*, H+L*, deaccentuation). The choice of accent types tested was based on claims made in the literature, in particular by Pierrehumbert & Hirschberg (1990) for English and Kohler (1991a) for German, and on our corpus study. Overall results, which entered the model of intonation and Givenness for German presented in chapter 5, revealed a relation between pitch accent type H* and Newness. This holds for both types of New elements, namely ‘Brand-New’ expressions, which are unknown (i.e. non-identifiable) to the listener, and ‘Unused’ expressions, which are known (i.e. identifiable) to the listener but which are not 163

derivable from the preceding discourse (i.e. inactive). The formal difference between these two categories lies in their morphosyntactic marking (indefinite versus definite). Looking at the other end of the scale, Given (i.e. active) referring expressions usually do not carry an accent. The intonational marking of the intermediate state of Accessibility is not as clear-cut, but a preference of pitch accent type H+L* for specific kinds of Accessible information, e.g. inferable items within a given scenario or the anaphor in a whole-part relation, could be found. Other presumably semiactive referents such as synonyms or the anaphors in part-whole or converseness relations turned out to be preferably deaccented. The unexpected preference for deaccentuation of the anaphor in the hypernym-hyponym relation (representing a superset-subset relation just like whole-part, in which the anaphor received an accent) might be a result of coreference interpretations which also had an impact on the focus-background structure of the target sentences. In sum, we have shown for the purposes of prosodic marking that information between the poles Given and New cannot be treated as a uniform category and that different types of more or less activated information, e.g. denoting different semantic relations, demand different accent types as linguistic markers. In fact, there is evidence that a range of accent types (including deaccentuation) can be mapped onto the gradient scale of activation degrees, with the pitch height on the accented syllable, i.e. the lexically stressed syllable of the referring expression, being the determining factor. Such a mapping suggests a somewhat iconic use of pitch height, which is compatible with Gussenhoven’s (2002, 2004) Effort Code: the higher the pitch on a lexically stressed syllable (or: the ‘starred’ tone), the Newer (or more newsworthy) the discourse referent (see section 4.2.8). This is also in line with the intermediate status of H+L* that we found, since in all cases where one of the other two types of marking was preferred, the second choice was H+L*. However, we have to bear in mind that the meaning conveyed by a particular pitch accent type in German is not restricted to the marking of activation states. To name but a few functions, an H* or L+H* can also be used to mark contrast, an H+L* often marks the end of a paragraph (see chapter 3), and an L* is often placed on the last referent if the following boundary tone is high. Moreover, specific accent types may be used to express pragmatic meanings like irony, or as a stylistic device, e.g. when deliberately deaccenting a piece of information to indicate to the listener that this information is assumed to be already known. Furthermore, it is not only the accent type that conveys meaning differences in terms of Givenness, but also accent strength. In several studies on German and English, different kinds of secondary prominence are (more or less directly) proposed to serve as markers of semi-active information. These may be called ‘secondary rise’ (see Allerton 1978), ‘secondary accent’ (Chafe 1994, Büring 2003), ‘secondary information focus’ (Halliday 1967b), ‘duration accent’ (Kohler 2005) or ‘phrase accent’ (Grice et al. 2000). 164

Looking beyond the scope of the present study, our findings are directly relevant to applications in language technology. In particular, they could be used for improving the intonation of speech synthesis systems. In fact, some of our results have already been intergrated in the German Text-toSpeech synthesis system MARY (see Schröder & Trouvain (2001) and http://mary.dfki.de). Based on the work by Amoia (2003) and Romanelli (2003), an information structure module was implemented, which has access to the output of the ‘tagger and chunker’, i.e. the module that divides the text input in part-of-speech tokens and syntactic phrases. The information structure module enriches the XML document by the attribute ‘Given’ for each adjective and noun, which can have the value ‘+’ (Given), assigned to all instances of a recurring word (as the same string or in a different inflected form) or ‘–’ (New). These tags are interpreted in the prosody module, deleting accents on ‘+ Given’, i.e. textually Given, items. The accent distribution in MARY is exclusively based on part-of-speech information. Each adjective and noun receives an accent. The other parts-of-speech are ranked hierarchically in terms of their accentability (roughly: full verbs > modal verbs > adverbs) and are only accented if the obligatory assignment rules do not assign any accent within an intermediate phrase. As to accent type, L+H* is the default prenuclear accent, H* the default nuclear accent, and H+L* the nuclear accent in a paragraph-final sentence. The deaccentuation of textually Given items already leads to an improvement of the intonation patterns produced. A further considerable improvement could be achieved by adding semantic information, enabling the system to generate appropriate prosodic realisations of inferentially Given or Accessible items. Hiyakumoto et al. (1997) propose a framework that integrates semantic and discourse information into (English) text-to-speech systems (see also Prevost 1996 and Prevost & Steedman 1994). Information about the basic semantic relations that hold among lexical items can be obtained from a lexical database like the Princeton WordNet for English (see Miller et al. 1993) or GermaNet for German (see Hamp & Feldweg 1997). By linking MARY to GermaNet, e.g., we could get for each noun a list of its synonyms, hypernyms, meronyms, and other lexical-semantic concepts the word is related to. If such a related concept occurred as an anaphoric expression, the information structure module could assign a specific tag (e.g. ‘synonym’ or ‘meronym’) to the token, which would be ‘translated’ by the prosody module into a specific type of pitch accent, e.g. – in accordance with our experimental findings – H+L* for meronyms and hyponyms but no accent for synonyms, holonyms and hypernyms. This distribution is sensitive to the order of occurrence of words in asymmetrical lexical relations: when a hyponym or meronym (i.e. a subordinate term) is mentioned, the meaning of its hypernym or holonym (i.e. a superordinate term) is implied, but not vice versa, which strongly influences the anaphor’s prosodic 165

realisation (as discussed in section 4.2). Furthermore, all textually displaced items (i.e. the recurrence of the same word or stem after at least three sentences) should be assigned an H+L*, although the experiment in 4.2 revealed that deaccentuation is also appropriate. Nevertheless, we suggest to choose the accent, since ‘too many’ accents appear to sound better in speech synthesis systems than ‘too few’ accents. Finally, all New referents should receive an H* pitch accent. Adding diverse types of context-sensitive information, as suggested here, will considerably improve the performance of speech synthesis systems, especially if the semantic differences are mapped onto a wider variety of intonational choices allowing for more appropriate and natural prosodic patterns. Moreover, the integration of semantic information accompanied by more sophisticated intonation proposed here represents a decisive step away from strictly text-based synthesis towards a ‘concept-to-speech’ system capable of adjusting its prosody to the context. The findings of our study open up new questions which have to be investigated in order to gain a fuller understanding of the interplay between intonation and Givenness. First, we need to find a way to determine different degrees or states of Givenness independent of the referring expression’s prosodic marking. This is necessary in order to avoid the danger of a circular argumentation and could be achieved by psycholinguistic experiments. One experiment could consist in a reading task in which subjects are confronted with sentences containing pairs of antecedents and anaphors in different contexts. Measuring reaction times as to the comprehension of the utterances presented could provide insights into the cognitive cost that is needed to activate specific semantically more or less related concepts. Second, we need more experiments on the nature of all three sources of activation in more natural contexts, especially on visual or situational Givenness or Accessibility. Third, the interaction between focus-background structure and Givenness states or degrees needs a closer investigation. And finally, the role of postnuclear prominences such as force accents or phrase accents needs to be investigated, both in production and perception experiments.

166

Bibliography Allerton, D.J., 1978. The Notion of ‘Givenness’ and its Relation to Presupposition and Theme. Lingua 44, 133-168. Allerton, D. J. & Alan Cruttenden, 1979. Three Reasons for Accenting a Definite Subject. Journal of Linguistics 15, 49-54. Altmann, Hans, 1988. Intonationsforschungen. Tübingen: Niemeyer. Ammann, Hermann, [1928] 1962. Die menschliche Rede. Sprachphilosophische Untersuchungen. 2. Teil: Der Satz. Darmstadt: Wissenschaftliche Buchgesellschaft. Reprint Lahr im Schwarzwald: Moritz Schauenburg. Amoia, Marilisa, 2003. Modelling Information Status and Contrastiveness in TTS Systems. Ms., Saarland University. Anderson, Anne H., Miles Bader, Ellen Gurman Bard, Elizabeth Boyle, Gwyneth Doherty, Simon Garrod, Stephen Isard, Jaqueline Kowtko, Jan McAllister, Jim Miller, Catherine Sotillo, Henry S. Thompson & Regina Weinert, 1991. The HCRC Map Task Corpus. Language and Speech 34, 351-366. Ariel, Mira, 1988. Referring and Accessibility. Journal of Linguistics 24, 65-87. Ariel, Mira, 1990. Accessing Noun-Phrase Antecedents. London/New York: Routledge. Ariel, Mira, 1991. The Function of Accessibility in a Theory of Grammar. Journal of Pragmatics 16, 443-463. Arnold, Jennifer, 1998. Reference Form and Discourse Patterns. PhD thesis, Stanford University. Bansal, R.K., 1990. The Pronunciation of English in India. In: Susan Ramsaran (ed.), Studies in the Pronunciation of English. London: Routledge. Batliner, Anton, Jan Buckow, Richard Huber, Volker Warnke, Elmar Nöth & Heinrich Niemann, 2001. Boiling down Prosody for the Classification of Boundaries and Accents in German and English. Proceedings Eurospeech, Aalborg, 2781-2784. Baumann, Stefan, 1997. Intonatorische Markierung der Informationsstrukturen freier Erzählungen. Unpublished Staatsexamen thesis, University of Cologne. Baumann, Stefan, 1999. Zum Verhältnis von Akzentform und kognitivem Status von Diskurseinheiten. In: Jürgen Joachimsthaler, Ulrich Engel & Stefan H. Kaszyński (eds.), Convivium. Germanistisches Jahrbuch Polen. Bonn: DAAD. 201224. Baumann, Stefan, Caren Brinckmann, Silvia Hansen-Schirra, Geert-Jan Kruijff, Ivana Kruijff-Korbayová, Stella Neumann & Elke Teich, 2004a. Multi-Dimensional Annotation of Linguistic Corpora for Investigating Information Structure. Proceedings Frontiers in Corpus Annotation 2004, NAACL/HLT Conference Workshop, Boston, USA. 39-46. Baumann, Stefan, Caren Brinckmann, Silvia Hansen-Schirra, Geert-Jan Kruijff, Ivana Kruijff-Korbayova, Stella Neumann, Erich Steiner, Elke Teich & Hans Uszkoreit, 2004b. The MULI Project: Annotation and Analysis of 167

Information Structure in German and English. Proceedings 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal. Baumann, Stefan & Martine Grice, 2004. Accenting Accessible Information. Proceedings SpeechProsody, Nara, 21-24. Baumann, Stefan & Martine Grice, to appear. The Intonation of Accessibility. Invited contribution Journal of Pragmatics (Special Issue on Prosody and Pragmatics). Baumann, Stefan & Kerstin Hadelich, 2003a. Accent Type and Givenness: An Experiment with Auditory and Visual Priming. Proceedings 15th ICPhS, Barcelona, 1811-1814. Baumann, Stefan & Kerstin Hadelich, 2003b. On the Perception of Intonationally Marked Givenness after Auditory and Visual Priming. Proceedings AAI workshop „Prosodic Interfaces“, Nantes, 21-26. Baumann, Stefan & Jürgen Trouvain, 2001. On the Prosody of German Telephone Numbers. Proceedings 7th Eurospeech, Aalborg, 557-560. Beckman, Mary E., 1986. Stress and Non-Stress Accent. Dordrecht: Foris. Beckman, Mary E. & G. Ayers-Elam, 1997. Guide to ToBI Labelling. Text and accompanying audio examples available at http://ling.ohiostate.edu/ Phonetics/E_ToBI/etobi_homepage.html. Beckman, Mary E. & Julia Hirschberg, 1994. The ToBI Annotation Conventions. Manuscript and accompanying speech material. Ohio State University. Beckman, Mary E., Julia Hirschberg & Stephanie Shattuck-Hufnagel, 2005. The Original ToBI System and the Evolution of the ToBI Framework. In: SunAh Jun (ed.), Prosodic Typology - The Phonology of Intonation and Phrasing. Oxford University Press. Beckman, Mary E. & Janet Pierrehumbert, 1986. Intonational Structure in Japanese and English. Phonology Yearbook 3, 255-309. Benzmüller, Ralf & Martine Grice, 1997. Trainingsmaterialien zur Etikettierung deutscher Intonation mit GToBI. Saarbrücken: Phonus 3, 9-34. Benzmüller, Ralf & Martine Grice, 1998. The Nuclear Accentual Fall in the Intonation of Standard German. In: ZAS Papers in Linguistics: Papers on the conference „The word as a phonetic unit“. Berlin. 79-89. Bloomfield, Leonard, 1935. Language. London: Allen and Unwin. Boersma, Paul & David Weenink, 1996. PRAAT, a System for Doing Phonetics by Computer (version 3.4.). Report 132. Institute of Phonetic Sciences of the University of Amsterdam, 182 pages. Web Page: http://www.fon.hum. uva.nl/praat/ Bois, John W. Du, 1987. The Discourse Basis of Ergativity. Language 63, 805-855. Bolinger, Dwight, 1958. A Theory of Pitch Accent in English.Word 14, 109-149. Reprinted in Bolinger 1965, 101-117. Bolinger, Dwight, 1961. Contrastive Accent and Contrastive Stress. Language 37, 8396.

168

Bolinger, Dwight, 1964. Intonation: Around the Edge of Language. Harvard Educational Review 34:282-296. Bolinger, Dwight, 1965. Forms of English: Accent, Morpheme, Order. Cambridge, MA: Harvard University Press. Bolinger, Dwight, 1972. Accent is Predictable (if you’re a Mind-Reader). Language 48, 633-644. Bolinger, Dwight, 1985. The Inherent Iconism of Intonation. In: John Haiman (ed.), Iconicity in Syntax, Amsterdam and Philadelphia: John Benjamins. 97-108. Bolinger, Dwight, 1986. Intonation and Its Parts. Palo Alto: Stanford University Press. Bolinger, Dwight, 1989. Intonation and Its Uses. Palo Alto: Stanford University Press. Borden, Gloria J. & Katherine S. Harris, 1984. Speech Science Primer: Physiology, Acoustics and Perception of Speech (2nd edition). Baltimore: Williams & Wilkins. Brants, Sabine, Stefanie Dipper, Peter Eisenberg, Silvia Hansen, Esther König, Wolfgang Lezius, Christian Rohrer, George Smith & Hans Uszkoreit, to appear. TIGER: Linguistic Interpretation of a German Corpus. Special Issue JLAC. Braun, Bettina, to appear. Production and Perception of Contrastive and NonContrastive Themes in German. PhD thesis, Saarland University. Brazil, David, Malcolm Coulthard & Catherine Johns, 1980. Discourse Intonation and Language Teaching. London: Longman. Brown, Gillian, 1983. Prosodic Structure and the Given/New Distinction. In: Anne Cutler & D. Robert Ladd (eds.), Prosody: Models and Measurements. Berlin: Springer. 67-77. Brown, Gillian, Karen L. Currie & Joanne Kenworthy, 1980. Questions of Intonation. London: Croom Helm. Bruce, Gösta, 1977. Swedish Word Accents in Sentence Perspective. Lund: Gleerup. Büring, Daniel, 1996. On (De)Accenting. Talk presented at the SFB 340 conference in Tübingen, October. Büring, Daniel, 1997. The Meaning of Topic and Focus – The 59th Street Bridge Accent. London: Routledge. Büring, Daniel, 2002. Information Structure. Talk presented at FFL, Düsseldorf. Büring, Daniel, 2003a. Intonation, Semantics and Information Structure. To appear in: Gillian Ramchand & Charles Reiss (eds.), Interfaces. Oxford: Oxford University Press. Büring, Daniel, 2003b. Focus Projection and Default Prominence. Proceedings Symposion Informationsstruktur - Kontrastivt, Lund. Cassidy, Steve & Jonathan Harrington, 2001. Multi-Level Annotation in the EMU Speech Database Management System. Speech Communication 33 (1-2), 61-78. Chafe, Wallace, 1974. Language and Consciousness. Language 50, 111-133.

169

Chafe, Wallace, 1976. Givenness, Contrastiveness, Definiteness, Subjects, Topics and Point of View. In: Charles Li (ed.), Subject and Topic. New York: Academic Press. 25-56. Chafe, Wallace, 1987. Cognitive Constraints on Information Flow. In: Russell Tomlin (ed.), Coherence and Grounding in Discourse. Amsterdam: John Benjamins. 21-52. Chafe, Wallace, 1994. Discourse, Consciousness, and Time. Chicago/London: University of Chicago Press. Chafe, Wallace, 1996. Inferring Identifiability and Accessibility. In: Torsten Fretheim & Jeanette Gundel (eds.), Reference and Referent Accessibility. Amsterdam: John Benjamins. 37-46. Charpentier Francis & M. Stella, 1986. Diphone Synthesis Using an Overlap-Add Technique for Speech Waveforms Concatenation. Proceedings ICASSP 86 (3), 2015-2018. Chomsky, Noam, 1972. Deep Structure, Surface Structure, and Semantic Interpreatation. In: Noam Chomsky (ed.), Studies on Semantics in Generative Grammar. The Hague: Mouton. 62-119. Chomsky, Noam & Morris Halle, 1968. The Sound Pattern of English. New York: Harper and Row. Clark, Herbert H., 1977. Bridging. In: P.N. Johnson-Laird & P.C. Wason (eds.), Thinking: Readings in Cognitive Science. Cambridge: Cambridge University Press. 411-420. Clark, Herbert H. & Susan E. Haviland, 1977. Comprehension and the Given-New Contract. In: Roy Freedle (ed.), Discourse Production and Comprehension. New Jersey: Ablex. 1-40. Clark, Herbert H. & Catherine R. Marshall, 1981. Definite Reference and Mutual Knowledge. In: Aravind Joshi, Bonnie Webber & Ivan Sag (eds.), Elements of Discourse Understanding. Cambridge: Cambridge University Press. 10-63. Clark, Herbert H. & C.J. Sengul, 1979. In Search of Referents for Nouns and Pronouns. Memory and Cognition 7, 35-41. Clements, G.N. & S.J. Keyser, 1983. CV-Phonology: A Generative Theory of the Syllable. Cambridge: Cambridge University Press. Cohen, Antonie & Johan ’t Hart, 1967. On the Anatomy of Intonation. Lingua 19, 177-192. Couper-Kuhlen, Elizabeth, 1984. A New Look at Contrastive Intonation. In: Richard J. Watts & Urs Weidmann (eds.), Modes of Interpretation. Essays Presented to Ernst Leisi on the Occasion of his 65th Birthday. Tübingen: Narr, 137-158. Couper-Kuhlen, Elizabeth, 1986. An Introduction to English Prosody. London: Arnold. Cristo, Albert Di & Daniel Hirst, 1986. Modelling French Micromelody: Analysis and Synthesis. Phonetica, 43 (1/3), 11-30. Cruse, D. Alan, 1986. Lexical Semantics. Cambridge: Cambridge University Press. Cruttenden, Alan, 1986. Intonation. Cambridge: Cambridge University Press.

170

Cruttenden, Alan, in press. The De-accenting of Given Information: a Cognitive Universal? In: G. Bernini & M.L. Schwartz (eds.), Pragmatic Organization of Discourse in the Languages of Europe (Empirical Approaches to Language Typology, EUROTYP Vol.20-8). The Hague: Mouton de Gruyter. Crystal, David, 1969. Prosodic Systems and Intonation in English. Cambridge: Cambridge University Press. Culicover, Peter & Michael Rochemont, 1983. Stress and Focus in English. Language 59 (1), 123-165. Dahl, Östen, 1976. What is New Information? In: Nils Erk Enkvist & Viljo Kohonen (eds.), Approaches to Word Order. Reports in Text Linguistics No. 72. Meddelanden fran Stiftelsens för Abo Akademi Forskningsinstitut, 8. Abo/Turku. Daneš, František, 1974. Functional Sentence Perspective and the Organization of the Text. In: František Daneš (ed.), Papers on Functional Sentence Perspective. Prague: Academia. Deemter, Kees van, 1992. Towards a generalization of anaphora. Journal of Semantics 9 (1), 27-51. Deemter, Kees van, 1994. What’s New? A Semantic Perspective on Sentence Accent. Journal of Semantics 11, 1-31. Deemter, Kees van, 1999. Contrastive Stress, Contrariety, and Focus. In: Peter Bosch, Rob van der Sandt (eds.), Focus - Linguistic, Cognitive, and Computational Perspectives (Studies in Natural Language Processing). Cambridge: Cambridge University Press. 3-17. Diehl, Randy, 1991. The Role of Phonetics within the Study of Language. Phonetica 48, 120-134. Dombrowski, Ernst, 2003. Semantic Features of Accent Contours: Effects of F0 Peak Position and F0 Time Shape. Proceedings 15th ICPhS, Barcelona, 12171220. E-Prime, 2000. Version 1.0. Pittsburgh, PA: Psychology Software Tools. Essen, Otto von, [1956] 1964. Grundzüge der hochdeutschen Satzintonation. Ratingen: Henn. Esser, Jürgen, 1983. Tone Units in Functional Sentence Perspective. Journal of Semantics 2, 121-139. Féry, Caroline, 1986. Metrische Phonologie und Wortakzent im Deutschen. Studium Linguistik 20, 16-43. Féry, Caroline, 1993. German Intonational Patterns. Tübingen: Niemeyer. Fillmore, Charles J., 1982. Frame Semantics. In: Linguistics Society of Korea (ed.), Linguistics in the Morning Calm. Hanshin. 111-138. Firbas, Jan, 1964. On Defining the Theme in Fuctional Sentence Analysis. Travaux Linguistiques de Prague, Vol. 1, 267-280. Firbas, Jan, 1966. Non-Thematic Subjects in Contemporary English. Travaux Linguistiques de Prague, Vol. 2., 239-256.

171

Fretheim, Thorstein, 1994. The Accent Parameter and the Cognitive Status of Discourse Referents. In: J. Allwood, Bo Ralph, Paula Andersson, Dora KósDienes & Åsa Wengelin (eds.), Proceedings XIVth Scandinavian Conference of Linguistics, 95-107. Fretheim, Thorstein, 1996. Accessing Contexts with Intonation. In: Thorstein Fretheim & Jeanette Gundel (eds.), Reference and Referent Accessibility. Amsterdam: John Benjamins. 89-112. Frota, Sónia, 1998. Prosody and Focus in European Portuguese. PhD thesis, University of Lisbon. Published by Garland, New York (2000). Fry, D.B., 1955. Duration and Intensity as Physical Correlates of Linguistic Stress. Journal of the Acoustical Society of America 27, 765-768. Fry, D.B., 1958. Experiments in the Perception of Stress. Language and Speech 1, 126152. Fuchs,

Anna, 1976. 293-312.

‚Normaler‘

und

‚kontrastiver‘

Akzent.

Lingua

38,

Garrod, Simon C. & Anthony J. Sanford, 1982. The Mental Representation of Discourse in a Focussed Memory System: Implications for the Interpretation of Anaphoric Noun-Phrases. Journal of Semantics, 1, 21-41. Geluykens, Ronald, 1991. Information Flow in English Conversation: A New Approach to the Given-New Distinction. In: Eija Ventola (ed.), Functional and Systemic Linguistics. Approaches and Uses. Berlin/New York: Mouton de Gruyter. 141-167. Geluykens, Ronald, 1993. Topic Introduction in English Conversation. Transactions of the Philological Society 91 (2), 181-214. Giegerich, Heinz J., 1985. Metrical Phonology and Phonological Structure. German and English. Cambridge: Cambridge University Press. Givón, Talmy, 1983. Introduction. In: Talmy Givón (ed.), Topic Continuity in Discourse. Amsterdam and Philadelphia: John Benjamins. 5-41. Givón, Talmy, 1984. Syntax: A Functional-Typological Introduction, Vol. I. Amsterdam and Philadelphia: John Benjamins. Givón, Talmy, 1990. Syntax: A Functional-Typological Introduction, Vol. II. Amsterdam and Philadelphia: John Benjamins. Givón, Talmy, 1992. The Grammar of Referential Coherence as Mental Processing Instructions. Linguistics 30, 5-55. Goldbeck, Thomas P. & Walter F. Sendlmeier, 1988. Wechselbeziehung zwischen Satzmodalität und Akzentuierung in satzfinaler Position bei der Realisierung von Intonationskonturen. In: Hans Altmann (ed.), Intonationsforschungen. Tübingen: Niemeyer. 305-321. Goldsmith, John A., 1976. An Overview of Autosegmental Phonology. Linguistic Analysis 2, 23-68. Goldsmith, John A., 1981. English as a Tone Language. In: D. Goyvaerts (ed.), Phonology in the 1980‘s. Gent. 287-308.

172

Goldsmith, John A., 1990. Autosegmental and Metrical Phonology. Oxford: Blackwell. Grabe, Esther, 1998. Comparative Intonational Phonology: English and German. (MPI Series in Psycholinguistics 7). Wageningen: Ponsen and Looijen. Grice, H. Paul, 1975. Logic and Conversation. In: Peter Cole & Jerry L. Morgan (eds.), Syntax and Semantics, Vol. III: Speech Acts. New York: Academic Press. 41-58. Grice, Martine, 1995. The Intonation of Interrogation in Palermo Italian: Implications for Intonation Theory. Tübingen: Niemeyer. Grice,

Martine, 2004. (Documentation Frankfurt/Main.

Prosody and Intonation. Presentation at DoBeS of Endangered Languages) Summer School,

Grice, Martine & Stefan Baumann, 2002. Deutsche Intonation und GToBI. Linguistische Berichte 191, 267-298. Grice, Martine, Stefan Baumann & Ralf Benzmüller, 2005 German Intonation in Autosegmental-Metrical Phonology. In: Sun-Ah Jun (ed.), Prosodic Typology. The Phonology of Intonation and Phrasing. Oxford: Oxford University Press. 5583. Grice, Martine & Ralf Benzmüller, 1995. Transcription of German Intonation using ToBI-Tones – The Saarbrücken System. Phonus (Saarbrücken University) 1, 33-51. Grice, Martine & Ralf Benzmüller, 1998. Tonal Affiliation in German Falls and FallRises. Poster presented at the 5th Conference on Laboratory Phonology, York. Grice, Martine, Mariapaola D’Imperio, Michelina Savino & Cinzia Avesani, 2005. Strategies for Intonation Labelling across Varieties of Italian. In: Sun-Ah Jun (ed.), Prosodic Typology. The Phonology of Intonation and Phrasing. Oxford: Oxford University Press. 362-389. Grice, Martine, D. Robert Ladd & Amalia Arvaniti, 2000. On the Place of Phrase Accents in Intonational Phonology. Phonology 17 (2), 143-185. Grosz, Barbara, Aravind Joshi & Scott Weinstein, 1995. Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics 2 (21), 203-225. Gumperz, John J., 1982. Discourse Strategies. Cambridge: Cambridge University Press. Gundel, Jeanette, 1985. ‘Shared Knowledge’ and Topicality. Journal of Pragmatics 9, 83107. Gundel, Jeanette, 1996. Relevance Theory Meets the Givenness Hierarchy. An Account of Inferrables. In: Thorstein Fretheim & Jeanette Gundel (eds.), Reference and Referent Accessibility. Amsterdam: John Benjamins. 141-153. Gundel, Jeanette, Nancy Hedberg & Ron Zacharski, 1993. Cognitive Status and the Form of Referring Expressions in Discourse. Language 69, 274-307. Gunter, Richard, 1966. On the Placement of Accent in Dialog: A Feature of Context Grammar. Journal of Linguistics 2, 159-179.

173

Gussenhoven, Carlos, 1983. Focus, Mode, and the Nucleus. Journal of Linguistics 19, 377-417. Gussenhoven, Carlos, 1984. On the Grammar and Semantics of Sentence Accents. Dordrecht: Foris. Gussenhoven, Carlos, 1985. Two Views of Accent: a Reply. Journal of Linguistics 21, 125-138. Gussenhoven, Carlos, 1991. The English Rhythm Rule as an Accent Deletion Rule. Phonology 8, 1-35. Gussenhoven, Carlos, 2002. Intonation and Interpretation: Phonetics and Phonology. Proceedings 1st Int. Conference on Speech Prosody, Aix-en-Provence, 4757. Gussenhoven, Carlos, 2004. The Phonology of Tone and Intonation. Cambridge: Cambridge University Press. Gussenhoven, Carlos & Haike Jacobs 1998. Understanding Phonology. London: Arnold. Hadelich, Kerstin, Matt Crocker & Christoph Scheepers, 2002. Powerful Pictures: Priming Planning, Production or Both. Poster presented on 8th AMLaP, Tenerife. Hajičova, Eva, 1993. Issues of Sentence Structure and Discourse Patterns, Theoretical and Computational Linguistics, Vol. 2. Prague: Charles University. Halle, Morris & Jean-Roger Vergnaud, 1979. Metrical Structures in Phonology. Ms., MIT. Halle, Morris & Jean-Roger Vergnaud, 1982. On the Framework of Autosegmental Phonology. In: Harry van der Hulst & N. Smith (eds.), The Structure of Phonological Representation. Dordrecht: Foris. Halliday, M.A.K., 1967a. Intonation and Grammar in British English. The Hague: Mouton. Halliday, M.A.K., 1967b. Notes on Transitivity and Theme in English, Part 2, Journal of Linguistics 3, 199-244. Halliday, M.A.K., 1970. A Course in Spoken English: Intonation. London: Oxford University Press. Halliday, M.A.K., 1985. An Introduction to Functional Grammar. London: Arnold. Halliday, M.A.K. & Ruqaiya Hasan, 1976. Cohesion in English. London: Longman. Halliday M.A.K. & Christian Matthiessen, 2004. An Introduction to Functional Grammar. London: Arnold. Hamp, B. & H. Feldweg, 1997. GermaNet – a Lexical-Semantic Net for German. Proceedings ACL/EACL-97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications. Madrid. Harrington, Jonathan, Mary E. Beckman, Janet Fletcher & Sallyanne Palethorpe, 1998. An Electropalatographic, Kinematic, and Acoustic Analysis of Supralaryngeal Correlates of Word-Level Prominence Contrasts in English. Proceedings of the 5th International Conference on Spoken Language Processing, 18511854. 174

Hart, Johan ’t, René Collier & Antonie Cohen, 1990. A Perceptual Study of Intonation: An Experimental-Phonetic Approach. Cambridge: Cambridge University Press. Haviland, Susan E. & Herbert H. Clark, 1974. What’s New? Acquiring New Information as a Process in Comprehension. Journal of Verbal Learning and Verrbal Behavior 13, 512-521. Hayes, Bruce, 1980. A Metrical Theory of Stress Rules. PhD thesis, MIT. Hayes, Bruce, 1982. Extrametricality and English Stress. Linguistic Inquiry 13, 227-276. Heim, Irene, 1982. The Semantics of Definite and Indefinite Noun Phrases. PhD thesis, University of Massachusetts, Amherst. Hendriks, Herman, 1996. Information Packaging: From Cards to Boxes. In: T. Galloway and J. Spence (eds.), Proceedings of Semantics and Linguistic Theory VI. Cornell University. Heusinger, Klaus von, 1999. Intonation and Information Structure. State doctorate thesis (Habilitationsschrift), University of Konstanz. Hiyakumoto, L., Scott Prevost & J. Cassell, 1997. Semantic and Discourse Information for Text-to-Speech Intonation. ACL Workshop on Concept-toSpeech Technology. Höhle, Tilman, 1991. On Reconstruction and Coordination. In: H. Haider & K. Netter (eds.), Representation and Derivation in the Theory of Grammar. Dordrecht: Reidel. Horne, Merle, 1990. Accentual Patterning in ‚New‘ vs ‚Given‘ Subjects in English. Working Papers, Department of Linguistics, Lund University 36, 81-97. Isačenko, A.V. & H.-J. Schädlich, 1966. Untersuchungen über die deutsche Satzintonation. Studia Grammatica VII, 7-64. Jackendoff, Ray, 1972. Semantic Interpretation in Generative Grammar. Cambridge, Mass: MIT Press. Jacobs, Joachim, 1982. Syntax und Semantik der Negation im Deutschen. München: Fink. Jacobs, Joachim, 1988. Probleme der freien Wortstellung im Deutschen. Sprache und Pragmatik. Arbeitsberichte 5, 8-37. Jacobs, Joachim, 1991/92. Informationsstruktur und Grammatik. Sonderheft 4. Linguistische Berichte. Jones, Daniel, 1950. The Phoneme: Its Nature and Use. Cambridge: Heffer. Joshi, Aravind, 1982. The Role of Mutual Beliefs in Question-Answer Systems. In: N. Smith (ed.), Mutual Knowledge. New York: Academic Press. Jun, Sun-Ah, 1993. The Phonetics and Phonology of Korean Prosody. PhD thesis, Ohio State University. Kameyama, M., 1986. A Property-Sharing Constraint in Centering. Proceedings of the 24th Annual meeting of the American Association of Computational Linguistics, Cambridge, MA, 200-206.

175

Kamp, Hans & Uwe Reyle, 1993. From Discourse to Logic. Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Dordrecht: Kluwer. Karttunen, Laurie, 1976. Discourse Referents. In: J. McCawley (ed.), Notes from the Linguistic Underground. Syntax & Semantics vol. VII. New York: Academic Press. 363-385. Kingdon, Roger, 1958. The Groundwork of English Intonation. London: Longman. Kingston, John, 1991. Integrating Articulations in the Perception of Vowel Height. Phonetica 48, 149-179. Kingston, John & Randy L. Diehl, 1994. Phonetic Knowledge. Language 70, 419-454. Kohler, Klaus, 1991a. Terminal Intonation Patterns in Single-Accent Utterances of German: Phonetics, Phonology and Semantics. AIPUK 25, 115-185. Kohler, Klaus, 1991b. A Model of German Intonation. AIPUK 25, 295-360. Kohler, Klaus, [1977] 1995. Einführung in die Phonetik des Deutschen. (Grundlagen der Germanistik 20). Berlin: Schmidt. Kohler, Klaus, 2004. Prosody Revisited: FUNCTION, TIME, and the LISTENER in Intonational Phonology. Proceedings SpeechProsody, Nara, 171-174. Kohler, Klaus, 2005. Form and Function of Non-Pitch Accents. AIPUK 35a, 97-123. Kohler, Klaus, in press. What is Emphasis and How is it Coded? Proceedings SpeechProsody, Dresden. Kruijff, Geert-Jan, 2001. A Categorial-Modal Logical Architecture of Informativity: Dependency Grammar Logic & Information Structure. PhD thesis, Charles University, Prague. Kruijff-Korbayová, Ivana & Geert-Jan Kruijff, 2004. Discourse-level Annotation for Investigating Information Structure. Proceedings of the ACL Workshop on Discourse Annotation, Barcelona. Kuno, Susumo, 1972. Functional Sentence Perspective: A Case Study from Japanese and English. Linguistic Inquiry 3, 269-320. Kuno, Susumo, 1978. Generative Discourse Analysis in America. In: Wolfgang Dressler (ed.), Current Trends in Textlinguistics. Berlin/New York: de Gruyter. 275-294. Ladd, D. Robert, 1980. The Structure of Intonational Meaning: Evidence from English. Bloomington: Indiana University Press. Ladd, D. Robert, 1983a. Phonological Features of Intonational Peaks. Language 59, 721-759. Ladd, D. Robert, 1983b. Even, Focus, and Normal Stress. Journal of Semantics 2, 157170. Ladd, D. Robert, 1996. Intonational Phonology. Cambridge: Cambridge University Press. Ladd, D. Robert & Rachel Morton, 1997. The Perception of Intonational Emphasis: Continuous or Categorical? Journal of Phonetics 25, 313-342.

176

Ladd, D. Robert & Kim Silverman, 1984. Vowel Intrinsic Pitch in Connected Speech. Phonetica 41, 31-40. Ladefoged, Peter, 1962. Elements of Acoustic Phonetics. Chicago: University of Chicago Press Lakoff, George, 1971a. The Role of Deduction in Grammar. In: Charles J. Fillmore & D.T. Langendoen (eds.), Studies in Linguistic Semantics. New York: Holt, Rinehart & Winston. Lakoff, George, 1971b. Presupposition and Relative Well-formedness. In: Danny Steinberg & Leon A. Jakobovits (eds.), Semantics. An Interdisciplinary Reader in Philosophy, Linguistics, and Psychology. Cambridge: Cambridge University Press. Lambrecht, Knud, 1994. Information Structure and Sentence Form. Cambridge: Cambridge University Press. Lehiste, Ilse, 1970. Suprasegmentals. Cambridge, MA: MIT Press. Lehiste, Ilse & G.E. Peterson, 1961. Some Basic Considerations in the Analysis of Intonation. Journal of the Acoustical Society of America 33, 419-425. Lehman, Christina, 1977. A Re-analysis of Givenness: Stress in Discourse. Papers from the 13th Regional meeting, Chicago Linguistic Society, 316-324. Liberman, Mark, 1975 [1979]. The Intonational System of English. New York: Garland. Liberman, Mark & Alan Prince, 1977. On Stress and Linguistic Rhythm. Linguistic Inquiry 8, 249-336. Lyons, John, 1968. Introduction to Theoretical Linguistics. London: Cambridge University Press. Marcus, Mitchell, Grace Kim, Mary Ann Marcinkiewicz, Robert MacIntyre, Ann Bies, Mark Ferguson, Karen Katz & Britta Schasberger, 1994. The Penn Treebank: Annotating Predicate Argument Structure. Proceedings of the Human Language Technology Workshop, San Francisco. Morgan Kaufmann. Mathesius, Vilém, 1929 [1983]. Functional Linguistics. In: J. Vachek (ed.), Praguiana. Amsterdam: John Benjamins. 121-142. Matthiessen, Christian & M.A.K. Halliday, 1997. Systemic Functional Grammar: A First Step into the Theory. Macquarie University. McCarthy, John, 1979. On Stress and Syllabification. Linguistic Inquiry 10, 443-466. Miller, George A., Richard Beckwith, Christiane Fellbaum, Derek Gross & Katherine Miller, 1993. Introduction to WordNet: An On-line Lexical Database. Moulines, Eric & Francis Charpentier, 1990. Pitch Synchronous Waveform Processing Techniques for Text-to-Speech Synthesis Using Diphones. Speech Communication 9, 453-467. Müller, Christoph & Michael Strube, 2003. Multi-Level Annotation in MMAX. Proceedings 4th SIGdial Workshop on Discourse and Dialogue, Sapporo, Japan. Nakatani, Lloyd H. & Carletta H. Aston, 1978. Acoustic and Linguistic Factors in Stress perception. Unpublished manuscript, Bell Laboratories.

177

Niebuhr, Oliver, 2003. Perceptual Study of Timing Variables in F0 Peaks. Proceedings 15th ICPhS, Barcelona, 1225-1228. Nooteboom, Sieb G. & J.G. Kruyt, 1987. Accents, Focus Distribution, and the Perceived Distribution of Given and New Information: An Experiment. Journal of the Acoustical Society of America 82 (5), 1512-1524. O’Connor, J.D. & G.F. Arnold, 1973. Intonation of Colloquial English. London: Longman. Ohala, John J., 1983. Cross­Language Use of Pitch: An Ethological View. Phonetica 40, 1-18. Ohala, John J., 1984. An Ethological Perspective on Common Cross-Language Utilization of F0 of Voice. Phonetica 41, 1-16. Palmer, Harold, 1922. English Intonation, with Systematic Exercises. Cambridge: Heffer. Passoneau, R., 1996. Instructions for Applying Discourse Reference Annotation for Multiple Applications (DRAMA). Draft. Peters, Jörg, 2001. Frageintonation in der Pfalz. Eine Reanalyse des GüntherodtKorpus. Ms., University of Potsdam. Peters, Jörg, 2002. Intonation und Fokus im Hamburgischen. Linguistische Berichte 189, 27-57. Peters, Jörg, 2004. Regionale Variation der Intonation des Deutschen. Studien zu ausgewählten Regionalsprachen. State doctorate thesis (Habilitationsschrift), University of Potsdam/University of Nijmegen. Pheby, John, 1980. Phonologie: Intonation (chapter 6). In: Heidolph et al. (eds.) Grundzüge einer deutschen Grammatik. Berlin: Akademie-Verlag. 839-897. Pierrehumbert, Janet B., 1980. The Phonetics and Phonology of English Intonation. PhD thesis, MIT. Bloomington: Indiana University Linguistics Club. Pierrehumbert, Janet B. & Mary E. Beckman, 1988. Japanese Tone Structure. Cambridge, Mass.: MIT Press. Pierrehumbert, Janet B. & Julia Hirschberg, 1990. The Meaning of Intonational Contours in the Interpretation of Discourse. In: P.R. Cohen, J. Morgan, M.E. Pollack, (eds.), Intentions in Communication. Cambridge: MIT Press. 271-311. Pike, Kenneth L., 1945. The Intonation of American English. Ann Arbor: University of Michigan Press. Prevost, Scott, 1996. An Information Structural Approach to Spoken Language Generation. Proceedings 34th Annual ACL Meeting, Santa Cruz, 294-301. Prevost, Scott & Mark Steedman, 1994. Specifying Intonation from Context for Speech Synthesis. Speech Communication 15, 139-153. Prince, Ellen F., 1981. Toward a Taxonomy of Given-New Information. In: Peter Cole (ed.), Radical Pragmatics, New York: Academic Press. 223-256. Prince, Ellen F., 1992. The ZPG Letter: Subjects, Definiteness, and Informationstatus. In: Sandra A. Thompson & William C. Mann (eds.), Discourse Description: Diverse Analyses of a Fund Raising Text. Amsterdam: John Benjamins. 295-325. 178

Pulleyblank, D., 1986. Tone in Lexical Phonology. Dordrecht: Foris. Reinhart, Tanya, 1982. Pragmatics and Linguistics. An Analysis of Sentence Topics. Philosophica 27 (1), 53-94. Rietveld, Toni & Carlos Gussenhoven, 1985. On the Relation between Pitch Excursion Size and Prominence. Journal of Phonetics 13, 299-308. Rochemont, Michael, 1986. Focus in Generative Grammar. Amsterdam/Philadelphia: John Benjamins. Romanelli, Massimo, 2003. Modelling Givenness and Contrast in MARY. Ms., Saarland University. Rooth, Mats, 1992. A Theory of Focus Interpretation. Natural Language Semantics 1, 75-116. Sanford, Anthony J. & Simon C. Garrod, 1981. Understanding Written Language: Explorations of Comprehension beyond the Sentence. Chichester: John Wiley. Schröder, Marc, 2004. Speech and Emotion Research. An Overview of Research Frameworks and a Dimensional Approach to Emotional Speech Synthesis. PhD thesis, Phonus 7, Research Report of the Institute of Phonetics, Saarland University. Schröder, Marc & Jürgen Trouvain, 2001. The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching. Proceedings 4th Speech Synthesis Workshop, Pitlochry (Scotland), 131-136. Schwarzschild, Roger, 1997. Givenness and Optimal Focus. Ms., Rutgers University. Schwarzschild, Roger, 1999. GIVENness, Avoid F and other Constraints on the Placement of Focus. Natural Language Semantics 7 (2), 141-177. Selkirk, Elisabeth, 1984. Phonology and Syntax. The Relation between Sound and Structure. Cambridge, MA: MIT Press. Selkirk, Elisabeth, 1995. Sentence Prosody: Intonation, Stress, and Phrasing. In: John A. Goldsmith (ed.), The Handbook of Phonological Theory. Cambridge, MA/ Oxford, UK: Blackwell. 550-69. Selting, Margret, 1995. Prosodie im Gespräch. Aspekte einer interaktionalen Phonologie der Konversation. Tübingen: Niemeyer. Sgall, Petr, Eva Hajičova & Eva Benešova, 1973. Topic, Focus and Generative Semantics. Kronberg/Taunus: Scriptor. Shattuck-Hufnagel, Stefanie, Mari Ostendorf & K. Ross, 1994. Stress Shift and Early Pitch Accent Placement in Lexical Items in American English. Journal of Phonetics 22, 357-388. Sievers, Eduard, 1876. Grundzüge der Lautphysiologie. Zur Einführung in das Studium der Lautlehre der indogermanischen Sprachen. Leipzig: Breitkopf und Härtel. Silverman, Kim, 1987. The Structure and Processing of Fundamental Frequency Contours. PhD thesis, University of Cambridge. SPSS Inc., 1998. SPSS Base 9.0 for Windows User’s Guide. SPSS Inc., Chicago IL.

179

Stechow, Arnim von & Susanne Uhmann, 1986. Some Remarks on Focus Projection. In: Werner Abraham & S. de Meij (eds.), Topic, Focus, and Configurationality. Amsterdam: John Benjamins. Steedman, Mark, 1991. Structure and Intonation. Language 67, 260–296. Steedman, Mark, 2000. Information Structure and the Syntax-Phonology Interface. Linguistic Inquiry 31, 649–689. Swerts, Mark, René Collier & Jacques Terken, 1994. On the Prosodic Prediction of Discourse Finality in Spontaneous Monologues. Speech Communication 15, 7990. Tannen, Deborah, 1979. What’s in a Frame? Surface Evidence for Underlying Expectations. In: Roy Freedle, (ed.), Discourse processing: new directions. Norwood, NJ: Ablex. Teich, Elke, 2003. Cross-Linguistic Variation in System and Text. A Methodology for the Investigation of Translations and Comparable Texts. Berlin/ New York: Mouton de Gruyter. Tench, Paul, 1990. The Roles of Intonation in English Discourse. New York: Peter Lang. Terken, Jacques & Julia Hirschberg, 1994. Deaccentuation of Words Representing ‘Given’ Information: Effects of Persistence of Grammatical Role and Surface Position. Language and Speech 37, 125-145. Trager, George L. & H.L. Smith, 1951. An Outline of English Structure. Norman, OK: Battenburg Press. Truckenbrodt, Hubert, 1995. Phonological Phrases: Their Relation to Syntax, Focus, and Prominence. PhD thesis, MIT. Published 1999 by MITWPL. Truckenbrodt, Hubert, 1999. On the Relation between Syntactic Phrases and Phonological Phrases. Linguistic Inquiry 30, 219-255. Uhmann, Susanne, 1991. Fokusphonologie. Eine Analyse deutscher Intonationskonturen im Rahmen der nicht-linearen Phonologie. Tübingen: Niemeyer. Umbach, Carla, 2001. (De)accenting Definite Descriptions. Theoretical Linguistics 27 (2-3), 251-280. Upperman, Gina, 2004. Changing Pitch with PSOLA for Voice Conversion. http://cnx.rice.edu/content/m12474/latest. Vallduví, Enric, 1992. The Informational Component. New York: Garland. Vallduví, Enric & Elisabet Engdahl, 1996. The Linguistic Realisation of Information Packaging. Lingusistics 34, 459-519. Vanderslice, Ralph & Laura S. Pierson, 1967. Prosodic Features of Hawaian English. Quaterly Journal of Speech 53, 156-166. Wahlster, Wolfgang, 2000. Verbmobil: Foundations of Speech-to-Speech Translation. New York/Berlin: Springer. Wells, William H.G., 1988. Focus in Spoken English. PhD thesis, University of York. Wichmann, Anne, 1991. Falls: Variability and Perceptual Effects. Proceedings 12th ICPhS, Aix-en-Provence, vol.5, 194-197.

180

Wichmann, Anne, Jill House & Toni Rietveld, 2000. Discourse Constraints on F0 Peak Timing in English. In: Antonis Botinis (ed.), Intonation. Analysis, Modelling and Technology. Dordrecht: Kluwer Academic Publishers. 163-182. Williams, Edwin, 1976. Underlying Tone in Margi and Igbo. Linguistic Inquiry 7, 463484. Wunderlich, Dieter, 1988. Der Ton macht die Melodie - Zur Phonologie der Intonation des Deutschen. In: Hans Altmann (ed.), Intonationsforschungen. Tübingen: Niemeyer. 1-40. Yule, George, 1980. Intonation and Givenness in Spoken Discourse. Studies in Language 4, 271-286. Yule, George, 1981. New, Current and Displaced Entity Reference. Lingua 55, 41-52.

181

Lebenslauf Name

Stefan Baumann

Adresse

Niehler Strasse 62, 50733 Köln

Geburtstag und -ort

27. Februar 1969 in Kiel

Familienstand

ledig

Schulbildung

08/1975-06/1979 08/1979-06/1988

Grund- und Hauptschule Suchsdorf, Kiel Ernst-Barlach-Gymnasium, Kiel; Abitur (Note: 2,0)

Wehrdienst

10/1988-12/1989

Stabskompanie Panzerbrigade 18, Obergefreiter, Neumünster

Studium

04/1990-09/1992

zunächst Germanistik und Anglistik (Lehramt Sek I+II), nach zwei Semestern zusätzlich Allgemeine Sprachwissenschaft (Magister) an der Christian-AlbrechtsUniversität Kiel; Zwischenprüfung in Germanistik und Anglistik Stipendium des DAAD für Studien der Linguistik an der University of Edinburgh, Schottland Germanistik, Anglistik, Allgemeine Sprachwissenschaft (Magister/Lehramt Sek I+II) an der Universität Köln Staatsexamen in Deutsch und Englisch (Lehramt Sek I+II) an der Universität Köln (Note: 2,0) Promotionsstudium in Phonetik und Phonologie an der Universität des Saarlandes, Saarbrücken (Note: summa cum laude)

10/1992-07/1993 10/1993-05/1998 05/1998 02/2000-12/2005

Mitarbeit in Forschungsprojekten

10/2001-05/2004 05/2002-09/2003

182

NECA: A Net Environment for Embodied Emotional Conversational Agents (EU-Projekt IST-2000-28580) MULI: Multilinguale Informationsstruktur (gefördert von der Universität des Saarlandes, TG84)

07/2002-06/2003

seit 03/2005

Tätigkeiten während des Studiums

04/1991-09/1992

08/1994-09/1994 01/1995-05/1998 10/1998-01/2000 02/2000-11/2000 12/2000-09/2004 10/2004-12/2004 01/2005-02/2005 seit 03/2005

Fremdsprachen

GToBI-2: Überprüfung und Erweiterung eines Modells für die Intonation des Deutschen (gefördert von der Universität des Saarlandes, TG84) STRETTS: The Structure of Tonal Representations: Evidence from TuneText Synchronisation (DFG-Projekt GR 1610/2-1) Studentische Hilfskraft am Englischen Seminar der Christian-AlbrechtsUniversität zu Kiel; selbständige Mitarbeit an einem Forschungsprojekt über Kinderspracherwerb Schulpraktikum am HumboldtGymnasium, Köln Studentische Hilfskraft in der Hochschulschriften- und Tauschstelle der Universitäts- und Stadtbibliothek Köln Freier Mitarbeiter im Monitor-Dienst bei der Deutschen Welle Köln Wissenschaftliche Hilfskraft am Institut für Phonetik der Universität des Saarlandes, Saarbrücken Wissenschaftlicher Mitarbeiter am Institut für Phonetik der Universität des Saarlandes, Saarbrücken Wissenschaftlicher Mitarbeiter am Institut für Linguistik - Phonetik der Universität Köln Wissenschaftliche Hilfskraft am Institut für Linguistik - Phonetik der Universität Köln Wissenschaftlicher Mitarbeiter am Institut für Linguistik - Phonetik der Universität Köln Englisch: sehr gut Französisch: Grundkenntnisse Finnisch: Grundkenntnisse Latinum

183