Computer Assisted Research on the Bible in the 21st Century 9781463216528

The application of computer technology to the edition and linguistic analysis of biblical texts has provided, in the las

156 98 6MB

English Pages 191 [185] Year 2010

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Computer Assisted Research on the Bible in the 21st Century
 9781463216528

Citation preview

&RPSXWHU$VVLVWHG5HVHDUFKRQ WKH%LEOHLQWKHVW&HQWXU\

%LEOHLQ7HFKQRORJ\

 6HULHV(GLWRU .HLWK+5HHYHV

7KH %LEOH LQ 7HFKQRORJ\  %,7 LV D VHULHV WKDW H[SORUHV WKH LQWHUVHFWLRQEHWZHHQELEOLFDOVWXGLHVDQGFRPSXWHUWHFKQRORJ\

Computer Assisted Research on the Bible in the 21st Century

Edited by Luis Vegas Montaner Guadalupe Seijas de los Ríos-Zarzosa Javier del Barco

Gorgias Press 2010

*RUJLDV3UHVV//&5LYHU5RDG3LVFDWDZD\1-86$ ZZZJRUJLDVSUHVVFRP &RS\ULJKW†E\*RUJLDV3UHVV//& 

$OO ULJKWV UHVHUYHG XQGHU ,QWHUQDWLRQDO DQG 3DQ$PHULFDQ &RS\ULJKW &RQYHQWLRQV1RSDUWRIWKLVSXEOLFDWLRQPD\EHUHSURGXFHGVWRUHGLQD UHWULHYDO V\VWHP RU WUDQVPLWWHG LQ DQ\ IRUP RU E\ DQ\ PHDQV HOHFWURQLF PHFKDQLFDO SKRWRFRS\LQJ UHFRUGLQJ VFDQQLQJ RU RWKHUZLVH ZLWKRXW WKH SULRUZULWWHQSHUPLVVLRQRI*RUJLDV3UHVV//& 

†

,6%1

,661

/LEUDU\RI&RQJUHVV&DWDORJLQJLQ3XEOLFDWLRQ'DWD &RPSXWHUDVVLVWHGUHVHDUFKRQWKH%LEOHLQWKH VWFHQWXU\>HGLWHG@E\/XLV9HJDV0RQWDQHU *XDGDOXSH6HLMDVGHORV5tRV=DU]RVD-DYLHUGHO %DUFR SFP %LEOHLQWHFKQRORJ\,661  3URFHHGLQJVRIDFRQIHUHQFHKHOG-XQH LQ6DQ/RUHQ]RGHO(VFRULDO6SDLQ ,QFOXGHVLQGH[ %LEOH&ULWLFLVPLQWHUSUHWDWLRQ HWF'DWDSURFHVVLQJ,9HJDV0RQWDQHU/XLV ,,6HLMDVGHORV5tRV=DU]RVD*XDGDOXSH,,, %DUFR-DYLHUGHO %6& GF  3ULQWHGLQWKH8QLWHG6WDWHVRI$PHULFD

TABLE OF CONTENTS Foreword .................................................................................................vii Section I: General and Theoretical Approaches ................................1 From Louvain-la-Neuve (1985) to El Escorial in Madrid (2008): 25 years of AIBI ..............................................................................3 R.-Ferdinand Poswick Major Theoretical Issues from Two Decades of Bible and Computer Conferences.........................................................13 Luis Vegas Montaner13 The Greek Pentateuch and the Library of Alexandria ....................25 Natalio Fernández Marcos25 Section II: Technological Developments of Biblical Tools ..........43 Electronic Tools for Biblical Study at Home, at the University, and in the Classroom ....................................................................45 Emanuel Tov45 Displaying Hebrew and Aramaic on Handheld Devices that Lack Proper Complex Script Support................................61 Drayton C. Benner61 The Hexapla Project: Traditional Scholarship meets Modern Technology......................................................................79 Elizabeth Robar79 Section III: Computers and Linguistic Analysis of Biblical Texts ................................................................................................97 The Challenge of Consistency..............................................................99 A. Dean Forbes99 The Computer and Complex Phrase Structure: A Unified Approach to Embedding, Gapping and Recursion...............................................................................117 Janet W. Dyk Biblical Hebrew Linguistics as Corpus Linguistics ........................135 C. H. J. van der Merwe135 v

vi

COMPUTER ASSISTED RESEARCH ON THE BIBLE

The Series of Woes in Pre-Exilic Prophecy: A ComputerAssisted Study on Syntax and Semantics.................................159 Javier del Barco, Guadalupe Seijas de los Ríos-Zarzosa Index of Authors ..................................................................................177

FOREWORD The last thirty years have seen significant development and technical achievements wrought by the application of computer technology to the development of computerised editions and analysis of biblical texts. This development, however, is still in need of a critical evaluation from a historical perspective. There is a need to evaluate the current tools available in the field of the study of the Bible and computer technology, and a necessity to consider the adequacy of the trends in Bible research to satisfy the increasing demand for software related to the computer-assisted programs for the linguistic analysis of the Bible. Such an evaluation requires consideration not only of the technical achievements, but also failed objectives and future challenges. Establishing a discussion forum in which this critical evaluation could take place was precisely the aim with which we organised the international conference “Bible and Computers: The Present and Future of a Discipline”, which took place in San Lorenzo del Escorial, Madrid, on June 16–19, 2008. Following the example of the seven conferences organised until 2004 by the Association Internationale Bible et Informatique (AIBI), the Universidad Complutense was elected at Louvain in 2004 as the designated site for the following conference. This international conference was organised by the members of the research group AUTHOR (Análisis Unificado de Textos Hebreos por Ordenador) at the Universidad Complutense de Madrid, with the aid of an international scientific committee whose members were highly qualified specialists on the subject. Computer Assisted Research on the Bible in the 21st Century is, therefore, the result of the papers presented and the discussions held at this conference. The selection of peer-reviewed papers included here shows the plurality of research lines currently being worked on in relation to the application of computer technology to the analysis of the Bible. To vii

viii

COMPUTER ASSISTED RESEARCH ON THE BIBLE

begin with, rapid technical development makes it necessary to face technical problems which are constantly arising, not only in the presentation and use of databases and texts in different languages and alphabets, but also in the adaptation of these tools to different platforms, from the traditional PC to the most recent PDA in a permanently on-line environment. In addition, the development of these tools allows for a more effective approach to what has been a priority since computer technology was first applied to the study of the Bible: the linguistic analysis of a biblical text, always with the aim to understand better the grammatical structure of the biblical languages and, consequently, to understand better the Bible itself. The development of both approaches has been uneven in the last few years; while in some cases the progress has been noteworthy and promising, in others expectations have not been completely fulfilled or have not been adequately adapted to the demands of specialised tools in the analysis of biblical texts. On account of this, the first section of the present volume presents a historical approach to biblical research from both theoretical and practical points of view; the two following sections then consider in greater detail the developments in research technology and the application of computer tools to the linguistic analysis of the text of the Bible. The first section consists of contributions of both general and theoretical character in which, from a comprehensive perspective, some particularly relevant aspects are discussed. F. Poswick (“From Louvain-la-Neuve (1985) to El Escorial in Madrid (2008): 25 Years of AIBI”) offers, according to his privileged point of view as Secretary of AIBI for many years, a brief summary of the most significant contributions of the AIBI conferences by focusing on the proceedings of each of them. L. Vegas Montaner (“Major Theoretical Issues from Two Decades of Bible and Computer Conferences”) focuses on the history of research carried out within the framework of the AIBI from a theoretical and linguistic point of view, underlining the aims sought in the development of computer tools. N. Fernández Marcos (“The Greek Pentateuch and the Library of Alexandria”) presents a historical account of the study of the Bible’s Greek text. He draws a comparison between the present aspiration of a Universal Library accessible to the greater public by means of the modern Internet and the Library of Alexandria, envisaged at its time as the universal temple of knowledge. Admittedly, Greco-Roman culture, as well as the

FOREWORD

ix

Judeo-Christian influences upon it, both in Western Europe as well as in the Byzantine world and Eastern Europe, constitutes a significant part of what is known as Western culture. The second section is dedicated to the technological development aimed at solving the technical problems already mentioned. The articles included in this section focus on the creation, development and application of computer programs and resources that will facilitate the study of the text of the Bible from different perspectives. Indeed, already from the first AIBI conferences some fundamental questions related to technological developments were put forward: How to apply IT to the Bible? To what extent are these technological developments used and are they useful? Are programs really adapted to research needs or, on the contrary, does research advance by adapting itself to IT achievements? E. Tov (“Electronic Tools for Biblical Analysis at Home, at the University, and in the Classroom”) explains how the astonishing development of hardware in the last two decades has led to software development and its use by an increasing number of people. In contrast to the use of mainframes in the past, researchers and students today make use of databases and PC programs at home or at the university, with terminals connected to servers both on-line and in the classroom. Tov describes the various electronic resources developed during this time for the linguistic and textual study of Hebrew script, paying special attention to CATSS database and the Accordance and SESB programs. D. Benner (“Displaying Hebrew and Aramaic on Handheld Devices That Lack Proper Complex Script Support”) analyses in turn how the technology for visualising complex writing systems has been significantly developed in the last few years thanks to the expansion of the Unicode standard for the development of Smart Fonts. In tandem, the quickly expanding business of mobile phones has increased the demand for Bible software. These handheld devices, however, usually lack adequate systems for complex alphabets and some do not even recognise Unicode. In his article, Benner presents a way of facing such limitations and allowing an aesthetically agreeable visualization of the Hebrew and Aramaic alphabets—thanks to the use of SBL Hebrew and Ezra SIL fonts—with all the consonants, vowels, cantillation accents, Masoretic symbols and modern editorial marks required for the visualization of Hebrew and Aramaic documents,

x

COMPUTER ASSISTED RESEARCH ON THE BIBLE

mainly the Bible. E. Robar (“The Hexapla Project: Traditional Scholarship Meets Modern Technology”) focuses on the historical and technological development of the Hexapla Project. This project started in 2001 with the aim of producing a critical on-line, fully searchable edition of the Hexaplaric Fragments—apart from the standard print edition. The author describes the development of such a project, indicating the challenges and achievements of each step of the database construction. For instance, initial problems of compatibility in the use of ASCII fonts for Greek, Hebrew and Syriac were solved by the adoption of Unicode fonts. The initial design of the database in XML format gave way to a relational database, which offers greater flexibility in the complex task of managing critical apparatuses and, by means of a new interface, offers a user friendly environment for the textual critic. The articles in the third section focus on problems of linguistic analysis of the biblical texts, and on what the most suitable technologies are to finding a solution for them. From a methodological point of view, they all share the use of technology and IT (software, databases, platforms, etc.) as basic tools for the development of their research, both in terms of the grammatical analysis of the texts as well as in the treatment of data. A. Dean Forbes (“The Challenge of Consistency”) analyses the sources of inconsistency in the representation of the linguistic structure of biblical Hebrew developed by himself and F. Andersen, assuming that people’s intervention in determining such representation is necessarily taken into account. The need to correct such sources of inconsistency leads the author to present a sample case based on the creation of what he has called text surrogates. Such a strategy enables him to focus on those areas where inconsistency is higher, under the firm conviction that such mistakes will only be solved by applying a higher level of automatization in the representational process. J. Dyk (“Computers and Complex Phrase Structures in Biblical Hebrew”) questions in her article the validity of syntactical analysis based on a bottom-up or top-down structural approach, since it is very difficult to integrate into them the expression of syntactical embedding or gapping. In order to do so, new, more adequate formulas that could allow the codification of such linguistic elements need to be sought, since these cannot be just simply integrated with a structural design of hierarchical character. Dyk proposes, therefore, carrying out an analysis focused on basic

FOREWORD

xi

units of grammatical expression, which, in her opinion, should be treated equally regardless of whether they are isolated or clustered in more complex structures. C. van der Merwe (“Biblical Hebrew Linguistics as Corpus Linguistic”), stemming from a firm belief in the application of electronic tools to linguistic analysis of biblical Hebrew, focuses on articulating a model that could be more suited to describing particles in biblical Hebrew. He seeks to develop a well-articulated language theory that could allow him an appropriate interpretation of quantitative data. He further develops the notion of corpus linguistics, as well as a user model of electronic tools that allows him to distinguish between prototypical and less prototypical uses of the particle ‫אף‬, which establishes certain differences with the use of other particles. J. del Barco and G. Seijas de los Ríos-Zarzosa (“The series of Woes in Pre-Exilic Prophecy: a Computer-Assisted Study of Syntax and Semantics”) present, for the volume’s conclusion, a linguistic study of woe oracles in which, by means of using a database in which morphosyntactic information has been previously codified, they combine the syntactical analysis of the texts with the use of stylistic and semantic elements typical of this sort of oracle. They try to discern, therefore, a relationship between grammatical form and linguistic and literary function within a specific context. Many other queries and problems related to the application of IT to the study of the Bible are yet to be explored. With regard to the future, some challenges seem particularly relevant to the role to be assumed by technology in the development of linguistic research, such as the use of Wi-Fi technologies, the composition and accessibility of virtual libraries, the scope of the e-book format as a tool for study and analysis of biblical texts, the compatibility of different platforms used in different research centres, the optimization of Internet potential and on-line research, among other innovations. The future organization of discussion forums on the application of technology to the study of the Bible seems to us, therefore, even more necessary in this new century than it was in the previous one, and, for this reason, we hope that the present volume will encourage a reflection upon where we stand today and, above all, in which aspects the support of technology is most needed in order to achieve a better understanding of biblical texts as our collective research progresses. ***

xii

COMPUTER ASSISTED RESEARCH ON THE BIBLE

The publication of the present volume has been made possible thanks to the support of various institutions which contributed financially to the organisation of the conference where the papers which constitute the present volume were presented. The Universidad Complutense de Madrid collaborated enthusiastically from the beginning in this venture and facilitated the organisation of the scientific meeting in El Escorial. We would especially like to thank the Departamento de Estudios Hebreos y Arameos for their support and assistance in various aspects of its organization. The Spanish Ministerio de Ciencia e Innovación has made possible both the organisation of the conference as well as the publication of the present volume by financing the research project AUTHOR (HUM2005-05747 and FFI2008-01120) and by the awarding of a special grant (HUM2007-30067-E). We would like to extend our gratitude to the Consejo Superior de Investigaciones Científicas (CSIC), an institution that also granted special aid for the organization of the conference, as well as to the Centro de Ciencias Humanas y Sociales (CCHS) of the CSIC, which also contributed financially. We would also like to thank the Asociación Española de Estudios Hebreos y Judíos for its assistance. Finally, we would like to express our deepest gratitude to each of the members of the scientific committee for their enthusiasm for this project from its outset and for their generous collaboration throughout the process. Madrid, November 2009

FROM LOUVAIN-LA-NEUVE (1985) TO EL ESCORIAL IN MADRID (2008): 25 YEARS OF AIBI R.-FERDINAND POSWICK INFORMATIQUE & BIBLE, ABBAYE DE MAREDSOUS, B-5537 DENÉE, BELGIQUE The AIBI (Association Internationale Bible et Informatique— International Bible & Computing Association) was created in 1982 in Belgium. Even though the first international conference organized by AIBI would not take place in Louvain-la-Neuve until September 1985, it was in this year that the 25-year history of the AIBI started. Article 2 of the incorporating document of this association, which is institutionally dormant as of this moment, pointed out that it wished to promote the study of the use of “the automated processing and communication tools” as applied to the biblical field and that it hoped to “specify the best computing practices” in this field. The following brief survey of AIBI conferences focuses on some significant technical developments that can be traced by reading the contributions published in the six books of Acts (see: Bibliography). However, this survey is not intended to be a history of the main trends in research, nor a full account of all technical (hardware or software) developments described in these contributions. My purpose here is simply to remind readers where we as an institution have come from as well as how circumstances used to be for us at various points in time over the last 25 years.

LOUVAIN-LA-NEUVE (1985) Our scholarly international exchanges started at the dawn of ‘micro-computers’, today known as personal computers, which 3

4

R.-FERDINAND POSWICK

had the effect of compelling all the pioneering projects (often institutionally sanctioned and carried out on main frame computers) to find ways of reshaping their whole work, with the simultaneous result that these projects entered into competition with individual projects, which quickly turned into commercial ones. Jean Bajard registered 22 centers where computer science had been in use for 5, 10 or 15 years for application to the text of the Bible or to the interpretation of its contents (Acts-86, 83). Among others, the names of these centers are: Andersen-Forbes; CATAB (Lyon); CATTS; CNRS; CIB-Maredsous; The Computer Bible; GRAMCORD Project; The Michigan Project; MIKRAH; Responsa Project; University of Stellenbosch; The Way International; University of Tübingen; Werkgroep Informatica (Amsterdam); The WORD Processor… The 2008 conference at El Escorial in Madrid would bring together, among other participants, eight representatives of groups active since 1985.1 This quarter century of stability has seen the cultivation of expertise in writing and electronic processing tools, which were only in fledgling stages in 1985. At that time, everyone had their own proposals for managing non-Latin characters, in spite of the limited resources given by the ASCII or EBCDIC codes. On that occasion, P. Chiaramella (Grenoble, Acts-86, 119) set out the key elements of a technological prospective. He talked about Text Editors changing from Line Editors to Word Processors, like Notably: Emmanuel Tov and the CATTS project (Hebrew University of Jerusalem and University of Pennsylvania); Dean Forbes and his projects on biblical Hebrew with Frank Anderson (thanks to Xerox, Palo-Alto tools); R.-Ferdinand Poswick (Informatique et Bible, Maredsous); Stellenbosch University (represented by W. Claassen and J. Cook at that time, and by Chr. Van der Merwe in Madrid); Bar-Ilan University of TelAviv (then represented by T. Radday, and by Yosef Ofer in Madrid); P. Miller’s GRAMCORD project (represented by the ACCORDANCE users in Madrid, among others); the Vrije Universiteit of Amsterdam (represented by Eep Talstra and Janet Dyk in Madrid); the Madrid Complutense University (with Luis Vegas Montaner). 1

25 YEARS OF AIBI

5

the WORDSTAR software; he spoke highly of the mouse and of the floppy disk, while announcing the coming of Digital Optical Disks, non-erasable and burned by laser, which could store up to 500,000 A4 pages or, astonishingly, 500 Mb. He also hinted at Artificial Intelligence (AI), the processing of natural language, expert systems, as well as Information retrieval, at the time when databases with fields, network or relational databases (INGRES, SQL) had started developing. Scanners were represented by the KURZWEIL system at that time (Acts–86, 326). Nonetheless, enthusiasm was keen, and everybody was already trying to use an Apple II, a MacIntosh, or an IBM PC-XT or AT. The first database on the Massoretic Hebrew text with multicriteria interrogation functions was called COMPUCORD of Mikrah (Maredsous and R. Bennum, New York, Act-86, 409). It was run on an IBM PC-XT or AT. P. Miller’s database on the Greek of the New Testament was also available on an IBM PC (Acts-86, 409).

JERUSALEM (1988) John J. Hugues’ recent publication Bits, Bytes and Biblical Studies (Zondervan, 1987) was considered at that time to be a good assessment of all the work carried out so far.2 It was the time when the TUSTEP software suites of the Tübingen University were being presented (W. Ott; W. Bader, Acts-89, 73 and 419). Notions of CD-ROM and hypertext also appeared then and Microsoft Windows as well as UNIX operating systems were beginning to be heard of. The dBASE III software was used for managing databases; at the same time, apart from COBOL and FORTRAN, PASCAL and Turbo-PASCAL were the programming languages in use. John J. Hugues presented his book at the Jerusalem AIBI Conference. It will be replaced by Ian Lancashire’s work (who attended the ALLC-AIBI Conference in Jerusalem), The Humanities Computing Yearbook 1989–90, Clarendon Press, Oxford, 1991. 2

6

R.-FERDINAND POSWICK

The IBM PC-AT hard disks had a 30 to 40 MB storage capacity for data while the Random Access Memory (RAM) was about 640 Kb. Yet the making of a CD-ROM still represented a considerable investment (US$ 4,000 at least), without any guarantee that one could find the appropriate access software (R. Kraft, Acts-89, 322). Graphic cards and hypercards were also being talked about for images and sound. But the problem which one was always facing was the difficulty of managing and transferring non-Latin characters between various platforms and in ASCII codes. Among various innovations, worthy of mention are the electronic/telephone telecommunication link established between the French videotext (TELETEL, Minitel) all the way to Jerusalem (R.-F. Poswick, Y. Juste, Acts-89, 490) and the Smart Book (‘The World’s first portable electronic Book’) presented by Tom Tresseder (Australia, Acts-89, 581).

TÜBINGEN (1991) I&B (J. Bajard, R.-F. Poswick, Acts-92, 263) described its way of managing large databases (up to 285 Mb) while maintaining compatibility between main frames (IBM 360 and subsequent models), mini computers (IBM 1 Series) and PC’s by storing data on hard disks with a capacity of between 20 and 200 Mb and by managing storage of 6250 BPI magnetic tapes, giving access through the mainframes to the IBM 3800 laser printing facilities (notably the 18,603 pages of the TOB concordance listings) or to Siemens Hell Digiset phototypesetting facilities of the Paris National Printing Office. The Text Encoding Initiative (TEI) and T. Brunner’s Thesaurus Linguae Graecae (TLG), the large database of the Irvine University (California), were first being spoken of at this time. The Vrije Universiteit of Amsterdam developed the first version of QUEST software (Harmsen, Acts-92, 319) by means of programming on UNIX platforms in C, in PASCAL, in PROLOG languages and with other tools. The first biblical libraries stored on CD-ROMs with search programs, such as Robin Cover’s CD-Word (Acts-92, 592), were also being examined. SGML (Cf. TEI) and UNICODE “which could replace various ASCII codifications within three to four years” (Pecoud, Acts-92, 351) were explicitly being talked about for the first time.

25 YEARS OF AIBI

7

Philips’ CD-I was in competition with IBM and INTEL’s DVI (Acts-95, 590) for storing mass electronic data. ‘Object directed’ programming had become popular and the limits of research work on Artificial Intelligence (AI) (Berleur, Acts-92, 377) were better assessed. After a survey carried out by AIBI among its users, J. Berleur’s contribution produced interesting data on electronic tools used at that time by people who worked in the field of the biblical texts: 33% PC’s; 25% PC’s in relation with main frames; 15% MacIntosh; 13% PC and MacIntosh used in tandem. On the use of word processing systems: 18% for Word, 14% for WordPerfect, 4% for NotaBene, 4% for Nisus and at least about ten other software programs which have been forgotten long since. … or also the more widespread use of dBASE among about twenty other database management systems!

AMSTERDAM (1994) In his inaugural speech, Eep Talstra (Acts-95, 25) alluded mainly to the following advances in technology: the possibility of submitting work on a network to a colleague by FTP (File Transport Protocol), the significant increase in RAMs, the CPU speed (Central Processing Unit), the use of PostScript for finalizing files with non-Latin fonts and for having them printed on fast laser printers and also for the possibility of working in Windows. The increase in the number of various achievements, such as the encoding of tasks and databases on New Testament Greek, allowed for a first critical assessment (H. Hahne, Acts-95, 223), in connection with GRAMCORD (4.04); Bible Windows (2.6); Bible Works (2.2.2.); The Word (3.05); Logos (1.68); and also ACCORDANCE (the GRAMCORD adapted for MacIntosh environment). HTML (Acts-95, 107, 437, 471) and the management of digital images in GIF, JPEG or TIFF were first being mentioned; electronic addresses of the http://www… type were starting to be talked about: the Web itself was coming into being.

AIX-EN-PROVENCE (1997) E. Nida announced that the next challenge for the use of electronics in the biblical field would be the use of multimedia (Acts-98, 27).

8

R.-FERDINAND POSWICK

Many people already used Microsoft-Access–2 or were programming in Visual Basic. On top of some Web sites, participants were also mentioning their electronic address (e-mail address).

STELLENBOSCH (2000) An electronic newspaper, produced and managed ‘on line’ (A. Adair, Acts-02, 135) was presented and the maintenance of ‘Web sites’ was alluded to. Moreover the use of dBASE III and IV was more widespread for different projects in the field of biblical studies. But it was thought (J. Lust, Acts-02, 365) that ACCORDANCE was probably the best software for studying the Septuagint. David Trobisch (Acts-02, 427) proposed to create a central database on the Internet for circulating all the New Testament manuscripts, while already several Web sites were offering papyri images, accessible on line and in color (K. De Troyer, Acts-02, 573). It thus seemed that we were heading for the Network Society (Swart, Acts-02, 593) and towards online assisted teaching, notably for the biblical languages (Chr. Van der Merwe, Acts-02, 615), or teaching with the help of PowerPoint presentations (P. Flint, Acts-02, 323).

MADRID (2008) Because contributions made in 2004 in Louvain had not been published, we had to go straight to those presented in 2008 where we were obliged to note that the CATTS project was still there and represented by E. Tov, who praised its implementation in ACCORDANCE. The other flagship product was the commercial set proposed by the Biblical Society of Stuttgart (SESB); it was a comprehensive set of databases, including texts (original or translations), lexica, dictionaries, search functions, sophisticated display and various other functions, all of which were operational on the basis of the LOGOS Libronix Digital Library System. UNICODE and XML were also being talked about as well as the access to non-Latin characters on WindowsMobile or on PALM.

25 YEARS OF AIBI

9

The Hexapla Project (Louisville, Kentucky, USA) proposed a methodology for creating this new critical edition of Origen’s Hexapla, based entirely on cooperative effort through the Internet, an enterprise carefully vetted under a strict hierarchy.

CONCLUSION The problems initially encountered for encoding non-Latin characters seem to have been almost entirely solved thanks to the launch of UNICODE. One should nevertheless be aware of the pitfalls of implementing UNICODE: transfer from ASCII to UNICODE, different types of UNICODE implementation or representation! Initial problems with random access memory on microcomputers or mass storage spaces also seem to have been overcome to a large extent. Little has been said here, however, about the prospects of medium or long term conservation of all work already done. What will be left within 10 or 20 years? The transition to ‘Flash’ memories and USB keys has been little discussed even though this does point the way forward by eradicating the mechanical problems still linked to disk rotation. The huge volumes demanded by electronic imagery, necessary for providing access to original manuscripts to the great majority of people, require memories which will be expressed in TERABYTES (Tb) and no longer in gigabytes (Gb); a further prospect is the role of large computers in hosting the world networks of data exchanges. Who actually will hold the data on the Internet, and in what kind of a cooperative project? The electronic book (e-book) is beginning to proliferate in various forms and will be in everybody’s hands before long; its prototype was called Smart-Book and debuted in Jerusalem in 1988, some 20 years ago. Nevertheless, the more advanced versions (UMTS) of the mobile (cellular) phone will increasingly become the unique writing and reading tool for a culture based on electronic writing. Data structuring (SGML, TEI, HTML, XML), in order to guarantee their independence from software while keeping it usable in different environments, is only starting. Does the prior ‘nontagged’ work run the risk of being written off if it is not transposed into these formats?

10

R.-FERDINAND POSWICK

Despite its limited circulation among graphic professions, one can wonder about the persistence of the biblical applications on MacIntosh thanks to a better management of graphic aspects (notably the non-Latin characters) and to a higher usage which leaves searchers little reason to invest too much in learning about information technology. It is amazing how widespread the acceptance is of Windows and of Microsoft products, without any notice or attention paid to free market products of the LINUX type. The extensive and critical use of statistics applied to texts has not yet found its place in the realm of the automated search in the field of biblical texts. But promising projects do exist, of which some are already quite advanced. Their objective is to make available to searchers all the sources they need for their critical work: manuscripts on any kind of medium (clay, stone, bronze, papyrus, parchment, paper); dictionaries or encyclopedias with hypertext links; various auxiliary tools to be used in geography or chronology; and, soon, the most important monographs and commentaries. The collectivization and sharing of research in the field of biblical studies is on the inexorable move. But the question now is whether the content of these texts, which are the object of the research, still has an impact on the social fabric of which the searchers are part. This is a hermeneutic and epistemological question, which had in fact been asked before at the 1991 AIBI Conference in Tübingen, but which the hubbub caused by technological evolution could possibly keep in obscurity.

BIBLIOGRAPHY (Acts-86): Actes ... Proceedings of the First International Colloquium Bible and Computer: the Text, Louvain-la-Neuve (Belgique), 2–3–4 Septembre 1985. Paris/Genève: Champion/Slatkine, 1986. (Acts-89): Actes ... Proceedings of the Second International Colloquium Bible and Computer: Methods, Tools, Results, Jérusalem, 9–13 Juin 1988. Paris/Genève: Champion/Slatkine, 1989. (Acts-92): Actes ... Proceedings of the Third International Colloquium Bible and Computer: Interpretation, Hermeneutics, Expertise, Tübingen, 26–30 August 1991. Paris/Genève: Champion/Slatkine, 1992.

25 YEARS OF AIBI

11

(Acts-95): Actes ... Proceedings of the Fourth International Colloquium Bible and Computer: The impact of computers on Bible studies, Amsterdam, 15–18 August 1994. Paris: Champion, 1995. (Acts-98): Actes ... Proceedings of the Fifth International Colloquium Bible and Computer: Translation and Transmission, Aix-en-Provence, 1–4 Septembre 1997. Paris: Champion, 1998. (Acts-02): Bible and Computer. The Stellenbosch AIBI-6 Conference. Proceedings of the Association Internationale Bible et Informatique “From Alpha to Byte”. University of Stellenbosch, 17–21 July 2000, ed. Johann Cook. Leiden/Boston: Brill, 2002.

MAJOR THEORETICAL ISSUES FROM TWO DECADES OF BIBLE AND COMPUTER CONFERENCES LUIS VEGAS MONTANER UNIVERSIDAD COMPLUTENSE, MADRID, SPAIN After years of development of computer assisted studies on the Bible in an expanding number of research centers, it was inevitable that an international academic forum be created for the free exchange of methods, results and ideas among scholars. This was done in 1985 with the first Bible and Computer conference. As a complement to F. Poswick’s technical survey of this series of conferences, we shall present here a panoramic view of some major theoretical approaches in the AIBI conferences.

TEXTS AND VERSIONS The main interest from the earliest Bible and Computer conferences was the discussion and exchange of experiences with regard to the use of computers in the study of biblical texts, as well as solving, among other problems, the visualization of non-Latin alphabets and their standard codification. The first conference in Louvain-la-Neuve, under the general title of “The Text”, concentrated mainly on the discussion of several methodological and technical questions on the different texts of the Bible, with a description of various automatized databases on biblical texts. Along the same lines, at the Jerusalem conference, under the theme “Methods, Tools, Results”, a great number of tools were discussed and shown in operation. As would be presented at subsequent conferences, increasingly powerful technologies were demonstrated for their possibilities of application in the field of Bible studies. In addition to reports on 13

14

LUIS VEGAS MONTANER

mainframe databases, various applications on micro-computers have been displayed, with software for multi-character printing and processing of texts. The Hebrew text of the Old Testament has always received primary attention, and the study of Biblical Hebrew has remained an essential topic for all the conferences and has been the object of morphological, syntactic and lexical analyses. The manuscript tradition and the orthographical features of the Masoretic text, as well as its accentual system, are particular points which have been stressed in several conferences. The text of the Septuagint has also been the object of particular interest from several points of view. The relationship between the Hebrew text of the Bible and its Greek translations has received due attention, showing how scholars can benefit from complete automatized data and programs increasingly suited to textual criticism and linguistic, lexical, and literary analysis. Different kinds of research on other ancient versions of the Bible, such as the Aramaic Targum and the Syriac Peshitta, have also been presented over the years. In terms of methodology, it has become clear that work based on quantitative analysis has remained one of the main approaches to computerized text processing. The statistical method was used in lexical and stylistic approaches to the Hebrew Bible, the Septuagint, the Greek version of the New Testament, patristic writings, and also to much younger biblical corpora in modern languages. In contrast to the high number of contributions to the study of the Old Testament, the number of papers presented on the New Testament is significantly lower, scattered throughout the history of the conferences. In two of them (Amsterdam and Stellenbosch), there was one complete session dedicated to computer research in the area of the New Testament texts. The majority of computer projects on the New Testament concentrate on philology (e.g. questions of manuscripts and textual transmission, grammar, vocabulary and authorship) where important tools for scholarly research have emerged as a result. Computer aided research on the language of the New Testament does not appear to have followed the same lines as Old Testament studies. For example, computer aided New Testament projects with a syntactic focus are lacking, despite the fact that a morphologically tagged text has been available for years. Some

MAJOR THEORETICAL ISSUES FROM TWO DECADES

15

discussions did centre on the reasons for these differences between Old Testament and New Testament studies and posed various methodological questions related to problems specific to the New Testament. The main concern of the first AIBI conferences were the two Old Testament texts then available in computerized editions accepted by the majority of researchers, the Hebrew Masoretic text and the Greek Septuagint. Subsequently, other ancient versions and new translations of the Bible were available in electronic form, enabling research in these areas to develop. A focus on “Translation and Transmission” guided the organization of the Aix-en-Provence conference. There E. Nida, based on his long expertise in linguistics and the use of computers to explore the theory of dynamic translation, suggested in his keynote speech that the next challenge was to master the various elements constitutive of transmission in order to be able to translate the Bible tradition into electronic multimedia. It was remarked that the analysis of existing translations, both ancient and new, allows us to gain a better grasp of the languages which convey, both in the past and today, biblical texts. This analysis also enables us to observe how editors of these translations have understood their source texts and whether or not they have incorporated outside elements which provide additional valuable testimony. The techniques used in translating the Bible permit an understanding of how the transmission from the source text to the target text is undertaken and to what extent we have a new text or simply a copy of a text in another language. The application of different textual syntax approaches to Hebrew was also considered. Proper textual division, as well as the value assigned to the elements which make up the text and its sentences, will necessarily affect the correct translation of the Hebrew Bible. The term “Transmission” includes everything surrounding the biblical text. These outside sources can differ greatly in type: graphic layout, dictionaries, comparisons between languages, external literature (such as Ugarit and Qumran), and traditional or modern commentaries of various origins, etc.

HERMENEUTICS After having dealt mainly with technical questions and methods and tools for the analysis of the biblical texts in the first two

16

LUIS VEGAS MONTANER

conferences, the Tübingen congress focused on “Interpretation, Hermeneutics, Expertise”, bringing “meaning” to our attention: What is the functionality of the “text”, and what in a text brings meaning? Several contributions centred on hermeneutic questions, asking for the possibility of transforming the theoretical insights and grammar conceptions into a practical methodology. It was stressed that a clear definition of the actual object of our research of the text was key to knowing how to read (textual criticism), to delimit (literary/redactional criticism), or to understand it (dictionnaries, grammatical solutions). All these preliminary stages led to a reflection on the interpretation of specific texts, with the analysis of their expressions (Syntax), their content (Semantics) and their use (Pragmatics). It became clear that the discussion about the contribution of computers to biblical studies is almost identical to a debate about the different methods used in textual research. As a matter of fact, biblical scholars want their computers to do more than sort and concord data. They want to use the computer to investigate the possibilities of formalizing an exegetical methodology and to assist them in experimenting with different methods of biblical exegesis. The Amsterdam conference further developed these methodological approaches on biblical exegesis under the general rubric “Desk and Discipline”. What help can computers give in the process of analyzing and interpreting biblical texts? Does the computer help us at our desk by replacing our books (dictionaries, text editions) and files (notes on cards), or does the computer also reorganize our discipline by influencing its methodology, by changing or making explicit the order and the type of our linguistic and literary argumentation? As Talstra remarked, “modelling” is the key concept, and three different levels at which the computer makes these modelling kinds of contributions to biblical studies can be observed: —at the level of viewing scientific texts, computers are used to imitate the raw data and documents from which we begin our research: text processors used to handle different character sets on screen or on paper, scanners employed to digitize documents of ancient texts, etc. —at the level of preparing scientific results, computers can imitate the more classical concordance type reference volumes for

MAJOR THEORETICAL ISSUES FROM TWO DECADES

17

searching and sorting linguistic material. The researcher is therefore provided with prepared materials, not with machine-made analytical proposals. —at the level of scientific argumentation, software can mimic our reasoning process in reading texts. The user can receive machine-made analyses for such diverse tasks as parsing and labeling clauses, analyzing text-syntactic or text-semantic structures, or identifying the actants in the text. Although the impact of computing on the whole process of interpreting biblical texts cannot be ignored, it was stressed that not all the problems had received clear answers yet. A number of wellknown tensions continued to exist: top-down versus bottom-up, formal linguistics versus semantics, literary theory versus linguistic theory. Does “form” come before “function” and “grammatical function” before “lexical information”? How does one perform both paradigmatic research (looking for one linguistic feature in the entire corpus) and syntagmatic research (analyzing all the different linguistic features and their interactions within one textual composition)? Along the same line, the subsequent conferences considered the effect of computer directed analysis on the research methodology on the Bible and its exegesis, with special attention to hypertext and online biblical studies.

HEBREW SYNTAX Since a text is never simply a combination of isolated linguistic items, the linguistic information generated by the structure of the text itself is of a different type from the paradigmatic linguistic information dealt with by existing programs for searching and concording. Although existing databases of biblical texts are important tools for textual criticism, comparison of manuscripts and translations, and relevant contributions on morphological and lexical research were made in all the conferences, biblical scholars also require computer-assisted research to contribute to the study of syntagmatic features of higher linguistic levels. They want the computer to define clause hierarchies as well as clause types and, beyond a mere search for the frequency of a verbal form, they want to detect the linguistic markers which indicate, for example, the shift between narrative or discursive discourse or between foreground and background textual sections. This need for greater

18

LUIS VEGAS MONTANER

syntagmatic definition and analysis will lead to the production of programs able to explore the syntactic structure of biblical texts. Of great importance from one congress to the next are the syntactical studies of Biblical Hebrew, with different points of focus. To have at our disposal a Hebrew Bible which has been syntactically codified for computer analysis is of capital importance, since the use of existing information on, for example, verbal morphology permits only partial syntactic studies. The creation of databases of syntactic content and computer programmes which are able to separate the different textual segments (textual “analysis” in the strict sense of the term) are then indispensable. It is also convenient to pay attention not only to the values and functions of the components of phrases and sentences, but also to their relationship with the cantillation handed down by the Masoretic tradition. Furthermore, the application to Hebrew of the varied theories of textual syntax has to be taken into consideration also. Classical Hebrew grammar pays much more attention to the description and interpretation of individual forms than to the study of syntagmatic relations. A methodological change has occurred by the use of the computer in Hebrew linguistic studies and in the development of a computer-oriented linguistic methodology, especially in the field of computer-assisted analysis of Biblical Hebrew syntax. Two important methodological concepts are being used: a “rule-based” approach and a “data-based” approach. In the first, scholars let the computer select linguistic data on the basis of conventional grammatical rules, while in the second approach the computer compiles new linguistic data by comparing “raw” linguistic materials with sets of previously analyzed data in order to propose new grammatical analyses (on the basis of the statistics of the previously analyzed data). The data-oriented approach may be most fruitful in the research of phrase-level and clause-level data, where the formulation of grammatical rules is much more complicated than on the level of morphological analysis. In previous times the computer has been used primarily to retrieve word-level information, such as grammatical and lexical features. In order to analyze texts we must use the computer to organize and retrieve syntagmatic features (i.e. linguistic information on all levels within

MAJOR THEORETICAL ISSUES FROM TWO DECADES

19

the context of a textual composition such as word connections, clause types, and textual hierarchy). Into this context can be inserted research carried out by the Werkgroep Informatica, Faculty of Theology, VUA, under the direction of E. Talstra, which aims at the development of a syntactically analyzed machine-readable database of the Hebrew Bible. According to Talstra, developing text-analyzing programs forces one to rethink how grammar should be approached. It would be better to use computer programming first to make an inventory of the text-level linguistic features we need to understand. This means that text-level computer-assisted analysis would be most effectively accomplished by starting with the dataoriented approach rather than the rule-based approach. Computerassisted research at the text-level would therefore be a tool for grammatical research, not merely the result of grammatical theorizing. Some results taken from the book of Genesis were generated by programs which identify clauses and construct clause hierarchies and textual structures. The more complex task of identifying grammatical clause connections and markers of textual hierarchy (e.g. the beginning and end of direct speech) requires a similar combination of rule-based and data-based methods. Over the course of the AIBI conferences Talstra and his team have presented results of their ongoing research on the relationship of grammar and encoding towards the building of a hierarchically structured database and the production of a syntactically orientated concordance of biblical Hebrew texts; text grammar and complex linguistic constructions: the hierarchical nature of embeddings; the Hebrew participle; how a discourse analysis and a computerassisted linguistic analysis could add to traditional exegetical approaches. The group integrated by A. D. Forbes (Palo Alto, USA)— F. I. Andersen (Melbourne, Australia) has also dealt with the topic of biblical syntax throughout various AIBI conferences, tailoring their presentations to the running theme of each of them. Their project, which began in 1970 with the transcription of the vocalized text of the Hebrew Bible into machine-readable forms, had a main focus on analysing the text into units (“segments” of text) suited for grammatical study. When addressing the methods and tools for the study of Old Testament syntax, they gave priority to the

20

LUIS VEGAS MONTANER

probabilistic approach over the rule-based approach. This latter approach places heavy emphasis on dominance relations and lends itself to the study of sentence deep structure, whereas the former focuses exclusively on precedence relations and seeks only to describe sentence surface structure. They developed two probabilistic models of sentence surface structure for the grammatical categories incidence and ordering with the aid of the S statistical analysis package. Another important issue is marking clause boundaries, on which the theoretical problems involved have been described and some practical solutions proposed. Problems of recursion and embedded subordinate clauses, as well as missing clause components, contribute to ambiguities in the computer analysis of the texts. Indeterminacy is very evident in Hebrew clause boundaries, and scribal practice does not have any clear signals of clause termination. On the contrary, Hebrew does have some very clear signals of clause beginning, which makes it possible to propose several clause-onset rules as an aid for automatic text division. Syntactic ambiguity in the Hebrew Bible has been one of the main issues dealt with by these researchers. Computer parsing is one approach that might be able to detect ambiguity in texts in an unbiased manner. Several kinds of ambiguity and methods for discovering and representing it, as well as a standard means for resolving it, have been discussed. Prepositional-phrase attachment ambiguity received particular attention. As a corollary to the use of electronic syntactic databases already built, Forbes and Andersen directed their efforts towards clause-type concordances of the Bible. A concordance of clauses organized with reference to their syntactic structures and which reports the phrase markers of the text should serve the theoretical interests of text linguistics and the practical tasks of translation and interpretation in the broadest possible ways. The systematic and disciplined application of agreed principles and procedures to the syntactic analysis and description of the grammatical structure(s) of all the clauses of the Hebrew Bible can produce parse graphics, the comparison of which will permit the isolation for comparative study of all clauses with identical structure, while the accesibility of substructures within the graphs will enable the comparison of clauses that resemble one another only partially.

MAJOR THEORETICAL ISSUES FROM TWO DECADES

21

Mention should also be made of the presentation made by Forbes in Leuven of the topic “Hebrew Grammar Visualised: Syntax and Discourse”. Another more recent group is also focused on biblical Hebrew syntax at the Universidad Complutense. Its director, L. Vegas Montaner, was present at the first AIBI conference and other members of the research group have been taking a greater part in successive conferences, such as G. Seijas and J. del Barco. In the framework of the project AUTHOR morpho-syntactical databases of biblical poetry (Isaiah, Ezekiel, Minor Prophets, Psalms) have been created in Madrid. Related software, which allows flexible and complex searches in order to detect the identity and intensity of the phenomena, has also been designed. According to the theoretical principles of text linguistics approaches, poetic texts have been analyzed taking into consideration clause structure (word order and position of the verb within the clause), particles with a macro-syntactic function and the use of verbal forms within a context broader than the sentence. A description of the database features and a case study of the mutual relationship between imperfect and consecutive perfect tenses in the proto-Isaiah was presented in Amsterdam, paying due attention to the biblical parallelism. The distinction between different kinds of discourses has guided the contributions to AIBI conferences that followed. An analysis of the perfect tense in discursive texts was carried out on Isaiah and Psalms, taking into account some important aspects, such as the word order of the constituents in related clauses and the masoretic notation of verse structures, this last aspect being important in the process of automatization of clause division. A similar methodology, applied to the discourse analysis of the preformative conjugation in Isaiah, Minor Prophets and Psalms, was presented in Leuven. The search for a computer-assisted classification of discourse types in poetic biblical texts led to the analysis of the consecutive imperfect and the interrogative clauses in Isaiah (ch. 1–35), which resulted in a different syntactical behaviour according to the type of the text. A study on Amos described some specific features in two different levels of text-markers: those working as a boundary between two types of discourse, and the development of these textmarkers into an independent type of discourse at the same level of

22

LUIS VEGAS MONTANER

the two sections that they separate. In the analysis of the narrative and discursive texts of Psalms, the consideration of syntactic differences concern mainly verbs, word order, and grammatical person.

ANALYSIS OF SPECIFIC TEXTS Under the premise that computer-assisted methods will only reach the majority of biblical scholars if they are able to become a tool for text-level analysis, it was deemed important to devote time to the analysis of specific chapters from the Bible. As a consequence, one section of the Tübingen conference was dedicated to the analysis of Genesis 22, Lamentations 3 and 5, and 2 Samuel 12 in some linguistic respects (e.g. formulas, actants, dialogues, sentence connections, modalities), with a description of their syntactic structures (discourse analysis) and a reflection on the contribution of computerized linguistic analysis to textual interpretation. A methodological section concentrated on the debate about different grammatical approaches on the basis of syntactic data, paying special attention to the definition of a clause and how to mark clausal boundaries with the help of a computer. Also, the Amsterdam congress had a session with papers representing different perspectives in the area of clause segmentation and clause analysis in a single chapter from the Hebrew Bible (Genesis 39 and Deuteronomy 8 were chosen for the occasion). In order to facilitate a comparison of the different approaches, the common structure of each paper was as follows: —Overview of the approach: the textual basis for the application, with indication of which machine-readable text and which morphologically analyzed text were employed. —Overview of the clause segmentation process: its kind (automatic, manual or a combination), its steps and criteria used for clause segmentation. —Overview of the clause analysis: its kind (automatic, manual or a combination), the type of analysis (clause identification, hierarchy, etc.), and criteria and steps used in analysis. In the contributions of this session it was possible to observe the struggle to find a proper balance of syntactic, semantic and non-linguistic information.

MAJOR THEORETICAL ISSUES FROM TWO DECADES

23

CONCLUDING REMARKS Many other issues, apart from the above mentioned, have been discussed, to higher or lesser degree, in the seven conferences, both in terms of computer techniques and tools as well as regarding aspects related to scientific methodology in various fields: the Hebrew text of the Old Testament (manuscripts, Masorah) and their versions, both ancient (Septuagint, Peshitta, et al.) and modern; New Testament; textual criticism; lexical, semantic, statistical, literary and cultural studies; electronic publishing, multimedia, education, etc. All of these fields of study attest the variety of interests pursued by Bible and Computer scholars, and the richness of the debate displayed in the proceedings of these conferences.

THE GREEK PENTATEUCH AND THE LIBRARY OF ALEXANDRIA NATALIO FERNANDEZ MARCOS CENTRO DE CIENCIAS HUMANAS Y SOCIALES. CSIC. MADRID The Bible has been an active agent in all the great transformations of western civilisation. The most influential translation in antiquity was that of the Greek Pentateuch. It was through this that the wisdom of Israel passed from the medium of a Semitic to an IndoEuropean language, namely Greek, the universal language of the time. Thanks to its adoption as the official Bible by the new religion, Christianity, and thanks to the new translations from the Greek to the languages of the different nations within the Roman Empire, the Bible spread out to the eastern and western frontiers of the oijkoumevnh. This process, which was to have monumental consequences, started in the first part of the 3rd century BCE in Ptolemaic Alexandria.1 The use of the Bible was also a determining factor in the transition from scroll to codex, and the use of the codex coincided with the formation of the Christian corpus of Scriptures.2 The Latin Bible was the first book printed by Gutenberg (Mainz 1455), See Fernández Marcos, N. “Las traducciones en la Antigüedad.” Sefarad 67 (2007) 263–82. 2 “The passage from roll to codex represents indeed the most dramatic transformation in the history of the book before Gutenberg”, see Stroumsa, Guy G. “The Scriptural Movement of Late Antiquity and Christian Monasticism.” Journal of Early Christian Studies 16 (2008) 61–77, 66. 1

25

26

NATALIO FERNÁNDEZ MARCOS

the so-called forty-two lines Bible; at the end of the fifteenth century no less than 124 editions of the Latin Bible had been printed.3 Biblical texts were at the centre of Humanistic and Renaissance discussion, and the print edition contributed enormously to the diffusion and expansion of the Reformation. Will the new electronic revolution now taking place on-line supplant the traditional printed book? Will the Bible be present and active in this new transformation of the global village? The number of seminars and conferences on this subject is witness to the fact that the Bible is again at the forefront of a new transformation. After the decay of her famous library, the spirit of Alexandria continued to live in other centres of knowledge throughout history: Rome, Antioch, Edessa, Nisibis, Bagdad, Toledo, and El Escorial. The library of El Escorial could not have been possible without the personal support of Philip II, a spiritual descendant of the Ptolemies eighteen centuries earlier, and without the expertise of his librarian Benito Arias Montano, a humanist and orientalist skilled in ten languages.4 My aim in this paper is to highlight the ancient dream of the Alexandrian Library as a universal temple of knowledge, which could embrace—by means of book editions and/or translations—

See Copinger, W. A. The Bible and Its Transmission, 220. London: Henry Sotheran & Co, 1897, and de Hamel, C. The Book. A History of the Bible, 190ff. London: Phaidon Press, 2001. 4 “Uno de los hitos [en la restitución del espíritu de Alejandría tras su ocaso] lo tenemos cerca, en la biblioteca de El Escorial, cuya existencia y contenido no se pueden comprender sin apelar a la educación, personalidad y pensamiento de Felipe II, un descendiente espiritual— dieciocho siglos después—de los Ptolomeos,” see Sánchez Ron, J. M. “Todos hemos estado en Alejandría: centros de saber en la historia del conocimiento.” In García Maza J., ed. Siempre estuvimos en Alejandría, 249– 69, 258. Madrid/Valencia, 1997. “Notoria fue, en particular, la influencia del citado Arias Montano, el erudito bibliotecario, el Eratóstenes escurialense, que dominaba el hebreo, griego, latín, sirio, árabe, alemán, francés, flamenco, toscano, portugués, además de, por supuesto, castellano …,” ibid., 260–61. 3

THE GREEK PENTATEUCH

27

all the scientific and literary achievements of antiquity.5 In fact, it could be said with some certainty that, as a result of the activities of the Royal Museum, philology and text processing started in the West. It was there that the production of the first official editions of the classics and the scientific transmission of the legacy of antiquity were produced. My paper has the added goal of relating the origins of the Septuagint to the project of the “Mythical Library”, where western academic life and scientific research started. I want to re-establish the connection between classical and biblical studies, a connection which existed at the beginning of historical criticism in the 18th century.6 The German lyricist and Jewish freethinker Heinrich Heine (1797–1856) pronounced what was, no doubt, an exaggerated statement: “All men are either Hebrews … or Hellenes.”7 I would like to adjust his sentiment, modestly, to, “We are all Europeans or people of Western civilisation,” since I hold that the Bible and the Greco-Roman classics permeate the greater part of Western cultural expression.

1. THE LIBRARY OF ALEXANDRIA Ptolemy I Lagos, one of Alexander’s generals and one of his most faithful friends, founded the Mousei'on ca. 306 BCE, an institution both scientific and religious in character. The director was a priest appointed by the king. Its members, devoted to the service of the Muses, were lodged in the royal palace, as Strabo tells us: “The Museum is also a part of the royal palaces; it has a public walk, an Exedra with seats, and a large house, in which is the common mess-hall of the men of learning (filolovgwn a[ndrwn) who share the Museum. This group of men not only hold property in Erskine, A. “Culture and Power in Ptolemaic Egypt: The Museum and Library of Alexandria.” Greece & Rome 42 (1995) 38–48, and Jacob, C., and F. de Polignac. Alexandrie IIIe siècle av. J.C. Tous les savoirs du monde ou le rêve de l’universalité des Ptolémées. Paris, 1992. 6 See Van Seters, John. The Edited Bible. Winona Lake: Eisenbrauns, 2006. 7 See Jacobs, J. “Heine.” In Singer I., ed. The Jewish Encyclopaedia, vol. 6, 329–30. New York/London, 1916. 5

28

NATALIO FERNÁNDEZ MARCOS

common, but also have a priest in charge of the Museum, who formerly was appointed by the kings, but is now appointed by Caesar.”8 Most of the residents were philologists, whose interest lay in the recovery and transmission of the classical legacy. They lived in conditions of privilege, under royal patronage and supervision, and with a substantial level of financial support. But they did not escape the criticism and envy of their colleagues. They were the target of satirical poems (sivlloi, ‘squint-eyed’), burlesque imitations of the Homeric hexameters, such as those by Timon of Phlius (320–230 BCE), a disciple of Pyrrho of Elis, the Sceptical philosopher: “There are many who just fill out the papyri, well fattened in populous Egypt, who constantly peck at each other in the birdcage of the Muses.” The disciples of Aristotle, and specifically the peripatetic Demetrius Phalereus, might well have been very active in originating and organising this institution. This library, together with the sister library of the Serapaeum, brought together the best of the ancient world at an early stage: Greece, Rome, Egypt, Mesopotamia and Persia, all under the cultural policy of the Ptolemies. It seems that already in the 3rd century BCE, with the help of successive librarians, all of them outstanding scholars, the library had brought together some hundred thousand papyri scrolls, and maybe as many as five hundred thousand in Callimachus’ time. For a period of one thousand years, until the Arabic conquest of Alexandria in 640 CE, the library was the main vehicle for preserving and transmitting the intellectual legacy of the past. Among its librarians are numbered five of the most famous philologists of antiquity: Zenodotus of Ephesus, Callimachus of Cyrene, Eratosthenes of Cyrene, Aristophanes of Byzantium, and

Strabo is writing at the end of the 1st century BCE; see Strabo, Geography, 17.1.8, Leonard Jones, Horace, ed. & trans. Loeb Classical Library. London/New York, 1932: tw'n de; basileivwn mevro" ejsti; kai; to; 8

Mousei'on, e[con perivpaton kai; ejxevdran kai; oi\kon sussivtion tw'n metecovntwn tou' Mouseivou filolovgwn th'/ sunovdw/ tauvth/ kai; crhvmata koina; kai; iJereu;" oJ tetagmevno" tovte me;n uJpo; tw'n basilevwn, nu'n d j uJpo;

mevgan, ejn w|/ to; a[ndrwn. e[sti de; ejpi; tw'/ Mouseivw/ Kaivsaro".

THE GREEK PENTATEUCH

29

Aristarchus of Samotracia.9 As we shall see in the next section, this intellectual atmosphere corresponds to the description of the Ptolemaic court and the activities of the library supplied by the Letter of Aristeas. Zenodotus produced an edition of Homer after comparing the copies which were circulating in the various cities, divided each of the epics into 24 books, and used the letters of the alphabet to designate each book. The poet Callimachus created a new model of cataloguing in the form of his writing-tablets or pivnake" in which he divided Greek literature into literary genres or different topics; scrolls were organised in alphabetic order. The list drawn up for each name or author included some biographical data, the incipits of the author’s works, and the number of lines of each work. These are clear indicators of a scholarly milieu that created a climate fitting for the translation of the Greek Pentateuch, the best known collection of a foreign corpus, one probably translated into Greek in conjunction with the project of the library. Eratosthenes was a philologist and poet, and also a true scientist. He carried out research into the surface area of the earth as well as the origins of humankind; he published the first complete catalogue of constellations or katasterismoiv, in which mythical anecdotes and astronomical calculations are combined in order to explain the origins of the different groups of stars. Aristophanes of Byzantium was heir to a century of philological tradition and improved the editorial techniques by means of a greater number of diacritical signs. He excelled in lexicographical studies, his Levxei" extending to all fields of literature, prose and poetry. He was the first to publish selected lists of authors, numbers and names of the epic, lyric and tragic poets, a kind of classical canon before the canon. Finally Aristarchus was the consummate philologist, the person principally responsible for the correction (diovrqwsi") and recension of texts. He manifests his opinion in the margin by means of diacritical signs, and maintains the spurious passages in the text preceded by an obelus as a sign of doubtful authenticity It is disputed whether Callimachus was librarian or not. For a possible list of librarians see Oxyrhynchus Papyri, 1241, in Grenfell, B. P., and A. S. Hunt. Oxyrhynchus Papyri. London 1898–. 9

30

NATALIO FERNÁNDEZ MARCOS

(ajqevtesi"). He remains the best editor and interpreter of antiquity. As important as his editions are his commentaries or uJpomnhvmata, where he justifies and explains his editorial decisions and interprets the more difficult passages. Aristarchus defended the position that each author is the best interpreter of himself. Although the quotation seems to come from Porphyry, it has also been attributed to Aristarchus and parallels his method of interpretation: {Omhron ejx JOmhvrou safhnivzein, “to explain Homer by Homer”, an exegetical rule which will later be applied to the Scriptures, especially by the school of Antioch. While the scroll was in use, text and commentary were written on separate scrolls. When the codex was introduced, the margins of the codices offered space for the notes and commentaries.10 There is another feature that may parallel the transmission of the Homeric texts to that of the biblical texts. The first Homeric papyri from the end of the 4th century BCE attest a total lack of uniformity; they are poluvsticoi, with many additions, and recall the stage of textual fluidity of the biblical texts attested by the Qumran documents. Homer was for the Greeks and Romans as authoritative as the Pentateuch was for the Jews. Although the Alexandrian editors had a great influence in determining who the first rank authors were, those worth imitating and commenting by grammarians, and worthy of being studied at the school, they did not establish a canonical list of these authors.11 This Hellenistic process of ‘canonisation’ of the classics might illuminate the See for this section Pfeiffer, R. Historia de la Filología Clásica. I Desde los comienzos hasta el final de la época helenística, 165–400. Madrid: Gredos, 1981 [English original, Pfeiffer, R. History of Classical Scholarship. From the Beginnings to the end of the Hellenistic Age. Oxford: Oxford University Press, 1968]; and Fraser, P. M. Ptolemaic Alexandria I–III, I, 323–30. Oxford: Clarendon Press, 1972. See also Harold Ellens, J. “The Ancient Library of Alexandria: The West’s Most Important Repository of Learning.” Bible Review 13 (1997) 19–29. 11 A list of authors, not of writings. This might be one of the reasons why the biblical writings, anonymous in general, were attributed to specific authors like Moses, David, Solomon, and the different Prophets. See Van Seters, The Edited Bible, 40–41. 10

THE GREEK PENTATEUCH

31

process of canonisation of Scripture, be it in Hebrew or in Greek. It is worth emphasizing that the Alexandrian editors neither deleted nor insert anything into the text; they simply signalled it with the sign of inauthenticity. One can find some point of similarity in the procedure of the Masoretes who respected the written text and indicated the qere and other Masoretic notes in the margin.

2. A RETURN TO THE LETTER OF ARISTEAS After almost five centuries of scepticism concerning the historical value of the Letter, finally, in the middle of the 20th century and particularly in the last decade, more attention has been paid to the general information transmitted by the Letter of Aristeas in relation to the origins of the Greek translation, the intellectual climate of the Alexandrian library, and the Ptolemaic court.12 The Letter of Aristeas is a pseudepigraphic writing, a literary fiction from the last part of the 2nd century BCE. Among treatises such as the Symposium and many other expositions or ejkfravsei"— on Jerusalem and Palestine, the High Priest and the Temple, the presents of Ptolemy and Eleazar, the allegorical explanation of the Jewish dietary prescriptions by the High Priest, etc.—the Letter describes the origin and circumstances of the first translation of the Torah into Greek. However, in spite of incorporating several legendary motifs, it is also indicative of the atmosphere of Alexandria in the first part of the 3rd century BCE, the promotion of culture by the first Ptolemies, and the intellectual milieu of the library. The author of the Letter displays throughout the writing diverse allusions to the ambitious project of Ptolemy II, described in paragraphs 9–11: the librarian Demetrius Phalereus “received an enormous fund to gather, if possible, in the library, all the books of the earth.”13 In paragraphs 29–30 the survey of the librarian is See Fernández Marcos, N. “The Greek Pentateuch and the Scholarly Milieu of Alexandria.” Semitica et Classica 2 (2009). 12

13

jEcrhmativsqh polla; diavfora pro;" to; sunagagei'n, eij dunatovn, a{panta ta; kata; th;n oijkoumevnhn bibliva, Letter of Aristeas § 9. The Greek

text of the Letter is taken from the edition of Thackeray, H. St. J. “The Letter of Aristeas.” In Swete, H. B., ed. An Introduction to the Old Testament in Greek. Cambridge, 21902, Appendix 499–574. See also Fernández

32

NATALIO FERNÁNDEZ MARCOS

recorded: the books of the Jewish Law are lacking and with the king’s agreement the decision is taken to write to the High Priest of Jerusalem, Eleazar, in order to get the best experts from Jerusalem for the translation, six for each of the tribes. These men shall examine the “agreement of the majority in order to obtain the exact interpretation.”14 These phrases echo the philological activity in the library with the Homeric edition as applied here to the translation.15 They wish to put the Jewish Law “in a distinguished place fitting to the royal projects and implementations” (Letter of Aristeas § 32). Although the library is not mentioned in this context, the implicit reader understands that this distinguished place of honour cannot be other than the library. Moreover, in the letter of Ptolemy to Eleazar the destiny of the translation for the library is stated explicitly: “we decided to translate your Law from what you call Hebrew language to the Greek language, thus that we may have it also near us in the library together with the rest of the royal books.”16 Throughout the Symposium, which occupies the most important section of the letter, the Jewish translators from Jerusalem, who were received with all kinds of honours by the king, are presented as cultivated philosophers able to respond wisely to any question related to life or the best way of government. The paragraphs devoted to the phenomenon of the translation are few, but they reproduce the kind of work and terminology used in the library for the restoration of the text of the Greek authors. The method behind the translation was that of agreement between themselves by comparing results, ajntibolhv, a technical term used in Alexandria for the collation of the manuscripts.17 With this specific Marcos, N. “Carta de Aristeas.” In Díez Macho, A., ed. Apócrifos del Antiguo Testamento II, 11–63. Madrid: Ediciones Cristiandad, 1983. 14

… o{pw" to; suvmfwnon ejk tw'n pleiovnwn ejxetavsante" kai; lavbonte" to; kata; th;n eJrmhneivan ajkribev" ... Letter of Aristeas § 32.

See Honigman, S. The Septuagint and Homeric Scholarship in Alexandria. London/New York: Routledge 2003. 15

16

... i{n j uJpavrch/ kai; tau'ta par j hJmi'n ejn biblioqhvkh/ su;n toi'" a[lloi" basilikoi'" biblivoi", Letter of Aristeas § 38. 17 oiJ de; ejpetevloun e{kasta suvmfwna poiou'nte" pro;" eJautou;" tai'" ajntibolai'", Letter of Aristeas § 302.

THE GREEK PENTATEUCH

33

language, the author of the Letter is claiming for the translators a similar role and level of accuracy as those of the editors of Homer. Once the translation had been legitimated and approved by all the Jews it was presented to the king. When the complete text of the translation was read to the king, particular emphasis was placed on the wisdom of the lawmaker. Afterwards Ptolemy paid a deep reverence and ordered that these books be treated with the highest care and be preserved scrupulously (aJgnw'", Letter of Aristeas § 317).18

3. CHRISTIAN AUTHORS AND BYZANTINE CHRONICLERS The tradition reflected in the Letter of Aristeas, which links the origins of the Greek Pentateuch with the library of Alexandria and the cultural policy of Ptolemy II, was already accepted in the Jewish community and consolidated only a century after the translation. It is supported and transmitted by diverse sources such as Aristobulus in the 2nd century BCE, Philo and Josephus in the 1st century CE, and the Rabbis in the centuries following. It is difficult to accept that the tradition itself, although embellished in different ways, might have been totally invented and was not based on recognized historical circumstances.19 Moreover, we have another view of this at our disposal, even in later authors who record features of the same tradition independent of the Letter of Aristeas.

18

proskunhvsa" ejkevleuse megavlhn ejpimevleian poiei'sqai tw'n biblivwn kai; sunthrei'n aJgnw'". Fraser, Ptolemaic Alexandria, I, 330 states:

“And certainly it is natural to suppose that translations of these books were deposited in the Library as they became available.” Perhaps on deposit in the library were also translations of the Persian texts of Zoroaster, since it is said that Hermippus, a student of Callimachus, wrote a commentary on Zoroaster’s writings; see Pliny, Naturalis historia, XXX.4. 19 See Kreuzer, S. “Entstehung und Publikation der Septuaginta im Horizont frühptolemäischer Bildungs—und Kulturpolitik.” In Kreuzer, S., and J. P. Lesch, eds. Im Breenpunkt: Die Septuaginta. Studien zur Entstehung und Bedeutung der Griechischen Bibel. Band 2, 61–75, 68–70. BWANT, 161. Stuttgart: Kohlhammer, 2004.

34

NATALIO FERNÁNDEZ MARCOS

A series of statements of Christian authors reflects the close relationship between the first translation of the Septuagint and the library of Alexandria. From the context of the Letter (§ 317) one can plausibly deduce that the text of the translation was on deposit in the library for consultation of the scholars, and that it might have been there over two hundred years, or at least until the time of Julius Caesar when, in 48 BCE, some boxes containing ancient volumes were accidentally burnt while being prepared to be moved to Rome. But it is quite possible that the copy of the Septuagint did not perish in this disaster. The Greek Pentateuch was known and quoted by such Jewish-Hellenistic writers as Demetrius, Aristobulus, Eupolemus, Ezechiel Tragicus, Aristeas the historian, Philo and Josephus, the authors of the New Testament (see especially Acts 13:15 and 15:21).20 In all events, Tertullian, Justin, Epiphanius and Chrysostom state that the Septuagint could be seen and consulted in the Serapaeum, the filial library of the Museum. As Veltri states: “Christian sources call to mind the fact that everybody can verify the presence of the manuscript(s) of the (Greek) Torah in the library of Alexandria as a proof that the story is true.”21 In the 2nd century, Justin, in his first Apology (I.31.5) directed to Antoninus Pius, states that after the original translation the books were to be found among the Egyptians until his time, and were to be found

For quotations according to the Septuagint where the Septuagint differs notably from the Masoretic Text, see Fernández Marcos, N. Introducción a las versiones griegas de la Biblia. Segunda edición revisada y aumentada, 261–69. Madrid: CSIC, 1998, and Swete, H. B. An Introduction to the Old Testament in Greek, 369–72. Cambridge: University Press, 1902. 21 See Veltri, G. Libraries, Translations, and “Canonic” Texts: The Septuagint, Aquila and Ben Sira in the Jewish and Christian Traditions, 29. JSJS 109. Leiden/Boston: Brill, 2006. See also Hengel, M. The Septuagint as Christian Scripture: its Prehistory and the Problem of its Canon, 224. Edinburgh/New York: T&T Clark, 2002: “The translation stored in the famous library of Alexandria, is like the publication of a political constitution and/or of a cultural perception.” 20

THE GREEK PENTATEUCH

35

everywhere among all the Jews.22 Tertullian (2nd/3rd century) in his Apologeticum 18.8–9 states: “Today these documents, translated into Greek, are exhibited with the same Hebrew texts in Ptolemy’s library in the Serapaeum. But the Jews also read them in public.”23 And John Chrysostom affirms:24 [So] that you may learn that books do not make a place holy and that the disposition of those who gather there defiles it, let me tell you a story from ancient history. When Ptolemy Philadelphus, who was collecting books from all over the world, learned that among the Jews there were books which philosophised about God and the best way of life, he sent for Kai; tou'to genomevnou [that is, the translation] e[meinan aiJ bivbloi kai; par j Aijguptivoi" mevcri tou' deu'ro, kai; pantacou' para; pa'sivn eijsin jIoudaivoi" ... In the 4th century Epiphanius (De mensuris et ponderibus, 22

11, PG 43, 255) says that the finished translation was placed ‘in the first Library,’ which was located in the Broucheion, see Fraser, Ptolemaic Alexandria, I, 323. 23 Ita in Graecum stilum exaperta monumenta hodie apud Serapeum Ptolemaei bibliothecae cum ipsis Hebraicis exhibentur. Sed et Judaei palam lectitant. Apologeticum, cura et studio E. Dekkers, in Tertuliani Opera. Pars I. Turnholti: Brepols, 1954. 24 In his Adversus Judaeos Orationes Octo (PG 48, 843–942), 851: Kai; i{na mavqhte, o{ti oujc aJgiavzei to;n tovpon ta; bibliva, ajlla; bvevbhlon poiei' tw'n suniovntwn hJ proaivresi", iJstorivan uJmi'n dihghvsomai palaiavn. Ptolemai'o" oJ Filavdelfo" ta;" pantacovqen bivblou" sunagagwvn, kai; maqw;n o{ti kai; para; jIoudaivoi" eijsi; grafai; peri; Qeou' filosofou'sai, kai; politeiva" ajrivsth", metapemyavmeno" a[ndra" ejk th'" jIoudaiva", hJrmhvneusen aujta;" di j ejkeivnwn kai; ajpevqeto eij" to; tou' Seravpido" iJerovn: kai; ga;r h\n {Ellhn oJ ajnhvr: kai; mevcri nu'n ejkei' tw'n profhtw'n aiJ eJrmhneuqei'sai bivbloi mevnousi. Tiv ou\n, a{gio" e[stai tou' Seravpido" oJ nao;" dia; ta; bibliva; Mh; gevnoito. In fact the homilies were

pronounced against Jews and Judaizers of the Christian community. There is no modern edition of these homilies. Only homilies 1 and 8 have been translated into English by Meeks, Wayne A., and Robert L. Wilken. Jews and Christians in Antioch, in the First Four Centuries of the Common Era, 83– 126. Missoula, MO: Scholars Press, 1978. The following translation is taken from this last work, 96–97.

36

NATALIO FERNÁNDEZ MARCOS men from Judaea and commissioned them to translate these books. Ptolemy placed the books in the temple of Serapis. He was a Greek and this translation of the prophets is in use to this very day. Now then, are we to consider the temple of Serapis holy because of these books? Of course not!

The Serapaeum was destroyed together with other Egyptian temples in 389 following an order from the emperor Theodosius. However, Chrysostom pronounced his homily against the Jews some years before in 386. According to the ancient sources, various witnesses seem to coincide in attesting the presence of the translation in the library, although no one is entirely verifiable. To this evidence should be added the information transmitted by some Byzantine Chroniclers: Johannes Tzetzes, the monk Syncellus and Georgius Cedrenus. All three tell of the intense translation activity in the court of Ptolemy II as they methodically collected ancient traditions. The Ptolemaic dynasty was open to other cultures. Under Macedonian influences its policy was based on the integration of the different ethnic peoples of Alexandria. The 12th century scholar Tzetzes transmits the story of the translation differently. He uses different terminology and notes that not one but two libraries existed in Alexandria. And he continues: “Once all the books of the Greeks, those of every one of the nations, along with the books of the Hebrews had been collected together (sunhqroismevnwn), that unsparing king … translated to the Greek script and language the foreign books by means of wise men who shared their language ….” The differences (sunavgein in Aristeas, sunaqroivzein in Tzetzes) suggest that Tzetzes was not dependent on Aristeas or Josephus, but drew his information from another source.25 G. Cedrenus also links Demetrius to the translation: See Collins, Nina L. The Library in Alexandria & the Bible in Greek, 91–92. Leiden/Boston/Köln: Brill, 2000, and Orth, W. „Ptolemaios II und die Septuaginta-Übersetzung.“ In Fabry, H.-J., and U. Offerhaus, eds. Im Brennpunkt: Die Septuaginta. Studien zur Entstehung und Bedeutung der Griechischen Bibel. Band I, 97–114, 106–8. BWANT, 153. Stuttgart: Kohlhammer, 2001. 25

THE GREEK PENTATEUCH

37

And he [Ptolemy Philadelphos] translated to the Greek language through the seventy most wise Hebrew men the divine Scriptures and the rest of the Greek, Chaldaic, Egyptian and Roman books, ten thousand all together, since all were in another language. He placed these translations in his libraries of Alexandria …. And the king struck by the beauty of the divine Scriptures asked in presence of all the court the philosopher Menedemus and Demetrius Phalereus how being the divine Scriptures of such value … no one of the historians or poets mentioned them.26

In view of the evidence transmitted by such different sources, one may conclude that the framework of the story of the translation as narrated by the Letter of Aristeas is quite plausible. One might object that, if the translation was deposited in the library it should be mentioned or quoted by pagan authors. This objection, according to Aristeas, was already put in the mouth of the king who asked Demetrius (Letter of Aristeas § 312): “How is it possible that such an important work is not mentioned by the historians or the poets?”27 The author of the Letter makes mention in his answer of the legend of the sacred book that cannot be touched by pagan hands without being punished by the divinity (Letter of Aristeas §§ 313–16). Be that as it may, in this case the argument ex silentio is perilous, especially when taking into account the oblivion into which ancient works and those of JewishHellenistic production had fallen. These works were preserved by PG 121, 325AB: o}" [Ptolemy Philadelphos] kai; th;n metabolh;n tw'n qeivwn Grafw'n kai; loipw'n JEllhnikw'n, Caldaikw'n, Aijguptivwn kai; JRwmaikw'n eij" devka muriavda" ajriqmoumevnwn biblivwn, pasw'n te ajlloglwvsswn oujsw'n, eij" th;n JEllavda glw'ssan metepoivhse dia; tw'n o j sofwtavtwn JEbraivwn: a}" kai; ejn tai'" kata; th;n jAlexavndreian biblioqhvkai" aujtou' ejnapevqeto ... jEpi; tw'/ kavllei toivnun tw'n qeivwn Grafw'n ejkplagevnto" tou' basilevw" kai; pavntwn tw'n ejn tevlei, parovnto" kai; Menedhvmou filosovfou kai; Dhmhtrivou Falhrevw" hjrwvta oJ basileuv" o{ti pw'" toiouvtwn o[ntwn ... oujdei;" iJstorikw'n h] poihtw'n ejmnhvsqh aujtw'n. 27 pw'" thlikouvtwn suntetelesmevnwn oujdei;" ejpebavleto tw'n iJstorikw'n h] poihtikw'n ejpimnhsqh'nai; 26

38

NATALIO FERNÁNDEZ MARCOS

Christians and only in the measure that they served Christian interests. Similar books such as those of Manetho, Berossus and even Josephus are not even quoted by pagan authors until the 3rd century CE, by the Neoplatonic philosopher Porphyry.28 The evidence may simply have been lost for a number of reasons. As Cook asserts: “Hecataeus, Ocellus Lucanus, Alexander Polyhistor, Diodorus Siculus, Nicolaus of Damascus and Ps.-Longinus are pagan authors who are aware of the LXX (or the Jews’ books of Laws) although extant quotations are sparse.”29

4. THE JEWISH-HELLENISTIC WRITERS AND THE LIBRARY These Jewish authors, known only thanks to the quotations of the Christian authors Clemens of Alexandria (Stromata) and Eusebius of Caesarea (Praeparatio Evangelica), write in the manner of the Greeks. They use practically all the literary genres cultivated by the Greeks and Romans: tragedy with biblical theme (Ezechiel Tragicus), epic (Philo the Ancient on Jerusalem), philosophy (Philo of Alexandria), short novel (Joseph and Asenet), short story (Judit), historiography (Artapanus, Demetrius, Eupolemus, and Flavius Josephus). They use Hellenistic topoi and rhetorical devices in order to construct a Jewish identity. They have discovered the power of story and narrative rhetoric over historical facts.30 For example, the epic poet Theodotus writes on Jacob and Sechem in a Homeric framework and reinterprets the biblical version of the rape of Dinah from Genesis 34. The poem contains echoes of the

In De abstinentia 4.11–14. See Cook, J. G. The Intepretation of the Old Testament in Graeco-Roman Paganism, 52. Tübingen: Mohr Siebeck, 2004. And in page 2 he insists: “Most of the (pagan) Greek literature concerning the Jews written between IV B.C.E. to II C. E. has been lost, as a glance at F. Jacoby’s FGrH will show.” 30 See Gruen, E. S. Heritage and Hellenism. The Reinvention of Jewish Tradition. Berkeley/Los Angeles/London: University of California Press 1998. 28 29

THE GREEK PENTATEUCH

39

Septuagint, while most of its expressions and structure are Homeric. Other authors also appropriate the Hellenistic topoi of the prw'to" sofov" (first sage) and prw'to" euJrethv" (first inventor) and the foundation myths to demonstrate the cultural priority and, accordingly, superiority of the Jews over Babylonians and Egyptians. Rewriting the past of Israel to serve the needs of the moment is one of the characteristic devices used by these authors. History becomes, in a certain way, rhetorical propaganda. PseudoEupolemus, in the middle of the 2nd century BCE, presents Abraham as the inventor of astrology and its teacher to the other peoples. Astrology was, according to Artapanus, also taught by Abraham to the Egyptians. Eupolemus presents Moses according to the literary topos of the cultural benefactor, the founder of civilisation, the first wise man and lawgiver. It was from him that the Phoenicians and the Greeks received the alphabet. Artapanus is even more perspicacious in combining pagan mythology with biblical traditions in an euhemeristic fashion. Taking advantage of the homophony, Artapanus identifies Moses with Mousaios, a mythical Greek poet and teacher of Orpheus. Artapanus transmits the only realistic portrait of Moses in the Hellenistic fashion: “he was tall, ruddy complexioned, with long flowing grey hair, and dignified.”31 Even the Letter of Aristeas not only quotes Hecataeus, Menedemus, Demetrius Phalereus, Theopompus and Teodectes, but Jerusalem’s description (Letter of Aristeas §§ 83–106) seems to be inspired by the description of the ideal city of Aristotle in his Politics VII, 11.3–4.32 The author of the Letter follows Aristotle’s See Fernández Marcos, N. “Interpretaciones helenísticas del pasado de Israel.” Cuadernos de Filología Clásica 8 (1975) 157–86; idem, “Rhetorical expansions of biblical traditions in the Hellenistic Period.” Old Testament Essays 15 (2002) 766–79, 771–76, and Holladay, C. R. Fragments from Hellenistic Jewish Authors. Vol I, Historians. Chico, CA: Scholars Press, 1983; Vol II, Poets. Atlanta, GA: Scholars Press, 1989; Vol III, Aristobulus. Chico, CA: University of California Press, 1995. 32 Especially the praise of Jerusalem water supplies (Letter of Aristeas §§ 88–91), see Honigman, Homeric Scholarship, 23–25, 23: “For example, the detailed description of the combined sources supplying water to the 31

40

NATALIO FERNÁNDEZ MARCOS

advice concerning the needs of the povli", especially in the water’s installations (Letter of Aristeas §§ 88–91). When one compares the Jewish literary legacy in the Greek language, the literary genres and compositions which arose and grew up in the shadow of the Greek Bible, with the legacy of Qumran, one is struck by how deep the differences are. The Jewish-Hellenistic writers had access to the most important writings of Greek antiquity. In other words, they had access to the great library of Alexandria. It is quite possible that Theodotus, Aristobulus (who knows the Greek Bible and uses the allegorical interpretation of Scripture before Aristeas), Ezechiel Tragicus, and other Jewish-Hellenistic authors were in contact with the library. It is possible that some distinguished members of the Alexandrian polivteuma were engaged in the cultural activities of the library, just as other Hellenistic Jews, whose names and qualities are well known, made their own cursus honorum in business and politics.33 It is true, no individual Jews, scribes or intellectuals, are known in the early 3rd century BCE, but we should bear in mind that in antiquity writers or authors were dependent on patrons and institutions, that is, on the centres of power concentrated mainly in the royal palaces and the temples. Therefore they functioned as the spokesmen for the perspectives of these institutions, not for their own, and they disappeared from the scene as individual actors.

5. CONCLUSIONS For the first translation of the Torah into Greek, exceptional circumstances had to align in order to make it possible: the cultural climate of Alexandria and the competition between the different peoples to achieve a place of prestige over the dominant Greek culture. It is impossible to think of such a complex undertaking Temple (chs. 88–91) finds its raison d’être in Aristotle’s insistence on the need for the polis to enjoy a good water supply, both in quality and quantity (Politics, 7.11.3–4).” 33 See Paul, A. La Biblia y Occidente. De la Biblioteca de Alejandría a la cultura europea, 98–106. Estella (Navarra): Verbo Divino, 2008 [French original, Paul, A. La Bible et l’Occident. De la bibliothèque d’Alexandrie à la culture européenne. Paris: Bayard, 2007].

THE GREEK PENTATEUCH

41

without the institutional support of the Ptolemaic monarchy and the infrastructure of the Museum itself. The team of translators were bilingual Jews, scribes cultivated both in Hebrew and Greek languages and literatures. The scholarly milieu created around the famous Library is the most fitting Sitz im Leben for the origin of the Greek Pentateuch. The Letter of Aristeas does not affirm explicitly that a copy of the translation was deposited in the Museum or Serapaeum.34 It does, however, indicate the moment when the king manifests his interest in preserving the Greek Pentateuch carefully (Letter of Aristeas § 317). The Greek Pentateuch was known and consulted by the Jewish-Hellenistic authors, by Philo, Josephus, and the authors of the New Testament. It had to be in open access, and the best place to provide such an access was the Library. Christian authors claim that people could consult the translation in “the first Library” or in the Serapaeum,35 and the Byzantine Chroniclers inform, with apparent independence from the tradition of the Letter, on the translation activity in the court of the Ptolemies and precisely on the translation of the Hebrew Scriptures in this context. If not demonstrably true, the links between the translation of the Greek Pentateuch with the ambitious Ptolemaic cultural project and the intellectual climate of the Library, as narrated in the Letter of Aristeas, are more than plausible.

34 35

The library created later by Ptolemy III Euergetes. See Epiphanius of Salamine (PG 43, 255–56): Kai; ou{tw" aiJ bivblioi

eij" JEllhnivda ejkteqei'sai ajpetevqhsan ejn th'/ prwvth/ biblioqhvkh/ th'/ ejn th'/ Broucivw/ oijkodomhqeivsh/. [Eti de; u{steron kai; eJtevra ejgevneto biblioqhvkh ejn tw'/ Serapivw/ mikrotevra th'" prwvth", h{ti" kai; qugavthr wjnomavsqh aujth'": ejn h|/ ajpetevqhsan aiJ tou' jAkuvla, kai; Summavcou, kai; Qeodotivwno", kai; tw'n loipw'n eJrmhnei'ai, meta; diakosiosto;n kai; penthkosto;n e[to". The first library is the Museum, close to the royal

palace, in the Brucheion Greek quarter of Alexandria. The Serapaeum was located at Rhakotis in the southwestern part of the city; see Barnes, R. “Cloistered Bookworms in the Chicken-Coop of the Muses: The Ancient Library of Alexandria.” In MacLeod, R., ed. The Library of Alexandria. Centre of Learning in the Ancient World, 61–77, 62–63. London/New York: I. B. Tauris 2005.

42

NATALIO FERNÁNDEZ MARCOS

Concerning the relationship of my paper with the main topic of the Conference, Bible & Computers, I would like to finish with a prudent reflection made by Prof. Robert Barnes, a Senior Lecturer in Classics at the Australian National University: “In the late twentieth century, the problems of large libraries have by no means been resolved. There has been much premature speculation that digital recording of texts will overcome all difficulties of collecting and storing them, and will make them available immediately to anyone, anywhere in the world. In fact the digitising of library holdings of printed texts has scarcely begun, and its costs, with present technology, would seem to limit it to a comparatively small selection of commonly used texts. Although most new books nowadays must be recorded somewhere in digital form, the economics of publishing discourage wide access to the books in that form. Above all, we simply do not know whether present technology will preserve texts even as efficiently as libraries of manuscript and printed books have done. The Alexandrian library may have preserved its books, without substantial loss, for up to 600 years.36 We should not be confident that we will preserve our own literature for anything like as long.”37

36 37

Probably for up to a millennium until the Arabic conquest. See Barnes, “Cloistered Bookworms,” 75.

ELECTRONIC TOOLS FOR BIBLICAL STUDY AT HOME, AT THE UNIVERSITY, AND IN THE CLASSROOM EMANUEL TOV DEPARTMENT OF BIBLE, THE HEBREW UNIVERSITY. JERUSALEM 91905, ISRAEL 1. RETROSPECTIVE Over the past two decades many developments have taken place in the area of the computerized study of Scripture. The impressive progress in hardware has enabled much advancement in software and thus has provided access to a greater number of users. Approximately a quarter century ago the main components for the computer-assisted study of Scripture were developed and used merely on mainframes, media that were available only remotely, and hence extensive use by scholars and students alike was delayed. It was not feasible for users, especially students, to make use of terminals linked to mainframes. Another problem was that several projects, among them our own CATSS project, presented the Hebrew and Greek data in transliteration. Therefore, in my presidential address to the 1988 AIBI conference in Jerusalem, I appealed to our colleagues not only to develop tools, but also to enable greater access for colleagues and students to these tools.1 When developing tools, specialists used to speak of “possibilities”, “Achievements and Trends in Computer-Assisted Biblical Studies,” Proceedings of the Second International Colloquium Bible and Computer: Methods, Tools, Results, Jérusalem, 9–13 juin 1988 (Travaux de linguistique quantitative 43; ed. C. Muller; Paris/Genève: Champion–Sladkine, 1989) 33–60. 1

45

46

EMANUEL TOV

but primarily needed were applications for non-specialists and tools accessible for the average person. Today, we realize that an appeal such as that made in 1988 is no longer needed; our colleagues are actively using the tools we have developed, thanks to advancements made in what we then named microcomputers, now known as personal computers. The servers of today and the networks of personal computers integrated with servers take the place of the mainframes of those bygone days. There still are many mainframes, immensely more powerful than those of the past, but they are seldom used for projects in the humanities. Users now employ the databases and programs on personal computers in their homes and universities, on terminals connected to servers, and increasingly more often on the Internet and in the classroom.

2. THE CATSS PROJECT The development of the CATSS database and software went hand in hand with the availability of better and more powerful hardware. Our project aimed to develop tools for the comparative analysis of the MT and LXX. To that end, we created a detailed comparison of the Hebrew text of BHS and the Greek text of the Rahlfs edition of the LXX which were broken down into very small units. The words of the Hebrew and Greek Bibles were then connected to a morphological analysis and lemmatization of all these words in both languages, enabling searches in these languages. The initial comparison of the details in the Hebrew and Greek Bibles was created by a computer program, which was based on the structure of the Hebrew and Greek languages and on the assumption that most Greek translations closely followed the structure of the Hebrew language. In most books of the LXX this automatic comparison created a correct alignment, even in free translation units, as long as most of the Hebrew lexemes were represented by a one-to-one equivalent in Greek. At the next stage, the results of this automatic comparison were fine-tuned by a team of collaborators in Jerusalem who also added detailed notes on matters of translation technique, the reconstructed Vorlage of the LXX, the relation of the LXX to the Ketiv-Qere notes of MT and also to the Qumran biblical scrolls, etc. All these notes were included in the so-called “column b” that was added to “column a” and contained a detailed alignment of MT and the LXX.

ELECTRONIC TOOLS FOR BIBLICAL STUDY

47

The program and database were developed simultaneously in the 1980s and into the 1990s by R. A. Kraft and myself using different systems: the IBYCUS system at the University of Pennsylvania (UPenn) and two different mainframes at the Hebrew University. Working on different machines in those days was more complex than it would have been today. This situation necessitated the writing of separate programs in different computer languages for the computers at UPenn and at the Hebrew University. Even more challenging, when the Hebrew University changed its mainframes without consulting with us or informing us, we had to write completely new programs for inputting and correcting data and for searching the parallel Hebrew and Greek words. When we finished our database of parallel Hebrew–Greek words, we were able to perform limited searches ourselves at both sites of the project, but very few external scholars used our data. The program was used mainly by some fifteen scholars who were connected with the project, but it can be said that it was used, and valuable studies on translation technique as well as textual and linguistic aspects were based on the data gathered from the project.2 The data were also used by two daughter or sister projects that compiled major research tools for the Greek Bible. The improved morphological analysis of all the words in the LXX was recorded by Taylor in a printed edition that presented this analysis as a running text of the LXX.3 The LXX lexicon by Lust-EynikelFor a bibliography, see my “A Computerized Database for Biblical Research.” In The Greek and Hebrew Bible—Collected Essays on the Septuagint, 31–51 (41–3). VTSup, 72. Leiden/Boston/Cologne: E. J. Brill, 1999; “The Use of Computers in Biblical Research.” In Flint, P. W., et al., eds. Studies in the Hebrew Bible, Qumran and the Septuagint Presented to Eugene Ulrich, 337–59 (354). VTSup, 101. Leiden: E. J. Brill, 2006; Kraft, R. A., in http://ccat.sas.upenn.edu/rs/rak/catss.html. See also the description of J. Treat in http://ccat.sas.upenn.edu/~jtreat/rs/rscpuhx.html. 3 Taylor, B. A. The Analytical Lexicon to the Septuagint—A Complete Parsing Guide. Grand Rapids, MI: Zondervan, 1994. Subsequently, that analysis, now named CATSS-Taylor-Wheeler, has been made available in seven different software packages, listed in Tov, “Use of Computers,” 356–7. 2

48

EMANUEL TOV

Hauspie was also based on our data, developed our data much further, and added a completely novel lexicographical description of all the LXX words.4 However, the use of our data by outsiders was not possible because the data were stored in a ASCII transliteration in Roman letters at two different universities and were not easily accessible. The original data have been available for many years at the UPenn website (http://ccat.sas.upenn.edu/), but it is my understanding that they have not been used much in their ASCII form. The breakthrough in the use of the data of the CATSS project came in 1997 when the database was integrated into the Accordance program using Hebrew and Greek fonts instead of transliteration, for use on personal computers instead of on mainframes. The platform was that of Macintosh, with the PC world later included by way of screen emulation. The powerful Accordance program had been launched in 1994, and came to include an ever-growing number of text modules for the study of ancient Scriptures and related subjects.5 In short, it was the development of personal computers that saved the CATSS project from oblivion; similar developments took place in other projects as well. Powerful tools for computer-assisted analysis of the Hebrew Bible, including grammatical analysis, vocalization, and Masorah were now accessible on personal computers. Let me mention two such programs. Originating in the Amsterdam project of E. Talstra, the SESB project, is now a powerful part of Logos.6 This program originated with the Werkgroep Informatica and includes syntax analysis and attention to clause hierarchy. Likewise, the Tokhnit

Lust, J., E. Eynikel, and K. Hauspie. A Greek-English Lexicon of the Septuagint, I–II. Stuttgart: Deutsche Bibelgesellschaft, 1992, 1996. 5 For the data, see http://www.accordancebible.com/modules/ index.php. 6 http://www.logos.com/products/details/1981. 4

ELECTRONIC TOOLS FOR BIBLICAL STUDY

49

“HaKeter”, initially developed alongside the Responsa Project,7 is now available on the PC.8

3. ELECTRONIC RESOURCES In an electronic journal TC (2003)9 as well as in a printed publication (2006),10 I published a list of all the electronic resources relevant to the textual criticism of Hebrew Scripture. This list is constantly being updated,11 although I realize that I may not be aware of all the available resources. The list is subdivided into the following types of information: 1. Source texts of the Hebrew Bible; 2. Source texts of the ancient versions; 3. Modern translations of Hebrew Scripture; 4. Critical apparatuses; 5. Morphological analyses; 6. Tools for the study of these texts. In biblical scholarship, computer-assisted studies have focused on the following five areas: authorship studies, stylistics, linguistics, statistics, and textual criticism. Beyond these monographic studies, we can now consult many electronic resources, both freeware and commercial software packages. Usually the rule is that the more recent the program, the more sophisticated the software.

Global Jewish Database* (Responsa Project, CD-ROM 12, Bar-Ilan University) (with cross-references to Rabbinic Literature), see http://www.biu.ac.il/JH/Responsa/index.html. 8 Tokhnit “HaKeter”–Ma’agar HaTanakh, Bar-Ilan University, Ramat Gan; part of the Miqraot Gedolot“HaKeter” Project based on the Aleppo Codex (in books in which the Codex is missing, its text has been reconstructed according to sources close to the Aleppo Codex). 9 “Electronic Resources Relevant to the Textual Criticism of Hebrew Scripture.” TC: A Journal of Biblical Textual Criticism 8 (2003) [http://purl.org/TC]. 10 Tov, “The Use of Computers.” 11 The latest update is included in the second volume of my collected writings, Hebrew Bible, Greek Bible, and Qumran—Collected Essays, 228–46. TSAJ, 121. Mohr Siebeck, 2008. 7

50

EMANUEL TOV

a. Hebrew Bible The Hebrew Bible has been available for a long time in machinereadable form, with vowels and accents. The Greek Bible with accents has also been available for at least three decades. All these sources are designed exclusively for a PC environment except for Accordance (Macintosh) and the Jewish Classics Library (PC and Macintosh). However, with emulation programs, the PC programs can be used on the Macintosh, and the Accordance data can be used on a PC. Before listing the available sources, I should mention that the great majority of the software packages are available only as commercial products. In my constantly updated list of available sources, I indicate such commercial products with a star,12 and when reviewing the list one will notice how little is available as freeware on the Internet. This is probably one of the concessions that have to be made to the conditions under which we work, since the preparation and maintenance of each software package costs several hundred thousand US dollars. In principle, it should not be difficult to accept the idea that these software packages have a price-tag attached, since books are likewise commercial products and no one rebels against purchasing them. However, somehow there seems to be a reticence to pay money for abstract material included on a lightweight CD. Software packages are usually sold for US $200–300, but the cost can easily jump to $1,000 if one orders several modules as well as tools such as lexicons. On the Internet, we can find the bare Hebrew and Greek biblical texts as freeware, though not the Dead Sea Scrolls. However, none of these texts is searchable in a meaningful way, since they are not linked to morphological analyses. In addition, various sources offer freeware lexicons (such as Liddell and Scott),13 editions (such as von Gall’s

Tov, “The Use of Computers.” Liddell, H. G., R. Scott and H. S. Jones. A Greek-English Lexicon. Oxford, 91940; LSJ is used together with Barber, E. A. A Greek-English Lexicon, A Supplement. Oxford, 1968, and Glare, P. G. W. Revised Supplement. Oxford, 1996: http://www.perseus.tufts.edu/. 12 13

ELECTRONIC TOOLS FOR BIBLICAL STUDY

51

edition of the Samaritan Pentateuch,14 Tischendorf’s edition of codex Sinaiticus,15 and the LXX edition of Brooke, McLean, and Thackeray)16 and single manuscripts (such as the large Isaiah scroll17 and the Aleppo Codex18) as scanned images. The Israeli sites Mechon Mamre and Snunit offer free access to many biblical and rabbinic texts, but without good search facilities.19 Many modern English Scripture translations are freely available as well. In addition, the valuable Comprehensive Aramaic Lexicon (CAL)20 and the non-updated version of the CATSS comparison of the MT and LXX are freely available in ASCII format.21 The great majority of the commercial software packages of the Hebrew Bible represent the Leningrad Codex or BHS. In principle, these two sources should be identical, since BHS is based on codex L, but in practice they are not. The main text that is used, the so-called Michigan-Claremont-Westminster text, has been corrected according to codex L. On the other hand, the Bar-Ilan database, Tokhnit “HaKeter”, presents the Aleppo Codex, which differs in only a small number of details from codex L; for students of grammar, however, these differences are important. No less than fourteen software packages offer the Hebrew Bible text.22 Von Gall, A. F. Der Hebräische Pentateuch der Samaritaner. Giessen: Tšpelmann, 1914–1918: http://rosetta.reltech.org/Ebind/docs/TC/. 15 Von Tischendorf, C. Bibliorum Codex Sinaiticus Petropolitanus. Leipzig: Giesecke et Devrient, 1862: http://rosetta.reltech.org/Ebind/docs/TC/. 16 Brooke, A. E., N. McLean, and H. St. J. Thackeray. The Old Testament in Greek according to the Text of Codex Vaticanus. Cambridge: University Press, 1906–1940: http://rosetta.reltech.org/Ebind/ docs/TC/. 17 http://www.ao.net/~fmoeller/qumdir.htm, http://www.imj.org.il. 18 http://aleppocodex.org/flashopen.html. 19 http://www.mechon-mamre.org/i/t/t0.htm, http://kodesh.snunit. k12.il. 20 http://cal1.cn.huc.edu/. 21 http://ccat.sas.upenn.edu. 22 In all programs, slight adjustments are made to the appearance of the text since all use different fonts. In 2003, A. Groves wrote to me: “The text was first made available for word-processing using the Hebraica 14

52

EMANUEL TOV

The text of MT is thus available in several software packages bearing such names as Accordance, BibleWorks, Jewish Classics Library, Logos, WordSearch, Gramcord, Bible Windows, SESB, and Global Jewish Database. The Masorah Magna and Parva are searchable in the BarIlan database. The Qumran biblical texts are available in M. Abegg’s module included in Accordance and Logos with all the search facilities of these programs. This module presents the Qumran texts in two ways, according to manuscript and the sequence of the biblical text. The presentation according to manuscript allows the user to analyze the features of individual manuscripts. The presentation according to the sequence of the biblical texts allows the user to analyze all the Qumran biblical material at the same time. All these analyses enable the “compare” function that, with one click, displays all the differences between texts; for example, differences can be displayed between MT and the large Isaiah scroll from cave 1. A module of the biblical scrolls by S. Pfann was announced by Logos in 2007.

font created and distributed (font and text) by Linguist’s Software (Phil Payne in Seattle). Until Hebrew Bible search software became available, this was probably the most widely used version in word processing. There are now many word-processing versions of the text available using fonts produced by the various Bible software companies. Some of the companies have licensed Linguist’s Software. The most common of which I am aware are: Hebraica I & II and the New Jerusalem font (Linguist’s Software), the BibleWorks font, the BibleWindows font, SuperHebrew (I think the predecessor to Hebraica?) and the Gramcord font. Then there is SIL’s font (called SIL Ezra; public domain). I also think that SBL has produced a public domain font. Anyway, these are the fonts I see from my various students. Note that each of these fonts maps the Hebrew a little differently, so the vendor has revised our text for use with their fonts. Which means that a text in a particular font is not easily converted to another font.” In addition, Accordance uses its own font (Yehudit).

ELECTRONIC TOOLS FOR BIBLICAL STUDY

53

b. Other Sources A. Tal’s edition23 of the Samaritan Pentateuch is available within Accordance with full morphological analysis. In this text, too, the “compare” function displays all the differences between MT and the SP. In addition to Hebrew Scripture, ten software packages also contain the LXX as encoded by the Thesaurus Linguae Graecae (TLG) from the edition of A. Rahlfs (1935),24 without any variant readings. The variants of the LXX have been encoded by CCAT in Philadelphia, though not yet for all books. Other tools available are the text editions by Swete,25 Brooke, McLean, and Thackeray (see n. 16), and also Field (1875),26 the grammar of Conybeare and Stock (1905),27 the Introduction by Swete,28 and modern translations of the LXX,29 some of them as scanned images. The edition of the Vulgate is presently available in nine software sources.30 Tal, A. The Samaritan Pentateuch, Edited According to MS 6 (C) of the Shekhem Synagogue. Texts and Studies in the Hebrew Language and Related Subjects, 8. Tel Aviv: Tel Aviv University, 1994. 24 Rahlfs, A. Septuaginta, id est Vetus Testamentum graece iuxta LXX interpretes. Stuttgart: Württembergische Bibelanstalt, 1935. 25 Swete, H. B. The Old Testament in Greek According to the Septuagint. Cambridge: University Press, 1887–1894. 26 Field, F. Origenis Hexaplorum quae supersunt sive veterum interpretum graecorum in totum Vetus Testamentum fragmenta. Oxford: Oxford University Press, 1875. 27 Conybeare, F. C. and St. G. Stock. A Grammar of Septuagint Greek, included in Selections from the Septuagint. Boston: Ginn and Company, 1905: http://ccel.org/c/conybeare/greekgrammar/. 28 Swete, H. B. An Introduction to the Old Testament in Greek. Cambridge: University Press, 21914. 29 The most important ones are Brenton, L. L. The Septuagint Version of the Old Testament with an English Translation. London: Samuel Bagster and Sons, 1879, available within Accordance and at http://www.bcbsr.com/ topics/olb.html; Pietersma, A. and B. G. Wright, ed. NETS, A New English Translation of the Septuagint and the Other Greek Translations Traditionally Included Under That Title. Oxford: Oxford University Press, 2007: http://ccat.sas.upenn.edu/nets/; Septuaginta Deutsch: 23

54

EMANUEL TOV

The Targumim as well as the Peshitta are available in several sources, foremost in conjunction with the Comprehensive Aramaic Lexicon (CAL), together with lexicographical information and search capacities,31 and also within the biblical software packages. The non-biblical Dead Sea Scrolls are well represented within the software packages, both as text editions and photographic images. The module of the non-biblical texts prepared by M. G. Abegg features prominently among the modules of Accordance and Logos. A revised form of that module has been included in my own DSSEL,32 which includes text editions, modern translations, and images of all the Qumran texts. Within DSSEL, all these corpora of information are searchable with Wordcruncher (Wordcruncher 7.0) software. In addition, Wordcruncher provides valuable statistics programs that can be applied to the results of these searches. This program allows the user to see the images and the transliterated texts at the same time. The higher the number of available modules, the more greatly are students and scholars served in their study of the Bible. Accordance, for example, has modules of MT, SP, the biblical and non-biblical Dead Sea Scrolls, and all the versions. These modules are not integrated or hyperlinked, except for the module of CATSS. They lack the variants from the text editions in the apparatuses of all these texts. All these details could be added by way of hyperlinks, but such information would involve several additional projects. The apparatus of BHS, which is now available, provides a partial answer to this issue. Accordance also provides morphological information, many lexica, encyclopedias, several Church Fathers, http://www.septuaginta-deutsch.de/. The NETS translation is also available within Accordance. 30 For the history of the encoding of the Vulgate text and for further sites, see the data provided by R. A. Kraft in http://www.le.ac.uk/elh/grj1/linksa.html. 31 Based mainly on Mikra’ot Gedolot “HaKeter”. Ramat-Gan: Bar-Ilan University Press, 1992–2000, as well as on various editions. 32 The Dead Sea Scrolls Electronic Library. Brigham Young University, Revised Edition, 2006; part of the Dead Sea Scrolls Electronic Reference Library of E. J. Brill Publishers (ed. E. Tov; Leiden: E. J. Brill, 2006).

ELECTRONIC TOOLS FOR BIBLICAL STUDY

55

Josephus, rabbinic literature (very partial), commentaries, the nonbiblical Dead Sea Scrolls, and the BHS apparatus. While Accordance does not have hyperlinks, it offers two valuable services providing similar, if not better, research facilities. Accordance enables the presenting of any two parallel texts in parallel text panes, such as MT, any of the versions, and the biblical Dead Sea Scrolls. A further option (“compare”) allows for the indication of the differences between the texts as long as they are in the same language, such as MT and a biblical Dead Sea Scroll. Many of these modules are also offered by Logos. Logos seems to be better in the area of hyperlinks between the various texts but, on the other hand, it marks many links that are not meaningful since they have been determined automatically. DSSEL is the only program that combines searches of the non-biblical Dead Sea Scrolls and the Bible in a single file of results. c. Lemmatization and Morphological Analysis The key to the effective use of any software program of Scripture texts is the availability in the background of lemmatization and morphological analysis (grammatical tagging) of all the Hebrew, Greek, Aramaic, and Syriac text words. This lemmatization allows for a search of all the text words as well as for producing concordances.33 Accordance and DSSEL also provide valuable statistical information. In the case of Accordance, the distribution pattern of one or more words is displayed graphically in different colors in graphs. While word searches are available in several programs, Accordance is probably the only one that enables the creation of complete concordances. Such concordances can be produced within Accordance for any text range defined for this purpose: any combination of biblical books or parts thereof, combinations of verses, pericopes, or selections, such as in Scripture the presumed Deuteronomistic segments, Wisdom literature, or late biblical prose or poetry. Defining such ranges is See http://www.balboa-software.com/semcomp/scbible2.htm for H. Hahne’s descriptions of these aspects in “Using a Computer in Biblical and Theological Studies, Lesson 6: Computer-Assisted Bible Study, Part 2.” 33

56

EMANUEL TOV

subjective, and the ranges can be changed at any given moment. The search facilities of the various programs differ; some are more sophisticated than others. The most sophisticated programs for the Hebrew Bible, Accordance and SESB, also allow for the search of morphological features, such as the frequency of the individual binyanim of the verb and unusual nominal forms, and also the search for combinations of lexical and grammatical information. In addition, SESB also allows for syntactical searches. The morphological and lexical information in the background of these programs also allows for grammatical and orthographical studies on the Qumran texts. Although these morphological analyses usually follow one central source, such as the HALOT lexicon in the case of Hebrew Scripture,34 they are also determined by subjective judgments, and a lexeme that one scholar considers a noun may be considered by another to be a verb. One scholar may ascribe a given verbal form to the niph‘al, while another one considers it to be a hitpa‘el. One source discerns in Scripture one core meaning of dever, while another differentiates between two homographs. Accordance probably places too many groups of words under the heading of “particle”. This subjectivity comes to light when one reviews the differences between the existing morphological analyses. The various morphological analyses that are at the base of the software packages should be considered separate sources since they reflect independent research. Seven different morphological analyses of Hebrew Scriptures are based on codex L / BHS. These include the Westminster Hebrew Old Testament Morphology of GrovesWheeler, available in at least five software packages, the Werkgroep Informatica (including syntax and clause hierarchy) from Amsterdam, the Bar-Ilan analysis, the analysis of the Academy of the Hebrew Language, as well as additional commercial and private morphological analyses. For the LXX, I know of just one such analysis, that of CATSS-Taylor-Wheeler, which is available in seven different software packages. For the Targumim, there is the CAL Koehler, L., W. Baumgartner and J. J. Stamm, The Hebrew and Aramaic Lexicon of the Old Testament, trans. and ed. under the supervision of M. E. J. Richardson. Leiden: E. J. Brill, 1994–2000. 34

ELECTRONIC TOOLS FOR BIBLICAL STUDY

57

analysis, while for the Vulgate there is no such morphological analysis, and searching in that translation is restricted. In addition, four different morphological analyses exist for the non-biblical Qumran texts. The availability of different lemmatizations enlarges the search facilities, since different lemmatizations and grammatical decisions yield different results. Accordingly, different search results for the exact same search executed in different software packages may derive from (1) differences in the base text, among them errors; (2) differences in tagging words and determining of lemmas; (3) differences due to the capabilities and assumptions of the software. In an example given by H. Hahne (n. 33), the various programs provide differing numbers for the occurrences of the pair of Greek particles μεν and δεν. The bilingual CATSS database, in Accordance, allows for examination of Greek-Hebrew equivalences, of features in translation technique, and searches of Hebrew and/or Greek grammatical features. The interface of the same program presented by different software packages may differ, as for example in the presentation of the CATSS database in Accordance and Logos.

4. PRESENT AND FUTURE A while ago we mentioned the drawback that many, if not most, of these software programs are commercial products and therefore may not be available to every student. This is true but, by the same token, not every student can afford to buy all the relevant textbooks needed for thorough study. University libraries come to our aid here, not only with regard to free access to books, but also by enabling free access to the software packages at three levels of availability: 1. the library offers such facilities, often at dedicated terminals limited to a given program; 2. the library or university services offer the programs on an internal university network, allowing the use of these programs on all computers on campus; and 3. the freest access is for those programs that the university allows its members to use in their homes on the web. University libraries buy site licenses for these programs, so that the conditions differ from university to university. Universities also provide the users with an ever-increasing number of electronic journals. At my

58

EMANUEL TOV

own university users can freely access the following programs, in the different ways described above:35 Accordance (limited number of modules); Anchor Bible Dictionary; Dead Sea Scrolls in DSSEL; HALOT; SESB; TLG; Maagarim, the historical dictionary of the Hebrew language; Responsa project including the rabbinic literature; and many more. While these programs are actively used for study and research in the library, in study rooms on campus, and in home environments, the software packages are also making their way into the classroom. An increasing number of professors and students take their computers to the classroom, enabling students to do independent research, make precise observations, and challenge the professors by examining data while the professor is speaking. In text-based classes, like my own Septuagint class, use of the computer replaces that of printed books almost entirely, excluding the variants of the LXX that are not included in the software programs. The bulk of my preparations for that course are now done on the computer, although my notes are still hand-written and I continue to print out certain searches rather than store them. At conferences, too, an increasing number of scholars bring their laptops. The texts of the Bible and the Dead Sea Scrolls are consulted while the lecture is in progress. For example, one can easily research various aspects of the translation technique of the LXX. As for the classroom, most American universities require the professors to post their syllabi on the web, and in this way the professors also announce their requirements for the classes and share papers with the students. This development is also taking place in Israel. For this purpose, many of my colleagues in Israel use the program called “HighLearn” that was officially adopted by The available programs are listed on http://www.mslib.huji. ac.il/main/siteNew/?page=26. 35

ELECTRONIC TOOLS FOR BIBLICAL STUDY

59

the university, while I myself prefer to place the same material on my website. These procedures do not include software packages for the use of the ancient sources, since all of these are commercial programs, but they do contain links to freeware on the Internet. This specific program also allows the professor to send notices, to calculate grades, to send assignments, and post texts. In this context, I should also mention the ever-increasing use of web-based multi-user programs for projects with the aid of which at any given moment users can compose and post papers, which can then be reviewed by editors and monitored continually to the printing stage. I refer to such programs that are now named Wiki environments, such as Confluence.36 In the course of this work, word-processing programs are used and the tools mentioned above can also be used. No one knows what the future has in store. We can only give a prognosis of some features and developments, and express some wishes. I would hope that the programs themselves will be further developed, streamlined, and be made cross-platform. I hope the number of modules will be greatly expanded and that the individual words in the biblical texts and translations will somehow be provided with hyperlinks, not only to lexicons and atlases, but also to commentaries. I hope that libraries will make more and more software packages available on their networks. I also hope that the new Wiki environments will include more software packages among their possibilities. In the classroom of the future, students should definitely have their computers with them alongside printed books, or even in their stead. These computers should include the Hebrew Bible together with the main versions, perhaps together with modern translations. Secondary literature such as commentaries can be added subsequently. At the same time, we should never forget that computers are mere tools that help us to perform our research. The use of computers is not a goal in itself. Our main goal is to research the Bible, its versions, and its language. Much research is executed on computers, but most research will nevertheless involve printed books. The integration of

36

http://www.atlassian.com/software/confluence/.

60

EMANUEL TOV

electronic data in the research of the future will improve that research as well as our teaching.

DISPLAYING HEBREW AND ARAMAIC ON HANDHELD DEVICES THAT LACK PROPER COMPLEX SCRIPT SUPPORT DRAYTON C. BENNER NEAR EASTERN LANGUAGES AND CULTURES DEPARTMENT, UNIVERSITY OF CHICAGO, CHICAGO, IL, USA INTRODUCTION In the early days of computing, keyboards and systems of data representation presupposed the use of Latin characters and the conventions governing their use in modern Germanic and Romance languages. There was little in the way of flexibility provided for the many other systems of writing present in the modern world, to say nothing of the ancient world. Technology for displaying languages using complex scripts has made impressive advances in recent years. While there are many advances still to be made, there is much worthy of praise, from the establishment and expansion of the UNICODE standard to the development of “smart fonts.” The establishment of UNICODE has allowed for a system of data representation that is no longer font-dependent.1 The development of “smart fonts” has allowed for context-sensitive placement of glyphs according to rules embedded within fonts themselves, not according to rules reFor an introduction to the UNICODE standard, see The Unicode Consortium, Allen, J. D., et al., ed. The Unicode Standard 5.0. Upper Saddle River, NJ: Addison Wesley Professional, 52006. 1

61

62

DRAYTON C. BENNER

written by every application developer wishing to display text using a “smart font.” These developments have paved the way for some excellent new fonts for Hebrew and Aramaic.2 Traditional Hebrew and Aramaic fonts have faced two difficulties. First, before the creation of UNICODE, they had to use font-specific values to represent each desired glyph, requiring users who share Hebrew or Aramaic text to use the same font.3 Second, before the creation of “smart fonts,” the placement of vowels, cantillation marks, and symbols was difficult and generally left much to be desired aesthetically. For example, a font could only place a given vowel or cantillation mark in a limited number of discrete places in relation to the consonant around which it was placed. For each such placement, a different value had to be chosen to represent that position internally, and the user had to use a different keyboard input, forcing the user to shoulder an undesirable amount of work, leading to inconsistency among users. Moreover, matters became nearly unmanageable when there were multiple glyphs required around a particular consonant. Recently, some strikingly beautiful Hebrew and Aramaic fonts have been created that overcome these traditional typographic pitfalls by representing the data according to the UNICODE standard, which now includes Hebrew, and making them “smart fonts,” so that the glyphs representing vowels, cantillation marks, and symbols are combined in intelligent and aesthetically pleasing ways around the consonants. They take advantage of some of the freedom ancient scribes had when fitting these vowels, cantillation marks, and other symbols together. I particularly commend SBL Hebrew, produced by John Hudson of Tiro Typeworks at the behest of the Society of Biblical Among the various scripts used in different times and places for Hebrew and Aramaic, this paper focuses specifically on Aramaic square script and includes the ways in which it was augmented by the Massoretes. 3 See, for example, de Moor, J. C. “Coding Proposal Submitted to ‘Bible et Informatique’.” In Bible and Computer: The Text: Proceedings of the First International Colloquium, Association Internationale Bible et Informatique, Louvain-la-Neuve (Belgique), 2–3–4 Septembre 1985, 179–92. Paris: Champion, 1986. 2

DISPLAYING HEBREW AND ARAMAIC

63

Literature, and Ezra SIL, produced by the Summer Institute of Linguistics.4 Both of these fonts are powerful and aesthetically pleasing. Figures 1 and 2 show Gen 1:1–3 in SBL Hebrew and Ezra SIL, respectively. Figures 3 and 4 show Ps 27:13, one of the most troublesome verses in the Hebrew Bible from the perspective of displaying all of the Massoretic marks, in SBL Hebrew and Ezra SIL, respectively, in order to show that these two Hebrew fonts look good even when the Massoretes complicated matters the most.5

Figure 1. Gen 1:1–3 using SBL Hebrew.

“SBL Hebrew Font;” available from http://www.sbl-site.org/ educational/BiblicalFonts_SBLHebrew.aspx; Internet; accessed 11 October 2008; “Ezra SIL Hebrew Unicode Font;” available from http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=EzraSIL_ Home; Internet; accessed 11 October 2008. 5 For a description of the upper and lower puncta in Ps 27:13, see Israel Yeivin, Introduction to the Tiberian Masorah, trans. and ed. E. J. Revell (Missoula, Mont.: Scholars Press, 1980), §79. 4

64

DRAYTON C. BENNER

Figure 2. Gen 1:1–3 using Ezra SIL.

Figure 3. Ps 27:13 using SBL Hebrew.

DISPLAYING HEBREW AND ARAMAIC

65

Figure 4. Ps 27:13 using Ezra SIL.

At the same time as these advances in the handling of complex scripts for desktop computers, the mobile market has been expanding rapidly, leading to a rise in demand for software on-the-go. These handheld devices, however, have sought to meet their tight constraints on speed and storage by excising anything in the operating system that might be extraneous. As such, handheld devices generally do not include complex script support, with some not even supporting UNICODE at all. Thus, those wishing to display Hebrew and Aramaic texts have not been able to take advantage of these developments in complex script handling. Vocalized texts have not heretofore been able to be presented in an aesthetically-pleasing fashion. The remainder of this paper presents the way in which I have been able to overcome these limitations in complex script support on two handheld platforms, namely Windows Mobile and Palm, and have been able to display Hebrew and Aramaic texts with all the beauty that recent UNICODE-based smart fonts have allowed. Thus far, this includes the Hebrew Bible with all of the vowels, cantillation marks, and symbols contained in the MichiganClaremont-Westminster database, an abridged BDB dictionary, and the non-biblical Qumran texts with all of the editorial symbols contained in the database produced by Dr. Martin Abegg, Jr.

66

DRAYTON C. BENNER

POSSIBLE APPROACHES There are at least three possible approaches one could take to make use of these new Hebrew and Aramaic fonts despite the lack of complex script support on mobile platforms. Inadequate Possible Approaches First, one could write an algorithm that would take as inputs the dimensions of each glyph along with an ideal location for each zero-width glyph—namely vowels, cantillation marks, and some symbols—around its consonant.6 Then, for each non-zero-width glyph—namely consonants and some symbols—the algorithm could determine the placement of each zero-width glyph around the non-zero-width glyph by placing each zero-width glyph as close to its ideal spot in relation to the non-zero-width glyph while retaining some minimum amount of space between each pair of glyphs. This approach has the advantage of being highly flexible. The algorithm could handle any desired combination of glyphs with somewhat reasonable results. However, there are two serious drawbacks to this approach. First, it would forfeit the tremendous amount of time and effort put forth by the developers of Hebrew and Aramaic “smart fonts” into the placement of particular combinations of glyphs. To be sure, many combinations of glyphs would look good under this approach. However, there are many combinations of glyphs that the human eyes of the creators of these fonts will have placed better than a straightforward algorithm could do. Second, this approach would lead to inconsistency. Users who view Hebrew text in, for example, SBL Hebrew on their laptop will recognize that the Hebrew text in SBL Hebrew on their handheld device looks different. Since the creators of these fonts have done such a good job with the placement of the glyphs when they are in combination Yannis Haralambous describes one such algorithm in “Tiqwah: A Typesetting System for Biblical Hebrew, Based on TEX.” In Bible and Computer: Desk and Discipline: The Impact of Computers on Biblical Studies: Proceedings of the Fourth International Colloquium, Association Internationale Bible et Informatique, Amsterdam, 15–18 August 1994, 445–70. Paris: Champion, 1995. 6

DISPLAYING HEBREW AND ARAMAIC

67

with one another, it is worthwhile to try to match their placement of the glyphs. The second approach thus would seek to replicate all of the rules embedded in these “smart fonts” in one’s application. This would avoid both of the difficulties in the aforementioned approach since the glyphs would be combined in precisely the same manner as designed by the creators of the fonts. However, there are two serious drawbacks to this approach. First, this represents a tremendous amount of duplicated effort. The creators of the fonts have written a non-trivial number of rules governing the combining of glyphs. Replicating these rules already embedded in the fonts would require a great deal of coding effort. Second, this approach does not easily accommodate the addition of a new font. For each UNICODE Hebrew “smart font” one wished to support, one would have to code an entirely new set of rules regarding the combinations of the glyphs in that font. A Successful Approach Finally, there is a third approach, an approach that avoids the weaknesses of the former two approaches. Here is the approach in outline form, using the presentation of the Hebrew Bible in SBL Hebrew as an example. The outline of this approach is quite straightforward. First, one false renders the entire UNICODE Hebrew Bible under Windows using Microsoft’s complex script library Uniscribe, allowing Uniscribe to determine precisely which glyphs from SBL Hebrew should be used and the precise placement of each. One then makes additional calls to Uniscribe to gather this information concerning the glyphs it used and the placement of each, and one stores this information in a private database. This private database is then utilized on the handheld platforms to place each particular glyph in its appropriate place. Thus, this approach takes advantage of the work done by the creator of SBL Hebrew and takes advantage of Microsoft’s work in handling complex scripts. The application developer does not have to redo the work already embedded in the “smart font” and does not have to write general complex script support for the mobile operating system to deal with “smart fonts,” yet the resulting Hebrew Bible on the mobile platform appears identical to its counterpart under Windows. The results, in my opinion, are outstanding. Figures 5 and 6 show screenshots from a Windows Mobile 6 emulator of the Hebrew Bible using SBL Hebrew,

68

DRAYTON C. BENNER

beginning at Gen 1:1 and Ps 27:13, respectively. For the sake of variety, Figures 7 and 8 show screenshots from a Windows Mobile 6 emulator using Ezra SIL, beginning at Exod 15:1 and Isa 40:1, respectively.

Figure 5. Screenshot from a Windows Mobile 6 emulator beginning with Gen 1:1 using SBL Hebrew.

DISPLAYING HEBREW AND ARAMAIC

Figure 6. Screenshot from a Windows Mobile 6 emulator beginning with Ps 27:13 using SBL Hebrew.

69

70

DRAYTON C. BENNER

Figure 7. Screenshot from a Windows Mobile 6 emulator beginning with Exod 15:1 using Ezra SIL.

DISPLAYING HEBREW AND ARAMAIC

Figure 8. Screenshot from a Windows Mobile 6 emulator beginning with Isa 40:1 using Ezra SIL.

71

72

DRAYTON C. BENNER

Issues with the Successful Approach There are a five types of issues with this approach that make it not quite as straightforward as the outline above. First, there are issues related to storage space. Second, there are issues related to data representation. Third, there are issues related to the scaling of fonts. Fourth, there are issues related to data input by the user. Fifth, there are issues related to searching. Each will be discussed in turn. The first issue with this approach is one of storage space. This approach requires that information be stored on the mobile device regarding where each individual glyph should be placed in relation to the previous glyph for each supported font. Thus, the storage space scales linearly with the size of the text and also scales linearly with the number of fonts supported. One can cut back on this storage space by requiring SBL Hebrew and Ezra SIL to agree on a common sequence of UNICODE values. The two generally choose glyphs that are analogous to one another, and one can tweak them in the few cases when they do not. For example, when there are multiple zero-width glyphs surrounding a single non-zero-width glyph, one or both of the fonts often adds a thinspace or a hairspace. If they do not match, one can choose either representation and adjust the offsets for the glyphs used by the other font. By so doing, the size of the file still grows linearly with the number of fonts, but the constant is smaller, yielding a smaller overall size. While size is much more limited on handheld devices than on personal computers, this is not a major obstacle to displaying a Hebrew Bible using this approach. There are not many Hebrew Unicode “smart fonts” that have been produced to this point, and they will probably not proliferate faster than Moore’s Law will accommodate additional storage space. At present, having an entire Hebrew Bible supporting both SBL Hebrew and Ezra SIL requires on the order of ten megabytes, a fairly trivial amount even by mobile standards. Moreover, if UNICODE Hebrew “smart fonts” do proliferate before mobile operating systems embrace complex script support, users will probably be satisfied with just a subset of these fonts supported, anyway. The second set of issues with this approach relate to data representation. Both SBL Hebrew and Ezra SIL use glyphs for

DISPLAYING HEBREW AND ARAMAIC

73

which there is no corresponding UNICODE value. In general, each of these fonts transforms UNICODE input into a set of glyphs from the font. In some cases, a one-to-one function maps these UNICODE input values to the particular glyphs, but that is not true in all cases. Not all of these other cases are problematic but some of them are. There are other glyphs in these “smart fonts” for which there is no corresponding UNICODE value. Between SBL Hebrew and Ezra SIL, there are nine cases like this when displaying the Hebrew Bible. For example, SBL Hebrew contains a glyph consisting of a hateph-patah with a medial meteg. The meteg can, of course, go to the left or right of the hateph-patah, but it can also go in middle of it, and the creators of SBL Hebrew created a new glyph for this case.7 One learns via function calls to Uniscribe when this glyph is used just as with any other glyph, but how does one represent that information in the database stored for use on the mobile device? There is no perfect solution that allows one to remain tied tightly and neatly to the UNICODE standard. Thus, I represented these glyphs with values drawn from one of the private use areas in the UNICODE standard. This does, of course, hamper the prospects of exporting the texts to another program, but since no other program on the mobile device could display the text properly, anyway, this is not entirely problematic at this point. Nonetheless, future development would hopefully allow for some means of exporting to other programs, even if it were simply to copy some text in order to email it to someone not using a mobile device. This would have to be accomplished either by writing additional code to map the The following nine glyphs in SBL Hebrew and/or Ezra SIL are used when displaying the Hebrew Bible as encoded in the MichiganClaremont-Westminster database, but they have no corresponding UNICODE value: final kaph with shewa, final kaph with qamets, lamed with holem, lamed with dagesh and holem, he with raphe, taw with raphe, final mem with hireq placed for the perpetual qere in the word for “Jerusalem,” hateph-patah with medial meteg, and hateph-segol with medial meteg. 7

74

DRAYTON C. BENNER

stored text back entirely to standard UNICODE values or by including the original UNICODE text along with the modified version simply for this purpose. The latter option would be simpler but comes at the expense of more storage space. Matters are a bit more complicated in the case of the Palm platform, which does not even support UNICODE and does not support fonts with more than 256 glyphs. For the Palm, we had to break the Ezra SIL font into multiple fonts and create mappings between UNICODE values and the appropriate values in these newly created fonts. Third, there is the issue of the scaling of fonts. Ideally, the user should be able to set the font size to a value that is appropriate to that user’s particular device and preferences. However, when false-rendering under Windows, one has to choose a particular font size at which to determine the placement of each glyph. One solution would thus be to allow a certain number of pre-defined font sizes available to the user. One would then false-render under Windows at each of those font sizes and store all of the information about the offsets. However, this is quite undesirable. The size of the database scales linearly with the number of font sizes one allows. Moreover, having a small number of available font sizes unnecessarily limits the user. Instead, one can false-render under Windows at one font size and save the offset values for that font size. One can then scale each of the offsets based on the particular font size chosen by the user. This does introduce one possible source of error, the only way in which the text might be displayed differently on the mobile device in comparison with a personal computer. The information returned by Microsoft’s Uniscribe library about the placement of each glyph is discrete. Thus, when scaling it, round-off errors might occasionally cause a glyph to be placed one pixel over, whether left or right, from where it would have otherwise been placed. This is mostly mitigated by false-rendering under Windows using a massive size for the font. Using a massive font size minimizes the discretization effect when scaling. Of course, if one makes the font size too massive, then that impacts the size required to store this information in the database. Thus, when false-rendering under Windows, I use the largest possible font that allows me to capture every offset from one glyph to the next in just one byte of storage. This is much larger than any user is

DISPLAYING HEBREW AND ARAMAIC

75

likely ever to want, thus making it rare that glyphs are misplaced by a single pixel while keeping storage space to a minimum. The fourth issue surrounding this approach relates to Hebrew and Aramaic input by the user. This approach I have presented relies on the fact that in this field, one deals primarily with closed corpora of texts. Thus, this approach works for presenting texts that can be processed in advance; it does not work for miscellaneous texts a user may wish to input. The fifth and final issue with this approach deals with searching. One of the greatest benefits of the digitization of texts is the ability to perform searches on the text. With this approach to displaying Hebrew and Aramaic text, one can no longer effectively search the text using a brute-force algorithm. This problem may be more of a blessing than a curse, however, as a brute-force algorithm is slower than a variety of other searching algorithms, which might require some pre-processing of the text and additional storage space but are much speedier than a brute-force approach.8

CONCLUSIONS In general, the approach advocated here allows for an aestheticallyappealing display of Hebrew and Aramaic texts with all of the marks one might desire, whether they be vowels, cantillation marks, Massoretic symbols, or modern editorial symbols. Indeed, it is the first time Hebrew and Aramaic texts with all of these marks have been presented in such an aesthetically-pleasing fashion on handheld devices. At present, I have implemented this approach with the font Ezra SIL on Windows Mobile and Palm and also with SBL Hebrew on Windows Mobile as part of contract work I did in the summer of 2007 for Olive Tree Bible Software, a

For techniques for searching under tight time, memory, and disk space constraints, see Witten, I. H., A. Moffat, and T. C. Bell. Managing Gigabytes: Compressing and Indexing Documents and Images. San Francisco: Morgan Kaufmann, 21999. 8

76

DRAYTON C. BENNER

company that specializes in Bible software for mobile devices.9 However, the usefulness of this approach is not limited to these platforms, these texts, or even these languages. These same general techniques could be used to display texts in a wide variety of languages requiring complex scripts on an equally wide variety of platforms lacking complex script support. This approach can be used with other languages of interest to biblical scholars, such as Syriac, that have complex scripts. Whenever there are advances in the handling of complex scripts on one platform, these advances could be reflected across a whole host of other platforms.

BIBLIOGRAPHY “Ezra SIL Hebrew Unicode Font.” Available from http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id= EzraSIL_Home. Internet. Accessed 11 October 2008. Haralambous, Y. “Tiqwah: A Typesetting System for Biblical Hebrew, Based on TEX.” In Bible and Computer: Desk and Discipline: The Impact of Computers on Biblical Studies: Proceedings of the Fourth International Colloquium, Association Internationale Bible et Informatique, Amsterdam, 15–18 August 1994, 445–70. Paris: Champion, 1995. de Moor, J. C. “Coding Proposal Submitted to ‘Bible et Informatique’.” In Bible and Computer: The Text: Proceedings of the First International Colloquium, Association Internationale Bible et Informatique, Louvain-la-Neuve (Belgique), 2–3–4 Septembre 1985, 179–92. Paris: Champion, 1986. “SBL Hebrew Font.” Available from http://www.sbl-site. org/educational/BiblicalFonts_SBLHebrew.aspx. Internet. Accessed 11 October 2008. Unicode Consortium, The. The Unicode Standard 5.0. Edited by J. D. Allen et al. Fifth edition. Upper Saddle River, NJ: Addison Wesley Professional, 2006. I would like to thank Olive Tree Bible Software (www.olivetree.com) for its support of this work, and I would specifically like to thank Drew Haninger and Stephen Johnson of Olive Tree Bible Software for their support with this project. 9

DISPLAYING HEBREW AND ARAMAIC

77

Witten, I. H., A. Moffat and T. C. Bell. Managing Gigabytes: Compressing and Indexing Documents and Images. Second edition. San Francisco: Morgan Kaufmann, 1999. Yeivin, I. Introduction to the Tiberian Masorah. Translated and edited by E. J. Revell. Missoula, Mont.: Scholars Press, 1980.

THE HEXAPLA PROJECT: TRADITIONAL SCHOLARSHIP MEETS MODERN TECHNOLOGY ELIZABETH ROBAR CAMBRIDGE UNIVERSITY, ENGLAND, UK Origen’s Hexapla is one of the great literary works of the ancient world that remains invaluable for scholarship of various kinds, foremost textual criticism and the study of the Septuagint. The available critical edition, however, is now over one hundred years old, and in the years since a wealth of material has been found with better (and even previously unknown) readings. The Hexapla Institute was founded in 2001 to produce a new critical edition, but it set its sights beyond a traditional print edition. It aimed also at an online edition, one with all the advantages of electronically edited texts: complex search and analysis capabilities, multiple display formats (including the ability to view the continuous text of any known manuscript), and a storage and retrieval system to enable collaboration with future scholarly projects. But accomplishing such a goal was far beyond the competence of traditional scholarship and its usual companion, the publisher of printed books. The online system imagined would involve work as complex and, at times, as laborious as the text-critical work itself. The Institute did not early realize the magnitude of their task. What follows is the story of that gradual realization and the various successes and failures accompanying it. From a review of the history, we hope to offer suggestions to other projects with similar ambitions, namely, the marriage of traditional scholarship with modern technology. We cannot pretend to construct a comprehensive set of guidelines, but we hope to make a small step in identifying the more significant challenges that we faced, and that other projects seem likely to face in their turn. 79

80

ELIZABETH ROBAR

Academic defense of the project When, in the third century CE, Origen gathered the main contemporary versions and revisions of the Greek Old Testament text into a six-column document (thus, “Hexapla”), he drastically altered the future transmission of the Septuagintal text. His fifth column was a conglomerate of all the texts known: the Old Greek supplemented by the Hebrew, mostly by intermediary of the three Jewish revisers (Aquila, Symmachus and Theodotion). In subsequent history, this amalgamation was reproduced as the new version of the Greek Scriptures, replacing the former standard, the Old Greek. The replacement was so comprehensive that it is no longer possible to reconstruct the oldest text with any confidence unless, in some measure, one first reconstructs the Jewish revisions, essentially, the entire Hexapla. Once each of the revisions is understood, the supplements to the fifth column can be identified and the Old Greek itself reconstructed. This Old Greek is one of the best witnesses to the oldest Hebrew (and thus vital for textual criticism of the Hebrew Bible), along with the world of early Jewish Hellenism. The texts of the three revisions provide a unique door into early Jewish-Christian apologetics and exegesis. For multiple fields of study, unraveling the Hexapla is highly to be desired. Theoretical defense of the project Such is the traditional rationale for reconstructing the Old Greek and the three revisions. But the advent of electronic text editing has fiercely challenged the supremacy and even value of the (eclectic) critical edition: might it not be better to provide the readers with all possible readings—without predisposing them according to the editor’s own bias—and thus enable them to choose that which is most appropriate for their own needs?1 The eclectic edition can indeed be a double-edged sword: it provides the illusion of a single, uninterrupted “original” text, and consequently it can mislead as to the true manuscript evidence. On the other hand, it provides a For the seminal reading on the debate, see Kathryn Abram’s helpful 2002 essay, “Electronic Textuality: A Bibliographic Essay,” available online at http://www.mantex.co.uk/ou/resource/elec_txt.htm. 1

THE HEXAPLA PROJECT

81

wealth of scholarship and skilled editorial insight in a compact and accessible format. Without a reconstruction provided by the editor, each reader is essentially compelled to create his own, but without the benefit of any expertise but his own. The naive reader may be misled by a critical edition, but the scholar will understand its true nature and purpose and will make full use of the apparatus—which provides all the data necessary to be able to agree or disagree with every editorial decision, and, if so desired, even to reconstruct a new critical text. And even the naive reader is likely better served by a naive understanding of the skilled editor’s critically reconstructed text than by a naive understanding of a single manuscript of unknown character. Consequently, in spite of concerns raised in the field of electronic text editing about the propriety of eclectic critical editions, it still seemed the best presentation of hexaplaric scholarship for the print edition. (The online edition, unhampered by the printed text’s limitation to a single display format, would be able to display either eclectic or diplomatic texts.) Initial distribution of the work The actual task of reconstructing a new critical edition of the Hexaplaric fragments was enormous. More manuscripts, patristic sources, and new editions of several Church Fathers and catenae had become available since Field’s 1875 edition. Collating and evaluating all the fragments required skills in Hebrew, Greek, Latin, Syriac, Coptic and Armenian, not to mention acquiring physical access to the various manuscripts. The Hexapla Institute distributed the burden, on a book-by-book basis, to text-critical scholars across the world who would individually prepare the critical text and apparatus for their assigned books. A graduate fellow was tasked with collecting their data, entering it into a central repository, and from that repository producing first the print edition, and then the online edition. The assumption that a graduate fellow alone could accomplish everything required in the field of information technology nearly proved fatal. This paper will focus not on the

82

ELIZABETH ROBAR

nature of the text-critical scholarship,2 but rather on the interaction of traditional text-critical scholarship with the information technology necessary to meet the project’s goals.

INITIAL DATABASE PROPOSAL To ever have an online edition, the members of the Hexapla project knew they would need a database. To that end, they developed an XML schema that all contributors would hand-code, as demonstrated below for Aquila’s reading in Gen 1:1. 1 Gen 1:1 αʹ (> 14-550-615; ἕτεροι Bas; alii Ambr) ‫ ᾽Εν ἀρχῇ ἐποίησεν ὁ θεὸς τὸν οὐρανὸν καὶ τὴν γῆν.

all] 9122 Diod Ps 18 Philop 92 ἐν ἀρχῇ] 135 14commEusEm-78-413-550commEusEm-615commEusEm 343-344 Ambr Exam I 16 Bas Hex 12 GregNys Hex 69 Ish 12 EusEmArm 4 Hi Quaest 3 τόν—γῆν] Hi Ep LVII 11.3

ἐν κεφαλαίῳ ἔκτισεν θεὸς σὺν τὸν οὐρανὸν καὶ σὺν τὴν γῆν Figure 1: Database Structure “A”

The proposed XML schema exemplified many of the obstacles that threatened to prevent the project from achieving its goals. The schema was in truth simply a critical text with apparatus layout that has enclosed its labels in XML-style angle brackets. As seen below, the XML and final format are identical, only with some of the labels removed, and the carriage returns replaced by vertical pipes. The XML was a marked-up copy of the final format, not a base of data for constructing the final format. Among other Which has been well covered elsewhere, e.g. in Peter Gentry, “The Role of the ‘Three’ in the Text History of the Septuagint: Aspects of Interdependence of the Old Greek and the Three in Ecclesiastes,” Aramaic Studies 4.2 (2004) 153–92. 2

THE HEXAPLA PROJECT

83

implications, this meant that whenever the final format changed (three times in the first four years), the XML for all previously entered data had to be revisited and updated accordingly.

1

Gen 1:1

Genesis 1:1 HT

‫בראשית ברא אלהים את השמים בראשית ברא אלהים את השמים‬ ‫ואת הארץ׃‬ ‫ואת הארץ׃‬



LXX

ἐν ἀρχῇ ἐποίησεν ὁ θεὸς τὸν οὐρανὸν καὶ τὴν γῆν.

ἐν ἀρχῇ ἐποίησεν ὁ θεὸς τὸν οὐρανὸν καὶ τὴν γῆν.

αʹ (> 14-550-615; ἕτεροι Bas; alii Ambr)

all] 9122 Diod Ps 18 Philop 92 ἐν ἀρχῇ] 135 14commEusEm-78413-550commEusEm-615commEusEm 343344 Ambr Exam I 16 Bas Hex 12 GregNys Hex 69 Ish 12 EusEmArm 4 Hi Quaest 3 τόν—γῆν] Hi Ep LVII 11.3

αʹ (> 14-550-615; ἕτεροι Bas; alii Ambr) all 9122 Diod Ps 18 Philop 92 | ἐν ἀρχῇ 135 14commEusEm-78-413550commEusEm615commEusEm 343-344 Ambr Exam I 16 Bas Hex 12 GregNys Hex 69 Ish 12 EusEmArm 4 Hi Quaest 3 | τόν—γῆν Hi Ep LVII 11.3

ἐν κεφαλαίῳ ἔκτισεν θεὸς ἐν κεφαλαίῳ ἔκτισεν θεὸς σὺν τὸν σὺν τὸν οὐρανὸν καὶ σὺν τὴν γῆν οὐρανὸν καὶ σὺν τὴν γῆν.

Figure 2: XML compared to the final format

Insufficiently understanding databases The first obstacle was understanding the nature of a database. A database is intended to enable greater manipulation of the data than would be possible from a more traditional, “user-friendly”

84

ELIZABETH ROBAR

format. The database stores data in discrete units, for example, separating out each reading (e.g., ἐν κεφαλαίῳ), attribution to a reviser (e.g., ἕτεροι), and individual attesting witness (e.g., Bas Hex 12). This contrasts markedly with the traditional apparatus exemplified here, in which one must decode the various apparatus sections for each bit of information. Basil is in the negative translator (later, “attributions”) apparatus: he does not attribute the text to Aquila, but rather to ἕτεροι. The witnesses apparatus reveals that he does not attest to the entire verse, but only to the first two words. The witnesses apparatus lemma indicates that these two words correspond to ἐν ἀρχῇ in the Old Greek, but the preferred reading reveals that Basil’s actual reading is ἐν κεφαλαίῳ (were Basil’s reading different from the preferred reading, there would be an additional “variants” apparatus). The respective purposes of the database and apparatus are fundamentally different: the database stores data in a format that will permit later manipulation; the storage format is not directly related to the final display format. The apparatus, however, is as compact (and thus often cryptic) as possible to convey all the information in a minimum of space. In a database, the readings by 14, 550 and 615 would each, individually, be associated with an omitted attribution (none made at all), Basil with the attribution “others” in Greek (ἕτεροι), and Ambrose with “others” in Latin (alii). It would then be straightforward to search on all instances in which 14 and 78 share an identical reading but differ on their attribution. To do such a search from the proposed XML (or print edition), however, would be far from straightforward: it would require carefully (manually) parsing the positive and negative translator/attributions apparatus. To train a computer to parse the apparatus would be difficult and never fully trustworthy, and so not acceptable for scholarship. What should be a basic search, then (readings which 14 and 78 share but attribute differently), would in fact be impossible (or, at best, unreliable) from the proposed XML. Almost any search at all, such as a list of all readings for Gen 1:1 attributed to Aquila, would be basic with a database with discrete data units, but doing so from the proposed XML would require the same manual parsing of the attributions and witnesses apparatus: to first compile a list of all witnesses mentioned in the witnesses apparatus, then subtract those in the negative section of the attributions apparatus. The very

THE HEXAPLA PROJECT

85

purpose of the database, to enable extensive manipulation, could not be realized with the initial XML proposal. Inadequate understanding of XML A second problem was an inadequate understanding of the chosen database format, XML. Valid XML has strict requirements, namely that there must be a single root element that contains all other content elements, all elements must be nested hierarchically according to parentage, and all tags must be present, properly spelled, and have identical case. In the early work submitted by contributors, tags were regularly missing or misspelled. This rendered invalid the entire XML and eliminated any benefits of using the coding scheme, since no XML processor would accept it. It became simply an inconsistently formatted text file. The timeconsuming and painstaking efforts of the scholars, to learn the XML schema and hand-code it, came to nothing with the occurrence of a single mistake in the schema. And the time required to detect and correct the mistakes was greater than coding new XML directly from the final format. An inadequate understanding of XML nullified its usefulness and largely wasted the scholars’ efforts. Foreign languages and ASCII fonts As if these difficulties were not enough, the project was also mired in the inevitable use (at that time) of ASCII fonts to encode nonASCII texts. Most biblical scholars have experienced the frustration of receiving a document from a student or colleague without having the proper fonts. Any right-to-left language is written in reverse order and does not wrap properly, and if a font is missing, then in place of the foreign characters is a series of unintelligible WingDings. At first, the Hexapla project contributors submitted their data in multiple incompatible ASCII fonts, with each contributor having a different collection of favorite fonts for each language. More problematic than the appearance of the text, there was no provision in the XML to tag text as being in a specific language or font. It was not realized that a word processor’s (e.g., Microsoft Word’s or Mellel’s) means of storing which font to use for which text would be inaccessible to an XML processor. To it, therefore, all text would be understood as Latin characters in

86

ELIZABETH ROBAR

standard left-to-right order. All language, font and line wrapping information was lost. As a result, the only means of searching for a Greek word, such as θεός, would be to search on the English qeo/j, to use one sample ASCII encoding. Summary of challenges The central issue in all of this was the need for a database that would make possible capabilities beyond those of a digitized version of the print edition, as in a searchable PDF. Requirements were not met for interpreting coded text as a database (e.g., valid XML), for distinguishing between types of information (e.g., different languages and fonts), and for storing the data so as to enable manipulation (e.g., to identify readings attributed to Aquila without being deceived by a negative apparatus), thus effectively eliminating any usefulness to be derived from the proposed database. The hope that a graduate fellow could compensate for any shortcomings proved ill-founded: he spent the first four years researching how to convert fonts programmatically, but with no success. He never had opportunity to address the other difficulties. Without significant changes, the success of the project appeared very much in jeopardy. Suggestions for other projects Multiple questions for other projects arise even at this early stage. Is the purpose of any database required fully understood, as well as the kind of database schema that can meet those requirements? (To be discussed in greater detail below.) Are there sufficient means for distinguishing between different types of data—e.g., languages or fonts—or does the database mirror the frustrations of everyday life with no solution? Do those entering or encoding the information for the database fully grasp the requirements of the database and understand what would invalidate its schema, so as to avoid nullifying their efforts through an unwitting mistake?

AN ADEQUATE DATABASE The Hexapla project stumbled on all of these points, but most important was the design, or architecture, of the database itself. For even if the XML had been valid and accurately encoded, its architecture could not meet the demands laid on it. Happily, the

THE HEXAPLA PROJECT

87

natural turnover of graduate fellows in 2005 brought one with prior experience in database architecture, and thus skills and experience in resolving precisely this kind of dilemma. The choice of XML had to be reviewed, both for its suitability for hand-coding by scholars and for the schema proposed for storing the data. If the online edition were to enable searching individually on any specific information (such as attributed reviser/translator, reading, witness), then the database schema would have to explicitly separate out all such data into discrete units: the requirement of granularity. A sample of such an XML schema appears in Figure 3.

Exodus 1 5 ἐξ ‫< י ֹ ְצאֵי י ֶֶרְך‬/MT> θ′ αἱ ἐκ μηροῦ

αἱ ἐκ μηρ[ θ′

64



Exodus 1:5 HT ‫י ֹ ְצאֵי י ֶֶרְך‬ LXX ἐξ θ′ μηρ[ 64 αἱ ἐκ μηροῦ

Figure 3: Sample XML schema that would enable extensive analysis.

This example consists of only two words in the Hebrew (‫ )י ֹ ְצאֵי י ֶֶרְך‬and one in the Greek biblical text (ἐξ), with the Greek attested by only one witness. The witness, 64, has the last two letters of its reading missing, which the editor has here reconstructed in the preferred reading as a genitive singular. The XML required to represent this most simple lemma, in a form enabling later analysis and manipulation, is significantly more detailed than the XML proposed above for a complex lemma. Although there is only one witness and its reading in this example,

88

ELIZABETH ROBAR

most lemmata will have multiple readings from different witnesses. Therefore, the XML contains a group, “Attestations,” to allow for each attestation (combination of attributed translator/reviser + reading + witnesses) to be encoded individually. In similar fashion, many readings will be attested by multiple witnesses, so each reading has an embedded group for all its “Witnesses.” Each attestation contains its own attributed Translator element, rather than including that information in a positive or negative apparatus in the translator element for the lemma as a whole. All witnesses are likewise listed individually rather than together, so with searching, witness 64 will never be confused with 164, as would regularly happen when searching the initially proposed XML. And, finally, the entire lemma is enclosed within one root element, “Lemma,” to make it valid XML. With multiple lemmata, even this Lemma element would be enclosed within a larger element, for instance, one named “Lemmata.” Using this sample schema with its granular depiction of the data, an XML processor could analyze and manipulate the data to answer scholars’ questions. It satisfies the technological needs for producing the desired capabilities. But no text critic would be interested in hand-coding this, especially for lemmata in which there are dozens of readings, sometimes each with dozens of witnesses. The schema would quickly become unwieldy and errorprone. And even though an XML processor could perform highly complex analyses, the sheer bulk of the XML data would make development and searching slow and cumbersome. For a database requiring such a high level of granularity, and for a project requiring data entry by scholars, XML did not seem a wise choice. If hand-coding were not realistic, however, a new dilemma presented itself: a database that is not hand-coded must be divided into a “back end” (the data itself, stored in highly granular format) and a “front end” (the interface to enter data in a more userfriendly way). Since the new graduate fellow also had skills in programming, she was able, after a week of intensive observation of and interviews with a contributor, to build a proof-of-concept database and data entry interface and present them to all the project’s contributors at the 2005 Society of Biblical Literature in Philadelphia. The ASCII font issue was resolved by the passage of time and the development of Unicode (which also resolved rightto-left and left-to-right difficulties), and the XML database was

THE HEXAPLA PROJECT

89

replaced by a back-end/front-end combination, so the data would be stored in a relational database3 and hand-coding was replaced by an interface of separate text boxes for all the distinct kinds of information.

Figure 4: Hexapla project interface presented in 2005

An immediate benefit of the new database and interface was that the final form of the apparatus could now be programmatically generated, as one among innumerable forms of displaying the data. The apparatus was generated simultaneously with data entry, providing instant feedback in a form with which the scholar was intimately familiar, providing a first editorial review, by the scholar A type of database that minimizes redundant information but handles well small units of data related to other units in many ways, e.g. each lemma related to one or more witnesses, and in turn each witness related to one or more lemmata and sublemmata. The structure of a relational database facilitates all manner of searching on such relationships in a fraction of the time required by an XML processor operating on large quantities of data. 3

90

ELIZABETH ROBAR

himself or herself, that the data had been entered accurately and interpreted correctly. Changes to the final form of the apparatus no longer required revisiting prior work (just a one-time modification to the code that generated the apparatus), and multiple display formats of the same data became possible. As simple as displaying the final apparatus, the contributor could compare side by side the readings of any combination of witnesses and revisers, such as a comparison of the troublesome Fb with all the preferred readings for Aquila and Symmachus in Exodus. In this case, a particular witness’s affinities might become clear in a highly visual format. A synoptic “Hexaplaric format” was also available, in which all the texts were displayed in their original six columns as reconstructed so far (this brought a great sense of accomplishment and delight to the contributors). For the Hexapla project, a vital answer to their challenges was employing someone with the relevant skills in database architecture, and the early obstacles were overcome. The requirements of the database for the Hexapla project were, in the end, simple: to be able to search on any combination of biblical reference, biblical text, reconstructed text of a reviser/translator, attested reading, attesting witness, attribution of that reading to a reviser/translator, and language and grammatical analysis of that reading; and to be able to display the same information in any user-determined format. This was made possible by a database that stored each data unit separately, along with its relationships to all the other data. While this could be done in most any kind of database (including XML, as demonstrated above), the complexity of the manipulation required for apparatus generation led to the choice of a relational database—one that excels in rapid manipulation of large quantities of data. Suggestions for other projects This narrows down the challenge of the database to three basic questions: what data (and in what format) needs to be stored in order to meet the requirements of a given project? This will determine the database architecture. Who will be entering the data? If a dedicated staff, then training for a specific format may be reasonable; otherwise, any kind of hand-coding or tagging may be unreasonable. And, what kind of manipulation of the data will be required on a regular basis? This will determine the database engine (e.g., XML processor for XML files, or a relational database

THE HEXAPLA PROJECT

91

engine). Answering these questions requires careful collaboration between the scholar and the programmer/database architect. It is only when the programmer fully understands what the academic project requires that he can properly build a database. And it is to everyone’s advantage for the scholar to understand why the programmer needs to ask such questions.

IN SEARCH OF AN ADEQUATE INTERFACE The flexibility initially sought by storing the hexaplaric data in digital form became possible in all its fullness, thanks to the architecture and processing abilities of the new database, and entering data was no longer so error-prone with the new, online interface. Being online meant that contributors could enter their data from anywhere, world-wide, as long as there was access to the internet. Data backups were automatically made in one central location, so contributors had no burden in that regard. But as soon as one set of problems was resolved, another set became evident. The task of entering data in the new online interface turned out to be as daunting to a text critic as hand-coding the original XML. For all the promise of the database, the proposed interface was frankly unintelligible to a text critic accustomed only to traditional forms of research. And even for those willing to learn it, it was unbearably tedious. An interface for the database Just as text critics are not automatically endowed with skills in database architecture, so too a programmer, or database architect, is not automatically endowed with skills in interface design. This problem is well-known in the world of software development, when a programmer designs an interface for non-programmers. The programmer thinks in terms of code simplicity and database architecture, which tend to be unrelated to traditional patterns of scholarship and users’ workflow. Although the Hexapla project now had a database that could be manipulated in unlimited ways, realistically, the interface was such that minimal data would ever be entered into it. And a database without data was no better than lacking a database altogether.

92

ELIZABETH ROBAR

AN INTERFACE FOR THE SCHOLAR The solution for the Hexapla project again required recruiting skills from outside the pale of traditional scholarship, and even outside the range of all involved in the project: in this case, an interface designer. After intensive collaboration with the graduate fellow, he designed the interface shown below in Figure 5.

Figure 5: Hexapla project interface presented in 2007

In this new interface, two principles guided the design: first, the scholar’s traditional workflow must determine the layout and the means of entering data. Second, only that work which absolutely requires text-critical expertise would be asked of the scholar in the data entry process. Any work which could be done automatically by the computer, or by a graduate student, would not be asked of the contributor. In short, the burden on contributors, both to learn how to use the interface and to actually enter the data, must be the smallest possible. Whatever effort could be spent early in the project by the interface designer and programmer would save the later effort required of contributors in entering their data.

THE HEXAPLA PROJECT

93

The interface was thus organized around what were determined to be three central text-critical decisions: how to delimit each lemma (and any sublemmata) for the basic layout of the apparatus; how to assess the attested readings according to their status as preferred or unlikely for a given translator/reviser; and whatever scholarly notes might be of use in interpreting the data. From these three decisions and their associated data, the apparatus could be programmatically generated as shown in the right-hand sidebar in Figure 5. No longer required of the text critic was typing in the Greek and Hebrew of the lemma, for that could be highlighted with a mouse; nor manual tagging of a reading’s language, for that could be determined from the Unicode code page (which the scholars did have to learn to type); nor individual entry of witnesses and groups, for those could be entered as one list by the scholar and subsequently parsed by the computer and stored individually in the database. And all standard text-critical Latin abbreviations were properly interpreted and expanded by the database, enabling a typical apparatus display format without damaging search and analysis capability. Four kinds of notes or comments were possible: those to appear in the print edition; those to appear only in the online edition; notes “to self” for the contributor’s personal use, as demonstrated in the sample; and notes to the graduate fellow, questions or comments about anything regarding data entry. This interface was tailored to the needs of the scholar, without disregarding those of the database. At all times during data entry the generated apparatus for the entire lemma was in view, providing fully updated context and feedback for the data entered. If the generated apparatus did not appear exactly as it should in the print edition, the contributor knew to make the requisite corrections (either his own or alerting the graduate fellow to any programming changes needed). This was most important with the witnesses: if the computer properly parsed the witness list into individual witnesses, then re-generated the witness list for the apparatus, then the contributor knew his input was correctly interpreted. And he did not need to worry about formatting; the computer handled the proper ordering of witnesses and the visual connection with hyphens of those in the same group. The programmatically generated apparatus provided immediate satisfaction for the contributor and served as a built-in editorial review for quality control.

94

ELIZABETH ROBAR

Without pretending to make text-critical work easy (that work remained as it ever was), the new interface made significant strides in making the data entry of that scholarly work easier and more intuitive for the project’s contributors. When presented at the 2007 Congress of the International Organization of Septuagint and Cognate Studies in Ljubljana, Slovenia, it was well received and is still actively collecting data from contributors. Suggestions for other projects The recurring theme for the Hexapla project was that its goals were attainable, but only with access to the necessary skills. In order to have contributors enter their data into the database, the right kind of interface was essential. With the first interface, the project stalled. With the interface designed by one with the requisite skills, contributors could (and did) readily enter their data. The application for others is to recognize that an adequate database needs to be paired with an adequate interface. Without an interface designed with as much care as the database, all the power of the database may be lost. For some projects, students may be available to be trained in any format and no more need be said. But for others, with limited resources, the scholars themselves must be enabled to enter their data. As scholars are not known for their abundance of spare time and delight in learning new methods of doing traditional work, a good interface should satisfy the needs of both scholar and database: familiar enough in terms of traditional scholarship to be readily used, and tailored enough to the final storage of the data so that the data can be programmatically converted into its discrete units (e.g., witness lists to individual witnesses). For projects in which scholars are entering the data, we found the following questions helpful. What will frustrate the scholars and discourage them from using the interface? Does the data entry process require anything unnecessary from them? Could any of the work be delegated to someone or something else with more time (or less expensive time)? For the Hexapla project, this entailed minimizing the amount of typing in Unicode (which also saved on errors) and choosing carefully what data could be entered as a conglomerate, to be parsed by the computer. We required that each reading have an individually specified attribution (no negative attributions apparatus permitted), but we did not require witnesses

THE HEXAPLA PROJECT

95

to be entered individually. We compensated for the riskiness of having the computer parse the witness list by having immediate feedback to test its success: the automatically generated apparatus. This of course raises a vital question for any kind of data entry: what kind of feedback is there, and what means of correcting inaccuracies? The apparatus available during the data entry process meant that the first review was automatically incorporated into data entry, so once the apparatus was correct, the data did not need to be re-visited for errors. This meant far less editorial work in the end, and instant gratification for the contributors, as they saw their data transformed into a consistently formatted critical text and apparatus layout.

CONCLUSION In retrospect, the history of the Hexapla project falls into distinct phases: being in need of an adequate database, possessing such a database but with no data, and possessing both a database and a usable interface (and thereafter data). Progression from each phase to the next depended on the employment of skills from outside traditional text-critical scholarship. To meet its ambitious goals for the online edition, the project was forced to recruit workers with the necessary skills in information technology. Collaboration between those with the different skills was likewise critical. The initial database design was possible only after intense interaction with Alison Salvesen (a contributor and member of the Editorial Board), and her feedback in the following years was invaluable. Much of the success so far is directly owed to her patience, willingness to think in new ways, and diligence in trying out the new system. The design of the new interface was possible only after intense interaction between programmer and interface designer, both for the designer to understand the requirements and for the programmer to be able to build the new interface. Regular and detailed feedback from the contributors was instrumental in optimizing the final interface and will continue to be critical to the success of the project as a whole. And, finally, the unwavering support and encouragement from the entire Editorial Board (Bas Romeny and Peter Gentry, along with Alison Salvesen) made everything possible. For the sake of other, future projects, we conclude with a plea to understand enough of the information technology involved to

96

ELIZABETH ROBAR

know what questions to ask and when to apply for the aid of outside skills. Enthusiastic amateurs may sometimes be able to provide skills, but not for the more ambitious projects. Particularly when many scholars are involved, who have little time to offer in the midst of their myriad other commitments, efficiency can be of the essence. Efficiency comes with expert knowledge. The Hexapla project abounded in text-critical expertise, but to meets its own goals, it also required expertise in information technology. With that acquired, the obstacles could be accurately evaluated and overcome, and traditional scholarship and modern technology could be happily united.

THE CHALLENGE OF CONSISTENCY A. DEAN FORBES HP LABORATORIES, PALO ALTO, CA, USA The Present State of Affairs—Based on our linguistic stance, we have fashioned a representation of the linguistic structure of biblical texts. Adequate representation requires human intervention, which unleashes inconsistency. Standard approaches to inconsistency detection range from random to systematic. A New Direction—Our approach to inconsistency detection involves constructing text surrogates. Using these, we have detected multiple sources of inconsistency. A pilot study has produced estimates of error rates, allowing us to focus on the areas of our data that most need improvement. The pilot study also indicates that we should automate and enhance our approach.

1. THE PRESENT STATE OF AFFAIRS 1.1 Our Stance1 In what follows, I will briefly describe a visual representation of the grammar of Biblical Hebrew clauses that Frank Andersen and I have developed over many years. Its details have evolved as we have struggled to create a rich representation of the clauses making up the corpus of Biblical Hebrew. We do not hold it up as “the one true grammatical representation.” We happily concede that other representations offer impressive coverage of various phenomena Author’s Note: As always, the crucial ongoing contributions of Frank Andersen are gratefully acknowledged. Many examples are from the forthcoming Andersen and Forbes book, Biblical Hebrew Grammar Visualized, and are used by permission of Eisenbrauns. All rights reserved. Phrase markers are as rendered by Logos Bible Software. 1

99

100

A. DEAN FORBES

and may well eclipse our approach when certain phenomena are the focus of analysis. But, we do believe that our approach provides excellent coverage across the broad range of linguistic phenomena attested in the Hebrew Bible. We recognize that our representation is incomplete.2 Indeed, our basic strategy has always relied on successive approximations. Our representation is continually being refined and extended. Our approach to grammatical representation is informed by a handful of basic premises. Rather than using a dependency grammar, we work with a constituency grammar as specified in the early 1990’s by context-free phrase-structure rules.3 Phrases were reliably assembled by the computer iteratively and bottom-up, but grammatical functions and semantic roles were/are assigned noholds-barred by both computer and human over-reader. We reject “the autonomy of syntax.” It became clear early in our parsing work that semantic information could assist parsing mightily, and we, therefore, assigned simple semantic information to nouns in our parsing dictionary. Our resulting representation includes both phrasal head semantics and semantic roles. It also became abundantly clear that parsing often required world knowledge not easily encoded and better left to the human overreader.

An exposition of the limitations known to us must await the publication of our book on Biblical Hebrew grammar. 3 Our work on the computer parsing of the Hebrew Bible is described in some detail in Andersen, F. I. and A. D. Forbes, “Opportune Parsing: Clause Analysis of Deuteronomy 8.” In Proceedings of the Fourth International Colloquium: Bible and the Computer—Desk & Discipline, 49–75. Paris: Editions Honore Champion, 1995. 2

THE CHALLENGE OF CONSISTENCY

101

1.2 Our Representation

1.2.1 The Phrase-Marker Tree Our basic representational entity is the phrase-marker tree or graph.4 A phrase marker tree is “a tree diagram which shows the division of a form into successively smaller constituents and labels each as belonging to one or more categories.”5 This definition can best be made clear by examining one of our phrase markers. Consider this phrase-marker tree6 from Job 42:8:

All of the nodes, save those holding the actual segments making up the text down the far right of each phrase marker, identify constituents (“phrases”) and contain upper and lower labels. The nodes are joined by edges, directed lines that point from one node (“the mother”) to another node (“the daughter”). The mother node is said to dominate the daughter node(s). The For an introduction to phrase markers, see Andersen, F. I. and A. D. Forbes, “Hebrew Grammar Visualised: I. Syntax.” ANES XL (2003) 43–61. 5 Matthews, P. H., Oxford Concise Dictionary of Linguistics, 280. Oxford: OUP, 2005. 6 The reason for calling the structure a tree is easy to grasp. It is a tree fallen on its side. The node at the far left (cl / oblq) is the tree’s root. The text segments at the far right are its leaves. The structure in between the root and the leaves consists of the branches. 4

102

A. DEAN FORBES

daughter nodes define constituents that combine to create their mother constituent. For each node, the upper label indicates the constituent’s form or function, while the lower label gives the licensing principle that justifies the formation of the constituent. Consider, for example, the node having dir obj as its upper label and gram as its lower label. This node realizes a grammatical function, being the clause’s direct object. It dominates its single daughter node. The labels on the daughter node disclose the form of the constituent (n:anml / union/disj). The combining of the sub-constituents is licensed by the relation union/disj, which tells us that a coordination phrase is involved. Or, consider the node having to + pron as its upper label and prep as its lower label. This constituent is a prepositional phrase consisting of the preposition translated by “to” and the personal pronoun labeled “pron.”

1.2.2 The Phrase-Marker Graph About five percent of the time, text structure representation is best handled, we think, by using non-tree structure. Consider this portion of the phrase marker for Deut 31:4:

Note that the construct phrase “kings of the Amorite” (n:humn / cstr) has two mothers since it is Sihon and Og who together are the kings of the Amorites. This structure is not a tree, since trees do not involve branches that fuse together at their “upper” ends, nor do they involve—at least in the planar mathematical case—branches that cross over one another (“tangling”).

THE CHALLENGE OF CONSISTENCY

103

1.3 Defining and Finding Inconsistency

1.3.1 Sources of Inconsistency We will understand the full range of sources of inconsistency when, in §3.1, we provide examples of each of the kinds of inconsistency that we detect. For now, we shall present a single kind of inconsistency, that arising from inconsistent resolution of homography. For the first release of our parsing of the biblical texts, we decided to suppress ambiguity by choosing the “most salient reading” in each case. Consider the possible choices attested for the meaning of ‫ ֵא ִילים‬in the phrase marker in §1.2.1. In our dictionary, we resolve homographs. As a result, the plene segment ‫ ֵא ִילים‬is found in three distinct lexemes: “pilasters” (building), “rams” (creature), “terebinths” (vegetation).7 Given our world knowledge, we here select the creaturely reading as salient.8 Now suppose that we had (mistakenly) opted for the “terebinths” sense. Thereby an error would have entered our phrase markers. Errors, by their nature, introduce inconsistencies.

1.3.2 Random Detection of Inconsistency Inclusion of human over-readers in the analysis process introduces the possibility (certainty?) of inconsistencies creeping in. Hence, we are duty bound to seek out inconsistencies. Until recently, we dealt with inconsistencies as they happened to turn up. In advancing to the implementation of a new data dimension, errors in previously implemented dimensions were invariably found. When an instance of a particular error type was noted, tests were devised to search for that kind of error more generally. Here is an example. By chance, this dangling construct was noticed in Josh 15:12: Were the segment written defectivi, there would be a fourth lexeme: “gods” (deity). 8 It is insufficient to choose this sense on the ground that the first conjunct involves creatures since heterogeneous semantics are fairly common among conjuncts. 7

104

A. DEAN FORBES

Not legal! When the complete corpus was searched for dangling constructs, sixty-three others (0.18% of the putative constructs) were found and corrected.9

1.3.3 Standard Systematic Detection of Inconsistency We now want to move from random discovery to the unearthing of instances of inconsistency systematically. Systematic inconsistency location is typically achieved by comparing the annotations produced by different analysts (“inter-rater consistency”) or by a single analyst at different times (“intra-rater consistency”). Since these approaches require annotating the same data by multiple raters or at multiple times, they can be costly to carry out. To estimate the extent of inconsistency, two approaches are commonly used: 1. If one has an infallible oracle (“correctly annotated data”), then one relies on error rates displayed in confusion matrices.10 This is not our circumstance. 2. If one has a panel of expert annotators or a repeating annotator, then one computes a measure of agreement, typically Cohen’s kappa.11 This approach is quite common in medical research and in the social sciences. We hesitate to adopt either standard approach since: 1. We are continually learning Biblical Hebrew from the texts and so are hesitant to declare ourselves or One apparent dangling construct was retained: the curious (bizarre?) clause at Jer 50:5. 10 Jensen, F. V. and T. D. Nielsen, Bayesian Networks and Decision Graphs, 268–70. New York: Springer, 22007. Most books on pattern recognition will discuss confusion matrices. 11 Agresti, A., Categorical Data Analysis, 431–36. New York: WileyInterscience, 22002, or Altman, D. G., Practical Statistics for Medical Research, 403–409. London: Chapman & Hall, 1991. 9

THE CHALLENGE OF CONSISTENCY

105

anyone else to be oracles of the sort required to produce a full gold-standard annotation. 2. We can afford neither the monetary costs required to engage multiple raters nor the time costs to have one rater reanalyze data multiple times. Is a third approach possible? Yes, a variant of the second approach.

2. A NEW DIRECTION: SURROGATE TEXTS Our preference is to avoid multiple annotations of the same data, separated in space (multiple annotations by a group of individuals) or time (multiple annotations by a single individual). Our basic idea is to find identical runs of texts in pairs of text portions. Their phrase markers will have been produced or corrected by our analyst, Frank Andersen, at different times. These surrogate texts will allow us to assess intra-rater agreement without the need to redo the analysis. 2.1 Preparation of “Surrogate Texts” Our approach to detecting and gauging inconsistency systematically involves four data preparation steps. Their intent is to strip out from our Kethiv text of L distinctions introduced by us (specifically, homograph resolution) and even to increase the likelihood that a given text string will have an equivalent string elsewhere in the text. We proceed as follows: 1. Convert the text into an absolutely minimallyannotated version by deleting all homographresolving markings, by mapping all plene spellings into defectivi spellings, and by converting all spacers into simple spaces.12 1) Our text does not include cantillations, so their removal is not an issue. 2) We reduce to defectivi the vowels that we introduced (from Gordis) to point the Kethiv consonants. These matters are discussed in Appendix 1, sections 3.1 and 3.2, of our forthcoming volume, Biblical Hebrew Grammar Visualized. 3) For the present experiment, we retain our segmentation of the text. 12

106

A. DEAN FORBES 2. To enable pair-wise comparisons, divide the text into standard portions, e.g., Torah, Former Prophets, Latter Prophets, etc. 3. Identify all identical strings that appear in pairs of portions. Thereby create pairs of matched surrogate texts. For example, the Torah and the Latter Prophets will share a set of identical strings defining the Torah/Latter Prophets surrogate pair. The annotation of all of the phrase markers having these strings as leaves will have been originated and/or revised by Frank Andersen at different times. Hence, these matched surrogates will allow assessment of Frank’s intra-rater consistency.

2.2 The Comparison of “Surrogate Texts” The differences between the structures associated with identical strings can be detected straightforwardly by a human assessor or by a tree—or, if necessary, graph-comparison algorithm. There are a few issues that must be resolved before the work of comparing the surrogate texts can proceed. Identification of Error(s): Given a pair of identical text strings and their associated phrase markers, one must identify which facets of the representations, if any, are in error. At present, this is a task assigned to the human assessor. Fortunately, it is much easier for the assessor to locate error(s) when confronted with differing phrase markers having the same leaves (text) than it is for the original over-reader to be consistent and error-free while working through a complete text seriatim. Given a pair of identical text runs in different portions, all four possibilities as regards error status can and do occur: 1. Both phrase markers may agree and both be correct, 2. One may be in error, 3. The other may be in error, 4. Both may be in error. It is the assessor’s assignment to determine which state of affairs obtains. Completion of this assignment is time consuming but proves not to be unduly stressful. We judge this to be so because the decision space is much better constrained than is the case when an analyst works through a text.

THE CHALLENGE OF CONSISTENCY

107

Dealing with Differing Numbers of Identical Text Strings: The number of identical strings in one surrogate text may differ from the number in the other. When this happens, one must have a policy regarding which text string in the larger pool is to be retained in the pool. Our procedure is as follows: If the numbers of text strings are unequal, then randomly delete excess strings whose phrase markers are judged to be error-free from the larger pool until the numbers in the two pools are the same. (This maximizes the number of errors in the reduced surrogate texts.) The Five Error Types: One of our goals in carrying out the comparisons of surrogate texts is to gain estimates of the relative accuracy of the various aspects of the phrase markers. To that end, we classify observed errors into these five categories: Edge Error—Is an edge missing, unneeded, or misdirected? Node Error—Is a node wrongly present or missing? Part-of-Speech Error—Has a part of speech been wrongly characterized?13 Licensing Relation Label Error—Is the licensing relation on a node incorrect?14 Form/Function Label Error—Is a form or function label incorrect? 2.3 A Computer-Aided Pilot Study We have preprocessed the text of the Hebrew Bible by carrying out the operations described in §2.1 above. We have then manually compared the surrogate Torah with the surrogate (Other) Writings, judging which portion is in error, if any, for each pair of corresponding phrase markers.

Any difference in part-of-speech (POS) feature (number, gender, etc.) is considered a POS difference and implies a POS error. 14 When we compute accuracy rates, we omit instances of the alwayscorrect grammar relation (gram) since their inclusion would substantially inflate the accuracy estimates. 13

108

A. DEAN FORBES

3. RESULTS FROM THE PILOT STUDY 3.1 A Gallery of Inconsistencies We provide instances of the five kinds of inconsistency found previously or found in our pilot study which manually compared the surrogate text of the Torah with that of the (Other) Writings.

3.1.1 Part-of-Speech Assignment Error Consider this incorrect phrase marker for Deut 5:24b:

We have a homograph here, the ‫ ֶאת‬segment having been assigned to the wrong part of speech. It is not the nota accusativi but rather is the preposition “with.” If the string appears elsewhere, correctly assigned, inconsistency will result. Of course, if the error occurs in both loci, then the two phrase markers will be consistent but both will be incorrect. Our methods should detect the error(s) in either case.

3.1.2 Formal Structural Ambiguity Differently Resolved In this phrase marker from Exod 2:1, the parse shown takes the man to be a Levite. But, the clause might mean that a man literally “walked from the house of Levi.”

THE CHALLENGE OF CONSISTENCY

109

This latter interpretation would be represented by making the prepositional phrase a location constituent alongside the verb and subject. Differing resolutions of structural ambiguity yield differing phrase markers and result in inconsistency.

3.1.3 Uneven Use of World Knowledge Just what is “adequate representation” depends on the linguistic coverage that one seeks for one’s phrase markers. Since we include semantic information, novel sources of inconsistency arise. Consider the first daughter of the clause root, # times / gram, in this phrase marker from Exod 23:14:

Here, the identity of the semantic role of the first major constituent has been determined on the basis of world knowledge. It is non-obvious how an enumerated body part can specify the number of times an event is to be executed. But, see BDB 919.2 and KB 1184, where all is made clear. If we elsewhere assign a different semantic role to an identical constituent (Num 22:28, 32, and 33), inconsistency will result.

3.1.4 Free Conventions In some cases, more than one parse can properly represent a clause. As a result, the analyst has a “free choice” available when the clause is parsed. Consider this analysis in Jer 5:7:

110

A. DEAN FORBES

We call the negator here a phrasal operator since its scope extends only over the noun that follows it. In this parse of Neh 5:9, the negator is shown operating at clause level, negating the entire clause. This is an admissible parse (in teen speak: “the word that you are making is good…not!”), but so would be one involving a phrasal operator (“not-good is the word that you are making”).

3.1.5 Inconsistent Assignment of Constituent Function The final kind of inconsistency that we have thus far identified involves our occasionally stumbling over the distinction between grammatical function and semantic role. Many major clausal constituents exercise both a grammatical function and a semantic role.15 Our de facto rule up until now has been that if a major

This topic is taken up at some length in §9.2 of our book, Biblical Hebrew Grammar Visualized, forthcoming from Eisenbrauns. 15

THE CHALLENGE OF CONSISTENCY

111

constituent has a non-trivial grammatical function,16 then we specify that in the node label. If the constituent has no non-trivial grammatical function, then we specify its semantic role. On occasion, we fail to observe our rule, as the following two phrase markers illustrate. In the first phrase marker (from 1 Chr 2:19), the indirect object is properly labeled as such:

But in this harmlessly-trimmed17 phrase marker from 2 Chr 11:19, we inappropriately (according to our rule) label the constituent with its semantic role, beneficiary (benf) when the indirect object grammatical function should be shown. Inconsistent!

3.1.6 Pseudo-Inconsistency Differing contexts can yield pseudo-inconsistency, false inconsistency. The strings enclosed in the black rectangles below are By “non-trivial grammatical function,” we mean a function other than the rather non-specific “adjunct.” 17 To save space, we have lopped off the appositional specification of the names of the sons. 16

112

A. DEAN FORBES

identical if one performs the preparatory steps specified in §2.1 above, specifically if all spacers (which includes sof pasuq, ‫ )׃‬are replaced by blanks (“ ”). Compare the structures to the left of the box-enclosed segments in phrase marker sets A (Exod 29:7) and B (Esth 9:25), below. The phrase markers properly are very different. This is because in set A, the “boxed text” crosses a clause boundary, while in set B it does not. A

B

3.2 Phrase Marker Accuracy Estimates The table gives our findings for major phrase marker components. Accuracy decreases as we move down the table. The estimates suggest that we should focus on correcting the form/function labels in the Torah, where the error rate is highest (0.6%).

THE CHALLENGE OF CONSISTENCY Component

Portion

#Decisions #Errors

113 Accuracy

Edges

Torah Writings Combined

3,817 3,817 7,634

3 0 3

99.921% 100.000% 99.961%

Nodes

Torah Writings Combined

4,274 4,274 8,548

4 0 4

99.907% 100.000% 99.953%

Parts of Speech (POS)

Torah Writings Combined

2,391 2,391 4,782

2 1 3

99.916% 99.958% 99.937%

Licensing Relations

Torah Writings Combined

1,148 1,148 2,296

4 4 8

99.652% 99.652% 99.652%

Form/Function Labels

Torah Writings Combined

1,877 1,877 3,754

11 4 15

99.414% 99.787% 99.601%

Overall

Torah Writings Combined

13,507 13,507 27,014

24 9 33

99.822% 99.933% 99.878%

It took about forty hours to gather the results summarized in the table, about seventy-three minutes of work per error discovered. Manual location and correction of additional inconsistencies/errors by exploiting the other nineteen surrogate pairs would take several months of soul-dulling labor. If we suppose that each pair of surrogates would turn up about the same number of errors as we found with the first pair, then we might be talking about a total of perhaps five or six hundred errors. If the process can be automated to a great extent, then the search could be worth it.

4. FUTURE DIRECTIONS We see four ways of advancing the analysis described above: 1. Outlier detection methods should be applied to the consistency problem.

114

A. DEAN FORBES 2. These methods work best for assessing the validity of node labels. Two approaches have promise: a. Data exploration via ranked lists, wherein rare items attract suspicion. b. Algorithmic prospecting.18 3. Check the consistency of the annotation of groups of “words, POS tags or syntactic function labels.”19 Clever reduction of texts to sequences of POS tags and the like should greatly enlarge the surrogate texts. 4. Use tree-climbing methods to find inappropriately diverse annotations.20

BIBLIOGRAPHY Agresti, A., Categorical Data Analysis. New York: Wiley-Interscience, 22002. Altman, D. G., Practical Statistics for Medical Research. London: Chapman & Hall, 1991. Andersen, F. I. and A. D. Forbes, Biblical Hebrew Grammar Visualized. Winona Lake, Indiana, forthcoming. _____, “Opportune Parsing: Clause Analysis of Deuteronomy 8.” In Proceedings of the Fourth International Colloquium: Bible and the Computer—Desk & Discipline, 49–75. Paris: Editions Honore Champion, 1995. _____, “Hebrew Grammar Visualised: I. Syntax,” ANES XL, 2003, 43–61. Chen, H.-H. and M.-S. Shaw, “A Treebank Development Tool.” First Int. Conf. on Lang. Resources and Evaluation, 1998. Jensen, F. V. and T. D. Nielsen, Bayesian Networks and Decision Graphs. New York: Springer, 22007. See, for example, Zengyou, He et al., “A Fast Greedy Algorithm for Outlier Mining.” In W. K. Ng et al. (eds.), Advances in Knowledge Discovery and Data Mining, (Berlin: Springer, 2006) 567–76. 19 Kaljurand, K., “Checking Treebank consistency to find annotation errors.” 2004. See: math.ut.ee/~kaarel/NLP/Programs/Treebank/ ConsistencyChecking/tbp.pdf. 20 Chen, H.-H. and M.-S. Shaw, “A Treebank Development Tool.” First Int. Conf. on Lang. Resources and Evaluation, 1998. 18

THE CHALLENGE OF CONSISTENCY

115

Kaljurand, K., “Checking Treebank consistency to find annotation errors,” the web, 2004. Matthews, P. H., Oxford Concise Dictionary of Linguistics, 280. Oxford: OUP, 2005. Zengyou, He, et al., “A Fast Greedy Algorithm for Outlier Mining.” In Ng, W. K. et al., eds., Advances in Knowledge Discovery and Data Mining, 567–76. Berlin: Springer, 2006.

THE COMPUTER AND COMPLEX PHRASE STRUCTURE: A Unified Approach to Embedding, Gapping and Recursion JANET W. DYK WERKGROEP INFORMATICA, FACULTY OF THEOLOGY, VRIJE UNIVERSITEIT, AMSTERDAM, NETHERLANDS 1. COMPLEX STRUCTURES To introduce the types of syntactic dependencies that need to be considered, we offer a few examples of the complexities present in Classical Hebrew phrases. Consider the following text: Exod 29:20

‫ונתתה על־תנוך אזן אהרן ועל־תנוך אזך בניו הימנית ועל־בהן‬ ‫ידם הימנית ועל־בהן רגלם הימנית‬ ‘(Then you shall kill the ram, and take of his blood) and put upon the tip of the ear of Aaron, and upon the tip of the right ear of his sons, and upon the thumb of their right hand, and upon the great toe of their right foot.’

Three of the phrases in this example contain an adjective at the end which refers to a construct state form in the middle of a string of construct state forms:

117

118

JANET W. DYK

‫על תנוך אזן בניו הימנית‬ ‘upon tip-of (m.sg.c.st.) ear-of (f.sg.c.st.) sons-of (m.pl.c.st.) him, the right (one)’1

Because ‘right’ is feminine singular we know that it refers to ‫אזן‬, ‘ear’, the only feminine singular form in the string. When preparing a database of Hebrew texts, how should one store this string so that the correct connections are made? In the following example, the object of the verb is composed of a noun in construct state governing a verbal construction containing a perfect form of the verb preceded by a negative: Ps 81:6

‫שפת לא ידעתי אשמע‬ ‘a language (c.st.) I did not know, did I hear’

In yet another example, a complex participial construction as fronted element is resumed in the following sentence by means of a pronominal suffix introduced by a preposition: Lev 7:33

‫המקריב את־דם השלמים ואת־החלב מבני אהרן לו תהיה שוק‬ ‫הימין למנה‬ ‘and he among the sons of Aaron, that offers the blood of the peace offerings, and the fat, for him the right shoulder shall be for (as) a portion’

The phrase ‘among the sons of Aaron’ occurs in Hebrew text after the phrase ‘and the fat’. How is the syntax structured so that ‘among the sons of Aaron’ refers not to the most immediate element ‘the fat’, which would mean that the fat of the sons of

1

Abbreviations: m. = masculine; f. = feminine; sg. = singular; pl. = plural; c.st. = construct state; abs.st. = absolute state; Adj. = adjective; AdjP = adjective phrase; Art. = definite article; Cj = conjunction; N = noun; NP = noun phrase; P = preposition; PP = prepositional phrase; V = verb; VP = verb phrase; Sub.Cj. = subordinate conjunction.

THE COMPUTER AND COMPLEX PHRASE STRUCTURE

119

Aaron was offered, but rather to the more distant ‘the one offering’, reading thus ‘he among the sons of Aaron who offers…’? Taking these three examples, we endeavour to look for an approach which is consistent and repeatable in dealing with the various structures.

2. A UNIFIED APPROACH TO COMPLEX STRUCTURES— BASIC UNITS AND EXPANSIONS When implementing computers to research language, formal characteristics of the data are recorded and patterns are recognized both at graphic and morphemic levels as well as at higher levels. From the patterns that emerge, analysis is possible. The inherent semantic and syntactic lexical characteristics of an item determine with which other elements it may or must cooccur in order to create well-formed language utterances. However, no one has ever seen the inherent lexical properties of a form; rather, they are revealed in the combinations within which a form occurs in actual language use. The idiosyncratic characteristics of a form are projected onto the constructions in which it appears, that is, ‘Lexical information is syntactically represented’.2 This direct connection between a form and its role in the syntax means that the configurations in which a form appears provide the clues to its inherent lexical characteristics. Given sufficient occurrences of a form, it is possible to accumulate information as to its nature and behaviour, and as to what can be expected to occur in the environment of a particular item. As the patterns become clear, usages of a form which deviate from the customary pattern also come to light. It is essential that the inherent lexical characteristics of a form not be prescribed on the basis of what one assumes to be true of that form, but that these be deduced from how the form manifests itself in interaction within its context.

2

For the ‘Projection Principle’, see Haegeman, L. Introduction to Government & Binding Theory, 63. Oxford: Basil Blackwell, 1991.

120

JANET W. DYK

2.1 Basic Units: Formal Indications—Parts of Speech: Each its own Dimensions In Classical Hebrew there are two ways of ordering components within a structure: one is linear, in which there is also the possibility of nesting, and the other is core-oriented, where satellites arrange themselves around a central element. The lexical part of speech provides essential information as to how an element will behave within a structure. Most parts of speech follow the linear manner of construction, while the verbal phrase is core-oriented, arranging the satellites around the verb, whereby the word order is not necessarily strictly linear. The possibilities for syntactic connections are given in the formal characteristics of the part of speech itself. Constructions can be broken down into simple units and combinations of these simple units. We will consider first the simple units per part of speech separately. 2.1.1 Nominal Endings In Classical Hebrew the nominal ending determines the type of connection an item has to that which follows. The simple unit has an absolute state ending which marks the boundary of the unit. This unit can be preceded by the definite article, which some linguists would prefer to call the Determiner Phrase.3 The simple unit for noun phrase is thus: NP = [±definite article] N absolute state Proper nouns act as determined nouns, that is, they act as though they are in absolute state and do not normally take a preceding definite article. The same is true of pronominal suffixes.

For an analysis of two identical surface forms with distinct analyses using the Determiner Phrase, see Dyk, J. W., “Who Shepherds Whom?” In Gosker, M. et al., ed., Een Boek heeft een Rug: studies voor Ference Postma op het grensgebied van theologie, bibliofilie en universiteitsgeschiedenis ter gelegenheid van zijn vijftigste verjaardag, 166–72. Zoetermeer: Boekencentrum, 1995. 3

THE COMPUTER AND COMPLEX PHRASE STRUCTURE

121

2.1.2 Verbal Endings When the part of speech of an item is a verb, we are dealing with the inherent ability of a verb to connect to items to form a constellation with a particular meaning. The head of the construction is the verbal form and around this are gathered the elements required for grammaticality (called: ‘complements’) and non-obligatory satellites providing extra information (called: ‘adjuncts’). These expansions will be dealt with below. In a manner somewhat similar to the possibility of having a definite article attached to a noun, the verb can be accompanied by an explicit subject. In independent verbal clauses the word order is not strictly linear but manifests a certain degree of flexibility, and is affected by various strategies of text composition, such as topic and focus, foregrounding and backgrounding, and others. 2.2 Expansions The simple units can be expanded by other structures. Sometimes the expansion is necessary for grammaticality, and sometimes it is optional. We will look at each of these separately per part of speech.

2.2.1 Obligatory Expansions 2.2.1.1 Nominal Phrases When a noun occurs in construct state it requires a following element to complete the phrase. The unit which follows is syntactically a complete phrase. Though the following phrase is often a NP, other types of phrases can also be governed by a N in construct state: N construct state + XP The nominal ending indicates the syntactic connection: a noun in construct state governs a following phrase (for a noun in absolute state, see below, section 2.2.2.1). The order is strictly linear.

2.2.1.2 Verbal Phrases The verb is the core of a constellation which is created by the verb’s powers of government. The satellites required by a verb to

122

JANET W. DYK

make a grammatical sentence are obligatory expansions of the verb. These include noun phrases, prepositional phrases and other structures functioning as direct objects or as some other element required by the valence of the particular verb to create a grammatical sentence. The obligatory expansions of a verb can be simple or complex phrases. Although there is a preferred or more frequently occurring order of elements in a verbal phrase within a particular specimen of the language, this order is not obligatory, but can be affected by aspects related to text composition and text hierarchy. It is the certain degree of flexibility in word order within verbal phrases which has allowed the preference for a certain word order in main clauses to shift through time and to manifest variation between dialects and genres.

2.2.1.3 Connectors 2.2.1.3.1 Prepositions—between a Noun Phrase or a Verb Phrase and the larger Context Prepositions behave syntactically like nouns in construct state, that is, to be complete they require that a full phrase follows; the preposition syntactically governs the following phrase. The preposition functions as a connection between the phrase it governs and the larger context in which it occurs.

2.1.3.2 Subordinating Conjunctions—between a Verb Phrase or Nominal Clause and the larger Context Much like prepositions, subordinating conjunctions embed clauses within a larger context. Subordinating conjunctions require the following structure to be a full phrase, and they relate the phrase they govern to the larger context in which they appear.

2.1.3.3 Coordinating Conjunctions—between Units of an equal Level A coordinating conjunction is syntactically incomplete without the elements which it connects at an equal level. A coordinating conjunction can occur between words within a phrase, between phrases within a clause, and between clauses and larger textual

THE COMPUTER AND COMPLEX PHRASE STRUCTURE

123

units. All coordinated elements within the string function at an equal level.

2.2.2 Optional Extensions Besides obligatory expansions, structures can have optional extensions. These expansions are themselves whole phrases. The syntactic relationships are again determined by the part of speech of both the form being expanded and of the expansion itself.

2.2.2.1 Noun Phrases Noun phrases can be expanded by phrases which are attributive or appositional, or which more precisely specify the noun they refer to. These types of relationships are determined generally by the part of speech of the expansion: adjectives will be attributive, noun phrases appositional, and prepositional phrases provide extra specification.

2.2.2.2 Verb Phrases Verb phrases can be expanded by optional phrases indicating manner, time, justification, or location of the action of the verb. An exception to this is formed by a locative occurring with a verb of movement: the locative provides obligatory information required by the verb of movement and is therefore a complement of the verb.

2.2.2.3 Connectors A phrase introduced by a connector can also be expanded by another phrase of the same type in apposition. Coordinating conjunctions can continue in a long series of coordinated expansions.

2.2.2.4 Other parts of speech Without giving them the attention they rightly deserve, let it suffice here to say that parts of speech such as adverb, interjection, interrogative particle, as well as other particles, assume their place within the non-obligatory expansions of both nominal and verbal constructions. Like noun phrases and verb phrases, when composed of more than one element, their internal structure is

124

JANET W. DYK

determined by their part of speech, and can be obligatorily and optionally extended by units as described above. 2.3 Summary Syntactic relationships of non-verbal phrases can be summarized in the following table: Head N abs state

Obligatory expansions -----

N const state

XP

Preposition Sub. Conj.

XP XP

Optional expansions Apposition (same phrase type) Attribution (AdjP—for NPs) Specification (PP / VP)

The order of the components in these structures is strictly linear at a single level. Nesting can occur, creating multiple levels within each of which the strict order is maintained. For verbal phrases the following syntactic relationships are valid: Head Verb

Obligatory expansions Complements

Optional expansions Adjuncts

Verbal phrases allow diversity in word order.

3. COMPLEX STRUCTURES AS BASIC UNITS AND EXPANSIONS We now return to the examples presented at the beginning. By means of a cyclic application of the structures described in section 2 above, the complex structures are reduced to units from the short list of basic units and expansions. Exod 29:20

‫ונתתה על־תנוך אזן אהרן ועל־תנוך אזן בניו הימנית ועל־בהן‬ ‫ידם הימנית ועל־בהן רגלם הימנית‬ ‘(Then shalt thou kill the ram, and take of his blood) and put upon the tip of the ear of Aaron, and upon the tip of the right ear of his sons, and upon the thumb of their right hand, and upon the great toe of their right foot.’

THE COMPUTER AND COMPLEX PHRASE STRUCTURE

125

This clause has a complex phrase where the blood of the ram should be placed. The phrase is the obligatory expansion of the verb:4 ‫על־תנוך אזן אהרן ועל־תנוך אזן בניו הימנית ועל־בהן ידם הימנית‬ ‫נתתה‬ ‫בהן רגלם הימניתועל־‬ complement verb ‘upon the tip of the ear of Aaron, and upon the tip of the right ‘put’ ear of his sons, and upon the thumb of their right hand, and upon the great toe of their right foot’

This complement of the verb is built up of four coordinated phrases which are equal to one another, i.e. coordinated, not subordinated: Coordinated Phrase = XP Cj XP Cj XP Cj XP = XP [Cj XP]x ‫על־בהן רגלם‬ ‫ו‬ ‫הימנית‬ PP Cj ‘upon the ‘and’ great toe of their right foot’

‫על־בהן ידם‬ ‫ו‬ ‫הימנית‬ PP Cj ‘upon the ‘and’ thumb of their right hand’

‫על־תנוך אזן‬ ‫ו‬ ‫בניו הימנית‬ PP Cj ‘upon the ‘and’ tip of the right ear of his sons’

‫על־תנוך אזן‬ ‫אהרן‬ PP ‘upon the tip of the ear of Aaron’

Three of the phrases in this clause contain an adjective at the end referring to a construct state form in the middle of a string of construct state forms. These can be analyzed using the units and expansions presented above. We will take the second in the string as illustration. The prepositional phrase in focus is composed of a preposition and a noun phrase governed by the preposition: PP = P NP ‫תנוך אזן בניו הימנית‬ NP P ‘the tip of the right ear of his sons’ ‘upon’ 4

‫על‬

Within the charts containing Hebrew, the Hebrew convention of right-to-left writing has been followed.

126

JANET W. DYK

The noun phrase is composed of a noun in construct state and a full noun phrase governed by the construct state form: ‫אזן בניו הימנית‬ NP ‘the right ear of his sons’

‫תנוך‬ Nc.st. ‘tip-of’

The following NP exhibits nesting. In order to allow the nonobligatory expansion ‘the right one’ to connect back to ‘the ear’ with which it is congruent, it is necessary to take into account the boundaries of the phrases in the composition. As already mentioned, pronominal suffixes function in the syntax as determined elements and as though having absolute state. This absolute state provides the boundary after which non-obligatory expansions can be added. Only by accepting that each construct state form is the head of a new phrase which in turn governs a complete phrase can it be possible to make the connection between ‘ear’ and ‘right’. ‫הימנית‬ Art. + Adj

‫ו‬

‫בני‬

‫אזן‬

‘the right (one)’

Suffix ‘him’ NP ‘his sons’

Nc.st. ‘sons-of’

│ └──────────

↑ ────────────────── ┘

Nc.st. ‘ear-of’

We must keep in mind that this structure is but one of the four coordinated prepositional phrases similarly constructed which together form the complement of the verb ‘place’. Yet in spite of the complexity, each of the units within the structure is simple in form and the types of combinations come from a limited repertoire of possibilities. The different combinations of these basic forms are, however, limitless in patterning possibilities. The complexities of the phrases in Jeremiah 48:24 can be handled in the same way. In this verse, a list of those upon whom judgment had come continues from verse 21. The last phrase in verse 24 is:

THE COMPUTER AND COMPLEX PHRASE STRUCTURE

127

Jer 48:24

‫ועל כל־ערי ארץ מואב הרחקות והקרבות‬ ‘and upon all the cities of the land of Moab, far or near’

Due to the feminine plural ending on the final adjectives, it is clear that these are syntactically connected to ‘cities’ in the middle of the string of construct state forms. Again a step-by-step application of the simple rules leads to a correct connection: PP = P NP ‫כל־ערי ארץ מואב הרחקות והקרבות‬ NP ‘all the cities of the land of Moab, far or near’

‫על‬ P ‘upon’

The noun phrase is composed of a noun in construct state governing a full noun phrase: ‫ערי ארץ מואב הרחקות והקרבות‬ NP Nc.st. ‘the cities of the land of Moab, far or near’ ‘all-of’

‫כל‬

The following noun phrase exhibits nesting. In order to allow the non-obligatory expansion ‘far and near’ to connect back to ‘the cities’ with which it is congruent, it is necessary to take into account the boundaries of the phrases in the composition. As mentioned, proper nouns function in the syntax as determined elements and as though having absolute state. This absolute state provides the boundary after which non-obligatory expansions can be added. Only by accepting that each construct state form introduces a new head of a phrase which governs a complete phrase is it possible to make the connection between ‘ear’ and ‘right’. ‫הרחקות והקרבות‬ ‫מואב‬ ‫ארץ‬ ‫ערי‬ Coordinated AdjP Proper noun Nc.st. Nc.st. ‘Moab’ ‘land-of’ NP ‘the far and the near’ ‘the land of Moab’ ‘cities-of’ │ └────────────

─────────────

↑ ┘

128

JANET W. DYK The second case introduced in the introduction is: Ps 81:6

‫שפת לא ידעתי אשמע‬ ‘a language I did not know, shall I hear’

The first major division is between the verbal form and the phrase functioning as its object. As mentioned, within verbal clauses the word order is not strictly linear. In this case the object precedes the verb, but is nonetheless governed by the verb: ‫אשמע‬ ‫שפת לא ידעתי‬ Verb NP functioning as object of the verb ‘I shall hear’ ‘a language I did not know’

To analyze the structure of the object correctly, we must accept again that a construct state form in Hebrew is the head of a new phrase and that it governs a full phrase, and that the phrase thus governed can be a verbal phrase. In translations, a relative pronoun is often added, ‘I heard a language that I understood not’ (KJV), but the Hebrew does without the extra element. Within the verbal phrase, the parsing is the same as in any verbal phrase: ‫ידעתי‬ Verb (perfect) ‘I knew’ VP ‘I knew not’

‫לא‬

‫שפת‬

Negative ‘not’ Nc.st. ‘language-of’

In the third case given in the introduction, a complex participial construction functions as a fronted element resumed in the following sentence by a pronominal suffix: Lev 7:33

‫המקריב את־דם השלמים ואת־החלב מבני אהרן לו תהיה שוק‬ ‫הימין למנה‬ ‘and he among the sons of Aaron, that offers the blood of the peace offerings, and the fat, for him the right shoulder shall be for a portion’

THE COMPUTER AND COMPLEX PHRASE STRUCTURE

129

The difficulty in the Hebrew is that ‘from among the sons of Aaron’ occurs immediately following ‘the fat’. To be able to connect this phrase back to the correct referent it is essential that the correct units and boundaries be recognized. Breaking it down one step at a time, we first have the largest divisions: ‫המקריב את־דם השלמים ואת־החלב מבני אהרן לו תהיה שוק הימין למנה‬ Main clause with Fronted element (casus pendens) resumptive pronoun ‘for him the right shoulder ‘and he among the sons of Aaron, that shall be for a portion’ offers the blood of the peace offerings, and the fat’

The fronted element can be further analyzed as a definite NP with a PP as specification. At this point, the specifying phrase ‘among the sons of Aaron’ makes the right connection. In the English translation, the phrase is moved to the beginning so that it modifies the one performing the act of offering, that is, the head of a NP. Hebrew does not need to use a lexicalized nominal form here, such as the English ‘giver’, ‘builder’, or in this case the hypothetical form ‘offerer’. Instead the NP itself is a participial clause with verbal satellites. This should not be allowed to cloud our judgment as to the correct boundaries and connections. The specifying phrase ‘among the sons of Aaron’ is an expansion of the head of this construction which is nominal at this level of syntax, though verbal in relation to the elements indicating what is offered. The whole fronted element fills a single syntactic slot as casus pendens in the clause (see above). ‫מבני אהרן‬ ...‫המקריב‬ PP as specification Art. + NP ‘among the sons of Aaron’ ‘the offerer (the one offering) …’ │ └──────────────

↑ ┘

The internal structure of this fronted element follows the rules described in section 2 above:

130

JANET W. DYK

‫החלב‬

Art +N ‘the fat’

‫את‬

OBJ MARK

‫ו‬

‫השלמים‬

‫דם‬

Def.art+N Ncst ‘the peace offering’ ‘bloodof’ NP

‫מקריב את‬

‫ה‬

OBJ MARK

‘the blood of the peace offering’ OBJ MARK phrase

OBJ MARK Cj phrase ‘the fat’ ‘and’ ‘the blood of the peace offering’ Coordinated phrases introduced by OBJ MARK ‘the fat and the blood of the peace offering’ Participial clause functioning as a NP ‘offering the blood of the peace offerings and the fat’

V ‘offer’ Art. ‘the’

The following example contains multiple embeddings; a cyclic application of the combinatory possibilities presented in section 2 allows each element to have its appropriate place within the structure: 2 Kgs 10:24

‫האיש אשר־ימלט מן־האנשים אשר אני מביא על־ידיכם נפשו‬ ‫תחת נפשו‬ ‘the man who allows to escape (one) from the men whom I am bringing upon your hands, his soul shall be instead of his soul’ ‫נפשו תחת נפשו‬

‫האיש אשר־ימלט מן־האנשים אשר אני‬ ‫מביא על־ידיכם‬ Nominal clause with resumptive Fronted element (casus pendens) pronoun ‘his soul instead of his soul’ ‘the man who allows to escape (one) from the men whom I am bringing upon your hands’

Taking only the structure of the fronted element into consideration, we have a definite NP with a non-obligatory

THE COMPUTER AND COMPLEX PHRASE STRUCTURE

131

attributive clause introduced by a subordinating conjunction, the so-called relative pronoun: ‫אשר־ימלט מן־האנשים אשר אני מביא על־ידיכם‬ ‫האיש‬ Art. + N Attributive clause introduced by ‫אשר‬ ‘who allows to escape (one) from the men whom I am ‘the man’ bringing upon your hands’

The attributive clause breaks down into: ‫ימלט מן־האנשים אשר אני מביא על־ידיכם‬

‫אשר‬ VP Sub.Cj ‘he shall allow to escape from the men whom I am ‘who’ bringing upon your hands’

This VP breaks down into: ‫מן־האנשים אשר אני מביא על־ידיכם‬ ‫ימלט‬ PP as complement of the verb Verb (imperfect) ‘from the men whom I am bringing upon ‘he shall allow to your hands’ escape’

The prepositional phrase breaks down into: ‫האנשים אשר אני מביא על־ידיכם‬ NP ‘the men whom I am bringing upon your hands’

‫מן‬ P ‘from’

The NP object of the preposition breaks down into a determined noun with a non-obligatory attributive clause introduced by ‫אשר‬: ‫אשר אני מביא על־ידיכם‬ ‫האנשים‬ Art. + Nabs.st. Attributive ‫ אשר‬clause ‘whom I am bringing upon your hands’ ‘the men’

This final attributive clause breaks down into: ‫אני מביא על־ידיכם‬ Participial verbal clause Sub Cj ‘I am bringing upon your hands’ ‘whom’

‫אשר‬

132

JANET W. DYK

Participial verbal clause containing an explicit subject can be analyzed as: ‫על־ידיכם‬ PP as complement Verb ‘upon your hands’ ‘bringing’

‫מביא‬

‫אני‬ Subject ‘I’

Thus by recursive application of the same simple rules the most complex structures can be made transparent by using a limited list of units and their expansions guided by the part of speech, by the different structural possibilities of construct state and absolute state of nominal forms and by the governing valence of verbs. The number of basic units and the types of extensions of these units are limited, although the patterns of combinations are potentially limitless.

CONCLUSIONS From this parsing exercise we can arrive at the following conclusions: • All structures no matter how complex are composed of simple units and a limited number of types of extensions • The part of speech determines what type of syntactic connections a form makes • The state of a nominal form determines its relation to the following element • Nouns in absolute state mark the boundary of a potentially independent phrase • Proper nouns and suffixes function syntactically as determined nouns in absolute state • Each nominal structure follows a strictly linear order in which components occur • Nouns in construct state must be taken to be the head of a new phrase which governs a whole phrase which in turn has the possibility of obligatory and non-obligatory expansions • Verbal valence determines the combinatory potential of a verb • Verbal structures have a less strictly linear order in which components occur

THE COMPUTER AND COMPLEX PHRASE STRUCTURE •





133

Computer programs dealing with Hebrew need to allow for the fact that a verb with its satellites can occur within a nominal environment and that at a higher syntactic level the constellation as a whole is still part of a single phrase Between word level and clause level constituents, computer programs must allow for the potential of multiple levels of structure so that complex phrases can be dealt with in a consistent manner which allows their internal structure to be transparent Though the number of basic units and the types of extensions of these units are limited, the potential combinations are unlimited.

This approach to the syntax of Semitic languages has a further advantage: through comparative research we have found that when making comparisons between languages, it is often at the boundaries of the units and their expansions that different syntactic strategies occur which reflect the system of the language itself. In this way differences between individual styles and genres within a single language can also be detected. Thus a consistent application of a simple set of rules provides insight into how complex phrases are built up within a language as well as insight into how languages differ in forming syntactically complex structures.

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS1 C. H. J. VAN DER MERWE UNIVERSITY OF STELLENBOSCH 1. INTRODUCTION The purpose of this paper is to illustrate how the synergy of available computer tools and current linguistic models can help a Biblical Hebrew grammarian to do his job. The presentation will take the form of a case study.2 In recent years I have found both BH grammars and lexica lacking in their representation of three types of important information.3 Firstly, users are rarely informed unambiguously what the most typical or frequent use(s) of a lexeme or a construction is. Semantic categories in most cases are listed as “flat” taxonomies. In other words, rare untypical categories could easily be treated by less advanced users as on a par with more prototypical ones.4 Secondly, The financial assistance of the National Research Foundation (NRF) of South Africa towards this research is hereby acknowledged. Opinions expressed in this publication and conclusions arrived at are those of the author and are not necessarily to be attributed to the NRF. 2 Van der Merwe (2009a, 266–83) represents a more exhaustive description of the findings of this case study. 3 Cf. Van der Merwe (2009a, 266–67). Lübbe (1993, 119–37), Van der Merwe (2004, 123 and 2006, 92–95) and Imbayarwo (2008, 108–40) also points out some of the inadequacies of current Biblical Hebrew lexica. 4 In the case of an ancient language of which only a closed corpus of texts is available, it is assumed that frequency of use and the even 1

135

136

C. H. J. VAN DER MERWE

the empirical status of the different categories in lexica is often not clear; e.g. are they semantic or syntactic distinctions? Furthermore, there is seldom an indication whether specific semantic categories and the associated suggested translation values could be linked with specific syntactic patterns. Thirdly, near-synonyms are listed, but the differences among them are seldom indicated. To illustrate the above-mentioned, consider the following example from one of the most authoritative BH lexica, viz. Koehler and Baumgartner (1999): ‫אף‬: ַ (about 120 ×): particle; MHb.; Ug. ap, Ph.; Arm. (→ BArm., MdD 30b, Altheim-S. Spr. 266) also ‫ ָאף‬and ‫אוֹף‬, → ‫;פ‬ DISO 21, → Garbini Semitico 167f: also, even; particularly often in Deut.-Is, Ps and Job; —1. additional (→ ‫)גַּ ם‬: ‫ַאף ֲאנִ י‬ Gn 40:16, Lv 26:24 Sam ‫ וְ ַאף ַל ֲא ָ ֽמ ְתָך ;גַּ ם‬and also to your maidservant Dt 15:17, ‫ ַאף ָ ֽע ְר ָכה‬she also furnishes Pr 9:2; —2. emphasising: ‫ ַאף ֲאנִ י‬I for my part Ps 89:28, ‫וְ ַאף־גַּ ם־זֹאת‬ and yet for all that Lv 26:44; —3. enhancing: ‫ ַאף ָק ְדקֹד‬even the crown of the head Dt 33:20, ‫ ַאף נָ ִעים‬certainly pleasant Song 11:6; ‫יאה‬ ָ ‫א־ה ִב‬ ֵ ֹ ‫ ַאף ל‬moreover she has not allowed to come Est 5:12, ‫יטיבוּ‬ ִ ‫ף־תּ‬ ֵ ‫ ַא‬you really should do good Is 41:23, ‫ַאף שׁ ְֹכנֵ י‬ how much more those who dwell Jb 4:19; —4. antithetic: ‫ ַאף־זָ נַ ְח ָתּ‬but you have spurned Ps 44:10; —5. compounds: ‫ף־א ְמנָ ם‬ ָ ‫ ַא‬really and truly Jb 34:12, ‫ף־א ְמנָ ם‬ ָ ‫ַה ַא‬ shall I in fact? Gn 18:13, ‫ ַה ַאף ֵאין־זֹאת‬should this really not be? Am 2:11, ‫ ַאף ַבּל‬never yet Is 40:24, ‫ף־אין‬ ֵ ‫ ַא‬there is none at all Is 41:26, ‫ ַה ַאף‬will you really? Jb 40:8; —6. ‫ ַאף ִכּי‬a) frequently ‫ִכּי‬ introduces a plain conditional clause: ‫ ַאף ִכּי … שׁ ַֹלּ ְח ִתּי‬how much more if I send Ezk 14:21, or ‫ ַאף‬elliptically introduces a question, the contents of which are preceded by ‫כּי‬:ִ ‫ַאף ִכּי ָא ַמר‬ did God really say? Gn 3:1, or ‫ ַאף‬stresses a clause of time: ‫ַאף‬ ‫ ִכּי‬even when Neh 918; b) ‫ ַאף ִכּי‬has become a unit: how much more when 2S 4:11, ‫י־ע ָתּה‬ ַ ‫ וְ ַאף ִכּ‬and how much more now 2S 16:11; after a negative clause: how much less Jb 9:14, how much less this house 1K 8:27; ? Hab 2:5, → Comm.; — distribution throughout the corpus may be interpreted as evidence of the prototypicality of that category.

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 137 2K 2:14 for ‫ ַאף הוּא‬Sept. αφφω, Jerome aph pho = ֹ‫ ;? ֵא ֽפו‬Ps 96:12 for ‫ ָאז‬cj. ‫;אף‬ ַ Dt 33:3 rd. ‫ ָ ֽא ְפ ֻף‬for ‫( ? ַאף‬Cross-F. JBL 67:199); Jb 36:29 ‫ ַאף ִאם‬rd. ‫( ? ַאף ִמי‬:: Tur-Sinai 504: = ‫ַאף‬ ‫ ָא ְמנָ ם‬or EgArm. ‫ אפם‬truly); Pr 22:19 ‫ ַאף ָ ֽא ָתּה‬rd. ‫ָא ְרח ָֹתיו‬ (Gemser); ? 2S 20:14, Ps 108:2 and Jb 36:16.

A generic translation value appears to be suggested as “also, even”. Six categories are then listed: 1. addition; 2. emphasising; 3. enhancing; 4. antithetic; 5. compounds; and 6. ‫אף ִכּי‬. ַ The first four distinctions are semantic categories, but the last two appear to be morpho-syntactic in nature. From the examples given, one can infer that the translation value “also” is to be associated with 1. and “even” with 3. It is also suggested that as far as 1. is concerned, ‫ַאף‬ is a near-synonym of ‫גַּ ם‬. No attempt is made to indicate if and/or how ‫ ַאף‬differs from ‫גַּ ם‬. Under 5. “compounds”, the translation value “really” is dominant. What the relationship between 5. and 2. (“emphasis”) should be, however, is not clear. While ‫ ַאף‬in 5. and 6. is clearly tied to morpho-syntactic constructions, none of the other four categories could be associated with a specific syntactic pattern. Even when one considers all the other available BH lexica and grammars, as well as other publications in the field,5 not much more is learned about ‫ ַאף‬or its relationship with ‫גַּ ם‬. Only two other studies are worth mentioning, Muraoka (1985, 143) who observes that “emphasis in the sense of asseveration is not a function of the particle”, and Waltke and O’Connor (1990, 663), who state that ‫ ַאף‬is “simpler and closer to ְ‫ו‬. ‫ ַאף‬can also serve as a correlative, lining up the situation of its clause with that of the previous clause…. Followed by ‫ ִכּי‬, ‫ ַאף‬can have an emphatic role.… ‫ גַּ ם‬generally has more distinctly logical force than ‫ ַאף‬, though it can be used as an emphatic” (1990, 663). The purpose of this case study is to make a contribution towards a model that could be used in addressing the abovementioned inadequacies in the description of BH particles. It is assumed that this model could also be applied and/or modified to E.g. Brown-Driver-Briggs (1907); Rüterswörden et al. (1987); Clines (1993); Gesenius (1909); Joüon-Muraoka (1991); and Van der Merwe, Naudé and Kroeze (1999). 5

138

C. H. J. VAN DER MERWE

be used in the analyses of other particles, full lexical items as well as typical grammatical constructions. It is, firstly, hypothesised that an exhaustive corpus linguistic analysis of the BH particles holds the key to this endeavour. However, the qualitative experimentation and interpretation of the quantitative data should be informed by a thoroughly justified theory of language. Or, to put it differently, a well-justified theory of language should prompt the investigator to ask the appropriate questions, and then draw relevant qualitative conclusions from the quantitative data. Since a “corpus linguistics” analysis does not represent a homogeneous methodology, and since I would also like to evaluate this study as an instance of corpus linguistics, I commence in section 2 with a brief description of this notion. In section 3, I provide a few reasons why I found basic insights from cognitive linguistics useful for the theoretical framework to underpin this investigation. The way that these insights come into play in the practical procedures of investigating all instances of ‫ ַאף‬in the Hebrew Bible with the help of two electronic tools is described in section 4. In section 5 the outcomes of this corpus linguistic study of ‫ ַאף‬are presented. In conclusion, I critically evaluate this study as an instance of corpus linguistics.

2. CORPUS LINGUISTICS “Corpus Linguistics” is seen as the study of linguistic phenomena through large collections of texts or corpora, in contrast to relying on the intuitions of mother-tongue speakers. Its roots predate computer technology, but the upsurge of interest in it is associated with “the analysis of a corpus or corpora of naturally occurring languages which are machine-readable so that the retrieval of the search patterns is computerized” (Gries 2006, 4). The methodologies of these analyses, however, are by no means homogeneous. Gries (2006, 4), for example, identifies five criteria that are prototypically shared by corpus linguistic investigations, but also distinguishes three parameters in terms of which corpus linguistic studies may differ. Relevant for our purposes are two of the criteria that are shared: 1. “the analysis is, or at least attempts to be, systematic and exhaustive, meaning that the corpus does not simply serve as a database of examples from which

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 139 some can be chosen ad libitum and others neglected, but that the whole (sample of the) corpus is taken into consideration so that even less frequent patterns must somehow be integrated or at least addressed; 2. the analysis aims at more than just accounting for a categorical either-or phenomenon, but uses statistical data (frequencies, percentages/probabilities, statistical methods) to cover the middle ground between what is possible/grammatical and what is not.” The parameters in terms of which corpus linguistic studies differ are the following (Gries 2006, 6): 1. Firstly, they differ as far as the granularity of the analysis is concerned; for example, in one instance only one form of a lemma is investigated without considering any other inflectional forms, in another a lemma and all its inflected forms are investigated, and so on; 2. Secondly, they differ as far as the qualitative inferences that are made on the basis of the frequency data from the corpora. According to Gries (2006, 5), most of the studies “restrict themselves to reporting frequency data and (usually) attribute some importance to the different frequencies with which particular categories are attested in data and the consequences this has for the phenomenon under investigation”; 3. Thirdly, they differ on the level of sophistication as far as the multi-factorial techniques and statistical methods that are used are concerned. With regard to BH, from the 18th century to the middle of the 20th century most BH scholars regarded historical and comparativephilological data6 as the key towards achieving a better understanding of BH—a language based on such a limited corpus of texts, which in turn have a history of their own, that some even 6

Cf. Van der Merwe (2006a, 95–96).

140

C. H. J. VAN DER MERWE

questioned the status of the language itself.7 Since the second half of the previous century BH scholars started to appreciate more and more the insights that a description of distributional patterns of the entire corpus could provide. Although this type of manual (or even semi-automated)8 “corpus linguistics” is theoretically possible, it is extremely time-consuming. Initiatives (e.g. those of Richter, Andersen-Forbes and Talstra)9 towards compiling linguistic databases so that computer technology could be used to speed-up experimentation with various distribution patterns have therefore been welcomed. At present these databases are at various stages of completion and/or accessible to varying degrees. Despite some of its limitations, I used the Stuttgart Electronic Study Bible (SESB)10 to search and compile syntactic constructions in this investigation with a finer level of granularity than was possible before. I used Vocabula,11 an electronic template designed for the preparation of the Semantic Dictionary of Biblical Hebrew, for postulating, storing, reviewing and manipulating “possible productive” syntactic and semantic categories of use. The process of this experiment was informed by a number of crucially important insights from cognitive linguistics.

See, for example, Knauf (1990). “Semi-automated” means that basic computer tools (e.g. word processors, electronic spreadsheets and databases) were used for the recoding and storing of manually searched data. Exemplary in this regard are Hoftijzer (1981 and 1985), Richter (1985 and 1986), Jenni (1992, 1994 and 2000), Gross (1996) and Follingstadt (2001). 9 For Richter, cf. Eckardt (1987) and Rechenmacher and Van der Merwe (2003, 226–42; for Andersen-Forbes, cf. http://www.andersenforbes.org and for Talstra, cf. Talstra (2002, 3–22) and http://wivu.dans.knaw.nl/. 10 Cf. http://www.logos.com/products/details/3108. 11 Cf. De Blois (2002, 275–95). 7 8

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 141

3. WHY IS THE COGNITIVE LINGUISTIC (CL) PARADIGM SO USEFUL? Since it is impossible to do full justice in one paper to an entire paradigm of linguistic research,12 I will focus on the most pertinent issues: 1. CL sees language “as an instrument for organizing, processing, and conveying information”—as something primarily semantic (Geeraerts 2006, 3). Accordingly, CL puts the meaning of language at the centre of the linguistic enterprise. While it may sound obvious that theories of language should shed light on the description of the meaning of lexemes and grammatical constructions, many lexicographers and grammarians will tell you that, despite the heuristic value of some models of meaning, few provide comprehensive frames of reference that can be used by language practitioners. 2. Knowledge of language emerges from its use (Croft and Cruse 2004, 1). In the words of Gries (2006, 2), “rather than assuming that knowledge of language is best conceived of as categorial and determined by discrete either-or-parameter settings”, it is assumed that “linguistic knowledge is ultimately shaped by how language is actually put to use” and that the ways in which language is used “influence the representation and the processing of linguistic categories.” Grammar is conceptualisation; in other words, “categories and structures in semantics, syntax and phonology are built up from our cognition of specific utterances on specific occasions of use. This inductive process of abstraction and schematization does not lose the conventionalized subtleties and differences found among even highly specific grammatical constructions and word meaning” (Croft and Cruse 2004, 3–4). This implies that corpus linguistic analysis could be employed to identify those patterns of use which have become conventionalised. In the case of BH, it provides a further justification for the need to conduct corpus linguistic analyses of the language, and probing the implications that could be drawn from quantitative data.

12

Cf. also Van der Merwe (2006a, 2006b, 2006c and 2007).

142

C. H. J. VAN DER MERWE

3. There is no categorial distinction between grammatical and lexical meaning. According to Langacker (2008, 22), “[t]he lexicon and grammar of a language consist of symbolic assemblies that in substantial measure have achieved the status of conventional units.” Furthermore, the “lexicon and grammar form a gradation consisting solely in assemblies of symbolic structure” (2008, 5) and the “lexicon can be characterized as residing in fairly specific symbolic assemblies, and grammar in more schematic ones” (2008, 22). This perspective helps one to appreciate better the value of insight into the syntactic configurations in which lexical items with a very generic or a minimal semantic content are used. Our description of ‫ ַאף‬in §5 will illustrate this point. 4. Language is not an autonomous cognitive faculty, and meaning is not constructed through language as a separate and independent module of the mind, but it reflects the overall experience of human beings. This means CL can draw from a range of disciplines on, for example, how the brain and mind work according to the neurological sciences, how humans as biological entities experience the world, according to the behavioural sciences, and how these experiences reflect as universals across languages, and also how languages, according to cultural anthropology, reflect the historical and cultural experiences of groups of speakers. 5. Of particular relevance for this presentation are the insights of psychological research into the way humans across languages categorise. For example, categories tend to have fuzzy borders, with prototypical examples and less prototypical examples. Consider the category “fruit”. An apple is a more prototypical example of the category “fruit” than a watermelon. Categories also display hierarchical levels of schematisation, i.e. moving from more specific levels to more general ones (Labrador>dog>domestic animal). However, some categories may be specific at one level, but unspecified at another; for example, at the level of domestic animals, dog refers to a specific type of domestic animal; but at the level of dogs, it is unspecified as far as being male or female. This implies that “[t]he semantics of lexical and constructional units is not a bag of meanings, but is a (prototypically and schematically) structured meaning potential that is sensitive to contextual effects” (Geeraerts 2006, 10).

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 143 6. The “meaning potential” of an expression is not only to be construed in terms of its position in a hierarchically ordered schematic network. Room must be left for the way in which the meaning of expressions might have developed in terms of a radial network. This means less central uses are extended from a central one. The extended uses or mappings are not arbitrary, but take place in terms of identifiable principles; for example, the mappings have the character of metonymic and metaphoric extensions, and shifts tend to take place from space to time, from concrete to abstract, from specific to generic, etc. As an illustration in this regard I refer to some of the shifts I identified in my study of ‫ ִהנֵּ ה‬,13 viz. a presentation of proximity of location (#1) became the presentation of proximity in time (#2) and then what can be regarded as a mental proximity (#3): “Keeping x in mind (mental proximity), do y.” 1

‫מר ִה ֵנּ֖ה נָ ָ ֣תן‬ ֹ ֔ ‫ וַ יַּ ִגּ֤ידוּ ַל ֶ֙מּ ֶל ְ֙ך ֵלא‬And they told the king, ‫“ ַהנָּ ִ ֑בי‬Here is Nathan the prophet” (1Kgs 1:23).

2

‫ל־ה ָמּ ֣קוֹם‬ ַ ‫ ִהנְ ִ֨ני ֵמ ִ ֤ביא ָר ָע ֙ה ַע‬Look, I am about to bring a ‫ ַה ֶ֔זּה‬disaster over this place (Jer 19:3).

3

‫שׁ־שׁ ֶבר‬ ֖ ֶ ֶ‫ ִה ֵנּ֣ה ָשׁ ַ֔מ ְע ִתּי ִ ֥כּי י‬Look, I have heard that there is ‫רוּ־ל֣נוּ‬ ָ ‫דוּ־שׁ ָמּ ֙ה וְ ִשׁ ְב‬ ָ֙ ‫ ְבּ ִמ ְצ ָ ֑ריִם ְר‬grain in Egypt, go down and ‫ ִמ ָ֔שּׁם‬buy grain for us there (Gen 42:2).

7. The most central use of an expression in an earlier stage of a language is not necessarily its most prototypical use at a later stage. Only a quantitative analysis of a specific synchronic layer could shed light on the most prototypical use(s) that became conventionalised at a particular stage of a language; for example, in the case of ‫הנֵּ ה‬,ִ instances such as #1 represent only about 8% (83/1060) of the corpus, while those such as #2, 29% (308/1060) 13

Van der Merwe and Naudé (forthcoming).

144

C. H. J. VAN DER MERWE

and #3, 15% (161/1060) of the corpus. A “formal” pattern in the form of ‫ וְ ִהנֵּ ה‬occupies 25% (265/1060) of the corpus. Cf. example #4. 4

‫וְ ֤הוּא יָ ָצ ֙א‬ ‫וַ ֲע ָב ָ ֣דיו ָ֔בּאוּ וַ יִּ ְר ֕אוּ וְ ִה ֵנּ֛ה‬ ‫ַדּ ְל ֥תוֹת ָה ֲע ִליָּ ֖ה נְ ֻע ֑לוֹת‬

After he had gone, his servants came and saw, to their surprise, the doors of the upper room were locked! (Judg 3:24).

Textbooks describing this intuitively appealing perspective on how language works as a rule abound with illustrations of how both the lexical and grammatical meanings of nouns and verbs work. Spatial prepositions also get their due share. Traditional Cinderellas of grammars and lexica, i.e. most of the “small words” or “particles” other than prepositions, have not yet received much attention. I therefore had fewer “ready-made handles” than I would have wished for. Nevertheless, the basic principles of categorisation in terms of prototypical categories and meaning extension in terms of radial networks, as well as the idea that constructions of various sizes may be conventionalised into specific categories of use, turn out to be extremely useful in the corpus analyses I undertook. Before discussing the results of these studies, I will briefly explain my modus operandi.

4. PROCEDURE The following steps are not necessarily exhaustive nor are they followed in a strict order; they do, however, illustrate a “prototypical scenario”. 1. If a particle has near-synonyms, the most frequently occurring one is investigated first. 2. A generic (where possible) semantic value is assigned to the particle or constructions in which the particle is used. For this purpose all existing secondary literature is consulted. 3. A pilot study is conducted with the help of Vocabula; for example, a sample of narrative and poetic texts, as well as Classical and Late Hebrew texts, is used in order to verify the empirical status of the above-mentioned

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 145

4.

5. 6.

7. 8.

9. 10.

14

assigned semantic value(s). In a preliminary run through of all instances in the sample corpus, no categories are recorded. Literature on translation equivalents of the BH particle in other languages is consulted (in my case English, Dutch, Afrikaans and German), as well as literature in the fields of linguistic typology and grammaticalisation in order to glean any insights that may be relevant. This is a fairly hit or miss venture, but up till now this has turned out to be a very useful exercise. In this way I could, for example, confirm the empirical status that the word class “focus particle” enjoys across languages. Lexical items such as “also, too, even, only” belong to this word class.14 The above-mentioned semantic value(s) are then reconsidered and major slots are opened in Vocabula for the most frequently occurring semantic values. If possible, the semantic scope of the particle (which often correlates with the syntactic scope, e.g. at phrase, clause, sentence or text level, i.e. more than one sentence) of all instances in the sample text is determined. Sub-category slots are also opened for this purpose in Vocabula. A distinction is made between nominal and verbal clauses, and sub-category slots are opened for these purposes. Distinctions may be made on the basis of the predicate of the clause the particle is used in, e.g. in a verbal clause whether the predicate is a wayyiqtol, weqatal, yiqtol, weyiqtol, imperative, etc. For this purpose, ad hoc searches are run by means of the SESB. The position of the particle in the phrase or clause is recorded, and its relevance reconsidered. Where relevant, sub-categories are opened in Vocabula. Lexical items (typically other particles) that may govern constructions with the particle are recorded, and sub-

Cf. Van der Merwe (2009b, 313-32).

146

C. H. J. VAN DER MERWE

11.

12.

13. 14.

15.

categories opened. The same applies if a significant number of instances are identified where the lexeme under investigation governs other particles. In each of the above-mentioned runs through the sample data, possible semantic values of the lexeme under investigation that either confirm or modify that of the one postulated initially are reconsidered. Any significant new nuance or sense is considered as a shift that could be motivated as a possible radial extension of meaning. Such shifts include: spatial > temporal; concrete > abstract; specific > generic. In some cases specialised fixed expressions may be identifiable, which may even be genre or speech-type specific. Major semantic category slots are next confirmed for the sample corpus with sub-category slots for relevant constructions that may be associated with a particular category. Next the entire corpus is investigated and the categories that have been distinguished are modified. Unproductive sub-categories could be collapsed at this stage. The statistical profile of each major semantic category, as well as those of each syntactic category that may be associated with it, is established and compared with that of near-synonyms. Qualitative conclusions are finally drawn from the quantitative data.

To give some substance to these procedures, we will now consider the most significant findings of this pilot study.

5. EXAMPLE At the outset it has to be pointed out that this study benefited from a number of earlier studies, e.g. an exhaustive study of the 1060

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 147 instances of ‫ ִהנֵּ ה‬, 769 instances of ‫גַּ ם‬, as well as that of 491 instances of ‫עוֹד‬.15 ‫ ַאף‬occurs 134 times in the Hebrew Bible. ‫ ַאף‬occurs mainly in poetic material, but the combination ‫ ַאף ִכּי‬is evenly distributed in prose and poetry. This uneven distribution of ‫ ַאף‬is in stark contrast to that of ‫גַּ ם‬, which is relatively evenly distributed throughout the Hebrew Bible. 5.1 Categories of use 1. The most prototypical use of ‫ ַאף‬is to display a noteworthy addition (more than 45% instances in the corpus). The following syntactic patterns are attested: 1.1 ‫ ַאף‬+ sentence(s) (31/134, about 23%) In these instances ‫ ַאף‬signals that the information referred to in a sentence (#6–7)16 or sentences (#8)17 y needs to be considered, in addition to information referred to in an immediately preceding sentence (or sentences) x as far as an explicit or implicit notion z is concerned. The information to be added after ‫ ַאף‬typically represents the most conclusive or compelling information related to what is asserted in the context. In other words, it must nearly always be translated as “moreover, what’s more or what’s worse.”

Van der Merwe (2007; 2009a; 2009b; Van der Merwe and Naudé forthcoming). The rest of section 5.1 is a modified extract from Van der Merwe (2009b). 16 Lev 26:40,44; Num 16:14; Deut 33:28; Isa 26:8,9; 41:23; 45:21; Pss 16:9; 68:17; 135:17; Job 6:27; 15:4; 36:29; Prov 23:28. Also Ps 119:3; Job 36:16; Esth 5:12. 17 Ezek 23:40; 34:12. Also Job 19:4–5, Song 1:16b–17. 15

148

C. H. J. VAN DER MERWE

‫הוֹס ְפ ִתּי ִמ ֛כֹּל‬ ַ֔ ְ‫וְ גָ ַ ֣ד ְל ִתּי ו‬ ‫ירוּשׁ ָל֑םִ ַ ֥אף‬ ָ ‫ֶשׁ ָה ָי֥ה ְל ָפ ַנ֖י ִבּ‬ ‫ָח ְכ ָמ ִ ֖תי ָ ֥ע ְמ ָדה ִ ֽלּי׃‬

6

7

‫ָאזְ ַ ֣נ יִם ָ ֭ל ֶהם וְ ֣ל ֹא יַ ֲא ִז֑ינוּ‬ ‫יהם׃‬ ֽ ֶ ‫וּח ְבּ ִפ‬ ַ ‫שׁ־ר‬ ֥ ֶ‫ַ֝֗אף ֵאין־י‬

8

‫יְמין ָצ ָ ֑ריו ִ֝ה ְשׂ ַ֗מ ְח ָתּ‬ ֣ ִ ‫ימוֹת‬ ָ ‫֭ ֲה ִר‬ ‫ָכּל־אוֹיְ ָ ֽביו׃‬ ‫ף־תּ ִשׁיב ֣צוּר ַח ְר ֑בּוֹ וְ ֥ל ֹא‬ ֭ ָ ‫ַא‬ ‫ֲ֝ה ֵקימ ֹ֗תוֹ ַבּ ִמּ ְל ָח ָ ֽמה׃‬

So

I

became

great

and

surpassed all who were before me in Jerusalem; what’s more, my wisdom remained with me (Eccl 2:9).18 They do have ears, but they do not hear, what’s worse, there is no breath in them (Ps 135:17). You have raised the right hand of his foe, you have made all his enemies rejoice. What’s worse, you turned back the flint>edge of his sword and not supported him in the battle (Ps 89:43–44).19

The two cases where it may be hard to argue that ‫ ַאף‬points unambiguously to the most conclusive part of a two—(#9) or a multi-pronged (#10) assertion may therefore be regarded as atypical uses of ‫ ַאף‬. 9

10

‫ינהּ‬ ֑ ָ ֵ‫ָט ְב ָ ֣חה ִ ֭ט ְב ָחהּ ָמ ְס ָכ֣ה י‬ ‫ַ֝֗אף ָ ֽע ְר ָ ֥כה ֻשׁ ְל ָח ָנֽהּ׃‬

She has slaughtered her animals,

‫ף־שׁ ַ ֣מיִ ם‬ ָ ‫ֶ ֤א ֶרץ ָר ֨ ָע ָשׁה׀ ַא‬ ‫פוּ ִמ ְפּ ֵנ֪י ֱא ֹ֫ל ִ ֥הים‬ ֮ ‫נָ ְט‬

The earth quaked and the

she has mixed her wine, she has also set her table (Prov 9:2).

heavens poured down rain at the presence of God (Ps 68:7).

Lev 26:40, 44; Num 16:14; Deut 33:28; Isa 26:8, 9; Ps 16:9; 68:17; Job 15:4; 36:29; Prov 23:28. 19 Isa 41:23; Job 19:4; 34:12; Isa 44:16, 19; Pss 93:1–2; 96:10; Isa 43:19; 1 Chr 16:30. 18

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 149 1.2 ‫ ַאף ִכּי‬+ sentence (with an ellipsed predicate or a main clause) (23/134, about 17%) Although ‫ ַאף ִכּי‬+ sentence (with an ellipsed main clause or predicate) have developed the character of a fixed expression, it does share two features with the use of ‫ ַאף‬in the instances above; specifically, it signals a connection between two pieces of information, and the second piece of information is the most compelling or conclusive of the two, hence the translation values “how much more” (#11) and “how much less” (#12). For this reason, the fixed expression can be semantically classified under the heading “noteworthy addition”. 11

‫אמ ֜רוּ ַאנְ ֵ ֤שׁי ָדוִ ֙ד ֵא ֔ ָליו‬ ְ ֹ ‫וַ ֨יּ‬ ‫יהוּדה‬ ֖ ָ ‫ִה ֨ ֵנּה ֲא ַנ ְ֥חנוּ ֛ ֹפה ִ ֽבּ‬ ‫יְ ֵר ִ ֑אים וְ ַא ֙ף ִ ֽכּי־נֵ ֵלְ֣ך ְק ִע ֔ ָלה‬ ‫ל־מ ַע ְר ֖כוֹת ְפּ ִל ְשׁ ִ ֽתּים׃‬ ַ ֽ ‫ֶא‬

But David’s men said to him: Behold, we are afraid here in Judah; how much more then if we go to Keilah against the armies of the Philistines? (1 Sam 23:3).20

12

‫וּשׁ ֵ ֤מי ַה ָשּׁ ַ֙מיִ ֙ם‬ ְ ‫֠ ִהנֵּ ה ַה ָשּׁ ַ֜מיִ ם‬ ‫י־ה ַ ֥בּיִ ת‬ ַ ‫֣ל ֹא יְ ַכ ְל ְכּ ֔לוָּך ַ֕אף ִ ֽכּ‬ ‫יתי׃‬ ִ ‫ַה ֶזּ֖ה ֲא ֶ ֥שׁר ָבּ ִ ֽנ‬

Look, the highest heavens cannot contain you, how much less this house that I have built? (1 Kgs 8:27).21

1.3 ‫ ַאף‬+ constituent (rare 5/134, about 4%)

‫שׁר יְ ָע ָ ֑צנִ י‬ ֣ ֶ ‫ת־י֭הוָ ה ֲא‬ ְ ‫ֲא ָב ֵ ֗רְך ֶא‬ ‫יוֹתי‬ ֽ ָ ‫ף־ל ֗ילוֹת יִ ְסּ ֥רוּנִ י ִכ ְל‬ ֵ֝ ‫ַא‬

13

I bless Yahweh who gives me counsel, even during the nights my kidneys instruct me (Ps 16:7).22

Deut 31:27; 1 Sam 14:30; 21:6; 2 Sam 4:11; 16:11; 2 Kgs 5:13; Ezek 14:21; Prov 11:31; 15:11; 19:7; 21:27. In Job 4:19 ‫ ַאף‬expresses the same meaning as ‫אף ִכּי‬. ַ 21 Job 9:14; 15:16; 25:6; 35:14; Prov 17:7; 19:10; 2 Chr 6:18; 32:15. 22 Also Isa 46:7 and Neh 9:18. ‫ ַאף‬is preceding a temporal adjunct. Cf. also Ps 68:19; Job 14:3. 20

150

C. H. J. VAN DER MERWE 1.4 ‫ ַאף‬+ member of coordinated phrase (rare 1/134)

14

‫ֹלהי‬ ַ֗ ‫ת־י֣ד ֱא‬ ַ ‫וָ ַא ֨ ִגּיד ָל ֶ֜הם ֶא‬ ‫טוֹבה ָע ֔ ַלי‬ ֣ ָ ‫יא‬ ֙ ‫ר־ה‬ ִ ‫ֲא ֶשׁ‬ ‫שׁר‬ ֣ ֶ ‫ף־דּ ְב ֵ ֥רי ַה ֶ ֖מּ ֶלְך ֲא‬ ִ ‫וְ ַא‬ ‫ר־לי‬ ֑ ִ ‫ָ ֽא ַמ‬

I told them that the hand of my God was good on me and also the words that the king had spoken to me (Neh 2:18).23

A feature of the semantic category “noteworthy addition” is that ‫ ַאף‬typically governs a sentence, and in a few cases more than one sentence. In the former case it could therefore be labelled as a conjunctive adverb and in the latter case a macro-syntactic connective. 2. Affirmation (27/134, about 20%) A second semantic category could be labelled as “affirmation”. The following syntactic patterns are attested: 2.1 ‫ ַאף‬+ sentence(s) (20/134, about 15%) ‫ ַאף‬signals that the information referred to in a sentence (or sentences) y affirms the information referred to in an immediately preceding sentence (or sentences).24 In other words, what must be added to strengthen an argument is not a second new proposition, but the specification of one that is discourse-active. 15

‫ן־ק ַמי‬ ֭ ָ ‫יְבי ַ ֣אף ִמ‬ ָ֥ ‫א‬ ֹ ֫ ‫… ְמ ַפ ְלּ ִ֗טי ֵמ‬who delivered me from my ‫ילנִ י׃‬ ֽ ֵ ‫רוֹמ ֵ ֑מנִ י ֵמ ִ ֥אישׁ ָ֝ח ָ֗מס ַתּ ִצּ‬ ְ ‫ ְתּ‬enemies; indeed, you exalted me above my adversaries; you delivered me from the violent (Ps 18:49). 25

The border between “noteworthy addition” and “affirmation” is not always clear-cut. In #15 it is fairly clear that the sentences 23 It can be argued that the information signalled to be added by means of ‫ ַאף‬has one or another type of relevance in the context. In other words, the permission of the king (Neh 2:8) was the most important part of what Nehemiah told his audience that inspires them to act (Neh 2:18b). 24 In two cases noun phrases and not sentences are involved, Isa 35:2 and Prov 22:19. 25 Deut 33:3; Isa 26:11; 42:13; 43:7, 19; 46:6; 48:13, 15; Pss 16:6; 77:17, 18; 93:1; 96:10; 1 Chr 16:30. Also Isa 33:2; Job 6:27.

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 151 governed by ‫ ַאף‬makes more specific “delivers me from my enemies”, and in this way affirms what is said in the previous utterance.

‫רוֹעי‬ ֥ ִ ְ‫ ָי ִ֭די ִתּ ֣כּוֹן ִע ֑מּוֹ ַאף־ז‬My hand shall support>remain with ‫ ְת ַא ְמּ ֶ ֽצנּוּ׃‬him, yes, my arm will strengthen

16

him (Ps 89:22).

In the case of #16, the question may be asked: is Ps 89:22b additional information “what’s more”, or does it merely make more specific and affirm what is said in 22a, in other words, “yes, indeed”? Consider, for example, how the following versions differ in their translation of ‫ ַאף‬: NRSV “my hand shall always remain with him, my arm also shall strengthen him”; NIV “My hand will sustain him, surely my arm will strengthen him”; NJPS “My hand shall be constantly with him, and my arm shall strengthen him.” Despite the fuzzy border between the semantic categories “noteworthy addition” and “affirmation”, what are relevant as far as this investigation is concerned are the following: firstly, in comparison with ‫גַּ ם‬, ‫ ַאף‬occurs significantly more frequently (20/134 versus 13/769) in contexts where it can have an affirmative interpretation; and secondly, the shift in meaning from “noteworthy addition” to “affirmation” is not hard to “motivate”.26 Muraoka’s (1985, 143) claim that ‫ ַאף‬does not have an affirmative nuance is therefore not borne out by our corpus linguistic analysis. The above-mentioned shift in meaning may also explain another, secondary shift in meaning, viz. those instances where ‫ַאף‬ does not connect two pieces of information, but is used in a question as a sentence adverbial. 2.2 Yes/no question word ‫ ַאף‬+ sentence (rare) (7/134, about 5%)

26

For the concept “motivation” in this context, cf. Lakoff (1987, 91).

152

C. H. J. VAN DER MERWE

In each instance the epistemic modality, i.e. the factuality (“is indeed”?) of an event (#17) or state of affairs (#18), is questioned.27 17

‫אמר‬ ֑ ַ ֹ ‫ וַ יִּ ַגּ֥שׁ ַא ְב ָר ָ ֖הם וַ יּ‬Then Abraham came closer and said, ‫“ ַה ַ ֣אף ִתּ ְס ֔ ֶפּה ַצ ִ ֖דּיק‬Will you even/indeed sweep away the ‫ם־ר ָ ֽשׁע׃‬ ָ ‫ ִע‬righteous with the wicked?” (Gen 18:23).28

18

‫יאים‬ ִ֔ ‫יכ ֙ם ִלנְ ִב‬ ֶ ֵ‫וָ ָא ִ ֤ קים ִמ ְבּנ‬ ‫חוּר ֶיכ֖ם ִלנְ זִ ִ ֑רים‬ ֵ ‫וּמ ַבּ‬ ִ ‫ין־ז ֹאת ְבּ ֵנ֥י יִ ְשׂ ָר ֵ ֖אל‬ ֛ ‫ַה ַ ֥אף ֵ ֽא‬ ‫הוֽה׃‬ ָ ְ‫נְ ֻאם־י‬

“I have raised some of your sons to be prophets and some of your young men to be Nazirites. Is it not indeed so, O people of Israel?” says the Lord (Amos 2:11).

Apart from operating as a sentence adverbial, the syntactic features of this semantic category are similar to those of “noteworthy addition”. ‫ ַאף‬is predominantly a conjunctive adverb and sometimes a macro-syntactic connective. 3. Addition (about 14%) Another less frequent, but not rare, use of ‫ ַאף‬is one that corresponds with the most prototypical use of ‫גַּ ם‬. The translation values in most cases (16/21) are “also, too” (#19). The following syntactic constructions are attested. 3.1 ‫ ַאף‬+ constituent (16/134, about 12%) 19

‫קּוּ‬ ֙ ‫יִמּ‬ ַ֙ ‫וְ ַהנִּ ְשׁ ָא ִ ֣רים ָבּ ֶ֗כם‬ ‫ֽ ַבּ ֲעו ָֹ֔נם ְבּ ַא ְר ֖צ ֹת אֹיְ ֵב ֶיכ֑ם‬ ‫וְ ַ ֛אף ַבּ ֲעו ֹ ֹ֥נת ֲאב ָ ֹ֖תם ִא ָ ֥תּם‬ ‫יִמּקּוּ׃‬ ָֽ

And those who remain among you will rot away in the land of your enemy because of your iniquity, and also because of the iniquities of their ancestors with them they will rot away (Lev 26:39). 29

Gen 3:1; 18:13, 24; Job 34:17; 40:8. In Gen 18:13 the question word is followed by a sentence adverbial. In Gen 3:1 the question word is lacking, and a content clause introduced by ‫ ִכּי‬is governed by ‫אף‬. ַ 29 Gen 40:16; Lev 26:42 (2x); Deut 15:17; 2:11, 20; Judg 5:29; 2 Sam 20:14; Job 32:10, 17 (2x); 37:1; Neh 13:15; 1 Chr 8:32; 9:38. 27 28

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 153

2%) 20

3.2 ‫ ַאף‬+ noun phrase in a coordinated phrase (2/134, about

‫וֹע‬ ַ ‫ ְכּ ָל ִ ֣ביא ָשׁ ֵ֔כן וְ ָט ַ ֥רף זְ ֖ר‬He lives like a lion and tears apart ‫ף־ק ְד ֽקֹד׃‬ ָ ‫ ַא‬arm and/as well as scalp (Deut 33:20).30

21

‫הו֑ה‬ ָ ְ‫ וְ י֘ ֤וֹדוּ ָשׁ ַ ֣מיִם ִפּ ְל ֲאָך֣ י‬Let the heavens praise your wonders, ‫ף־א ֽמוּנָ ְת ָ֗ך ִבּ ְק ַ ֥הל‬ ֱ֝ ‫ ַא‬O LORD, and/as well as your ‫ ְקד ִ ֹֽשׁים׃‬faithfulness in the assembly of the holy ones (Ps 89:5).

It is not entirely clear whether ‫ ַאף‬functions here as a nearsynonym of ְ‫ו‬31 or whether it does something more, a nuance that can be expressed by “as well as”. 4. A corresponding reaction of an actor x to an action by an actor y (6/134, i.e. about 5%) All instances are of the pattern: ‫־אנִ י‬ ֲ ‫ ַאף‬. ‫ ַאף‬points out, just like the focus particle ‫גַּ ם‬, the role of people or God in a corresponding reaction (#22). In other words, it cannot be said that “also x” does y, but what x does is a corresponding reaction to what z did. Translation values of “x also” or “x in turn” can be used to express this use of ‫ ַאף‬in English. 22

‫ף־א ִנ֛י ִע ָמּ ֶכ֖ם‬ ֲ ‫וְ ָה ַל ְכ ִ ֧תּי ַא‬ ‫יתי ֶא ְת ֶכ ֙ם‬ ֤ ִ ‫ְבּ ֶ ֑ ק ִרי וְ ִה ֵכּ‬ ‫ם־אנִ י ֶ ֖שׁ ַבע‬ ָ֔ ַ‫גּ‬ ‫יכם‬ ֽ ֶ ‫את‬ ֵ ֹ ‫ל־חטּ‬ ַ ‫ַע‬

I will also go against you [lit. walk with you in hostility], and I will also strike you sevenfold for your sins (Lev 26:24).32

Also Ps 89:6. Van der Merwe (forthcoming b) establishes that in a similar syntactic construction, i.e. ‫ גַּ ם‬without ְ‫ ו‬in a coordinated phrase, ‫ גַּ ם‬is used as a near-synonym of ְ‫ו‬. 32 Lev 26:16, 28, 41; Ps 89:28; 2 Chr 12:5. 30 31

154

C. H. J. VAN DER MERWE

5. Noteworthy in-/exclusion (13/134, i.e. about 10%) Another category that could be listed separately on mainly syntactic grounds is that of ‫ ַאף‬+ clause // ‫ ַאף‬+ clause (// ‫ ַאף‬+ clause) (#23).33 23

‫אשׁ וְ נֵ ָ ֔ד ָעה‬ ֙ ֹ ‫י־ה ִגּ֤יד ֵמר‬ ִ ‫ִ ֽמ‬ ‫אמר‬ ֣ ַ ֹ ‫וּמ ְלּ ָפ ִנ֖ים וְ נ‬ ִ ‫ין־מ ֗ ִגּיד‬ ַ ‫ַצ ִ ֑דּיק ַ ֣אף ֵא‬ ‫יע‬ ַ ‫ַ ֚אף ֵ ֣אין ַמ ְשׁ ִ֔מ‬ ‫יכם׃‬ ֽ ֶ ‫ַ ֥אף ֵאין־שׁ ֵ ֹ֖מ ַע ִא ְמ ֵר‬

Who declared it from the beginning, so that we might know, and beforehand, so that we might say, “He is right”? There was neither who declared it, nor one who proclaimed, nor one who heard your words (Isa 41:26).

This use of ‫ ַאף‬as a correlative conjunction34 is regarded as “noteworthy inclusion” since it signals that the information referred to in each sentence preceded by ‫ ַאף‬carries equal weight as far as the substantiation of explicit or implicit assertion is concerned. 5.2 Qualitative inferences from quantitative data From the above-mentioned description of ‫ ַאף‬it is evident that it has the semantic potential to mark: 1. “noteworthy addition” (45%); 2. “affirmation” (20%); 3. “addition” (14%); 4. “corresponding reaction” (5%); and 5. “noteworthy inclusion” (10%). The most typical use is by far that of “noteworthy addition”. In those instances either ‫ ַאף‬or ‫ ַאף ִכּי‬governs one or more sentences. Although less frequent, the second most frequent use of ‫ ַאף‬is to mark “affirmation”. This syntactic profile of ‫ ַאף‬, i.e. governing sentences in more than 65% of its occurrences, differs significantly from that of ‫גַּ ם‬. The latter, in contrast, governs a sentence constituent in more than Also Isa 40:24 (3x); 41:10 (2x), 44:15 (2x); 46:11 (3x). Despite the syntactic similarities, the use of ‫ ַאף‬in Isa 40:24 and 46:11 are not prototypical examples of this category in the same way as Isa 41:26. For example, in Isa 46:11 four sentences (which form two pairs) are involved, but the third one lacks ‫אף‬. ַ 34 For the notion “correlative conjunction”, cf. Huddleston and Pullum (2002). 33

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 155 55% of its occurrences. A sentence or sentences are governed by ‫ גַּ ם‬in only 14% of the cases. From a semantic point of view, the data in our corpus attest that ‫ גַּ ם‬is typically a marker of addition, while ‫ ַאף‬typically marks “noteworthy addition”. The finding of ‫ַאף‬ as being the more “marked form of addition” is contra Waltke and O’Connor (1990, 663), who claim that ‫ ַאף‬is “simpler and closer to ְ‫ ”ו‬than ‫גַּ ם‬.

6. CONCLUSIONS Except for six instances, many of which could be accounted for in terms of text-critical considerations, our model allowed us to account for all instances in the Hebrew Bible. Profiling the use of ‫ ַאף‬in terms of prototypical and less prototypical uses that could also be associated with specific syntactic constructions not only helps one to understand the particle better, but also allows for sophisticated comparisons with its near-synonyms. Shifts in meaning could be motivated in terms of principles acknowledged in cognitive linguistics. If we consider the perspective of corpus linguistics that Gries (2006) provides, as summarised in §3, this investigation can be classified as a fairly typical example. All instances of the lexeme are investigated and an attempt, informed by a cognitive linguistic theory of language, is made to do more “than just accounting for categorial either-or phenomena, but uses statistical data … to cover the middle ground between what is possible/grammatical and what is not.” Qualitative inferences are made on the basis of quantitative data (i.e. the frequency of constructions). As a corpus linguistic study this investigation was unsophisticated in two respects: firstly, in the use of statistical methods; and, secondly, in the electronic processing of multifactorial considerations, such as determining the relevance of and/or interplay between such factors as clause type and word order patterns. In this regard, I believe that the existing instruments of the grammarians of the Hebrew Bible could benefit from developments in the corpus linguistic analyses of other languages.

BIBLIOGRAPHY Brown, F., S. R. Driver, and Ch. A. Briggs. A Hebrew and English Lexicon of the Old Testament. Oxford: Clarendon Press, 1907.

156

C. H. J. VAN DER MERWE

Clines, D. J. A., ed. The Dictionary of Classical Hebrew (DCH). Sheffield: Sheffield Academic Press, 1993. Croft, W. and D. A. Cruse. Cognitive Linguistics. Cambridge: Cambridge University Press, 2004. De Blois, R. “A Semantic Dictionary of Biblical Hebrew.” In Current Trends in Scripture Translation, edited by P. A. Noss, 275–95. Reading: United Bible Society, 2002. Follingstead, C. M. Deictic Viewpoint in Biblical Hebrew Text: A Syntagmatic and Paradigmatic Analyis of the Particle ‫( כי‬Kî). Dallas: SIL International, 2001. Geeraerts, D., ed. Cognitive Linguistics: Basic Readings. Berlin: Mouton de Gruyter, 2006. Gesenius, W. and E. Kautzsch. Gesenius’s Hebrew Grammar. Translated by A. E. Cowley. Oxford: Clarendon Press, 1910. Gesenius, W., U. Rüterswörden, R. Meyer, and H. Donner. Hebräisches und Aramäisches Handwörterbuch über das Alte Testament. 18th Aufl. (HAH). Berlin: Springer-Verlag, 1987. Gries, S. T., and A. Stefanowitsch, eds. Corpora in Cognitive Linguistics. Corpus-Based Approaches to Syntax and Lexis. Berlin: Mouton de Gruyter, 2006. Gross, W. Die Satzteilfolge im Verbalsatz alttestamentlicher Prosa: Untersucht an den Büchern Dtn, Ri und 2Kön unter mitarbeit von Andreas Diße und Andreas Michel. Forschungen zum Alten Testament 17. Tübingen: Mohr [Siebeck], 1996. Hoftijzer, J. A. Search for Method: A Study in the Syntactic Use of the H-Locale in Classical Hebrew. Studies in Semitic Languages and Linguistics 12. Leiden: Brill, 1981. Hoftijzer, J. A. The Function and Use of the Imperfect Forms with Nun Paragogicum in Classical Hebrew. Studia Semitica Neerlandica 21. Assen: Van Gorcum, 1985. Imbayarwo, T. “A Biblical Hebrew Lexicon for Translators Based on Recent Developments in Theoretical Lexicography.” D Litt diss., University of Stellenbosch, 2008. Joüon, P. and T. Muraoka. A Grammar of Biblical Hebrew. Rome: Editrice Pontificio Istituto Biblico, 1991. Koehler, L. and W. Baumgartner. The Hebrew and Aramaic Lexicon of the Old Testament: Subsequently Revised by Walter Baumgartner and Johann Jakob Stamm. Grand Rapids: Eerdmans, 1999. Knauf, E. A. „War ‘Biblisch-Hebräisch’ eine Sprache? — Empirische Gesichtspunkte zur linguistischen Annäherung an

BIBLICAL HEBREW LINGUISTICS AS CORPUS LINGUISTICS 157 die Sprache der althebräischen Literatur.“ ZAH 3, No. 1 (1990) 11–23. Lakoff, G. Women, Fire, and Dangerous Things: What Can Categories Reveal About the Mind. Chicago: University of Chicago Press, 1987. Langacker, R. W. Cognitve Grammar. A Basic Introduction. Oxford: Oxford University Press, 2008. Lübbe, J. “The Use of Syntactic Data in Dictionaries of Classical Hebrew.” Journal for Semitics 5, No. 1 (1993) 89–96. Muraoka, T. Emphatic Words and Structures in Biblical Hebrew. Jerusalem and Leiden: Magnes Press and E. J. Brill, 1985. Rechenmacher, H. and Ch. H. J. van der Merwe. “The Contribution of Wolfgang Richer to Current Developments in the Study of Biblical Hebrew.” Journal of Semitic Studies 50, No. 1 (2003) 226–42. Talstra, E. “Computer-Assisted Linguistic Analysis. The Hebrew Database Used in Quest.2.” In Cook, J., ed. Bible and Computer, 3–22. Leiden/Boston: Brill, 2002. Van der Merwe, Ch. H. J. “Some Recent Trends in Biblical Hebrew Linguistics: A Few Pointers Towards a More Comprehensive Model of Language Use.” Hebrew Studies 44 (2003) 7–24. ____. “Towards a Principled Working Model for Biblical Hebrew Lexicology.” JNSL 30, No. 1 (2004) 119–37. ____. “Biblical Hebrew Lexicology: A Cognitive Linguistic Perspective.” KUSATU 6 (2006a) 87–112. ____. “Biblical Exegesis, Cognitive Linguistics and Hypertext.” In Lemaire, A., ed. 18th IOST. Congress Volume Leiden 2004, 255– 80. Leiden: Brill, 2006b. ____. “Lexical Meaning in Biblical Hebrew and Cognitive Semantics: A Case Study.” Biblica 87, No.1 (2006c) 85–95. ____. “A Cognitive Linguistic Perspective on ‫ ִהנֵּ ה‬in the Pentateuch, Joshua, Judges and Ruth.” Hebrew Studies 47 (2007) 238–77. ____. “The Hebrew Particle ‫ ַאף‬.” Vetus Testamentum (2009a) 266– 83. ____. “Another Look at the Focus Particle ‫גַּ ם‬.” Journal of Semitic Studies LIV, No. 2 (2009b) 313–32. Van der Merwe, Ch. H. J., Jackie Naudé, and J. Kroeze. A Biblical Hebrew Reference Grammar. Sheffield: Sheffield Academic Press, 1999.

158

C. H. J. VAN DER MERWE

Van der Merwe, Ch. H. J., and J. A. Naudé. A Biblical Hebrew Reference Grammar. A revised and expanded version (forthcoming). Waltke, B. K., and M. O’Connor. An Introduction to Biblical Hebrew Syntax. Winona Lake: Eisenbrauns, 1990.

THE SERIES OF WOES IN PRE-EXILIC PROPHECY: A COMPUTER-ASSISTED STUDY ON SYNTAX AND SEMANTICS1 JAVIER DEL BARCO CENTRO DE CIENCIAS HUMANAS Y SOCIALES, CSIC, MADRID, SPAIN GUADALUPE SEIJAS DE LOS RÍOS-ZARZOSA UNIVERSIDAD COMPLUTENSE MADRID, SPAIN

I. INTRODUCTION Textual segments introduced by ‫ הוי‬are commonly referred to as woe oracles or “Alas” oracles. This kind of text has not been studied anew since the 1970’s, and a re-evaluation of these texts now seems necessary in terms of the analysis of their syntactic structure and their interaction with their wider narrative context. We have limited our study to the pre-exilic prophecy as a first step towards a more complete study of the woe oracles in the entire biblical prophetic poetry. Post-exilic prophecy will be treated, therefore, in a later stage of our research. Furthermore, we have restricted our research to more specific grounds. One of the most interesting features of these texts is that several ‫ הוי‬sentences are clustered in one textual segment. These series of ‫ הוי‬sentences are 1 This article is a result of the research project HUM2005-05747, entitled Análisis unificado de textos hebreos con ordenador (AUTHOR), and sponsored by the Spanish Ministry of Science and Innovation.

159

160

J. DEL BARCO, G. SEIJAS DE LOS RÍOS-ZARZOSA

presented in textual contiguity and are semantically related to one another, being part of the same woe oracle. Therefore, they have been considered as a textual unit in terms of their function within the context. As a result, we have decided to focus only on these series of woe oracles, and to put aside the rest of the cases in which the ‫ הוי‬particle appears isolated for a later stage of our research. The texts that we have analysed (and on which we will base our conclusions) are: Amos 5:7–20, Amos 6:1–14, Isa 5:8–24 and 10:1– 4, and Hab 2:6–20. Before proceeding, our methodology does need some clarification. Ours is a linguistic analysis, focused mainly on the study of syntactic form and function. Such a linguistic analysis of biblical texts has gained much in the last twenty years from the creation of computer-assisted databases containing grammatical information of the biblical texts. Nowadays we have access to a wide variety of biblical software in which morphological information is consistently encoded. Attempts have been made in order to design computerized databases which include syntactic information at the sentence level, and suitable programs are now able to analyse such databases in order to define formally the syntactic relations between sentences within a broad segment of text.2 More modestly, we have also created in Madrid a linguistic database of prophetic poetry, primarily concerned with morphological and syntactic levels.3 Related software, which allows flexible and complex searches, has also been designed.4 We have been using and improving these databases for the last ten years, 2

For example, the linguistic databases and related software designed by Dean Forbes in California and Eep Talstra in Amsterdam. 3 For a detailed description of the computerized databases, see Seijas 1992, 52–68; Herranz 1993, 12–33; Vegas—Seijas 1995, 341–355; Del Barco 2001, 55–73; and Del Barco—Seijas 2006, 115. 4 All particles which initiate a clause are encoded, as well as the number of clauses under the influence of such a particle and the syntactic relationship between these clauses and the text which surrounds them. Therefore, using our software for the purpose of this contribution, we have been able to localize all the cases of the particle ‫ הוי‬and to delimitate segments of texts which are under its influence.

THE SERIES OF WOES IN PRE-EXILIC PROPHECY

161

and some results of our computer-assisted analysis of prophetic poetry have been presented in previous AIBI meetings and elsewhere. However, we are facing two different problems which need our critical evaluation. On one hand, despite the fact that we have built linguistic databases of the Proto-Isaiah, Ezekiel and preexile Minor Prophets, further work is required on other books, especially post-exile prophets, so that our research can move into the domain of diachronic syntax. On the other hand, progress in computer technology should lead us to re-evaluate our programs and databases in order to benefit from the current contributions in the field. As far as the linguistic theoretical frame is concerned, we are committed to a grammatical text-linguistic analysis of biblical prophecy. Such an analysis takes into consideration clause structure and the use of macro-syntactic particles, and pays particular attention to the use of verbal forms within a context broader than the sentence. In addition to our main focus on this syntactic approach, the semantic analysis does not escape our attention, since we seek to shed new light on the relationship between syntax and semantics in prophetic poetry. We find this linguistic analysis essential to the identification of literary units and sub-units within a particular text; that is to say, we believe that a form-criticism approach to these biblical texts must be supplemented with a linguistic analysis of the different literary units. Indeed, we support the idea that the linguistic functions of a text are part of its literary form and, therefore, any literary analysis should not neglect the linguistic approach. Thus, interaction between linguistic and literary analysis of the biblical texts should bring new results and perspectives unavailable to strict linguisticonly or literary-only approaches. After these methodological remarks, we can now go back to the specific subject of this paper, the analysis of the woe oracles. The study of the woe oracles as a literary genre within the Prophetic literature did not start until the 1930’s. Gunkel’s formcriticism approach to biblical texts, based on the identification of specific literary units and sub-units, led some scholars to pay attention to the woe oracles as a literary genre or sub-unit. One of the first things to do was to identify and isolate the woe oracles in the entire Hebrew Bible.

162

J. DEL BARCO, G. SEIJAS DE LOS RÍOS-ZARZOSA

From a preliminary look at the distribution of the cases, one conclusion is obvious: the woe oracles are restricted to prophetic texts, with the sole exception of 1 Kgs 13:30. Moreover, most of the cases are found in the pre-exilic prophecy: 37 out of 47 of the ‫ הוי‬sentences appear in the books of Amos, Proto-Isaiah, Micah, Habakkuk, Zephaniah, Nahum and Jeremiah.5 In post-exilic prophecy, the presence of utterances introduced by ‫ הוי‬is reduced to ten cases, in Isaiah, Ezekiel and Zechariah.6 The “Woe” genre has been a subject of discussion in literary criticism from the 1930’s until the beginning of the 1970’s. Basic questions of concern have been the origin and Sitz im Leben (or social and historical context) of these oracles, and the analysis of the use of the introductive particles ‫ הוי‬and ‫אוי‬. Mourning texts in the literatures of the Ancient Near East and Eastern Mediterranean have also been studied as parallel texts according to anthropological and ethnographic approaches in general, and according to a comparative linguistic approach in the case of the Semitic languages in particular. The first basic question of concern has been the origin and Sitz im Leben of the woe oracles. Three main theories were formulated on this issue: The first derived the origin of the woe oracles from curse discourse. Mowinckel (1924) and later on Westermann (1967) underlined the connections existing between curse discourse and the woe oracles, and suggested that the prophetic woe cries were transformations of cultic curses. However, Gerstenberger (1962) challenged this idea by showing the differences between the behaviour of curses and of the woe oracles. Current scholarship has subsequently accepted the independence of this literary genre from curse discourse. The second theory derived the origin of the woe oracles from the popular ethos. Gerstenberger detected a close affinity between Amos 5:[7].18; 6:1.[13]; Mic 2:1; Nah 3:1; Zeph 2:5; 3:1; Hab 2:6.9.12.15.19; Isa 1:4.24; 5:8.11.18.20.21.22 + 10:1; 10:5; 17:12; 18:1; 28:1; 29:1.15; 30:1; 31:1; 33:1; Jer 22:13.18 (2); 23:1; 30:7; 47:6; 48:1; 50:27. 6 Isa 45:9.10; 55:1, Ezek 13:3; 13:18; 34:2; Zech 2:10 (2); 2:11; 11:17. 5

THE SERIES OF WOES IN PRE-EXILIC PROPHECY

163

wisdom (especially popular wisdom) and the form and contents of the woe oracles. According to this theory, the prophets employed the popular ethos by applying it to new situations. As a result, the oracles introduced by ‫ הוי‬became sentences of a pedagogical nature, oriented to transmit popular wisdom to young people and preserve the existing conditions of society. The fact that many woe oracles deal with topics such as social injustice and oppression, both basic themes found also in the literature of wisdom, further supports this theory. The third theory derived the origin of the woe oracles from the form of the funerary lament. This theory suggested that lamentation for the death of a beloved person was the context for the origin of the woe cries. Subsequently, the prophets borrowed the woe genre from the realm of mourning, where it was meant to express the shift from life to death (Wanke 1966, Clifford 1966). As has been argued, the use of ‫ הוי‬can reflect the presence of emotions and feelings unable to find a suitable expression in a rational way. In this sense, the prophets adapted the woe genre freely and amplified the oracles by adding other elements related to the prophetic speech. In other words, and according to this theory, they adapted an old form of expression and gave it new meanings and connotations. This theory is, in our opinion, mostly convincing, and has been acknowledged as such by many scholars. The second basic question of concern has been the analysis of the use of the introductive particles ‫ הוי‬and ‫אוי‬. Scholars have discussed if they are different particles or, on the contrary, they should be considered as mere orthographic or dialectal variants of the same word. The fact that they have a similar pronunciation seems to support the latter idea. Wanke (1966) analysed the differences between utterances introduced by ‫ אוי‬and those starting with ‫הוי‬, and noted three basic differences between them: • A difference in the grammatical structure: Whereas ‫ אוי‬is followed by the preposition lamed + a pronominal suffix or equivalent in 20 instances of a total of 23 cases, ‫ הוי‬is usually followed by an active participle (or other nominal element) without preposition. Only in four cases is ‫הוי‬

164

J. DEL BARCO, G. SEIJAS DE LOS RÍOS-ZARZOSA

followed by a preposition: ’el (Jer 48:1); ‘al (Jer 50:27; Ezek 13:3) and lamed (Ezek 13:18),7 and all come from exile or post-exile texts. • A difference in the context: While ‫ הוי‬oracles appear only in the prophetic literature (with the sole exception of 1 Kgs 13:30), texts introduced by ‫ אוי‬are much more scattered in different books of the whole Hebrew Bible. • A difference in the meaning: Wanke finds a clear semantic distinction between ‫ אוי‬and ‫הוי‬ utterances. In his opinion, utterances with ‫ אוי‬express cries of anxiety, as in “Woe unto me” (Isa 6:5) or “Woe unto us” (Jer 4:13), whereas utterances with ‫ הוי‬are cries of funerary lament, as in “Alas my sister” (Jer 22:18). Following Wanke’s conclusions, we have treated separately ‫ אוי‬and ‫ הוי‬utterances, analysing only woe oracles introduced by the particle ‫הוי‬.

II. THE PROBLEM Following a form-criticism analysis of the biblical texts, Gunkel8 and other scholars tried to determine the literary patterns of the “Alas” oracles starting with ‫הוי‬. At first glance, all the woe oracles seem to share a similar basic structure. Apparently, an oracle starting with ‫ הוי‬is composed of two different parts, both of which stand as independent grammatical structures. The first part is composed of the particle ‫ הוי‬+ participle, while the second part has no specific syntactic form. From a semantic point of view, this second part often expresses a threat, a lament, a proverb or a new accusation. The first part, with ‫ הוי‬almost always followed by a participle clause has, in the words of Gerstenberger, “one purpose: they seek to describe a person or a group of persons in regard to what they are doing, their deeds being the cause for the foreboding woe-cry; […The statement] ‘Woe 7

Williams 1967, n. 18. On Gunkel’s research on prophetic texts, see Hayes 1973, 60–99, especially 60–70. 8

THE SERIES OF WOES IN PRE-EXILIC PROPHECY

165

(comes upon) one who is doing such and such’ seems to be the basic formula.”9

As far as the second part is concerned, meaning is much more blurred. Grammatically, the second part is not characterised by any particular form. From the point of view of semantics, the continuation of the initial indictment may adopt various literary forms. Obviously, these different literary forms implying different meanings are built on a variety of different syntactical structures. In Amos 5:7–17, for instance, the second part is a text of lamentation (Amos 5:16–17):

hº²b«s£t ,Itcm h¥vO¡t v²u«v±h r©n¨t v«F ifk16 sºP§x¦n ,Ic«j§r kfC I·v Iv Ur§nt«h ,ImUj kfcU :h¦v®b hg§sIh k¤t sP§x¦nU kcº¥t k¤t rF¦t Ut§r¨e±u s·P§x¦n oh¦n¨rF kfcU17 :v²u«v±h r©n¨t WC§r¦eC r«cg¤t hF In Amos 5:18–20, the second part is a proverb, a mašal:

v·²u«v±h oIh ,¤t oh°U©t§,¦N©v hIv18 :rIt t«k±u Q¤J«j tUv v²u«v±h oIh ofk v®Z v¨Nk hº¦r£t¨v h¯bP¦n Jh¦t xUb²h r¤J£tF19 c«·S©v Ig²dpU rhº¦E©v kg Is²h Q©n¨x±u ,°hºC©v tcU :J¨j²B©v If¨J±bU rI·t t«k±u v²u«v±h oIh Q¤Jj t«k£v20 :Ik V³d«b t«k±u kp¨t±u In Hab 2:6–8, the second part introduces a threat expressed in the second person (Hab 2:7–8):

Wh·g±zg³z§n Um§e°h±u Whºf§Jb UnUe²h g©,p tIk£v7 :Ink ,IX¦J§nk ¨,h°h¨v±u oh·¦Ng r¤,®h kF WUK¨J±h ohºC©r o°hID ¨,IK©J v¨T©t hF8 :Vc hc§Jh kf±u v²h§r¦e .¤rº¤t x©n£j³u o¨s¨t h¥n§S¦n 9

Gerstenberger 1962, 251.

166

J. DEL BARCO, G. SEIJAS DE LOS RÍOS-ZARZOSA

Despite the fact that both parts of the oracle are independent grammatical structures, text-grammar strategies are used in the examples above in order to connect the second part with the first one. Those strategies include the use of macro-syntactic particles, such as ‫ לכן‬or ‫ הנה‬and other connective particles, and the use of certain syntactic structures, such as rhetorical questions: —In Amos 5:16, the second part is introduced by the particle ‫ לכן‬and the introductory formula for direct speech ‫כה אמר יהוה‬:

hº²b«s£t ,Itcm h¥vO¡t v²u«v±h r©n¨t v«F ifk 16 sºP§x¦n ,Ic«j§r kfC I·v Iv Ur§nt«h ,ImUj kfcU :h¦v®b hg§sIh k¤t sP§x¦nU kcº¥t k¤t rF¦t Ut§r¨e±u —In Amos 5:19, the second part is introduced by the particle and is then developed through a syntactic sequence of an initial yiqtol plus a chain of weqatal:10

‫כאשר‬

hº¦r£t¨v h¯bP¦n Jh¦t xUb²h r¤J£tF 19 c«·S©v Ig²dpU rhº¦E©v kg Is²h Q©n¨x±u ,°hºC©v tcU :J¨j²B©v If¨J±bU —In Hab 2:7, the second part is a rhetorical question introduced by ‫הלוא‬:

Wh·g±zg³z§n Um§e°h±u Whºf§Jb UnUe²h g©,p tIk£v 7 :Ink ,IX¦J§nk ¨,h°h¨v±u Beside the problem posed by the lack of consistency in the syntactic form of the second part of a woe oracle, some scholars have noticed the existence of yet other syntactic problems in the woe oracles as a whole.11 Among these problems we can mention the following: —A second part does not always exist in a woe oracle, the only permanent element of the utterance being the particle ‫ הוי‬+ a 10

This is an example of the explicative function in the chain of weqatal; cf. Del Barco 2003, 83–87. 11 See especially Gerstenberger 1962, 253.

THE SERIES OF WOES IN PRE-EXILIC PROPHECY

167

participle. Thus, in some examples there is neither a threat, nor a lament for the person or group of persons described in the first part. —One of the most stable elements in the first part, the participle, is absent in a few examples. When this occurs, it is replaced by a proper noun, or by a nominal clause. In fact, most scholars argued that the only distinguishing element in a woe oracle is the use of the particle ‫ הוי‬+ the mention of a person or a group of persons for which the lament is uttered.12 —The beginning of the oracle is always marked by the presence of the particle ‫הוי‬. However, there is no specific sign or marker to indicate the end of the oracle. This fact makes it difficult to establish the end of the oracle itself. As a consequence, different ending limits have been proposed for some oracles. From the above it is obvious that the syntactic structure of these oracles is less stable than what seems to be at a first glance. Not even the participle is present in all the cases, and the use of the only permanent element, the particle ‫הוי‬, does not imply the existence of a second part of the oracle in a semantic structure of the type ‘indictment-threat’ or ‘indictment-lamentation’. Nevertheless, there is a certain semantic homogeneity in the type of contents to which the oracles refer: themes like social injustice, the Day of the Lord, and the transgression of a covenant are almost always present in the woe oracles across pre-exilic prophecy. We think, therefore, that a combined syntactic and semanticliterary analysis is strongly needed to understand how syntax works together with semantics in order to build biblical poetry. The detailed study of the syntactic form of these oracles will help us, on one hand, to determine the relationship, if any, between syntax and 12

Gerstemberger 1962, 251: “The words following the introductory woe have, with few exceptions, one purpose: they seek to describe a person or a group of persons in regard to what they are doing, their deeds being the cause for the foreboding woe-cry.” Janzen 1972, 23, argues that the use of “a proper noun or a precise designation of a particular addressee […] suggests that the less specifically delimiting participle may also be understood as designating a specific addressee rather than a general category.”

168

J. DEL BARCO, G. SEIJAS DE LOS RÍOS-ZARZOSA

semantics, and, on the other hand, to understand the woe oracles in the context of the different types of discourse in the prophetic literature.

III. THE CASES After having presented the basic problems found by previous studies of the ‫ הוי‬oracles, we will now analyse a selection of these oracles in pre-exilic prophecy, in order to seek new perspectives within our theoretical framework. This selection includes Isa 5:8– 24 together with 10:1–4a, Amos 5:7–17 together with 5:18–20, Amos 6:1–12 together with 6:12–13, and Hab 2:6–20. The first case, Isa 5:8–24 together with 10:1–4a, is a section containing seven ‫ הוי‬oracles. The question of the correct placement of Isa 10:1–4a has been a matter of scholarly debate in textual criticism, but for our discussion we will consider it as being a continuation of Isa 5:24, as most scholars do.13 From a semantic perspective, the woes in this section are directed against those who practice injustice (the rich, judges, those who feast in vain) and those who fail to recognize the “signs of the times” and the arrival of “the day of the Lord”. Therefore, despite the wide variety of the syntactic forms used in this section, the oracle is clearly uniform in its semantic analysis. Chapters 5 and 6 of Amos have only two cases of the particle ‫הוי‬, with two other emended cases which are commonly accepted by textual critics. Amos 5:7–20 is a text containing two oracles. The emended particle ‫ הוי‬in verse 7 introduces the first oracle, and is followed by several participles. The second oracle in this section is also introduced by the particle ‫הוי‬, and is followed by another set of participles as well. Amos 6:1–14 shows the same structure with two oracles, with the particularity of having an emended ‫הוי‬ introducing only the second oracle in verse 13. The expected ‫ הוי‬at the beginning of the first oracle in verse 1 is lacking. Both texts, including the four oracles in chapters 5 and 6, constitute a large textual section which is semantically coherent, and which encompasses a swathe of the upper classes in Judean society. It is 13

Cf. the discussion in Janzen 1972, 50 n. 23.

THE SERIES OF WOES IN PRE-EXILIC PROPHECY

169

precisely against these upper classes that the oracles are uttered: the first, against the corrupt judges; the second against those who seek the “day of the Lord” in vain (probably the priestly class); the third against the rich Judean class of land owners; and the fourth against the upper military circles. The section in Habakkuk 2 has five cases of the particle ‫הוי‬. In this sense, the whole section in Habakkuk is a genuine sequence of short indictments. From a semantic approach, the five woe oracles in Hab 2:6–20 constitute a thematically coherent section. Some scholars think that those indictments are uttered against the corrupt Judean high classes in general, as in Amos, while others, including Vermeylen, support the idea that they are uttered against the king of Judah in particular.14 The syntactic analysis of these oracles has shown that they present common specific forms in, apparently, the same contexts. As expected, all the oracles start with the particle ‫הוי‬. In the texts that we have analysed, this particle is usually followed by a participle: —Amos 5:18:

v·²u«v±h oIh ,¤t oh°U©t§,¦N©v hIv18

—Isa 5:18:

t±u·¨ ©v hkc©jC i«ug¤v hf§J«n hIv18 :v¨t¨Y©j vk²dg¨v ,Icgf±u

The participle generally appears in the absolute state and in the masculine plural form, but different variations of this form are also possible. Thus, the example from Isa 5:18 presented above shows the participle in the construct state in connection with the following word, referring to “those who are doing such and such”.15 In the woe oracles of Habakkuk 2, the participle is in the masculine singular form, and not in the plural as in the rest of the cases. As we have just mentioned, this particularity has led some 14

Vermeylen 1978, 645: “[…] il est possible qu’Habacuc ne s’adresse pas à un groupe, comme ses prédécesseurs, mais plus particulièrement à la personne du roi.” 15 See also Isa 5:8 and 5:11.

170

J. DEL BARCO, G. SEIJAS DE LOS RÍOS-ZARZOSA

scholars to think that the indictments in Habakkuk 2 are uttered against a specific person and not against the upper classes in general:

h©,¨n sg Iºk t vC§r©N©v I·,hck g¨r gmC gm«C :vk±ugC v²h§r¦e i¯bIf±u oh·¦n¨sC rhg v®b«C r·F©J ;©t±u W§,¨n£j ©jP©x§n Uvºg¥r v¥e§J©n o·¨nUS ic¤tk h¦rUg vmhº¦e¨v .gk r¥n«t

hIv hIv hIv hIv hIv

The particle ‫ הוי‬is followed by a nominal element, and not by a participle, in only two consecutive instances of the cases under analysis. These cases are Isa 5:21 and 5:22. The syntax in Isa 5:22 is especially interesting. The first part of the verse shows the woe particle ‫ הוי‬followed by a noun + the preposition lamed and a construct infinitive. The second part of the verse is a syntactic and a semantic parallel depending on the previous ‫הוי‬: it shows a nominal element (here a construct chain instead of a noun) + the preposition lamed and a construct infinitive. The striking use of the construct infinitive acting as a finite verb is syntactically remarkable:

:oh°b«c±b o¤vh¯bP s®d®b±u o·¤vh¯bhgC oh¦nf£j hIv21 :rf¥J Q«x§nk k°h©j h¥J±b©t±u i°h·²h ,IT§Jk oh¦rIC°D hIv22 As in this last case, the use of the particle ‫ ה ו י‬is usually restricted to the beginning of the oracle, even if it has several, parallel clauses under its influence. Thus, the influence of the particle over several clauses, and the fact that these clauses are mostly parallel, both function as syntactic and textual strategies to build a coherent text. This is well exemplified in Isa 5:20, where the fact that the text is a semantic unit is stressed by the use of the syntactic strategies just mentioned: the influence of the particle ‫ ה ו י‬over three clauses, and the use of parallelism:

g·¨r cIYk±u cIy g©rk oh¦r§n«t¨v hIv20 Q¤Jº«jk rIt±u rItk Q¤J«j oh¦n¨G [‫]הוי‬ :r¨nk eI,¨nU eI,¨nk r©n oh¦n¨G [‫]הוי‬ In this example the parallel clauses use the same syntactic structure of the first clause: they are all participle clauses. However, parallel clauses do not always use the same structure as the one in

THE SERIES OF WOES IN PRE-EXILIC PROPHECY

171

the initial clause. In fact, several syntactic structures are widely used in parallelism with the initial participle clause. Among these structures, we find w-x-qatal (non initial perfect), w-x-yiqtol (non initial imperfect), a nominal clause and even wayyiqtol (consecutive imperfect):16 —Amos 5:7 (participle clause / w-x-qatal):

:Ujh°B¦v .¤r¨tk v¨e¨smU y·P§J¦n v²bgkk ohfp«v©v7

—Isa 10:1 (participle clause / w-x-qatal):

:Uc¥TF k¨ng ohc§Tf§nU i®u·¨t h¥e§e¦j oh¦e§e«j©v hIv1

—Amos 6:6 (participle clause / w-x-yiqtol):

Uj·¨J§n°h oh°b¨n§J ,h¦Jt¥r±u i°hº³h h¥e§r±z¦nC oh¦,« ©v [‫]הוי‬6 :;¥xIh rc¥J kg Uk§j®b tO±u

—Isa 5:8 (participle clause / w-x-yiqtol):

Uch·¦r§e³h v¤s¨Gc v¤s¨G ,°hºcC ,°hc hgh°D©n hIv8

—Isa 5:18 (participle clause / nominal clause):

t±u·¨ ©v hkc©jC i«ug¤v hf§J«n hIv18 :v¨t¨Y©j vk²dg¨v ,Icgf±u

—Amos 6:3 (participle clause / wayyiqtol):

:x¨n¨j ,c¤J iUJh°D©T³u g·¨r oIhk oh¦S³b§n©v [‫]הוי‬3

As far as the second part of the woe oracle is concerned, this may be introduced by different clause types and formulae, all of them common in the language of the prophetic discourse. We will focus on two specific structures present in the oracles that we have analysed. These structures are the formulae introducing direct speech and the rhetorical questions. The following are two cases of formulae introducing direct speech.17 Amos 5:16–17:

hº²b«s£t ,Itcm h¥vO¡t v²u«v±h r©n¨t v«F ifk 16 sºP§x¦n ,Ic«j§r kfC I·v Iv Ur§nt«h ,ImUj kfcU 16

It is noteworthy to say that all these syntactic structures are commonly used in other prophetic types of discourse than the woe oracles. 17 Form and function of direct speech in biblical Hebrew narrative is exhaustively analysed in Miller 2003.

172

J. DEL BARCO, G. SEIJAS DE LOS RÍOS-ZARZOSA

:h¦v®b hg§sIh k¤t sP§x¦nU kcº¥t k¤t rF¦t Ut§r¨e±u s·P§x¦n oh¦n¨rF kfcU17 :v²u«v±h r©n¨t WC§r¦eC r«cg¤t hF This is the second part of a woe oracle, which expresses a lamentation. The textual segment is headed by the macro-syntactic particle ifk followed by the clause v²u«v±h r©n¨t v«F, which introduces direct speech. The use of both—the macro-syntactic particle and the formula to introducing direct speech—clearly indicates a shift in the type of discourse. The lamentation itself is syntactically built with nominal clauses, and semantically dominated by words expressing lament and mourning, such as sP§x¦n, kcº¥t, h¦v®b and the particle Iv. A less common structure for introducing direct speech is used in Isa 5:9–10:

,I·tcm v²u«v±h h²b±z¨tC 9 Uºh§v°h v¨N©Jk ohC©r oh¦TC tO o¦t :c¥JIh ih¥t¥n ohcIy±u ohk«s±D ,·¨j¤t ,C UGg³h o¤rºf h¥S§nm ,¤r¤Gg hF10 :vph¥t v¤Gg³h r¤n«j g©r®z±u Direct speech is here introduced by the oath formula which is followed by the expression of a punishment as a reversal of imagery. In Hab 2:6–7 and the examples that follow, the second part of a woe oracle is developed through the use of one or more rhetorical questions with different syntactic structures. The literary function of questions of this type in this context is to announce the punishment decreed by God. As in the cases with structures introducing direct speech, the use of rhetorical questions marks a syntactic shift in the type of discourse, the second part becoming a largely independent prophetic discourse. Hab 2:6–7:

,I·tcm v²u«v±h h²b±z¨tC,

Utº¨¬°h k¨J¨n uhkg oKf vK¥t h©,¨n sg Iºk t vC§r©N©v hIv Wh·g±zg³z§n Um§e°h±u Whºf§Jb UnUe²h g©,p :Ink ,IX¦J§nk Hab 2:13:

tIk£v6 r©nt«h±u tIk£v7 ¨,h°h¨v±u

:vk±ugC v²h§r¦e i¯bIf±u oh·¦n¨sC rhg v®b«C hIv12

THE SERIES OF WOES IN PRE-EXILIC PROPHECY

173

,I·tcm v²u«v±h ,¥t¥n vº¯B¦v tIk£v13 Jº¥t h¥sC oh¦Ng Ug±dh°h±u :Upg°h eh¦r h¥sC oh¦NªtkU Isa 10:3:

:Uc¥TF k¨ng ohc§Tf§nU i®u·¨t h¥e§e¦j oh¦e§e«j©v hIv1 […]

tI·c¨T e¨j§r¤N¦n v¨tIJkU vº¨SªeP oIhk UGg©T v©nU3 :of§sIcF Uc±zg©, v²b¨t±u vº¨r±zgk UxUb¨T h¦n kg In the last case, the idea of the approaching punishment is expressed by a sequence of three rhetorical questions which are characterised syntactically by the use of the imperfect.

IV. CONCLUSIONS From the cases analysed here, we can reach the following preliminary conclusions. From a semantic perspective, every textual section seems to be a well determined unit. This unit tends to be structured as a coherent and complete discourse mentioning all the social groups or particulars for which the oracle is uttered. Most of the woe oracles consist of two parts. The first part expresses the indictment, and the second part, the future result as means of a punishment, a lamentation or a threat. However, some of the woe oracles lack the second part. This is the case, for instance, when the initial indictment belongs to a sequence with other related woe oracles. In this particular case, a second part may appear once, at the end of the entire oracle, thus referring to the whole sequence and not only to one particular ‫ הוי‬clause. As far as syntax is concerned, the first part of the oracle is basically composed of the particle ‫ הוי‬+ a participle or a nominal element. This basic structure may be enlarged and developed in parallelism by using syntactic structures common in the prophetic discourse, such as w-x-yiqtol, w-x-qatal, wayyiqtol or nominal clauses. When a woe oracle expresses more than one indictment, the particle ‫ הוי‬appears once, at the beginning of the oracle, the rest of the times being gapped. Gapping the particle in the successive indictments of an oracle is one of the strategies to build the discourse. The second part of a woe oracle is usually introduced by a macro-syntactic particle, such as ‫ לכן‬or ‫ הנה‬or other connective

174

J. DEL BARCO, G. SEIJAS DE LOS RÍOS-ZARZOSA

particle, hence marking a textual shift and the introduction of new syntactic structures. We have focused our attention on the use of two specific structures used in the prophetic discourse to introduce the second part of a woe oracle: formulae introducing direct speech and rhetorical questions. The use of several structures to introduce the second part of a woe oracle, common in different types of discourse used in prophetic literature, such as the predictive discourse or the descriptive discourse, reveals that the second part of a woe oracle is in fact a prophetic text with the same syntactic structure as any other prophetic discourse not related to a woe oracle. The only common syntactic element in the structure of the woe oracles is the particle ‫ הוי‬at the beginning of the oracle. Subsequently, there is no specific mark to indicate the end of a woe oracle as such. The end of the literary structure called woe oracle is syntactically marked by a shift of structures, that is, by the introduction of a new type of discourse. In this sense, there is not any syntactic particularity in the woe oracles, and they can be analysed as any other type of prophetic discourse.

REFERENCES: Clifford, R. J. “The Use of Hôy in the Prophets.” CBQ 28 (1966) 458–64. del Barco, J. Sintaxis verbal en los Profetas Menores preexílicos. D. Phil. dissertation. Universidad Complutense, Madrid 2001. http://www.ucm.es/BUCM/tesis/fll/ucm-t25272.pdf del Barco, J. Profecía y sintaxis: el uso de las formas verbales en los Profetas Menores preexílicos. Madrid: CSIC, 2003. del Barco, J., and G. Seijas de los Ríos-Zarzosa. “The Syntax of Parallelism in Isaiah and the Minor Prophets: A Comparative Study.” JNSL 32 (2006) 113–30. Gerstenberger, E. “The Woe-Oracles of the Prophets.” JBL 81 (1962) 249–63. Hayes, J. H. “The History of the Form-Critical Study of Prophecy.” SBL Seminar Papers 1 (1973) 60–99. Herranz Pascual, C. Sintaxis verbal en los oráculos de Ezequiel. D. Phil. Dissertation. Universidad Complutense, Madrid 1993. Janzen, W. Mourning Cry and Woe Oracle. Berlin/New York: Walter de Gruyer, 1972.

THE SERIES OF WOES IN PRE-EXILIC PROPHECY

175

Miller, C. The Representation of Speech in Biblical Hebrew Narrative. Winona Lake, Indiana: Eisenbrauns, 2003. Mowinckel, S. Psalmenstudien V. Segen und Fluch in Israeli Kilt und Psalmendichtung. Kristiania, 1924. Seijas de los Ríos-Zarzosa, G. Análisis sintáctico del libro de Isaías (caps. 1–39). D. Phil. Dissertation. Universidad Complutense, Madrid, 1992. http://www.ucm.es/BUCM/tesis/19911996/H/3/AH30204 01.pdf Vegas Montaner, L., and G. Seijas de los Ríos-Zarzosa. “A Computer Assisted Syntactical Study of Poetic Biblical Texts.” In Bible et Informatique. Matériel et matière, 341–55. AIBI, 4. Paris, 1995. Vermeylen, J. “Le discours prophétique en Hôy.” In Du Prophète Isaïe à l´Apocalyptique. Isaïe I–XXXV, miroir d’un demimillénaire d’expérience religieuse en Israel, II, 605–652. Paris, 1978. Wanke, G. “‫ אוי‬und ‫הוי‬.” ZAW 78 (1966) 215–18. Westermann, C. Basic Forms of the Prophetic Speech. Philadelphia, 1967 (English translation of Grundformen prophetischer Rede, Munich 1960). Williams, J. G. “The Alas-Oracles of the Eighth Century Prophets.” HUCA 38 (1967) 75–91.

INDEX OF AUTHORS Abegg, M. G. 54, 65 Abram, K. 80n Adair, A. 8 Agresti, A. 104n Allen, J. D. 61 Altman, D. G. 104n Andersen F. I. 4, 4n 19, 20, 99, 99n, 100n., 101n., 105, 106, 140

Cruse, D. A. 141 Dekkers, E. 35n De Blois, R. 140n De Moor, J. C. 62 De Polignac, F. 27n De Troyer, K. 8 Del Barco, J. 21, 160n (2), 166n Díez Macho, A. 32n Driver, S. R. 137n Dyk, J. W. 4n, 120n

Bader, W. 5 Bajard, J. 6 Barber, E. A. 50n Barnes, R. 41n, 42, 42n Baumgartner, W. 56, 136 Bell, T. C. 75n Bennum, R. 5 Berleur, J. 7 Bernard, J. 4 Brenton, L. L. 53 Briggs, Ch. A. 137n Brooke, A. E. 51 Brown, F. 137n Brunner, T. 6

Eckardt 140n Erskine, A. 27n Eynikel, E. 48 Fabry, H.-J. 36n Fernández Marcos, N. 25n, 31n, 32n, 34n, 39n Field, F. 53 Flint, P. 8, 47n Follingstadt, C. M. 140n Forbes, A. D. 4, 4n, 19, 20, 21, 99n, 100n., 101n., 140, 160n Fraser, P. M. 30n, 33n, 35n

Chen, H.-H. 114n Chiaramella, P. 4 Claassen, W. 4n Clifford, R. J. 163 Clines, D. J. A. 137n Cook, J. G. 4n, 11, 38, 38n Collins, Nina L. 36n Conybeare, F. C. 53 Copinger, W. A. 26n Croft, W. 141

García Maza, J. 26n Geeraerts, D. 141, 142 Gentry, P. 82n Gerstenberger, E. 162 (2), 164, 165n, 166n, 167n Gesenius, W. 137n Glare, P. G. 50n Gosker, M. 120n Grenfell, B. P. 29n

177

178

COMPUTER ASSISTED RESEARCH ON THE BIBLE

Gries, S. T. 138, 139, 141, 155 Gross, W. 140n Groves, A. 51n Gruen, E. S. 38n Gunkel, H. 161, 164, 164n Haegeman, L. 119n Hahne, H. 7, 55n, 57 Hame, C. 26n Haralambous, Y. 66 Harmsen 6 Harold Ellens, J. 30n Hauspie, K. 48 Hayes, J. H. 164n Heinreich, H. 27 Hengel, M. 34n Herranz, C. 160n Hoftijzer, J. A. 140n Holladay, C. R. 39n Honigman, S. 32n, 39n Huddleston, 154n Hudson, J. 62 Hugues, John J. 5 Hunt, A. S. 29n Imbayarwo, T. 135n Jacob, C. 27n Jacobs, J. 27n Janzen, W. 167n, 168n Jensen, F. V. 104n Jenni, E. 140n Jones, H. S. 50 Joüon, P. 137n Juste, Y. 6 Kaljurand, K. 114n Knauf, E. A. 140n Koehler, L. 56, 136 Kraft, R. 6, 47, 47n, 54n Kreuzer, S. 33n (2) Kroeze, J. 137n

Lakoff, G. 151n Lancashire, I. 5n Langacker, R. W. 142 Leonard Jones, H. 28n Lesch, J. P. 33n Liddell, H. G. 50 Lübbe, J. 135n Lust, J. 8, 48 MacLeod, R. 41n Matthews, P. H. 101n McLean, N. 51 Meeks, Wayne A. 35n Miller, C. 171n Miller, P. 4n, 5, 5n Moffat, A. 75n Mowinckel, S. 162 Muraoka, T. 137n, 151 Naudé, J. K. 137n, 143n, 147n Ng, W. H. 114n Nida, E. 7, 15 Nielsen, T. D. 104n O’Connor, M. 137, 155 Ofer, Y. 4n Offerhaus, U. 36n Orth, W. 36n Ott, W. 5 Paul, A. 40n Payne, Ph. 52n Pecoud 6 Pietersma, A. 53 Pfeiffer, R. 30n (2) Poswick, R.-F. 4n, 6 (2), 13 Pullum, 154n Radday, T. 4n Rahlfs, A. 46, 53 Rechenmacher, H. 140n Revell, E. J. 63n Richter, W. 140 Rüterswörden, U. 137n

INDEX OF AUTHORS

179

Sánchez Ron, J. M. 26n Scott, R. 50 Seijas, G. 21, 160n (3) Shaw, M.-S. 114n Stamm, J. J. 56 Stock, St. G. 53 Stroumsa, Guy G. 25n Swart 8 Swete, H. B. 31n, 34n, 53 (2)

Van der Merwe, C. H. J. 4n, 8, 135n (3), 137n, 140n, 141n, 143n, 145n, 147n (3), 153n Van Seters, J. 27n, 30n Vegas Montaner, L. 4n, 21, 160n Vermeylen, J. 169, 169n Von Gall, A. F. 51 Von Tischendorf, C. 51

Tal, A. 53 Talstra, E. 4n, 7, 16, 19 (3), 140, 160n Taylor, B. A. 47n Thackeray, H. St. J. 31n, 51 Trobisch, D. 8 Tresseder, T. 6 Tov, E. 4n, 8, 45n, 47n, 49n, 50n, 54 Treat, J. 47n

Veltri, G. 34n Walkte, B. K. 137, 155 Wanke, G. 163 (2), 164 (2) Westermann, C. 162 Wilken, Robert L. 35n Williams, J. G. 164n Witten, I. H. 75n Wright, B. G. 53 Yeivin, I. 63n Zengyou, H. 114n