Information Technology and Egyptology in 2008: Proceedings of the meeting of the Computer Working Group of the International Association of Egyptologists (Informatique et Egyptologie), Vienna, 8–11 July 2008 9781463216269

The Computer Working Group of the International Association of Egyptologists has been in existence since 1983. The group

147 101 72MB

English Pages 228 [224] Year 2009

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Information Technology and Egyptology in 2008: Proceedings of the meeting of the Computer Working Group of the International Association of Egyptologists (Informatique et Egyptologie), Vienna, 8–11 July 2008
 9781463216269

Citation preview

Information Technology and Egyptology in 2008

Bible in Technology Volume 2

Series Editor Keith H. Reeves

Information Technology and Egyptology in 2008 Proceedings of the meeting of the Computer Working Group of the International Association of Egyptologists (Informatique et Egyptologie), Vienna, 8–11 July 2008

Edited by Nigel Strudwick

Gorgias Press 2008

First Gorgias Press Edition, 2008 Copyright © 2008 by Gorgias Press LLC All rights reserved under International and Pan-American Copyright Conventions. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise without the prior written permission of Gorgias Press LLC. Published in the United States of America by Gorgias Press LLC, New Jersey ISBN 978-1-60724-068-6 ISSN 1943-9369

Gorgias Press

180 Centennial Ave., Suite 3, Piscataway, NJ 08854 USA www.gorgiaspress.com Library of Congress Cataloging-in-Publication Data IAE Computer Working Group. Meeting (2008 : Vienna, Austria) Information technology and Egyptology in 2008 : proceedings of the Meeting of the Computer Working Group of the International Association of Egyptologists (Informatique et Egyptologie), Vienna, 8-11 July, 2008 / edited by Nigel Strudwick. -- 1st Gorgias Press ed. p. cm. -- (Bible in technology ; 2) 1. Egyptology--Information technology--Congresses. 2. Egyptology--Data processing--Congresses. I. Strudwick, Nigel. II. Title. DT60.I34 2008 025.06’932--dc22 2008051347 The paper used in this publication meets the minimum requirements of the American National Standards. Printed in the United States of America

CONTENTS

INTRODUCTION

Nigel Strudwick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 SECTION I PAPERS FROM THE VIENNA MEETING THE ROSETTE PROJECT: COMPUTER ASSISTANCE FOR THE STUDENT, THE EPIGRAPHIST AND THE PHILOLOGIST

Vincent Euverte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 TRISMEGISTOS. AN INTERDISCIPLINARY PORTAL OF PAPYROLOGICAL AND EPIGRAPHICAL RESOURCES

Svenja A. Gülden . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1000–500 BC Claus Jurman, University of Vienna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 THE MEMPHIS DATABASE PROJECT

EDUCATIONAL IMAGES ON THE WEB

Edward Loring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 DAS GEFLÜGELTE KROKODIL CODIERUNG VON TOTENBUCH-VIGNETTEN

Marcus Müller-Roth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 AUTOMATIC ALIGNMENT OF HIEROGLYPHS AND TRANSLITERATION

Mark-Jan Nederhof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN: TRAITEMENT XML DES TEXTES DES PROCESSIONS DE SOUBASSEMENT DES TEMPLES TARDIFS

Vincent Razanajao . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

V

VI

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

AN OFFERING TO AMUN-RA: BUILDING A VIRTUAL REALITY MODEL OF KARNAK

Elaine Sullivan, Willeke Wendrich . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 AGENT-BASED MODELS OF ANCIENT EGYPT Sarah Symons & Derek Raine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 L’USAGE DE LA 3D EN ARCHÉOLOGIE

Robert Vergnieux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 RAMSES. A NEW RESEARCH TOOL IN PHILOLOGY AND LINGUISTICS

S. Rosmorduc, St. Polis, J. Winand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 SECTION II ADDITIONAL PAPERS AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

Serge Rosmorduc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 RELATIONAL DATABASE DESIGN: A TUTORIAL AND CASE STUDY FOR EGYPTOLOGISTS

Ernest W. Adams and Nigel Strudwick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 USING RELATIONAL DATABASES AT THE BEGINNING OF THE CENTURY

21ST

Dag Bergman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Author Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

INTRODUCTION

Nigel Strudwick

At the closing session of the 2006 Oxford meeting of the Computer Working Group of the International Association of Egyptologists (Informatique et Egyptologie, I&E) an invitation was extended to the group by Regina Hölzl to hold the next meeting at the Kunsthistorisches Museum in Vienna (KHM) in 2008. Those present accepted this offer with alacrity, and the conference of which the present volume is the proceedings is the result. I should, on behalf of the group, like to thank Dr Wilfried Seipel, Director of the KHM, for his generosity and hospitality in agreeing that the meeting could take place in his museum. The organisation of the meeting was undertaken by Regina Hölzl and her colleagues in the Egyptian Department of the KHM, and I and all the participants are deeply appreciative of the welcome and excellent facilities extended to us. BACKGROUND TO THE PRESENT PUBLICATION

Throughout the 1980s and into the early 1990s, it proved possible to publish the proceedings of the meetings in print through the good offices of founding members of the group Nicolas Grimal and Dirk van der Plas. The last proceedings to appear in this way were those of the 1994 meeting in Bordeaux. The publication of subsequent meetings has been erratic, with some papers appearing independently, and only the papers from the 2002 Pisa meeting saw the systematic light of day via a CD ROM. Publication via electronic media seems very appropriate for a group whose interests lie in the possibilities of Information Technology in Egyptology. Nonetheless, it cannot be denied that there is still considerable 1

2

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

prejudice against, or suspicion of, such publications within the subject. The manner in which such works are collected and catalogued within Egyptology is still a little unreliable, and the rather fragile aspects of the Internet, particularly the ease with which sites come and go, is still very much an issue. For whatever reason, a print publication still seems more likely to come to the attention to colleagues, and has potential advantages in raising the profile of I&E. Thus I was delighted to be approached by Katie Stott, Production Editor of Gorgias Press, with the offer of publishing these papers. I am very grateful to Gorgias for taking on this task. It is likely that, with the agreement of the publishers, some of the papers in this volume may also appear electronically on the web-sites of individual contributors. CONTENTS OF THIS VOLUME

A total of sixteen papers was presented at the meeting, arranged into four broad groups. The first, Modelling and Animation, produced an excellent introduction by Robert Vergnieux to the processes and issues of modelling in 3D. The world of 3D has much to offer Egyptologists in terms of reconstructing ancient worlds, and a particular example of it was presented by Elaine Sullivan with a very informative and elaborate model of the temples of Karnak. The paper of Raine and Symons introduced to I&E the subject of Complex Systems, and the modelling or simulation of social processes in ancient societies. Such modelling may seem doomed to failure in ancient Egypt due to the very skewed nature of surviving data, but careful application of the principles in conjunction with full understanding of the data can produce interesting results. Most importantly of all, it brings further techniques to the attention of Egyptologists. Session 2 was devoted to Text Corpora and Text Processing. Papers by Razanajao and Grützkau presented examples of the use of XML as an open standard for the encoding of texts. The implementation of an ambitious database of Late Egyptian texts was described by Rosmorduc and Winand, a database which has a real potential as a research tool for many users. Michael Everson’s paper gave a brief update on the status of the proposal for the encoding of hieroglyphs in Unicode, asked several questions of the group, and made a number of suggestions for the future. Nederhof ’s paper examined options for the alignment of the hieroglyphic and transliterated versions of texts, and illustrates how techniques of analysis from Computer Science can provide useful features and insights for Egyptologists.

INTRODUCTION

3

Databases have for many years been a central feature of presentations at I&E, and new and old ones featured in Session 3. As new databases, Navrátilová & Landgrafová looked at the IT dimensions of their database of First Intermediate Period Biographical Texts, as Jurman did with his database of Late Period material from Memphis. The discussions of these papers revolved around a number of points, but particularly how such datasets might be released onto the Internet and made available for all, an issue which needs to concern I&E very much in the future. Two projects with prominent presences on the World-Wide Web completed this session, with Gülden discussing the papyrological collections of Trismegistos, and illustrating how it works, while Horst Beinlich demonstrated the SERat database at the University of Würzburg, and showed a number of features of the project which thus far have not been implemented in the online version, but which may be consulted via the department in Würzburg. The last session was entitled Images, Bibliography and Tools. As for the first of these, Müller discussed how the complex vignettes of the Book of the Dead may be registered, while Loring presented his thoughts on the online availability of images and issues of museum copyright. The final paper was an update from Willem Hovestreydt on the Annual Egyptological Bibliography, to which I will return shortly. The opportunity has been taken to include two papers not presented at the Vienna meeting. That by Serge Rosmorduc was presented originally at the Würzburg I&E conference in 2000, but never published. Since that time his system has developed, and it is highly appropriate to present it here now as a up-to-date examination of how computers might be used to begin to attempt automatic analysis of hieroglyphic texts, particularly in the context of the paper by Nederhof which appears here. The second paper is a republication of a paper on database theory by Ernest Adams and myself presented in 1986 and published in 1990. A paper kindly contributed by Dag Bergman follows it here to help explain its inclusion and relevance. The technology may have changed but the underlying principles have not, and it would seem that although there are many Egyptologists using databases, a considerable number of them do not fully understand the underpinnings of the systems they use. As the original publication has long been unavailable, I decided to take advantage of the present volume to make it available again. For simplicity of setting, all references in papers have been left in the formats in which they were submitted by the authors.

4

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

CONCLUDING REMARKS AND THE FUTURE OF I&E

By way of closing remarks at the Vienna meeting, I reviewed the subjects discussed, indicated any actions which the meeting should take, and looked into the future. Two business matters should be recorded here. Michael Everson had asked in his paper for the group’s opinion on the best way of encoding the transliterated Egyptological yodh, which apparently does not exist yet within any Unicode font. Summarising, the options were to encode it as a discrete character, or to create it from existing elements in other Unicode fonts. The meeting felt that the former course was better, recognising that it would be a longer process than creating a composite; one important reason for not preferring the composite was that it would free us from any obligation to have another font on the computer which would remain otherwise unused. Everson also enquired about the default sort order for the signs in the Unicode encoding of hieroglyphs; as there is no consistent or overwhelming way to handle all of them phonetically, it was felt best that the sort order should be that of the Gardiner list. It was stressed that I&E does not feel it has the right on its own to make such decisions on the part of the whole community, but I passed on its recommendations to the President of the International Association of Egyptologists (IAE), James P. Allen. The response on the issue of yodh from the Unicode Technical Committee (UTC) which oversees Unicode proposals, has thus far not been encouraging. They seem set against the creation of new characters, preferring composites. I am in the process of making representations against this, although I am not optimistic that the views of those who actually use the characters are of particular significance to the UTC. The precarious situation of the Annual Egyptological Bibliography has long been a cause for concern in the Egyptology world. Since Vienna, major developments have taken place, and Willem Hovestreydt has authorised me to print this statement on the state of matters in January 2009: At the Vienna meeting I described the AEB’s financial and institutional situation, which had become very difficult, especially after the University of Leiden decided to cease funding by the end of 2008. I presented several options, one of which was that the AEB should seriously explore the possibility of moving its base to a foreign institution. I am happy to say now that such a solution has indeed been found. From 1 January 2009 the AEB will be located

INTRODUCTION

5

in the Griffith Institute, Faculty of Oriental Studies, University of Oxford. The name Annual Egyptological Bibliography will change to Online Egyptological Bibliography (OEB). The OEB will also include Christine Beinlich-Seeber’s Bibliographie Altägypten 1822–1946. Discussions begun in Vienna played an important role in generating the initiative for the move to Oxford. I am sure all Egyptologists, not just participants in I&E, will wish the AEB the very best of futures, and will continue to support the project. I was delighted to be able to announce that Jean Winand, one of only two members of the group to be present at the first meeting of I&E (the other is Robert Vergnieux), has offered the University of Liège as the venue for the 2010 meeting of the group. All members were delighted at his generosity, and we look forward to reconvening in Belgium. The issue of the future of I&E frequently crosses my mind. In the early days of the group, it produced work widely used by the rest of the Egyptological community, notably the Manuel de Codage system, and the Multilingual Egyptological Thesaurus, as well as watching over the production of the earliest systems for writing hieroglyphs with computers. The most important thing which I&E should do is to continue to make itself relevant to Egyptology, and encourage those who are using various innovative IT techniques to bring them to meetings so that they may become wider known and benefit from discussions with others. One such area which is becoming more frequent in Egyptology, but which has yet to make an appearance at I&E, is Geographical Information Systems (GIS). But there are many areas which the meeting has examined before and which are as important as ever. I have already noted that fuller understanding of database concepts needs reinforcement and reiteration. Another matter which I&E has discussed and which needs further examination is that of data persistence and preservation (see my comments earlier about electronic vs print publication!). It is my belief that in this latter area I&E can help drive forward systematic approaches, and to that end, it was agreed in the closing session of the meeting that those of us interested in this topic should meet in the course of 2009 and consider how discussion on this subject should be facilitated at the Liège meeting. The future for I&E remains exciting so long as we remain relevant and encourage our colleagues to take full advantage of the information and technological revolutions which we have observed and participated in over the past 25 years.

6

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

PAPERS PRESENTED AT THE VIENNA MEETING

Session 1 (Modelling and Animation) Vergnieux, L’usage de la 3D en Egyptologie Sullivan, An Offering to Amun-Ra: Building a Virtual Reality Model of KarnakTemple Raine & Symons, Complex Systems: Agent based models of ancient Egypt Session 2 (Text Corpora and Text Processing) Razanajao, The Electronic Text Corpora of Ancient Egyptian: XML Treatment of Processions of Nile-Gods Grützkau, Hieroglyphic Texts and XML* Rosmorduc & Winand, Ramsès Project Everson, Yod, Unicode, and future options for Egyptian encoding* Nederhof, Automatic alignment of hieroglyphic and transliteration Session 3 (Databases) Navrátilová & Landgrafová, The Database of First Intermediate Period Biographical Texts* Jurman, Prosopography and Monument-analysis with the Memphite Late Period Database Beinlich, Datenrecherche über die Internet-Version von SERat hinaus* Gülden, Trismegistos Session 4 (Images, Bibliography and Tools) Müller, Encoding Vignettes of the Book of the Dead Loring, Sharing Images in the Internet Euverte, Rosette. A computer-assistance for the student, the epigraphist, and the philologist Hovestreydt, The Annual Egyptological Bibliography: Recent and Latest Developments* Nigel Strudwick, Closing comments * indicates that this paper was not submitted for this volume.

THE ROSETTE PROJECT: COMPUTER ASSISTANCE FOR THE STUDENT, THE EPIGRAPHIST AND THE PHILOLOGIST

Vincent Euverte

ABSTRACT

Since 2004, the Rosette Project has promoted the culture of Ancient Egypt through its writing, with its main focus on the usage of modern technologies to assist in the reading and the translation of hieroglyphs. This paper describes the achievements over these four years, and highlights the perspectives offered by the rapid developments in computer science. INTRODUCTION

In 2004, the concept of computer assistance to read and translate the hieroglyphic texts was launched. Soon some Egyptology amateurs joined the project, and the next two years were dedicated to address multiple difficulties in programming as well as in the learning of the writing. In July 2006, the Informatique & Egyptologie Computer Group Meeting in Oxford gave the first opportunity to present our initial results to professional colleagues. The reception given to the project was encouraging, and many pertinent items of advice were offered so as to orientate future developments. In 2008, two major events have allowed us to reassess the project and to redefine our axes of progress: the International Congress of Egyptology in Rhodes, and the Informatique & Egyptologie Computer Group Meeting in Vienna. 9

10

INFORMATION TECHNOLOGY AND EGYPTOLOGY

Figure 1: The home page of the Rosette Project

Figure 2: Sample entry from the catalogue of hieroglyphs

2008

THE ROSETTE PROJECT

11

We wish here to thank all those who believed in and supported the Rosette project. Their suggestions and comments will be incorporated into the future development of the project. THE ROSETTE PROJECT

Thanks to a team of committed and talented volunteers, and supported by a network of hundreds of users, the Rosette application (http://projetrosette.info/, Figure 1) has been on-line and free of charge on the Internet for three years and offers the following features: • the multilingual environment includes so far French, English, and Arabic (Spanish and Portuguese are under development); a simple and dynamic interface with drop-down menus and an integrated help function; compatibility is assured for most web browsers which are in compliance with XHTML standards, without any influence on the user environment (there is no need for cookies nor downloads). • The catalogue of hieroglyphs (Figure 2) contains more than 3000 signs, documented and enriched with palaeographic images, and available in three different fonts in Unicode EGPZ format, in partnership with Saqqara Technology. A descriptive card is accessible by a single click on any hieroglyph, from everywhere in the Rosette application. • The dictionary already contains all the entries of the Concise Dictionary of Middle Egyptian by R.O. Faulkner, with the kind authorisation of the Griffith Institute, as well as its French counterpart from the Medjat Association. Multi-criteria search tools allow access to different writing forms, semantic variants, transliterations, translations, and bibliographic references. • The hieroglyphic editor complies with the Manuel de Codage 1988 (MdC) and offers several additional functions such as the fine positioning of the signs and the composition of groups and sign combinations. A lexical analysis allows to find automatically every word of the dictionary that is contained in the submitted sentence. • A corpus of texts links high resolution photographs of the original artefact with the hieroglyphic pattern, the MdC sequence, the transliteration, the translation in several languages, and the references to the source document. Each artefact is described according to the

12

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 3: sample text corpus entry, from the White Chapel of Senwosret I

Multilingual Egyptian Thesaurus (MET) to allow thematic searches. An ongoing project is an indicator of developments to come: a detailed visit to the White Chapel of Senwosret I in Karnak (Figure 3). 3D navigation will be available for every scene, and permit the analysis of the symmetrical arrangement of this wonderful monument. • Several thematic articles discuss epigraphy, philology, calligraphy, and so on, as well as the lists of kings, gods, toponyms, etc. in association with the dictionary. The tools we have developed enable a wide range of topics to be examined, and we are planning to expand the scope of this further. An interesting example under construction is the list of military titles at all dates, collected by Michel Sancho, a specialist in military matters. ASSESSMENT

The comments received from the professionals are encouraging to our approach. The following are the most appreciated strengths of the Rosette project: • A structured database: From its initial conception, Rosette has been seeking an information architecture which will lead to an

THE ROSETTE PROJECT











13

optimal usage. However, Egyptology has multiple needs which require reassessment and realignment of this structure. Dag Bergman and Nigel Strudwick demonstrated their interest in such a review. Therefore, we will document our approach and submit it to them for their review and suggestions. Integration: One of the key features of Rosette is its ability to integrate very varied functions in a single environment. Most of these functions already exist in other application, often more powerful overall, but rarely combined altogether. The interaction between the catalogue, the dictionary, the editor, and the corpus makes Rosette a global tool for the study of hieroglyphs. Referees: Every single element (MdC, transliteration, translation, bibliographic reference) has to be checked by a qualified person. Our approach is to grant a ‘referee level’ to the reviewer, and to indicate the review date, the name and the level (at this date) of the reviewer. Attestations: Rosette aims to explain each source as much as possible. A good example is the palaeographic collection, which justifies the choice we made in the design of our hieroglyphic fonts. Respect of copyright: The Rosette project is proud to be assisted by an international Law Advisory Agency (Hogan & Hartson). We took the firm resolution to protect the authors’ rights, and to only publish duly authorised works. We believe this to be an assurance of our seriousness and our intentions to maintain this project over the longer term. Compliance with the standards in force in Egyptology: The Manuel de Codage, the MET, the basics of transliteration, the bibliographic abbreviations, and so on, are all essential to the acceptance of the project by specialists as well as by amateurs.

PERSPECTIVES

The Rosette Project has already achieved number of its initial objectives. Nevertheless, we are far from having exhausted all the possibilities offered by the technology. So it is even more important to highlight the major axes of improvement: • Harmonisation and extension of the Rosette fonts: Whatever its quality, a computer font is reductive by nature, for it normalises

14

INFORMATION TECHNOLOGY AND EGYPTOLOGY









2008

the graphic representation of each hieroglyph. So it is worth debating whether it is necessary to build a collection of fonts covering each period and/or each writing mode. One could for instance envisage a first series of five fonts to handle separately Middle Egyptian, Late Egyptian, Late Period Egyptian, a ‘printing font’ (e.g. de Rougé or Theinardt), and a semi-cursive one for papyrus texts. Rosette already proposes the first three and is evaluating the others. Revision and extension of the dictionary: First, it is essential to validate our current database. This effort requires skills that are beyond our team’s resources as of today, and it stresses the need for the project to be supported by philologists. As it is currently structured, the Rosette dictionary is ready to gather multiple sources. The collections of royal names, divine names, military titles, toponyms, and so on, are underway; there are many thematic items which will enrich the dictionary and the lexical analysis. Enrichment of the corpus and cross-analyses: The Rosette corpus currently includes nearly one hundred texts. In order to achieve cross-analyses on different criteria (period, supporting material, type of text, and so on), this collection must grow. The MET classification is a first step towards rationalisation. We are now developing multi-criteria search tools and adapting the existing texts to these standards. Another axis of progress is the statistical analysis of the occurrences of hieroglyphs by period, supporting material, and so on. On another hand, we also have numerous photographs of original artefacts readily available for future addition to the corpus. As soon as volunteers are available, their translations can be immediately verified and posted on the web-site. Semantic and lexicographic analyses: Transliteration is unavoidable in the translation process. With the help of the catalogue, the dictionary, and the phonetic table, it should be possible to facilitate and to speed-up this exercise with the assistance of the computer. Then, a semi-automatic analysis should allow the identification of a few elementary grammatical rules to facilitate the work of the reader-translator. We are still aware that his/her acumen and experience will be required for the final interpretation. Thematic articles on epigraphy and philology: The Rosette Project’s purpose claims to ‘promote the culture of Ancient Egypt through its writing’. The corollary of this wish is to publish serious

THE ROSETTE PROJECT

15

and documented articles about any topic related to hieroglyphic writing. • These are not withstanding the many potential improvements of the web-site itself, to ease navigation for the visitor. CONCLUSIONS

Computer science offers almost boundless opportunities. It is up to us to exploit them for the sake of the Egyptological community. Should a worldwide standard emerge in the coming years, it could allow all texts to be encoded in a unique format, available in one single database, with a common dictionary encompassing every expert’s knowledge, assisted by a syntactic engine, and in multi-lingual translations. Even if these standards are not yet ready, nothing prevents starting to capture the data in a temporary format. Whatever Rosette can collect from now on, it will be easy to convert and transfer it to any future ‘official database’, blessed by the Egyptological community, as soon as this database is created. Is not it worth starting the data collection now, in parallel with the development of the new standards?

TRISMEGISTOS. AN INTERDISCIPLINARY PORTAL OF PAPYROLOGICAL AND EPIGRAPHICAL RESOURCES

Svenja A. Gülden

ABSTRACT

Trismegistos, named after the famous epithet of Hermes-Thoth, the Egyptian god of wisdom and writing who also played a major role in Greek religion and philosophy, is a platform aiming to surmount barriers of language and discipline in the study of late period Egypt and the Nile valley (roughly 800 BC–AD 800). It brings together a variety of projects dealing with metadata, mainly of published documents. Its core component is Trismegistos Texts, which includes papyrological and epigraphic texts, not only in Greek, Latin, and Egyptian in its various scripts (Demotic, Hieroglyphic, Hieratic and Coptic), but also in Meroitic, Aramaic, Arabic, Nabataean, Carian, and other languages (currently 106859 records).

Scholarly tools that help to find and explore written sources from Ancient Egypt with relevance for one’s research have become increasingly important in the last century. For Demotic studies such tools were first provided in the late 1960s and the early 1980s, mainly by articles of Pestman, Thissen and Lüddeckens, that made precisely dated demotic texts easily retrievable through detailed lists.1 For Greek papyrology such a tool was missing at that time. Since the early nineties of the previous century, however, the Heidelberger Gesamtverzeichnis der griechischen Papyrusurkunden Ägyptens (HGV) provided similar, but digital lists of dated documents, and from 1997 onwards their FileMaker 17

18

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

database with information about date, provenance and other metadata has become fully accessible online.2 For Greek literary texts a similar tool is available in the form of the Leuven Database of Ancient Books (LDAB).3 Digital tools like these of course made metadata searches more flexible for Greek texts than that for Demotic ones. Therefore the first goal of the project Multilingualism and Multiculturalism in Graeco Roman Egypt4 was to create a similar tool to find and explore Demotic and other Egyptian papyrological material in general. SETTING UP TRISMEGISTOS

In early 2005 we therefore started to create Egyptological metadatabases as counterparts to the already mentioned databases HGV – concentrating on Greek and Latin documentary papyrological texts – and LDAB – for literary texts – by programming the databases DAHT (Demotic and Abnormal Hieratic Texts) and HHP (Hieroglyphic and Hieratic Papyri). Fortunately, for neither of these projects (DAHT and HHP) did we have to start from nothing, because for Demotic papyrological material two unpublished digital tools were placed at our disposal. The first was a database of limited metadata of Demotic papyri of all genres, compiled by Heinz-Josef Thissen, and the second was the nascent database with more elaborate metadata of Demotic documentary texts (written on papyri, ostraca and occasional other writing surfaces), which was part of the relational databases of the Prosopographia Ptolemaica and the Leuven Homepage of Papyrus Collections.5 1

2 3 4

5

P.W. Pestman, Chronologie égyptienne d’après les textes démotiques, P.L.Bat. 15, Leiden 1967, H.-J. Thissen, Chronologie der frühdemotischen Papyri, in: Enchoria 10, 1980, 105–25, E. LÜDDECKENS, Papyri, Demotische, in: LÄ IV, Wiesbaden 1982, 750– 898. http://aquila.papy.uni-heidelberg.de/gvzFM.html http:// ldab.arts.kuleuven.be/ This project was made possible by a Sofja Kovalevskaja Preis awarded by the Alexander von Humboldt-Stiftung. The award which was obtained by Mark Depauw from the Katholieke Universiteit Leuven did allow him to set up this project at the Seminar für Ägyptologie der Universität zu Köln, under the auspices of Heinz-Josef Thissen, the former Director of the Cologne Egyptological Seminar. http://lhpc.arts.kuleuven.ac.be/

TRISMEGISTOS

19

After these two unpublished databases had been merged into the new system and double entries had been eliminated, new records were entered on the basis of the Berichtigungsliste of Demotic Documents6 – a non-digital tool listing all published Demotic texts for which corrections had been proposed in secondary literature. The HHP database also builds upon two unpublished digital tools which were put at the project’s disposal: first the database with metadata of Late Period Funerary Papyri created by Marc Coenen and second that of Late Book of the Dead Papyri compiled mainly by Ursula Verhoeven with later additions by the present author. The second database was adapted and enlarged by the Book of the Dead project in Cologne and Bonn and published as an Online Prosopographie in a downloadable version with limited data, which – at a later stage of my work on the HHP – also proved useful.7 Again eliminating double entries and assigning a unique number to the texts in the system was one of the main elements of the work, but in this case another, sometimes complicated task was identifying and joining papyrus fragments that belong together in one single entry. Each entry in our databases is considered to be a single document. To determine what constitutes a ‘single document’ we have adopted a definition based on material aspects: In principle all texts written on what was in antiquity a single writing surface belong together and form one document, unless there are good reasons to believe that the only (and unrelated) relation between the two texts is the writing surface itself.8 If two or more texts on the same writing surface are related, they are therefore considered a single document; if the texts do not belong together, i.e. in case of reuse, they form separate entries in the database but are connected by a link to each other. Thus papyrus fragments that are spread over several museums (e.g. funerary papyri such as Book of the Dead manuscripts) but which can be identified as being parts of a single papyrus in 6

7 8

A.A. Den Brinker, B.P. Muhs, S.P. Vleeming, A Berichtigungsliste of Demotic Documents, Studia Demotica 7, 2 vols, Leuven 2005. Although itself at the time unpublished, the compiler Sven Vleeming was so kind as to send a preliminary version at the start of the present project, which greatly facilitated data entry. http://www.uni-bonn.de/www/Totenbuch_Projekt/Online_Prosopographie.html See M. Depauw, Bilingual Greek – Demotic Documentary Papyri and Hellenization in Ptolemaic Egypt. The Trismegistos Database, in: P. Van Nuffelen, (ed.), Faces of Hellenism, Studia Hellenistica, Leuven, forthcoming.

20

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

antiquity, constitute a single record. This basic definition works quite well for the papyrological material, although there are, as always, some problematic cases. It soon became evident that the two language / script based databases DAHT and HHP would hardly be practicable without an accompanying bibliographical database. Therefore we started with the digital version of the Demotistische Literaturübersicht (DL). This bibliographical tool was initially restricted to the entries made in the published version in Enchoria since 1971. We again were fortunate to receive a basic electronic version from H.-J. Thissen. In addition to this we have created a general bibliographic database (TMBib), which is linked directly to our text databases. It provides bibliographical information for all DAHT and HHP entries, including the abbreviations used. Since this bibliography is fully implemented in the system it is also possible to start from a certain publication and see whether there are links to the texts themselves. Apart from setting up databases for material in Egyptian scripts as counterparts to the Greek papyrological ones, the project also wanted to investigate questions of multilingualism on the basis of these tools. To quantify we therefore needed to establish ways of exchanging information by bringing everything together in a common platform, which we have decided to call Trismegistos. For this purpose we first implemented a mapping procedure for all records with metadata of the HGV, LDAB and our several project databases. Each record of the partner projects was then assigned a unique (and random) identification number, the TM-ID, which is used to link between all relevant databases. TRISMEGISTOS NOW

– OFFLINE VERSION

Trismegistos is based upon several internal and external partner projects: Internal Trismegistos partner projects DAHT Demotic and Abnormal Hieratic Texts HHP/HHT Hieroglyphic and Hieratic Papyri / Texts ATE Aramaic Texts from Egypt TM Magic Magical, ritual, religious and divinatory Texts (all languages) LDAB Leuven Database of Ancient Books (all languages)

TRISMEGISTOS

21

External Trismegistos partner projects HGV Heidelberger Gesamtverzeichnis (Greek and Latin) APD Arabic Papyrology Database BCD Brussels Coptic Database

All Trismegistos partner projects9 collect information in a shared offline FileMaker server database system.10 It consists of a set of about 16 (main) relational databases with metadata. The core are over 105,000 text or document records, linked to this are about 157,000 publication entries and just over 109,000 dates. Together with the Leuven Homepage of Papyrus Collections, these documents are linked to more than 100,000 inventory numbers and about 1,400 collections; in collaboration with the Fayum project (Leuven) the more than 117,000 provenances of these documents are linked to about 7,600 places. Finally 375 archives – collected by the Leuven Archives project – could be connected in 12,600 cases to our entries. As a result of the close cooperation of all Trismegistos partner projects the database includes metadata of various kinds and scripts of written material (Table 1). TRISMEGISTOS NOW

– ONLINE VERSION

Another important aim was not only to set up this database system for our own research concerning multilingualism and language shifts in GraecoRoman Egypt, but also to share information and make our database available to all interested scholars by creating a freely accessible online platform. Based on PHP / MySQL11 Trismegistos went online in a first version in November 2006 with only limited search facilities such as publication numbers and inventory numbers. The latest version of Trismegistos went online in February 2008 at http://www.trismegistos.org/. The six main modules (Figure 1) of Trismegistos are: Texts, Collections, Archives, People, Places and Bibliography, whereas the core component is Trismegistos Texts (Figure 2).

9 10 11

HGV, APD and BCD provide us with regular updates so that we can include actual information in Trismegistos. The database structure was designed mainly by Bart Van Beek and the present author in an FileMaker Pro 7 / 8 / 9 environment. The online version was designed by Jeroen Clarysse and Bart Van Beek.

22

INFORMATION TECHNOLOGY AND EGYPTOLOGY

Greek & Latin Demotic & Abnormal Hieratic Coptic Hieroglyphic & Hieratic Meroitic Aramaic Arabic Other languages & scripts

2008

Papyrological Epigraphical Documentary Literary HGV LDAB Gr.-Lat. IGLE 52400 records 10436 records ~ 10000 records DAHT (incl. LDAB Dem.) 13975 records (incl. 654 LDAB Dem.) BCD LDAB Copt. no partner 7074 records 1590 records 1833 records HHP no partner 1509 records 3452 records no partner 940 records ATE 1150 records APD no partner 566 records no records no partner ca. 400 records

Table 1: Types of metadata in Trismegistos by types and partner sources

Figure 1: Trismegistos online portal

TRISMEGISTOS

23

Figure 2: Trismegistos Partner Projects Texts databases

In addition to the already mentioned partner projects, the CPP (Corpus of Paraliterary Papyri) is listed (at the very right), which does not provide any information but does link directly to Trismegistos Texts by using the TMnumber. More information about all of our partner projects, as well as links to their home pages, can be obtained by using the corresponding button. The new search form provides more elaborate search fields than we had in our first online version: Publication, Editor, Inventory number, Material, Language/Script, Provenance, Nome/Region and Date.12 Search criteria can be combined and every search field allows two requests by the use of the Boolean ‘OR’, ‘AND’ or ‘BUT NOT’. A short explanation accompanies every search field – for example to define the ‘strict’ button that filters the queried range of dates. The list of matching data gives the publication number(s) and the inventory number(s). To save a certain search result from Trismegistos, a 12

For dates BC, negative numbers are used. The dates in the document sheets, however, are shown in the common way using BC and AD.

24

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 3: DAHT – Demotic and Abnormal Hieratic Texts

button ‘export to .csv’ (still under development) at the top of all lists permits the download of the search results into a format which can be imported, for example, into an Excel spreadsheet. For more details about a particular record two possibilities are given: The TM ID – Link for limited metadata or the link to the partner project (e.g. the databases HHP, DAHT or HGV)13 with more elaborate information such as the type of text, details concerning the language, the use of recto and verso, the provenance, the date and so on (Figure 3).14

13 14

In those cases where a text is found in more than one partner projects all possible links are given (e.g. HGV and DAHT). The elaborate data sheets provide not only actual inventory numbers but also show older numbers that are out of use and even wrong inventory numbers used in literature. As well as starting with a general search in Trismegistos, it is possible to look up a special subset of data by using one of the partner databases that leads directly to the elaborate data sheets.

TRISMEGISTOS

25

Figure 4: Collections – LHPC data sheet

In some cases, additional literature – with key words concerning the topic of the relevant entry – is listed below the data sheet. This bibliography does not claim to be exhaustive – it is just an extract of recent or particularly relevant literature. By using the links (indicated by underscore) the user can not only obtain the full bibliographic data but also a list of related and linked texts in Trismegistos. At the left side more related data – collections info and archives info – is listed and of course linked to the respective database. The link to collections leads to a data sheet with limited data about the collection itself but it provides a list of other Trismegistos texts present there. Through cooperation with the Leuven Homepage of Papyrus Collections (LHPC) project, more detailed information concerning the collection is often provided on the LHPC data sheet (Figure 4). The archives-info link – provided in cases where a text is known to belong to a certain archive in antiquity – leads to a data sheet with detailed

26

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 5: Archives – data sheet

information about the relevant archive (Figure 5). In this case the infobox at the left shows the publication numbers of all Trismegistos texts linked to this archive. Whenever possible we try to give information concerning the provenance of a document, where it was found, written or, in the case of letters, what its probable destination was. The link provided leads the user to the places database that shows – because the example above chosen was Memphis – not only Memphis but also other nearby sites that are included under this entry. The most important variants are named as well as e.g. Greek, Demotic or Latin versions. Apart from links to Trismegistos texts that are linked with this area, a new feature, a link to Google Maps, is added. The red rectangles (grey in this figure) show places that are linked with Trismegistos (Figure 6). This is a very interesting feature, particularly for lesserknown places. Furthermore it is now possible to browse Google Maps for ancient place names in Egypt with the possibility to switch from Google

TRISMEGISTOS

27

Figure 6: Trismegistos and Google Maps

Maps directly to the Trismegistos places database. Of course place names in the Collections database are also linked to Google Maps. Starting from the core of Trismegistos, the Texts database, the main features of Trismegistos online with the databases Collections, Archives, Places and Bibliography have now been introduced here. But it is also possible to explore Trismegistos by using one of the other databases as a starting point. The people database, which was not described here, is still “under construction”. The project Multilingualism and Multiculturalism in Graeco Roman Egypt began work on the people database, but this work is now continued by the project Creating Identities in Graeco-Roman Egypt, led by Mark Depauw at the University of Leuven. Eventually all people mentioned in the Trismegistos texts should be included in this database, but in view of the very high number of individuals therein, cooperation with full-text databases, especially for Greek, is planned. Finally one more feature provided by Trismegistos should be mentioned: the Trismegistos Online Publications. This is a new series of freely downloadable PDF documents with scholarly tools based on the Trismegistos database. The first published volume is a Chronological Survey of Precisely

28

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Dated Demotic and Abnormal Hieratic Sources.15 Although this PDF of course can be printed, however, it should preferably used as a digital tool, because the entries provide a direct link to the Trismegistos online databases. The publication of a second volume listing placenames is imminent (October 2008). Trismegistos is still a ‘work in progress’. Despite the fact that there are without doubt missing and erroneous entries, we hope our current coverage is sufficient enough for Trismegistos to be a useful tool.

15

M. Depauw, C. Arlt, M. Elebaut, A. Georgila, S.A. Gülden, H. Knuf, J. Moje, F. Naether, H. Verreth, S. Bronischewski, B. Derichs, S. Eslah, M. Kromer, A Chronological Survey of Precisely Dated Demotic and Abnormal Hieratic Sources, Version 1.0 (February 2007), Köln / Leuven 2008, xiii, 232 pp: http://www.trismegistos.org/dl.php?id=4.

THE MEMPHIS DATABASE PROJECT

1000–500 BC

Claus Jurman, University of Vienna

ABSTRACT With the aim of creating a corpus of Third Intermediate Period and Late Period elite monuments from Memphis for prosopographical and sociological analysis, a flexible database capable of dealing with very different kinds of data becomes necessary. Since no commonly available ready-to-use Egyptological database software exists today, a custom-built database structure based on the common Microsoft Access application has been created for the purposes of an ongoing PhD project. Though being the product of a non-IT expert and tailored to the specific requirements of the study in question, the multi-purpose layout of the program, which covers among other things information on field archaeological contexts, dating, texts, palaeography, stylistic features, typology, personal names, titles, etc., could perhaps be successfully applied also by other Egyptologists conducting prosopographical or artefact-centred research.

INTRODUCTION TO THE PROJECT The Memphis Database Project is an outcome of my PhD dissertation entitled Memphis in der Dritten Zwischenzeit und der Spätzeit – (Selbst)Repräsentation auf Elitedenkmälern which is currently being prepared at the Institute of Egyptology, University of Vienna, under the supervision of Prof. Manfred Bietak. With the database still being a work in progress, this article cannot present definitive results. Instead it is meant to highlight certain aspects relevant to designing and working with an Egyptological database. The main goal of the aforementioned dissertation is to shed more light onto the history of Egypt’s traditional cultural capital and its citizens from c. 1000 to 500 BC by closely studying the elite monuments from votive and 29

30

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

funerary contexts preserved at the site. Since comparatively little research has so far been directed to the region of Memphis between the end of the New Kingdom and the Persian Empire,1 it was decided to create at first a corpus of Memphite monuments from the Third Intermediate Period and the Late Period. This corpus covers objects that can be attributed to individuals by means of inscriptions or other criteria, dating from Dynasties 21 to 27. When completed, one of its primary assets will be that it assembles for the first time comprehensive sets of data on the totality of the relevant Memphite material now dispersed over unnumbered collections on several continents, including those objects that have not yet been properly published or indeed published at all. The following considerations had a decisive influence on the conception and realisation of the corpus by means of a relational database: • The primary research goal of the project is to investigate the changing modes of (self)representation of the Memphite elites on different kinds of monuments during the periods in question. In order to achieve this goal it is necessary to create a corpus of artefacts which can be linked to Memphite individuals through external (> e.g. archaeological records) or internal (> e.g. inscriptions) criteria. Additionally, a tool designed for this kind of research should facilitate creating dossiers for each recorded individual in which his or her monuments/attestations are grouped together according to pre-defined criteria of classification. • The individuals linked to the objects of investigation were subject to changing political and social conditions that should in one way or another have also found their expression in the objects themselves, namely through particular features of design or ‘decorum’.2 One has to account for the fact that the production of an artefact is a complex decision process, often taking place on more than one level of agency and involving deliberate choices as well as unconscious 1

2

Apart from excavation reports on the North Saqqara Temple Precincts and on the Late Period shaft tombs at Abusir, the most noteworthy publication of the last two decades touching upon the topic of the Memphis Database Project is C.M. ZIVIE-COCHE, Giza au premier millénaire. Autour du Temple d’Isis Dame des Pyramides, Boston 1991. On the concept of ‘decorum’ within Egyptology, see J. BAINES, Fecundity Figures. Egyptian Personification and the Iconology of a Genre, Warminster 1985, 278– 279.

THE MEMPHIS DATABASE PROJECT

1000–500 BC

31

ones. In order to filter out significant choices, a tight net of qualitative and quantitative parameters has to be spread over the object of study, providing the basis for a culturally meaningful ‘thick description’ in the sense of Ryles and Geertz.3 • Each object of research is conceived as a complex entity consisting of several partly interdependent cultural features and properties for which the term ‘culturemes’ will be used.4 Such culturemes may include, for example, particular iconographic modes of depiction (e.g. gestures of adoration), stylistic details (e.g. plastic rendering of eyebrows) or the particular type of stone used for an object, its proportions, measurements, types of texts present and conventions of spelling followed, and so on. • To make efficient use of the collected data, it is imperative to have a tool at hand which is capable of identifying patterns, i.e. testing potential connections between certain culturemes or groups of features. It might be relevant, for example, to investigate the systematic presence or absence of certain features (such as the patrilineal filiation in inscriptions identifying the owner of a monument) on particular groups of objects (such as funerary monuments).5 • In the fullness of time the corpus will comprise a great variety of objects, ranging from temple statues and funerary stelae to votive bronzes, funerary papyri, administrative documents, inscribed amphorae, etc. While the focus of research unquestionably rests on the abundant source material provided by statues and stelae, the corpus data will in the end include less prominent categories such as shabtis and votive figurines as well. 3

4

5

G. RYLE, Thinking of Thoughts. What Is ‘le Penseur’ Doing?, University of Saskatchewan University Lectures 18, 1968; see also http://lucy.ukc.ac.uk/CSACSIA/Vol14/Papers/ryle_1.html (last accessed on 14-09-2008); C. GEERTZ, Thick Description: Toward an Interpretive Theory of Culture, in: idem, The Interpretation of Cultures. Selected Essays, New York 1973, 3–30; esp. 5–6; 9–10. For the term ‘cultureme’, see I. EVEN-ZOHAR, Factors and Dependencies in Culture: A Revised Outline for Polysystem Culture Research, Canadian Review of Comparative Literature 24, 1997, 22. For the tendency to conceal the name of a person’s father within Memphite funerary monuments of the Late Period, see L. GESTERMANN, Die Überlieferung ausgewählter Texte altägyptischer Totenliteratur (‘Sargtexte’) in spätzeitlichen Grabanlagen, ÄA 68, Wiesbaden 2005, 402.

32

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

• The corpus data should also be used for purposes of comparison with related material from other regions, for example, the group of statues from the first millennium BC found in the Karnak Cachette at Thebes.6 In Egyptological practice the two most common forms of data assemblage are the more or less detailed catalogue of objects with a fixed order following pre-defined categories, and the often alphabetically arranged catalogue of dossiers, which usually offers a basic selection of information on objects linked to particular individuals. In the first case, the range of objects considered is often rather limited. This makes it very difficult, if not impossible, to obtain an overview of a person’s activity at a site or in a region if this person is attested by more than one object category (e.g. by a funerary assemblage from the Saqqara necropolis, an Apis stela from the Serapeum at Saqqara and a votive statue from the temple of Ptah at Memphis). The second method of data processing has its shortcomings as well, since object-related information within a dossier entry is normally confined to the most basic level so as not to sacrifice clarity. Even if such an entry includes, for instance, the title sequence(s) and genealogical data provided by an inscribed monument, it will usually fall short of offering insight into the specific contexts in which they occur. However, these contexts are in many cases highly relevant for answering the type of questions sketched above. For example, it might be significant where a particular title occurs within a sequence or where a title sequence is positioned on an object. It could be important to know whether a divine name is part of an offering formula or appears within a biographical text as part of a priestly designation. Office-related titles may be found in separated groups or interspersed with epithets of a general nature. Certain types of genealogical data may only occur within specific contexts or in combination with particular iconographic attributes. However, it is indisputable that the registration and interpretation of these subtleties can hardly be addressed by the two methods delineated above. For the envisaged holistic approach of the Memphis Database Project, with its focus on the cultural and sociological significance of object-related properties, a different path had to be taken. As seen in Figure 1, the aim of an individual-centred monument analysis can only be achieved by 6

Cf. recently, J.-C. GOYON & C. CARDIN, Trèsors d’Égypte : la ‘cachette’ de Karnak, 1904–2004; exposition en hommage à George Legrain à l’occasion du IXe Congrès International des Égyptologues, Grenoble 2004.

THE MEMPHIS DATABASE PROJECT

1000–500 BC

33

Artefacts from a geographical entity (Memphis and its necropoleis in the early 1st millennium BC)

Corpus of monuments (catalogues)

Prosopography (dossiers)

Levels of agency (local tradition, commissioning person/ institution, workshop, artisan, etc.)

Individual-centred monument analysis

Figure 1: Scheme showing the major components of an individual-centred monument analysis

integrating the features of a detailed catalogue of objects and a collection of prosopographical dossiers. Such an integrated corpus as described in this paper should also help by enabling effective inner-corpus comparisons, and, in certain cases, by differentiating between several levels of agency contributing to the appearance of a monument. One might thus be able to filter out culturemes that are characteristic of a certain office or social position, of an owner/customer, a local workshop tradition or even an individual artisan. For all these reasons it appears as the logical choice to capitalise on the advantages of IT applications by creating a powerful and flexible database. DATABASE LAYOUT

– AIMS AND OBJECTIVES

Preceding the actual design of the database, a number of key-features were formulated without which the application would not meet the objectives. • The structure of the database should be flexible enough to ensure that it can be used, at least in principle, also within other areas of Egyptological research. By operating at a fundamental level with

34

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

basic categories of object-related data that are not specific to the Memphite corpus, but of general relevance to museum collections and archaeological excavations, a high degree of data comparability would be guaranteed. This aim could be achieved by implementing different layers of detail within each entry form for an individual data category, and by generally making sure that the database layout remains adaptable and extensible. If these principles are adhered to, it should be fairly simple to integrate new object categories requiring their own entry form by reference in part to already existing data categories (e.g. for classifying pottery one would rather choose to enlarge the already existing hierarchically organised data category ‘material’ by several thesaurus entries than create a completely new category of ‘pottery fabrics’). • The user interface has to provide mechanisms through which the large amount of information collected for each object or individual is filtered so as to bring into focus the most important features. Otherwise one will risk forsaking the advantages that are connected with a comprehensive description. The solution might lie in providing the user with summary tabs in which all the names, divinities, titles, etc. occurring on a monument are listed together, ordered by quantity or relative importance. • Designed from the start as a multi-purpose application, the database should be able to deal with philological as well as with archaeological data. However, it is not the aim to provide a tool for full-scale linguistic analysis. With the Manuel de Codage (MdC)7 being the primary method for encoding Egyptian texts, the integration of hieroglyphs into the datasets is confined to jpeg-images (photos and/or images exported from common hieroglyphic editors such as Winglyph or JSesh8 and only undertaken if deemed appropriate. • In accordance with the principle of extensibility the database should offer possibilities for integrating specific ‘add-ins’ by way of hyperlinks. For example, elements of a flowchart created with Microsoft 7

8

J. BUURMAN et al., Inventaire des signes hiéroglyphiques en vue de leur saisie informatique, 3rd ed., Informatique & Egyptologie 2, Paris 1988. There is an unofficial web page by H. van den Berg dedicated to the Manuel de Codage: http:// www.catchpenny.org/codage/ (last accessed on 16-09-2008). For the latter, see http://jsesh.qenherkhopeshef.org/ (last accessed on 16-092008).

THE MEMPHIS DATABASE PROJECT

1000–500 BC

35

Visio could be linked with individual datasets and thus form a useful tool for analysing complex genealogical information (see below page 42). After initial research, it became clear that there was currently no commonly available ready-to-use Egyptological database software meeting the requirements as laid down above. Therefore, despite my very limited expertise in the field of software engineering, I finally found myself in the position of having to customise a database environment of my own based on the common Microsoft Access application. EXPERIENCES

During the process of designing the database structure, and also at the later stage of entering the data itself, the advantages as well as the problems connected with a small-to-medium custom-built archaeological database became apparent. This will be illustrated below by a few examples. Maintaining an overview Unquestionably, a general advantage of using databases in Egyptological research lies in the fact that they permit the convenient management of large datasets. The possibility of constant updating is another feature in which a relational electronic database surpasses any kind of printed catalogue. But as has been stated above the benefit obtained through the compilation of detailed and highly structured data on an object or an individual is in danger of being neutralised by the concomitant decrease of transparency. In other words, the user might feel overwhelmed by the great amount of detail and lose the overview. To prevent this the user interface of the Memphis Database provides at the end of the entry form several tables listing key-information about the object (e.g. the divinities or people mentioned and/or depicted) which would otherwise be ‘hidden’ within the detailed accounts on texts and pictorial decoration in the preceding tables (see Figure 2). This information is generated automatically through a dataretrieval query, relieving the user not only of finding the correct data by himself, but also preventing unnecessary data duplication and thereby minimising the loss of data integrity. At the same time, one has to admit that the ‘general commentary’ section, which addresses concerns partly similar to the aforementioned summary tables (i.e. to give a basic idea of the object’s

36

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 2: Memphis Database screenshot showing the summary tables at the end of the entry form

most important features), lies outside the scope of database normalisation.9 This is because data entry here is fairly straightforward and potential updates should not pose any problem to an experienced user. As with the bibliography section, it was decided that a structured set of tables would not be required in this case. The option to conduct full-text searches should be sufficient for retrieving information from there. User considerations and potential online-accessibility As a general rule, the need for a high level of database normalisation and automated decision-making procedures is indirectly proportional to the expertise of the targeted user. Data entered into an amateur database can thus theoretically attain an equal, if not superior, degree of reliability and precision to a professionally-created one if the database is operated by a user well-accustomed both to the topic of research as well as to the software application. On the other hand, this particular fact poses certain problems 9

For the issue of normalisation and general principles of database design, see ADAMS, E.W. And STRUDWICK, N., Relational Database Design: A Tutorial and Case Study for Egyptologists, present volume, esp. pp. 188–9.

THE MEMPHIS DATABASE PROJECT

1000–500 BC

37

as soon as the decision is made to put the database online. Without doubt the only way to use a database to its full potential for dissemination and continuous updating of information is to make it accessible via the internet. However, the increase in the number of potential users (even if only passive ones) necessitates a more professional database management, which in turn depends not only on the developer’s expertise but also on financial backing and consequently cannot be guaranteed for the long term. Data-retrieval and analysis Setting up the database structure and entering the data provides the framework within which the subsequent analytical steps are performed. Fortunately it is relatively easy with Microsoft Access to customise multiparameter queries that can also be implemented into specifically designed search forms. If one were, for example, to investigate potential relations between a Memphite workshop and members of a particular family, one could devise a special query including parameters such as the significant presence or absence of certain logotypes (spelling variants), the different textual components of inscriptions, the material(s) of the object, the occurring personal names of the dedicant and the beneficiary, genealogies, officerelated titles, and so on. In comparison with more traditional methods, such complex queries also greatly enhance the chances of succeeding in disambiguating prosopographical entries of like-named individuals. But as with any other database, practical usage bears out the truism that the quality of the information gained through queries depends upon the quality of the data entered. In the case of the Memphis Database, its quality is largely defined by the quality of the thesauri devised for each data category. This is the reason why great efforts have been undertaken in order to create well-founded typologies and thesaurus hierarchies. Even though I tried to base as many of my thesauri as possible on the already existing Multilingual Egyptological Thesaurus (MET),10 the latter was often found to be too imprecise and unspecific for the purposes of the study. Nevertheless, it 10

The thesaurus was created by Dirk van der Plas and his team and is copyright protected by the Centre for Computer-aided Egyptological Research (CCER). See VAN DER PLAS, D. (ed.), Multilingual Egyptological Thesaurus, Publications interuniversitaires de recherches égyptologiques informatisées 11, Utrecht 1996; an online-version of the MET is available at http://www.ccer.nl/apps/thesaurus/index.html (last accessed on 16-09-2008).

38

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

remains an aim to make the newly created thesauri compatible with the MET or to import them into the latter. By doing so one would not only help establishing a generally agreed upon standard of Egyptological artefact classification and nomenclature, but also gain a multilingual interface that could be capitalised on when making the database publicly accessible via the internet. Standardisation vs. faithfulness No artefact, however small, can be represented within a database in its entire complexity. As a consequence, a level of detail has to be chosen that is just high enough to suit one’s requirements. Problems occur when the aim to reduce complexity by standardising features (e.g. for indexing purposes) runs counter to the wish to record a particular cultureme as faithfully as possible and thereby safeguarding against the loss of potentially significant idiosyncrasies. This conflict of interests can be illustrated by the inscriptions on a stela in the Robert Fullerton Art Museum in San Bernardino, CA, belonging to a certain Sheshonq. As Figures. 3a–c show the personal name ‘Sheshonq’ is written in a variety of ways which are not in accordance with the ‘standard’ spelling one would expect (Figure 3a). On the interpretative level of a ‘thick description’, it is perfectly clear that in Figure 3b the personal name ‘Sheshonq’ (S-S-n-q) has to be read, but with a palaeographic agenda in mind the objective would be a faithful recording of what has actually been carved, resulting in the transcription S-S-q-n. However, in doing so another problem occurs. In the common hieroglyphic editors such as Winglyph, the variety of Gardiner N35 in form of a straight line is not available and has to be represented through N17. A similar case is found in Figure 3c. There the last sign resembles a mouth (Gardiner D21) but should rather be taken to represent an egg (H8) for sA (son), indicating a filiation, as can be concluded from the following name of Sheshonq’s father. The question emerges whether a standardising interpretation should be given precedence over recording the phenomenological appearance of a textual element for clarity’s sake. If one were to trace, for example, a particular artisan and therefore were to search for all Memphite funerary stelae on which the filiation sA is written with an egg resembling a mouth, the information on the peculiar shape of H8 has unquestionably have to enter the dataset. The other side of the coin is that greater precision in recording, while enhancing the usefulness of the data for palaeographic analysis,

THE MEMPHIS DATABASE PROJECT

1000–500 BC

39

¥-S-n-q M8A-M8A-n-q-A50 ¥-S-q-n ? M8E-M8E-q:N17-A52A Fig. 3a-c Stela of Sheshonq in the Fullerton Art Museum, San Bernardino, no. 01.001.2002 a - “standard version” of PN Sheshonq

¥-S-(n)-q M8E-M8E-q:r/H8

b - PN in main text, line 4 c - PN in main text, line 6

Figure 3: Stela of Sheshonq in the Fullerton Art Museum, San Bernardino, CA, no. 01.001.2002

unfortunately greatly reduces the comparability of names or other textual elements on the prosopographical level. Though my solution to this problem may neither be very elegant nor meet the highest standards of data normalisation, it is in my opinion the most straightforward one and the easiest to handle. In cases that really matter, textual units should be represented in three different ways: firstly by a photo or an image created with a hieroglyphic editor in order to give an impression of the actual appearance of the word. As the qualitative properties of such an image are not yet susceptible to computerised data retrieval, it is necessary to introduce a second layer, the precise transliteration. This layer accounts for the transfer of the hieroglyphic image into the Manuel de Codage, with faithfully representing potential reversals of sign sequences or misspellings, as well as into the Winglyph code.11 In the third layer, the word is finally represented in a standardised transcription informed by Egyptological interpretation, which makes it usable for general search operations. Taking into consideration that in many cases the second and third layers will 11

Only the latter can be considered a true transcription as it enables to precisely reconstruct the original spelling.

40

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

be identical, the entering of data into the third layer should be regarded as compulsory whereas the other layers are merely facultative extensions.

STRUCTURE OF THE DATABASE The major structure of the Memphis Database can be seen in Figure 4. The main form of the user interface is organised into 11 tabs which comprise several sub- and sub-sub-forms. However, not all of them are relevant for each object category. The first two tabs provide information for every kind of object, listing general identifying information, categorisation, measurements, material, date, actual provenance and reconstructed origin, as well as state of preservation. The following tab gives detailed information specific to statues, indicating general type, iconography, attributes, technique of carving and the presence of two-dimensional scenes and texts. In a similar way, tab four is devoted to stelae and offers a detailed form to describe their typological properties. The basic classificatory system for shabtis devised by Schneider12 is implemented in tab five. As with several other data categories of the database, the correct entry of data is facilitated by the possibility to bring on the screen hyperlinked documents providing a graphic overview of the typological features and their respective alphanumeric codes (see Figure 5). The complete string (e.g. 5.3.1 Tc:Cl.XIA2/W38 H4 I8 B 25 Tp3b/V.VIIA) combining the typological sub-codes is generated automatically when entering the specific details. The following two tabs deal with the structure and the rendering of any existing two-dimensional scenes. While tab six offers a basic description of the scenic elements and records their sequence (making the basic layout of a scene susceptible to search operations), tab seven may be used for an in-depth description of any particular scenic element (e.g. a representation of Osiris). It goes without saying that the hierarchical structure of the database with its several sub-forms allows for the fact that a monument may contain many scenes, each of which may comprise numerous pictorial elements worthy of a separate detailed description/classification. Texts are recorded in tab eight, which is itself divided into four subtabs. Among the features recorded are the position of a text on the 12

Cf. H.D. SCHNEIDER, Shabtis. An Introduction to the History of Ancient Egyptian Funerary Statuettes with a Catalogue of the Collection of Shabtis in the National Museum of Antiquities at Leiden, Part I, Leiden 1977.

THE MEMPHIS DATABASE PROJECT

1000–500 BC

41

Structure of database

Identification, object category, measurements, material

Dating, provenance (contextualisation of information is imperative!)

{

Statue: typological criteria Stela: typological criteria

}

Shabti: typological criteria

Scene-description, sequence of pictorial elements, technique

(

Iconography (detail): typological and stylistic description of pictorial element

)

Texts: number, typology, position, sequence of text elements, characteristic logotypes, palaeography, technique

Typological description of pictorial element Text element – detail: epigraphic classification, spelling variants, typology, important information

Summary 1: mentioned/depicted individuals, deities, toponyms etc. (contextualisation)

Summary 2: social context and evaluation of craftsmanship, workshop, bibliographies, comments

Summary 3: text commentary, potential genealogical information

Links to supplementary materials (additional images, documents, etc.)

Figure 4: Basic structure of the Memphis Database

Figure 5: Memphis Database screenshot showing external window helping in the process of data entry

42

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

monument, its state of preservation/completeness, a continuous transcription (standardised), significant logotypes (spelling variants of common words and phrases that offer themselves for comparisons), palaeographic property of selected signs, a sequential list of textual elements in order to make the basic composition of a text searchable and comparable to similar texts, techniques of carving, and so on. The potential multiplicity of texts or of textual elements, logotypes, words or signs within a single text is taken care of in a fashion similar to the handling of pictorial elements. Finally, the last three tabs contain summary information automatically retrieved from the scene- and text-sections, a list of primary and secondary publications as well as a large text-box for any general comments on the monument and its features. These last sections are designed to accommodate unstructured free-text entries. The reasons for this decision are as follows: firstly, the bibliography of any given object is not the subject of comparative research and therefore does not require a structured table of its own, and secondly, the full-text searchability of these text-boxes is sufficient for finding a particular publication, an author or a key-word within a commentary.

‘ADD-INS’ In order better to understand the often very complex (sometimes even confusing and contradictory) genealogical data contained in texts an attempt has been made to visualise them by translating them into flowcharts created with Microsoft Visio (see Figure 6). As Visio allows linking any given shape within the chart with an Access dataset, it is possible not only to keep the charts permanently in accordance with the database entries but also to model certain properties of Visio shapes on the latter (e.g. retrieving name and principal titles of an individual from Access data). Whereas the shapes of the Visio flowchart refer to individuals in a manner similar to that of a common genealogical tree, the totality of objects outlined in a particular colour represent the information provided by a single monument (colours are represented in greyscale in Figure 6). Through adding additional layers in different colours the entire network of information associated with a certain family group can be depicted within the chart. By superimposing or masking certain layers, full transparency as to what information is provided by which object is ensured. A critical evaluation of the ancient accounts is thus facilitated. A timeline of royal reigns added to the right of the chart offers the possibility to fix the position of monuments containing regnal

THE MEMPHIS DATABASE PROJECT

1000–500 BC

43

Psammetichus I, year 12

Wnn-nfr People index

People index

8

Title

Hrj pDw.t

Status

mAa-xrw

Psammetichus I

Apr-al-Sdw

Hm.t=f

9

Title

nb.t pr Status

mAa.t-xrw

Psammetichus I sA

Pf-TAw.a.wj-m-BAst.t PA-Ra-ms-sw People index

¤[email protected]@rw §s-As.t-pr.t

Hm.t=f Hm.t=f

Title

nb-t-pr

Status Status

mAa-xrw

sA sA

54

2

Title

jmj-rA aXnwtj

Status

Regnal years

People index

1

mAa.t-xrw mAa.t-xrw

mw.t

Necho II

Nfr-Htp

Regnal years

¡rw-wDA People Status index Title

15 5 mAa-xrw jtj-nTr

Status

anx D.t

Regnal years 6

sA

Psammetichus II Regnal years 19

Pf-TAw.a.wj-m-BAst.t People index Title

Hm.t=f

People index

jmj-rA aXnwtj, jmj-rA pr wr n dwA.tnTr

Status

mAa.t-xrw

mw.t

¡rw-wDA

Amasis

Nfr-rnp.t 5 jtj-nTr

Status

nb-t-pr

Status

sA

Title

2

Title

mAa-xrw

People index

Apries

§s-As.t-pr.t 1

anx D.t

sn smsw

People index Title Status

4

Amasis, year 36

jmj-rA Snwtj anx D.t

Regnal years 2 generations minimum

44 Regnal years 0,5

Figure 6: Genealogical flowchart created with Microsoft Visio

Psammetichus III

44

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

years within the absolute (or relative) historical chronology. Thereby the generational intervals of the entire genealogical framework can be adjusted in accordance with the chronological anchors. CONCLUSION AND OUTLOOK

As could hopefully be demonstrated, the Memphis Database Project is not a tool designed to serve exclusively the needs of a PhD dissertation on Late Period Memphis, but has also certain potential to facilitate other Egyptological research in line with the principles of the individual-centred monument analysis. Before the database itself or its core structure can be utilised by other researchers, however, much remains to be done. One of the next steps will consist of implementing a fully integrated extension or counterpart to the existing database in which key information on individuals is grouped together in a dossier style (see also above page 32). It is also clear that it would require the hard work and dedication of people with considerably more IT expertise than I, as well as sufficient funding, to turn this product of an amateur into an efficient and user-friendly tool for data collection and analysis. From an Egyptological point of view, the key questions for the future revolve around the creation, management and extension of the thesauri that are at the very core of the database project and largely determine its quality and relevance. The relation of the already existing thesauri and those yet to come to the MET is an issue of great importance which needs to be clarified before more work is done on these levels. Should archaeological missions working in the Memphite region be willing to contribute their material to the database as well, it would be even more vital to agree upon common typological criteria and terminologies.

EDUCATIONAL IMAGES ON THE WEB

Edward Loring

ABSTRACT

This paper considers the need for making Egyptological photographs accessible for scholars and the general public on the web. Today, practically all relevant institutions have accumulated a considerable number of digital photos, more than they will ever be able to publish in print. It is a great waste of effort if this material is not shared with others through publication on the web. Specific cases of research projects demonstrating the vital importance of images are detailed here and information on web publishing is provided for those who do not have their own web-sites. Sharing images is a win-win situation for all concerned.

The members of I&E stand at the critical and ever more traveled junction between information technology and the humanities. All have, in one way of another, an educational mission. The Internet provides a unique opening for educators to present their material, be it textual or graphic, in a manner which makes it available to all interested individuals and institutions worldwide. It is said that one picture is worth a thousand words. Without a picture many things are difficult or impossible to communicate. Obviously this is especially true in the area of art and to some extent also in the areas of science and technology. A picture combined with text, pointing out a certain element is often the only way to present a convincing argument in professional discussion. Let us take a specific example: Subsequent to the re-excavation of the Royal Cache, TT320, the Centre for Egyptological Studies of the Russian Academy of Sciences (CESRAS) and the Russian Institute of Egyptology in Cairo (RIEC) have pursued a continuing study of the art and artifacts of the 21a Theban 45

46

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Dynasty.1 As the painted coffins of that period are a principal source of cultural information, emphasis has been placed on an analysis of these. In the literature they are generally referred to as ‘yellow coffins’. An in-depth study of digital images made in the Egyptian National Museum in Cairo and in museums of the former Soviet Union reveals that the decoration of many of these coffins, actually the better ones, was painted on a white background and that the present yellow colour was caused by degradation of the pistacia resin varnish applied to protect the surfaces. The generally prevailing dark green colour is a result of this yellowing over designs painted in a light to medium blue. There is considerable resistance among Egyptologists to the white background theory. A number of the later coffins from Bab el-Gasus distributed in European museums do have backgrounds of an ochre wash (not yellow) below yellowed varnish. Technically speaking, the outline draughtsmen would not have been able to draw with water-soluble ink on a varnished surface. The new Armenian Egyptology Centre of Yerevan State University is conducting a project with CESRAS participation to exactly reproduce the painting and varnishing techniques of the period. It would be impossible to prove the case for white backgrounds without detailed colour images. An individual attempting to present the case without such would only be able to refer to specific surface areas of coffins in various museums. The argument for white backgrounds would remain unproven. The necessary images could, of course, be published in printed media but the cost for doing so would be out of the question for any Egyptological journal. In any case, publication in any professional journal with a generally very limited number of copies printed would not make the material available to all Egyptologists, much less to art historians and to a broader culturally oriented public. CESRAS is continually posting evidence for the above argument on its web-site,2 thus making up-to-date research results available to anyone anywhere. Everyone everywhere is able to use our images for their own (noncommercial) purposes. To participate in the project, and is a position to comment on our arguments. We believe that it is the duty of all Egyptological and related institutions to make their visual material available to the public in this manner. 1 2

The term ‘21a Theban Dynasty, ‘D21a’ or ‘21a’ refers to the Theban line of rulers contemporary with those of the 21st Dynasty in Tanis. http://www.cesras.org, under ‘Technology’.

EDUCATIONAL IMAGES ON THE WEB

47

There are many institutions and private individuals with considerable numbers of images which would be of value to Egyptologists in general. Certainly not all of these images are of ‘publication’ quality, but in most cases would none the less be of value to researchers. CESRAS has posted many such images from the early days of digital photography. Admittedly many of them are of very poor quality but they may be the only photos of many objects which will ever be published. A number of the 21a Dynasty painted coffins in museums of the former USSR are in such a miserable state, after seventy years of neglect during the Soviet Period, improper storage, and abuse from transport, that they can neither be moved from their largely inaccessible places of storage, not could they be restored. The problem of moving coffins for complete investigation is another serious problem which we shall address on cesras.org As the volume of information on the web grows, there is a growing necessity for methods to locate specific information efficiently. There are two principle ways to publish images of the web: on web-sites and in photography archives such as Yahoo Flickr. CESRAS is continually posting images in both manners. The basic use of Flickr is gratis. One need only have a gratis Yahoo address to make a private archive. For a small yearly charge a professional membership allows unlimited space for archiving images. The CESRAS Flickr site is http://www.flickr.com/photos/ horemachet. Over three thousand images are to be found there, all in the public domain for free use and the number is growing. Further, CESRAS has a Flickr photo-group with over 4000 images named Egyptology: http:/ /www.flickr.com/groups/egyptology. Flickr members can become members of this group and can then post their images there in a secure archive. A number of excellent photographers participate in the group. All photos are indexed with keywords and can be easily searched. For those who maintain web-sites the best way to make your images searchable is through a considered choice of meta-tags. These are picked up by the various web-crawlers and can be used in combination. The terms used should be short and clear for everybody to understand. The combinations should be well thought out. As dating is vital for Egyptological material, a universal code should be implemented: dynasties tagged D1, D2, D3… An equivalent code should be devised for the reigns of the kings. This is a matter which could be discussed and worked out in an Egyptological forum. The EEF3 would certainly be best for this purpose as many I&E members are regular contributors.

48

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

A further possibility to index images or groups thereof is the new ‘open source’ database project at http://www.myowndb.com. This project allows anyone to create their own database at no cost. Each database is a complex matrix which can be searched from any point, much like our GALEXYS system which no one has yet been able to put on the web. We suggest that all those interested in relational databases look into this. It is being managed by dedicated information scientists, and, as it is open source, we can expect good future development. The present search is incremental but we are in communication with the developers and hope to be able to implement the more powerful multidimensional GALEXYS matrix there. The web can and should be used as a tool in research projects requiring photographic information from many sources which could not all be visited by a single researcher. CESRAS has just embarked on a test of this promising method. Here is the specific case. In the study of the coffins mentioned above it is essential to have details on as many objects as possible. The Bab el-Gasus coffins are the most important source of iconographic information. These coffins are scattered about in many locations, some, such as Voronezh, Kazan and Odesssa virtually inaccessible. CESRAS has published, or is publishing, these coffins and has been able to photograph in several important locations. Still, the research material available for our study is very limited. The present study is iconographic with the goal of determining which coffins were decorated by the same artist(s). The study is parametric and will form a structured database of iconographic elements. The first parameter selected is women’s faces and busts. These are being presented on cesras.org and in Flickr for colleagues with access to 21a Theban coffins to study and compare with equivalent images on their examples. Hopefully said colleagues will make photos available on the web to be integrated into the database. This material will then be available to all scholars, students of the graphic arts etc. As time allows, further parameters will be introduced. The next planned is ‘men’s heads’. There it will be interesting to note, for example, that a certain well represented draughtsman (the ‘Vierstrichzeichner’) used completely different lines for male and female faces. This short paper only scratches the surface of the vast possibilities for sharing graphic knowledge on the web. There is more than enough material for all of us and everyone will gain through increased sharing, a win-win situation. 3

http://showcase.netins.net/web/ankh/eefmain.html

DAS GEFLÜGELTE KROKODIL: CODIERUNG VON TOTENBUCH-VIGNETTEN

Marcus Müller-Roth

ABSTRACT

The Book of the Dead is one of the most important and most frequent attested religious texts of Ancient Egypt. Although approx. 50% of all handwritings also show vignettes, which illustrate the spells, the research concentrates nearly exclusively on the texts. Even new editions often confine themselves to a description of the vignettes or offer just an image. A comparative analysis of the iconography or the style is missing. Often parallels are taken into account only in the philological comment. Especially little attention is given to the vignettes of the Book of the Dead of the Saite Recension. In comparison with its precursors of the New Kingdom they are regarded as standardized. Therefore the researchers trust in P. Turin 1791 as a reference. This papyrus was published by Richard Lepsius in 1842 and is regarded as a parallel for the texts as well since that time. But the status of P. Turin 1791 is unclear. Only a small amount of the 1400 known handwritings with vignettes is published. Therefore the multiplicity of the variants is largely unknown. An overview of the material shows that one cannot orientate oneself to P. Turin 1791 without problems. The variants are much more numerous than previously assumed. It can be assumed that local styles are associated with this variants. The article shows how the mass of 1400 manuscripts can be processed to be able to investigate the local variants.

49

50

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

1. FORSCHUNGSGESCHICHTE Das Totenbuch zählt zu den bedeutendsten und am häufigsten belegten religiösen Texten des Alten Ägypten.1 Obwohl die Hälfte aller Handschriften neben den Sprüchen auch Vignetten besitzt, die einen erheblichen inhaltlichen und formalen Anteil der Handschriften ausmachen, konzentriert sich die Forschung fast ausschließlich auf die Texte. So werden die Vignetten selbst in aktuellen Editionen, wenn überhaupt, meist nur beschrieben.2 Dagegen vermisst man, dass Stil und Ikonografie wie die Texte anhand von Parallelen analysiert werden.3 Besonders wenig Aufmerksamkeit wird den Vignetten der spätzeitlichen Totenbücher geschenkt, weil sie im Vergleich zu ihren Vorläufern als standardisiert gelten. Von den spätzeitlichen Handschriften mit Vignetten ist jedoch nur ein Bruchteil publiziert, so dass das Spektrum der Varianten weitgehend unbekannt ist. Auch die vor circa 20 Jahren entstandenen Überblickswerke zu den Vignetten leisten nur einen eingeschränkten Dienst. Henk Mildes Dissertation zum Totenbuch des Neferrenpet bietet zwar einen guten Einblick in die Entwicklung vieler Vignetten. Er behandelt auf der Grundlage des ihm 1

2

3

Vgl. T.G. ALLEN, The Book of the Dead or Going Forth by Day. Ideas of the Ancient Egyptians Concerning the Hereafter as Expressed in their own Terms, SAOC 37, Chicago 1974; R.O. FAULKNER, The Ancient Egyptian Book of the Dead (hg. von C. ANDREWS), New York 1972 sowie E. HORNUNG, Das Totenbuch der Ägypter, Die Bibliothek der Alten Welt, Zürich/München 1979 (München 19932). Die vollständige Bibliografie bei S.A. GÜLDEN/I. MUNRO, Bibliographie zum Altägyptischen Totenbuch, SAT 1, Wiesbaden 1998. Eine zweite, erweiterte Auflage ist als SAT 13 in Vorbereitung. Vgl. G. LAPP, The Papyrus of Nu, Catalogue of the Books of the Dead in the British Museum I, London 1997, 58–60; ders., The Papyrus of Nebseni, Catalogue of the Books of the Dead in the British Museum III, London 2004, 53– 5 sowie I. MUNRO, Das Totenbuch des Pa-en-nesti-taui aus der Regierungszeit des Amenemope, HAT 7, Wiesbaden 2001, 60–8. Vgl. I. MUNRO, Der Totenbuch-Papyrus des Hor aus der frühen Ptolemäerzeit, HAT 9, Wiesbaden 2006, 58–73; M. MOSHER, The Papyrus of Hor, Catalogue of the Books of the Dead in the British Museum II, London 2001, 12–22 und 96– 108 sowie M. VON FALCK, Das Totenbuch der Qeqa aus der Ptolemäerzeit (pBerlin P. 3003), HAT 8, Wiesbaden 2006, 59–65. Gegen die Bewertung im letzten Beitrag vgl. M. MÜLLER-ROTH, Lokalkolorit in Schwarz-Weiß, in: B. BACKES/M. MÜLLER-ROTH/S. STÖHR (Hgg.), Ausgestattet mit den Schriften des Thot. Festschrift für Irmtraut Munro zu ihrem 65. Geburtstag, SAT 14, Wiesbaden 2009, 119–31.

DAS GEFLÜGELTE KROKODIL

51

vorliegenden Papyrus aber nur etwa ein Drittel aller Vignetten und konzentriert sich auf die so genannte Thebanische Rezension. Für die Saitische Rezension bietet er dagegen kaum Material.4 Die ausführlichere Arbeit zu den Vignetten der Saitischen Rezension von Malcolm Mosher schafft nur teilweise Abhilfe. Seine Auswertung ist zwar vollständiger und sein Ansatz durchaus lobenswert, aber auf einer Basis von circa 40 Handschriften nur bedingt referenzfähig. Inzwischen sind aus diesem Zeitraum immerhin 1400 Handschriften mit Vignetten registriert. Hinzu kommt, dass die Publikation dieser Arbeit noch immer aussteht und deshalb bisher nur schwer zugänglich ist.5 Bisher liegen auch nur wenige Studien zu einzelnen Vignetten vor. Immerhin widmen sich verschiedene Forscher den großen Vignetten, die die gesamte Höhe des Schriftspiegels einnehmen: Bereits vor 30 Jahren untersuchte Christine Beinlich-Seeber die Szenen des Totengerichts (Tb 125).6 Vor zehn Jahren folgte Judith Gesellensetter mit einer Studie zur Darstellung des Binsengefildes (Tb 110).7 Noch aktueller sind die Arbeiten von Jana Budek zu V 15 8 und von Tarek Tawfik zu V 1.9 Trotz aufkommenden Interesses an den Vignetten der Totenbücher und deren Wertschätzung für die Erforschung der Handschriften herrscht weitgehende Unkenntnis über deren Bandbreite. Deshalb wird aus Mangel an Überblicksstudien blind auf Referenzen wie P. Turin 1791 vertraut, den 4 5

6 7

8

9

H. MILDE, The Vignettes in the Book of the Dead of Neferrenpet, EU 7, Leiden 1991. M. MOSHER, The Ancient Egyptian Book of the Dead in the Late Period: a Study of Revisions Evident in Evolving Vignettes and Possible Chronological and Geographical Implications for Differing Versions of Vignettes, Unpubl. Dissertation Berkeley 1989. Das einzige mir bekannte Exemplar ist auf Mikrofilm in der Universitätsbibliothek Heidelberg einsehbar (Signatur: 2006 RA 3). CHR. SEEBER, Untersuchungen zur Darstellung des Totengerichts im Alten Ägypten, MÄS 35, München/Berlin 1976. J.S. GESELLENSETTER, Das Sechet-Iaru. Untersuchungen zur Vignette des Kapitels 110 im Ägyptischen Totenbuch, Würzburg 1997. URL: http://www.opus-bayern.de/uni-wuerzburg/volltexte/2002/375/. J. BUDEK, Die Sonnenlaufszene. Untersuchungen zur Vignette 15 des Altägyptischen Totenbuches während der Spät- und Ptolemäerzeit, in: SAK 37, 2008, i. Dr. T. TAWFIK, Die Vignette zu Totenbuch-Kapitel 1 und vergleichbare Darstellungen in Gräbern, Unpubl. Dissertation Bonn 2008.

52

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Abb. 1: V 137 in P. Turin 1791

Abb. 2: V 137 in P. Paris BN 129–136

Abb. 3: V 149a in P. Turin 1791

Abb. 4: V 149a in P. Wien ÄS 3862

Richard Lepsius vor über 160 Jahren edierte, und der auch als Referenzgrundlage für Textverweise dient.10 Der folgende Beitrag widmet sich mehreren Fragen: Zunächst wird überprüft, ob P. Turin 1791 seiner Bedeutung als Referenz gerecht wird. Darauf aufbauend wird gezeigt, wie man Vignetten unabhängig vom umgebenden Text identifiziert, wenn dieser fehlt, zerstört ist oder es offensichtlich ist, dass er nicht zur Vignette gehört. In diesem Zusammenhang wird aufgezeigt, dass sich in den unterschiedlichen künstlerischen Ausführungen lokale Stile spiegeln, anhand derer man die Herkunft einer Handschrift ermitteln kann. Abschließend soll vorgestellt werden, wie die Menge von 1400 Quellen aufgearbeitet und bewältigt werden kann, um 10

R. LEPSIUS, Das Todtenbuch der Ägypter, Leipzig 1842 (Neudruck Osnabrück 1969). Vgl. B. BACKES, Wortindex zum späten Totenbuch (pTurin 1791), SAT 9, Wiesbaden 2005.

DAS GEFLÜGELTE KROKODIL

53

Untersuchungen nach obigen Fragestellungen zu ermöglichen und um eine Klassifizierung vorzunehmen.

2. P. TURIN 1791 – EIN STANDARD? Wenn man sich auf einige wenige Handschriften verlässt und von diesen abhängig macht, müssen mehrere Kriterien sichergestellt sein: 1. Die Dokumentation bzw. Publikation der Quelle ist zuverlässig. 2. Zu allen Sprüchen, die grundsätzlich eine Vignette besitzen, existiert ein Beleg.11 3. Die Referenz zeigt den am häufigsten verwendeten Typ der Vignette oder seine Charakteristika. 2.1 Die Zuverlässigkeit der Dokumentation Im Gegensatz zu modernen Editionen, die das Material auf Fototafeln vollständig dokumentieren, zeichnete Lepsius bei P. Turin 1791 sowohl Vignetten als auch Text um. Dabei schlichen sich allerdings Fehler ein. In V 137 trägt der Verstorbene ein spitz zulaufendes Objekt, dass einem Messer ähnelt (Abb. 1). Es handelt sich aber um ein stumpf endendes Sechem-Zepter, wie wir es auch von anderen Quellen kennen (Abb. 2). Ein Blick in die jüngere Fotodokumentation von Boris de Rachewiltz zeigt, dass P. Turin 1791 das Zepter ebenfalls ausführt. Die Umzeichnung von Lepsius ist lediglich unsauber.12 Eine weitere Ungenauigkeit liegt in V 149a vor. Der Dämon des ersten Hügels besitzt in P. Turin 1791 nach Lepsius einen runden Kopf, von dem gerade, kurze Haarstoppel abstehen, und eine lange, spitze Nase, die ihm ein vogelartiges Aussehen verleiht (Abb. 3). Tatsächlich bildet eine Art Pflanzenstaude seinen Kopf, eine tulpenähnliche Mitte umrandet von zwei Uräen (Abb. 4).13

11 12 13

Zu Tb 2–14 und 96–97 existiert prinzipiell keine Vignette. Siehe B. DE RACHEWILTZ, Il Libro dei Morti degli Antichi Egizi. Papiro di Torino, Rom 1986, Tafeln (CXXXVII). Siehe DE RACHEWILTZ, Libro dei Morti, Tafeln (CXLIX). Das Motiv spielt wohl auf Tb 149a, 1 an. Vgl. MOSHER, Vignettes, 413.

54

INFORMATION TECHNOLOGY AND EGYPTOLOGY

Abb. 5: V 19 in P. Paris Louvre E. 7716

Abb. 6: V 42 in P. Leiden T 16

2008

DAS GEFLÜGELTE KROKODIL

55

2.2 Die Vollständigkeit des Papyrus P. Turin 1791 ist unvollständig und weist an zwei Stellen eindeutige Lücken auf. Dabei handelt es sich nicht um Sprüche, die keine spezifische Vignette besitzen14 oder denen in anderen Handschriften eine einfache Darstellung der Verstorbenen als Verlegenheitslösung zugewiesen wurde.15 Vielmehr sind es sehr charakteristische Motive, die wir aus anderen Quellen kennen. Die erste Lücke tut sich bei Tb 19 und 20 auf.16 Dagegen kennen wir aus etwa 60 Quellen zusammen rund 70 Belege für die Vignette der beiden Sprüche.17 Leitmotiv ist der Kranz der Rechtfertigung, der dem Verstorbenen verliehen wird. Am häufigsten steht er dabei dem Gott Atum gegenüber, der dem Verstorbenen laut Tb 19, 1 den Kranz flechtet und überreicht (Abb. 5). Die zweite Lücke besteht bei Tb 42.18 Etwa ein Dutzend anderer Quellen zeigt hier eine Götterreihe aus bis zu 22 Mitgliedern. Sie stehen im Kontext der Gliedervergottung, die Tb 42 behandelt (Abb. 6).19 2.3 Charakteristika in P. Turin 1791 Als Drittes stellt sich die Frage, ob P. Turin 1791 in den einzelnen Vignetten überhaupt die typische bzw. häufigste Form des Motivs überliefert. Eine Überprüfung ergibt, dass der Papyrus über zehn Mal nicht die Ausführung zeigt, die von der Mehrheit der Quellen bevorzugt wird. Manchmal zeigt

14 15 16 17

18 19

Siehe Anm. 11. Siehe Abb. 9 sowie LEPSIUS, Todtenbuch, Tf. XX (V 46), XXI (V 48, 49, 51), XXV (V 65, 66, 67), XXII (V 73) und XXVII (V 76). Siehe LEPSIUS, Todtenbuch, Tf. XIII-XIV. Vgl. E. HASLAUER, Eine Mumienmaske mit dem „Kranz der Rechtfertigung“, in: Jahrbuch des Kunsthistorischen Museums Wien 6/7, Mainz 2006, 233–39; M. MOSHER, Five Versions of Spell 19 from the Late Period Book of the Dead, in: ST.E. THOMPSON/P. DER MANUELIAN (Hgg.), Egypt and Beyond. Essays presented to Leonard H. Lesko, Providence 2008, 237–60 sowie M. MÜLLERROTH, Der Kranz der Rechtfertigung, in: A. MANISALI/B. ROTHÖHLER (Hgg.), Festschrift Assmann zum 70. Geburtstag, i. Dr. Siehe LEPSIUS, Todtenbuch, Tf. XIX. Vgl. S. STÖHR, Who Is Who? Die Repräsentanten der Gliedervergottung in der späten Vignette zu Tb 42, in: B. BACKES/M. MÜLLER-ROTH/S. STÖHR (Hgg.), Ausgestattet mit den Schriften des Thot. Festschrift für Irmtraut Munro zu ihrem 65. Geburtstag, SAT 14, Wiesbaden 2009, 175–200.

56

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Abb. 7: V 43 in P. Turin 1791

Abb. 8: V 43 in P. Paris Louvre N. 3248

Abb. 9: V 51 in P. Turin 1791

Abb. 10: V 51 in P. Paris Louvre N. 3248

P. Turin 1791 nur eine Variante, manchmal ist das gewählte Motiv jedoch stark verändert. Auch hierfür mögen zwei Beispiele genügen: Zu Tb 43 zeigt P. Turin 1791 den Verstorbenen vor drei Göttern, die Was-Zepter tragen (Abb. 7). Diesen Typ für V 43 besitzt immerhin etwa ein Dutzend Quellen. Etwa drei Mal häufiger ist dagegen das gleiche Motiv mit nur einem einzigen Gott. Außerdem fügen fast doppelt so viele Handschriften zwischen dem Verstorbenen und dem Gott drei Köpfe ein (Abb. 8). Sie gehen auf den Text zurück, einem „Spruch, zu verhindern, dass der Kopf des NN ihm abgeschnitten wird im Totenreich.“ Auch zu V 51 liefert P. Turin 1791 ein vereinfachtes Motiv. Er bildet nur die Grundform mit dem Verstorbenen ohne jegliche andere Gestalten oder Attribute ab (Abb. 9). Darin folgen ihm zwar etwa zehn Handschriften und einige weitere geben dem Verstorbenen zusätzlich lediglich einen Stab in die Hand. Acht Quellen erweitern das Motiv dagegen um drei auf dem Kopf stehende Verdammte vor dem Verstorbenen, entsprechend dem Inhalt des Spruchs, „nicht kopfüber zu gehen im Totenreich“ (Abb. 10).

DAS GEFLÜGELTE KROKODIL

Abb. 11: P. Kairo J.E. 97249 (Papyrus 5), Fragmente 1–3

57

58

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

3. VARIANTEN Es ist deutlich geworden, dass P. Turin 1791 bei der Identifikation von Vignetten alleine nur bedingt wertvoll ist und selbst ein Überblick über das bisher publizierte Material oft nicht ausreicht. Im Folgenden wird an einigen Beispielen gezeigt, wie nützlich aber auch problematisch es ist, eine möglichst breite Basis an Vergleichsmaterial zu besitzen. Dazu werden zwei Fälle besprochen, in denen zum Vignettenrest kein Text erhalten ist, so dass die Identifizierung allein anhand der Charakteristika der Abbildung erfolgen muss. Beide Fragmente stammen aus Theben-West, wo sie zwischen 1963 und 1969 aus dem Umfeld der Gräber TT 386, 389 und 410 geborgen wurden.20 3.1. Identifizierung der Vignette Das erste Fragment zeigt in der Mitte einen Vogelschwanz, dessen oberes Ende in einen aufrechten Nacken übergeht. Unterhalb des Schwanzes ist noch eine Linie vorhanden, die zum Körper und speziell wahrscheinlich zu den Beinen der Gestalt gehört (Abb. 11, Fragm. 3). Günther Burkard ergänzte die Reste zu einem Ba-Vogel aus V 85. Für diese Deutung sprechen der steile Rückenverlauf sowie die senkrechten Beine, die mit den Vogelgestalten in V 83, 84 und 86 nicht übereinstimmen (Abb. 12).21 Der aufrechte Nacken und der winzige Rest auf der gegenüber liegenden Seite lässt dagegen auf eine Ergänzung nach V 88 schließen. Dieser liegt eigentlich überhaupt keine Vogelgestalt zugrunde, sondern zeigt ein mumifiziertes aufrecht stehendes Krokodil. Die meisten Handschriften führen es ohne weitere Zusätze aus (Abb. 13a). Lediglich eine Minderheit gibt dem Krokodil ein Was-Zepter in die Hände (Abb. 13b). In vielen Handschriften besitzt das Krokodil an seiner Rückseite aber noch einen Fortsatz. Dieser 20

21

Die Gräber publiziert bei D. ARNOLD, Das Grab des Jnj-jtj.f, Grabung im Asasif 1963–1970 I, AV 4, Mainz 1971; J. ASSMANN, Das Grab des Basa (Nr. 389) in der thebanischen Nekropole, Grabung im Asasif 1963–1970 II, AV 6, Mainz 1973 sowie DERS., Das Grab der Mutirdis, Grabung im Asasif 1963–1970 VI, AV 13, Mainz 1977. Zu den folgenden Ausführungen vgl. M. MÜLLER-ROTH, Papyrusfunde aus dem Asasif: Nachträge, in: MDAIK 65, 2009, i. Dr. Vgl. G. BURKARD, Die Papyrusfunde, Grabung im Asasif 1963–1970 III, AV 22, Mainz 1986, 36f. mit Tf. 25c.

DAS GEFLÜGELTE KROKODIL

a) V 83

b) V 84

c) V 85

d) V 86

59

Abb. 12: Vignetten mit Vogeldarstellungen in P. Turin 1791

hat in der Regel die Form eines Krokodilkörpers (Abb. 13c). P. London BM 10315 und P. Lyon H 1579–1583 belegen aber auch eine Variante mit einem Vogelschwanz (Abb. 13d).22 Da es nur zwei weitere Belege für dieses außergewöhnliche Motiv gibt, ist offensichtlich, dass die Vignette auf P. Kairo J.E. 97249 (Papyrus 5) ohne die Kenntnis möglichst aller Belege nicht zu identifizieren ist. Schätzt man den Wert der Identifikation dieser Vignette allein noch gering ein, so liegt aber nahe, dass sich dahinter ein lokales Merkmal verbirgt. Da die vorliegenden Fragmente des P. Kairo J.E. 97249 (Papyrus 5) in situ gefunden wurden, kann der Befund dazu dienen, die beiden anderen Handschriften ebenfalls Theben zuzuweisen. Deren Provenienz ist nämlich unbekannt. Während das erste Beispiel zeigt, wie eindeutig eine Vignette bestimmt werden kann, deren Reste bisher fraglos auf ein anderes Motiv hindeuteten, so kann die Bandbreite des Vergleichsmaterials gleichwohl dazu führen, 22

Der Vollständigkeit halber muss erwähnt werden, dass P. London BM 10097 und P. London BM 10253 das Krokodil in V 88 ausnahmsweise als unmumifiziertes Tier in natürlicher Gestalt wiedergeben. Beide Szenen sind unpubliziert.

60

INFORMATION TECHNOLOGY AND EGYPTOLOGY

a) P. Turin 1791

b) P. London BM 10558

c) P. Berlin P. 3149

d) P. London BM 10315

2008

Abb. 13: Die Vignette zu Tb 88

Abb. 14: pKairo J.E. 97249 (Papyrus 17)

Abb. 15: V 89 in P. Turin 1791

DAS GEFLÜGELTE KROKODIL

61

anscheinend eindeutige Identifikationen zu relativieren, weil das vorliegende Motiv von mehreren Vignetten bekannt ist. Auf einem anderen Fragment ist ein Vogel mit ausgebreiteten Schwingen zu sehen, dessen Kopf jedoch zerstört ist. Darunter sind Reste einer anthropomorphen Mumie erhalten (Abb. 14). Es handelt sich um die prominente Szene, in der der Ba-Vogel über dem aufgebahrten Leichnam schwebt.23 Das Motiv ist zumindest in P. Turin 1791 sowohl aus V 17 als auch V 89 bekannt (Abb. 15). Obwohl von über 200 spätzeitlichen und ptolemäischen Quellen, auf denen Reste von V 17 erhalten sind, nur P. Paris Louvre N. 3081 einen Ba-Vogel über dem Leichnam zeigt, kann eine Entscheidung zugunsten V 89 nicht eindeutig ausfallen.24 Darüber hinaus kommt die Szene in Ausnahmefällen nämlich auch in zwei weiteren Vignetten vor. So zeigen die Achmimer Handschriften P. Hildesheim 5248 und P. MacGregor auch in V 151 einen Ba-Vogel über dem Leichnam, obwohl er dort in der Regel nicht erscheint.25 P. Genf 23464/1–6, P. Kairo J.E. 32887 (S.R. IV 930) und P. London BM 9902 führen den Ba zudem in V 154 aus.26 Obwohl die meisten Belege außerhalb von V 89 Ausnahmen sind, und die Szene einzig für V 89 der Standardtyp ist, so ist doch klar, dass das Motiv des Ba-Vogels über dem Leichnam prinzipiell über allen Szenen erscheinen kann, in denen der aufgebahrte Leichnam vorkommt. Natürlich kann die Provenienz, sofern sie bekannt ist, die Wahrscheinlichkeit einer Identifikation erhöhen. Trotzdem ist eine sichere Zuordnung grundsätzlich fraglich. Die Beispiele zeigen, dass eine Übersicht über die Bandbreite der Variationen nicht nur eine Neuidentifikation zur Folge haben kann, sondern scheinbar eindeutige Identifikationen relativieren kann. Um hierüber einen schnellen Überblick zu gewinnen und allgemein für die Arbeit mit Vignetten, regte mein Kollege Burkhard Backes an, analog zum Wortindex, mit dessen Hilfe man Totenbuchpassagen selbst winziger Fragmente identifizieren kann, einen entsprechenden Motivindex zu erstellen.27 Dort würde man unter dem Stichwort „Ba-Vogel“ alle Vignetten finden, in denen der Ba-Vogel in jedweder Variante Bestandteil der Vignette ist. 23 24 25

26 27

Vgl. BURKARD, Papyrusfunde, 68–71 mit Tf. 66,8. Siehe P. BARGUET, Le livre des Morts des anciens égyptiens, LAPO 1, Paris 1967, 59. Siehe B. LÜSCHER, Das Totenbuch pBerlin P. 10477 aus Achmim (mit Photographien des verwandten pHildesheim 5248), HAT 6, Wiesbaden 2000, Photo-Tafel 36 sowie MOSHER, Hor, Tf. 16.1. Alle drei Szenen unpubliziert. Vgl. BACKES, Wortindex.

62

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

3.2 Identifizierung der Provenienz Wie oben bereits angedeutet, ist es nahe liegend zu vermuten, dass die Merkmale der Varianten ihren Ursprung in lokalen Stilen haben. Neben chronologischen Unterschieden ist die Provenienz oder die Werkstatt sicherlich ursächlich für die Varianten. Prinzipiell steigt die Wahrscheinlichkeit, dass es sich bei einem Merkmal um eine lokale Eigenheit handelt, gewiss mit zunehmender Individualität. Je individueller das Merkmal und je deutlicher die Unterschiede, desto stärker ist außerdem das Unterscheidungskriterium und damit die Aussagekraft des Befundes. Die Vignette zu Tb 149 ist hierfür ein aussagekräftiges Beispiel: Der Dämon der ersten Höhle erscheint in sehr unterschiedlichen, sehr individuellen Ausführungen. Die Verteilung der Quellen zeigt deutlich, dass es sich um lokale Typen handelt: In Theben verwendet man die bereits oben kennen gelernte Ausführung mit einer Pflanzenstaude auf dem Kopf (Abb. 16a). Daneben existiert in einer Minderzahl ein anthropomorpher Kopf, aus dem zwei federähnliche Objekte ragen (Abb. 16b). In Memphis ist der Kopf des Dämons völlig anders gestaltet. In der Regel besitzt der Dämon dort einen Bes-Kopf (Abb. 16c).28 Eine einzige Ausnahme zeigt eine Darstellung, die wohl einen abgeschlagenen Kopf und Blutfontänen zeigen sollen (Abb. 16d). Die mittelägyptischen Handschriften heben sich von beiden geografischen Polen ab. In Herakleopolis nähert man sich dem thebanischen Typ mit rundem Kopf an, führt ihn jedoch schwarz aus (Abb. 16e). Die oberen Fortsätze sehen wie abstehende Haarstoppel oder Grashalme aus. Die Quellen aus Achmim reduzieren die Gestalt auf eine kahlköpfige anthropomorphe Gestalt (Abb. 16f). Ein dritter Typ ersetzt den Kopf durch Messer, die im Hals stecken (Abb. 16g). Das Beispiel zeigt, dass Handschriften der gleichen Herkunft einen gemeinsamen Vignettentyp besitzen. Während sich hier alle Handschriften auf sehr unterschiedliche Typen verteilen lassen, handelt es sich in der Regel um Varianten und Merkmale, die sich in kleiner Anzahl von der Menge absetzen. So zeigt Fragment 1 des oben bereits behandelten P. Kairo J.E. 97249 (Papyrus 5) einen Vogelschwanz, über dem ein Schattenwedel schwebt (Abb. 11, Fragm. 1). Bei diesem Schattenwedel handelt es sich um eine seltene aber charakteristische Variante von V 26. Dort sitzt ein Ba-Vogel auf 28

Bei MOSHER, Vignettes, 413 mit Tf. 210 gar nicht erwähnt, obwohl der abgebildete P. Paris Louvre N. 3091 die Darstellung zeigt.

DAS GEFLÜGELTE KROKODIL

a) P. Wien ÄS 3862

b) P. Ryerson

c) P. Portheim (A)

d) P. Louvre N. 3081

e) P. Colon. Aeg. 10207

f) P. Berlin P. 10478

g) P. Milbank Abb. 16: Übersicht über die Typen von V 149a aus Theben (a-b), Memphis (c-d) und Mittelägypten (e-g)

63

64

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Abb. 17: V 26 in P. Turin 1791

Abb. 18: Die Eigenschaften der Vignetten zu Tb 19 und 20 (P. Paris Louvre E. 7716)

einem Sockel, während ein hockender Adorant ihm gegenüber sein Herz in der Hand hält (Abb. 17). Den Fächer führen nur P. Berlin P. 3008, P. Boston MFA 92.2582, P. Dublin 1663, P. Kairo J.E. 97249 (Papyrus 5), P. London BM 9976, P. Lyon H 1579–1583 und P. Paris Louvre E. 7716 aus. Wertet man die Informationen dieser Handschriften bezüglich ihrer vermeintlichen Herkunft aus, erkennt man, dass es sich beim Schattenwedel um eine thebanische Variante handelt.

4. AUFARBEITUNG Sowohl die kritische Betrachtung des P. Turin 1791 als auch die Darlegung des Variantenreichtums haben gezeigt, dass das publizierte Primärmaterial und die bisher vorliegende Sekundärliteratur immer nur einen Teileinblick bietet. Gerade bei seltenen Varianten wie dem geflügelten Krokodil aus

DAS GEFLÜGELTE KROKODIL

65

V 88 (Abb. 13d) ist diese Grundlage unzureichend, um Identifikationen sicher vorzunehmen. Immerhin handelt es sich allein für den Zeitraum der Saitischen Rezension um etwa 8000 einzelne Vignetten. Zählt man bei Vignetten, die aus mehreren Bildern oder Szenen bestehen, diese Teile jeweils einzeln, summiert sich die Menge sogar auf ca. 15000 Abbildungen. Ein Überblick ist momentan allein im Totenbucharchiv in Bonn möglich.29 Wie kann das Material so zugänglich gemacht werden, dass einerseits ein Überblick über diese Masse möglich ist und zum anderen die Varianten hinsichtlich ihrer Provenienz, ihres Alters oder anderer Kriterien ausgewertet werden können? Dabei ist zu beachten, dass eine einfache Unterteilung anhand nur eines Kriteriums, wie Mosher dies betrieb, meist unzureichend ist.30 Viele Bestandteile der Vignetten können nämlich variieren und lokale Eigenheiten widerspiegeln. Die Varianten zu V 19 und 20 veranschaulichen das Problem (Abb. 18).31 a) Als Konstellation kann der Verstorbene dem Gott Atum gegenüber stehen. Er kann aber auch einer anderen Person gegenüber stehen oder alleine auftreten. b) Der Kranz der Rechtfertigung kann als Leinenstreifen ausgeführt sein oder als geflochtener Reif. Manche Vignetten ersetzen ihn sogar durch einen Halskragen, wie er auch in Varianten von V 158 bekannt ist.32 Einige Quellen verzichten auch völlig auf den Kranz. c) Das Podest, auf dem der Kranz liegt, ist zum einen als einfaches Rechteck ausgeführt, zum anderen als Torbau gestaltet. Manchmal handelt sich auch um einen Tisch. Außerdem kann das Podest fehlen, wenn der Kranz von einer Person dargereicht wird. 29

30 31 32

URL: http://www.totenbuch-projekt.uni-bonn.de. Zur Geschichte des Projekts H. KOCKELMANN, From One to Ten: The Book of the Dead Project after its First Decade, in: B. BACKES/I. MUNRO/S. STÖHR (Hgg.), TotenbuchForschungen. Gesammelte Beiträge des 2. Internationalen Totenbuch-Symposiums 2005, Bonn, 25. bis 29. September 2005, SAT 11, Wiesbaden 2006, 161–65. Siehe Anm. 5. Siehe Anm. 17 mit der relevanten Literatur. So P. Kairo J.E. 32887 (S.R. IV 930), P. Langres, P. Leiden T 18, P. London BM 10983, P. Paris Louvre N. 3084, P. Paris Louvre N. 5450, P. St. Gallen, P. Wien ÄS 3862 + 10159, P. Wien Vindob. Aeg. 10.110, M. Berlin o. Nr. (PsammetichMeri-Neith), M. Princeton, Pharaonic Rolls, No. 8, M. Uppsala o. Nr. (Nofretiu), M. Sydney R 397. Siehe URL: http://libweb2.princeton.edu/rbsc2/ papyri/Bookofthe DeadRoll8.html (M. Princeton, Pharaonic Rolls, No. 8).

66

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Abb. 19: Die Eingabemaske für die Eigenschaften der einzelnen Vignetten

DAS GEFLÜGELTE KROKODIL

67

Abb. 20: Mumienbinde Prag, Náprstek Museum K 249

d) Zwischen den Akteuren kann ein Opfertisch stehen oder fehlen. e) Die Personen können preisend dargestellt sein oder etwas darreichen. Sie können auch einen Stab in der Hand halten. Unabhängig von ihrer Gestik sitzt der Verstorbene in Ausnahmefällen auch auf einem Stuhl. Hinter jedem dieser Merkmale kann sich eine lokale Eigenheit verbergen. Nach welchem muss man nun die Typen einteilen, um relevante Befunde zu erhalten? Je nach Klassifizierung könnten dabei völlig unterschiedliche Ergebnisse entstehen. Muss man anhand aller Merkmale eine Einteilung vornehmen? Wie kann dies bei einer Menge von 1400 Quellen geschehen? Die Lösung besteht in der Festlegung der einzelnen Merkmale und ihrer statistischen Auswertung. Zunächst muss man die Eigenschaften festlegen: Welche Personenkonstellation liegt in der jeweiligen Quelle vor? Wie ist der Kranz dargestellt? Welche Form hat das Podest? Ist ein Opfertisch vorhanden? Welche Gesten besitzen die Gestalten? Diese Eigenschaften müssen in einer Datenbank registriert werden. Dazu wurde die im Totenbuch-Projekt bereits existierende FilemakerDatenbank erweitert. In einer neuen Tabelle werden nun die Eigenschaften aller Vignetten separat registriert. In einer weiteren Maske werden den einzelnen Vignetten diese Eigenschaften zugewiesen (Abb. 19). Sind die Merkmale der Vignetten derart verzeichnet, kann man die Vignetten und ihre zugehörigen Quellen nach allen verwendeten Kriterien sortieren. Die Quellen der vorliegenden Vignette zu Tb 19 und 20 kann man beispielsweise zunächst anhand der Personenkonstellation trennen. In einem zweiten Schritt kann man die Quellen zudem anhand der Darstellung des Kranzes unterteilen.33 Diese Verteilung ergibt für die Darstellung des Verstorbenen ohne Atum (a) und der Interpretation des Kranzes als Leinenstreifen (b) eine kleine Gruppe aus fünf Handschriften (Tab. 1). Vier

68

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

der fünf Quellen werden bereits Memphis zugeordnet. Quellen aus anderen Regionen fehlen dagegen. Deshalb ist die Wahrscheinlichkeit groß, dass M. Prag K 249 (Abb. 20), dessen Herkunft bisher unbekannt war, ebenfalls aus diesem Raum stammt. P. Kairo CG 40029 P. Paris Louvre N. 5450 P. Wien ÄS 3862 + 10159 P. Wien Vindob. Aeg. 10.110 M. Prag, Náprstek Museum K 249

Memphis Memphis Memphis Memphis unbekannt

V 20 V 20 V 19 V 19 V 19

Tabelle 1: Ergebnis einer Abfrage nach V 19 und 20, die den Verstorbenen ohne Atum zeigen und den Kranz als Leinenstreifen abbilden P. Kairo CG 40029 (J.E. 95837, S.R. IV 934) P. Kairo CG 40029 (J.E. 95837, S.R. IV 934)

Memphis Memphis

V 19 V 20

P. Wien ÄS 3862 + 10159 P. Wien ÄS 3862 + 10159 P. Wien Vindob. Aeg. 10.110 M. Florenz Inv. 3681 M. Prag, Náprstek Museum K 249

Memphis Memphis Memphis Memphis unbekannt

V 19 V 20 V 19 V 19 V 19

Tabelle 2: Ergebnis einer Abfrage nach V 19 und 20, die das Podest, auf dem der Kranz der Rechtfertigung liegt, als Tisch darstellen

Eine andere Verteilung kann anhand der Form des Podests (c) vorgenommen werden. Das Ergebnis zeigt, dass ein kistenartiges Podest nur in Theben verwendet wird. Auch die Form als Torbau ist fast nur aus Theben bekannt. Die Quellen, bei denen der Kranz auf einem Tisch liegt, stammen dagegen fast alle aus Memphis (Tab. 2). Hierzu gesellt sich auch wieder M. Prag K 249. Somit gibt es allein durch V 19/20 zwei Befunde, die für M. Prag K 249 eine memphitische Herkunft plausibel machen. Bestätigt wird das Ergebnis durch das Layout der Mumienbinde, die Holger Kockelmann als Formular 1b bezeichnet.34 Hierunter versteht er Mumienbinden mit im Text eingebetteten Vignetten, die eine Rahmung 33

Das Folgende ausführlicher bei M. MÜLLER-ROTH, Der Kranz der Rechtfertigung, in: A. MANISALI/B. ROTHÖHLER (Hgg.), Festschrift Assmann zum 70. Geburtstag, i. Dr.

DAS GEFLÜGELTE KROKODIL

69

besitzen. Die Bindenhöhe beträgt bei dieser Gruppe 5–8 cm, auf denen etwa 5–7 Zeilen untergebracht sind. Die trifft auch auf M. Prag K 249 zu, die 6 cm hoch ist und 5 Zeilen trägt (Abb. 20). Laut Kockelmanns Untersuchung, der das gesamte Corpus der Mumienbinden zugrunde liegt, stammt dieses Layout aus Memphis und Gurob.

5. FAZIT Die oben festgestellten Befunde verdeutlichen, dass die Vignetten wichtige Hinweise zur Herkunftsbestimmung der Totenbücher liefern können. So dienen die lokalen Vignettentraditionen und die damit verbundenen ikonografischen und stilistischen Varianten dazu, die Provenienz der bisher nicht zugewiesenen Handschriften zu erschließen. Andere Hinweise liefern unter anderem die Titeln der Totenbuch-Besitzer oder das Layout.35 Das oben gezeigte Beispiel zeigt, dass M. Prag K 249 aus Memphis stammt, da er das Podest als Tisch ausführt sowie den Verstorbenen alleine zeigt und mit dem Kranz der Rechtfertigung in Form eines Leinenstreifens kombiniert (Abb. 20). Natürlich ist der Befund einer einzigen Vignette nur bedingt aussagekräftig. Wenn aber nur eine einzige Vignette erhalten ist, wie es bei M. Prag K 249 der Fall ist,36 gewinnt man durch die Ikonografie der Vignette einen wertvollen Hinweis. Ist mehr Material erhalten, sollten natürlich mehrere Befunde ermittelt werden, um die Argumentation zu festigen.37 Je größer die Anzahl an Vignetten und Merkmalen, die aussagekräftige Ergebnisse liefern, desto eindeutiger ist natürlich der Befund. 34

35 36

37

Vgl. H. KOCKELMANN, Untersuchungen zu den späten Totenbuch-Handschriften auf Mumienbinden, Band II: Handbuch zu den Mumienbinden und Leinenamuletten, SAT 12, Wiesbaden 2008, 95f. Vgl. M. MOSHER, Theban and Memphitic Book of the Dead Traditions in the Late Period, in: JARCE 29, 1992, 145–51. Vgl. M. VERNER, Verejné sbírky staroegyptskych ´ památek v CSSR, Vol. II, Unpublizierte Magister-Arbeit Prag 1964. Laut KOCKELMANN, Untersuchungen, 275, Anm. 273 gehört M. Paris, Louvre E. 18861 zum gleichen Set. Ob darauf auch Vignetten erhalten sind, ist weder ihm noch mir bekannt. Vgl. M. MÜLLER-ROTH, Papyrusfunde aus dem Asasif: Nachträge, in: MDAIK 65, 2009, i. Dr. sowie DERS., Lokalkolorit in Schwarz-Weiß, in: B. BACKES/M. MÜLLER-ROTH/S. STÖHR (Hgg.), Ausgestattet mit den Schriften des Thot. Festschrift für Irmtraut Munro zu ihrem 65. Geburtstag, SAT 14, Wiesbaden 2009, 119–31.

70

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Da von den 1400 Quellen mit Vignetten aus der Saitischen Rezension nur von etwa 550 die Herkunft bekannt ist, ist das Potential dieser Analyse offensichtlich. Selbst die manuelle Verteilung anhand nur eines oder zweier Kriterien ermöglicht es mir schon heute etwa 100 Quellen neu zuweisen. Es ist damit zu rechnen, dass sich diese Zahl durch eine computergestützte Auswertung, wie sie oben vorgestellt wurde, verdoppeln lässt. Das gezeigte Vorgehen ermöglicht außerdem, auch sehr komplexe Vignetten wie die Darstellung des Totengerichts zu analysieren und zu klassifizieren. Vignetten mit mehreren Szenen sind manuell nämlich nur schwer zu untersuchen. Neben der Vielzahl an Szenen, die jeweils unterschiedliche Varianten besitzen können, kann zusätzlich auch die Anordnung und Abfolge der verschiedenen Szenen variieren.38 ABBILDUNGSNACHWEIS

Abb. 1: LEPSIUS, Todtenbuch, Tf. LVI; Abb. 2: Zeichnung M. Müller-Roth; Abb. 3: LEPSIUS, Todtenbuch, Tf. LXXI; Abb. 4: Zeichnung M. Müller-Roth; Abb. 5: Zeichnung M. Müller-Roth; Abb. 6: C. LEEMANS, Papyrus égyptien funéraire hiéroglyphique (T. 16) du Musée d’Antiquités des Pays-Bas à Leide, Monumens égyptiens du Musée d’Antiquités des Pays-Bas à Leide III, 4, Leiden 1876, Tf. 15; Abb. 7: LEPSIUS, Todtenbuch, Tf. XX; Abb. 8: BARGUET, Livre des Morts, 86; Abb. 9: LEPSIUS, Todtenbuch, Tf. XXI; Abb. 10: BARGUET, Livre des Morts, 89; Abb. 11: Zeichnung M. Müller-Roth; Abb. 12: LEPSIUS, Todtenbuch, Tf. XXXI–XXXII; Abb. 13a: LEPSIUS, Todtenbuch, Tf. XXXIII; Abb. 13b—d: Zeichnung M. Müller-Roth; Abb. 14: Zeichnung M. Müller-Roth; Abb. 15: LEPSIUS, Todtenbuch, Tf. XXXIII; Abb. 16: Zeichnung M. Müller-Roth; Abb. 17: LEPSIUS, Todtenbuch, Tf. XV; Abb. 18: Zeichnung M. Müller-Roth; Abb. 19: Screenshot; Abb. 20: Zeichnung M. MüllerRoth.

38

Vgl. Anm. 6–9.

AUTOMATIC ALIGNMENT OF HIEROGLYPHS AND TRANSLITERATION

Mark-Jan Nederhof

ABSTRACT Automatic alignment has important applications in philology, facilitating study of texts on the basis of electronic resources produced by different scholars. A simple technique is presented to realise such alignment for Ancient Egyptian hieroglyphic texts and transliteration. Preliminary experiments with the technique are reported, and plans for future work are discussed.

1. INTRODUCTION A convenient form to represent analysis of a manuscript is as interlinear text. In this form, the text is divided into fragments, each short enough to fit within the width of a page or of a computer screen. We will refer to such fragments as phrases, which may or may not concur with the linguistic meaning of the term. For each phrase, a number of rows present different aspects of the phrase, which may be the original text, some form of transcription, word-by-word gloss, translation, or a combination of these types of data. The data that occupies the i-th row of the interlinear text for each phrase is called a tier, or sometimes stream. In the case of Ancient Egyptian, interlinear text typically offers three tiers, consisting of hieroglyphs, transliteration and translation. Additional tiers may offer glosses and lexical or syntactic analyses. The hieroglyphic text may be a facsimile, but more often, we find a normalised transcription using an electronic font, especially when the original manuscript is in 71

72

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

hieratic. The text direction is often mirrored with respect to the original manuscript to be left-to-right, to match the directionality of the other tiers. By transliteration we mean the rendering of the text using the modern Egyptological alphabet, which is composed of some letters from the Latin alphabet in combination with diacritics, and two additional letters representing aleph and ayin. Interlinear text commonly offers only one translation in one modern language, but considering that many interpretations of some of the more difficult texts are still contentious, it can be very fruitful to compare several different translations, displayed as consecutive tiers. This may also be said about transliteration, especially where the segmentation of a hieroglyphic text into words is uncertain. In general, one particular interpretation of a hieroglyphic text is best represented by the combination of transliteration and translation. Many applications of interlinear text involve audio recordings. Such a recording in an appropriate visualisation can be one of the tiers, but it may also serve as the basis for annotations. For example, occurrences of words in a transcription as well as prosodic units can be mapped to time intervals within the recording. Alignment of such annotations can be done straightforwardly through the total ordering imposed by the time line of the recording. Several annotations can be compiled by different linguists, allowing automatic creation of interlinear text, typically restricted to a selection of the tiers, depending on the interests of the user. A survey of tools and techniques involving such applications was presented by Bird and Liberman (2001). They pointed out that annotations can also be mapped to offsets within a particular textual resource, in place of anchor points within an audio recording. This requires however that the textual resource is unchanging, and that different scholars agree on the choice of this textual resource. These constraints regrettably preclude use in many branches of philology. In the example of Ancient Egyptian texts, it would be impractical to demand that all scholars who translate or annotate a text should tag their resources with indices in some canonical representation of the text. Note that a hieroglyphic transcription as interpretation of an hieratic text cannot serve as such a canonical representation, because there may not be any such interpretation that has the approval of the entire community. Existence of lacunas would further exacerbate the problem.

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

73

If the creation of interlinear text cannot rely on anchor points in a common resource offering a total ordering, then an obvious alternative is to align different textual resources automatically, by analysing the contents of the tiers. Alignment of, for example, English, French and German translations of Egyptian texts can be done by relatively conventional techniques; see for example Gale and Church (1993). The present paper will focus on automatic alignment of hieroglyphs and transliteration, which is a form of monolingual alignment involving two very different writing systems. The implementation of this task is one significant component within a larger system to create interlinear text out of one or more hieroglyphic transcriptions, transliterations, translations, and lexical and syntactic annotations. In passing, we would like to point out that similar techniques can also be applied to automatic alignment of different manuscripts of the same text. Examples are the four manuscripts of the Eloquent Peasant, the dozens of manuscripts coverings parts of Sinuhe and the countless manuscripts offering different versions of the Book of the Dead. Alignment of different manuscripts of the same text entails specific problems. For example, a phrase in one manuscript may be absent in another, or entirely different phrases may occur in the respective manuscripts. Even more difficult to handle automatically are cases where the same phrases occur, but in a different order. These issues will not be addressed in any detail here. The task of automatic alignment of hieroglyphic text and transliteration is related to the automatic transliteration of hieroglyphs, which was investigated in a seminal paper by Rosmorduc (2001). He used finite-state transducers, achieving very high accuracy. Whereas automatic alignment seems an easier task in comparison, it is still far from trivial, especially as we have decided not to involve lexica or grammatical knowledge. The rationale is that incorporating such knowledge could bias certain genres or periods, and make the software less robust. Another related task is word segmentation, which means dividing a sequence of signs into words. It differs from our alignment task in that the words themselves are not known. Word segmentation is relevant in general for writing systems without explicit word boundaries. It has received much attention for Chinese. Most conventional algorithms for word segmentation rely on the availability of lexica; see e.g. Sproat et al. (1994). Again, this is incompatible with our objectives. The structure of this paper is as follows. Section 2 discusses the ongoing activities that form the context to the work reported here. The

74

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

orthographic model underlying the automatic alignment is discussed in Section 3 and initial experiments are discussed in Section 4. Section 5 outlines plans for further work.

2. CONTEXT 2.1 an XML format for alignment The XML format AELalign allows encoding of: • hieroglyphs, • transliteration, • translation, and • lexical annotation. For one manuscript, there may be several hieroglyphic transcriptions, several transliterations, etc., and these may be distributed over different files. Moreover, several manuscripts for the same text may be included. Constraints on alignment can be explicitly indicated by line numbers in the manuscripts, or by additional anchor points relating one tier to another. For more details, see Nederhof (2002a). A first trial of its use involved a joint effort over the World Wide Web to translate the Eloquent Peasant with a group of students. Participants submitted their interpretations of parts of the text by email, in a very simple plain-text format, containing transliterations, translations and comments. This format was automatically converted to AELalign. In a next phase, the given hieroglyphic text, which was also in the AELalign format, was aligned with the respective interpretations to form an interlinear text in HTML, which could be viewed as a web page. This served as a virtual blackboard, allowing joint discussions about different interpretations. After this successful trial, small adjustments were made to the format, and the viewing software was reimplemented to provide output in PDF and in a Java applet. The implementation in Java provides the most flexibility, allowing a selection of the tiers to be displayed. The amount of text that fits on each line depends on the width of the window, and as soon as the window size is changed, suitable line breaks are determined anew, leading to a new interlinear text. An excerpt from the PDF output is given in Figure 1. We see that the hieroglyphic text is conveniently divided into parts that are aligned with phrases consisting of transliteration and translation. Until recently, such precise alignment could only be achieved by manually inserting suitable

75

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

M ¯ ˆ z i( ÁŒVÊ /L z‹ B1 | L¯y 42

42 B1 | PNZMUQ+PR NV;WM 42 B1 | 'Look, I shall take away your donkey, peasant, R

M ¯ z L¯/L z‹

R PN+PR N R 'But look, your donkey

43

5 vk¸Ë

| 43 +U | ZQP I6PR M 43 because | it ate my barley. 9.6

V

| 9.6 | +U 9.6 | is

Ë

v5  ZQPMW M

eating my barley!

Figure 1: Part of interlinear text showing two versions of the Eloquent Peasant

anchor points. From Section 3 onward, it will be explained how precise alignment can be done automatically.

2.2 HIEROGLYPHIC ENCODING The hieroglyphic encoding we use is called the Revised Encoding Scheme (RES), and represents a significant departure from the Manuel de Codage (MdC) from Buurman et al. (1988). The main shortcomings of MdC encoding of hieroglyphs are: • There is no precisely defined standard independent from any software tool. • The syntax is chaotic and common interpretations of the official documents seem to entail ambiguities. • The operators are not nearly expressive enough to represent a fair portion of the relative positioning of signs one finds on good monumental inscriptions. • The Manual de Codage seems to be the product of feature creep by having it dictate not only the encoding of hieroglyphs themselves but also the layout of a document that contains hieroglyphs, as well as rudimentary grammatical annotations. A few key properties of RES are: • The syntax is very simple, and the meaning is rigorously defined. Given a string of characters, it can be decided with certainty whether it is or is not a valid fragment of hieroglyphic encoding, and if so, its visualisation is fully prescribed, with the font and a small number of other parameters as free variables.

76

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

• In place of absolute positioning, introduced by some dialects of MdC to make up for shortcomings in its expressivity, a number of operations are available in RES that allow the composition of signs to be described as one sees them, and a suitable appearance can be automatically computed on the basis of a given font. This means that the validity of an encoding can survive a change of font. • RES basically only involves hieroglyphs and not other types of text. The only exceptions are footnote markers next to hieroglyphs (whose exact positions are determined automatically), and brackets for philological purposes. An example of the enhanced expressive power is: insert[te](G39,N5) ~% The meaning of this use of the ‘insert’ operation is that N5 (‘sun’) is placed in the empty right-upper corner of G39 (‘pintail’). In particular, N5 is scaled down as much as necessary to leave a default distance between the two signs. (This distance can be adjusted if desired. With distance 0, the two signs are touching.) The reason the validity of this construction may survive a change of font is that the positioning and scaling depend on the sizes and shapes of the individual signs. For example, in a font where the right-upper corner of G39 leaves less empty space, the occurrence of N5 would be scaled down more. (It should be pointed out that a similar construction exists in PLOTTEXT, developed by Stief (1985).) This should be contrasted with the corresponding notation in most dialects of MdC, using an ampersand. The above example would be written G39&N5. This construction is called a ‘ligature’ or ‘special group’. Both terms are misleading, because the individual signs are not joined together as in traditional ligatures, and there is nothing special about such groups, considering they are quite common in any hieroglyphic text. The meaning of the so-called ligatures, in terms of the relative positioning and scaling of signs, is fixed in the font or in the software. Either way, no standardisation is achieved by the notation itself, and different tools could assign different meanings to ligatures. Attempts to exhaustively list all ligatures and prescribe standardised meanings are futile, as any newly found long text will very likely contain ligatures not included in any fixed list. A case in point is the EGPZ sign list, which contains no less than 400 ligatures.1 While investigating an MdC encoding of Papyrus Westcar, which is 1

Version 1.0, November 2007, at http://www.egpz.com/resources/egpz.htm.

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

77

one of the most popular Middle Egyptian texts, we found two ligatures that were absent from the EGPZ list. Our comments carry over to superimposition of signs. Instead of requiring a proliferation of combined signs as separate code points as in the case of most MdC dialects, RES offers the ‘stack’ operation, as for example in: stack[on](V28,I9) The introduction of RES by Nederhof (2002b) has not been well received by the Egyptological community. The main objections raised by members of the audience, and by others before and after the meeting, were: 1. The MdC is generally accepted as the standard, and too many existing encoded texts would become obsolete if RES were adopted. 2. The goal of preserving the validity of an encoding across different fonts, which is one of the strengths of RES, is irrelevant because Egyptologists typically throw away an encoding once they have published a text. In other words, the electronic encoding is no more than an intermediate form towards a final product on paper. 3. RES is too verbose. Instead of C2\ as in MdC, one must write C2[mirror]. 4. The uniform syntax of RES is irrelevant, as typical users only approach hieroglyphic encoding via a graphical interface. 5. RES is not an XML format. 6. Even the precise placement of signs relative to each other as allowed by RES would not suffice for palaeographic purposes. 7. The rendering of, for example, the ‘insert’ operation is too expensive and too complicated for some applications. The first objection is in conflict with the second, and at least one of them must be invalid. The same holds for the third objection versus the fourth and the fifth objections. Apart from this, each of the above allows a number of counter-arguments. The first objection can be rejected by pointing out that MdC is not a standard. Various tools exist today that each implement one possible interpretation of part of the features from Buurman et al. (1988), and these interpretations vary widely. All of these tools further extend MdC by new features, to make up for shortcomings in its expressivity. However, as different tools add different such features, encoded texts created with one tool become obsolete as soon as that tool becomes obsolete, and exchanging encodings across different tools is problematic.

78

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

To be able to benefit from existing texts encoded in MdC, we have implemented a tool to automatically convert various MdC dialects to RES. As very different principles underlie MdC and RES, respectively, manual post-processing is regrettably required in most cases. As to the second objection, the reason why encodings of hieroglyphs are considered to be ephemeral may be just because the grave inadequacies of available formats such as MdC have so far hindered the development of any large electronic corpora of hieroglyphic texts. Considering the corpora available in other areas of philology, including those involving non-alphabetic writing systems, such as Akkadian and Sumerian, it is unclear why the particular qualities of Ancient Egyptian would preclude the creation of similar corpora in Egyptology, to be freely shared among different scholars. The syntax of RES is more verbose than that of MdC, in the sense of requiring more characters to describe the same thing, but this helps to make the constructions more self-explanatory, and the main objective was to cast the enhanced expressive power into a uniform syntax. The simplicity of the syntax of RES may not be appreciated by end-users as much as by developers of hieroglyph-processing tools, which counters the fourth objection above.2 As to the fifth objection, an XML version of RES will be created as soon as an immediate need for it arises, which has not been the case since RES was introduced. The sixth objection is based on a misunderstanding of what RES wants to achieve. The purpose of an electronic encoding is to offer a visual appearance somewhere between a purely linear sequence of hieroglyphs on the one hand, which would be utterly unacceptable to any scholar, and a facsimile of the original manuscript on the other, which would be impractical in applications involving e.g. interlinear text. RES does not have the pretences to replace facsimiles, but it does move further away from an unacceptably rigid and unrealistic partition of the text surface into perfect squares as MdC would have it. Furthermore, it cannot be denied that developers and users of MdC software in the past have felt a strong need for more accurate scaling and positioning of signs. In fact, after the introduction of RES, some MdC tools have adopted some of its features and added them to their dialects of MdC. 2

One striking observation illustrating the relative complexity of MdC notation is the following. The specification of the tokeniser for MdC in Serge Rosmorduc’s JSesh is 188 lines long, against 34 lines for RES in our Java implementation.

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

79

Regrettably, this exacerbates some of the other problems of MdC, such as lack of standardisation and the chaotic syntax. In response to the seventh objection above, we have introduced RESlite, in which signs receive absolute values for their positioning and scaling. In applications where the font is fixed, RES and RESlite offer the exact same visual appearance, and both allow a fragment of hieroglyphic text to be divided into smaller fragments, e.g. to allow line breaks where hieroglyphs are part of running text. In practice, RES is the format most suitable for exchange between groups of scholars, whereas RESlite can be used internally in systems to allow quick rendering using very simple software, following one-off automatic conversion from RES to RESlite. As far as automatic alignment is concerned, the choice of RES as opposed to MdC for the hieroglyphic encoding is not essential, because the implementation as yet ignores relative positioning of signs beyond a purely linear order. Nevertheless, RES is preferable for this task, due to its emphasis on standardisation and avoidance of ad hoc signs and ligatures. Moreover, RES is ideal for interlinear text, allowing automatic line breaks and padding, and explicitly providing for applications to enforce a horizontal left-to-right text direction irrespective of the encoded directionality.

3. MODEL Experienced Egyptologists would have little difficulty in correctly aligning hieroglyphs with corresponding transliteration. As with any other problem in the realm of artificial intelligence however, it is not so easy to capture expert knowledge in a formal representation allowing the same task to be done reliably by mechanical means. Whereas alignment seems straightforward in the case of idealised input, many problems arise in practice. For example, some occurrences of signs may have a non-standard reading not listed in any grammar or dictionary. Further, there may be errors, made by the modern scholar in the hieroglyphic encoding or in the transliteration, or errors by the ancient scribe not reflected in the transliteration. An alignment algorithm should therefore be designed to avoid a complete failure of the task when confronted with input that is less than ideal. In particular, upon encountering problematic writings, local errors may be unavoidable, but these should not spread to cause incorrect alignments for larger parts of a text.

80

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

The ability of software to give reasonable output even when the input suffers from a limited number of inadequacies is known as robustness. In general, the more complicated an algorithm is, the more difficult it is to achieve robustness. We have therefore started our investigations by choosing a very simple ‘orthographic’ model of how hieroglyphic signs are combined to write words, in terms of their transliteration. This model assumes only two classes of signs, namely phonograms and determinatives. Ideograms will be treated as phonograms with the special property that they can only match against the start of a word. For example, sign D56 (‘leg’) will be treated as a phonogram rd that can only match against the first two consonants of a word (or more precisely, of a morpheme; see further below). We will refer to a mapping from signs to collections of possible readings as an ‘annotated sign list’. In some cases, the mapping is from a sequence of signs to one or more readings; for example, three consecutive occurrences of N35 (‘ripple of water’) may together have a reading as phonogram mw. In the experiments, reported in Section 4, we have extracted our annotated sign list from the ‘Zeichenliste’ of Hannig (1995). It is relatively straightforward to map this list to a data structure that is machine readable. As we wanted to make the experiments reproducible and eliminate subjective decisions as much as possible, Hannig’s list seemed preferable to the one from Gardiner (1957), which would have left much more room for interpretation. It should be noted that Hannig’s sign list is less complete than Gardiner’s. In particular, many uncommon readings of signs are absent. This does not hinder our experiments however, and in fact, the existence of gaps in the sign list helps us to measure the robustness of the algorithm, in the light of the awareness that no sign list will ever cover all readings of all occurrences of signs in unseen texts. With a fixed annotated sign list, the actual input to the alignment algorithm consists of a sequence of hieroglyphic signs and a sequence of words in transliteration. The order of the signs is roughly as they occur in the hieroglyphic encoding in RES. An exception is made however for the ‘insert’ operation, where the order depends on whether the inserted sign is placed before or after the main sign. The alignment algorithm to be described below reads the hieroglyphic text from beginning to end, maintaining positions, which represent the boundaries between pairs of consecutive hieroglyphs, plus the position

81

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

w

ntr 0

R8

1

wj

wj

ntr

ntr

R8

2

R8

3

Figure 2: Edges between positions indicate possible readings of signs or sequences of signs. For example, sign R8 (‘cloth wound on a pole’) can be read as phonogram nTr . The second occurrence in sequence can alternatively be read as the dual ending wj, and the second and third occurrences can together be read as the plural ending w. Hence a path from position 0 to position 3 exists with the edges nTr and w, respectively, which can be matched against a word nTrw . (To simplify the figure, other readings, such as the feminine dual and plural endings, were omitted.)

before the first hieroglyph and the position after the last hieroglyph. Positions are connected by edges labelled by the possible meanings of the hieroglyphs between those positions, as determined by the annotated sign list. A meaning is either a string of consonants for a reading as phonogram (or ideogram, as explained before), or it is simply the information that a sign can serve as determinative. Special treatment is needed for numbers and for dual and plural. For a sequence of numerals between two positions, an edge is added between those positions, labelled by the corresponding number in decimal notation, as it might occur in the transliteration. For two consecutive occurrences of the same sign, edges are added labelled by phonograms wj and tj with the extra constraint that they can only match the final two consonants of a word. Something similar holds for plural, in the case of three occurrences of the same sign, as exemplified in Figure 2. The words of the transliteration are simply defined as strings separated by white space, consisting of consonants and punctuation signs (i.e. ‘.’, ‘-’, or ‘=’). No attempt was made to do automatic morphological analysis beyond the explicit punctuation signs. For example, the feminine ending t is treated like any other consonant, as our transliteration conventions, which follow Hannig (1995), do not mark the boundaries between stems and feminine or plural endings.

82

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

The morphemes that are separated by punctuation signs are treated as individual entities however where it concerns our model of orthography. The basic assumption is that a morpheme is written as a sequence of phonograms, together covering all consonants in the transliteration, from left to right, followed by a sequence of zero or more determinatives. By ‘left to right’ we mean that the first consonant covered by a phonogram should not follow any consonants that have not yet been covered by previous phonograms. In order to achieve robustness, orthographic analyses are allowed that violate the above basic assumption, at the cost of a ‘penalty’, the height of which depends on the seriousness of the violation, based on our intuitions about hieroglyphic writing. For example, a phonogram which follows rather than precedes a determinative incurs a penalty of 8. If a semi-vowel (j or w) in the transliteration is not covered by any phonogram, this incurs a penalty of 2, while this penalty is 5 for other consonants. A hieroglyphic sign that is ignored altogether incurs a penalty of 20. The task is now to automatically determine how consecutive hieroglyphic signs corresponding to words in the transliteration. This is realised by going through the hieroglyphic signs from beginning to end, jumping from position to position following the edges, while at the same time going through the words from the transliteration from beginning to end. The labels of the edges are matched against words from the transliteration, which may incur penalties as outlined above. One difficulty is however that the correct alignment of hieroglyphs and transliteration is not known in advance, and at each moment, it may be decided to terminate the recognition of the current word of the transliteration and move to the next. Our approach is to pursue all possibilities in parallel, and in the end the solution is returned that minimises the sum of the incurred penalties. More precisely, we define a configuration as a triple consisting of the following three components: 1. A position in the sequence of hieroglyphic signs, as explained before. 2. Precisely one of the following: • A position in the sequence of words. Positions are defined much as in the case of hieroglyphs. Each represents the boundary between a pair of consecutive words, and there is one position before the first word and one position after the last word.

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

83

• An occurrence of a word in the transliteration, together with an indication which of the consonants have been covered by phonograms encountered earlier. 3. The sum of penalties so far. At the beginning of the alignment algorithm, we have one configuration, with penalty 0, pointing to the beginning of the hieroglyphic text and to the beginning of the transliteration. We call this the initial configuration. New configurations are derived from existing ones by different steps. The main steps are: • The recognition of one word is finalised, moving to the position between the current word and the next. • From a position between two words, the recognition of the next word is initiated. • We follow an edge between two hieroglyphs, moving to a next position. In the case of a phonogram, the corresponding consonants in the current word are marked as having been covered. • We ignore an hieroglyphic sign, moving to the next position. We say a configuration is final if it simultaneously points to the end of the hieroglyphic text and to the end of the transliteration. Of all final configurations, the one is taken that has the smallest penalty. By tracing back how the final configuration originated, one indirectly obtains a preferred matching of sequences of hieroglyphs against words in the transliteration. The algorithm applies two tricks that allow the task to be done within a few seconds, even for long texts. First, where two competing configurations are identical except for their penalties, the one with the highest penalty is discarded. This can be easily justified, as the configuration with the higher penalty will certainly not be part of the optimal solution when we reach a final configuration. This trick falls within a range of techniques that are known as ‘dynamic programming’. Secondly, for each position within the hieroglyphic text, we only consider the configurations with the N lowest penalties among all configurations associated with that position. Here N is a low number, for example 40. This technique is known as ‘beam search’. The rationale is that partial solutions that seems less promising than many competing partial solutions will likely not be part of the optimal solution in the end. Although beam search is very effective in truncating useless computations, there is a risk that the optimal solution itself is truncated. To reduce this risk, N should be chosen sufficiently high.

84

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 3 shows an example of some configurations that match three consecutive hieroglyphic signs against an occurrence of word xprt, starting from a configuration that points to position 41 just before the corresponding hieroglyphs and to position 82 just before the word, with an overall penalty of 10, which is the sum of the penalties incurred earlier. From this configuration, another is derived that points to the word xprt between positions 82 and 83, and position 41 as before. This configuration contains information about which consonants have already been covered by phonograms. In the figure this is indicated as a hyphen (‘not yet covered’) or an asterisk (‘covered’). At the beginning we have only hyphens. After L1 is interpreted as phonogram xpr, a new configuration is obtained, pointing to position 42 and to the word xprt between positions 82 and 83 as before, now with three asterisks for the three covered consonants. From here, one may process D21 as phonogram r and X1 as phonogram t, and then finish recognition of the word, leading to the configuration with overall penalty 10 as before, pointing to positions 44 and 83. Alternatively, the recognition of the word may be terminated just before D21 is processed, and then the total penalty increases by 5 for the t in xprt that is not accounted for. The resulting configuration has overall penalty 15, and points to positions 42 and 83. More penalties seem unavoidable after that, as D21 and X1 may need to be skipped in order to process following words in the transliteration, and each skipped hieroglyph carries a penalty of 20. Note that the higher the overall penalty becomes, the more likely it is that the configurations will eventually be discarded in favour of competing configurations with lower penalties. A feature was built in to deal with simple cases of honorific transposition, involving a single sign R8 (‘cloth wound on a pole’), N5 (‘sun’) or M23 (‘swt-plant’) to be moved across one or more words of the transliteration. This is realised by allowing such a sign to be skipped and stored in a ‘buffer’ in a configuration, to be retrieved from a later configuration derived from it. No additional mechanism was needed to deal with transposition for honorific or aesthetic purposes within single words, as the basic orthographic model is already fairly permissive with regard to the order of signs within words (although this by itself causes some errors, as we will see in the next section). Honorific transposition in general may involve a god’s name written with several signs. It is not clear how to deal with this without slowing down

85

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

t hpr ˘ 41

L1

hprt ˘- - - 10

10

81

ktt

jt

r 42

hprt ˘∗∗∗10

82

D21

43

15 hprt ˘∗∗∗10

hprt ˘

X1

44

35 hprt ˘∗∗∗∗ 10

83

m

10

84

Figure 3: The circles at the top represent positions within the hieroglyphic text, those at the bottom represent positions within the transliteration. The rectangles are configurations, each containing the sum of penalties so far and a pointer to a position in the hieroglyphs. Each of the small rectangles also points to a position between two words in the transliteration. The large rectangles each point to an actual word in the transliteration, while indicating which of the consonants have been covered by phonograms so far. The dotted arrows indicate how one configuration is derived from another. By following such arrows backwards, one can find out how the final configuration with the lowest penalty was obtained from the initial configuration, through a list of steps that identifies the preferred alignment between hieroglyphs and transliteration. (Only those configurations are depicted here that are relevant to the discussion in the running text.)

86

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

the alignment algorithm considerably, and therefore we have not attempted to solve the general case in the current implementation.

4. EXPERIMENTS The first text that was considered is the Shipwrecked Sailor. It was found to be very suitable for experimentation with different variants of the alignment algorithm, as it may be the least complicated of all the longer Middle Egyptian texts, having only minor lacunas and few problematic readings. The annotated sign list was not changed after we started our experiments, but changes were made to the software at this stage. This means that the error rates cannot be taken as typical for unseen texts of the same level of difficulty, let alone unseen texts of higher levels of difficulty due to, for example, unusual writings of words. We produced a hieroglyphic encoding of the text, and a transliteration that closely follows the conventions of Hannig (1995), the same dictionary from which the annotated sign list was extracted. By these conventions, the text is 1014 words long. A compound word consisting of two parts connected by a hyphen was counted as one word. Also suffix pronouns were not counted separately. In a first phase, we segmented the hieroglyphic encoding manually, marking the first sign of the writing of each word. A simple graphical user interface was developed to help this process, allowing signs to be marked by mouse clicks, while putting the corresponding word from the transliteration under the position of each marked sign, and showing the next few words from the transliteration. In a second phase, the automatic alignment was run to find the first sign corresponding to each word. This was compared to the manual alignment, and the graphical user interface then identified the differences by highlighting. Some auxiliary tools were added to provide explanations why certain mismatches between manual and automatic alignment arose. This includes a tracer, showing the steps of the alignment process for a selected part of the text. Among the 1014 words, only 12 errors were made by the automatic alignment. These can be divided into 8 errors that are due to gaps in the annotated sign list, and only 4 that are due to inadequacies of the orthographic model. Examples of gaps in the sign list are the absence of the reading of A50 (‘man of rank seated on chair’) as ideogram for Spsj, and the

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

87

absence of the reading of A12 (‘soldier with bow and quiver’) as ideogram for mSa. The crudeness of the orthographic model accounts for the failure to match F20 (‘tongue of ox’) with reading as phonogram ns against a subsequence of consonants in the word nj-sw. One error occurs in xprt.n rdjt, where the second occurrence of D21 (‘mouth’) is matched to the r from xprt.n rather than the r from rdjt; the problem here is that the model does not pose enough restrictions on the order of phonograms. In two occurrences of snTr (‘incense’) the honorific transposition of R8 (‘cloth wound on a pole’) misleads the model into taking the sign as determinative of the preceding word. Whereas each of the above errors could clearly be eliminated by an ad hoc patch of the model, it seems likely that every unseen text will raise new problems, and a 100% accuracy is beyond reach. Furthermore, a frequent observation in computational linguistics is that tweaking models to correctly handle specific cases may inadvertently lead to other cases being handled incorrectly. Moreover, increasing coverage, for example, by adding possible readings to the sign list, may well lead to a decrease in accuracy. On the positive side, for each of the cases discussed above, no trailing errors in subsequent words ensued. This means the algorithm is very robust, in the sense that local errors do not tend to spread to larger parts of the text. Moreover, for purposes of producing interlinear representations, it may not be a cause for great concern to have the start of a word misidentified by a distance of only one or two signs. In a second experiment we investigated Papyrus Westcar, repeating the above procedures. This was done after all parameters of the model had been fixed. This means that the results can be seen as typical for unseen texts of the same level of difficulty. However, due to the many lacunas, it was often problematic to identify the sign occurrence where we would want the automatic alignment to find the beginning of a word. Mismatches between manual and automatic alignment that arose as a direct result of lacunas have therefore been ignored, leaving 81 errors, among the 2683 words of the transliteration. Of these errors, 24 are due to gaps in the annotated sign list, and the remaining 57 must be blamed on inadequacies of the orthographic model. Among the latter, the most frequent problem is honorific transposition within a single word, accounting for 33 errors. Of these, 14 occur in the writing of nsw-bjtj and 6 in the writing of snTr.

88

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

The other inadequacies of the orthographic model were found at pairs of consecutive words sharing one or more consonants, accounting for 24 errors. Most notably, in 14 occurrences of Dd.jn Ddj the first occurrence of R11 (‘column imitating a bundle of stacks’), with reading as phonogram Dd, is incorrectly taken as part of the writing of the first word Dd.jn, rather than the second word Ddj, which has two occurrences of R11. (See Gardiner (1957, 502) for the reading of two consecutive occurrences of R11.)

5. FUTURE WORK The above reported preliminary results from work in progress. The orthographic model we have described allows a large spectrum of refinements, and the current project plans to pursue several of them. This includes evaluation on the basis of a wider range of texts. A first priority will be the creation of a sign list that contains more detailed and accurate annotations on possible readings of signs. Although the recent Unicode proposal (Everson and Richmond, 2007) greatly contributes to the standardisation of signs used in electronic encoding of hieroglyphs, it is regrettable that no accompanying document is currently being planned that summarises and updates the information about the signs collected by Gardiner (1957; as well as several other documents). It cannot be emphasised enough that electronic resources offering such information are of the highest importance to automatic processing of hieroglyphic texts. The creation of annotated sign lists in an electronic format also forces us to look closer at the different classes of signs and their functions in the writing of words. Whereas some Egyptian grammars distinguish between only three different classes of signs, viz. phonograms, ideograms (also called logograms) and determinatives, some publications use a finer distinction. Schenkel (1971) in addition offers a formal description of how words are composed of signs with various functions. This description cannot be readily employed for our purposes however, as no sign list exists that is annotated with corresponding functions. Furthermore, Schenkel’s work does not directly link hieroglyphic writing to transliteration. For example, it does not specify how to deal with phonetic complements. The annotated sign list that we used in the experiments was derived from the ‘Zeichenliste’ of Hannig (1995). The original list distinguishes between Phon, Log, Abk, Det, Phono-Det, and Log/Det. Whereas the informal meanings of these concepts may be clear, it is less obvious what

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

89

functions these classes of signs should have in a formal model of orthography. With refinements of the orthographic model, the mechanism of penalties described in Section 3 may become harder to maintain. The finer the constraints are that one imposes on the orthography, the more frequently will constraints be violated by valid orthographic analyses of actually occurring hieroglyphic text, and thereby penalties may be incurred in all competing analyses. The selection of the desired analysis will therefore often depend on a suitable choice of the relative heights of different kinds of penalties. Regrettably, human intuitions tend to be quite unreliable when it comes to estimating quantitative aspects of language or, in this case, writing systems. We therefore need to investigate stochastic approaches, to replace penalties by probabilities that are automatically estimated on the basis of annotated or unannotated hieroglyphic texts. Due to the nature of the writing system, which lacks unique standardised spellings, and due to the sparsity of the data, it would be infeasible to estimate the probability of each possible variant spelling of each word separately. A more promising approach is to compute parameters that abstract away from the actual consonants of a word in transliteration, looking at the order in which, for example, phonograms are used to represent the consonants in respective positions. As an example, consider the writing of nst (‘throne’) as: C∑ N35 : F20 - X1 : W11 4® The exact probability of this writing, given the word in transliteration, is: P(N35,F20,X1,W11 | nst) = P(N35 | nst)⋅ P(F20 | nst,N35)⋅ P(X1 | nst,N35,F20)⋅ P(W11 | nst,N35,F20,X1)⋅ P(end | nst,N35,F20,X1,W11). For example, the third factor in the right-hand side of this equation should be read as the probability that X1 is the third sign in the writing of nst, following the signs N35 and F20 in this order. The final factor represents the probability that the word ends after the given list of four signs. Whereas accurate estimation of each of the factors in the above is infeasible, we can approximate them by for example: P(N35,F20,X1,W11 | nst) ≈ P(*-- | ---)⋅⋅

90

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

P(**- | *--)⋅⋅ P(--* | **-)⋅⋅ P(det | ***)⋅⋅ P(end | ***), given the information that N35 can be a phonogram n, which concurs with the first consonant of nst, W11 can be a determinative, etc. Each minus sign or asterisk in the above represents a position in the word in transliteration. The asterisks in the left-hand sides of factors of the form P(⋅ | ⋅) are the consonants covered by the next phonogram. The asterisks in the right-hand sides indicate the consonants that have been covered by previous phonograms. For example, P(--* | **-) represents the probability that the next sign is a phonogram matching the third consonant of a three-consonant word, given that the first and second consonants have already been covered by previous phonograms. There are many variants of such a model. For example, probabilities can be conditioned on the previous one or two signs (cf. bigrams and trigrams), or appropriate abstractions from those signs, making use of ‘smoothing’ of probabilities in the case of sparse training data. Very similar techniques exist for other applications in computational linguistics, such as part-of-speech tagging (Manning and Schütze, 1999). So far we have assumed that hieroglyphic text is processed as a linear list of signs, without indication of the exact relative positioning. In particular, line breaks and the separations between quadrats are ignored. There are cases however where relative positioning is essential to the correct reading of hieroglyphs. One classical example is m-Xnw written with N35a (‘three ripples of water’) below W24 (‘bowl’); see Gardiner (1957, 134). In the investigated texts, no examples were found of incorrect alignment of hieroglyphs and transliteration that could be amended if relative positioning beyond a purely linear order were to be taken into account. It cannot be excluded however that relative positioning could help to increase the accuracy of alignment. Signs may generally be grouped together following aesthetic principles, irrespective of how a sequence of signs is to be segmented into words. For example, if the last sign of one word and the first sign of the following word are both roughly one quadrat in width and half a quadrat in height, they may be grouped together into a single quadrat, with one sign above the other. An interesting conjecture by Horst Beinlich (personal communication) is however that there was a certain tendency to let the boundaries

AUTOMATIC ALIGNMENT OF HIEROGLYPHS

91

between consecutive words concur with boundaries between consecutive quadrats. This merits further investigation. To the extent the conjecture may be confirmed by the data, it holds the potential to enhance the accuracy of automatic alignment. We have found that the penalties discussed in Section 3 sometimes signal errors in the hieroglyphic encoding, often due to a confusion between signs with similar appearances. Another suggestion for further research is therefore to develop tools that highlight potential errors in hieroglyphic transcriptions. Lastly, it should be pointed out that the most useful applications of automatic alignment are at this moment hindered by the fact that many hieroglyphic transcriptions and translations are available only in printed form. It is highly desirable, for this reason and for many others, that scholars in the future will make more of their textual resources available in suitable electronic formats, either free of copyright or at least explicitly allowing use within viewing software.

6. CONCLUSIONS Whereas the work reported here is in early stages, some conclusions can already be drawn. First, automatic alignment of hieroglyphs and transliteration is feasible with very simple techniques, without using lexica or grammatical knowledge. The accuracy may vary across texts, but experiments show that at least some texts allow a very high accuracy. In addition, there is ample room for refinements of the discussed techniques, with the potential to further reduce the error rate. Second, our work underlines the importance of standardisation of hieroglyphic encoding. In addition, the creation of electronic resources, such as annotated sign lists documenting the possible functions of signs in the writing of words, is essential for automatic processing of texts.

7. ACKNOWLEDGEMENTS Many of the presented ideas were inspired by unpublished work by Serge Rosmorduc on automatic transliteration, and I am greatly indebted to him for many fruitful discussions. Much gratitude goes to Nigel Strudwick for his technical assistance with the typesetting of this article. I am also very

92

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

grateful to Horst Beinlich for discussions on the orthography of Egyptian, and to Norbert Stief for correspondence about PLOTTEXT. This article was written with the generous assistance of a fellowship from the Leverhulme Trust.

REFERENCES J. Buurman, N. Grimal, M. Hainsworth, J. Hallof, and D. van der Plas. Inventaire des signes hiéroglyphiques en vue de leur saisie informatique. Institut de France, Paris, 1988. S. Bird and M. Liberman. A formal framework for linguistic annotation. Speech Communication, 33: 23–60, 2001. M. Everson and B. Richmond. Proposal to encode Egyptian hieroglyphs in the SMP of the UCS. Working Group Document ISO/IEC JTC1/ SC2/WG2 N3237, International Organization for Standardization, 2007. A.H. Gardiner. Egyptian Grammar. Griffith Institute, Ashmolean Museum, Oxford, 1957. W.A. Gale and K.W. Church. A program for aligning sentences in bilingual corpora. Computational Linguistics, 19(1): 75–102, 1993. R. Hannig. Grosses Handwörterbuch Ägyptisch-Deutsch: die Sprache der Pharaonen (2800–950 v.Chr.). Verlag Philipp von Zabern, Mainz, 1995. C.D. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. MIT Press, 1999. M.-J. Nederhof. Alignment of resources on Egyptian texts based on XML. In Proceedings of the 14th Table Ronde Informatique et Egyptologie, 2002a. On CD-ROM. M.-J. Nederhof. A revised encoding scheme for hieroglyphic. In Proceedings of the 14th Table Ronde Informatique et Egyptologie, 2002b. On CD-ROM. S. Rosmorduc. Transducteurs pour la translittération des hiéroglyphes. Unpublished paper presented at TALN 2001, 2001. W. Schenkel. Zur Struktur der Hieroglyphenschrift. Mitteilungen des deutschen archäologischen Instituts, Abteilung Kairo, 27: 85–98, 1971. R. Sproat, C. Shih, W. Gale, and N. Chang. A stochastic finite-state wordsegmentation algorithm for Chinese. In 32nd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pages 66– 73, Las Cruces, New Mexico, USA, 1994. N. Stief. Hieroglyphen, Koptisch, Umschrift, u.a. – ein Textausgabesystem. Göttinger Miszellen, 86: 37–44, 1985.

CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN: TRAITEMENT XML DES TEXTES DES PROCESSIONS DE SOUBASSEMENT DES TEMPLES TARDIFS

Vincent Razanajao

ABSTRACT

An important element of the grammar of the temple, the processions of Nile Gods inscribed on the lower parts of walls in temples of Ptolemaic and Roman date form an interesting corpus for the study of toponyms or for the history of religions. When the author came to consider the form the publication of this corpus should take, the idea of an electronic publication and treatment by XML seemed very promising. The aim of this paper is not to elaborate the Document Type Definition (DTD) dedicated to Egyptian texts but to put forward the author’s comments subsequent to having used the corpus. The subject of the project being textual corpora, the approach will follow that of the Text Encoding Initiative (TEI). After a brief history of processions and an analysis of their structure, a series of adjustments of the TEI tags will be presented with an explanation of the bias chosen. The outline of an XML file is appended at the end of the paper.

Les réflexions d’ordre informatique dont je souhaite faire part dans les quelques pages qui vont suivre sont nées des questionnements apparus lors de l’établissement d’un projet d’édition des textes des processions ornant les soubassements des temples tardifs, principalement d’époques grecque et romaine. Lorsqu’il s’est agi de réfléchir à la forme que devait prendre ce corpus, l’idée d’une publication électronique a tout de suite paru opportune et,

93

94

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

plus précisément, un traitement par XML de ces textes a semblé tout à fait intéressant. L’aspect électronique répond à la volonté d’aller au-delà de la publication des textes pour elle-même, l’objectif étant de développer un outil permettant une interrogation multiple et illimitée de la matière contenu dans ces textes. La possibilité serait ainsi offerte à tout chercheur d’obtenir aisément, par exemple, l’ensemble des textes relatifs à la province d’Égypte sur laquelle il travaille. LE XML, SES PRINCIPES ET SES APPLICATIONS DANS LES ETUDES DE CORPUS TEXTUELS

Principes généraux du XML Le XML, ou «eXtensible Markup Language», a pour principe d’augmenter toute chaîne de caractères – et donc tout texte – de «balises» invisibles permettant d’ajouter des informations sur les mots qui la composent. Par exemple, le balisage XML de la notice «égyptologie» du Petit Robert consisterait à indiquer que: -le premier mot, «égyptologie», est l’entrée de la notice; -«mil. XIXe» est la date de la première occurrence; -«de Égypte et –logie» constitue l’étymologie; -«étude scientifique de l’Égypte ancienne» en donne le sens. Les balises XML étant de petites séquences entre signes < et >, une traduction en XML de cet exemple donnerait: Égyptologie, mil. XIXe, de Égypte et –logie. Étude scientifique de l’Égypte ancienne. Les règles de saisie XML Le traitement d’un texte selon les normes XML passe par la définition préalable du canevas et des règles qui seront utilisés pour la saisie. L’ensemble de ces règles est répertorié dans ce qui s’appelle une DTD, ou «Document Type Definition», une sorte de document modèle qui énonce les règles propres à la mise en XML des textes du corpus correspondant.

CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN

95

XML et corpus textuels: les préceptes de la TEI Traiter de manière informatique le corpus qui nous intéresse nécessite de s’inscrire dans les problématiques développées par la TEI, ou Text Encoding Initiative (http://www.tei-c.org/), «que l’on pourrait traduire, ainsi que cela a été très justement écrit1, par groupe d’initiative pour le balisage normalisé des textes» et qui constitue «une norme de balisage, de notation et d’échange de corpus des documents électroniques»2. La TEI, par la publication régulière de recommandations3, propose un large panel de balises XML qu’il s’agit d’adapter aux besoins de son propre objet d’étude. Loin d’être contraignante, cette nécessité de prendre comme assise les jeux de balises et autres composants XML définis dans la TEI permet de rendre intelligibles les corpus entre chercheurs. De surcroît, si le but premier de la TEI est de permettre et de faciliter l’échange des corpus eux-mêmes, un autre atout de suivre la TEI est que cela offre une possibilité d’analyser et de confronter les approches en elles-mêmes. S’inscrire dans la logique TEI permet ainsi de s’appuyer sur l’expérience acquise dans l’élaboration des corpus de textes électroniques. Pour notre propos, notre regard doit également se porter sur les groupes de travail qui ont plus particulièrement pour objet la publication en ligne des textes épigraphiques ou en langues anciennes. Au premier rang de ces groupes de travail, il faut citer EpiDoc, ou Epigraphic Documents in TEI XML (http://epidoc.sourceforge.net/), et le Perseus Project (http://www.perseus.tufts.edu/), deux émanations de la TEI particulièrement intéressantes pour notre réflexion. L’objectif du présent article étant moins de définir la DTD propre aux textes égyptiens que de soumettre les quelques conclusions qu’il m’a paru possible de tirer de mes interrogations sur le corpus des textes des processions des temples tardifs, il s’agit désormais de se pencher plus avant sur ces textes qui s’organisent selon une structure qui sied particulièrement à un traitement XML.

1

2 3

L. Romary, H. Hudrisier, «TEI – Text Text Encoding Initiative», article en ligne sur le site Internet du RIFAL, Réseau international francophone d’aménagement linguistique, http://www.culture.gouv.fr/culture/dglf/rifal/tei.htm (dernière consultation: octobre 2008). Ibid. Nous en sommes actuellement à la cinquième édition: http://www.teic.org/Guidelines/P5/ (dernière consultation: octobre 2008).

96

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

LES PROCESSIONS DE SOUBASSEMENT DES TEMPLES TARDIFS: HISTOIRE ET TYPOLOGIE

Élément important de la grammaire du temple, les processions de figures ornant les soubassements des temples d’époques ptolémaïque et romaine constituent un corpus dont la richesse s’avère particulièrement intéressante en ce qui concerne non seulement les études toponymiques mais également l’histoire des religions ou celle, plus générale, des mentalités. Les processions dans l’histoire Dès le début de la IVe dynastie et le règne de Snéfrou4, des personnages incarnant les provinces (spA.t) d’Égypte apparaissent dans l’ornementation des parois des temples funéraires royaux aux côtés des figurations des domaines du défunt venus lui apporter les richesses de ses terres. Les processions de personnages androgynes ou féminins seront dès lors le support privilégié pour représenter le territoire et ses richesses. Glissant du monde des défunts à celui des dieux, les défilés viendront enrichir progressivement l’iconographie des temples. Cette évolution est plus particulièrement intéressante à suivre en ce qui concerne les processions d’ordre géographique. Ce n’est cependant pas sous l’aspect d’une procession qu’apparaît la première énumération géographique connue produite dans le domaine religieux non-funéraire. C’est en effet sous la forme d’une liste disposée en tableau qu’apparaît le dénombrement de sépats que le roi Sésostris Ier a fait figurer sur le soubassement de sa Chapelle blanche à Karnak5. Plus tard, sur la Chapelle rouge érigée à Karnak par Hatchepsout – monument dont la fonction est identique à celle de la Chapelle blanche –, le tableau des sépats ornant le soubassement est remplacé par une procession de figures androgynes et féminines6. Si l’information fournie par ce collège de personnages est bien moins importante puisqu’elle ne se résume guère qu’au seul nom de la province, la charge symbolique peut être considérée comme bien plus importante puisque ce retour à la procession proprement figurée renoue

4 5 6

A. FAKHRY, The Monuments of Snefru at Dashur. II. The Valley Temple. Part I. The Temple Reliefs, Le Caire, 1961, p. 17–58. P. LACAU, H. CHEVRIER, Une chapelle blanche de Sésostris Ier, Le Caire, 1956, p. 207–251. Id., Une chapelle d’Hatchepsout à Karnak, [Le Caire], 1977, p. 69–92.

CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN

97

avec une forme plastique finalement beaucoup plus expressive que le tableau. Comme souvent dans l’iconographie égyptienne, le texte vient soutenir l’image, mais quoique connaissant huit variantes7, les formules accompagnant les porteurs d’offrandes de la Chapelle rouge d’Hatchepsout restent très stéréotypées et ne présentent que de manière très générique les biens présentés: «Je t’apporte toutes les choses bonnes et douces issues de la Terre du Nord», «toutes les offrandes fraîches et pures», «toutes les provisions», etc. Ce n’est qu’à partir de l’époque ramesside que la nature de ces formules s’étoffe et que celles-ci abandonnent leur aspect stéréotypé, allant parfois même jusqu’à évoquer des spécificités régionales propres8. À cette même époque, les processions présentant des figures incarnant les produits et les richesses du pays connaissent également un regain d’intérêt9. Connues par des modèles remontant à l’époque Sheshonquide10, les processions de «Nils» gravées sur les soubassements des temples d’époque ptolémaïque puis romaine vont acquérir une réelle épaisseur quant au contenu de leurs textes. Les exemples les mieux conservés sont assurément ceux des temples d’Horus à Edfou et d’Hathor à Dendara, mais rares sont les temples qui ne comportent pas de ces défilés dans leur programme décoratif (Figure 1). Formes et types des processions en œuvre sur les parois des temples gréco-romains Dans cette époque mémorielle où les hiérogrammates puisent dans leurs archives pour compiler le compendium sacré que l’on peut lire aujourd’hui, entre autres, sur les parois des derniers temples, les motivations profondes dans le choix des toponymes et des entités représentées sont complexes11. Le sujet de ces processions est tant à la fois géographique – dans le cadre d’une une géographie sacrée –, économique ou mythologique, ces trois domaines se mêlant de manière étroite. 7 8 9

10

P. LACAU, H. CHEVRIER, Une chapelle d’Hatchepsout à Karnak, [Le Caire], 1977, p. 71. J. YOYOTTE, AEPHE Ve section 98, 1989–1990, p. 179. Voir par exemple la procession du temple de Séthy Ier à Gurna (H. BRUGSCH, Die Geographie des alten Ägyptens, Geographische Inschriften altägyptischer Denkmäler 1, Leipzig, 1857, pl. XII). Id., «Note sur le bloc de Sheshonq I découvert par la mission archéologique à Saqqara de l’Université de Pisa», Egitto e Vicino Oriente 12, 1989, p. 33–35.

98

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

En ce qui concerne les processions ornant les soubassements des temples gréco-romains, quatre grandes catégories ont été mises en évidence12: 1) les processions géographiques, lesquelles se décomposent en: a) Les défilés des seules quarante-deux figures représentant chacune une sépat d’Égypte; b) Les processions plus développées dites quadripartites, défilés ainsi dénommés parce que chaque province est représentée par quatre personnages, incarnations des quatre composantes d’une province: la spA.t elle-même, le canal-mr et les territoires w et pHw qui lui sont propres. 2) les processions hydrologiques; 3) une variante de ces dernières où sont mis en scène des couples inondation/campagne; 4) les processions économiques. Structure générale des textes des processions de soubassement Quoique des variations en nuancent l’apparente unicité, ainsi que le montre la typologie rappelée ci-dessus, les textes des processions forment un ensemble cohérent tant du point de vue de la forme que du contenu, ce qui rend particulièrement intéressante leur mise en corpus. Ainsi, en prenant en considération les seules processions géographiques par exemple, un rapide examen permet de voir que les textes de chaque province s’articulent autour de deux syntagmes13: un premier présentant l’entité de la procession et les produits qu’elle apporte avec elle; un second définissant un certain nombre d’assimilations divines destinées à caractériser la divinité récipiendaire en fonction de l’apport (la tradition égyptologique désigne cette seconde partie sous le nom de «glose d’assimilation»). À l’intérieur de ces deux syntagmes, plusieurs composantes grammaticales et sémantiques donnent corps à ce système. Si l’on prend par exemple le texte de la procession concernant la XIXe sépat de Basse-Égypte, il est possible de mettre en exergue différents éléments tel que cela est illustré dans la Figure 2. 11

12 13

Voir J. YOYOTTE, Orientalia 35, 1966, p. 46; id., AnnEPHE Ve section 75, 1967– 1968, p. 106; 91, 1982–1983, p. 217–221; Chr. ZIVIE-COCHE, Tanis. Statues et autobiographies de dignitaires. Tanis à l’époque ptolémaïque, TTR 3, Paris, 2004, p. 294; V. RAZANAJAO, « Le Delta à Basse époque: géographies d’un territoire», Égypte. Afrique et Orient 42, juin 2006, p. 3–9. J. YOYOTTE, Annuaire du Collège de France 94, 1993–1994, p. 685–686. Dans le sens large donné à ce mot par F. de Saussure, Cours de linguistique générale, Lausanne, Paris, 1916, p. 172.

CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN

99

Syntagme d’assimilation

Syntagme de présentation

Figure 1: Edfou, soubassement du mur ouest de la cour du temple d’Horus. Début de la procession quadripartite des sépats de Basse-Égypte © Photo de l’auteur. Verbe de l’apport suivi du pronom désignant le roi

Prép. du datif + pronom désignant la divinité récip.

Entité apportée

Préposition introduisant l’offrande

Produits apportés par le personnage

Toponyme

Toponyme

in≠f n≠k Jm.t-pH Xr mXr.w≠s m jrp n(y) TA-nTr Il t’apporte Imet-peh chargée de ses produits qui sont le vin de Ta-netjer Ntk jm nTrj xnty wnm.t≠f sD.tj wr xnty jAb.t≠f Car tu es l’enfant-imti divin qui préside à son œil-droit, l’enfant-sdjti vénérable qui préside à son œil gauche

Pronom désignant la divinité récipiendaire

Assimilations dont la divinité récipiendaire est l’objet

Figure 2: Mise en exergue des éléments composant les textes accompagnant une figure de procession, en l’occurrence la sépat de la XIXe province de Basse-Égypte (Edfou IV, 37, sqq.)

100

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Un traitement selon les normes XML du texte de la Figure 2 consisterait à indiquer en premier lieu qu’il est question de la procession ornant le soubassement du pronaos du temple d’Edfou (= Edfou IV, 21–39 pour la Basse-Égypte; IV, 172–193 pour la Haute-Égypte). À un niveau inférieur, il faudrait renseigner le fait qu’il s’agit du texte de la XIXe sépat (Edfou IV, 37, 3–38, 1) puis que de «in=f» à «TA-nTr», nous avons affaire au syntagme de présentation, et de «ntk» à «jAb.t=f», au syntagme d’assimilation. Par la suite, il est possible de marquer les différents éléments constitutifs du texte, qu’il s’agisse d’objets grammaticaux (préposition introduisant l’apport, pronom introduisant la divinité récipiendaire) ou sémantique (assimilations divines dont la divinité récipiendaire est l’objet). Un balisage XML permettrait également de renseigner le type de personnage qui incarne l’entité apportée, celui de sa coiffe, sa position dans la procession. D’un caractère plus «flottant», un autre balisage aura son importance: celui des toponymes. Pouvant apparaître dans les deux syntagmes, les noms de lieux pourront ainsi être marqués, quelle que soit leur position dans le texte. La Figure 3 donne l’ébauche de la structure générique des processions qu’il paraît possible de mettre en avant. Il reste à s’interroger sur la manière que l’on pourrait user pour traduire en XML ce schéma générique, et quels éléments et balises de la TEI il s’agit de prendre en considération. POUR UNE DFINITION D’UNE DTD-TEI PROPRE AUX TEXTES ÉGYPTIENS

Un premier jet de DTD liée aux textes hiéroglyphiques a vu le jour en 200014. Comme son auteur S. Rosmorduc l’explique en introduction, cette DTD n’avait pas pour vocation de baliser les textes hiéroglyphiques d’un point de vue sémantique mais bien du seul point de vue formel. La question était centrée sur les problèmes que pose la saisie même des textes hiéroglyphiques et la manière de les encoder, indépendamment presque du XML, suivant les préceptes du Manuel de codage. Mais, et particulièrement dans le cadre qui nous occupe ici – celui des textes des processions de soubassement –, il s’avère que l’intérêt du XML et de l’application des 14

Consultable sur les pages Internet de S. Rosmorduc à l’adresse http://webperso.iut.univ-paris8.fr/~rosmord/HieroEncoding/DTD/ (dernière consultation: octobre 2008).

CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN

101

Procession

Sépat

Sépat

Figure Sexe Numéro d’ordre Pavois

Sépat

Sépat

Sépat

Texte Syntagme de présentation Verbe introductif

Syntagme d’assimilation Vocabulaire

Prép. datif + pron. div.

Toponyme

Entité apportée

Toponyme

Produits apportés

Toponyme

Pronom introductif Assimilations Assimilation Assimilation

Figure 3: Arborescence de la structure générique applicable aux processions de soubassement

recommandations TEI réside dans les réponses que ceux-ci apportent à la façon dont on peut appréhender un corpus textuel sur le plan sémantique. Malgré les apparentes dissemblances qui séparent les textes égyptiens des manuscrits médiévaux ou modernes – principal sujet d’études des corpus TEI –, il semble possible d’affirmer que l’ensemble des entités et balises contenues dans la TEI répondent aux besoins, à l’instar de ce qui a été fait dans le cadre du projet EPIDOC. Aucune création de balise n’est nécessaire car l’adaptation de celles existantes par le biais des définitions d’attributs est très souple et extensible. Dès l’abord, il apparaît que les entités définies par la TEI afin de renseigner les informations générales sur un manuscrit et destinées à figurer dans l’en-tête fonctionnent parfaitement. Le «header» du fichier XML sur les processions pourra donc reprendre les éléments de la TEI suivant:

( file description) contient la description du fichier électronique d’un point de vue bibliographique.

102

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

(text-profile description) contient la description du texte contenu dans le fichier électronique du point de vue de son élaboration, de son contexte de production, de sa langue (balises , , etc.). (classification declarations) répertorie les références qui seront utilisées pour des renvois externes au document. Par exemple, il est possible de définir les abréviations bibliographiques qui seront utilisées tout au long du corpus (Dendara, Edfou, Wb, PM, etc.). Afin de baliser la structure générale de chaque texte relatif à une figure, des avec trois «types» différents pourraient être utilisées: , division concernant les informations sur la figure elle-même (genre, numéro d’ordre, pavois); , division concernant les textes du syntagme de présentation; , division concernant les textes du syntagme d’assimilation; À l’intérieur de chacune de ces divisions, il s’agit de pouvoir saisir des hiéroglyphes, de la translittération et de la traduction. Les suivantes pourraient être utilisées: pour les hiéroglyphes; pour la translittération («al» = «alphabetic»15); pour la traduction. Ces dernières étant d’un niveau inférieur à celui des , etc., l’utilisation de «div» hiérarchisées sera un avantage. Ainsi, pour les grandes parties de texte, pour les séquences hiéroglyphiques, de translittération et de traduction. L’adaptation de l’existant permet également de ne pas avoir à créer de balises particulières pour l’encodage des signes hiéroglyphiques. Il nous paraît ainsi envisageable d’utiliser les balises suivantes:

(character or glyph) pour le balisage d’un signe hiéroglyphique unique.

15

Les suffixes « hi », « al » et « tr » sont repris de M. J. NEDERHOF, « Alignment of Resources of Ancient Egyptian Texts Based on XML », dans Proceedings of the 14th Table Ronde Informatique et égyptologie, Pise, 2002, ressource en ligne: http://www.cs.st-andrews.ac.uk/~mjn/egyptian/align/index.html, p. 2 (dernière consultation: octobre 2008).

CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN

103

(segment) permettrait de procéder aux regroupements de signes. Une définition adéquate de l’attribut type permettrait d’indiquer le type de regroupement: cadrat ou, à l’intérieur d’un cadrat, groupe horizontal, vertical, ou encore ligature. e.g.: W25A9 nk

A18pH O49

L’indication de hachure pourrait être portée sur un texte par l’utilisation de la balise dont le type serait défini comme «hatch». Les balises TEI , , , , le couple / (ou la balise ?) et permettront de porter toutes les indications utiles à l’édition du texte. En ce qui concerne la translittération et la traduction, diverses balises permettront d’affiner l’encodage sémantique des mots et séquences qui composent le texte:

(word ) ainsi que la définit la TEI, «represents a grammatical (not necessarily orthographic) word». Cette balise permettra par exemple de marquer un toponyme. L’attribut «lemma» sera utile pour préciser la forme exacte du mot. e.g.: le balisage de la translittération de pourrait être rendu par: Jm.t-pH

(referencing string),

permettrait de marquer, par exemple, que la partie du texte ainsi balisée concerne les produits apportés. e.g.: cf. texte du syntagme d’assimilation de la Figure 2. ntk jm nTrj xnty wnm.t≠f sD.tj wr xnty jAb.t≠f Comme on peut le voir, les préceptes établis par la TEI répondent aux besoins de l’égyptologue en matière de traitement d’un corpus textuel. Je ne saurais prétendre, par cet article, établir une DTD-TEI propre aux textes égyptiens, ceci nécessitant un évident travail en commun. Aussi, je n’approfondirai pas plus la description des adaptations qu’il m’a paru possible de mener sur les balises XML-TEI. J’ai toutefois annexé en fin d’article une ébauche de fichier XML qui permet, je l’espère, d’illustrer la manière dont il paraît possible d’aborder les textes égyptiens.

104

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Les réflexions dont je vous ai fait part ont été induites par la nature même du corpus que je souhaitais traiter, celui des textes des processions de soubassement des temples tardifs. Cet article se veut avant tout l’annonce de ce projet et une pierre dans l’établissement d’une DTD pour les textes en ancien égyptien. Il apparaît nécessaire, pour l’avenir, d’établir un groupe de travail à l’image d’EPIDOC, groupe de travail égyptologique qui aurait sa place toute trouvée dans le cadre des tables rondes Égyptologie & informatique. ANNEXE. ÉBAUCHE DE FICHIER XML-TEI POUR L’ENCODAGE DES TEXTES DES PROCESSIONS DE SOUBASSEMENT DES TEMPLES TARDIFS En-tête de fichier avec le titre du corpus dans son entier; Filedesc: description Corpus des textes des processions ornant les soubassements des générale du fichier XML, temples ptolémaïques et romains

avec indication de l’éditeur

(respStmt);





Corpus of procession-texts inscribed on the wall-basement of Egyptian Temples of Ptolemaic and Roman Periods



Edited by Édité par Vincent Razanajao





Édition des textes des processions ornant les soubassements des temples égyptiens d’époques grecque et romaine



Publication of the texts of processions inscribed on lower parts of walls in Egyptian temples of Greek and Roman date





Périodes ptolémaïque et romaine

Égypte

Égypte



Description générale du corpus, avec indications de dates et de localisation (creation);

CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN



Le temple d'Edfou MIFAO 10-11, 20-31

Marquis de Rochemonteix Emile Chassinat Le Caire 1897-1934



Topographical Bibliography of Egyptian Hieroglyphic Texts, Reliefs and Paintings B. Porter R. Moss Oxford 1927-1972



105

Définition des abréviations (taxonomy) bibliographiques qui seront utilisées dans le corpus (ici, définition de “Edfou” et “PM”). - title level="s": titre de collection

Début du sous-corpus rassemblant les textes du temple d’Horus d’Edfou, avec les informations Corpus des textes des processions de soubassement du temple relatives aux dates et d’Horus d’Edfou

localisation (profilDesc).







Époque ptolémaïque

Edfou







Procession du corridor extérieur



Edfou IV, 21-42, 170-194



Début du sous-corpus des textes ornant le corridor extérieur du temple d’Edfou. - ref target permet d’utiliser l’abréviation définie plus haut;

106

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008



Ptolemy V Épiphane et Cléopâtre III

Temple d’Horus à Edfou

Corridor extérieur



_________________________

_________________________



19e province de Basse-Égypte



Textes relatifs à la première sépat de Basse-Égypte. Textes concernant les sépats 2 à 18 Début de la section concernant les textes relatifs à la XIXe sépat de BasseÉgypte. Header avec les indications bibliographiques

Edfou IV, 37, 3-38,1



PM VI, 157, (290)-(294)







1 - La Sepat 73 Homme

Pavois

A18 F22

Textes de la sépat elle-même; A. Section relative au personnage lui-même, indications (rs) sur: - le sexe; - numéro d’ordre dans la procession; et indication sur ce que porte le pavois (div type="pavois");

CORPUS ÉLECTRONIQUES DE L’ANCIEN ÉGYPTIEN

Jm.t-pH

Syntagme de présentation

W25A9

nk

A18

pHO49

F32

ma

W13r ii X4AZ2 zimir pi i X4AZ2 nw Z1 nTrtA tt xAst

jn.f n. k Jm.t-pH

Xr mXr.w.s m jrp n(y) TA-nTr



107

B. Section relative au syntagme de présentation: - Texte hiéroglyphique; - Translittération; - Traduction.

Il t'apporte Imet-peh chargée de ses produits qui sont le vin de Ta-netjer

Assimilation syntagme

Saisie du texte hiéroglyphique

C. Section relative au syntagme d’assimilation

108

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

ntk jm nTrj xnty wnm.t.f sD.tj wr xnty jAb.t.f

Car tu es l'enfant-imti divin qui préside à son oeil-droit, l'enfantsedjti vénérable qui préside à son oeil gauche



_________________________

_________________________

_________________________



_________________________

_________________________



Textes du canal

Textes du ou Textes du péhou

_________________________

Textes relatifs à la 20e sépat.



Clôture des sous-corpus et corpus : 1. Corridor extérieur d’Edfou; 2. Temple d’Horus à Edfou; 3. Textes des processions.

AN OFFERING TO AMUN-RA: BUILDING A VIRTUAL REALITY MODEL OF KARNAK

Elaine Sullivan, Willeke Wendrich

ABSTRACT

The use and creation of ‘Virtual Reality’ (VR) models in the field of Egyptology has increased in recent years, as lower-cost digital modeling software comes within reach of more archaeological projects. The use of VR models for the recording of archaeological data and the display of site reconstructions offers incredible potential for teaching, research and outreach. The creators of a model of the Amun temple at Karnak, in its second year of ‘construction’ at the University of California at Los Angeles, present the results of one such project. This paper discusses the advantages of using ‘virtual’ representations for illustrating complex, multi-period archaeological sites; the issues involved in the three-dimensional reconstruction and portrayal of architecture and features that are no longer extant; and the decision-making process involved in the design of the UCLA Karnak model. The Karnak temple model highlights potential future uses of such models for Egyptology: as effective teaching tools for communicating information to students and the general public, as a means to test research hypotheses, as a way to ‘virtually’ recontextualize statuary, stelae, and other objects removed from their original locations, and for the visualization of entire ancient cityscapes. INTRODUCTION

Over the last year and a half, a team at the Experiential Technology Center of the University of California at Los Angeles has been building a Virtual Reality model of the Amun temple precinct at Karnak for use as a teaching 109

110

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

tool in the college classroom.1 The Digital Karnak Project, directed by Dr. Diane Favro of the Department of Architecture and Urban Design and Dr. Willeke Wendrich of the Department of Near Eastern Languages and Cultures, was funded in part by the National Endowment for the Humanities.2 This paper will explain the advantages of presenting complex archaeological information in a three-dimensional form, the general concerns of the academic community regarding the use of modeling technology, and how the Digital Karnak Project has attempted to address these concerns to create a model useful for professional Egyptologists and university-level instructors.

VIRTUAL REALITY IN EGYPTOLOGY The use of ‘Virtual Reality’ (VR) models in archaeology has boomed in the past 15 years, as ancient sites in diverse areas of the world have been recreated virtually by historians and archaeologists. While Egyptian materials have not been included to the same extent as ancient cities or artifacts in Europe, Egyptologists are increasingly interested in the potential of modeling for our field. Many types of projects, including those focused on art, architecture and text, have begun to explore the possibility of displaying information in three dimensions. The projects utilizing this new technology have varying goals, and thus the sophistication of the models used, as well as the methods by which the models are displayed or presented, differ greatly. A few recent projects relating to Egyptology highlight the wide variety of platforms and objectives:

1

2

The Digital Karnak Project’s website is available at http://dlib.etc.ucla.edu/ projects/Karnak/. The Experiential Technology Center (ETC) is under the direction of UCLA professor Dr. Diane Favro. For information about the center or to view ongoing projects, see: http://www.etc.ucla.edu/. The NEH is a grant agency of the United States government. See: http:// www.neh.gov/.

AN OFFERING TO AMUN-RA

111

Models have been used to provide museum visitors with context for objects on exhibit. A digital reconstruction of the funerary Chapel of Ka(i)pura, modeled as part of a traveling exhibition sponsored by the University of Pennsylvania Museum in 1998–2000,3 was shown in tandem with a false door from the tomb.4 In a recent publication of his work on the 17th Dynasty at Thebes, Daniel Polz effectively used digital models (included on a DVD) to place the tomb of Nub-Kheper-Re Intef within the larger funerary landscape of Dra Abu el-Naga, as well show a possible reconstruction of the superstructure of this tomb from multiple angles.5 Scholars have used model recreations to present multiple theories of reconstruction at mostly destroyed sites. Two alternative reconstructions of the labyrinth at Hawara were produced by the University College London.6 The Egypt Exploration Society posted models of reconstructed buildings at Amarna to the internet along with explanatory texts and photographs of the site to the internet as a type of virtual ‘visitor’s center.’ The models and maps are based on architectural plans from the temple and consultation with Amarna expedition staff.7 Mark Lehner, of the Ancient Egypt Research Associates, in cooperation with the Oriental Institute’s Computer Laboratory, created a number of models of the Giza plateau for a NOVA television series about the pyramids. Additional models were created as an extension of the project, and screenshots of the overall plateau as well as individual buildings have now been made available on the Giza Plateau Mapping Project’s home page.8 The team from UCLA and the Rijksuniversiteit Groningen working at the Ptolemaic and Roman site of Karanis has integrated modeling into the 3

4 5 6

7 8

Silverman, D., and E. Brovarski. 1997. Searching for ancient Egypt: art, architecture, and artifacts from the University of Pennsylvania Museum of Archaeology and Anthropology. Dallas: Cornell University Press. A description of the project and screen shots from the model are posted at http://www.learningsites.com/Kapure/Kapure_home.htm Polz, D. 2007. Der Beginn des Neuen Reiches: zur Vorgeschichte einer Zeitenwende. SDAIK 31. Berlin: Walter de Gruyter. Shiode, N. and W. Grajetzki. 2000. A virtual exploration of the lost labyrinth. Encounters with ancient Egypt. http://www.casa.ucl.ac.uk/digital_egypt/ hawara/ http://www.thearchitecturestore.co.uk/. http://oi.uchicago.edu/research/projects/giz/

112

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

larger conservation plan for that site. A 3-D interactive model of the ancient town will be used to monitor wall decay, plan a visitor’s route through the site, and provide visual reconstructions that can be used on information panels for tourists.9

THE DIGITAL KARNAK PROJECT The goals of the UCLA Digital Karnak Project differ somewhat from the projects discussed above. The subject of the project, the Amun-Ra temple at Karnak, is one of the most visited, photographed and researched places in all of Egypt. Hundreds of publications debate each detail of the temple, and every year its layout and design is discussed in art history and architecture classes at universities across America. However, the standing architecture at Karnak conflates over fifteen hundred years of religious and architectural development into a single plane. When looking at a site plan or walking around the precinct today, the temple’s form in its final phase takes precedence, masking the earlier stages of the temple’s existence and obscuring its history and growth. This is especially confusing for students, for whom the unfamiliar list of kings and complex patterns of construction make it almost impossible to comprehend in any meaningful way. The Digital Karnak Project attempts to use the capabilities of digital technology to make Karnak a more understandable site. A free website hosted by UCLA has been designed expressly to provide teaching resources about Karnak for university instructors. This website includes: Quick Time video footage of the digital model integrated with maps and photos from the site; six accompanying thematic texts in PDF form to provide students and instructors with general information on the temple’s religious, political, and historical importance within Thebes and greater Egypt; a simplified Google Earth version of the model that gives students the chance to interactively explore the precinct; and an interactive Time Map where photographs, bibliography and descriptive information on each feature at

9

Wendrich, W., J.E.M.F. Bos, and K. Pansire. 2006. VR modeling in research, instruction, presentation and cultural heritage management: the case of Karanis (Egypt). In: The 7th international symposium on virtual reality, archaeology and cultural heritage VAST 2006, eds. M. Ioannides, D. Arnold, F. Niccolucci, and K. Mania. pp. 225–30. Budapest: Archeolingua.

AN OFFERING TO AMUN-RA

113

the temple is linked spatially to a geo-referenced Google Maps plan of the precinct.

THE KARNAK DIGITAL MODEL AS A TEACHING TOOL The Virtual Reality model includes over sixty structures from the precinct recreated from the published architectural plans of the Centre FrancoEgyptien d’Etude des Temples de Karnak (CFEETK) and a number of other scholars. The model’s main strength lies in its unique ability to present the Amun precinct in four dimensions – not only displaying many of the temple’s structures in 3D, but also showing the temple’s development through time. Moving reign by reign, the model depicts the temple from the earliest hypothesized form in the reign of Senwosret I through the changes of the Greco-Roman Period, integrating the spatial and temporal modifications into a single, easily understood platform. This allows students to view Karnak in a way not possible in two dimensional maps and plans. While many reconstructive models concentrate on adding in now destroyed buildings, the Karnak model also ‘peels away’ the later additions of the temple, emphasizing the precinct’s growth and change (Figures 6–9). Not only does the model allow students to visualize the different chronological stages of the temple’s development, it also situates the temple within the larger ancient landscape by showing how the movement of the Nile impacted the temple’s growth. The hypothesized westward shift of the Nile is integrated into the phasing of the model, making a visual connection between the built and natural environments. Although the temple has been significantly reconstructed in modern times, the loss of many of its features means that only the extant buildings influence the modern viewer’s understanding of the temple. For example, of the seventeen monumental obelisks that originally stood in central and north Karnak, only two remain upright today. The visual importance of these monoliths to the precinct is lost. On the model, however, fifteen of these obelisks have been recreated and replaced, many of them with their original inscriptions added. The obelisks again dominate the temple’s skyline, reminding the viewer of Karnak’s special position as the ‘Heliopolis of the south’ (Figure 4). The obelisks provide a good example of how the digital model can recreate the lost context for destroyed or replaced temple features. Two of these monuments, the ‘unique’ obelisk in east Karnak and one of the obelisks

114

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

before the seventh pylon, were removed from the temple by the Romans and raised in the capitals of the Imperial Empire. Photographs of the remaining portion of the ‘unique’ obelisk (now standing in Rome’s piazza San Giovanni) were added to this feature on the model, virtually ‘repatriating’ the monument to the precinct after more than 1600 years (Figure 5). As well, structures that were destroyed or modified by ancient renovation efforts at the temple can be recreated, allowing students to view each king’s building projects as individual, cohesive construction efforts, unaltered by the later (and often drastic) modifications of succeeding rulers. A hypothetical recreation of the original western side of the court of Sheshonq I (later destroyed for the construction of the huge first pylon by Nectanebo I) shows how the temple’s entrance may have appeared for over 500 years – a significant period of time in the temple’s active lifespan (Figure 10). Finally, the model offers something photographs and plans can never provide – an experiential interaction with the temple. By traveling through the temple at eye level, the student grasps the enormous scale of the hypostyle hall or the first pylon, the extensiveness of the overall precinct in its later phases, and the contrast between the light of the open courts and the darkness of the covered halls. Since most university students studying the temple have never been to Karnak, the model gives them the opportunity to stand inside the ‘red chapel’ of Hatshepsut, gaze up at the starry ceiling of the Akhmenu, or virtually walk down the temple’s stone entrance path, flanked by ram-headed sphinxes (Figure 12).

BUILDING AND DESIGNING THE DIGITAL MODEL The digital model of Karnak was produced in Multi-Gen Creator 3.4, a program designed for real-time 3D modeling.10 The model was viewed and presented in Virtual Reality Navigator (vrNav) 2, version 7.13. After its completion, the model was exported to two common modeling programs, 3-D Studio Max and Maya, where the video animations and still frame images were produced. To facilitate direct access via the internet, the model

10

Real-time models update information immediately as they receive data, instantly responding to the user as he or she navigates around the virtual space. This gives the user complete freedom and 360-degree control over movement within the model.

AN OFFERING TO AMUN-RA

115

was converted, simplified and exported to Google Earth. The entire process took over eighteen months. Before and during the design of the model, The UCLA team approached a number of important issues regarding how to display information, both visually and textually, to best fit the project goals. As the use of digital models spreads, concerns from the academic community have risen regarding how digital models can meet the academic standards needed for use in teaching and research. Common criticisms include: a lack of documentation of the basic data included in the model; a lack of clear explanations of decision process, sources and uncertainty in the recreation; no indication of how much of the original building or feature still exists and how much has been reconstructed; and choosing aesthetic concerns over accuracy.11 UCLA’s Experiential Technologies Center has been involved in the virtual reconstruction of a number of buildings and sites, including the Roman Forum, the Cathedral of Santiago de Compostela, and Qumran. These projects have all had the advantage of involving a team of digital modelers and archaeologists, architects, or historians who collaboratively decide on reconstructions and visualization design. Each project has responded to these issues differently, and the lab does not advocate a single method for every model. We would like to present the decisions made by the Digital 11

Miller, P., and J. Richards. 1995. The good, the bad, and the downright misleading: archaeological adoption of computer visualization. In: Computer applications and quantitative methods in archaeology, eds. J. Hugget, and N. Ryan. BAR International Series 600, pp. 19–26. Oxford: Tempvs Reparatvm; Forte, M. 2000. About virtual archaeology: disorders, cognitive interaction and virtuality. In: Virtual reality in archaeology: computer applications and quantitative methods in archaeology, eds. J. Barceló, M. Forte, and D. Sanders. BAR International series 843, pp. 9–36. Oxford: Archaeopress; Niccolucci, F., and F. Cantone. 2003. Legend and virtual reconstruction: Porsenna’s mausoleum in X3D. In: The digital heritage of archaeology: computer applications and quantitative methods in Archaeology: proceedings of the 30th conference, Heraklion Crete, April 2002, eds. M. Doerr, and A. Sarris, pp. 57–62. Greece: Archive of Monuments and Publications, Hellenic Ministry of Culture; Vatanen, I. 2003. Deconstrucing the (re)constructed: issues in conceptualising the annotation of archaeological virtual realities. In: The digital heritage of archaeology: computer applications and quantitative methods in archaeology: proceedings of the 30th conference, Heraklion Crete, April 2002, eds. M. Doerr, and A. Sarris, pp. 69–74. Greece: Archive of Monuments and Publications, Hellenic Ministry of Culture.

116

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Karnak Project, explaining how we chose to respond to these challenges, in hopes that this will help others both evaluating and creating VR models in the field of Egyptology. Perhaps the most important issue for academics using VR is the level of ‘accuracy’ attained. It is important to stress that the Digital Karnak model does not pretend to have created an accurate reproduction of the Karnak precinct as it existed in ancient times. Such a goal is unattainable.12 Instead, the project aspired to construct a model that accurately reflects the present state of knowledge about the temple’s past form. A type of ‘knowledge representation’ (a visual expression of an object with specific goals for accuracy based on the desired research or pedagogical purpose),13 the model attempts to faithfully recreate the chronology, architectural form and spatial relationships of the temple features as documented through modern publications. The model will therefore only be as accurate as our current understanding of the precinct, and as new discoveries are made, the model will need to change to reflect new interpretations of the temple. To this end, the VR model was produced using information (plans, axial drawings, maps and other 3D models) on Karnak published primarily by the CFEETK and the American and Canadian teams working at the greater site. In order that the model reflect the most recent research, as much information from the latest volume of the Cahiers de Karnak series (2007) was included as possible. The extensive list of publications about the temple and the high quality reconstructive plan and axial drawings made by the Egyptologists and architects of the CFEETK make Karnak an excellent site for this type of virtual recreation project. The UCLA team made very few reconstruction decisions, as all the buildings modeled are based on the previously published plans of scholars working at the temple (Figure 3). Creating an accurate knowledge representation in VR models does not just involve the physical geometry of the buildings reconstructed, but also the aesthetic choices made in ‘texturing mapping,’ the laying of colored or patterned panels onto the basic geometric forms on the model. This is what gives virtual reality models their resemblance to real buildings. Three project members spent ten days at Karnak in 2007 photographing different 12

13

Gillings, M. 1999. Engaging place: a framework for the integration and realisation of VR approaches in archaeology. In: Computer applications and quantitative methods in archaeology. BAR International series 750, pp. 247–54. Favro, D. 2006. In the eyes of the beholder: virtual reality re-creations and academia. Journal of Roman archaeology, supplementary series 61, pp. 321–34.

AN OFFERING TO AMUN-RA

117

parts of the temple for texture mapping, and the digital photos were used not only for reproducing the relief scenes and text panels, but also for recreating much of the stone and brick textures on the model. It was decided early on that we only wanted to add specific imagery, such as relief scenes or hieroglyphs, in ways that respected the actual appearance of the temple, either as it is now, or as it can reasonably be reconstructed from the evidence. A significant amount of time and energy was therefore spent trying to replicate things as exactly and precisely as possible in a few specific areas of the temple, including a number of panels of text and relief on large wall surfaces (the ‘daily ritual’ scenes inside the hypostyle hall, the Sety I battle scenes on the northern exterior wall of the hypostyle hall, the ‘Bubastite portal,’ and the façades of the seventh and eighth pylons), on the obelisks, and on bark chapels and shrines (the calcite chapel of Amenhotep I, the ‘white’ chapel and the ‘red chapel’—Figure 2). No relief scenes or hieroglyphs were amended when added to the model. Fragmentary scenes were blended into the surrounding areas of the model while maintaining enough difference in color or line that the viewer could easily differentiate the temple imagery from the reconstructed areas (Figure 13). Time constraints made it unfeasible for us to texture map each wall of the temple exactly, so we used simple, repeating stone patterns created from imagery of Karnak’s walls to give a general sense of the temple’s appearance in areas we chose not to reproduce exactly. However, we tried to reflect the general stone size and patterning of each area of the temple, using blocks of different dimensions and colors to reflect the fact that the walls, columns, pylons, and other features of the temple were made from different stones. For example, the contrast between the small stone blocks used to construct the columns of the kiosk of Taharqo and the large column drums used for the colonnade in the first court are reflected by the patterning included on the model (Figure 11). While much of the temple would have been accented with colorful paint in ancient times, little of this decoration remains visible at Karnak today. Because the use of color has been studied in only a few areas of the temple, it was decided that the addition of color to the model would, in most cases, be too conjectural. However, to give students a better sense of the overall appearance of the temple, one area of the precinct was chosen to demonstrate how different the temple looks today because of the loss of the polychrome paint that once adorned its walls and columns. The

118

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

extraordinary preservation of color in the Akhmenu festival hall allowed us to recreate both the blue and yellow star-covered ceiling of the central nave, as well its red tent-pole columns. Digital photos of the paint on both these areas were sampled and used to reproduce the vivid colors of the hall (Figure 1). The decision about which areas of the temple were texture mapped as accurately as possible was governed not only by the differing levels of preservation at the temple, but also by the pedagogical goals of the project. Since the Digital Karnak Project attempts to provide a resource that will aid university instructors presenting the temple in their art history, architecture, Egyptian language or Egyptian history courses, texture mapping was focused on areas of special historical, art historical and architectural interest.

ACCESSIBILITY AND EASE OF USE Although not frequently discussed in the publications dealing with VR modeling, the accessibility of the model was of utmost concern for the Digital Karnak Project. One of the major challenges in using real time virtual reality models posted on the World Wide Web is the need to download unfamiliar types of software, only some of which are compatible with any given computer. These problems limit the number of people who can take advantage of the models to those who have a specific kind of computer or operating system, and who are savvy and patient enough to download numerous programs that are not usually applicable to other computer functions. A second concern was the ability of students and teachers to control their own speed and path through the virtual space. The power to individually navigate can add substantially to the experience of touring a model, as it allows the user to focus on areas of special interest, or to replicate the feeling of personally exploring a building or site. But this capability comes with drawbacks as well, as programs often fail to explain navigation techniques, or navigation is difficult, slow, and clumsy, making use of the model frustrating. Teaching while trying to move around a model can also be difficult, making classroom use impractical. The Karnak model, designed in a program that has the problems mentioned above, has been presented in two alternative forms: through the use of 1. downloaded Quick Time videos, and 2. Google Earth, all available

AN OFFERING TO AMUN-RA

119

through the home website of the Digital Karnak Project. Both programs are free to download and commonly used by educators and the general public. Because of the technological problems with offering direct downloads of VR models on the internet, the high-resolution version of the model is available as a series of Quick Time videos. Six thematic videos display the model using animations and still frame images combined with animated maps and photographs of the site today. Because these were designed for use in the college classroom, no sound and only minimal text are included. Instead, researched and annotated essays (in PDF format), to be used as instructional guides or background information for teachers, accompany the videos. A number of additional videos, including brief animations of the model and chronologically phased still-frames of the model are offered without commentary. These offer instructors the flexibility to use the model videos for their own purposes, designing their lectures to highlight the aspects of the model important for their own class interests. Quick-Time videos can be played while the instructor lectures, alleviating the need to manually navigate through the model while teaching. As well, the videos allow the user to pause and restart at will, allowing instructors to stop at any point in the video to explain a point or add information. The Google Earth version of the model allows the user to virtually fly above and around the temple, or to move down to pedestrian level and walk through room by room. Google Earth works on both Macs and PCs and has a relatively easy to learn and intuitive means of navigation, based on Google’s popular map program. Most American university students are already familiar with the Google suite of programs. PRESENTATION OF THE MODEL AND SOURCE TRANSPARENCY

Documentation of the data and design decision-making are issues of main concern for those creating and using VR models. Projects have approached these matters in a variety of ways. Some embed textual information within the model, allowing users to click on ‘hotspots,’ which open windows or pop-up bubbles relaying textual information about that building or feature. In other models, information appears when a user ‘enters’ a room or space, with no need to click on a specific spot. Because the Digital Karnak Project model includes a large number of buildings that appear and disappear through time, a sophisticated platform

120

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

for source documentation was necessary. Since documentation within the videos was deemed distracting, project technicians created a ‘time map’ – a two-dimensional map with a date slider indicating which buildings were built, modified or destroyed over the course of the temple’s lifetime (mirroring each structure’s activity within the model itself) for inclusion within the project’s website. By clicking on a building within the time-map, the user is sent to that building’s individual web page, which lists its multiple phases of activity by king’s reign. Students, teachers, or researchers can here read basic information about each modeled building at Karnak, such as its measurements, material, and function or ritual use. A discussion of the issues involved in recreating the feature (including how it was texture mapped), a bibliography of sources directly used for the model reconstruction, a list of other related bibliographic sources, and a description of alternative theories about the building’s appearance provide clear and easily understandable documentation of our process. Renderings of each building are embedded within the web pages, accompanying the textual explanations with the associated still-frame shots of the model recreation (Figure 14). Some VR projects offer users the ability to ‘toggle’ on and off the sections of a modeled structure that no longer exist at an archaeological site. This helps the viewer differentiate between those remains virtually reconstructed with a high level of certainty (such as the existing foundations or bases of walls) from those of a more hypothetical nature (such as wall height or windows). This was not feasible for Karnak, due to the size and complexity of the site and the time allotted for this project. Instead, each structure’s web page includes a photo archive with digital images of Karnak taken in 2007.14 This similarly allows students to visually delineate the extant architecture at the site from the recreated areas of each building. As well, students and instructors can compare the details of the modeled structures with their real-life counterparts, observing the associated statuary, areas of text or relief decoration not included on the model. The interactive Google Earth version of the model can also be used to access this archive of source material. When touring the interactive model, students can click on information bubbles attached to each building. These open to display the building’s name and a still-frame image of that structure from the higher resolution model. By clicking again on the name of the 14

In a few cases, images could not be obtained by UCLA, as some areas of the site are inaccessible. In these few instances, students will need to consult the bibliographic sources listed for published photos of the buildings.

AN OFFERING TO AMUN-RA

121

building, the user is taken to that building’s descriptive web page, where they can view the metadata related to that structure. RESEARCH POTENTIAL OF MODELING FOR EGYPTOLOGY

Karnak, with many still extant buildings and a history of study and excavation that spans more than 100 years, provides an interesting test case because of the large number of buildings that can be virtually recreated. Visually, the model is appropriate for display to students and a public without much knowledge of ancient Egypt. However, it may be that sites with little standing architecture will in fact benefit the most from this new technology. The flexibility of modeling allows multiple reconstructions of buildings or areas to be made and ‘tested’ within the virtual site, providing researchers the opportunity to experiment with and present various theories of architectural or decorative reconstruction, but also landscape use, view sheds, and spatial relations.15 We envision using the Karnak model to test out the impact of light and shadow on temple ritual and decorative program and to re-contextualize statuary and stelae within the temple. Models of large or complex sites allow the researcher to not only consider the different physical possibilities of a structure, but also how each reconstruction would affect the buildings and landscape around it. For multi-period sites like Karnak, the ability to represent and record a site’s evolution through time offers even more advantages. While 3D modeling may today seem like a flashy way to interest the public or to present information to students, in the not so distant future, Egyptologists will use it (as some already do) as an integral part of their field data recording and analysis. Site modeling may not only change what types of questions we can answer, but also what type of questions we can ask.

15

Winterbottom, S.J. and D. Long. 2006. From abstract digital models to rich virtual environments: landscape contexts in Kilmartin Glen, Scotland. Journal of Archaeological Science, 33, 10 pp. 1356–67.

122

INFORMATION TECHNOLOGY AND EGYPTOLOGY

Figure 1: The painted interior of the Akhmenu festival hall

Figure 2: The ‘white chapel’ of Senusret I

2008

AN OFFERING TO AMUN-RA

123

Figure 3: The interior of the Wadjet hall, based on the plans and axial drawings of Carlotti and Gabolde (Carlotti, J., and L. Gabolde. 2003. Nouvelles données sur la Ouadjyt. Cahiers de Karnak XI, pp. 255–338)

Figure 4: The Thutmoside temple of Karnak with twelve standing obelisks

124

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 5: The eastern section of the precinct with the recreated obelisks at the contra-temple and the ‘unique’ obelisk.

Figure 6: Karnak in the early 18th Dynasty

AN OFFERING TO AMUN-RA

Figure 7: Karnak in the 19th Dynasty

Figure 8: Karnak in the 22nd Dynasty

125

126

INFORMATION TECHNOLOGY AND EGYPTOLOGY

Figure 9: Karnak in the Greco-Roman Period

Figure 10: A hypothetical recreation of the Sheshonq I court

2008

AN OFFERING TO AMUN-RA

Figure 11: The first court and the Taharqo kiosk

Figure 12: The interior of the ‘red chapel’

127

128

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 13: The ‘daily ritual’ relief scenes of Sety I on the interior of the hypostyle hall

Figure 14: Screenshot from the ‘time-map’

AGENT-BASED MODELS OF ANCIENT EGYPT

Sarah Symons & Derek Raine

ABSTRACT

Complex systems theory is the study of emergent collective behaviour in sets of agents that can be represented as interacting according to simple rules. It is a fast-growing field which brings together many disciplines. The area crosses between computational science and sociology and can provide insight into the development of cultural practices. Early civilisations provide interesting backgrounds for models: for example, Anasazi society has been studied using agent-based modelling (Dean et al. 2006). Conversely, a model of, for example, ancient Egypt can be used to illustrate concepts from complex systems theory and test the universality of these concepts. In the Fractal House of Pharaoh, Mark Lehner outlines a complex systems view of ancient Egyptian civilisation (Lehner 2000). We build on some of the aspects of this view in an agentbased computer model. The model presented here is designed to investigate the spread of information and population aggregation in an agrarian society. The model is based on an abstracted Egyptian landscape containing villages, flood plain, and river. The agents represent farming households which exchange information and migrate around the landscape motivated by the availability of surplus food (used as a proxy for quality of life). We shall use this to illustrate some of the key ideas of agent-based modelling and complex systems. Future work will attempt to construct more realistic models to explore the impact of special features of Egyptian geography and society on the development of the civilisation.

1. INTRODUCTION In this paper we construct an agent-based model in NetLogo (Wilensky 1999) inspired by the example of the pre-dynastic agrarian society of 129

130

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Ancient Egypt to illustrate aspects of complex systems. In section 2 we illustrate the process of abstracting from what is known of the real society to a simplified model. The aim in agent-based modelling is to achieve sufficient simplification to illuminate connections and general trends. We are not trying to reproduce a population in as complete detail as possible, even where data would be available to permit this, since it would be no more informative than the input data. We argue that even with the present model (which contains a number of arbitrary assumptions) the complex systems approach can illuminate the behaviour of ancient societies as it has done for aspects of our own. The complex systems approach has been applied to many situations in the social sciences (e.g. Epstein 2006) from racial segregation (Schelling 1971) to trench warfare (Axelrod 1985). The model we present is designed to investigate the relationship between the spread of information and population aggregation in an agrarian society. It is based on an abstracted Egyptian landscape containing villages, flood plain, and river. The agents represent households which move between villages based on the information they obtain on the availability of surplus food (used as a proxy for quality of life). In section 5 we shall use the model to illustrate some of the key ideas of agent-based modelling and emergent properties in complex systems. An agent-based model comprises a set of interacting entities called agents. These may represent individual people (or other social creatures such as ants or birds), but they can, for example, correspond to households, herds, firms, or countries. They can also be entities within the physical environment, for example farmers and the fields that they farm may both be treated as agents. Each agent will have a set of rules by which it interacts with the other agents and takes some form of action, or undergoes a change in its properties, as a result of that interaction. The collective behaviour of agents in a model may be qualitatively different from the properties of the individual agents. Such collective behaviour cannot always be predicted from the small set of rules which govern each agent. Systems for which this is the case are called complex systems. Looking at the collective behaviour in complex systems, interesting patterns may emerge. For example, an ant-hill may appear to ‘decide’ to forage at a different site even though no single ant is directing the overall behaviour or has enough knowledge to make executive decisions.

AGENT-BASED MODELS OF ANCIENT EGYPT

131

2. AN AGENT-BASED MODEL OF ANCIENT EGYPT In this section we describe the assumptions behind our model. It is important to emphasise that these assumptions are deliberately designed to simplify the many layers of detail in any real society to the point where we can focus on a few key aspects. Where possible we base these assumptions on what we know of the society. Otherwise, we adopt a common sense approach to fill in the blanks. The time frame The model follows the aggregation of hamlets into larger villages that occurred during the pre-dynastic period. The results presented here are intended to correspond to a one hundred to five hundred year time-span in the period roughly 3800 – 3300 BC. During this time there is definite evidence, for example from Hierakonpolis and Naqada (Midant-Reynes 2000), for the stratification of the population in the larger centres, with the emergence of a trading and artisan class not supported by their own agricultural labour. In the current version of our model the larger villages are formed from collections of identical households engaged in farming, so we do not include explicitly this differentiation of the population that would characterise the development of towns. The environment The flood plain of the Nile ranges from 2 km to 10 km in breadth on either bank. In pre-dynastic times this was bordered by a fertile plain. This fertile area had disappeared by the end of the Neolithic Subpluvial around 3300 BC as a result of a change in the climate to more arid conditions, restricting agricultural settlements to the flood plain and to dependency on the Nile inundation (Hassan 1988; Hendrickx and Vermeersch 2000, 35; Wilkinson 1999, 372). We shall model the movement of population on one bank of the Nile of about 5 km in length and depth. For convenience the top and bottom boundaries of the region are identified; that is, the model region is a cylinder with the Nile at the left edge. This means we avoid having to deal with edge effects: any household that leaves at the top is replaced by an identical one entering at the bottom.

132

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Population movement At the end of the Neolithic Subpluvial the increasing aridity of the desert regions forced the migration of the pastoral population to the Nile flood plain. During this period desert villages were abandoned. The deposits from the settlements in the Badari area on the East bank of the Nile dating from between 4800 BC and 4000 BC are small in spatial scale and in time, which suggests that these villages did not exist for long periods on the same site. The villages of Hawashim and Naga el-Mashayikh were abandoned at the end of the pre-dynastic period while Naga ed-Deir was temporarily abandoned (Wilkinson 1996). Both the sites at Hierakonpolis and Naqada show evidence of movement from low density settlements over regions of order 0.5 km2 to towns of higher population density towards the end of the pre-dynastic period (Kemp 2006, 81). At Hierakonpolis, the period c. 3800 – 3100 BC marks a movement of population towards the Nile (MidantReynes 2000, 201). Another specific example from the early dynastic period is provided by Armant where villages at the desert edge disappeared as a town grew up (Kemp 1977; Brass 2004). This is the type of movement represented in the model. For the purpose of the model we shall assume that households move to better their quality of life. For an indicator of quality of life we use the annual food surplus for each village. At the end of each year, each household calculates its food surplus and compares it with that of the other villages it knows about. If the expected surplus can be bettered by more than 20% the household is given a probability of moving that increases strongly with the increasing difference in quality. There is a cost to moving that can be varied in the model. First, the moving household takes with it a fraction of its estimated surplus. Some of this is consumed on the journey and the rest is added to the granary of its destination village. At each time-step villages choose which of their fields they will cultivate. For simplicity we also allow the households to make the decision whether or not to move to a new village at each time-step (although not all of them choose to do so). If the time-steps correspond to years this yields a major drift in population over a 50 to 100 year period, which is almost certainly too rapid. However, there is nothing in the model that equates a timestep to a calendar year. Thus the cycles in the model could represent average inundations and yields over a number of years. This hypothesis has been tested by running a version of the model in which the choice of fields is

AGENT-BASED MODELS OF ANCIENT EGYPT

133

made at each time-step, but the decision to move is made no more than once every five time-steps. This slows down the pace of population change but makes no material difference to the outcomes. Total population On average, the population of the Nile valley grew very slowly over almost all of its history, typically by around 0.1% per year after 4000 BC (Butzer 1976) with a doubling time of about 1000 years. In the pre-dynastic period large population centres such as Hierakonpolis had reached populations of perhaps only 1500. However, population is difficult to estimate, especially prior to 3000 BC. Also, population growth rates varied widely between periods of famine and plenty, so they may well have varied a great deal between villages with large and small surpluses. Under these circumstances ‘wealthy’ centres would need to find ways of sustaining a growing population. We know that bursts of population growth were associated with improvements in agriculture. However, for the purpose of the present model we shall fix the total population. This enables us to explore the movement of population subject to agricultural stress without the further complication of population pressures which can, if needed, be added later. Given the time-scale of the model (up to about 500 years) our households are clearly not made up of the same individuals throughout the period. We assume that the number of workers and dependents in each household is maintained by constant births and deaths. Population density Our initial set up includes 80 villages as a starting condition located at random. This is a sufficient number to ensure that the dynamics of population movement does not vary greatly on average simply as a result of random differences in the initial spatial distribution. The population of pre-dynastic Egypt has been estimated at one million (Kemp 2006, 50). Baines and Málek (1980, 16) give a cultivated area of up to 34,000 km2 as a stable figure over the last 5000 years. During the time period we are dealing with, which falls just outside that era, the area actively cultivated may have been lower. Thus we calculate a population density of 50 people per km2. We have chosen an area of about 5 km × 5 km with initially 80 settlements, so at this population density we have a total of 1250 people or about 10 per settlement. The settlements therefore begin as hamlets. Our initial settlement size is 3 households. This would give 5 people per household, a reasonable

134

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

average value. We imagine that each household supplies enough labour to farm one field per year using 1 or 2 workers. As the model develops the population of individual villages may reach 50 households or so. This is consistent with the estimate of no more than 50 to 200 inhabitants of some of the larger villages (Midant-Reynes 2000, 183). Surplus of land over labour Theories of state formation often give prominence to population pressures (Carneiro 1970). In the present model we want to emphasise the abundance of agricultural land in relation to the workforce in ancient Egypt (Kemp 2006, 73; Eyre 1999, 46). Therefore, we make the simple assumption that the population was constant. What we find are some interesting effects of population movement even in the absence of pressures caused by increasing numbers, but instead relating to economic stresses caused by the unreliability of the food supply alone. The flood One of the drivers of the model is the variability of the Nile flood, since this determines the changing fertility of the land. In a widely quoted paper Mandelbrot and Wallis (1969) examined the Nile flood records as a chaotic sequence. Some authors have looked for periodicities in the flood and have found evidence for cycles from a few years to a few hundred years (Kondrashov et al. 2005). The amplitude of the variation has been studied back to 3500 BC (Bell 1970). At the extremes, variations of ±1 m are found, but more typically for ancient Egypt variations are in the range ± 0.5 m, about 0.2 of the mean. The fact that we have records only of the overall height of the Nile and not the duration of the inundation means that we cannot use this data directly. The fertility of the land depends on the timing and the way in which the water spreads across the land: flood levels that were too high would have retarded the harvest (Bowman and Rogan 1999), presumably reducing yields locally. We have therefore implemented changes in the fertility of fields without correlations from year to year. It would be straightforward to include the effects of periodicities directly in the fertility model.

AGENT-BASED MODELS OF ANCIENT EGYPT

135

Fertility and yield The fertility of the flood plain varies widely between nearby locations, but is correlated overall with the Nile flood through irrigation and also alluvial deposits. The fertility is modelled as a function of perpendicular distance from the Nile. We use a Gaussian function of distance, with the position of its peak and its width both being random variables drawn from a uniform distribution. The peak lies between 15% and 30% of the depth of the landscape from the Nile boundary. The width of the distribution varies in the same range. This means that while there is a general increase in fertility closer to the Nile, the positions of the most fertile patches of land vary from year to year. Thus in the model, local information on fertility becomes important. The potential fertility is obtained by summing over the fields that the villages ‘own’ and therefore could choose to cultivate. With our chosen parameters the variation in fertility is about ± 10%, although for individual villages it can be several times larger. On the basis of the occurrence of widespread famines one might expect this variation to be larger, although there is some evidence that the famines of the First Intermediate Period were more the result the amplification of poor harvests through failures in food distribution (Coulon 2008). In any case, we have chosen the amplitude of variation to put sufficient stress on the population, through its agricultural surplus, to produce population movement without driving villages to starvation. Future developments of the model could investigate the effects of larger or longer term fluctuations. For fields that are farmed we convert fertility directly into the yield of grain, subject to a distance penalty which reduces the return to a community the further their farms are from their village. Land tenure We have no records of land tenure in the pre-dynastic period. Therefore the best we can do in the model is to extrapolate back from later practice to implement a simplified form of land tenure. A principal source for details of land tenure in ancient Egypt is the Wilbour Papyrus (Gardiner 1948). The papyrus records the ownership of land made over a period of two months in 1142 BC. The records are subject to statistical analysis in Katary (1989). This suggests that land tenure in ancient Egypt was reflected at least in general form by the cadastral surveys

136

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

of the early 20th century. These show a complex pattern of intermingled ownership of tracts of land. The four surviving sections of P. Wilbour cover less than 5% of cultivatable land (Katary 1989, 23). It is possible therefore that the data refers only to changes in land tenure since a previous survey (Katary 1989, 244). Extrapolating to the pre-dynastic period, the increasing aridity of the desert lands drove population movements, the point of which was to form new farming communities. The formation of settled communities suggests at least some stable form of land attribution within these dynamics. The two principal features we will take from later patterns of ownership are therefore (a) a patchwork of ownership of fields and (b) a changing ownership. For simplicity we assume that all fields are of equal areas. Thus, we have included in our model the ‘buying’ of fields whenever a village has more labour available than the fields it ‘owns’. Once bought these fields remain under the ownership of the village even if the population declines. For simplicity, ownership is assigned to a village as a whole, rather than to individual households. The algorithm to decide which fields to buy is based on three factors. The first is the fertility of the field. This would be an obvious consideration. For simplicity in the model, the fertility history for all fields is assumed to be known to every village. Recall that the average fertility varies with distance from the Nile border. It is therefore reasonable to suppose that experienced farmers would have this information in a qualitative way, even though they could not predict quantatively from year to year where the best fields would be. The second factor is the distance to the fields, which makes a field less attractive the further away it is from the village. This is clearly in accord with the decreasing ownership of more distant fields in P. Wilbour (Gardiner 1948 vol. 2, 36–7; Katary 1989, 245) but is also common sense. This variable does feature in the results, so the model allows us to adjust its importance relative to the other factors. The final factor is the undesirability of owning fields adjacent to those already owned, as an insurance policy against fluctuations in yield. As an unintended bonus this gives rise to the exchange of information on the prospects for the quality of life in a greater number of nearby villages, thereby promoting informed decisions to move.

AGENT-BASED MODELS OF ANCIENT EGYPT

137

Farming From the letters of Heqanakht, a minor official travelling on state business in time of the Twelfth Dynasty, we deduce that decisions on which land to farm were made on an annual basis, at least where households had access to ‘capital’ or to a surplus of land (Lehner 2000). In the model villages are only able to farm in accordance with the number of inhabitants available to provide the labour. Each season therefore they take a decision on which of their fields to farm, based on previous yields. Overall the model incorporates a simplified structure of land use, whereby the complex decisions of ownership, rental and use are reduced to the free acquisition of unowned land by a village in accordance with the size of its population, and the cultivation of the historically best yielding land. Food storage Evidence for food storage survives from Neolithic times in the form of sunken receptacles (Hendrickx and Vermeersch 2000). The presence of pottery implies some form of storage. Large houses and temples from dynastic times provide storage facilities on large scales, for in excess of a thousand people in some cases. We shall therefore interpolate a storage capacity for each village that is sufficient to accommodate the surplus for its current population. The storage capacity is therefore assumed to grow (at no cost) with the maximum occupation of the village. In reality shortages of food caused frequent famines and deaths. Average life expectancy for adults was around the late thirties (Sterling 1999). However, overall the population does not decline over longer periods. Thus we restrict ourselves to a simplified model in which we assume sufficient food for subsistence is available and we are looking at the surplus (tradable) production. The model does not include temporary reductions in population due to famine.

3. RUNNING THE MODEL: (I) TRANSMISSION OF INFORMATION As the first of two illustrations we consider the impact of knowledge about farming yields and surpluses on the movement of population. In the model a household knows the quality of life (surplus grain) in their home village. While farming, they exchange this information with a household farming a

138

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

neighbouring field. If a household predicts that the quality of life in their home village will be poor, they use their knowledge of other villages to decide whether they should relocate. Relocation changes the number of households in a village. If the number of households exceeds the number of fields the village owns, then more fields are acquired. Since the knowledge of conditions in as many villages as possible is useful to a household in deciding where it might better its quality of life, it should be advantageous for a village to acquire a variety of neighbours. The fractal nature of land tenure in ancient Egypt (Lehner 2000), which persisted into the twentieth century, is usually explained in terms of insurance against fluctuating yields. However, as a side-effect it also ensures rapid diffusion of knowledge. To demonstrate the benefit of owning fields with neighbours from other villages we can vary the weighting of two factors in the decision that a village makes as to which fields to buy. In the first run of the model villages place prime importance on the distances of the fields and give little weight to the variety of ownership of neighbouring fields. Subsequent runs alter these priorities. Figure 1 shows the final state of the population after 100 cycles. The population centres have moved closer to the Nile, where the fertility of the fields is greater, and increased in size. Figure 2 shows the evolution of the quality of life with time. Where a village chooses to farm fields largely on the basis of proximity, the knowledge of how to improve the quality of life (by moving to villages with more fertile areas) diffuses relatively slowly. Where there is more of an intermixing of holdings of land, the quality of life improves more rapidly.

4. RUNNING THE MODEL: (II) THE PARADOX OF PARTIAL KNOWLEDGE In our second illustration we compare the quality of life in the transmission model (‘local information model’) with that in a model where households have complete knowledge of the food surpluses in each village (‘global information model’). One might expect that the wider knowledge of surpluses in the latter case would enable households to obtain a greater quality of life. In the local model, as previously, farmers from different villages in neighbouring fields exchange information about their village and its quality

AGENT-BASED MODELS OF ANCIENT EGYPT

139

Figure 1: The final distribution of households after one hundred time-steps, starting from a random spatial distribution of 80 villages each of 3 households. The white background is unfarmed land. The light grey squares are the positions of villages that have been abandoned. The various shades of grey represent ownership of fields by different villages. The shades are chosen arbitrarily to give an impression of the degree of intermixing of ownership. Patches with symbols in white are inhabited villages. The symbols are superposed arrow heads (), one per household. The population of a village varies between one household up to about 50 households. In the case shown here villages make their decisions about which fields to own based mainly on proximity (corresponding to the upper graph in Figure 2).

140

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 2: The increasingly fractal nature of land tenure leads to more rapid spread of information about village surpluses, to greater population movement and hence to an earlier attainment of a higher quality of life. The figures show the overall fertility of the land owned by villages (light grey) and the quality of life (agricultural yields) from farmed fields (dark grey) as a function of time. In each case the vertical line shows the approximate time at which the quality of life reaches the long-term average value. The scale on the vertical axis is arbitrary. The horizontal axis is plotted in time-steps in the model. In the upper figure the purchase of fields is weighted 4 : 1 in favour of proximity over variety of neighbours. In the central figure the weightings are equal. In the lower figure the weightings are reversed to 1 : 4. The quality of life reaches its long term average in 30 time periods, 25, and 20 respectively.

AGENT-BASED MODELS OF ANCIENT EGYPT

141

of life and each household will decide to move with some probability. The global model is the same, except that each agent has complete knowledge of the quality of life in all the villages in the previous year. This is of course deliberately unrealistic, but it has an interesting outcome. Figure 3 shows the resulting distribution of population after 100 steps in the two models. In the case of global information the population is much more concentrated in fewer larger sites. The accompanying numerical results show, very surprisingly, that the quality of life is lower when households have more information. This presents us with what we have called the paradox of partial knowledge: that there are circumstances in which it can be better to know less. Usually in classical economics it is assumed that participants in a market have perfect knowledge, and that departures from this disadvantage the participants. It is now known that this assumption is both unrealistic and false. Our model gives a novel illustration of a counter-example to the classical assumption.

5. COMPLEX SYSTEMS In this section we explore some of the complex systems aspects of the model starting with the diffusion of information. In our first example the households relocated sooner to villages with a higher quality of life when the intermingling of the farms was greater. This is clearly a matter of the more rapid diffusion of knowledge under these circumstances, since there is nothing else in the model that could influence this. How do we explain the more rapid spread of information? A closer analysis shows that if the farmed fields are adjacent to the owning village most of the farmers will have neighbours from the same village who already know their own quality of life. These farmers do not contribute to the spread of information. With a more dispersed tenure of farmland more households are involved in passing on information to other villages and the diffusion is more rapid. This is rather obvious. There is however a further effect. The assignment of distant farms to villages results in a complex network of village neighbours in which a few villages become hubs with a much larger than average connectivity to other villages. These play a major role in spreading information which further acts to speed up the process. Such highly connected hubs are responsible for what is known as the ‘small world effect’ in which randomly chosen individuals will be linked in a surprisingly small number of steps. ‘Six degrees of separation’ is the most quoted expression of this concept (Milgram 1967).

142

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 3: Typical distributions of population after 100 time-steps starting from a random distribution of 80 villages each with 3 households, for local information (left hand figure) and for global information (right hand figure). As in Figure 1, patches containing white arrow heads or stacks of arrow heads are occupied villages. White patches are unowned fields. The different shades of grey indicate fields with different village ownership. The values for the quality of life in the two cases range between 0.77 – 0.81 for the local information model and 0.60 – 0.70 for the global model. The ranges in each case correspond with different runs of the model using the same parameters. The range of values in each case arises from the different (random) initial distribution of villages and from the variability of the flood.

AGENT-BASED MODELS OF ANCIENT EGYPT

143

The importance of subsets that depart significantly from the mean is a key feature of complex systems. To explain this, consider, for example, the set of heights of participants in a conference. They will range by a few percent around the mean, say mostly between 0.8 and 1.2 of the mean. We do not expect to find delegates with a height three times that of the mean. The height distribution follows a Gaussian (bell) curve. In contrast, the connectivity of the villages has a different distribution (a ‘power law’ distribution), which includes a significant probability of large departures from an average value. Such villages are a major influence on the collective behaviour of the complex system. Power law distributions are often an indicator of the presence of complexity. Our paradox of partial knowledge is a version of the minority game (Challet et al. 2005). This is most easily described in terms of the original problem set in the El Farol bar (Arthur 1994). The problem is that the bar cannot accommodate all the people who would like to go to it. If the bar is overcrowded no one has a good time and it would be better to have stayed at home. But no one knows how many other people will go on any given night. How then can one decide whether to go or to stay at home? The aim is to be in the minority of those that visit the bar. (The minority game is a technically simpler version of this that can be fully analysed.) The result is that there is no solution if all agents follow the same strategy: that is, there is no single strategy that will optimise the outcome. We can see the effect in our farming model. In the case of perfect information every household follows the same strategy. Successful villages are then overcrowded and the quality of life drops, so a large number of households decamp to another village. And the same thing happens. This all costs resources, so the average quality of life is adversely affected. With only partial knowledge households are forced to adopt what are in effect different strategies (they certainly have different outcomes). This means that at each step most households do not go to the ‘best’ village, but in so doing they raise the quality of life for everyone. In complex systems language, this is an emergent property of the system: it is not an outcome that can be ascribed to any one household. In fact each household is trying to maximise its quality of life: there is nothing in the properties of individual households that makes them adopt a less than optimal strategy. The overall outcome is a collective effect of the interactions between agents. A key feature of all of the examples is the aggregation of population. The movement of the population towards the Nile and the abandonment

144

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

of the more distant villages is to be expected on the grounds of the distribution of fertility. What is not programmed into the model is the nonrandom distribution of the population between the villages. Starting from a uniform distribution of 3 households per hamlet we end up with some villages of up to 50 households. This is a collective outcome, or emergent property, of the behaviour of the agents. Each household simply tries to optimise their quality of life. As a result it turns out that they aggregate into larger villages. The outcome is not even the most desirable from the point of view of overall average quality of life for an individual household over time in any of the models. The movement of the population consumes resources and therefore reduces the overall quality of life, and the agglomeration into larger villages does not exploit the full potential fertility of the landscape. The driving force that produces larger village units is the unpredictable variability of the flood that provides relative safety in numbers. (Or strictly speaking, avoids the dangers of fluctuations in small numbers.) On the other hand, even though the size distribution is not random, the final number of occupied villages is about what one might expect from the area into which the households move. Thus, the model does not yet explain the further movement of population into larger towns. For this we would expect to develop the model to include a variety of types of households representing the division of labour and a non-agricultural class. In the picture presented here, the movement to the larger villages provides the opportunity for the emergence of a class of non-producers of food and hence lays the grounds for the development of culture. What looks like an organised movement towards a civil society, turns out in this model to be an unintended consequence of the Ethiopian climate.

6. CONCLUSIONS We are not in a position to compare our present preliminary model with real data but we can nevertheless draw some conclusions. At present the setting in ancient Egypt provides an interesting educational background for the introduction of ideas from complex systems, which can be developed further in this context. Even in its present form we have seen that it raises issues about the formation of society that may be worth addressing. However, with the introduction of more realistic assumptions, we can envisage a model which can begin to consider some more specific questions. For example, with more realistic farming practices and a

AGENT-BASED MODELS OF ANCIENT EGYPT

145

division of labour to include professional and artisan classes we could begin to look at the formation of towns and the emergence of social hierarchies. In other contexts other modellers have addressed the stability of social structures to external stresses; for ancient Egypt one thinks immediately of drought and consequent famine. Egypt has a particular almost linear geography which may impact on the way it developed and which would provide an interesting comparison to the emergence of statehood in other civilisations. REFERENCES

Arthur, B. (1994) ‘Inductive Reasoning and Bounded Rationality’ (The El Farol Problem) Am. Econ. Assoc. Papers & Proc. 84, 406–11. Axelrod, R. (1985) The Evolution of Cooperation. Basic Books, New York. Baines, J. and Málek, J. (1980) Atlas of Ancient Egypt. Facts on File Publications, New York. Bell B. (1970) ‘The Oldest Records of the Nile Floods’ The Geographical Journal 136, no. 4, 569–73. Bowman, A. K. and Rogan, E. (1999) ‘Agriculture in Egypt from Pharaonic to Modern Times’ in Bowman, A. K. and Rogan, E. (eds.) Agriculture in Egypt from Pharaonic to Modern Times. Oxford University Press 1–32. Brass, M. (2004, unpublished) ‘The nature of urbanism in Ancient Egypt’ (available at http://www.antiquityofman.com/Egyptian_urbanism.pdf; accessed 3/08/08). Butzer, K. (1976) Early Hydraulic Civilisation in Egypt. A Study in Cultural Ecology. University of Chicago Press, Chicago. Carneiro, R. L. (1970) ‘A Theory of the Origin of the State’ Science 169, 733– 38. Challet, D., Marsili, M., and Zhang, Yi-C. (2005) Minority Games. Interacting Agents in Financial Markets. Oxford University Press. Coulon, L. (2008) ‘Famine’ in Frood, E. and Wendrich, W. (eds) UCLA Encyclopedia of Egyptology Los Angeles; http://repositories.cdlib.org/ nelc/uee/1016/ (accessed 3/08/08). Dean, J. S., Gumerman, G. J., Epstein, J. M., Axtell, R. L., Swedlund, A. C., Parker, M. T., and McCarroll, S. (2006) ‘Understanding Anasazi culture change through agent-based modeling.’ in Epstein, J. M. Generative Social Science: Studies in Agent-Based Computational Modeling (Princeton Studies in Complexity). Princeton University Press, 90–116.

146

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Epstein, J. M. (2006) Generative Social Science: Studies in Agent-Based Computational Modeling (Princeton Studies in Complexity). Princeton University Press. Eyre, C. J. (1999) ‘The Village Economy in Pharaonic Egypt’ in Bowman, A. K. and Rogan, E. (eds.) Agriculture in Egypt from Pharaonic to Modern Times. Oxford University Press, 33–60. Gardiner, A. H. (1948) The Wilbour Papyrus. Oxford University Press. Hassan, F. A. (1988) ‘The Predynastic of Egypt’ Journal of World Prehistory 2 no. 2, 135–85. Hendrickx, S. and Vermeersch, P. (2000) ‘Prehistory from the Palaeolithic to the Badarian culture’ in Shaw, I. (ed.), The Oxford History of Ancient Egypt. Oxford University Press, 17–43. Katary, S. L. D. (1989) Land Tenure in The Ramesside Period Kegan Paul. Kemp, B. J. (1977) ‘The Early Development of Towns in Egypt’ Antiquity 51, 185–200. — (2006) Ancient Egypt, Anatomy of a Civilisation. Routledge. Kondrashov, D., Feliks, Y., and Ghil, M. (2005) ‘Oscillatory modes of extended Nile River records (A.D. 622–1922)’ Geophysical Research Letters 32 (10), L10702.1–L10702.4. Lehner, M. (2000) ‘Fractal house of pharaoh: ancient Egypt as a complex adaptive system, a trial formulation.’ in Kohler, T. A. and Gumerman, G. J. (eds.) Dynamics in Human and Primate Societies: Agent-Based Modeling of Social and Spatial Processes. Oxford University Press, 275–353. Mandelbrot, B. B. and Wallis, J. R. (1969) ‘Some long term properties of Geophysical records’ Water Resources Research 5(2), 321–40. Midant-Reynes, B. (2000) The Prehistory of Egypt. Blackwell. Milgram, S. (1967) ‘The Small World Problem’ Psychology Today 2, 60–67. Schelling, T. C. (1971) ‘Dynamic Models of Segregation’ Journal of Mathematical Sociology 1, 143–86. Sterling, S. (1999) ‘Mortality Profiles as Indicators of Slowed Reproductive Rates: Evidence from Ancient Egypt’ Journal of Anthropological Archaeology 18, 319–43. Wilensky, U. (1999) ‘NetLogo’ http://ccl.northwestern.edu/netlogo/. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL. Wilkinson, T. (1996) State Formation in Egypt: Chronology and Society. BAR International Series 651 (Cambridge Monographs in African Archaeology 40). Tempus Reparatum, Oxford. — (1999) Early Dynastic Egypt. Routledge.

L’USAGE DE LA 3D EN ARCHÉOLOGIE

Robert Vergnieux

ABSTRACT

The use of 3D modelling in archaeology has developed from the straightforward production of images for illustrative purposes into becoming an effective tool to aid scientific archaeological research. Virtual Reality allows not only the restoration of ancient structures which have now disappeared but also permits the testing of new hypotheses as to how these structures worked. A particular use within Egyptology has been the non-invasive studies of mummies with this new technology. However, in view of the importance of architectural remains along the Nile valley, there can be little doubt that the most significant potential lies in the virtual restoration of those structures. Although 3D images allow the public to understand better these processes, we should never lose sight of the fact that their production was driven by underlying scientific goals. A 3D technological platform specifically for work with our architectural heritage has been created in Bordeaux; it is especially adapted for 3D scanning, modelling, and maintaining the persistence of 3D digital data. UNE PLATE-FORME TECHNOLOGIQUE 3D

L’UMR 5607 du CNRS s’est doté d’une espace architectural complétant la Maison de l’Archéologie. Un nouveau bâtiment de près de mille mètres carrés, se compose de salles « chercheurs », d’un laboratoire de céramologie, d’un espace d’exposition et d’une Plate-forme Technologique 3D du CNRS. Le concept de ce bâtiment repose sur la possibilité donnée au public de venir voir la recherche en archéologie « en train de se faire ». L’archéopôle n’est pas un musée mais plutôt une interface entre les citoyens et les chercheurs.

147

148

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

L’imagerie 3D envahie aussi bien les programmes de recherches en archéologie que les médias. Dans ce contexte, l’un des enjeux majeurs de la Plate-forme Technologique 3D de l’Institut Ausonius est de promouvoir auprès des équipes de recherche l’utilisation des technologies de la 3D en archéologique en tant qu’outil de recherche et non pas seulement comme un moyen d’illustration. Les modèles 3D conçus dans le cadre de la PFT3D sont à la disposition de la communauté des chercheurs tout en restant sous le control des équipes qui les ont élaborés. La PFT3D dispose d’une plate de production de neuf postes de développement 3D, et d’un poste de scannographie laser 3D. L’ensemble des données est géré localement à l’aide de deux serveurs Raid 5 (DELL) et de deux disques de sauvegarde d’un téraoctet chacun. La salle réalité virtuelle (odéon) permet, lors des séminaires de validation des restitutions 3D d’afficher sur un écran de 8,5 x 3,5 des scènes 3D de 2496 pixels x 1050 pixels à l’aide d’un cluster de PC. L’odéon sert également pour la présentation au public des dossiers de recherche. Sa capacité est de 96 places. Toutes les données produites par la PFT3D sont archivées et pérennisées dans le Conservatoire National des Données Numériques 3D du patrimoine que nous avons crée. MÉTHODOLOGIE PARTICULIÈRE DE LA MISE EN ŒUVRE DES RESTITUTIONS 3D

Par restitution nous entendons l’acte scientifique de restituer, à une époque ou date précises, des structures antiques disparues ou dégradées, et non pas le fait de transposer en numérique des édifices encore existants du patrimoine. a) mise en place d’un modèle préparatoire – niveau V1 Le travail de restitution d’un site archéologique majeur est un travail d’équipe et ne peut plus être le fait d’un seul individu. L’élaboration d’un modèle numérique 3D requiert des compétences spécifiques de plusieurs natures. Des connaissances archéologiques et historiques sur le site étudié sont nécessaires tout autant que la connaissance du site : topographie des vestiges encore en place, caractérisation des matériaux, données archéologiques de toute nature (archéométrie). La difficulté de mise en œuvre de tels projets pluridisciplinaires réside dans la façon de faire dialoguer tous les spécialistes entre eux d’une part et

L’USAGE DE LA 3D EN ARCHÉOLOGIE

149

d’arriver à engranger les avancées significatives en termes de validation des restitutions des espaces antiques. C’est précisément ici que le modèle numérique 3D revêt toute son importance. Tant qu’un modèle 3D n’est pas visualisable, le dialogue entre spécialistes est délicat, chacun des scientifiques ayant sa propre vision des volumes. Dès qu’une première ébauche tridimensionnelle est visualisable alors il devient possible à deux chercheurs de champs disciplinaires distincts de dialoguer entre eux sur la restitution de « volumes » disparus. Une argumentation peut se mettre en place de façon précise chacun activant ses connaissances propres face aux détails de la restitution 3D visualisable par tous. La toute première étape des projets scientifiques de restitution que nous encadrons techniquement est l’élaboration incontournable de cette première ébauche 3D des édifices étudiés tenant compte des éventuelles hypothèses antérieures si elles existent. Ces modèles à cette étape ne sont certes pas encore validés scientifiquement mais ils sont indispensables pour la mise en place d’un dialogue entre tous les partenaires scientifiques. Nous appelons ce niveau de version : V1. b) Organisation de la documentation nécessaire au travail de restitution 3D C’est parallèlement à cette première réalisation que nous constituons la base documentaire de l’ensemble des sources qui seront nécessaires à l’opération de restitution. - Données épigraphiques Attestations des noms des édifices Attestations textuelles d’existence de bâtiments Attestations textuelles d’évènements liés à des bâtiments - Données iconographiques avec la représentation d’édifices - Vestiges archéologiques de terrains (épars et in situ) - Hypothèses antérieures de restitution (maquettes physiques, dessins, modèles numériques) - Documents complémentaires (parallèles documentaires possible et issus d’édifices d’autres périodes) Dans le cadre de la PFT3D nous avons mis au point une interface spécifique de manipulation de ces données, sorte de table lumineuse virtuelle (tabloïde) accessible à tous les participants quelque soit leur localisation géographique. La partie « factuelle » de ces données est donc mise en

150

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

commun. Chaque chercheur peut y adjoindre ses propres informations s’il le souhaite. Le tabloïde utilise également une structure particulière d’« unicos » et d’« unitexte ». Ce sont des fragments d’image ou de texte issus d’un document permettant la manipulation du contenu sémantique des images et des textes (Vergnieux 1999). c) Séances de validation des restitutions Nous organiserons des séminaires autour d’un ordre du jour précis lié à la restitution des édifices concernés. Les personnes présentes lors de ces séminaires disposent donc d’une « scène 3D » manipulable en temps réel permettant de visualiser collectivement n’importe quel secteur et détail 3D du site en cours d’étude. Les personnes présentes ont également accès pendant le séminaire à l’ensemble des sources liées au projet de recherche. Nous pouvons inviter ponctuellement, lors d’une séance, un chercheur ayant des compétences particulières liées à l’ordre du jour. Chaque séminaire a pour conséquence de faire évoluer la connaissance et les hypothèses de restitution. Il est donc nécessaire d’actualiser le modèle numérique 3D en fonction des séminaires. Les différentes versions sont qualifiées de second niveau (versions V2.x). Les séminaires se tiennent autant de fois que possible et nécessaire faisant évoluer les modèles d’une version V2.x en une version V2.(x+1). Lors de ces séminaires. il arrive aussi de découvrir de nouveaux documents qui sont alors ajoutés dans la base documentaire des différents corpus qui s’accroît régulièrement en quantité et en qualité. Enfin des liens entre les modèles numériques 3D et les sources documentaires reposent sur le concept de nomenclature 3D. C’est à dire que pour l’ensemble des édifices du règne, nous définissons un vocabulaire commun décrivant la hiérarchie « volumique » de tous les sites archéologiques étudiés dans ce projet. Cette nomenclature constitue l’arête dorsale du projet après sa validation par les membres de l’équipe. A partir d’un élément quelconque du modèle numérique il devient alors possible d’interroger les corpus documentaires associés. L’ensemble du travail fait selon la méthode indiquée amène progressivement à la construction de modèles numériques en version V3. Versions dont les membres du projet s’accordent à dire qu’elles sont conformes aux hypothèses scientifiques actuelles. Les modèles en version V3 sont destinés aussi à évoluer en fonction de nouvelles avancées de la recherche. Cependant ils sont d’un niveau suffisamment avancé pour servir de support à la communication scientifique ainsi qu’à la communication vers le public au titre de la valorisation des programmes de recherche.

L’USAGE DE LA 3D EN ARCHÉOLOGIE

151

L’ÉGYPTOLOGIE ET L’USAGE DE LA 3D POUR LA RESTITUTION DES ÉDIFICES AUJOURD’HUI DISPARUS

Pour visualiser les vestiges pharaoniques, l’usage de maquettes en « dur » est encore assez courant car elles rencontrent toujours un grand succès auprès du public. Comme pour les autres domaines de l’archéologie, c’est à l’occasion de la réalisation de films documentaires ou d’expositions qu’apparaissent des modèles 3D de restitutions. Rare sont les équipes égyptologiques utilisant les modèles 3D pour effectuer des recherches dans le sens des procédures décrites ci-dessus. Les modèles 3D se limitent le plus souvent à une fonction d’illustration. Il faut rappeler ici le projet pionnier dans lequel dès 1986 des équipes mixtes d’archéologues et Ingénieurs ont collaboré à la réalisation d’un modèle numérique des temples de Karnak (Albouy et al 1989). Ce projet pilote avait ouvert la voie à l’usage de l’image de synthèses en tant que support pour la recherche de restitution. Le potentiel des modèles numériques 3D est actuellement sous utilisé. Encore présenté comme une illustration construite en fin de projet, les modèles numériques 3D prennent cependant tout leur sens s’ils sont introduits en début de recherche car l’action de restituer les architectures disparues relève de véritables programmes pluridisciplinaires. C’est à travers de tels projets qu’il devient possible de mieux comprendre les bâtiments construits par les anciens, de mieux comprendre leur fonctionnement et ainsi de mieux comprendre les sociétés qui les ont érigés. REFERENCES

Albouy et al 1989. Marc Albouy, Le Temple d’Amon restitué par l’ordinateur , Solar, 1999. Vergnieux 1999. Robert Vergnieux, Recherches sur les monuments thébains d’Amenhotep IV à l’aide d’outils informatiques - Méthodes et résultats, Cahiers de la Société d’Egyptologie de Genève, vol. 4, 2 tomes, 243p. 105 planches (Genève 1999). Sur les différents projets 3D en archéologie consulter les actes des colloques suivants : Robert Vergnieux 2008, Editeur scientifique en collaboration avec C. Delevoie des Actes du Colloque Virtual Retrospect 2007, Collection Archéovision aux éditions Ausonius, (Bordeaux 2008).

152

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Robert Vergnieux 2006, Editeur scientifique en collaboration avec C. Delevoie des Actes du Colloque Virtual Retrospect 2005, Collection Archéovision aux éditions Ausonius, (Bordeaux 2006). Robert Vergnieux 2004, Editeur scientifique en collaboration avec C. Delevoie des Actes du Colloque Virtual Retrospect 2003, Collection Archéovision aux éditions Ausonius, (Bordeaux 2004).

Figure 1: Version numérique (V0) du buste d’Akhenaton conservé au Louvre (Archéovision - Archéotransfert)

L’USAGE DE LA 3D EN ARCHÉOLOGIE

153

Figure 2: Restaurations virtuelles sur le buste numérique d’Akhenaton (Archéovision - Archéotransfert)

Figure 3: Version V2: Maison de Djehoutymes, aspects extérieurs (Archéovision Archéotransfert -Musée de Genève).

154

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 4: Version V2: Pylône du sanctuaire d’Aton (Archéovision Archéotransfert-Musée de Genève)

RAMSES. A NEW RESEARCH TOOL IN PHILOLOGY AND LINGUISTICS

S. Rosmorduc, St. Polis, J. Winand

ABSTRACT

This paper introduces Ramses, a database of Late Egyptian texts, currently under development at the University of Liège (Belgium). Ramses sets out to be a new and powerful research tool. Its main applications are linguistically and philologically orientated. After a general overview of the structure of the database, the search engines are described with some detail.

0. INTRODUCTION Ramses was officially presented at the Xth International Congress of Egyptologists in Rhodes in May 2008.1 It is an interdisciplinary project whose purpose is the building of an annotated corpus of all Late Egyptian texts. From a technical point of view, Ramses is a relational database in SQL, where the texts themselves are represented and stored in XML. The editing and search software is written in Java, and usable both on Mac and PCs. By the means of export procedures (XML), the adopted format for the database is fully compatible with what is recommended by the Text Encoding Initiative (http://www.tei-c.org/index.xml). We hope that Ramses will pave the way for new and innovative approaches to texts and language. 1

See J. WINAND, St. POLIS, & S. ROSMORDUC, ‘Ramses. An Annotated Corpus of Late Egyptian’, in Proceedings of the Xth International Congress of Egyptologists, in course of publication.

155

156

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

The project obtained substantial support from the University of Liège in the form of a so-called Action de Recherche Concertée (ARC), starting in October 2008 for a five-year period. In this paper, we examine in some detail (1) the process of text encoding in Ramses, (2) the search engine and (3) some prospective developments.

1. ENCODING LATE EGYPTIAN TEXTS IN RAMSES Any new text must be identified and described. This is done in a dedicated module. The usual information is encoded: writing (hieroglyphic, hieratic), support (ostracon, papyrus, stela, etc.), date, provenance, genre.2 For the three last features, special lists with a hierarchical structure have been built working more or less like Russian dolls.3 When encoding a new text, a clear distinction must be made between documents and texts. As is well-known, there is no necessary overlap between the two: a single text can exist on more than one document, and a single document may host different texts. Information on the writing system, the support and the provenance are stored in the document encoding sheet, whereas information on genre and language are encoded in the text encoding sheet.4 Special attention is also paid to the so-called ecdotic information to describe the actual state of a text (lacuna, erasure, words above or under the line, etc.) and the editor’s interventions (suppression, addition, restitution, etc.). One can then proceed to the encoding of the text itself. The text is first segmented in propositions, and propositions are in turn segmented in words. Words are described with three kinds of information: a lemma, a 2

3

4

The classification of the texts raises many problems. The taxonomy of written Late Egyptian texts is being studied by Stéphanie Gohy. A complete view of the encoding process is given in A.-Cl. HONNAY & St. POLIS, Manuel d’encodage du projet Ramsès (http://www.egypto.ulg.ac.be/Manuel_Ramses.pdf). This allows, for instance, to select easily documents by searching the name of a place (e.g. the Karnak temple), of an area (e.g. the West Bank of Thebes), of a nome (e.g. Waset), or of a whole region (Upper Egypt). Illustrations of problems raised by the encoding are given on our web-site (http://www.egypto.ulg.ac.be/Ramses.htm). The dating raises specific problems: in the literary texts, for instance, one has to distinguish clearly between the date of the composition and the date of the copy.

Figure 1: The Text Editor showing an analysis in progress (the final result is displayed in the right box)

RAMSES

157

158

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

spelling, and a morphological analysis (for the syntactic analysis, see below). This information is stored in a related lexicon. By using filters, encoders can quickly select the correct lemma, spelling and grammatical analysis. The lexicon is of course constantly updated as new words, new spellings or new analysis appear.

Figure 2: The Lexicon Editor

Figure 1 shows an extract from the Tale of the Two Brothers. The main window, in the upper part of the screen, displays the text. As already noted, the text is segmented in propositions. Words are lemmatised and analysed, and the spelling is encoded. The lower window gives an idea of the encoding process by showing how the windows look like once the correct lemma has been chosen. This appears in the grey box on the right (see the enlargement). In this example, the cursor stands behind i.Dd. The second and third columns list respectively all the inflections and spellings already recorded for Dd. Using the lexicon, one can immediately see the inflections and spellings that are actually attested for any given word. In the left column of Figure 2 are listed the inflections already attested for this verb in the database; the right column displays the spellings attached to a specific inflections (in this case, the status pronominalis of the infinitive) of the verb rwi.

RAMSES

159

Finally, it was of the utmost importance to find a device to handle elegantly and efficiently ambiguities that can surface at different levels, whether lexical, morphological or syntactical. The grammar of Egyptian can sometimes be a puzzle, making hard to decide between concurrent analyses. Rather than making an arbitrary choice by picking between potentially acceptable analyses, the best solution in those cases is obviously to encode them all. The program has a dedicated routine to treat the ambiguities correctly. This is especially important when figures and statistics are produced.5

2. THE SEARCH ENGINE It is by its power that a database can be gauged, and it is by what it can and cannot do that its value can be properly assessed. If compared with what exists nowadays inside and outside Egyptology, it does not seem pretentious to state that Ramses allows (almost) any kind of research without limitation. The following points deserve some discussion: • corpus definition; • search parameters; • result display. 2.1. Corpus Definition The search engine offers the possibility to search either the whole database or the text being edited. Now it is also possible to build one’s own corpus of research. This can be done by selecting texts on a list, or by using various criteria as filters. For instance, it is possible to restrict the research to the letters written on ostraca during the twentieth dynasty. The results of a previous search can also be used as a corpus for another search, which is often the appropriate way of investigating a problem thoroughly. 2.2. Search Parameters Basically, any search can involve one or several words. In the latter case, there is the possibility to search adjacent words or to allow some intervening words between. This is achieved by using the skip operator (*). One can search the lemma, the inflection or the spelling, separately or in 5

For a concrete example, see our Website (n. 4).

160

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

combination. The search engine also allows the use of the boolean operators (AND, NOT, OR). Here are some examples: • If one is interested in finding the occurrences of the formula rdi Hr.f r/n X (‘to give one’s attention to something/somebody’), one must search for the collocations of three distinct words: the verb rdi (whatever its inflection), the noun Hr, and the preposition r or n. The subject of the verb does not matter, nor the suffix pronoun after Hr. One must here insert the skip operator to allow any kind of noun phrase. From a logical viewpoint, the request has the following pattern (where a box stands for a word): Spelling

Spelling

Spelling

Lemma: rdi * Lemma: Hr * Lemma: n OR r Inflection

Inflection

Inflection

In the database, the request will be built as in Figure 3 (using * as a skip operator). The results are displayed in a list with the text name. One can easily access to the text by double-clicking on the line; the results are highlighted (Figure 4). Figure 3: Request pattern for rdi Hr.f n/r

Figure 4: Showing the context of one example of rdi Hr.f n/r (here P. Bologna 1094)

161

RAMSES

• It is easy to find the occurrences of a inflection without linking it to a lemma. As example, the search pattern in Figure 5 must be used to find the circumstantial perfects (iw sDm.f )

Figure 5: The request pattern for the circumstantial perfect sDm.f

• Spellings are another possible field of research: for instance, one can now search for a string of signs inside a word. This can of course be useful to fill in lacunae, or to study determinatives. In the example below, one looks for the string ≠ ´ in verbs; the query must use the operator AND since two criteria are applied on the same word (Figure 6). Spelling:

≠´

Lemma Inflection: verb

]

AND

Figure 6: The request pattern for ≠ ´ in verbs

2.3. Display of results For the present, the program enables only limited facilities for viewing the results. They are displayed in a list that can be sorted out by the documents’ name or date. The context can be shown on the screen by clicking on the

162

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

corresponding line. These facilities will be greatly expanded in the months to come (see below).

3. FUTURE PLANS In the next two years, the database should be greatly enhanced in many ways. 3.1. Encoding the Texts The encoding of the texts started in early 2007. So far, 440 texts have been encoded, which amounts roughly to 100,000 words. There are about 6,000 lemmas in the lexicon. Ramses aims at encoding all sources written in Late Egyptian. Texts written in mixed Late Egyptian are also taken into account. The time span considered ranges from the 18th to the 25th dynasty.6 In the coming years, the encoding of the texts will remain a priority. The automata which will be written for handling the syntactic and the morphological issues should greatly help us to reach our goals quickly and efficiently (see below). 3.2. Bibliography In 2008–2009, a bibliographical module will be written. It will be interconnected with the documents/texts database, the Lexicon Editor, and the Text Editor. So it will be quite easy to add the relevant bibliographical notes on general matters (like texts) and on particular points (like words in the lexicon, special spellings, or difficult passages in the texts). 3.3. Syntactic Analysis The next two years will keep us busy with the writing of the syntactic parser. Basically, it will come down to write automata that will be fed with some rules of Late Egyptian syntax. By analysing the context (especially word order), by accessing the classes of words recorded in the lexicon, and by using a statistical approach, the automata should produce reliable hypotheses to group words in syntagms and to assign them a syntactic function.

6

J. WINAND, Études de néo-égyptien, I. La morphologie verbale, Liège, 1992 (= Ægyptiaca Leodiensia 2), p. 3–25.

RAMSES

163

The advantages of encoding texts with the help of automata7 cannot be overemphasized: first they really speed the things up, and second – which is undoubtedly as important –, it is probably the best way of ensuring the maximum possible coherence of the data. So, automata could also be written to help the basic encoding of lemmas, spellings and morphological analyses. Needless to say, the final responsibility of the analysis rests upon the human encoder. 3.4. The search engine The search engine will also be greatly improved. A special focus will be put on the following points: • extending the possibilities to build a search corpus by making accessible all the descriptors present in the documents/texts database; • extending the sorting facilities: for the time being, the results can only be sorted according to the date or to the alphabetical order of the document. This should be extended in at least three directions: first, the descriptors being used to build a search corpus should be available as sorting criteria; second, the lexical and morphological features present in the lexicon should also be used; and third, the syntactic analysis, when completed, should be taken into account. For instance, when looking for all possible occurrences of a circumstantial perfective (see above), the results should be sorted out according first to the verbal lemma, and second according to the spelling; • the results should be exported in different formats (word processing, spreadsheet...) according to one’s personal preferences: for instance, taking once again the example of the circumstantial perfective, one should be able to export the occurrences sorted out as suggested above plus, if needed, for each example, the corresponding line in hieroglyphs (with a definable context); 7

Multiple approaches are considered here: variations of the very classical context-free grammars (É. WEHRLI, L’analyse syntaxique des langues naturelles, Paris, Masson, 1997), which are very expressive, but not always convenient for a large corpus, but also grammars based on finite-state automata and transducers, an approach which became popular in Natural Language Processing in the late 1990s (E. ROCHE & Y. SCHABES [ed.], Finite-State Language Processing, MIT Press, 1997).

164

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

• finally, some statistics should be produced according to the user’s choice. This will be particularly welcome when the database will produce vast quantities of results. It will be thus possible to test hypotheses by applying different criteria to see whether they are statistically relevant. For example, when studying the spellings of the personal pronouns, it will be possible to test if the reasons of the variations and changes must be looked for in the diachrony, in the writing system (hieroglyphic vs. hieratic), in the geographical provenance, or in whatsoever still unknown reason.

4. CONCLUSION: WHAT COMES NEXT? The database will be put on line as soon as possible. Users will first have to register. Basic consultation will be allowed without restriction. The advanced search module will also be accessible for free, but under some conditions. The database will prove most useful for philological and grammatical studies. In this respect, very innovative topics of research could be addressed especially by young scholars engaged in a doctoral thesis.8 Researches in the graphic system(s) of Late Egyptian texts are another possibility. In the long run, we hope to bring to a successful end two main projects which are the most natural outputs of such a database: a dictionary of Late Egyptian, and a complete grammar of Late Egyptian.

8

There are already four PhDs directly related to the Ramses project (for more details, see J. WINAND, St. POLIS, & S. ROSMORDUC, in Proceedings of the Xth International Congress of Egyptologists, n. 15).

AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

Serge Rosmorduc

ABSTRACT This article describes a system which is able to give a reasonable transliteration of Middle Egyptian Hieroglyphic texts, using a set of ‘rewriting rules’. I give a brief explanation of the inner working of the system, and then proceed to describe the detail of the said rules. INTRODUCTION

A number of years ago, while working on automated syntactic analysis of Middle Egyptian for my doctorate, I considered the possibility of automated transliteration.1 However, as this task was not central to our work back then, I did not explore the problem in depth. In 1997, I had a engineering student, F. Kerboul,2 work on the subject again, using the Prolog computer language; she did work on the analysis of isolated words, and, as the results were encouraging, I decided to study the subject in more detail.

1

2

S. Rosmorduc. (1996). Analyse morpho-syntaxique de textes non ponctués, Application aux textes hiéroglyphiques. PhD Thesis, p. 75; id, ‘Traitement automatique du langage naturel en moyen égyptien’, PIREI X, Bordeaux, 1994, p. 100–101. F. Kerboul (1997). Translittération automatique des hiéroglyphes. Rapport de stage de l’ENSTA.

167

168

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

The system described in this article takes as input a description of a hieroglyphic text as a list of ‘Manuel de codage’ codes, and outputs a transliteration of the text. This article is written with two audiences in mind, both Egyptologists and specialists of the Natural Language Processing. Thus, it contains a number of rather basic explanations, both on the problem itself and on the method used to solve it. ANALYSIS

Transliteration of Egyptian Hieroglyphs When an Egyptologist works on a hieroglyphic text, he transcribes it into latin characters which represent its consonantal skeleton. In my case, the input will consist in codes designed to describe the hieroglyphs, defined in the so-called ‘Manuel de codage’3 (MdC). Hieroglyphs

¢ ¢ ∆C

6 É!

= ¿ ∆á

MdC

I10:D46-M17-N35

T18-G43-A1

M17-N29:D21:Y1

Transliteration

Dd.in

Translation

said

Smsw servant

iqr excellent

The Manuel de Codage describes the hieroglyphic texts with signs codes such as I10 for ¢ , and positional codes in the form of ‘:’ for ‘stack above’. The sign values can fall into different categories: • phonetic signs, which represent one or more consonants, as the vowels are not written. For instance, the I10 snake is D (like English ‘j’), while T18 ( ) is Sms (sh+m+s). • ideograms, which refer directly to their meaning: for instance, µ E1 can be used to write the word iH, ‘ox’. • determinatives, which are a kind of semantic classifier, used at word endings. In our example, ! and ¿ (a papyrus roll) are determinatives, respectively for human beings and for abstract concepts.

3

J. Buurman, N. Grimal, M. Hainsworth, J. Hallof and D. v. der Plas Inventaire des signes hieroglyphiques en vue de leur saisie informatique. Mémoires de l’Académie des Inscriptions et Belles Lettres. Paris: Institut de France (1988)

AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

169

Now, automated transliteration is not as simple as it might seem. Firstly, signs may have more than one value. For instance, the papyrus roll is also an ideogram in the word mDA.t, ‘document’. Secondly, signs combination is not straightforward. A multi-consonantal sign is often accompanied by ‘alphabetic’ signs, which correspond to part of its spelling. For instance, Y5 ƒ has the value mn, but is often accompanied by N35, C, which is n, and the group Y5:N35 ƒ C is mn, not mnn. BASIC APPROACH

To give a first idea of how automated transliteration can work, let us start with a simple example, considering the word _Ø" , Ab, ‘to desire’. In reality, this word is encoded in the ‘Manuel de codage’ as ‘U23-D58-A2.’ The signs can have the following value: 1. U23, _, can correspond either to the consonants Ab or mr. 2. D58, Ø, is the consonant b. 3.A2, ", is a determinative for actions linked to the mouth. Knowing the values of the signs is not enough. We also need to combine them. In reference to the phonetic signs here, two different rules may apply: a) we can consider that a biliteral XY sign followed by an uniliteral sign Z can be combined into a group of three consonants, XYZ, b) or we can consider that a biliteral sign XY, followed by an uniliteral Y, i.e. followed by its last consonant, can be combined as a group of two consonants, XY. In all cases, the sign A2 marks the end of the word, and has no phonetic value. Using all possible solutions, we end up with three interpretations: using the value mr for U23, we can only use rule a), which gives the transliteration mrb. using the value Ab for U23, we can use either rule a), producing Abb, or b), giving Ab (the correct solution). Then, we need a way to represent those ‘rules’, and a way to choose the ‘best’ solution.

170

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Representation of the rules (simple example) The rules will be represented as ‘rewriting rules’, stating that, for a given data, we will obtain a given conclusion. For instance, the two different values of U23 will be represented as: r1. U23 => P(A,b) / 100 r2. U23 => P(m,r) / 100 where P(A,b) means ‘phonetic sign with value A,b’, and 100 is the ‘cost’ of the rule, which is a measure of its correctness. The actual values given to the costs are rather ad-hoc. The other rules for signs values would be something like: r3. D58 => P(b) / 100 r4. A2 => DET(mouth_action) / 100 Now, this is a first set of rules, which, applied to the entry ‘U23 D58 A2’, can produce the results: P(A,b) P(b) DET(mouth_action), for a cost of 300 or P(m,r) P(b) DET(mouth_action), also for a cost of 300. A second set of rules, used to combine the signs, will then be applied. A first rule will state that a uniliteral sign can be read on its own: r5. P($X) => L($X)/100 here, $X is a variable which can be replaced by any consonant. We use two different operators, P() and L() to differentiate sign values (that is P) and word transliteration (that is L). The same rule is true for biliteral signs: r6. P($X,$Y) => L($X), L($Y) /100 But of course, there is the rule which allows to combine a biliteral and an uniliteral sign: r7. P($X,$Y), P($Y) => L($X), L($Y) /100 Last, we can envisage a rule stating that a determinative can mark the end of a word: r8. DET($X) => wordend/100.

AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

171

Applying the rules The first set of rules, concerning the sign values, is applied to the input text. This creates a set of possible interpretations; then we apply the second set of rules to those interpretations. The principle is that the ‘cost’ of an hypothesis is the total of the costs of the rules used, and the ‘best’ hypothesis is the one with the least cost. For instance, to obtain the transliteration mrb, one has to use rules r2, r3, and r4, and then apply rules r5, r6, and r8, the result being ‘L(m), L(r), L(b), wordend’, with a total cost of 600. To obtain the transliteration Ab, the rules used will be r1, r3 and r4, and then r7 and r8, with a cost of 500. Hence, Ab is better than mrb. One problem with this system is that we need somehow to consider all the possible combinations of rules, which, at first sight, is a daunting task. For instance, if we consider the word åØN´ (V24-D58-F46-D54), wDb, ‘turn, fold’,

å,V24, can be P(w,D) Ø, D58, can be P(b) N can be either P(p,X,r), ID(w,D,b) or DET(folding_action) ´ can be ID(i,w), ID(n,m,t,t), or DET(move) where ID means ‘ideogram’. If we simply consider the possibilities here, we have 1 × 1 × 3 × 3, hence 9 interpretations for the sign values. To summarise, the number of hypotheses tends to grow exponentially with the length of the text.4 Now, rewriting rules is a very well-studied field of computer science, and we have an elegant representation for both the rules and their interpretation, called a finite state transducer, itself an extension of the notion of finite state automaton.5 The notion is rather simple to grasp graphically. Here is a transducer representing our sequence of signs:

4

5

To give another example, if we had four ambiguous signs, each with three interpretations, the total number of combinations would be 3 × 3 × 3 × 3, 81 possibilities! For a theoretical and practical coverage of the subject in relation with computing and Natural Language Processing, see E. Roche and Y. Schabes, ed. Finite-State Language Processing, MIT Press, 1997.

172

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

On each link in the transducer, there are two pieces of information: the input (on top of the link), which is what can be read by the transducer, and the output (below the link), which is what can be produced by the transducer. The input corresponds to the left part of our rules, and the output, to the right part. Any path from the leftmost node of the transducer to its rightmost node (the square one) represents a choice of rule applications. Hence, the transducer permits the representation of a large number of hypothesis in a very compact way. Another important feature of transducers is that they can represent both the entry, the rules and their results. Let us consider a few more rules to permit us to go further:6 ID($A,$B)=> L($A), L($B) / 100 ID($A,$B,$C,$D)=> L($A), L($B),L($C), L($D) / 100 ID($A,$B,$C)=> L($A), L($B),L($C) / 100 P($A,$B), P($B), ID($A,$B,$C) => L($A), L($B), L($C) / 100 P($A,$B,$C)=> L($A), L($B), L($C) / 100 Once applied, those rules will give us the following transducer:

6

Those rules are in reality rather over-simplistic to allow the whole analysis to be performed, but I wish here to underline the mechanisms of the system.

AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

173

Each path in the transducer corresponds to a possible analysis (of course, only a few of which are reasonable). If we count them, we find that there are now twelve different possibilities, but the transducer represents them all with only nine links. An interesting feature is that choices which are independent from one another are clearly separated by the automaton. Problems to solve This being said, there are a number of problematic points to deal with if we want to perform a reasonable automated transliteration of a hieroglyphic text. We want to be able to separate the various words. To do this, we need somehow to model the form of an Egyptian word. Firstly, on the linguistic level: a biliteral or triliteral word is usual, but longer words tend to have specific forms, either because they contain grammatical endings, or because they are derived from simpler roots by the addition of prefixes or reduplication. Second, on the graphemic level: a word tends to contain a phonetic part, followed by an ending with a determinative and various marks, like plural indicators. The structure of a word ending, and in particular the determinatives, needs to be described thoroughly. This is a reasonable description for most words, but not all. Some are written ideographically, but they are not very usual in hieratic texts. More annoying is the problem of function words which often do not follow the general pattern. In this case, the reasonable thing to do is to introduce some lexical information in our system, and have specific rules for those words.

174

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

ARCHITECTURE OF THE SYSTEM

To design a system able actually to deal with texts, we needed to choose a corpus. The principles of the hieroglyphic system is the same for all texts, but the details vary a lot. For instance, the orthography changed considerably from the 12th to the 19th dynasty. Even if we keep a synchronic corpus, the orthography of hieratic texts, or cursive texts in general is much more explicit than that of monumental texts. The system we are about to describe was tailored for Middle Kingdom Hieratic texts. To achieve our goal, we need to make the system somewhat more complex. We will need, not two, but five layers of rules. Each layer will be applied to the result of the previous one. All layers are implemented as transducers, but some layers do not follow the model above. LAYERS OF THE SYSTEM

Sign Normalisation This first layers takes as input the MdC codes, and outputs a simplified representation thereof. The ‘phonetic codes’ of the MdC are replaced with Gardiner codes, and some sign variants are replaced by their base sign (for instance, F47 O and F48 P are variants of F46 N and will be replaced by ‘F46’). Sign values The next level of rules takes as input normalised codes, and replaces them with all their possible interpretations. We keep the information about biliteral signs and so on, in order to combine the signs values in the next layer. Let us give a few typical rules, about the sign D36 ò : D36 => P(a) / 100 D36 => ID(a) / 200 D36 => DET(action) / 500 D36 can be understood as a uniliteral sign, which is the preferred interpretation (with a cost of 100); but it does also behave as an ideogram, when writing the word ‘arm’ or the word ‘condition, state’.7 The cost for this interpretation is slightly higher. The idea is that, in later rules, the combination D36 and Z1 ( ò » ) will be interpreted as an ideographically written word

AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

175

at a relatively low cost. The third rule deals with a confusion between D36 and D40 ú, as a determinative for actions. Note that this could have been dealt with in the first level instead. In the current version of the software, we distinguish between different kinds of values: Phonetic values, coded by ‘P’ Ideographic values, coded by ‘ID’ Phonetic-ideographic values, coded by ‘IP’.8 Some signs are also real ‘word-signs’, which differ from the usual ideogram in that they write a word on their own, whereas the typical Egyptian ideogram is usually combined with Z1 » to do so (for instance ò » ). In this case, we decide to generate the full interpretation of the word right away. For instance, O3, N, pr.t-xrw will be treated by the rule: O3 => L(p), L(r), L(t), wordend, L(x), L(r), L(w), wordend / 100 where ‘wordend’ is a marker for word endings. This data will be copied as is by the following layers. This is also a good place to deal with a kind of hiatus which occurs sometimes in the MdC. For instance, the group R22:R12 ® û is in fact to be understood as a whole, i.e. it should be considered as one sign.9 The same is true for the determinative T14-G41 2Å , which is not a ‘double’ determinative, but represents in fact a bird being hit by a throw-stick, as shown in the variant G84 : . Finally, as some signs are rather strong markers of some specific words, especially of function words, but may be combined with phonetic complements, we decided that, in the next layer, it might be interesting to keep the sign codes themselves, so we have a ‘copy rule’: $X => $X / 0

7 8

9

In this last case, it is not of course an ideogram stricto sensu, but it does behave like one. The relevance of the category of phonetic-ideographic value is somewhat dubious, as was pointed to me by P. Vernus. They could probably be replaced with ideographic values. W. Schenkel ‘Gesichtspunkte für die Neugestaltung der Hieroglyphenliste’, Göttinger Miszellen, vol. 14, 1974, p. 31–45.

176

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Simple Groups This layer takes its input from the previous layer, i.e. sign values, and combine those signs. It does not produce words yet, as words can be composed of multiple and rather complex groups of signs. Basically, this level of rules does two things: a) it combines phonetic complements with other signs, saying that ƒ mn + C n is mn and not mnn (dealing with exceptions like D4:D21 u which is irr and not ir in our corpus). á b) it tries to ascertain word endings Regarding case a), we obtain as input phonetic values like P(m,n) or P(n), ideographic values, and ideographic-phonetic values. We need to combine them to realise the values of the resulting groups. However, as we are interested in word formation, we need to keep the information that all those consonants belong to the same group, as a group belongs to only one word. Hence, our rules have the form: P($X,$Y), P($Y) => G($X,$Y), groupend / 10 which means that a phonetic sign with value $X$Y, followed by an uniliteral sign of value $Y should be read $X$Y. The fact that we grouped those signs is marked by the G() symbol, and also by the groupend, which is used in the following layers to deal with group combination. Regarding word endings, we model possible combinations of signs. The idea is to represent the way the various determinatives and grammatical markers may be combined in words. In particular, a number of phonetic signs, coding the plural, the feminine, or various verbal values, may be mixed with the determinatives. We have decided to re-order them at that stage, which is convenient for the next layer, where we will build the word. Let us look at two significant rules for this purpose: DET($x) => wordendstart, DET($x), wordend / 100 DET($x), P(t) => wordendstart, L(t), DET($x), wordend / 100 The first rule states that a determinative may end a word by itself. The second takes a group consisting of a determinative and a ‘t’, and states that this can be a word ending. However, the ‘t’ is output in front of the determinative for the next layer. I will take advantage of this rule to point out why it is interesting to distinguish ‘phonetic signs’ from ideograms in our system. If we consider X2, ∏, which for our period is an ideogram for t, ‘bread’, it cannot be used

AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

177

as a phonetic marker for feminine words. And that is exactly what our system does, as there is no rule ‘X2 => P(t)’ in the ‘sign values’ level. The set of rules in this level is rather large, as we need to deal with all possible configurations of signs. Except for the most obvious ones, this was achieved through analysis of the corpus. Word formation The task of the next set of rules is to combine the various groups into words. It uses the markers created by the previous level. First, the wordendstart and the groupend markers allow us to express that a word should in general contain both a phonetic part and a determinative part (for those which do not, we have more specific rules). Thus, we have a rule groupend, wordendstart => epsilon / 0 which means that a group ending followed by a typical ‘end of word’ part is normal, erasing the sequence groupend, wordendstart, with a cost of 0 (‘epsilon’ stands for ‘nothing’). groupend => wordend / 10000 which means that a group ending can be a word ending too, but at a very high cost in general. Words for which this is normally the case are few, and can be dealt with using specific rules. Next, we have rules to combine the various groups. For instance, a word may be written with a biliteral group: G($x,$y) => L($x), L($y) / 100 Or it can be written with two uniliteral groups: G($x), groupend, G($y) => L($x), L($y) / 100 In this last rule, the group ending which corresponds to the first group is ‘consumed’ for a small cost (much less than the 10000 needed previously), which means that the system will prefer to group two uniliteral groups in a biliteral word than to make a word with the first group on its own (once again, function words like suffix pronoun have specific rules which shortcut this one). Long words are not very usual in Egyptian, and they tend to follow specific patterns. For instance, a word can be formed on a biliteral root by reduplicating it, with often a strengthening effect. For instance the root qn,

178

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

‘to be strong’ generates the verb qnqn, ‘to beat’. We have described this formation with the rule: G($x,$y), e2, G($x,$y) => L($x), L($y), L($x), L($y) / 100 which allows reduplicated roots of the form ABAB (both groups need to have the same consonants). Phonetic Determinatives and general cleanup The last set of rules deals mainly with phonetic determinatives. They are currently considered as part of the end of the word, but they bring with them a phonetic value, which should be matched by the beginning of the word. PROBLEMS SOLVED

Function and frequent words A number of words, mainly very common ones, and above all function words, tend not to take a determinative at all. Hence, they would break our system. Fortunately, although those words are commonly found in texts, they represent a small part of the lexicon. Hence, we can write specific rules for them. In the current system, those rules are kept in the ‘group formation’ layer, because in that layer, we have the phonetic values of the signs at our disposal, and we have also copied their codes specifically for that purpose. The rule for the preposition Hna is: P(H), P(n), P(a) => wordstart, L(H), L(n), L(a), wordend /1 here, wordstart and wordend will prevent any attempt to group this word with another one later on. However, there are a few problems, especially with suffix pronouns, and especially with ‘s’, which can be either a feminine suffix pronoun, or be used as a causative prefix. A point which must certainly be improved in our system is that it sometimes outputs a sequence of suffix, which is obviously impossible. It would be possible to create a transducer to prevent this behaviour, but it has not been done yet.

AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

179

Multiple determinatives In Middle Egyptian orthography, words tend to have only one determinative, but they sometime take more. As we decided to extract our rules from the corpus, we needed to list those cases. In fact, they are not arbitrary. One of those determinatives tends to be a very general one, as Y1 ¿ (abstractions), A1 ! (man), or N35A D (liquid), the other being much more specific, and in some cases, one might be tempted to consider it as an ideogram. For instance, in the word î ∑∑|[ , ktt, small one, the first determinative (|)corresponds to small things, whereas the second one corresponds to women. We decided to add a mark, ‘GENDET’, to those signs. A rule of the form: DET($X), wordend, GENDET, DET($Y) => DET($x), DET($Y), wordend/1 allows then the combination of a generic determinative with another one, while preventing the combination of two run-of-the-mill determinatives. Robustness Software is said to be robust if it can cope with ‘bad’ or unexpected input. The very principle of our system makes it quite easy to do so. The idea is that each layer must accept any possible input, but always produce an output. This is done by ‘catch-all’ rules, which have a very high cost. CONCLUSION

Evaluation We have developed our set of rules on the tale of the story of the Shipwrecked Sailor (3500 signs, 1419 words), and we have about 9% of erroneous words. A word was considered to be erroneous if the interpretation given by the system was not possible; in some cases, it might be considered somewhat optimistic, as the grammatical context might for instance invalidate the software’s analysis. The software was then tried on another text, P. Westcar (9480 signs, 4000 words), for which the set of rules had not been made. The error rate raised to 18%, which is rather high. However, most errors were caused by unforeseen groups of signs, which is rather easy to fix within our formalism.

180

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

A good feature of the system is that one can add complex rules quite freely, without interfering with the already correct parts of the software, because a complex rule matches a complex and peculiar entry, and so will interfere with more general rules only in specific cases, where it will normally improve the result. Of course, rules which accept very simple entries are definitely more difficult. Now, if we take a Late Egyptian text, far removed from the training corpus, the result is rather bad, but quite interesting. Our system, when trying to transliterate the Instruction of Amenemope, makes the same kind of errors as a student trained in Middle Egyptian would do. In particular, he is baffled by syllabic orthography and multiple determinatives. In a way, this indicates that we have modelled the performance of a student trained in the classical language, but not in its more recent successors. On a more practical side, the software is quite fast on a modern computer. The speed is currently in the region of 30s for 3000 signs, which seems reasonable (and much faster than a human would achieve!). A number of rather simple optimisations would permit considerable improvements in the results. Possible improvements A number of technical points can certainly be improved in the system; in particular, the sign value layer could benefit from a considerable increase in speed. The rule system could probably be improved, both by adding a few layers (in particular layers to filter out multiple determinatives), and by cleaning up the existing ones. The cost system itself is of course very ad-hoc, and it would be interesting to see if statistics could be applied to this problem,10 but this would require a major revision of the system, with a much simpler structure. The down-side of statistics in this case is that we would no longer be able to model human expertise. The system as a whole is rather independent of the actual training corpus, and one could imagine using different sets of rules for different types of texts: hieroglyphic texts, Late Egyptian texts, and so on. Of course, the addition of a lexicon would increase the precision of the system, and is indeed rather easy, as is demonstrated by the work on function words. 10

F. Pereira, M. Riley. Speech recognition by composition of weighted finite automata. In Roche & Schabes, ed., p. 431–453.

AUTOMATED TRANSLITERATION OF EGYPTIAN HIEROGLYPHS

181

Possible uses Our system can of course be used to produce a ‘reasonably good’ transliteration of a text, which can speed up the work of a scholar (although we are not sure that correcting an existing transliteration is much faster than writing one from scratch). The result itself, a transducer, can be used as input for further processing, for instance as the entry of a syntactic analyser. An interesting property of the system is that it analyses the word structure while transliterating it. This could be of some use, for instance in text searches. The intermediary levels, with the sign group, can be thought of as ‘normalized forms’ of the words, and could be used for searching and indexing. And, last but not least, the creation of rules in itself is rather stimulating, and a quite interesting exercise in the study of Ancient Egyptian spellings.

RELATIONAL DATABASE DESIGN: A TUTORIAL AND CASE STUDY FOR EGYPTOLOGISTS

Ernest W. Adams and Nigel Strudwick

ABSTRACT

We present a tutorial and case study on the design of relational databases, with examples of interest to Egyptologists. The principles presented are applicable both to commercial database software and to user-written programs. Emphasis is placed on understanding the nature of the data, and on minimizing the space consumed by the database. INTRODUCTION1

With the invention of the microprocessor, computing power has become available to nearly every scientist who wishes it. However, many scholars in the humanities and social sciences are uncertain of how best to make use of a computer, even though they perceive a need for one. The new user is presented with a bewildering array of hardware and software options, new terminology, and vague promises on the part of computer manufacturers. Frequently the manufacturer’s emphasis is on financial applications and business productivity. Scientists, on the other hand, use computers in more

1

This paper was originally presented at the Leiden meeting of Informatique et Egyptologie in July 1986 and submitted for the proceedings, although it did not appear until Informatique et Egyptologie 7 (Paris 1990), 9–24. See the Postscript on page 206 below for some comments on changes since that time.

183

184

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

diverse ways, and the nature of their data is less well-defined than a business’. To the scholar, computers usually serve three purposes: data collection, data analysis, and data storage. The former two are beyond the scope of this paper. How data are stored, however, can make a great deal of difference to the amount of space required to store them and the time required to retrieve them. In this paper we hope to provide an introduction to the techniques of database design which will be useful to the novice user. Some of the terms we will use are not quite correct from a database-theory standpoint. Instead we have chosen to use the same terminology found in the manuals provided with commercial software. As you read the tutorial, we suggest you pay especial attention to the illustrations, noting the connections between the relations and the way data moves from one to the next. The illustrations provide a visual example of what is being described in the text, and are essential for understanding the concepts presented. WHAT IS A DATABASE?

A database is a collection of related data. Not all databases are in computers: the telephone book, dictionary, and encyclopaedia are all databases of sorts, organized in different ways and for different purposes. Computerized databases have the advantage of being searched quickly and changed easily. More importantly, they can be organized in several different ways at the same time, by creating indexes.2 A new index for a different purpose may be created at any time, and the indexes can be updated automatically when data are added or deleted. There are three basic kinds of computerized databases: network, hierarchical, and relational. The former two types are usually organized for maximum access speed, and they sacrifice some flexibility as a result. They are most often used for applications in which the permissible queries are restricted to a few well-defined operations.3 Network and hierarchical databases use computer storage space to store relationships among the data 2

The book Bartlett’s Familiar Quotations is an example of a database indexed more than one way at once: the book is actually sorted chronologically, but at the back it contains an alphabetical index of authors and an alphabetical index of words.

185

RELATIONAL DATABASE DESIGN

explicitly. That is, if two data are related in some way, there will be a special datum in the computer’s storage, called a ‘pointer’, which connects the two data items. This enables the computer to ‘follow the pointer’ and discover the relationship very quickly. If the database is ever reorganized, however, all the pointers must be recomputed. Relational databases, on the other hand, sacrifice speed for flexibility and ease of manipulation. They do not contain pointers. Relationships among the data are implicit, and the computer must search for the correct datum, instead of following a pointer. It is easy to add new relationships to a relational database because there are no pointers to be computed. This paper will devote itself exclusively to relational database design, for two reasons. The first is that relational databases are easier to understand intuitively. The second is that the great majority of commercial database software for microcomputers is designed to build relational databases, and this is the type most likely to be encountered by Egyptologists. ADDRESS_BOOK first

middle

last

street

city

state

post code

John Sue Linda

Jay Ellen Kay

Smith Miller Huang

224 Oak St. 112 Alma St 12 Park Rd.

Lexington Palo Alto Atherton

KY CA CA

40511 94306 94025

Figure 1: The ADDRESS_BOOK relation THE RELATIONAL DATABASE

Relational databases are constructed of one or more relations.4 A relation may be thought of as a sort of table. It has rows (called records) and columns (called fields). Each record is made up of several fields. The fields in the record contain data which are related in some way. The most familiar example is the name-and-address relation, consisting (in the United States) of 3

4

An airline flight- and passenger-scheduling database, for example, could well be a network database; events which might require a substantial restructuring of the database (new types of airplanes, new rules about booking procedures, etc.) are comparatively infrequent. In documentation for the dBase II database management system, relations are called ‘databases’. This is an unfortunate misuse of the term. dBase II’s databases are, in fact, relations. [This issue still persists in 2008, although sometimes the better term ‘tables’ is employed.]

186

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

fields for first, middle, and last name, house number and street, city, state, and postal code (Figure 1). We will refer to this relation as ADDRESS_BOOK. There is one record for each person in the relation. Note that every record has the same fields (although their contents vary), just as every row in a table crosses the same columns. In relational databases, the fields in a relation have a fixed width. This means that in any given field, only so much space is available to store information. The width of each field must be determined when the database is designed, and cannot be changed afterwards.5 Consequently it is very important to make sure the fields are wide enough to store the data you anticipate. Each field in the relation has a domain—that is, the range of possible values which may be stored there. Defining the domain of a field in advance will help to provide some idea about how wide it must be. For example, if a field called gods is to be filled with one of the names ‘Osiris’, ‘Isis’, ‘Ra’, or ‘Khnum’, then those four names make up its domain. The field need only be six characters wide because the longest datum that will ever be stored there is the name ‘Osiris’. If domain is widened to include the name ‘Apedemak’ after the relation has been created, it will be necessary to shorten the name in some way to make it fit. Often the domain of a field is a range of numbers, rather than a set of names. A field called account_balance usually contains a number with two decimal places, representing money.6 In this case also the designer must determine (or guess) how big the largest number ever to be stored in that field will be, and make the field wide enough to hold it. In most database management systems, it is not necessary to specify the domain of a field explicitly. Usually only the width need be specified.

5

6

It is usually possible to copy all the data from one relation to a new relation with wider fields. Depending on how much there is, however, this can be time-consuming. Before decimal currency was adopted in Great Britain, three fields were required: pounds, shillings and pence. The domain of pound was from 0 to some arbitrary limit; the domain of shillings was from 0 to 19; the domain of pence was from 0 to 11. When sums of money were added together, some additional calculation was required to make sure that the values in the fields remained within their domains. The whole system was further complicated by the fact that some sums of money were expressed in guineas.

187

RELATIONAL DATABASE DESIGN

The system will not make any checks to make sure that you have in fact entered one of the legal values.7 SOME RULES ABOUT RELATIONS

Relational database theory is predicated upon certain rules which all relations must follow: 1. A relation may have any number of records. 2. All records in the relation are made up of the same fields. 3. No two records in a relation may contain exactly the same data; i.e. each record must be unique. Rules #1 and #2 are self-explanatory. As new data are obtained, new records are added to the relation. Rule #3 exists to prevent certain kinds of problems with manipulating the relation. Consider the ADDRESS_BOOK relation again. John L. Smith and his son, John L. Smith Jr. both live at the same address. Since ‘Jr.’ is not part of the name ‘Smith’, it will probably be left out of the database—resulting in two records with identical data. What would happen if we gave the command ‘remove John L. Smith’? Should the system remove one record, or both? Which one should it remove? How should it decide? ADDRESS_BOOK bill #

first

middle

last

street

city

state

post code

1142 0369 7263

John Sue Linda

Jay Ellen Kay

Smith Miller Huang

224 Oak St. 112 Alma St 12 Park Rd.

Lexington Palo Alto Atherton

KY CA CA

40511 94306 94025

Figure 2: ADDRESS_BOOK with billing number

In order to avoid this problem, many systems assign a unique number to each record, storing it in a field of its own (Figure 2). In billing systems, this is usually a billing number, and is provided to the client in order to allow him to identify himself quickly and unambiguously. This is why computerized bills request that the customer write his account number on his check—and why payments are credited more slowly if it is not.

7

dBase II defines three kinds of fields: character, numeric, and logical (true/ false). Only numbers may be put in numeric fields; only the letters ‘T’ and ‘F’ may be put in logical fields. Character fields may contain any data.

188

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

To the greatest extent possible, all the values in a domain should be unique. Often this is impractical in the case of people’s names, and some arrangement for distinguishing individuals, such as the bill_# field may be used. This problem arises in the case study presented later, and another alternative is given. Every record exists to describe a specific object or relationship. Some of the fields name or define the object; they are referred to as the ruling part of the record. The other fields provide additional data about the object; they are referred to as the dependent part. This is because their data depends upon the data in the ruling part. It is the contents of the ruling part which makes the record unique. In the example from Figure 2, the ruling part consists only of the bill_# field. All other fields depend on that field, i.e. if the billing number were different, the other data would probably be different also. In giving the names of fields in a relation, we will place the ruling part first, and separate it from the dependent part by the symbol ‘:>’. Thus the field list for ADDRESS_BOOK with a billing number looks like this: bill_# :> name, address, city, state, post_code NORMALIZATION

It is often possible to reduce the amount of space required by a database by a process called normalization. This is not a command which may be given to a computer; rather, it is a process the computer user follows while setting up the database. The process of normalization results in relations which are said to be in normal form. Depending on which normalizations have been applied, the relations are said to be in first normal form, second normal form, etc. In this section we will examine the different normalizations and their results. In order to do this, we will be considering a hypothetical database for iconographic studies.8 Let us suppose that we have taken several thousand photographs of temple reliefs, in an effort to detect changes in the 8

Certain aspects of our example may appear to be implausible to professional Egyptologists. They should remember that it has been chosen primarily to illustrate the principles of relational database design. Some of the data shown have been derived from published sources; others are fictitious. They have been interpreted rather liberally in order to show the desired effect. A Pascal program to implement the database in a very simple form was demonstrated at the conference.

RELATIONAL DATABASE DESIGN

189

iconography over the centuries and throughout the country. We have decided to concentrate on a single type of scene: a person or persons making offerings to a god or gods. We wish to store the details of each photograph in a database, and we are interested in the following items: 1. The number of the photograph. 2. The collection in which the photograph is stored. 3. The publication (if any) in which the photograph appears. 4. The temple at which the photograph was taken. 5. The location within the temple of the relief. 6. The king during whose reign the relief was made. 7. The names of the gods in the relief. 8. The names of the people in the relief. 9. The dates of the reigns of the kings in the reliefs. 10. The headdress of each god or person. 11. The names of the offerings being made. 12. The types of furniture present. 13. The posture of each god or person. We will begin by assuming that all these data are stored in a single relation. Each item will be represented by a field (or fields) in a single record, and there will be one record per photograph. In the process of normalization, new relations will be created. The initial relation, however, will always exist; it is called the primary entity relation because it stores information about the primary entities. We will regard the photographs as the primary entities. We will name our first relation PHOTOS. The ruling part of this relation is the photograph number, since it is the thing which names what each record contains, and the other data are dependent upon it. The field list for PHOTOS is: photo_number :> collection, publication, temple, location, reign, god_name(s), people_names(s), god_headdress(es), people_headdress(es), god_posture(s), people_posture(s), furniture, offerings, reign_dates FIRST NORMAL FORM: REMOVING REPEATING FIELDS

Quite often, a situation arises in which some fields must repeat themselves within a single relation. For example, a relation about a family might include fields for the names of the children, but how many fields are needed? By rule #2, all records must be made up of the same fields, but not all families

190

INFORMATION TECHNOLOGY AND EGYPTOLOGY

PHOTOS photo # collection publication temple 0001 Chicago House Calverley v. I Abydos 0002 E.E.S. Bubastis 0003 Berlin Medinet Habu GODS photo # 0001 0002 0002 0003 0003 0003

name Osiris Bastet Horus Amun-Re Mut Khonsu

headdress Atef crown (none) Double crown Two-plumed crown Double crown (lost)

PEOPLE photo # name 0001 Sety I 0002 Osorkon I 0003 Ankhnes-neferibre 0003 Padineith

2008

location Second hypostyle First Hall Chapel of divine adoratrices

reign Sety I Osorkon Amasis?

posture Seated Standing with staff Standing with staff Seated Standing with staff Standing with staff

headdress Nemes Blue crown Plumed crown (none)

posture reign dates Offering 1307–1290 Censing and libating 924–888 Standing and offering Standing

FURNITURE photo # item 0001 Offering table 0001 Cube throne 0003 Throne OFFERINGS photo # item 0001 Pile of food 0003 Maat

Figure 3: Database with all relations in first normal form

have the same number of children. If we allot five fields, for example, families with more than five children will only have information on the first five. On the other hand, in families with fewer than five children, some fields will be empty and space will be wasted in the database. Therefore, the normalization to first normal form involves removing repeating fields and placing them in a new relation (Figure 3). Returning to the iconography database, it is clear that there are several repeating fields. The names of the gods, names of the people, offerings, and furniture are all repeating fields. Furthermore, the headdresses of the people and gods, their postures, and the reign dates of any people who were kings, must go along with their names. In order to normalize PHOTOS to

RELATIONAL DATABASE DESIGN

191

first normal form, we will have to create four new relations: PEOPLE, GODS, FURNITURE, and OFFERINGS. Their field lists are as follows: PEOPLE

photo_number, name :> headdress, posture, reign_dates GODS

photo_number, name :> headdress, posture FURNITURE

photo_number, item :> OFFERINGS

photo_number, item :> The field list for PHOTOS has been reduced to: photo_number :> collection, publication, temple, location, reign In normalizing to first normal form, the new relations are called nest relations. The relation from which their fields are moved is said to be their owner. The connection between them is called an ownership connection. (The ‘star’ symbol in Figure 3 indicates ownership.) Instead of having repeating fields, each record in the nest relation contains the data from one repeating field. In Figure 3, there are three gods in photo #0003. The PHOTOS relation contains only one record for the photograph itself, but the GODS relation contains three records for photo #0003, one for each god. The one record in PHOTOS owns the three records in GODS. This means that the GODS relation contains exactly as many gods as are really represented; there are no empty fields. Notice that no record appears in the FURNITURE relation for photo #0002, because no furniture appears in the photograph. The relationship between owner and nest relations is therefore one-to-many; one record in the owner relation may own many records in the nest relation. In order to connect the records in the nest relations with the records in the primary entity relation, it is necessary to copy the photograph number into the nest relation. The ruling part of the nest relations consists of both the photograph number and the repeating field, because only these two together make it unique. In the case of the FURNITURE and OFFERINGS nest relations, there are no additional fields, and consequently there is no dependent part. In the case of GODS and PEOPLE, the headdress and posture fields form the dependent part. If there were any additional data about each item of furniture, those fields would form the dependent part.

192

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

SECOND NORMAL FORM: DEPENDENCY REMOVAL

Normalization to second normal form consists of removing one kind of dependency from the relation. If a field in the dependent part of a relation is dependent on only a subset of the ruling part (that is, if it is only dependent on one of several fields making up the ruling part), then it should be removed to another relation. This does not apply to the primary entity relation PHOTOS. The ruling part consists of only one field, photo_number, and so the dependent fields are dependent on the entire ruling part. Consequently, PHOTOS is already in second normal form. PEOPLE, however, is not. It is in first normal form, since it has no repeating fields. The process of normalization creates new relations; it is always possible to normalize them as well as the primary entity relation. In the PEOPLE relation, the reign dates field is dependent only upon the original name, not upon the photograph number. Thus if there are several hundred photos of Sety I, the dates of his reign would appear in each record. Also, for all people who are not kings, the field will be empty. This redundant storage of information is clearly a waste of space. The reign dates should be removed to a different relation. Normalization to second normal form creates a new kind of relation, called a reference relation. This contains information which it may be desirable to ‘look up’ or refer to. The connection is called a reference connection, and is drawn with an arrow (Figure 4). The new relation we will call KING_REIGNS. Its field list looks like this: name :> reign_dates The field list for PEOPLE now looks like this: photo_number, name :> headdress, posture Reference relations can be referred to by any number of other relations. The relationship between the records in a referring relation and those in a reference relation is many-to-one; many records may refer to a single record. THIRD NORMAL FORM: MORE DEPENDENCY REMOVAL

The normalization to second normal form removed dependencies between fields in the dependent part and a subset of the ruling part. The normalization to third normal form removes dependencies among fields in the

RELATIONAL DATABASE DESIGN

PHOTOS photo # collection publication temple 0001 Chicago House Calverley v. I Abydos 0002 E.E.S. Bubastis 0003 Berlin Medinet Habu GODS photo # 0001 0002 0002 0003 0003 0003

name Osiris Bastet Horus Amun-Re Mut Khonsu

headdress Atef crown (none) Double crown Two-plumed crown Double crown (lost)

PEOPLE photo # name 0001 Sety I 0002 Osorkon I 0003 Ankhnes-neferibre 0003 Padineith FURNITURE photo # item 0001 Offering table 0001 Cube throne 0003 Throne

location Second hypostyle First Hall Chapel of divine adoratrices

193

reign Sety I Osorkon Amasis?

posture Seated Standing with staff Standing with staff Seated Standing with staff Standing with staff

headdress Nemes Blue crown Plumed crown (none)

posture Offering Censing and libating Standing and offering Standing

KING_REIGNS name reign dates Sety I 1307–1290 Osorkon I 924–888

OFFERINGS photo # item 0001 Pile of food 0003 Maat

Figure 4: Database with all relations in second normal form

dependent part alone. As with the normalization to second normal form, such dependencies indicate that data are being stored redundantly and space is being wasted. This normalization also creates reference relations. In the case of our database, such a dependency exists in the PHOTOS relation. More than one photograph of a single location may have been taken (if one wall has several reliefs, for example) but it is not necessary to store the reign of the wall’s construction with each of them. The reign field is dependent on the location, not on the photograph number, and should be removed to another reference relation. We will call the new relation LOCATION_DATA. The field list for PHOTOS is now:

194

INFORMATION TECHNOLOGY AND EGYPTOLOGY

PHOTOS photo # collection publication temple 0001 Chicago House Calverley v. I Abydos 0002 E.E.S. Bubastis 0003 Berlin Medinet Habu LOCATION_DATA temple location Abydos Second hypostyle Bubastis First Hall Medinet Habu Chapel of divine adoratrices GODS photo # 0001 0002 0002 0003 0003 0003

name Osiris Bastet Horus Amun-Re Mut Khonsu

headdress Atef crown (none) Double crown Two-plumed crown Double crown (lost)

PEOPLE photo # name 0001 Sety I 0002 Osorkon I 0003 Ankhnes-neferibre 0003 Padineith FURNITURE photo # item 0001 Offering table 0001 Cube throne 0003 Throne

2008

location Second hypostyle First Hall Chapel of divine adoratrices

reign Sety I Osorkon Amasis?

posture Seated Standing with staff Standing with staff Seated Standing with staff Standing with staff

headdress Nemes Blue crown Plumed crown (none)

posture Offering Censing and libating Standing and offering Standing

KING_REIGNS name reign dates Sety I 1307–1290 Osorkon I 924–888

OFFERINGS photo # item 0001 Pile of food 0003 Maat

Figure 5: Database with all relations in third normal form

photo_number :> collection, publication, temple, location The field list for LOCATION_DATA is: location :> reign The database is now represented by Figure 5. The normalizations presented here are not the only ones which exist. In recent years computer

RELATIONAL DATABASE DESIGN

195

scientists have identified several more. However, these are the most significant in terms of the storage space they save, and they are the only ones which will be presented. Two additional space-saving techniques will be given, however. DECOMPOSITION INTO SUBSET RELATIONS

It is frequently valuable to collect extra information on a certain group of primary entities. In an employee database, it may be useful to know how many people the supervisors have working for them. However, most employees are not supervisors, and if we create a field for each employee called supervisees, it will be empty most of the time. This situation exists in our iconography database. The PHOTOS relation contains a field, publication, which will only be filled if the photograph actually has been published. The field can correctly remain in PHOTOS, since its value is dependent on the photograph number. However, for all the photographs which have never been published it will be empty. As a result, it can be moved to a new relation of its own. This type is called a subset relation, and it exists simply to store data that are not needed in every record of the entity relation. A subset relation is connected to the entity relation by means of a subset connection, illustrated here as a rounded arrow. We will call the new relation PHOTO_PUBLICATION. The database is now represented by Figure 6. The field list for PHOTOS is now: photo_number :> collection, temple, location The field list for PHOTO_PUBLICATION is: photo_number :> publication The relationship between the entity relation and the subset relation is one-to-one; a record in the entity relation has at most one record in the subset relation. TWO ADDITIONAL RELATION TYPES

There are two additional types of relations which we have not discussed so far: the lexicon and the associative relation. While these are not created as the result of normalization or decomposition, they are useful tools in the construction of a relational database.

196

INFORMATION TECHNOLOGY AND EGYPTOLOGY

PHOTOS photo # collection 0001 Chicago House 0002 E.E.S. 0003 Berlin

temple Abydos Bubastis Medinet Habu

name Osiris Bastet Horus Amun-Re Mut Khonsu

location Second hypostyle First Hall Chapel of divine adoratrices PHOTO_PUBLICATION photo # publication 0001 Calverley v. I

OFFERINGS photo # item 0001 Pile of food 0003 Maat GODS photo # 0001 0002 0002 0003 0003 0003

2008

headdress Atef crown (none) Double crown Two-plumed crown Double crown (lost)

PEOPLE photo # name 0001 Sety I 0002 Osorkon I 0003 Ankhnes-neferibre 0003 Padineith FURNITURE photo # item 0001 Offering table 0001 Cube throne 0003 Throne

posture Seated Standing with staff Standing with staff Seated Standing with staff Standing with staff

headdress Nemes Blue crown Plumed crown (none)

posture Offering Censing and libating Standing and offering Standing

KING_REIGNS name reign dates Sety I 1307–1290 Osorkon I 924–888

LOCATION_DATA temple location Abydos Second hypostyle Bubastis First Hall Medinet Habu Chapel of divine adoratrices

reign Sety I Osorkon Amasis?

Figure 6: Database with subset decomposition

The lexicon There is a special class of reference relation called a lexicon. These relations have exactly two fields, each of which is a synonym for the other. In a lexicon, each field is dependent upon the other, and the concepts of ruling and

RELATIONAL DATABASE DESIGN

197

dependent part do not apply. Lexicons are usually used for storing the relationship between a code and its key—the relationship between a department number and a department name, for example. Such codes are useful for reducing the amount of space required in the database. If the domain of a field consists of some particularly long strings of characters, it may be inconvenient to allocate enough space to the field. For example, one value of the headdress field could in theory be the string ‘The double crown of Upper and Lower Egypt’. Of course no one would really want to define a field long enough to hold all that, and keep that field in several thousand records. But a two-letter code might suffice quite well: DC for double crown, UE for upper Egypt, SD for solar disk, etc. The key to the code could be stored in a lexicon. Since the lexicon would only contain as many records as there are types of headdresses, the keys could be quite long. When reading the code, it is only necessary to look up its meaning in the lexicon. Such a lexicon would have the field list: headdress_code headdress_description The lexicon should be employed at the user’s discretion, but it requires compromises. The use of a code means that the data cannot be as quickly displayed, since coded values must be looked up in a separate relation, and that takes time. On the other hand, for compression of data it is invaluable. The associative relation Our iconographic database has been organized with the assumption that the primary entity is the photograph. This arrangement means that searching the database will be very fast if we already know the photograph number. To find out what furniture exists in a given photograph, we need only to search the FURNITURE relation for the photograph number, and read the item field in the correct records. But suppose that we wished to ask, ‘In which photographs do Sety I and Osiris appear?’ In order to determine this, we would need to search both PEOPLE and the GODS relation for the correct names, then compare the photograph numbers to see if they match. If there are 500 photographs in which Osiris appears, and 50 in which Sety I appears, the database search will require 25,000 comparisons! This amount of searching is time-consuming even for a computer. If such queries are likely to be frequent, some way of reducing the search time must be found. It is always possible to construct new relations, even without normalization. If there is a new kind of data to be stored, or if existing data is to be

198

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

stored in a new way, a new relation can be created. In this case, we wish to create a new relation which will represent explictly the relationship between people, gods, and the photographs in which they appear. The data in this relation will duplicate the data already existing in other relations, but it will be much easier to find. We will sacrifice storage space on the disk or cassette in order to reduce the execution time required to answer the query. In order to do this, we will create a new kind of relation called an associative relation. An associative relation is like a nest relation which is owned by two (or more) different relations at once. Our relation is called GODS_PEOPLE_PHOTOS. Its field list is: god_name, person_name, photo_# :> Each record is owned by both one record in GODS and one record in PEOPLE. Whenever a new record is added to GODS or to PEOPLE, a new record must also be added to GODS_PEOPLE_PHOTOS. Otherwise the association will not be recorded and a program which relies upon data in that relation will give incorrect results. The associative relation represents meaningful associations of existing entities. It demonstrates a many-to-many relationship between the records in the owning relations. It need not have a dependent part, although often there are some additional data about the association which should be stored there. MORE RULES ABOUT RELATIONS

Here is a summary of the new types of relations and the rules which apply to them. Following these rules carefully is crucial to creating and maintaining a relational database correctly.9 Ownership rules The ruling part of the nest relation is the concatenation of the ruling part of the owner relation and a field to distinguish individuals in the owned sets. A new record can be inserted into the nest relation only if there is a matching owner record in the owning relation. Deletion of an owner record requires deletion of all the owned records in the nest relation as well. 9

The rules in this section are paraphrased directly from Wiederhold, Database Design, 368–71.

199

RELATIONAL DATABASE DESIGN

PHOTOS photo # collection 0001 Chicago House 0002 E.E.S. 0003 Berlin

temple Abydos Bubastis Medinet Habu

PHOTO_PUBLICATION photo # publication 0001 Calverley v. I

OFFERINGS photo # item 0001 Pile of food 0003 Maat GODS photo # 0001 0002 0002 0003 0003 0003

name Osiris Bastet Horus Amun-Re Mut Khonsu

location Second hypostyle First Hall Chapel of divine adoratrices

headdress Atef crown (none) Double crown Two-plumed crown Double crown (lost)

PEOPLE photo # name 0001 Sety I 0002 Osorkon I 0003 Ankhnes-neferibre 0003 Padineith FURNITURE photo # item 0001 Offering table 0001 Cube throne 0003 Throne

posture Seated Standing with staff Standing with staff Seated Standing with staff Standing with staff

headdress Nemes Blue crown Plumed crown (none)

KING_REIGNS name reign dates Sety I 1307–1290 Osorkon I 924–888

LOCATION_DATA temple location Abydos Second hypostyle Bubastis First Hall Medinet Habu Chapel of divine adoratrices

Figure 7: Database with associative relation

posture Offering Censing and libating Standing and offering Standing

reign Sety I Osorkon Amasis?

GODS_PEOPLE_PHOTOS god name person name

photo #

Osiris Bastet Horus Amun-Re Amun-Re Mut Mut Khonsu Khonsu

0001 0002 0002 0003 0003 0003 0003 0003 0003

Sety I Osorkon I Osorkon I Ankhnes-neferibre Padineith Ankhnes-neferibre Padineith Ankhnes-neferibre Padineith

200

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Notice that in the case of the associative relation shown above, the ruling part is the concatenation of the ruling parts of both owner relations. Reference rules The ruling part of a reference relation matches the referring field of the primary or referring relation. Records in reference relations may not be removed while any reference exists. The removal of referring records from the primary relation does not require removal of the corresponding referenced record. Subset rules The ruling part of a subset relation matches the ruling part of its connected general relation. Every subset record depends on one general record. A general record may have no or one record in any connected subset relation. SOME NOTES ABOUT NORMALIZATION

We cannot emphasize strongly enough that the process of defining relations and normalizing them should take place before any data are gathered. In this way you will understand exactly what sort of data you intend to gather and how it is related. If you wish to change a relation after the data have been gathered, great care must be taken to see that all the data are copied properly into their new relations, and they are all still valid. The process of normalizing a relation does not guarantee that the data in it will be correct. It is the responsibility of the database administrator to see to it that the rules for adding and deleting records are obeyed. As the number of relations increases, the complexity of their interrelationships increases also. Special care must be taken to ensure that when records are added or deleted, there are no unresolved connections (owned records in a nest relation which are left behind, for example). It is also important to realize that the connections between the relations are abstractions. The manuals which come with commercial database management software will probably not make any reference to them, and they have no actual representation in the computer. They are simply a means for expressing various relationships among the data. Normalization of relations is not always desirable. With small data sets, (under 100 primary entity records) the duplication of fields in the nest

RELATIONAL DATABASE DESIGN

201

reference, and subset relations may cost more than the reduction of redundancy saves.10 For example, in going to first normal form, it is necessary to copy the ruling part of the owner record into every record that it owns. If the ruling part is very large, and the actual data to be stored in the nest relation quite small, then it may be preferable to simply create a large number of repeating fields in the primary entity relation and run the risk of running out of fields or of leaving some empty. This leads to the question of what should be done with an empty field. In most systems, empty fields are filled with some default value, usually zero for numeric fields and the ‘space’ character for character fields. However, there are at least three reasons why a field might be empty, and it is useful to distinguish between them. 1. NOT APPLICABLE - The field simply does not apply to the data in the given record. This can occur when you have not decomposed the relation into subset relations, or when you have repeating fields and some of them remain unfilled; e.g. you have fields for five children and there are only four in the family. 2. NOT YET ENTERED - Data are available for the field but have not yet been entered into the computer. 3. UNKNOWN - The field is applicable, but there are no data for it. It may be helpful to devise some means which will allow you to distinguish between these three conditions. This way other people examining the data will know why the field is empty. In numeric fields, for example, you may wish to enter zero or negative numbers; in character fields you might develop a code which stands for one of the reasons. IDEAS FOR IMPLEMENTATION

By far the easiest way for a computer novice to create a relational database is to buy a commercial database management system. When this article was 10

In our own example, much more space was wasted during normalization than was saved. However, the example only used three primary entities. If there, had been several thousand primary entities, the amount of space saved would have been significant. If, for example, most of the photographs were unpublished, then subset decomposition would have removed all the empty publication fields and only a few PHOTO_PUBLICATION records would have been created.

202

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

originally written, one of the most popular programs was the dBase family; dBase II had the drawback of being able to manipulate only two relations at once, with the result that one was constantly switching them in and out of memory. Such limitations are now history, but dBase II was a landmark product in the development of database software. Most available systems provide a means of writing programs for manipulating the relations automatically. This capability is essential if more than four or five relations are in use. When reading the data in a relation, usually the ruling part is searched for the records desired. For example, in order to find all the photographs in which the goddess Isis appears, it is necessary to search the ruling part of the GODS relation for ‘Isis’ in the name field. This search can be speeded up considerably if the data is sorted or indexed. While techniques for sorting and indexing relations are beyond the scope of this paper, many commercial database software products have built-in commands for indexing a relation on any field desired, or on more than one field. An index gives the impression that the data are sorted without actually sorting it; the advantage of this is that relations may be indexed in several different ways at once. In the example above, GODS should probably have two indexes, one for the name field and one for the photo_number field. Although the index takes up space on the disk (just as it requires pages at the back of a book), the speed it affords is well worth it if the relation contains more than about 100 records. CASE STUDY

In the published proceedings of the Leyden Conference, Nigel Strudwick presented a database of stelae, and we will now demonstrate how it can be implemented using relational methodology.11 In its original form, it was designed for a significant part of the data to be available in memory, whereas relational files are usually stored on a disk. Using the terminology adopted above, the basic fields of the primary entity relation would be as follows, using the reference number of the stela as the ruling part of the relation; stela_no :> collection, data, provenance, material, shape, height, width, registers, technique, features, owner_name, owner sex, 11

Informatique et Egyptologie 4, 17–31.

RELATIONAL DATABASE DESIGN

203

owner_title(s), god(s)_in_scene, god(s)_in_formulae, other_person_name(s), other_person_sex, other_person_relationship, other_person_title(s), inscriptions, bibliography, comments There are a total of 23 fields, and there is much scope for normalization of repeating fields. Into this category come the titles of the stela owner, the gods in scenes and formulae, the other people shown on the stela, inscriptions, bibliography, and comments. The creation of first normal form by removal of repeating fields creates no less than eight further relations, as follows: OWNER_TITLES stela_no, title :> GOD_SCENE stela_no, god_name :> GOD_FORMULAE stela_no, god_name :> PERSONNEL

stela_no, name :> sex, relationship PERSONNEL_TITLES

stela_no, name, title :> INSCRIPTIONS

stela_no, inscription :> BIBLIOGRAPHY

stela_no, bibliography :> COMMENTS

stela_no, comments :> Notice that it is not necessary to copy the owner_name into the OWNER_TITLES relation, because there is exactly one owner per stela. It is understood that the titles in OWNER_TITLES are associated with the owner’s name. PERSONNEL_TITLES is in fact a nest relation of a nest relation, since there can be several people (besides the owner) in a single stela, and each of them can have several titles. Data with many nested structures like this are often stored in hierarchical databases. The pointer structure of a hierarchical database makes it unnecessary to copy the ruling parts of the owner relations into each nest relation. This raises again the problem of creating unique records. It is certainly possible that the ownership of a stela might be ambiguous, and so some

204

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

STELAE stela no owner name data provenance material shape height width registers technique

OWNER DATA owner name owner sex

OWNER TITLES stela no title

INSCRIPTIONS stela no inscriptions

GOD SCENE stela no god name

BIBLIOGRAPHY stela no bibliography

GOD FORMULAE stela no item

COMMENTS stela no comments

PERSONNEL stela no name relationship

PERSONNEL DATA name sex PERSONNEL TITLES stela no name title

Figure 8: Stela database in third normal form

decision must be made about which person to put in the owner_name field. Likewise, it is not impossible that two people on the same stela should have the same name. In both these cases, the solution is to adopt a convention which must be made clear to all persons working with the database. In the case of people with identical names, we might append a letter to each of their names indicating their position on the stela, for example. Amenhotep A could always mean the one on the left, and Amenhotep B the one on the right. With ambiguous owners, we could adopt the convention that the one which is carved the largest, nearest the top of the stela, etc. will be regarded as the owner. In addition, we could add a field to STELAE called ambiguous, whose domain is simply TRUE and FALSE. This would serve to warn

RELATIONAL DATABASE DESIGN

205

people working with the data that it is not necessarily accurate and the convention has been applied. Two additional normalizations are possible yet. In STELAE, the sex of the owner is dependent on another field in the dependent part, owner_name, and not on the stela number. Consequently, this relation can be normalized to second normal form by moving sex to its own relation. If the owner happens to own several different stelae, it is not necessary to store his or her sex in each stela record. The same is true in PERSONNEL. The sex of the person is dependent on a subset of the ruling part (the name), and so this relation can be normalized to third normal form in exactly the same way. The fully normalized database is now shown in Figure 8. We should point out, however, that this is a good example of a database which need not be completely normalized. The sex field is only one character wide, and unless there are several hundred stelae belonging to one person, normalization to second and third normal forms will probably use more space than it saves, in addition to complicating the database. It would be better just to leave it in the STELAE and PERSONNEL relations and ignore the redundancy. Another possibility for saving space in this database is to use a lexicon for the materials, shapes, and relationships, but the overheads in access time must be considered in view of the fact that the words themselves are not particularly long. A danger that must be borne in mind with relational databases of this type is system restrictions on the number of files to be manipulated at once; in the case of dBase II it was two; with the Pascal system on which the tutorial example was demonstrated at the conference, problems arise more from the management of disk space than from the number of files. The proliferation of relations definitely complicates the database administrator’s task, and may not be worth the space it saves. CONCLUSION

We believe that this tutorial and case study will be of some value to the Egyptologist who is just beginning to use a computer. The concepts presented here are not new in the field of computer science or database design, but they are probably unfamiliar to the scholar in the social sciences and humanities. While the microcomputer has placed computing power in the hands of many people, the knowledge required to use it well is still largely

206

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

in the hands of computer professionals. We hope that we have helped to close that gap. BIBLIOGRAPHY

Date, C.J. An Introduction to Database Systems (Menio Park, California: Addison-Wesley, Inc., 1982). Strudwick, Nigel, ‘A database for the study of Egyptian stelae’, Informatique et Egyptologie 4 (Paris 1988), 17–31. Ullman, Jeffrey D. Principles of Database Systems (Rockville. Maryland: Computer Science Press, 1982). Wiederhold, Gio. Database Design (New York: McGraw-HiIl, Inc., 1983). POSTSCRIPT (NOVEMBER

2008)

This paper is a product of the late 1980s. Since that time, the world of computer hardware and software has changed out of all recognition, but the basic principles of relational databases have not. While most Egyptologists now have access to relational database packages and appropriate hardware, databases are becoming increasingly sophisticated, and the authors considered it timely to republish the basic paper, as an understanding of the theoretical underpinnings is more important than ever. The theoretical core of the paper has been left as was in 1986, with just the correction of a few typographical errors. The sections relating to database implementation in the late 1980s have, however, been shortened, as their only benefit to the reader in 2008 is historical; nonetheless, the occasional reference to contemporary hardware and software has been left in as a reminder. Computers in the past twenty years have become powerful in a way which could hardly be envisaged at the original time of writing, and certain matters discussed and still in this paper have become non-issues for the most part. Thus the paper stressed the need to save every byte possible of storage space; in those days, databases were perhaps the major potential consumers of storage, using floppy discs and cassettes as one did, with hard discs in their relative infancy. In 2008, the space requirements of databases are insignificant when compared to audio, video and digital images, and a private database is only likely to have significant storage requirements when it stores these newer space-hungry media types.

RELATIONAL DATABASE DESIGN

207

Most Egyptologists now have access to extremely powerful desktop database software: for the majority, this means relatively inexpensive general-use commercial packages such as Filemaker Pro or Microsoft Access, although open-source products employing basic database technologies such as SQL are also available at next to no cost. The dBase products which were in mind when the original paper was written are no more, although the company still produces database software (currently dBase Plus). Other companies have come and gone or diversified into other areas of information management (e.g. Borland), and there are many niche database programs (such as 4D, dBase), as well as the larger systems favoured by international commercial concerns (such as Oracle). It would be inappropriate here to add more about the capabilities of current systems. Nonetheless, it is worth noting how software improvements now make two issues noted in this paper easier for all: • current products permit data relationships to be displayed to and understood by the user in a graphical fashion unimaginable in 1986; • fixed-length data fields are more or less unknown, and the effective lack of restrictions on field size make the movement of data between structures much simpler than it was. Anyone involved in the world of computers and IT knows it is rash to predict where any type of system will be in five years, never mind twenty. In 1986, Egyptologists were coming to terms with databases and desktop publishing, and we could not easily have foreseen the world as it is now. I would like to thank Dag Bergman for reading this paper, and offering many useful comments, as well as for writing the next paper in this volume giving a 21st century perspective on the original article.

USING RELATIONAL DATABASES AT THE BEGINNING OF THE 21ST CENTURY

Dag Bergman

ABSTRACT

A perspective with comments and illustrated examples on the re-publishing of the ground-breaking article “Relational Database Design: A tutorial and case study for Egyptologists”, by Ernest W. Adams and Nigel Strudwick (see pages 183–207).

When Nigel Strudwick, Diane Bergman, and I planned the 2006 conference for the Informatique et Egyptologie (I&E) computer working group, we wanted to change its focus away from mainly discussing the technical aspects of the various computer-based technologies and tools that can be used to support Egyptological work. Instead, we wanted to focus the discussion on higher-level aspects of using such technologies and tools for various projects. After I had listened to several presentations of interesting projects during the 2006 and 2008 conferences of I&E, I realized that, although many aspects of using computer tools have become general knowledge understood by most people, this is not always true about the use of relational databases. The proper usage of relational databases is far less intuitive than the usage of other computer tools, and requires a good understanding of the fundamental underlying principles in order to be done efficiently. For some of the projects presented, where relational databases were used, it was clear from the presentations and the following discussions that the people who were running these projects had not always received the necessary training to understand these fundamental principles. This implies that each new 209

210

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

generation of relational database application designers needs to obtain the necessary skills, requiring both theoretical studies and practical training. To provide these skills may lie beyond the scope of I&E, but future conference organizers may consider offering (as is often done in similar conferences) a one or two day workshop in this field, linked to the conference. While thinking of how best to remedy the lack of understanding of relational databases within Egyptology, I recalled how I once, a long time ago, first began to understand them. I had already worked with non-relational databases since 1978. In 1988, I joined a working group for relational databases within CIDOC, the ICOM International Committee for Documentation. I later attended a university level course on the subject. However, before that, the first thing that opened my eyes was an article by Ernest W. Adams and our own chairman Nigel Strudwick from 1986, which I first saw in 1990. It presented the basic principles in a clear way using practical examples relevant to Egyptology. Due to my experiences at the 2008 conference, I reminded Nigel Strudwick about this excellent 1986 article, and suggested it was still relevant. After re-reading the article, he agreed to re-publish it, adding a postscript to make it more up-to-date. COMMENTS

1. The article suggests early that “To the scholar, computers usually serve three purposes: data collection, data analysis, and data storage”. I would like to add to that list, in these days when we finally have the excellent tool for data-sharing called the “World Wide Web”, that the retrieval and presentation of data are equally important, and further motivates the use of relational databases and thus reading the article. 2. Strudwick mentions himself (in his postscript) the limited availability of storage as a major consideration at the time when the article was written (a 40 MB hard disk was considered large in 1986, and would have cost some 10 times more than a 400 GB disk would cost in 2008). I would like to add that the low speed of processors, limited availability of RAM memory, and limited capacity of available applications were equally important. Even in the early 1990s, it was often necessary to ‘de-normalize’ after just having normalized (see the article) a database structure, just in order to make the computer and the application cope with the task, and not be overwhelmed. It was first in the period from 1993 to 1996 that (personal/micro) computers became powerful enough to make use of (more or less) true relational

USING RELATIONAL DATABASES

211

database engines, and, of equal importance, powerful such applications appeared, termed Relational Database Management Systems (RDMS). I used myself mainly Paradox by Borland and Microsoft (MS) Access (which I still use) running on Windows computers, whereas I had previously used Oracle on Macintosh. The two former products provided efficient implementations of an easy to use tool both to enter and retrieve data from properly normalized relational structures. The data can even be viewed and manipulated live, rather than just be studied as “dead” extracted views or snapshots in reports. The tool is called “Query By Example” (QBE or qbe), and is an excellent complement to the since long standard, but difficult to use, SQL (Structured Query Language) to retrieve data from relational databases. 3. The importance of queries is not mentioned in the article. This is understandable, as the applications available at the time when the article was written did not provide such powerful tools. It is, however, the use of powerful queries that make the normalization process so important and meaningful, rather than just a dry exercise in order to fulfil some theoretical principles. I have learned to master QBE, which in MS Access is very fast, also with complicated nested structures thanks partly to the patented Rushmoor Technology. MS Access even helps the user to convert automatically any QBE query to SQL when required (although this is far more difficult to use for complex queries). For the few cases that QBE cannot handle (such as Unions), and for using queries in program code, SQL can still be used. I promote QBE and MS Access here for the reasons given. Whichever relational database product is being used, the importance of queries cannot be over-emphasized, as that is what motivates the use of relational databases and learning what the article by Adams and Strudwick discusses. 4. In the discussion about relations (page 187), I would add that the rows and columns could have any order (be unordered). The ordering desired (in addition to filtering) for presentation would always be provided by use of queries, forms, and reports. 5. Regarding Normalization (page 188), I would like to add that “The First Normal Form” also dictates that all column values must be atomic, i.e. only one value in each field. The article assumes that this requirement is so obvious that it does not need to be mentioned. This rule is, however, not obvious to everybody, but is nevertheless an equally important part of the requirements of the “First Normal Form”, as is the removal of repeating fields, discussed in the example. While atomizing the original data and removing repeating fields, the data from those fields will be moved to new, so called nest relations (as the article states), with “one-to-many”

212

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

relationships to the original (owner) relation. Not only can the original shape of the data then be recreated, but also its components (or atoms) can be put back in any desired order using queries. 6. One major principle of the design of relational databases is that no data should be stored in more than one location at the same time. Following this principle not only saves storage space, but more importantly, provides terminology control, and thus also prevents the same data from being spelled differently in different places. Any time that data is needed in another place, a reference should be made to the place where the data is stored so it can be re-used with help of queries, rather than repeating the data (see below how I would implement it). The article deals with this issue while discussing normalization to second and third form (see Figure 4 and 5 in the article). There, so called reference relations, or “look-up” relations are introduced. The relationship between the referring and the reference relations is called many-to-one, as many records on the referring side may refer to the same record on the reference side. Accordingly, in the example, the relations KING_REIGNS and LOCATION_DATA are created to prevent the need for repeating such data every time a king’s reign, or a specific location is being referred to. 7. If one-to-many and many-to-one relationships are rather intuitive to most users, they are still special cases of the less intuitive, but very useful, many-to-many relationship (not the one the article mentions in its discussion about associative relations, though (see point 9 below)). To understand many-to-many relationships better, my technique is to consider reference relations as primary entities in their own rights. The situation is then entirely dependent on the perspective of the primary entity at the time. It is in the user interface, through queries, forms, and reports, that the user can decide to see, or study, the data in the database from the perspective of any such primary entity, not only from the one first chosen. I will try to explain this with an example below, but will first discuss some other important issues. 8. I find the method of identifying each record by the rather long names in one or two text fields (as used in the article) as very impractical. If, for instance, a name used as identifier has to be spelled differently, it has to be changed everywhere it has been used (some RDMS, such as MS Access, can actually automate this process, but I still think it is impractical). Instead, I use a randomly created number (AutoNumber in e.g. MS Access) as identifier in a so called key field to identify uniquely the records of every primary entity (relation). When data in a record from one such entity is needed in a

USING RELATIONAL DATABASES

213

different relation, I only store the unique ID number of this record, rather than repeating the actual data. When the data is to be presented in a form or report, I use a query to replace the ID number with the content of the desired field of the record in question. This method also replaces what the article calls “lexicon” as it permits me to display alternative data, such as the full name of a concept rather than its abbreviation or vice versa, which in turn could be stored in different fields of this primary entity. The use of relations as look-up tables (in e.g. drop-down menus) in combination with queries, make this principle possible and very powerful. Note that in the discussion about lexicons, the article uses the word “key” as the expansion or interpretation of a code or abbreviation, which is different from the one used here, which I believe is the more commonly used presently. 9. The discussion about associative relations (page 197) was relevant when it was written, but appears to be obsolete today. These days, I would never repeat data (and add unnecessary data entry work) in an associative relation. I would simply set up a query linking the relations required to accomplish the task that the associative relation would have been intended to accomplish. Queries have become, as mentioned, very fast since the mid 1990s, eliminating the need for associative relations. 10. The article mentions (page 187) three groups of rules which must be followed to create and maintain a relational database correctly. They are called Ownership, Reference, and Subset rules, and they stipulate what is required and allowed to be done to records in relations with the relevant relationships. These rules are often summarized in a concept called Referential Integrity. Unlike what the article stated as a fact when it was written, that the connections between the relations are just abstractions, modern RDMS (such as MS Access) actually allow the user to set up relationships between relations where the rules for Referential Integrity are enforced (see Figure 1 below). This will efficiently prevent the user from breaking any of these rules while working with the data. EXAMPLE

The examples below uses an MS Access database implementation of some of the tables from the examples in the Adams-Strudwick article to demonstrate some of my main issues: many-to-many relationships, referential integrity, AutoNumber key fields, as well as queries and forms with subforms to present data.

214

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

In a real development situation, in addition to the primary entity PEOPLE as in this example, I would also normalize most of the other fields, such as headdress and posture, and thus, in a similar way, select the values for these fields from drop-down menus based on primary entities to guarantee terminology control. I should also mention that in MS Access, relations are normally called tables, hence the mixed terminology below. Let us return to figures 4 and 5 of the Adams-Strudwick article and the example with the reference relation KING_REIGNS. There, the relation PEOPLE acts as the referring relation towards the reference relation KING_REIGNS. The relation PEOPLE also performs the same function in respect of the relation PHOTOS. As storage space is of less concern these days, I would ignore that not every person has (known) reign dates, and thus replace the relation KING_REIGNS with a new and different relation called PEOPLE. This new relation will be a primary entity and be able to store names and other data (not only reign dates, as in this case) about all persons, not only kings. Each record will be uniquely identified by an ID number in a key field. The original relation named PEOPLE will of course have to be renamed to something that better explains its role. For this example, I have chosen to call it PHOTOS-PEOPLE, as it links the relations PHOTOS and PEOPLE in a so called many-to-many relationship (see Figures 1 and 2). In the discussed many-to-many relationship, one photo can be linked to many persons, and one person can be linked to many photos. The ruling part of the relation PHOTOS-PEOPLE (the relation in the middle) consists of fields storing the ID numbers of the referenced records from the key fields of the relations PHOTOS and PEOPLE, and are therefore called “Foreign Key” (FK) fields. The fields in the dependent part are there to provide information about every “meeting” between a person and a photo. It is having a table in the middle (which links the primary entities) that makes a many-to-many relationship possible, and distinguishes it from the simpler one-to-many, many-to-one, and one-to-one relationships, which all are represented by only a simple line connection. The data can be studied from the perspective of either of the two primary entities PHOTOS and PEOPLE. This could be done directly through the query in figures 3 and 4 displayed as a datasheet, where the sorting at any time decides the perspective. Alternatively, the data can be studied from the perspective of either of the two primary entities PHOTOS and PEOPLE through a single record

USING RELATIONAL DATABASES

215

form for the owner relation, with the nest relation displayed as a datasheet in a linked subform (based on the query just discussed). The subform is filtered to only show records owned by the currently selected record on the main form (photo # 0003 in Figure 5, and ID person 1 (Sety) in Figure 6). Figure 5 demonstrates this with the data seen from the perspective of the relation PHOTOS, while Figure 6 does the same from the perspective of the relation PEOPLE. The relationships may then appear as simpler oneto-many/many-to-one relationships. Again, it is in the user interface that one decides from which perspective the data is to be studied at any specific time, and this can be changed from time to time. CONCLUSION

The Adams-Strudwick article is still very relevant, and I recommend everyone using computers within Egyptology to read it to be enlightened on the topic of relational databases, even if they have no (immediate) plans to undertake any project using them. This may even make them see possible important projects, just waiting to be undertaken, where their new relational database understanding can be applied. I should like to thank Diane Bergman, Hans van den Berg, and Leif Näsholm for their comments on drafts of this article.

Figure 1: The discussed many-to-many relationship as implemented in MS Access. The links between the tables are set up to secure Referential Integrity (see point 10).

216

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Figure 2: The discussed example with data from the Adams-Strudwick article, but with one (made-up) extra photo from the Abydos temple depicting a relief including Sety I with different headdress and posture, in order better to demonstrate the many-to-many relationship. For database related pedagogical reasons (see Figure 5), Anknes-neferibre has been given faked reign dates (rather than no reign dates).

USING RELATIONAL DATABASES

217

Figure 3: This example shows one query (out of many possible queries) in design mode (split into two rows in order to fit) that is used to retrieve the data. The three tables have (in this case) been linked as in the relationships diagram, and all fields have been selected, and placed in a chosen order. When run, we can see that the query (in this case) will be sorted in ascending order; firstly on the field “photo #” from the table PHOTOS, and secondly on the field “name” from the table PEOPLE. The fields “temple” and “location” from the table PHOTOS have been merged into one virtual field, calculated at runtime, and separated by comma.

Figure 4: The image shows the same query as in Figure 3, but at runtime. Note that one photo # and one name each appears twice, but no combination “photo # - name” appears more than once. This is because they are based on the two fields making up the key, or ruling part of the table PHOTOS-PEOPLE, which is not allowed to contain duplicates. Also note how the data from the fields “temple” and “location” from the table PHOTOS have been merged.

218 INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

USING RELATIONAL DATABASES

219

Figure 5: The single record main form shows the data from the perspective of the relation PHOTOS, with the nest relation as a datasheet on a subform. New records on the datasheet (people on photo) will automatically be linked to the currently selected record on the main form, and are created by either selecting a name from the drop-down menu in the field “Lookup person”, or by simply entering the persons ID number into the field “FK PEOPLE”.

Figure 6: The single record main form shows the data from the perspective of the relation PEOPLE, with the nest relation as a datasheet on a subform.

AUTHOR ADDRESSES

Ernest ADAMS Consultant http://www.designersnotebook.com/

Edward LORING Russian Academy of Sciences [email protected]

Dag BERGMAN Oxford UK [email protected]

Marcus MÜLLER-ROTH Institut für Kunstgeschichte und Archäologie der Universität Bonn Abteilung Ägyptologie Arbeitsstelle Totenbuch-Projekt Oxfordstraße 15 53111 Bonn Germany [email protected]

Vincent EUVERTE 10, avenue Edouard Belin 92500 Rueil Malmaison France [email protected]

Mark-Jan NEDERHOF School of Computer Science University of St Andrews North Haugh, St Andrews, Fife KY16 9SX Scotland [email protected]

Svenja A. GÜLDEN Auf dem Sandberg 24 51105 Köln Germany [email protected] Claus JURMAN Commission for Egypt and the Levant Austrian Academy of Sciences Postgasse 7/1/10 1010 Vienna Austria [email protected]

Stéphane POLIS Département des Sciences de l’Antiquité, Égyptologie Université de Liège Bât. A1, Place du XX-Août, 7 4000 Liège Belgium [email protected]

221

222

INFORMATION TECHNOLOGY AND EGYPTOLOGY

2008

Derek RAINE Centre for Interdisciplinary Science Department of Physics and Astronomy University of Leicester Leicester LE1 7RH UK [email protected]

Sarah L. SYMONS Centre for Interdisciplinary Science Department of Physics and Astronomy University of Leicester Leicester LE1 7RH UK [email protected]

Vincent RAZANAJAO 14, rue de la Grange Batelière 75009 Paris France [email protected]

Robert VERGNIEUX Director of Archéovision Plate-forme Technologique 3D – ARCHEOVISION Institut Ausonius 8 Esplanade des Antilles 33600 Pessac Cedex France [email protected]

Serge ROSMORDUC 6, rue Augustine Variot, 92240 Malakoff France [email protected] Nigel STRUDWICK British Museum Great Russell Street London WC1N 3DG UK [email protected] Elaine SULLIVAN Department of Near Eastern Languages and Cultures UCLA, 378 Humanities Building, 415 Portola Plaza Los Angeles CA 90095 USA [email protected]

Willeke WENDRICH Department of Near Eastern Languages and Cultures UCLA, 378 Humanities Building, 415 Portola Plaza Los Angeles CA 90095 USA [email protected] Jean WINAND Département des Sciences de l’Antiquité, Égyptologie Université de Liège Bât. A1, Place du XX-Août, 7 4000 Liège Belgium [email protected]