Hypertext, Scholarly Annotation, and the Electronic Edition

George P. Landow
Department of English, Brown University, Providence, Rhode Island 02919

KEYWORDS: electronic-editing, linking, annotation

AFFILIATION: Brown University

E-MAIL:       gplandow@brownvm.brown.edu
PHONE NUMBER: 401-863-2393

The particular form of digital textuality called hypertext has the power to change both our experience and our understanding of text and author. It therefore has the potential to reconfigure the nature of the scholarly edition in ways that might appear radical. After providing a brief working definition of hypertext, I shall next observe some of the ways even seemingly minor variations in technology can affect forms of reading and writing. Finally, I shall look at some of the implications of various forms of hypertext for scholarly annotation, particularly that intended for scholarly editions.

As used here, the term hypertext, which includes hypermedia, signifies text composed of lexias (blocks of words, images, or sounds) linked electronically by multiple paths, chains, or trails in an open-ended web. Hypertext, in other words, is an information technology in which a new element - the link - plays a major part. All the chief practical, cultural, and educational characteristics of this medium derive from the fact that linking creates new kinds of connectivity and reader choice. Hypertext is therefore properly described as multisequential or multilinear rather than as nonlinear writing.

All hypertext systems thus far developed represent only partial, very limited embodiments of the Nelsonian vision of this information technology, one that ultimately requires all texts to be linked together in a universal web or docuverse. One can categorize present hypertext systems in terms of a series of oppositions, such as those between systems that store links in each lexia (Hypercard, Storyspace, html) and those that store them in a separate database (Intermedia, Microcosm). Similarly, one can distinguish between systems, like Voyager Expanded Book or early versions of DynaText, that conceive of the text as an electronified book with an essentially axial structure, and those that begin with a more fundamentally hypertextual network organization. As important as these differences are, two others have more fundamental implications for the scholarly edition, the first being that between read-only or presentation systems and those in which readers have the ability to add their own texts, comments, or both. A second, equally basic opposition divides discrete, stand-alone hypertexts that exist on single machines and that dispersed form of hypertext whose components reside on separate machines joined to form networks. The World Wide Web (WWW) represents an early, extremely limited, instantiation of this second form.

Anyone contemplating the creation of an electronic edition of a literary or other text immediately confronts questions directly related to these technical issues. For example, in designing an edition one has to decide if it should take the form of a stand-alone textual corpus or participate within a larger network. This distinction between stand-alone versus networked hypertext relates directly to matters of scale, since any stand-alone edition, whether contained on a floppy disk, Zip cartridge, or CD-ROM, brings with it predetermined limits upon size. These factors in turn relate directly to questions about how the new scholarly edition will be purchased, delivered, stored, and read. Matters of scale also influence the manner in which in one answers the question, "Of what does the scholarly edition consist - a text, text plus explanatory contextualization, both of these plus reference tools, such as the OED, or, finally, all of these plus Everything Else?"

The difference between stand-alone and networked hypertext forces one to recognize a principle that appears particularly important during early stages of the development of these new information technologies - namely, the rule that apparently minor differences in technology can have unexpectedly major consequences upon intellectual work. Historians of reading, information technology, and culture offer the example of the effects created by the relatively inexpensive writing surfaces that appeared around the year 1000. Cheaper writing materials encouraged scribes to leave spaces between words, and this new writing practice in turn made reading vastly easier, ultimately producing the practice of silent reading, or reading to oneself; and this widespread cultural practice seems to have played a role in the development of ideas of the self and private, interior imaginative space.

At this point it appears impossible to tell if any particular detail or difference in the new information technologies can, by itself, have the effect of that interword spacing did upon the late middle ages. One can note, however, obvious effects created by minor technological differences upon our conception of the new reading, writing, and editing. For example, even something as apparently trivial as screen size or screen definition has clear effects: large monitors encourage hypertext and other document systems that employ multiple windows, whereas smaller viewing surfaces demand systems that rely on a more limited card or chunk metaphor. Smaller screens also tend to work best with comparatively disorienting systems that replace one window by another rather than those that add new windows to ones already open. From the point of view of someone designing an electronic edition, screen size has major effects because it has direct impact upon the coherence and usability of scholarly annotation.

Many other technological matters, including system speed, software agents, and simulation, have the power to create new forms of the scholarly edition. Simulation, which in the long run might appear directly opposed to alphanumeric text, has the potential in the short run to create new kinds of scholarly editions. Interactive graphics can produce the appearance of earlier editions either by simply reproducing color images or by using SGML to provide dynamic reconfiguration of texts that could allow readers to shift between modern typography and a more-or-less exact presentation of how text appeared in older versions.

Given the potentially major effects of even minor changes in system speed, economy of scale, and software environments, predicting the future of electronic editing seems an especially risky business. Nonetheless, certain trends and developments appear likely enough to be almost inevitable, and the most important of these involves the increasingly importance of location-independent, dispersed, networked hypertext technology. Therefore, in making recommendations about the nature of annotation in the future scholarly edition, one has to confront the potential problems and difficulties presented by dispersed, virtual texts of potentially infinite size.

Before commenting specifically upon annotation, I wish to point out some of the effects of hypertext upon our conceptions of text and textuality. The dispersed textuality characteristic of this information technology calls into question some of the most basic assumptions about the nature of text, textual editing, and textual encounters by readers. The appearance of the digital word has the major cultural effect of permitting us, for the first time in centuries, to perceive easily the degree to which we have become so accustomed to the qualities and cultural effects of the book that we unconsciously transfer them to the productions of oral and manuscript cultures. We so tend to take print and print-based culture for granted that, as the jargon has it, we have "naturalized" the book by assuming that habits of mind and manners of working associated with it have naturally and inevitably always existed. Eisenstein, McLuhan, Kernan, and other students of the cultural implications of print technology have demonstrated the ways in which the printed book formed and informed our intellectual history. They point out, for example, that a great part of these cultural effects derive from book technology's creation of multiple copies of essentially the same text. Multiple copies of a fixed text in turn produce scholarship and education as we know it by permitting readers in different times and places to consult and refer to the "same" text. Historians of print technology also point out that economic factors associated with book production led to the development of both copyright and related notions of creativity and originality. My reason for once again going over this familiar ground lies in the fact that all these factors combine to make a single, singular unitary text an almost unspoken cultural ideal. They provide, in other words, the cultural model and justification for scholarly textual editing as we have know it.

Such a congeries of attitudes, economic factors, technology, and intellectual practice have produced enormous benefits. They are, after all, responsible for a good portion of what we understand by education and scholarship, but these benefits have come at a cost, though one that most of us would admit was well worth it. For example, as Peter Robinson and other scholars have recently argued, the modern print-based conception of a unitary text falsifies the experience of medieval readers of who would have read - or heard - a much less unified text marked by a wide range of variations.

As I have argued before, the dispersed text of hypertext has much in common with the way contemporary individual readers of, say, Chaucer or Dante, read texts that differed from one another in various ways. Hypertext does not, of course, come even close to reproducing either the medieval text or the medieval experience of encountering it - even assuming that one could precisely ascertain this last - but it does have two beneficial effects: First, it offers a variant upon our usual experience of reading texts, and, second, it encourages reconceptualizing the scholarly edition. This reconceptualization takes forms that at first glance seem particularly paradoxical. Book technology (or print culture), whose chief characteristic is the production of multiple copies of essentially the same text, places great importance on a unitary text as cultural and editorial model. For two centuries now scholars have sought to identify instances of single perfect texts, and although such editorial theory has profound benefits for education and scholarship, it also produces false assumptions that greatly distort our understanding of the productions of oral and manuscript cultures.

It is therefore particularly ironic or simple poetic justice - take your pick - that digital technology so calls into question the assumptions of print-associated editorial theory that it forces us to reconceive editing texts originally produced for print as well as those created within earlier information regimes. Print technology's emphasis on the unitary text prompted the notion of a single perfect version of all texts at precisely the cultural moment that the presence of multiple print editions undercut that emphasis - something not much recognized, if at all, until the arrival of digitality. As the work of George Bornstein, Jerome J. McGann, and others have urged, any publication during an author's lifetime that in some manner received his or her approval - if only to the extent that the author later chose not to correct changes made by an editor or printer - is an authentic edition. Looking at the works of authors such as Ruskin and Yeats who radically rewrote and rearranged their texts throughout their careers, one recognizes that the traditional scholarly edition generally makes extremely difficult reconstructing the version someone read at a particular date. Indeed, from one point of view it may radically distort our experience of an individual volume of poems by the very fact that it enforces an especially static frozen model on what turns out to have been a continually shifting and changing entity.

This new conception of a more fluid, dispersed text, possibly truer than conventional editions, raises the issue can one have a scholarly edition at all, or must we settle for what Jerome J. McGann terms an archive - essentially a collection of textual fragments (or versions) from which we assemble, or have the computer assemble, any particular version that suits a certain reading strategy or scholarly question, such as "What version of Modern Painters, volume 1, did William Morris read at a particular date and how did the text he read differ from what American Ruskinians read?"

One does not encounter many of these issues when producing print editions because matters of scale and economy decide or foreclose them in advance. In general, physical and economic limitations shape the nature of annotations one attaches to a print edition just as they shape the basic conception of that edition. So what can we expect to happen when these limitations disappear? Or, to phrase the question differently, what advantages and disadvantages, what new problems and new advantages, will we encounter with the digital word?

One answer lies in what hypertext does to the concept of annotation. As I have argued in Hypertext, this new information technology reconfigures not only our experience of textuality but also our conceptions of the author's relation to that text, for it inevitably produces several forms of asynchronous collaboration, the first, limited one inevitably appearing when readers choose their own ways through a branching text. A second form appears only in a fully networked hypertext environment that permits readers to add links to texts they encounter. In such environments, which are exemplified by the World Wide Web, the editor, like the author, inevitably loses a certain amount of power and control. Or, as one of my friends who created the web site for a major computer company pointed out, "If you want to play this game, you have to give up control of your own text." Although one could envision a situation in which any reader could comment upon another editor's text, a far more interesting one arises when successive editors or commentators add to what in the print environment would be an existing edition. In fact, one can envisage a situation in which readers might ultimately encounter a range of annotations.

An example taken from my recent experience with having students create an annotated version - read "edition" - of Carlyle's "Hudson's Statue" on the World Wide Web illuminates some of the issues here. I intended the assignment in part to introduce undergraduates to various electronic resources available at my university, including the on-line versions of the Oxford English Dictionary and Encyclopaedia Britannica. I wished to habituate them to using electronic reference tools accessible outside the physical precincts of the library in order both to acquaint them with these new tools and also to encourage students to move from them to those presently available only in print form. For this project students chose terms or phrases ranging from British political history ("Naseby Field," "Lord Ellenborough," and "People's League") to religion and myth ("Vishnu," "Vedas," "Loki"). They then defined or described the items chosen and then briefly explained Carlyle's allusion and, where known, his uses of these items in other writings.

This simple undergraduate assignment immediately raised issues crucial to the electronic scholarly edition. First of all, the absence of limitations upon scale - or to be more accurate, the absence of the same limitations upon scale one encounters with physical editions - permits much longer, more substantial notes than might seem suitable in a print edition. To some extent a hypertext environment always reconfigures the relative status of main text and subsidiary annotation. It also makes much longer notes possible. Electronic linking makes information in a note easily available, and therefore these more substantial notes conveniently link to many more places both inside and outside the particular text under consideration than would be either possible or conveniently usable in a print edition. Taking our present example of "Hudson's Statue," for instance, we see that historical materials on, say, democratic movements like Chartism and the People's International League can shift positions in relation to the annotated text: unlike a print environment, an electronic one permits perceiving the relation of such materials in opposite manners. The historical materials can appear as annotations to the Carlyle text, or conversely "Hudson's Statue" can appear - be experienced as - an annotation to the historical materials. Both in other words exist in a networked textual field in which their relationship depends solely upon the reader's need and purpose.

Such recognitions of what happens to the scholarly text in wide-area-networked environments, such as those created by WWW and HyperG, only complicates matters by forcing us to confront the question, "What becomes of the concept and practice of scholarly annotation?" Clearly, linking by itself isn't enough, and neither is text retrieval. At first glance, it might seem that one could solve many issues of scholarly annotation in an electronic environment by using sophisticated text retrieval. In the case of my student-created annotated edition of "Hudson's Statue," one could just provide instructions to use the available search tools, though this do-it-yourself approach would probably only appeal to the already-experienced researcher. Our textual experiment quickly turned up another, more basic problem when several bright, hard-working neophytes wrote elegant notes containing accurate, clearly attributed information that nonetheless referred to the wrong person, in two cases providing material about figures from the Renaissance rather than about the lesser-known nineteenth-century people to whom Carlyle referred. What this simple-minded example suggests, of course, is nothing more radical than that for the foreseeable future scholarship will always be needed, or to phrase my point in terms relevant to the present inquiry, one cannot automate textual annotation. Text retrieval, however valuable, by itself can't do it all.

Fine, but what about hypertext? The problem, after all, with information retrieval lies in the fact that the active reader might obtain either nonsignificant information or information whose value they might not be able to determine. Hypertext, in contrast, can provide editorially approved connections in the form of links, which can move from a passage in the so-called main text - here "Hudson's Statue" - to other passages in the same text, explanatory materials relevant to it, and so on. Therefore, assuming that one had permission to create links to the various on-line resources, such as the OED, one could do so. If one did not have such permission, one could easily download copies of the materials from them, choose relevant sections, and put them back on-line within a web to which one had access; this second procedure is in essence the one many students chose to follow. Although providing slightly more convenience to the reader than the text-retrieval do-it-yourself model, this model still confronts the reader with problems in the form of passages (or notes) longer than he or she may wish to read.

One possible solution lies in creating multi-level or linked progressive annotation. Looking at the valuable, if overly long, essay one student had written on Carlyle and Hindu deities, I realized that a better way of proceeding lay in taking the brief concluding section on Carlyle's satiric use of these materials and making that the first text or lexia the reader encounters; the first mention of, say, Vedas or Vishnu, in that lexia was then linked to the longer essays, thereby providing conveniently accessible information on demand but not before it was required.

I have approached these questions about scholarly editions through the apparently unrelated matters of a student assignment and educational materials because they remind us that in anything like a fully linked electronic environment, all texts have variable applications and purposes. One consequence appears in the variable forms that annotation and editorial apparatus will almost certainly have to take: since everyone from the advanced scholar down to the beginning student or reader outside the setting of an educational institution might be able to read such texts, they will require various layers or levels of annotation, something particularly necessary when the ultimate linked text is not a scholarly note but another literary text.

Thus far I have written only as if the linked material in the hypertext scholarly edition consists of textual apparatus, explanatory comment, and contextualization, but by now it should have become obvious that many of those comments inevitably lead to other so-called primary texts. Thus, in our putative edition of "Hudson's Statue" one cannot only link it to reference works, such as the OED, the Britannica, (and possibly in future) to the Dictionary of National Biography, but also to entire linguistic corpora and to other texts by the same author, including working drafts, letters, and other publications. Why stop there? Even in the relatively flat, primitive version of hypertext offered by the present WWW the Carlylean text demands links to works upon which he draws, such as Jonathan Swift's "Tale of a Tub", and those that draw upon him, such as Ruskin's "Traffic," whose satiric image of the Goddess-of-Getting-on (or Britannia of the Market) derives rather obviously from Carlyle's ruminations on the never-completed statue of a stock swindler.

Once again, though, linking, which reconfigures our experience and expectations of the text, is not enough, for the scholarly editor must decide how to link the two texts. Once again, the need for some form of intermediary lexias seems obvious, the first, say, briefly pointing to a proposed connection between two texts, the next in sequence providing a summary of complex relations (the outline in fact of what might in the print environment have been a scholarly article or even book), the third an overview of relevant comparisons, and the last the actual full text of the other author. At each stage (or lexia), the reader should have the power not only to return to the so-called main text of "Hudson's Statue" but also to reach these linked materials out of sequence. Vannevar Bush, who invented the general notion of hypertext, thought that chains or trails of links might themselves constitute a new form of scholarly writing, and annotations in the form of such guided tours might conceivably become part of the future scholarly edition. We can be certain, however, that as constraints of scale lessen, increasing amounts of material will be summoned to illuminate individual texts and new forms of multiple annotation will develop as a way of turning availability into accessibility.