'The Hidden Dictionary' at DH 2018

Mark Algee-Hewitt will be presenting "The Hidden Dictionary" at DH 2018 in Mexico City. The introduction to the abstract is as follows:

Introduction

While written works are often encountered by readers as linear phenomena, one of the most important conceptual advances offered by the Digital Humanities is the way that computational text analysis has permitted researchers to find non-linear patterns that speak to organizational principles embedded in even a single text. The methods developed to parse thousands, or millions, of texts can, in the context of a single work, reveal connections and patterns that are unavailable to a human reader.

Even in reference books, whose alphabetic order discourages the same kind of linearity found in novels, digital methods have proven effective at revealing alternative ordering principles. This has been particular important in eighteenth-century studies. For example, recent digital work on the French Encyclopédie has sought to assess the compatibility of the multiple ways in which the text was organized by its authors. In their 2002 article, Gilles Blanchard and Mark Olsen measure the knowledge domains described by Diderot in his introduction by counting the number of renvois, or “see alsos” between articles in each domain. Similarly, Heuser, Algee-Hewitt and Bender have also reconstructed the French Encyclopédie based on which articles are connected by renvois. In both cases, an alternative structure emerges: one that speaks to connections between domains of knowledge that are more meaningful than the alphabetic layout would suggest.

In this project, I employ a similar set of methodologies to explore the other foundational linguistic reference book of the eighteenth century, Samuel Johnson’s 1755 Dictionary of the English Language. While it lacks a system of renvois to counter-balance the prevailing alphabetic order it shares with Diderot’s work, it nevertheless contains a hidden system of connections between seemingly disparate articles, whose organization can only be revealed through quantitative analysis: the quotations used in the definitions of each word. These quotations are what separate Johnson’s dictionary from other, earlier dictionaries. In providing a contextual basis for assessing meaning, Johnson grounds definitions in historical usage and contingent situations. Yet, by Johnson’s own definition, the quotations have an educational and referential purpose that remains implicit within their use. And, by sheer volume, their presence is the most notable aspect of the dictionary. A given page of the 1775, second edition of the text, defines 17 words using 52 quotations. The typographical imbalance between the definitions and the quotations, which overwhelm the page, is striking, even while this is a fact that should come as no surprise to users of the OED, the spiritual successor to Johnson’s Dictionary.

This project, therefore, seeks to answer three questions. First, who is cited in what contexts in the Dictionary? Here, a quantitative methodology should allow for unprecedented access to the fine-grained details of the text. Second, if Johnson’s Dictionary were rearranged to group articles connected by shared quotations together, what patterns of relationship emerge? And finally, how does Johnson use his quotations to reflect back on the works that he cites?