Typicality in the US Novel

Erik Fredner, Mark Algee-Hewitt; Jul 18, 2020

The talk below was originally scheduled to be given as a lightning talk at DH2020. We have adapted it here for the new virtual conference. You can follow along with our slides here.

For a discipline committed to rejecting reductionism, literary studies relies on typicality more than it would care to admit. For example, [SLIDE] Frederic Jameson describes an “unexpected” change in a character’s life in the novel Demos (1886) as something that would [SLIDE] “normally generate a properly Utopian narrative” (Jameson 184, emphasis added). Of course, the subversion of his expectation is what interests Jameson. Yet so much literary criticism seeks to explain subverted expectations that critics tend to ignore an equally fascinating question: What exactly is being subverted in such moments? Our project uses computational methods in an attempt to turn away from these transgressions and toward expectations.

We are not the first to attempt to study literary typicality using computational methods. Indeed, typicality has already been a question in computational literary studies for decades. In an essay on Josephine Miles’s computational reading experiments from the 1950s, Brad Pasanek quotes an objection made by a reviewer to Miles’s work: [SLIDE] “‘If Quarles is “most typical,” in any way at all, of this set of poets,’ which includes Donne, John Milton, and John Dryden, ‘I am immediately convinced that typicality is not a fruitful thing to investigate’” (Pasanek 369). Here, computational typicality and readerly typicality diverge. But we can reject Miles’s reviewer’s conclusion that typicality is not fruitful to investigate precisely because that reviewer had such strong priors about what typifies Donne, Milton, Dryden, and Quarles that they felt confident declaring Miles’s method fruitless “immediately.”

Our project consists of a series of experimental provocations that analyze literary typicality using many different features. For this brief talk, I will be presenting just one of the experiments we have conducted. We do not claim that any one of our methodologies operationalizes literary typicality as such. Rather, each provides new evidence about the works and features that might typify authors and historical periods. Our argument is, in essence, a rejection of Miles’s critic: typicality is a fruitful thing to investigate in part because literary criticism has rather conspicuously refused to do it.

That said, quantitative typicality is a different judgment than qualitative typicality. Heather Brink-Roby cites an analogy from nineteenth-century zoology that clarifies this point: In a letter to Charles Darwin, George Robert Waterhouse distinguishes between two ways in which zoologists identified the typical species of a group. [SLIDE] The first is Jameson’s method, whereby the critic identifies a trope as typical of Utopian fiction. A quantitative approach, by contrast, would look to a corpus of Utopian novels and ask which, if any, displays what Waterhouse calls [SLIDE] “the greatest number of [characteristics] most common to the species…in the best balanced condition.” The naivete of this latter approach is part of our intervention.

The experiment we’re sharing today is one of our simplest: Using Gale’s corpus of American fiction, which contains more than 18,000 American texts published between 1774 and 1920, we identified the 2,000 most frequent nouns across the whole corpus using Apache OpenNLP. Then, we used the scaled frequencies of each of those nouns in each text as our feature set. We compare the texts using three measures of similarity in hyperdimensional space. [SLIDE] Computationally, the goal is to identify which texts are least unlike the others in the corpus. Finally, we use t-stochastic neighbor embedding to visualize these relationships in high-dimensional space. [SLIDE]

Here, we find clusters that correspond with known subjects of nineteenth-century American novels. For instance, [SLIDE] we have here a clear grouping of seafaring tales and desert island adventures. Another cluster [SLIDE] contains works of historical fiction, including the Arthurian medievalism that Twain parodies in Connecticut Yankee.

However, one novel from 1912 is closest to the mean and the median noun distribution in our corpus: The Dragon’s Daughter by Clyde C. Westover. Generically, it combines tropes from the American Western with an Orientalized San Francisco Chinatown, opium smuggling, and the mob. Tellingly, The Dragon’s Daughter was also adapted into a 1919 film, The Tong Man. Its cinematic potential by the standards of early Hollywood is clear from the first page [SLIDE].

The question this finding raises falls outside of the bounds of this brief presentation, but lies at the heart of our study: How typical is the average novel? Is our Westover like Miles’s reviewer’s Quarles? Or is The Dragon’s Daughter more characteristic of American fiction than we might care to admit?<

Works Cited

Jameson, Fredric. The Political Unconscious: Narrative as a Socially Symbolic Act. Routledge, 2013.
Pasanek, Brad. “Extreme Reading: Josephine Miles and the Scale of the Pre-Digital Digital Humanities.” ELH, vol. 86, no. 2, June 2019, pp. 355–85. Project MUSE, doi:10.1353/elh.2019.0018.
Shklovsky, Víktor. The Novel: An Anthology of Criticism and Theory 1900-2000. Edited by Dorothy J. Hale, John Wiley & Sons, 2005.
Westover, Clyde C. The Dragon’s Daughter. Neale Publishing Company, 1912.