Popularity/Prestige: A New Canon

J.D. Porter; Oct 29, 2018

Shortly after the Lab released my recent pamphlet on the structure of the literary canon, New York magazine ran an article about the 21st century canon, in which a panel of judges pick an early version of the literary canon from the century so far.[1] The structure of their canon is a list, approximately 100 books published (mostly) since 2000---but I wondered how it would fare under the terms laid out in the pamphlet. How does the New York canon look when it comes to popularity and prestige?

The basic idea behind the pamphlet is that literature becomes canonical in a variety of ways, and a structure like a list can't always capture the complexity of the canon as we actually encounter it. The two metrics I picked to remedy this are designed to show us an arrangement of the canon based on things that academic scholars write about (prestige, at least one version of it) and things that many people know about (popularity). This helps us to understand how, say, Gertrude Stein (very prestigious, less popular), Stephen King (very popular, not as prestigious), and Jane Austen (both) relate to each other within our broader notions of canonicity. But what can it tell us about, say, Zadie Smith, W.G. Sebald, and Roberto Bolaño?

I think my version of popularity applies pretty smoothly to the New York list: I use the number of ratings each author has on Goodreads, basically an index of how many users have interacted with the author at all. Prestige is much trickier. In the pamphlet I use the number of academic articles that feature each writer as a primary subject author (i.e., they're tagged as being highly featured in the article), according to the Modern Language Association's database of literary scholarship. I imagine this is not the kind of prestige the New York panelists care about. They're writers, editors, and critics rather than academics, so they probably aren't deeply invested in the goings on at scholarly journals. It's also not clear that MLA scores can tell us much about recent work; academia moves notoriously slowly, and it can take two years or more to get from a paper idea to a publication---so any book published after about 2016 will have had very few chances to appear as the subject of an academic journal. In this case, though, I think these caveats make things more interesting, because they give us a chance to measure one kind of canon (made by people more or less in the contemporary writing business) in terms of another kind (a slower-moving and more academic one).

There's one other major difference between the New York canon and the one I use in the pamphlet. They chose books; I worked with authors. Since I was trying to cover a wide variety of genres over a very long time span, authors made life a lot easier; Shakespeare, Dickinson, and Hurston struck me as somewhat more comparable to each other than Hamlet, "Because I Could Not Stop for Death", and Their Eyes Were Watching God. During the research phase, though, I did try the method on individual books, which led to some interesting depictions of individual authors, each of whom has a characteristic canonical bookprint. For instance, figure 1 shows Joyce, Austen, and Dickens (note that the axes are logarithmically scaled).

Figure 1

Joyce is extremely prestigious, albeit with most of the criticism about him allocated to one book. Austen is substantially less prestigious, but much more popular; modern Goodreads users still love Pride and Prejudice (even in comparison with today's novels). Dickens as he appears here has achieved Austen-like canonicity, but only in the top half of his output; he also has an entire Austen-sized corpus with distinctly less-than-Austen results. As an author, he has greater total prestige than Austen, in part because he produced so many novels for scholars to analyze; but on the level of individual books, her works tend to surpass his.

After all of that preface, let's get to the results for the *New York *canon. We can start with Figure 2, which uses non-logarithmic axes in order to make the outliers clear.

Figure 2

Readers are very familiar with Gillian Flynn's Gone Girl, which is beating every other book depicted here by over one million ratings (none of the others even has one million ratings). Meanwhile, W.G. Sebald's Austerlitz is, by a pretty sizable margin, the most prestigious book, having netted 258 academic articles.[2] The clear difference between the two speaks to the usefulness of the graph. If you took a negative view of these choices, you might say Gone Girl can't make the canon because it's a flash in the pan (no one takes it seriously) or Sebald can't because he's obscure (you can't be canonical if most people have never heard of you). Yet they're clearly both quite successful, along their particular tracks. Cormac McCarthy's The Road splits the difference---a book well known enough to, say, become a successful movie, but also by a Famous Important Writer.

Turning to a logarithmic depiction, in Figure 3, gives us a clearer picture of the overall structure of this canon.

Figure 3

The gray lines here reflect the median values for both prestige and popularity. It's immediately clear that most of these books have received fairly little academic attention, even when we think of them as pretty prestigious. Leaving the Atocha Station, for instance, is highly regarded (it was widely praised and won some awards), yet I have personally met 25% of the people who have published MLA-recognized articles about it (his name is Alexander Manshel; he's a Lit Lab member).[3] In fact, 35 of the books listed, or a little over a third, have no MLA articles at all. That means (and for me this rings intuitively true) that getting any scholarly attention is a very strong positive signal for books published in the last 18 years.

This is in part a function of time, for the reasons mentioned above. As Figure 4 shows us, the MLA number is highly contingent on the book's publication date.

Figure 4

In fact, this effect is even more pronounced than I expected: The big drop-off starts ten years ago. This points to another major (and, I suppose, obvious) difference between academic thinking about literature and that of an institution like *New York *magazine. In an English department, most people work on old literature; you'd virtually never see a department that had a majority of its faculty working on the contemporary. New York, by contrast, probably does not employ any medievalists. Over time, this means that academics are more likely to keep talking about the same things over and over, with a rich-get-richer effect. Whoever the Ian McEwan of 1895 was, New York is unlikely to mention him much these days. Meanwhile Hamlet has accrued 2,169 articles since 2000---399 more than everything in Figure 4 combined.

There's a conspicuous absence in the graphs so far, however, and addressing it helps to close that gap quite a bit. The New York editors included several book series in their list, like Elena Ferrante's Neapolitan novels or N.K. Jemisin's Broken Earth trilogy. I didn't include those in the images above, because I wasn't quite sure how to show them. Here's the original, non-logarithmic image, with every novel from each series included.

Figure 5

In prestige terms, Margaret Atwood's Oryx and Crake (part of her MaddAddam trilogy) is the big addition, coming in third overall. But an even more striking story is clearly Harry Potter. I'll admit that when I first heard about the New York article, Harry Potter was my first thought---inspired, most likely, by J.K. Rowling's incredible popularity in the data for the pamphlet. The Potter series here has 18.2 million Goodreads ratings; the entire non-Potter corpus has 7.7 million. Some of that is recency; as I note in the pamphlet, the best-sellers of 1918 are mostly forgotten today, so popularity in 2018 might be equally ephemeral, if we're trying to project forward. But the first Potter novel is actually the oldest book on this list (it came out about a decade before Goodreads existed), and it has 5.6 million ratings by itself. The numbers just confirm what we all know---these books were the major popular literary event of the century so far.

Oddly enough, the New York panel almost didn't include them. They divided their list into four tiers: Best Book of the Century (So far); 12 New Classics; The High Canon (books chosen by at least two panelists); and The Rest. The first tier only contained one book, Helen DeWitt's The Last Samurai (no relation), which is actually in the southwest, least canonical quadrant here, with 4,105 Goodreads reviews and 1 MLA article. I'll admit that I found their choice of that novel pretty surprising; maybe this is an index of some pretty substantial institutional differences in literary consecration. Still, it's worth remembering that making a list like this is often more an exercise in canon creation than in canon reflection. I think in DH and other empirical literary critical fields, we often think of the canon as something out there in the world for us to study, which it is; but as in many other fields, observation changes the object. For the New York panel, this way of thinking about things was probably pretty paramount. You don't necessarily pick Ghachar Ghochar because you think it has had a significant impact on (English language) literary culture; you pick it because you want it to.

Perhaps as a result of motivations like these, the Harry Potter series made the last tier, meaning, I suppose, that only one person picked it. Could this be a prestige problem? Perhaps; I recall presenting my earlier Pop/Pres data for a French audience, and most of them scoffed at the prospect that any French academic would ever write about someone like Rowling (perhaps she should have tried professional wrestling). Yet Figure 5 shows that she's right up there with all sorts of prestigious novels; the first Potter novel is beating The Corrections and three books by J.M. Coetzee.

Of course, Harry Potter and the Philosopher's/Sorcerer's Stone came out in 1997, early for this canon; but there's another factor to consider, too, the one that kept me from including series in the original images.[4] The Potter titles sum to 137 MLA articles---respectable, but less than Austerlitz, The Road, or Oryx and Crake. If you just look up "Harry Potter", though, (with Rowling in another field, so we're not getting just any Harrys or Potters), you get 775 MLA articles.[5] In other words, more people are writing about the Harry Potter series than about any one of the novels (or about any of the rest of these books) by a substantial margin. It's a bit like Sherlock Holmes; scholars might write about Hound of the Baskervilles or The Speckled Band, but often they just write about Holmes himself. The literary contribution is not well captured by any particular publication---which makes it difficult to depict on a graph.

Nonetheless, in prestige as in popularity, Rowling is crushing it. Figure 6 is an attempt to depict each series as a single collective point; it receives the sum of the Goodreads ratings for each constituent book, plus whichever MLA score is higher between A) summing the individual volumes or B) looking up the series as a whole (using both would risk double counting articles). This isn't quite fair to the other books on the list; after all, it's surely easier to amass Goodreads ratings from your fans if you give them multiple books to rate. Nonetheless, the results are pretty stark. Seen this way, the MaddAddam trilogy is approximately as canonical as The Road, and the Harry Potter series is by far the most canonical thing on the graph---in the pamphlet, I came to think of that space in the top right corner as the Shakespeare Position. In this company, Shakespeare is clearly J.K. Rowling.

Figure 6

If you're like me, the fun part of all this is speculating about how this canon will age. What will the literary scholars of 2118 make of this period, or this list? I have my non-quantitative opinions, of course, often about things that didn't make the New York list.[6] I also don't think this way of measuring canonicity can offer much of a negative signal, especially given the time constraints. Helen Oyeyemi and Valeria Luiselli seem like strong candidates for the canon to me, but their work is too recent to be sure (that's not just a problem with these graphs, of course---recent work is always more difficult to evaluate for the longue durée). Moreover, many metrics are just not captured here, or not capturable. I find myself thinking about Marlon James's A Brief History of Seven Killings pretty often, and it's not quite in the northeastern, most canonical quadrant yet; I understood the most recent U.S. Open entirely differently, in real time, because of Claudia Rankine's Citizen---how do you measure an effect like that?[7]

Still, I think we can take some positive signals from what we see here. Austerlitz is in great shape; this accords pretty well with my sense of Sebald's uptake among academics. It may also be noteworthy that he made this list in spite of writing in another language; there are only a few such cases here. That could be a good sign for 2666; add Bolaño's global reach to the information here, and his work looks formidable. McCarthy's Road is doing so well that I have to imagine it will persist for a while. For me, though, since I'm unfamiliar with the series, the *Oryx and Crake *results were the most surprising. In a way, Atwood has already begun to stand the test of time; The Handmaid's Tale came out thirty years ago, and it clearly still has an impact today. Atwood is well positioned to stay put in the Northeast quadrant.

And of course there's Harry Potter; I think those novels are already in, for the same reason Sherlock Holmes made it one hundred years earlier. At a certain point, you're so popular that people can't avoid talking about you, even if only to try and understand your popularity. If you look back at Figure 3, the New York selections are concentrated in two places: In the Northeast, canonical quadrant, and scattered along the "no MLA articles" axis at the bottom. Much more than in the broad literary canon depicted in the pamphlet, the recent canon unites popularity and prestige---to get one, you really need the other. In a few cases I think various kinds of prestige probably led the charge (e.g., with Ngũgĩ wa Thiong'o's Wizard of the Crow, which has only 2,106 Goodreads ratings, but has already amassed 46 MLA articles.) Typically, though, it appears that readerly attention is something of a precursor for critical attention; when something is widely read (and it's clear that reading happens much faster than academically analyzing), scholars are more apt to take notice. Harry Potter is a kind of apotheosis of that principle. We can't know who else will make the canonical library, but when they arrive, Hermione will have gotten there first.

Notes

¹ Note that the link is to Vulture rather than to New York per se (the two are connected in some corporate structure or another). It's tempting to call this the "Vulture Canon", but I first encountered it in the print version of *New York *magazine, and I like the way "New York" suggests the world of professional writers/critics/editors in question, so with some regrets I'll use "New York Canon" in this post.

² It's a little more difficult to look up books than authors, since titles often consist of common words (e.g., Ali Smith's How to Be Both). Minor differences also matter more at this scale than they did in the pamphlet---being off by two articles is more significant for a book with 10, in comparison with an author who has 6,000. My method was to start with the title in the Primary Subject Work field, and the author's last name in a general field. I also tried titles in the original language where applicable, and in a few cases titles in the general field. As a rule I tried to give a book the maximum plausible number of articles I could find. I spot-checked the results; I think they will generally hold up, but it's always possible I missed something.

³ You can read Manshel's article here.

⁴ * The Corrections* came out in 2001, and one of the Coetzee novels (Boyhood) is also from 1997, so Rowling doesn't have *that *much of a head start over those two.

⁵ Readers of the pamphlet may notice that this is about 50% more articles than Rowling had as a whole in that data. That's because that data was a few years old; the information in this post was collected in October 2018.

⁶ Here's a self-indulgent footnote: I'd have included Harryette Mullen's Sleeping with the Dictionary, César Aira's An Episode in the Life of a Landscape Painter, Allison Bechdel's Fun Home, Toni Morrison's A Mercy, and Cixin Liu's The Three Body Problem trilogy.

⁷ In general, non-novel genres suffered here, probably because they don't attain their peak audiences or critical attention through the book format. Frederick Seidel and Fred Moten are trapped in the least canonical quadrant, but that doesn't mean much about their poems. And for my money Anne Carson seems destined for enduring canonicity, but The Beauty of the Husband doesn't quite capture what she's up to.