For the purpose of creating this chart, I made use of both the Wildcard and Inflection functions on Google Ngram (instructions here). Wildcard gives us the top ten most popular words that follow or precede "sodomite(s)" in all of the books published between 1850 and 1930 that Google has digitized. Meanwhile, Inflection gives us all of the derivatives of "sodomite" (just "sodomites" in this case). Unfortunately, the tool would not let me pair the two functions together, so I ended up inputting "sodomite *, sodomites *, * sodomite, * sodomites." I chose the word "sodomite" because it has maintained a relatively uniform popularity for the past two centuries – with the exception of the the past three decades (please see below). I chose the 1850-1930 time range because it is roughly the same time period covered in my thesis (the first queer public protest being in 1867 and the first queer organization in America founded in 1924), encapsulating what I often refer to as "first-wave queer activism," wherein – for the first time in the Western world – queer sexual identifiers were invented for the explicit purpose of asserting a political presence. Google Ngram logically assigns the color scheme – various shades of the same color for Wildcards and Inflections of the same words. All of the lines/data points may seem overwhelming at first, however, the overlay gives us a general picture of parallel trends while one can select individual phrases to highlight and identify specific lines.
Below the Ngram chart is a list of hyperlinked years and words to "Search in Google Books." To investigate the mysterious "notorious sodomite" entry (about midway through the list above), I used one of these links, which brought me to a regular Google Books search. To narrow my search results to works published in the period I was looking at in my Ngram chart (and to filter out contemporary historiographies using or quoting the word), I went under Tools and customized the time range. This new search yielded a bunch of old pseudo-historical texts – an 1855 Church of England Magazine article, an 1871 History of Romanism, an 1889 History of Latin Christianity, and a 1927 Preface to The Life and Confessions of Oscar Wilde. Those accused of being "notorious sodomites" included Xenophon (an ancient Greek philosopher), Pope Julius III, Pope Boniface, and (unexpectedly) Robbie Ross – Wilde's good friend. Ironically, Xenophon seems to have been more of an opponent of same-sex sexual activity (writing somewhat admiringly about the lack of homoeroticism in Spartan culture and going so far as to debate Socrates about the shamefulness of "homosexuality"). There is seemingly no evidence to support that Boniface was any sort of a queer, meanwhile Julius III was rumored to have had a long-standing relationship with his adopted nephew. Last but not least, the story of Oscar Wilde is well known; however, based on the Snippet View I was able to access, this descriptor among others ("an unspeakable skunk," "habitual debaucher and corrupter of young boys" and "blackmailer") was ascribed to Ross – an openly "gay" journalist.
Random factoids aside, these primary sources – the historical research they might prompt having been unified under an (un)common theme – illustrate one of the best aspects of Ngram. This tool is not just "exploratory for the sake of exploratory." it encourages critical engagement and follow-up questions. I was prompted to go down a rabbit hole of information just to learn about these men and their lives. What other information can we glean from these charts when different elements interest different people? Ngram should not be reserved for the masturbatory ramblings of academics who seek to engage the digital humanities in their work; it should be utilized by any layperson who is interested in understanding how language shapes and is shaped by our contexts – temporal, cultural or otherwise. In this case, "sodomy" is associated with immorality, notoriety, and a ruined reputation (especially, it seems, in a religious context).
Much like the popularity of "the sodomites" in the previous chart might indicate an acknowledgement of collective identity or orientation (however disparaging) by the 1930s, "the homosexual" comes off as rather individuated – with pluralized derivatives only cropping up in the lower half of the popularity count. This speaks to a larger historical theme of the homosexual identifier (actually originated by a queer activist) being appropriated by judgemental sexologists who constructed regimented, pseudo-scientific typologies and forms of inquiry that pathologized queerness. "The homosexual" is thus discussed as an isolated individual case – a deviant or an outlier subject to scrutiny; meanwhile, "sodomites" have endured for much longer as subcultural entities en mass. Unexpectedly, however, "homosexual love" was also rather common by the thirties – suggesting either a significant amount of romantic/poetic literature being produced by queer themselves or issues with the sample itself. Indeed, the statistical significance/generalizability (feasibility of using Google Ngram charts as evidence or to make broad claims) is questionable. The data is limited to whatever has been digitized and OCR'd by Google. How does Google go about prioritizing what gets digitized? Perhaps copyright, accessibility, donations, and popularity of certain works all play a role.
The first of this set of charts is the one I used for my thesis – comparing/contrasting the prevalence of various sexual identifiers within the span of time I was researching. However, it is difficult to discern when any great shifts in popularity occurred – just that "homosexual" really took off towards the end of the nineteenth century (having been invented in 1868). So, I created a follow-up chart (the second one) to "zoom in" on the changes. Ngram shows us that Urning (the first queer sexual identifier invented by an activist) enjoyed some significance until the early 1890s – when a lot of sexological literature was being published, co-opting "homosexual" and inventing "sexual inversion." This analysis demonstrates that Ngram can serve to confirm and/or illustrate one's previously drawn conclusions (based on traditional historical research).
Finally, I just threw these two charts together to investigate/amend other questions and issues that might arise while using Ngram. The first of this set (using the Inflection function) demonstrates the strange jump in the use of "sodomite(s)" in the past thirty years – something worth investigating and perhaps attributable to a growth in LGBT history research and/or the availability of digital/online works. This particular Ngram also demonstrates a strange (and perhaps meaningless) trend for "sodomites" to periodically surpass "sodomite" in popularity (with a "smoothing" of 3 – as in all of these charts). Does the plurality of "sodomites" bear some sort of meaning or am I just reading too much into it? Meanwhile, the last chart is my attempt to account for multiplicity of meaning/connotation. For example, "gay" and "queer" would be difficult to test with Ngram because, up until the mid-twentieth century (or thereabouts), they could mean happy or weird, respectively, or homosexual; as such, there is no real way of measuring their popularity in a strictly sexual sense (except maybe with the Wildcard function). So, using "sexual inversion" as a kind of control variable, I tried to see whether the spike in "invert" was more likely attributable to just a general use and/or literature about geometry, physics, etc. rather than sexology. I think my results were somewhat inconclusive. Overall, I would argue that Google Ngram is a wonderful and powerful tool best suited (at present) for distant reading – which ought to inform how we conduct our close readings of historical text.