Text Network and Corpus Analysis of the Unabomber Manifesto

Introduction

The Unabomber Manifesto—Industrial Society and its Future—was sent to major newspapers in 1995, with an accompanying promise from its author, Ted Kaczynski, to stop exploding things if someone printed the 35,000 word text in full. The New York Times and the Washington Post obliged in September of that year. The manifesto became a major clue in the hunt for the Unabomber, but only a few forensic linguists concluded that Kaczynski, a suspect at the time, had written it. The majority failed to see a connection between the manifesto and other writings by Kaczynski (these are the same people, I can only guess, who remain skeptical about who wrote Romeo and Juliet). In the end, none of it mattered anyway. Evidence found in Kaczynski’s cabin was far more damning than forensic linguistic analyses of the manifesto.

The Manifesto

You expect the manifesto of a domestic terrorist to be insane. Kaczynski is not your average domestic terrorist. A former Berkeley professor of mathematics with a Michigan PhD, Kaczynski could have feasibly published the essay with a legitimate press or magazine and gained a wide academic audience had he not retreated into the woods and his own head. The manifesto is a real argument that, minus its calls for violence, could have been inserted into a legitimate discourse, albeit one that would have resulted in criticism coming Ted’s way.

Ostensibly, the manifesto is a strong critique of contemporary techno-capitalist society. However, if you took a knife to the text, divided it into little passages, you would discover that half of them bend far leftward and could be read aloud without protest in Harvard Yard, while the other half bend far rightward and could only be read aloud without protest at Hillsdale College.

So, there are passages such as this one, which would send heads nodding in every humanities department in America:

The Industrial Revolution and its consequences have been a disaster for the human race. They have greatly increased the life-expectancy of those of us who live in “advanced” countries, but they have destabilized society, have made life unfulfilling, have subjected human beings to indignities, have led to widespread psychological suffering (in the Third World to physical suffering as well) and have inflicted severe damage on the natural world.

Then comes this curveball:

One of the most widespread manifestations of the craziness of our world is leftism, so a discussion of the psychology of leftism can serve as an introduction to the discussion of the problems of modern society in general.

Like many on the left, Kaczynski blames technology and The System for the sad state of the earth and its inhabitants, yet he suggests that the contemporary left (the “oversocialized” left, as Ted puts it) is in fact The System’s most malformed, though logical outgrowth.

At first, I couldn’t recognize the motive behind the manifesto. Its politics seemed too conflicted. Then I noticed a brief mention in Kaczynski’s Wikipedia article that ties him to the anarcho-primitive tradition, and suddenly the text became more philosophically cohesive.

The Manifesto’s Motive

There are two types of anarcho-primitivists: the Rousseau types and the Hobbes types (my own ad hoc terms). The former are human-centric and collectivist. They believe that dismantling techno-capitalist society will usher in an era of equality and harmony between men and women of all races. The latter are earth-centric and individualistic. They believe that dismantling techno-capitalist society will put a halt to overpopulation and environmental degradation, and allow individuals to live more spiritually and physically fulfilled lives.

The goals aren’t mutually exclusive, but nor are they necessarily aligned. (When it comes to immigration, they are outright opposed.) The Hobbesian primitivists tend to believe that nature, for all its beauty and desirability, isn’t a progressive utopia. Who are these Hobbesians? They are the Monkey Wrench Gang radicals, the Edward Abbeys and Doug Peacocks of the environmental movement, the Garret Hardins of ecology, the survivalists, the Timothy Treadwells, the (typically) men who love nature more than humanity but harbor no romanticism about either. Kaczynski would have gotten along well in the Monkey Wrench Gang, who held no love for humans or community or society in the aggregate because, to them, human communities are precisely the problem.

Let’s put these categorizations aside for now and look to the text of the manifesto itself. A text network analysis and an analysis with the Natural Language Toolkit (NLTK) can provide us with grounded data about Kaczynski’s motives as they appear in his manifesto. The motives of all authors—or at least their traces—are always left behind in the lexical choices of their texts. Deliberate, written language is like a rhetorical fingerprint.

Text Network Analysis

As I’ve discussed in other posts, a text network analysis proceeds in the following way: a text is copied into a .txt file; it is imported into some analytic tool (I use Auto Map) in order to remove stop words and to lightly stem the text; then, using the same tool, the text—which has now been expunged of all but significant content words—is run through an algorithm that treats the content words like a network and creates a co-reference list in .csv format. What words are connected to what other words, and how often? (In this analysis, I used a two word gap and a five word gap.) The .csv file is then opened in a network analysis tool (I use Gephi) in order to visualize these semantic connections. Each word is visualized as a node in the network, and words that appear next to each other—again, within a certain word gap—appear as edges.

The two most important network visualizations, in my opinion, show nodes with the highest levels of Betweenness Centrality and the highest levels of Degree Centrality. The latter measures how many total connections a node has to other individual nodes; so, a node with high degree centrality will simply be connected to many other nodes. The former measures whether or not a node is connected to other nodes that themselves have many connections; so, a node with high betweenness centrality will in essence be an important ‘passageway’ between communities within the network. (Here’s an excellent visual description of the concepts.)

In a textual network, a word with high degree centrality is a word used in connection with myriad other words. This simply tells you that a word is used frequently in a text and in a variety of contexts. A word with high betweenness centrality is a word used frequently and in conjunction with other words that also connect to other nodes to form community clusters. This tells you that a word is not only used frequently and not only in many contexts but also that it is used in connection with words that also do a lot of semantic work in the text. A word with high betweenness centrality is a word through which many meanings in a text circulate.

For example, as you see below, psychological has a high degree centrality in the Unabomber Manifesto but not a high betweenness centrality. This lexical item was therefore used frequently and connected to many different words, such as:

psychological techniques

psychological methods

However, the words to which psychological is connected (techniques and methods) do not themselves perform a lot of semantic work elsewhere in the text. Words like psychological are essentially productive creators of bigrams but not pathways of meaning.

Society, on the other hand, not only has a high degree centrality but also a high betweenness centrality. So, the words that it connects to also have further connections and thus do perform semantic work elsewhere in the text.

Here are the text network visualizations:

Nodes with the highest Degree Centrality in the manifesto

Nodes with the highest Degree Centrality in the manifesto

Nodes with the highest Betweenness Centrality in the manifesto

Nodes with the highest Betweenness Centrality in the manifesto

The text is long, so its network is messy. In the 5-word gap network, the manifesto had over 200 separate meaning clusters. In the 2-word gap (seen above), it still had over 150 clusters.

Social, society, people, and human are the words with the highest levels of degree centrality in Kaczynski’s manifesto. Also visible in this network are technology and its derivations, psychological, system, freedom, physical, power, leftist, and modern.

Social, society, and people are the words with the highest levels of betweenness centrality in Kaczynski’s manifesto. Also visible in the network are human, problems, system, change, and natural.

As I mentioned earlier, most commentary on the Unabomber manifesto focuses on a) its attack on technology, and b) its attack on leftism. However, as these text networks demonstrate, the words that do the most semantic work in the text—the words through which most meanings flow—suggest that Kaczynski’s sights were set on society as a whole—its people, its systems. Three other words with relatively many connections—psychological, power, and freedom—further suggest that the ostensible screed against leftism and technology masks a deeper motive that circulates in a diffuse, though nonetheless salient way throughout the text. And in the light of Kaczynski’s possible connection to an anarcho-primitivist tradition, these particularly noticeable nodes make much more sense than they would if we tried to paint him as a madman or, worse, a bitter, conservative academic. If he were only that, we might expect other terms to be more noticeable in the network (e.g., the various derivations of leftism).

One thing a text network does, beyond providing an interesting visualization, is to point the researcher in the direction of terms and n-grams that might be explored more granularly in a corpus analysis tool, such as the NLTK. It provides a map of a text’s semantic circulation, a map that can be followed when we return to the world of pure textuality.

Corpus Analysis

Here is a raw count of the most frequent words in the manifesto:

unabomberLineChart

Certain words weren’t visually important nodes in the text network but were nonetheless used frequently (e.g., goal/s, individual/s, process, industrial, way, work, man, behavior, control ); these words were deployed often but in conjunction with a limited number of other terms. Nevertheless, the 20 most frequent words signify a dual emphasis that makes sense if Kaczynski is a certain kind of primitivist: there is the left-wing emphasis on the ills of society, the system, technology, and control; but there is also the right-wing emphasis on individuals and freedom.

The NLTK can also generate a dispersion plot, which shows where in a text individual words fall. Here is a dispersion plot of the 10 most frequent words:

unabomberDispersionPlotTop10

A striking pattern emerges. Although much has been made of the manifesto’s condemnation of the left, the dispersion plot demonstrates that anti-leftism is not a continuous theme in the text but rather forms the bookends: the manifesto opens and closes with references to leftists, but the bulk of the text does not mention them at all. The focus is elsewhere.

The dispersion of technology and technological provides another striking pattern. More than a third of the text passes before Kaczynski begins to deploy these words in earnest, even though a surface reading of the text leaves the reader with the impression that technological anxieties anchor every aspect of the manifesto.

But compare the dispersion of these supposedly central terms—leftist/s, technology and technological—with the dispersion of other terms in the list. Society, system, people, power, human, and, to a lesser extent, modern all have much more uniform dispersions throughout the manifesto. In other words, these concepts appear more regularly and consistently in each of the manifesto’s 232 numbered paragraphs, and that is precisely what we should expect if Kaczynski is indeed a primitivist who loves nature more than humanity. His ire is most obviously directed at leftists, but more subtly, the motivated energy of his manifesto is pointed in all directions at all society in its malformed, destructive development.

9 thoughts on “Text Network and Corpus Analysis of the Unabomber Manifesto

  1. Pingback: Elliot Rodger’s Manifesto: Text Networks and Corpus Features |

  2. Pingback: Close vs Distant Readings #rcdh14 | Collin Gifford Brooke

  3. Pingback: Ohio National Guard Training Envisions Right-Wing Terrorism - Page 6

  4. Pingback: CCR 733: Rhetoric, Composition, and Digital Humanities (sp14) | Collin Gifford Brooke

  5. I am very impressed with your post. Especially the visualized data is very well formed. So I would try to attempt the research with your way. I was wondering if you could teach me to make .csv file for network analysis. Sorry to disturbing you. Thank you in advance.

  6. Pingback: Text Analysis of 2012 Digital Humanities Job Adverts part 2 « Electric Archaeology

  7. Pingback: Enjoy A Valentine’s Day Sampler | emptywheel

  8. Excellent post!

    I’m curious, given the bookended treatment of the anti-leftist argument, is it possible this was a conscious approach simply to draw attention from a group most likely to embrace the manifesto? That is, possibly it’s less “genuine” or inherent to his main arguments (which you synthesize so wonderfully here), but more a complementary tactic?

    Also, do you know if any stylistic analysis was or has been done on the manifesto AND his other known writings? (Or do any other works relevant/related to the manifesto exist?) May also be interesting to expand the corpus to include the primitivist works you mention.

    • That’s an interesting possibility, that the bookended treatment of the anti-left rhetoric was designed to pull people in who would be pre-disposed to listen. Here, we’d have to ask who Kaczynski believed to be his intended audience, or his ideal audience anyway. Really, it boils down to the question: Who, if anyone, are manifestos designed to persuade; who, if anyone, was Teddy trying to reach? But, yeah, I think he had an idea of whom he would alienate and whom he would appeal to by foregrounding “leftism” at the beginning and end. (Really, it’s anti-political correctness more than anything; leftism is just Ted’s synonym for PC, so his screed would theoretically appeal to people as different as Richard Dawkins and Rush Limbaugh.)

      Re: other work on the manifesto. Most work done on it comes from forensic linguistics. This was, in many ways, the first big public case for the field. The questions asked there are stylistic, orthographic, statistical. I don’t know that much work has been done on the rhetorical end. Kaczynski’s writings are collected in “Technological Slavery,” but to my knowledge, very little work has been done on it. He’s one of those characters who doesn’t fit neatly into any single political narrative, so he more or less has been flushed down the memory hole.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s