As applied to writing studies, text network analysis is a method by which a researcher can trace the circulation of meaning within a text.
Meaning is generated when different pieces of information are related to one another in some way. Information is relational, however, only according to some system or ontology. For instance, the string of phonetic sounds [k æ t] relies for its meaning on the system of English phonology (as well as the English lexicon). Within this system, when my vocal tract strings together the different sounds [k], [æ], and aspirated [t], I generate meaning because the different sounds relate to and work with one another to create the English word, cat. On their own, those sounds are not necessarily meaningful; they become meaningful in relation to one another within a specific system. Any text will draw upon many systems to create meaning: syntactic, graphemic, cultural . . .
Meaning circulates within individual texts, but individual texts circulate among other texts and within communities and cultures. So, a larger concern with meaning circulation is not satisfied with analyses of individual texts. However, any study of meaning circulation within larger networks must take the individual text as the starting point (or the ending point, depending on how you approach the question). The inter-textual network does not terminate at the individual text; it simply changes scale, exiting the exterior network and entering the network of the text.
Using Auto Map and Gephi, and following a methodology similar to the one described here, I created a network of all the lexical connections within the first 10 chapters of Vladimir Nabokov’s Lolita. (View the upcoming videos in full screen; otherwise, you can’t see the nodes I’m talking about.) There are different ways to visualize these connections as a text network, but the results here show which words possess the highest levels of betweenness centrality. The more betweenness centrality, the larger the node; these are the words that have, not the most connections, but the most connections to the most different clusters, which tells us that these terms, in the text, are used in many different contexts and therefore are the most fluid in meaning.
The results also allow us to trace all the possible connections from one word to any other, both within individual meaning clusters and through terms with a high level of betweenness centrality. For example, the terms ‘girl’ and ‘night’ have a relatively high betweenness centrality, and they are both connected to one another through the word ‘touched’, which itself is not connected to very many clusters and thus has a low betweenness centrality.
night –> touched –> girl
(Lots of pervy pathways of meaning in Lolita.)
Visualizing all the connections in this textual network is messy. Nabakov was a master stylist, not one to use the same words too often, and certainly not in the same sentence or in the same connective pattern. The average path length in the text is 7.95. Average path length measures how many steps you need to take on average to connect two randomly selected nodes. The lower the average path length, the more connected the text. At 7.95, the first 10 chapters in Lolita are not very connected; there are 221 separate meaning clusters. Here’s the messy initial network . . .
Using Gephi’s degree range tool, I hid the most disconnected nodes, thereby ‘cleaning’ the visualization of all but the most prominent clusters and connections.
With this cleaner network, I could see a few distinct clusters, as well as those terms with high degrees of betweenness centrality, the words that act as conduits between different words and meaning clusters. They were what you’d expected: meaning in Lolita circulates through the favorite words of an enamored pederast. Nymphet, night, girl, age, eyes, hair . . .
More interesting than the overall network, however, were the various paths I found between different terms. In general, fewer than 3 paths of separation in a social network indicate a possibility of cross-influence between two nodes; in our textual network, two nodes separated by 3 paths or fewer indicate a possible, latent relationship between the nodes, perhaps even a relationship that can be expressed in terms of influence.
For example, ‘nymphet’ led backward to ‘annabel’, which had a direct path to ‘lolita’ in one direction and to ‘death’ in the other direction.
Remember, this textual network only represents the first 10 chapters of the novel (I didn’t include the fake preface). And yet, already built into this network of lexemes from early in the novel is a clue to Humbert’s eventual demise, a great example of the intimate connection between form and function, style and plot.
Another interesting pathway was the path between ‘life’ and ‘death’. Actually, there were two pathways, one leading through ‘felt’ and another leading, oddly enough, through ‘love’ and then ‘father’.
The ‘father’, ‘love’, ‘death’ triangle is quite interesting . . . and, of course, ‘death’ leads back through ‘felt’ to ‘annabel’, the first nymphet in Humbert’s life.
Finally, two important terms in the network are quite disconnected: ‘nymphet’ and ‘girl’. Which is exactly what we should expect. Humbert goes to great lengths to separate the one from the other, and textually, it’s difficult to trace a lexemic path from one to the other. (note: at the end of this video, I highlight the word ‘fruit’, which is only connected to ‘table’ and ‘set’. Nabokov apparently declines to use any sort of forbidden fruit metaphor during the first ten chapters of the novel; ‘fruit’ never connects to the pervy words or meaning clusters.)
Even this short analysis has given me some interesting things to discuss if I were actually writing a dissertation on Lolita. The meaning circulation of ‘lolita’, ‘annabel’, and ‘death’ through the conduit ‘nymphet’ would be worth analyzing in more detail, especially considering that this circulation occurs so early in the novel.