The basic question of authorship attribution—who wrote what—is a question we’re increasingly able to answer, using computers that employ stylistic analysis. But this process also creates new questions about the role of the author: not about whether it’s the author or reader who makes meaning from a text, but what it means to write something at all.
In some naive algorithmic approaches that know nothing of the author—or, indeed, of history, ideology, or critical traditions—authorship emerges powerfully from the sea of texts as a set of shared patterns. That is, an artificial intelligence, or AI, successfully “recognizes” an author not as a person, but instead as the likeness of features that characterize a body of work. In order to find patterns across texts, the algorithmic “reader” uses a collection of textual traits—like frequently used words or punctuation—to draw conclusions about who wrote what.
Here’s how it works. In a relatively coherent set of texts by a single author, a writer’s idiosyncratic linguistic choices leave a mark analogous to a fingerprint. In order to recognize that fingerprint, you need a reasonably large, reasonably similar set of texts to compare with the one you’re curious about, and the one you’re curious about should be long enough to exhibit the fingerprint patterns.1 Methods that use style to discover who authored an anonymous piece of writing—or to validate who really wrote it—rely on text alone to draw probable conclusions.2