Generating Interesting Stories

John Ohno:

The problem of generating interesting long-form text (whether fiction or non-fiction) is a problem of information density: people do not like to be told things they already know (or can guess), particularly at length, nor do they generally find the strain of interpreting content that’s too informationally-dense interesting for long. There’s a relatively narrow window of novelty that a piece of text must stay inside for most people to put up with it (and when we go outside that window, there are often motivations outside of interest: we may be daring ourselves to put up with a difficult text out of masochism or pride, or we may need to learn something that isn’t explained in a more accessible way elsewhere). This pattern repeats at multiple levels: not only must we be careful with the novelty of our content, but we must also keep interest with a particular ratio of familiar and unfamiliar words, variation in sentence length and structure, and even changes in tone. Few human writers can maximize all these things successfully; those who can are considered geniuses. So, can a machine?
Historically, the best-performing text-generators have depended heavily on framing: in some traditions of writing (for instance, modernist or postmodern prose, or symbolist poetry) there is an expectation that the work itself will remain vague and the reader will put more effort into determining how to interpret it, even on an object level. Putting aside the fact that general audiences often do not want to do this much work (particularly for an unproven reward), these generators often have an underlying pattern to their output that is distractingly noticeable at the scale of tens of thousands of words. In other words, on different levels of structure, they are simultaneously too novel and not novel enough.