Dynamic word embeddings for evolving semantic discovery

morning paper:

Consider the trajectory of ‘apple’: in 1994 it’s most closely associated with fruits, and by 2000 changing dietary associations can be seen, and apple is associated with the less healthy ‘cake,’ ‘tart,’ and ‘cream.’ From 2005 through 2016 though, the word is strongly associated with Apple the company, and moreover you can see the changing associations with Apple over time, from ‘iTunes’ to Google, Microsoft, Samsung et al..

Likewise ‘amazon’ moves from a river to the company Amazon, and ‘Obama’ moves from his pre-presidential roles to president, as does ‘Trump.’

These embeddings are learned from articles in The New York Times between 1990 and 2016. The results are really interesting (we’ll see more fun things you can do with them shortly), but you might be wondering why this is hard to do. Why not simply divide up the articles in the corpus (e.g., by year), learn word embeddings for each partition (which we know how to do), and then compare them?

What makes this complicated is that when you learn an embedding for a word in one time window (e.g., ‘bank’), there’s no guarantee that the embedding will match that in another time window, even if there is no semantic change in the meaning of the word across the two. So the meaning of ‘bank’ in 1990 and 1995 could be substantially the same, and yet the learned embeddings might not be. This is known as the alignment problem.