» Learning with Big Data: The Future of Education

Viktor Mayer-Schönberger & Kenneth Cukier:

A decade ago, as a 22-year-old grad student, von Ahn helped create something called CAPTCHAs—squiggly text that people have to type into websites in order to sign up for things like free email. Doing so proves that they are humans and not spambots. An upgraded version (called reCAPTCHA) that von Ahn sold to Google had people type distorted text that wasn’t just invented for the purpose, but came from Google’s book-scanning project, which a computer couldn’t decipher. It was a beautiful way to serve two goals with a single piece of data: register for things online, and decrypt words at the same time. Since then, von Ahn, a professor at Carnegie Mellon University, has looked for other “twofers”—ways to get people to supply bits of data that can serve two purposes. He devised it in a startup that he launched in 2012 called Duolingo. The site and smartphone app help people learn foreign languages—something he can empathize with, having learned English as a young child in Guatemala. But the instruction happens in a very clever way.

The company has people translate texts in small phrases at a time, or evaluate and fix other people’s translations. Instead of presenting invented phrases, as is typical for translation software, Duolingo presents real sentences from documents that need translation, for which the company gets paid. After enough students have independently translated or verified a particular phrase, the system accepts it—and compiles all the discrete sentences into a complete document. Among its customers are media companies such as CNN and BuzzFeed, which use it to translate their content in foreign markets. Like reCAPTCHA, Duolingo is a delightful “twin-win”: students get free foreign language instruction while producing something of economic value in return.