On Chomsky and the Two Cultures of Statistical Learning


I take Chomsky’s points to be the following:

Statistical language models have had engineering success, but that is irrelevant to science.

Accurately modeling linguistic facts is just butterfly collecting; what matters in science (and specifically linguistics) is the underlying principles.

Statistical models are incomprehensible; they provide no insight.

Statistical models may provide an accurate simulation of some phenomena, but the simulation is done completely the wrong way; people don’t decide what the third word of a sentence should be by consulting a probability table keyed on the previous two words, rather they map from an internal semantic form to a syntactic tree-structure, which is then linearized into words. This is done without any probability or statistics.
Statistical models have been proven incapable of learning language; therefore language must be innate, so why are these statistical modelers wasting their time on the wrong enterprise?