Supporting word learning with language-internal distributional statistics: A place for the recurrent neural network language model?