Beautiful Data Natural Language Corpus and Code

Ngrams and code from Dr. Peter Norvig's chapter for Beautiful Data (2009), edited by Segaran and Hammerbacher. Data files are derived from the Google Web Trillion Word Corpus, as described, which is distributed by the Linguistic Data Consortium.

Data and Resources

Additional Info

Field Value
Source http://norvig.com/ngrams/
Last Updated October 10, 2013, 19:54 (UTC)
Created October 10, 2011, 02:58 (UTC)