Read the Web

This data includes facts extracted from 500 million web pages.

To build a never-ending machine learning system that acquires the ability to extract structured information from unstructured web pages. If successful, this will result in a knowledge base (i.e., a relational database) of structured information that mirrors the content of the Web. We call this system NELL (Never-Ending Language Learner).

Download Data Package

Data and Resources

Twitter feed
Twitter feed

More information Go to resource
All beliefsTSV
All beliefs

More information Go to resource
Extraction Patterns TSV
Tab-separated-value file with the textual extraction patterns learned by CPL...

More information Go to resource
Heatmap of learning activityHTML
No description for this resource

More information Go to resource
RGB Heat map of learning activity
RGB Heat map of learning activity for each predicate, broken down by learning...

More information Go to resource

Additional Info

Field	Value
Source	http://rtw.ml.cmu.edu/rtw/resources
Author	Carnegie Mellon University
Last Updated	October 10, 2013, 23:23 (UTC)
Created	June 26, 2011, 10:39 (UTC)