-
Indie Map
The IndieWeb is a people-focused alternative to the "corporate" web. Participants use their own personal web sites to post, reply, share, organize events and RSVP, and interact... -
A Week in the Life of a Browser - Version 2
Mirror of the datasets from the MozillaLabs TestPilot project: A Week in the Life of a Browser - Version 2 Test name: A Week in the Life of a Browser – Version 2. Test... -
Information Society Household Statistics
Statistics of Information Society Household Statistics from www.statcentral.ie under the theme People and Society - Information Society from the Central Statistics Office (CSO)... -
Regional GDP
Statistics of Regional GDP from www.statcentral.ie under the theme Economy - National Accounts from the Central Statistics Office (CSO) Classifications: Region (NUTS2, NUTS3)... -
Information Society Enterprise Statistics
Statistics of Information Society Enterprise Statistics from www.statcentral.ie under the theme People and Society - Information Society from the Central Statistics Office... -
Communications Use by Business
Statistics of Communications Use by Business from www.statcentral.ie under the theme People and Society - Information Society from the Commission for Communications Regulation... -
Electronic Communications Market
Statistics of Electronic Communications Market from www.statcentral.ie under the theme People and Society - Information Society from the Commission for Communications... -
Residential Access to Technology
Statistics of Residential Access to Technology from www.statcentral.ie under the theme People and Society - Information Society from the Commission for Communications... -
STAC
A dataset to describe security mechanisms: security protocols, security tools, cryptographic concepts (encryption algorithms, hash functions, key management, etc.) in various... -
Identi.ca
About Identi.ca is a microblogging service. Users post short (140 character) notices which are broadcast to their friends and fans using the Web, RSS, or instant messages.... -
dotnetdotcom
About We invite you to help us share the content of internet by downloading the first part of our index. It has roughly 600,000 pages and is shared in an easy to parse text... -
A corpus of web crawl data composed of 5 billion web pages.
A corpus of web crawl data composed of 5 billion web pages. This data set is freely available on Amazon S3 at s3://aws-publicdatasets/common-crawl/crawl-002/ and formatted in... -
The ClueWeb09 Dataset
The ClueWeb09 dataset was created to support research on information retrieval and related human language technologies. It consists of about 1 billion web pages in ten languages...