47 datasets found

Filter Results
  • Neurocommons text mining pilot

    About The complete dataset is composed of a set of smaller datasets. Each download is in one of two formats: (1) WARC or (2) tar.gz. You can read about the WARC format by...
  • NeuroCommons

    From the website: The NeuroCommons project seeks to make all scientific research materials - research articles, annotations, data, physical materials - as available and as...
  • The Mondial Database

    From home page: The MONDIAL database has been compiled from geographical Web data sources listed below: CIA World Factbook, a predecessor of Global Statistics which has been...
  • MeSH titles

    Data exposed: Extracted from 2007 Medline baseline distribution Size of dump and data set: 670 MB Notes: contact Medline for use terms
  • MeSH pairs

    Data exposed: NLM 2007 MeSH descriptor/qualifier pairs Size of dump and data set: 13 MB Openness: OPEN See http://www.nlm.nih.gov/mesh/termscon.html (basically attribution with...
  • MeSH headings

    About Data exposed: List of all associations of MeSH headings to papers indexed by Medline extracted from 2007 Medline baseline distribution Size of dump and data set: 758 MB...
  • Linked ISO 3166-2 Data

    About Linked ISO 3166-2 Data. ISO-3166-2 gives codes for countries and their principal subdivisions. Openness Published under CC0. (Where is this specified?)
  • Homologene

    Data exposed: what? Size of dump and data set: 626 KB Notes: NCBI Copyright and Disclaimers
  • Historical Events Markup Language

    Title: Historical Event Markup Language Description Historical Event Markup and Linking Project (Heml) provides an XML schema for historical events and a Java Web app which...
  • GO annotations from National Center for Biotechnology Information (NCBI) and ...

    Data exposed: GO annotations from National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI) Size of dump and data set: 73 MB Openness...
  • Freebase RDF Store

    Duplicate of package:freebase Data exposed: Freebase Views of Freebase Topics following the principles of Linked Data. The dataset extractions contain aggregated data from:...
  • Fly-TED

    Data exposed: derived from data published by www.fly-ted.org and provides metadata on images depicting in situ hybridisation in D. melanogaster testes. Size of dump and data...
  • FlyAtlas

    Data exposed: FlyAtlas and Affy D2 probe-to-gene Size of dump and data set: size? Notes: also found in the of SPARQL Endpoints
  • EU European Statistical Information Service

    Description "Eurostat’s mission is to provide the European Union with a high-quality statistical information service." Very large amount of data on a wide variety of European...
  • Entrez Gene Extract

    Data exposed: Entrez Gene Extract from [ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz] Size of dump and data set: 5.6 MB Notes: NCBI Copyright and Disclaimers
  • Entrez Gene

    About Data exposed: Select fields from Entrez Gene records Size of dump and data set: 7.7 MB Notes: NCBI Copyright and Disclaimers Openness Data appears to be in public domain....
  • DOAP Store

    About Data exposed: provides daily generated dumps with all its DOAP project descriptions Size of dump and data set: size? Notes: 2009-05-24: Both files seem to be empty - hg...
  • DOAPspace

    Data exposed: All 55,000+ DOAP profiles available as RDF/XML DOAP. This includes all DOAP created by doapspace and all DOAP spidered. Size of dump and data set: size? Notes:...
  • DMOZ RDF Dump

    Data exposed: DMOZ Size of dump and data set: size? Openness: OPEN (?) Use Open Directory License which is, in essence, open (may be some wrinkles about updates).
  • Open Directory Project (ODP)

    From about page: The Open Directory Project is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community...