We just got this huge dataset (150Gb of scanned documents from Justice Ministry), the content as far as I could understand is the logs of the process that made the laws, and the documents archive covers a big time
interval, there are stuff going back since the 40s or even older, Ricardo Poppi copied in this email maybe can give us more info.
The problem with this dataset is that they have the scanned documents and some metadata in spreadsheets, but the relationship between the scans and the metadata got lost, perhaps due to a system crash around 2006 on a proprietary IBM system.