Parse the Table of Contents of EuroStat and retrieve all dataset URLs:
- INPUT: URL of
table_of_contents.xml
- OUTPUT: A list of dataset URLs
This is covered by ParseToC.bat.
Using a dataset URL, download and parse the contents of the compressed file:
- INPUT: dataset URL, for example
http://someURL?file=data/tsieb010.sdmx.zip
- OUTPUT:
tsieb010.dsd.xml
andtsieb010.sdmx.xml
This is covered by DownloadZip.bat and UnCompressFile.bat.
- INPUT:
tsieb010.dsd.xml
- OUTPUT:
~/dsd/tsieb010.rdf
(represented in DataCube vocabulary)
This is covered by DSDParser.bat.
- INPUT:
tsieb010.sdmx.xml
- OUTPUT:
~/data/tsieb010.rdf
(represented in DataCube vocabulary)
This is covered by SDMXParser.bat.
- INPUT: URL of
table_of_contents.xml
- OUTPUT:
~/catalog.rdf
For example:
@prefix data: <http://eurostat.linked-statistics.org/data/> .
@prefix dss: <http://eurostat.linked-statistics.org/dss/> .
@prefix dsd: <http://eurostat.linked-statistics.org/dsd#> .
@prefix qb: <http://purl.org/linked-data/cube#> .
@prefix void: <http://rdfs.org/ns/void#> .
dss:ds_1 a qb:DataSet, void:Dataset;
qb:DataStructureDefinition dsd:dsd_1;
void:dataDump data:ds_1.ttl;
.
This is covered by Catalog.bat.
This will be solely used to populate the triple stores (see next step).
- INPUT: URL of
table_of_contents.xml
- OUTPUT:
~/inventory.rdf
one file in the file system with all DSDs
Note: the inventory also contains the dataset from STEP5.
For example:
@prefix data: <http://eurostat.linked-statistics.org/data/> .
@prefix dss: <http://eurostat.linked-statistics.org/dss/> .
@prefix dsd: <http://eurostat.linked-statistics.org/dsd#> .
@prefix qb: <http://purl.org/linked-data/cube#> .
@prefix void: <http://rdfs.org/ns/void#> .
dsd:dsd_1 a qb:DataStructureDefinition, void:Dataset;
void:dataDump dsd:dsd_1.ttl
.
This is covered by Catalog.bat.
- CONTROL INPUT:
~/inventory.rdf
from STEP6 - DATA INPUT:
~/dsd/tsieb010.rdf
from STEP3 and~/catalog.rdf
from STEP5
With the VoID file described in step 6 and the SMCS, populate the triple store.