Getting Started¶
This page gets you from “install” to a first successful fetch, and explains what changes on the second run (append-only, incremental refresh).
Installation¶
Python versions¶
sdmxflow supports Python 3.11 and 3.12.
From PyPI¶
If you use uv:
From source (contributors)¶
Minimal working example¶
The main entrypoint is SdmxDataset.
from pathlib import Path
from sdmxflow import SdmxDataset
ds = SdmxDataset(
out_dir=Path("./out/lfsa_egai2d"),
source_id="ESTAT",
dataset_id="lfsa_egai2d",
)
result = ds.fetch()
print("appended:", result.appended)
Parameters (what they mean)¶
out_dir(required): where artifacts are written and persisted between runs.source_id(required): the provider/source identifier (currently"ESTAT").dataset_id(required): dataset identifier within the provider.
Common optional parameters:
agency_id: defaults tosource_idforESTAT.key: provider-specific SDMX key restriction.- For
ESTAT, useNoneto request the full dataset;""means “provider default” (currently also the full dataset).
- For
params: provider-specific passthrough parameters (e.g., time window).save_logs=True: writes a per-run debug log file under<out_dir>/logs/.
For the full parameter reference and defaults, see Configuration Reference.
What happens on first run vs second run¶
First successful run¶
- No
metadata.jsonexists yet, sosdmxflowinitializes metadata. - It fetches upstream “last updated” metadata.
- It downloads the dataset slice and creates
dataset.csv. - It writes
metadata.jsonand exportscodelists/.
Second (and later) runs¶
sdmxflow compares the upstream “last updated” timestamp to the latest locally recorded one:
- If unchanged: it skips the dataset download and does not append to
dataset.csv. - If changed: it downloads and appends a new slice and updates metadata/codelists.
Important
dataset.csvis append-only across versions. It is normal for the same “logical row” to appear multiple times across differentlast_updatedvalues.
Expected output folder tree¶
After a successful fetch you should see:
If you enabled per-run log capture:
Next:
- Read Output Artifacts (Contract) for file semantics and examples.
- See Scheduling & Deployment for production patterns.
- See Integration Patterns for warehouse loading examples.