Metadata & Codelists¶
DBPort manages metadata automatically. You never write metadata.json or codelist CSV files manually.
Lifecycle fields¶
| Field | Source | When set |
|---|---|---|
agency_id |
Constructor / dbp init |
Every init |
dataset_id |
Constructor / dbp init |
Every init |
created_at |
Auto | First publish only |
last_updated_at |
Auto (publish time) | Every publish |
last_fetched_at |
Auto | Every initialization |
params |
Publish parameters | Every publish |
inputs |
Load calls | Accumulated across all loads |
versions |
Auto (append) | Every publish |
Codelists¶
On publish, a codelist is auto-generated for each output column from its distinct values. You can customize this behavior per column.
Overriding codelist metadata¶
| Parameter | Default | Description |
|---|---|---|
codelist_id |
column name | Identifier for the codelist |
codelist_kind |
inferred from SQL type | "flat" or "hierarchical" |
codelist_type |
inferred from SQL type | Value type hint |
codelist_labels |
null |
Human-readable labels per language |
Attaching an external codelist table¶
The referenced table should already be loaded via dbp model load or port.load(). On publish, the full table is exported as the codelist instead of auto-generating from distinct output values.
Chaining in Python¶
In the Python API, .meta() returns self for chaining:
.meta()returns theColumnConfiginstance, so.attach()can be called immediately on the result.
How metadata is stored¶
No files on disk
On publish, the finalized metadata.json is built in-memory and embedded directly in Iceberg table properties (gzip + base64 compressed). Codelist CSVs are generated in-memory from DuckDB and embedded in Iceberg column docs. No intermediate files are written to disk.