DBPort¶
Build locally. Publish safely.
Governance and orchestration for recomputable warehouse datasets.
You build models that produce datasets — and those datasets depend on each other. When external sources update, you need to recompute downstream models in the right order, knowing exactly which input versions went into each output. As the number of models grows, keeping track of dependencies, provenance, and data quality becomes harder than the modeling itself.
DBPort is the orchestration layer on top of your warehouse that enforces governance into recomputable workflows. It tracks dependencies between your models and on external inputs, so you can build with the confidence that future updates will be picked up correctly — and that other models can pick up your results.
Why DBPort — and who it's for Get started
See it in action¶
pip install dbport
# Initialize a project and configure the model
dbp init regional_trends --agency wifor --dataset emp__regional_trends
cd regional_trends
dbp config model wifor.emp__regional_trends schema sql/create_output.sql
dbp config model wifor.emp__regional_trends input estat.nama_10r_3empers
# Run the full lifecycle: load → execute → publish
dbp model run --version 2026-03-09 --timing
That is a complete lifecycle: inputs loaded from an Iceberg warehouse, SQL transforms executed in DuckDB, and a versioned output published back — with schema validation, metadata, and codelists attached automatically.
Why using DBPort¶
-
Model dependencies, tracked
Models produce datasets that feed other models. DBPort tracks these dependencies so you always know what depends on what — across your entire organisation.
-
Full input provenance
Every publish records exactly which input versions and snapshots were used. Months later, you can trace any output back to the data that produced it.
-
Recompute when sources update
Snapshot-cached inputs detect when external sources change. Unchanged tables are skipped automatically — only what's new gets reprocessed.
-
Schema drift, caught early
Declare the output shape upfront. Drift is caught before anything is written to the warehouse — no fraudulent data, no silent corruption.
-
Versioned, resumable publishes
Every publish records version, parameters, and row count. Interrupted runs resume from checkpoint. Re-running a completed version is a safe no-op.
-
Committable state
dbport.lockis TOML, credential-free, and tracks schema, inputs, and versions — ready for code review and CI.
It fits with what you already use¶
DBPort doesn't deliver the models — it delivers the platform to keep track of dependencies between them. It is the governance layer that connects your tools.
| DuckDB | The execution engine. DBPort adds governed inputs, output contracts, and publish semantics around it. |
| dbt | Complementary. dbt handles transformations in the middle; DBPort manages dataset lifecycle at the edges. |
| Airflow, Dagster, … | DBPort defines what a safe run means. Orchestrators decide when to trigger it. |
-
Getting Started
Install DBPort, configure credentials, and run your first model.
-
Concepts
How inputs, schemas, metadata, versioning, and the lock file work together.
-
CLI Reference
Full command reference for
dbp init,dbp model,dbp config, anddbp status. -
Python API
Constructor options, methods, and lifecycle for the
DBPortclass.