An open lakehouse.
Any engine.
Your cloud or ours.
Apache Iceberg + Parquet under the hood. Trino, Spark, or DuckDB over the top. One governance layer across all of them. SaaS or self-install in your own Azure tenant. Databasin One writes schema-aware SQL with Claude + GPT — built in, not an add-on SKU.
$100 for TLDR readers · $50 public · no commit · no reservation · cancel anytime
Four things the hyperscalers won't give you.
Every pillar is a structural property of the platform, not a pricing tier. No add-on SKUs, no "contact sales" for the interesting bits.
Open tables. Open engines. No dialect lock-in.
Your data lives in Apache Iceberg + Parquet — readable by anything that speaks Iceberg. Three engines, one lakehouse, zero lock-in.
-
Apache Iceberg+Parquet— vendor-neutral open formats -
Apache Trino·Apache Spark·DuckDB— all read/write the same Iceberg tables - One governance layer — row/column policies, audit, lineage — enforced across every engine
- No proprietary dialect. No egress fees to get your data back.
SaaS in our cloud — or self-install in yours.
Run Databasin as SaaS for fastest time-to-value, or deploy the same platform into your own environment. Zero vendor logical access. Same APIs, either way.
- SaaS: start in minutes, no infra to stand up
- Self-install: Databasin deploys into your own cloud subscription
- Your data never leaves your tenant — we can't see it
- Switch modes later without re-architecting — same platform, same UI
Built inside a hospital. HIPAA is table stakes.
Databasin was co-created at Washington University School of Medicine. Security wasn't retrofitted — it's the design constraint.
- HIPAA-compliant by default · BAA available
- PHI never leaves your environment — by architecture
- Row- and column-level policies enforced at the governance layer
- Full audit trail of every query, export, and credential use
Databasin One — schema-aware AI, built in.
Not a chatbot stapled to a docs page. An agentic analyst that indexes your catalog, picks the right gold tables, writes the SQL, runs it, and ships the chart.
- Claude + GPT — multi-model routing for planning vs. generation
- Reads the semantic model, not just the schema — metric definitions respected
- Discover mode for exploratory analysis against your own files
- Zero add-on SKU. Zero "contact sales."
Point it at your lake. Pick your engine. Query.
The governance layer is the same regardless of which engine is executing. Switch from Trino to Spark for a heavy batch job, to DuckDB for a local dev loop — without moving or re-formatting a byte of data.
-
Storage:
Apache Iceberg+Parquet -
Compute:
Apache Trino·Apache Spark·DuckDB - Governance: row/column policies · audit · lineage · masking
- Medallion: bronze (raw) → silver (transformed) → gold (governed)
- AI layer: Databasin One · Claude + GPT · schema-aware SQL · chart generation
- Deploy: SaaS or self-install in your own cloud
Every architectural decision has a reason.
Why open formats. Why three engines instead of one. How the governance layer enforces policies across all of them. Why PHI never leaves your environment by architecture — not configuration.
Read the architecture → See the four modules →An analyst that knows your schema on day one.
Databasin One is not a copilot bolted onto a text editor. It reads your semantic model, routes between Claude and GPT based on the task, generates governed SQL, runs it against your gold layer, and returns a chart you can pin to a dashboard.
- Multi-model. Claude for reasoning over schema; GPT for fast refinement and code.
- Semantic-aware. Uses your defined metrics, not guessed aggregations.
- Governed. Only queries gold views. Row/column policies always applied.
- Discover mode. Point it at a spreadsheet or folder of files — it figures out the shape.
gold.facility
with
gold.fact_encounter_financials
on the semantic relationship you defined. Net revenue excludes adjustment codes
100–199.
SELECT f.name, SUM(r.net_revenue) AS revenue FROM gold.fact_encounter_financials r JOIN gold.facility f ON f.id = r.facility_id WHERE r.quarter = CURRENT_QUARTER() - 1 AND r.adjustment_code NOT BETWEEN 100 AND 199 GROUP BY f.name ORDER BY revenue DESC LIMIT 10;
Billed per minute. No reservation. No seats. No commit.
Compute rounded up to the minute. Stopped clusters don't bill. Every rate is on the pricing page — not behind "contact sales."
$100 in credit.
No card required.
Credit applied automatically when you sign up via this page.