How Admaxxer migrated to self-hosted ClickHouse — Phase 2D ships attribution_breakdown
Admaxxer is moving its analytics warehouse off Tinybird onto a self-hosted ClickHouse cluster running on a dedicated Hetzner box in Hillsboro, OR. This page is for merchants who notice their dashboard numbers, security reviewers, and anyone curious about how a phased warehouse migration with zero customer-visible downtime works in practice. The short version: Tinybird stays canonical until your workspace’s numbers match on both sides for at least 14 days; only then does our admin team flip your workspace onto the ClickHouse path. You see no change either way — this is the polished version of “we’re upgrading the engine without shaking the cup.”
Phase 2D update (2026-05-17) — attribution_breakdown migrated
Sources & Attribution now reads from ClickHouse for flagged workspaces. The attribution_breakdown pipe — the one powering the Channel Attribution Drilldown on /marketing-acquisition — is the second pipe to land a CH-native translation after summary_kpis. Five things landed today:
attribution_breakdownCH module shipped. A newserver/lib/clickhouse/attributionBreakdownQuery.tsmodule emits parameterized SQL covering all four breakdown levels: channel, campaign, adset, ad. Same UTM-matching rules, same provenance gates (meta_v21/google_v18/tiktok_v13/ etc.), same currency-stamping behavior as the Tinybird pipe. Seven attribution models supported:last_click,first_click,linear_all,linear_paid,time_decay(7d half-life),position_based(40/20/40 U-shape), andmarkov.- Per-workspace read flag wired.
/admin/clickhouse-flagsnow exposes a second toggle alongsidesummary_kpis: a per-workspaceattribution_breakdown (CH)switch. Default OFF — Tinybird stays canonical until each workspace clears its 14-day parity gate. - Adset → Ad drilldown depth fix landed in the same wave. Drilling from Adset to Ad inside the Channel Attribution table now lazy-loads correctly. The fix corrects the
breakdown_level=adrequest envelope so the FE chevron resolves children instead of falling into the “preparing” empty state. - Parity script covers all four breakdown levels.
scripts/clickhouse-verify-attribution-breakdown-parity.tsiterates channel / campaign / adset / ad with per-row per-column drift detection across ~25 numeric fields. Same ±1% gate, same--strictexit-2 contract as the Phase 2Bsummary_kpisscript. Iterates each level so divergences at the ad level (which only surface when a user actually drills) are caught before cutover. - Burn-in clock starts after summary_kpis clears. Phase 2D dual-write goes live for VITATREE USA next, then a 14-day clock at ±1% per breakdown level before broader Cohort 1 cutover. Rollback is one toggle flip — same per-workspace recovery posture as Phase 2C.
How to verify your numbers: open /marketing-acquisition and look at the data-source badge on the Channel Attribution table after your workspace flips. A sky-blue ClickHouse chip means the row was sourced from CH; absence means Tinybird. Per-row per-column numbers should agree to within ±1% — if they don’t, file a support ticket and we’ll flip your workspace back to Tinybird in under a second. The operator-facing flag flip lives at /documentation/admin/clickhouse-runbook.
Why Admaxxer’s Sources & Attribution beats Triple Whale + Datafast
Self-hosted ClickHouse is table stakes for serious DTC analytics — both Triple Whale and Datafast run ClickHouse under the hood and we ship on the same architecture for the same reasons. The differentiators are upstream of the warehouse: pricing, attribution-model surface, and transparency. Phase 2D’s attribution_breakdown cutover makes all three concrete on the Sources & Attribution screen.
| Axis | Admaxxer (this ship) | Triple Whale | Datafast |
|---|---|---|---|
| Warehouse hosting + pricing | Self-hosted ClickHouse on a dedicated $40/mo box. Flat marginal cost — no per-event upcharge as you scale. | Managed warehouse with event-volume billing — costs scale with traffic, not just usage. | Co-located ClickHouse, similar architecture; pricing tied to billable analytics events per their docs. |
| Attribution models exposed | Seven models on a single dropdown: last_click, first_click, linear_all, linear_paid, time_decay (7d half-life), position_based (40/20/40), and markov chains. Four data lenses: pixel, Shopify journeys, platform-reported, and a side-by-side all-lenses view. |
Sonar supports “switch between attribution models on the fly” for a curated set per triplewhale.readme.io. | Single-touch attribution with optional view-through per datafa.st/docs; multi-model selector positioning is narrower. |
| Pipe + datasource transparency | Every pipe, every datasource, every materialized-view CREATE TABLE is visible inside merchant-readable docs and the GitHub repo. The exact SQL behind a tile is one click away. |
Internals are proprietary; merchant-facing docs cover surfaces, not the warehouse layer. | Documentation focuses on dashboard surface and pixel install; underlying warehouse pipes not exposed. |
| Migration discipline | Per-workspace dual-write parity gate, customer-facing data-source badge during burn-in, three-tier rollback (flag flip <1s / pipe kill-switch <5s / container reset <3min), parity script in the repo for every migrated pipe. | Migrations happen server-side without merchant-visible signaling. | Migrations happen server-side without merchant-visible signaling. |
Citations (fair-use, ≤15 words per quote): triplewhale.readme.io describes Sonar as a unified attribution layer with on-the-fly model switching. datafa.st/docs covers analytics-event billing and single-touch attribution with optional view-through. Differentiator rows are fact-of-product, not editorial.
Phase 2C update (2026-05-17) — VITATREE USA is live
Four things landed today:
- GL#500 fix shipped. The meta-inflation parity drift that gated Phase 2B cutover is gone. The CH
summary_kpispath now matches Tinybird within ±0.8% across every workspace-day we tested. - Performance pass complete. Median CH
summary_kpisquery latency is now ~1.2 ms (in-host on theadmaxxer-chcontainer), vs ~32 ms for the cross-region Tinybird path. p95 is ~2.1 ms vs ~58 ms. Tile-level dashboard renders ~880 ms faster on the CH side. - Dual-write live for canary workspace VITATREE USA. The first production traffic reading
summary_kpisfrom ClickHouse. The dashboard tile-level data-source badge confirms each row’s backing warehouse in real time. 90 days ofpageviews+revenue_events+ordersbackfilled into ClickHouse so long-window LTV queries (7/30/90d) cross-validate against Tinybird with the same ±1% gate. - Parity burn-in clock started. 14-day continuous-parity clock is running. As long as no workspace-day drifts past ±1%, broader cohort cutover begins 2026-05-31. If any day fails, the clock resets and we investigate before resuming. Conservatism beats speed here.
The operations runbook for this cutover lives at /documentation/admin/clickhouse-runbook (admin-only access). It documents how to check parity status for a workspace, flip a workspace flag on, roll back a workspace, trigger a backfill for a specific date window, read the CH performance dashboard and system.query_log, and the escalation matrix.
Wave 3 update (2026-05-17) — source_medium_breakdown + p_summary_series
Two more pipes shipped today, in lockstep:
source_medium_breakdownCH module shipped. The pipe that powers the Sources & Attribution top-level table at /marketing-acquisition — every(source, medium)row your team scans first thing in the morning to see which channel paid off yesterday — now has a CH-native query path. Paid rows joinad_spend_dailyfor CPC / CPM / ROAS; organic rows surface session + pageview volume; the same UTM canonicalization rules apply on both warehouses so reconciliation between Tinybird and CH is one-to-one per row.p_summary_seriesCH module shipped. The pipe that powers the sparkline arrays on every summary KPI tile at /dashboard — revenue, spend, blended ROAS, MER, AOV, sessions, conversions, units, NCPA, and the rest of the ~16 numeric series. Per-day granularity preserved exactly so the chart animation reads identically. Float64 coercion handled at the SQL boundary per GL#418 (the Tinybird Decimal vs Float64 cast key lesson).- Two new per-workspace read flags wired.
/admin/clickhouse-flagsnow exposessource_medium_breakdown (CH)andp_summary_series (CH)toggles alongside the existingsummary_kpisandattribution_breakdownones. Default OFF on every workspace — Tinybird stays canonical until each workspace clears its 14-day parity gate per pipe. - Two parity scripts cover both pipes end-to-end.
scripts/clickhouse-verify-source-medium-breakdown-parity.tsiterates per-(source, medium)row with per-column drift detection;scripts/clickhouse-verify-summary-series-parity.tsiterates per-day series column with per-column drift detection across ~16 numeric fields. Same ±1% drift budget, same--strictexit-2 contract, same per-workspace + per-day decomposition as the Phase 2Bsummary_kpisand Phase 2Dattribution_breakdownscripts. - Sparkline-render visual contract preserved.
p_summary_seriespowers a visually sensitive surface — the sparkline shape on each tile reads as the trend signal that operators use to decide whether to dig into a number. A 14-day series of (10, 11, 10, 12, 11, …) on CH vs (11, 10, 11, 11, 12, …) on Tinybird can pass a naive per-day ±1% gate but render visibly different sparklines. We added an extra visual-parity verification inside the burn-in: the CH series and the Tinybird series must produce the same up/down/flat direction-of-change between adjacent days, plus the same min/max bucket across the 14-day window. Numeric drift inside ±1% is not sufficient on its own for this pipe. - Burn-in clock starts after upstream summary_kpis clears. Wave 3 dual-write goes live for VITATREE USA next, then a 14-day clock at ±1% per
(source, medium)row forsource_medium_breakdownand per-day per-column forp_summary_seriesbefore broader Cohort 1 cutover. Rollback is one toggle flip per pipe — same per-workspace recovery posture as Phase 2C and 2D, and the three-layer rollback (per-workspace flag flip <1s / pipe-wide Redis kill-switch <5s / container-level CLICKHOUSE_URL unset <3min) applies independently to each toggle.
What each pipe powers, in customer-facing terms: source_medium_breakdown is the table you scan to answer “which channel paid off yesterday?”; p_summary_series is the inline chart you glance at to answer “is this metric trending up, flat, or down?”. Both are read-path-only changes — the API shapes returned by /api/v1/analytics/source-medium and /api/v1/analytics/summary-series are byte-identical before and after cutover; every field name, every type, every nullable shape stays the same. The migration is invisible at the consumer level until your workspace is explicitly opted in by an admin.
Admin runbook for these two new flags (admin-only access): /documentation/admin/clickhouse-runbook. The runbook documents how to read parity status for the new pipes, flip a workspace flag per pipe, roll back per pipe, and the sparkline-shape verification step added for p_summary_series.
Why we’re migrating
Three reasons, in order of weight:
- Per-query billing was capping us at our growth point. Tinybird’s Develop tier ($40/mo) was sufficient through the early product but our pipe-count × workspace-count is now bumping the vCPU ceiling daily. The next Tinybird tier is $249/mo+; a dedicated Hetzner CCX23 ($40/mo, 4 dedicated AMD cores, 16 GB RAM) gives us 8× the headroom for the same spend, no per-query upcharge, and no scaling surprises as we ship features.
- Cross-region latency was a real cost. Tinybird runs in AWS us-east-1. Admaxxer’s app servers run in Hetzner Hillsboro, OR. Every dashboard query paid ~30–50 ms of network round-trip before the query even started. Co-locating ClickHouse with the app server cuts that to 0.8–1.6 ms over a private LAN — on a dashboard with 12 tiles each firing its own query, that’s 400+ ms shaved per page load. You feel this on the dashboard render.
- We already self-host everything else. Admaxxer’s Postgres (OLTP), Redis (queues + cache), and pixel ingest all migrated off managed cloud onto the same Hetzner box earlier this month (May 11–12, 2026). ClickHouse is the last managed dependency. Cancelling the Tinybird subscription cleanly closes the per-query-billing chapter and makes our cost model predictable at every customer-count target.
The Triple Whale and Datafast playbooks both self-host their warehouse layer at scale — the patterns are well-trodden and the cost curve at our growth point clearly favours owning the box.
What changes for you — right now: nothing
This is the most important section. Every Admaxxer dashboard you load today still reads its numbers from Tinybird. The migration is happening behind a feature flag that defaults OFF for every workspace. We only flip your flag on after our admin team has verified that the ClickHouse path returns numbers that match Tinybird’s within ≤1% on a per-column basis for at least 14 days running.
When your workspace flips:
- Same API shape. The response envelope your dashboard reads (
/api/v1/analytics/summary,/api/v1/analytics/attribution, etc.) is byte-identical across both backends. Every field name, every type, every nullable shape stays the same. - Same numbers. The parity gate enforces ≤1% drift per column over the burn-in. Anything that drifts triggers an admin alert and your flag stays off.
- Faster page loads. 0.8–1.6 ms per query instead of 30–50 ms. The most visible effect: the dashboard hero loads ~half a second faster.
- Tracking data badge stays consistent. The "data source" indicator on tiles still says where each number came from (pixel / Shopify / platform); the warehouse swap is invisible at the tile level.
If anything looks off the day your flag flips, mention it to support — we can flip it back to Tinybird instantly while we investigate. The rollback is one column update on our admin DB; no deploy needed.
How parity verification works
Every numeric column emitted by summary_kpis (the pipe that powers your dashboard’s 80+ KPIs — MER, ROAS, AOV, NC-ROAS, NCPA, units sold, etc.) is computed on BOTH Tinybird and ClickHouse for your workspace over identical date windows. A nightly canary script (scripts/clickhouse-verify-summary-kpis-parity.ts) runs the comparison and emits a per-column drift report:
- Status
ok— drift < 1% on a non-zero column. Counts toward your burn-in clock. - Status
drift— drift > 1%. Resets your burn-in clock to day 0; admin investigates the source of drift before resuming. - Status
currency_pending— your workspace reports in a non-USD currency (PKR, EUR, INR, etc.). The current ClickHouse path returns "upstream_unavailable" for non-USD workspaces until the FX-conversion pipe migrates in a follow-up phase. Tinybird stays canonical and your numbers are correct; the parity check just doesn’t produce a meaningful comparison until FX lands on the new side.
The burn-in requirement is 14 consecutive days of ok across every column before flag-on. This is intentionally conservative — the polled-fallback path that fills install-day workspaces' tiles (GL#313 + GL#319 + GL#327 in our key-lessons archive) takes several days to accumulate enough data on both sides to compare meaningfully, so we want a window that covers it.
Per-workspace cutover schedule
Cutover happens workspace-by-workspace, not all at once. The order is:
- Cohort 1: USD DTC workspaces with full data history (≥ 90 days). The most stable parity profile — both sides have months of overlapping data to compare. Flag-on candidates start landing here ~14 days after the dual-write started for your workspace.
- Cohort 2: USD DTC workspaces with install-day or early-stage data. Parity hinges on the polled-fallback fold; we want extra runway to confirm the ClickHouse translation preserves it.
- Cohort 3: Non-USD workspaces. Blocked on the FX-conversion pipe migration. Once that ships, parity check unlocks and the same 14-day burn-in starts.
- Cohort 4: SaaS and Lead-Gen workspaces. These read different pipes (
p_saas_summary, etc.) — their CH equivalents land in a separate phase.
You can’t opt your own workspace in or out — the flag is admin-controlled. We’re intentionally not exposing this as a customer-facing toggle because the parity gate is the contract; flipping manually would defeat the gate.
How to verify your numbers (any day, any backend)
Two pieces of advice that apply regardless of which warehouse backs your dashboard:
1. Compare to Shopify Admin’s native reports
Shopify Admin → Analytics → Reports has authoritative revenue and orders numbers straight from Shopify’s order edge. Pick a 30-day window in both Admaxxer and Shopify Admin — pixel-attributed revenue on Admaxxer should land within ~5% of Shopify’s gross sales (the gap is real attribution loss: ad-blockers, Safari ITP, CSP-filtered events). For polled metrics (orders, NC orders, units sold), the numbers should be within ~1% — we read those directly from Shopify Admin via the daily-poll fallback, so they’re effectively the same source.
2. Compare to Meta Ads Manager and Google Ads
Pull spend / impressions / clicks for the same date range from each platform’s native reporting. Admaxxer’s spend numbers should match to the cent — we read directly from the Marketing API daily and store the response unchanged. Purchase / conversion counts will differ slightly because each platform has its own attribution model (default 7-day click + 1-day view for Meta, etc.); the spend is the ground-truth comparison.
If you see discrepancies that don’t match these ranges, file a support ticket with the specific tile + date window. We’ll either explain the gap (often it’s currency or attribution-window mismatch, both fixable) or root-cause it as a real bug. Either way, the warehouse swap doesn’t change this calculus — the parity gate ensures our numbers stay correct across the cutover.
Security and data isolation in the new setup
Same model as before. The ClickHouse cluster:
- Runs in a Docker container (
admaxxer-ch) on a dedicated Hetzner CCX23 box (Hillsboro, OR, US-West region). - Listens only on the private LAN interface (
192.168.100.3:8123); the public IPv4 returns connection-refused. Verified externally — the analytics warehouse is not reachable from the public internet. - Speaks to the Admaxxer app server over a Hetzner Private Network. No internet traffic crosses the warehouse boundary.
- Authenticates with a 128-bit random password; SSL between app and warehouse is end-to-end private LAN (no internet hop).
- Per-workspace data isolation is enforced the same way as Tinybird: every read query carries
WHERE workspace_id = {workspace_id:String}bound from the authenticated session. No user-suppliedworkspace_idpath exists. - Daily backup runs at 03:20 UTC; 14 daily snapshots + 12 monthly snapshots retained on a separate disk path. Restore is a single
RESTORE DATABASEcommand.
For full architecture details see Tinybird auth model (the patterns generalize) and our public infrastructure reference in the codebase.
Where to learn more
Related: Admin operations runbook · Tinybird auth model · Marketing acquisition deep dive · Dashboard analytics card reference · Multi-currency plumbing · Documentation home
If you have questions about your workspace’s migration status or want to know when your flag is scheduled to flip, reach out to support@admaxxer.com.