Data Ontology — The 10 Canonical Entities

A shared vocabulary for the objects Admaxxer models. Read this before querying /api/v1/*, before reading dashboard SQL, or before asking the Claude agent a "give me X by Y" question.

TL;DR: Admaxxer's data model is built on 10 canonical entities: Workspace, Website, Visitor, Session, Pageview, Goal, Payment, AdAccount, the Campaign / AdSet / Ad hierarchy, and Subscription. Workspace is the multi-tenant root; everything else carries a workspace_id. Hot append-only entities (Pageview, Goal, Payment, ad insights) live in Tinybird; structural state (Workspace, User, AdAccount config) lives in Neon Postgres. Naming is snake_case in storage, camelCase at the JS/TS layer; external IDs from third-party systems carry an external_* prefix. The same 10 entities surface unchanged in the UI, in /api/v1/*, and in every Claude agent tool call.

What is a data ontology — and why does it matter?

A data ontology is the shared vocabulary your stack uses to refer to the entities and relationships in your business. When the dashboard says "Revenue by Ad", the API says GET /api/v1/ads/{id}/revenue, and the Claude agent says get_ad_revenue(ad_id, ...) — they're all referring to the same Ad entity, joined the same way to the same Payment entity. Drift in that vocabulary is one of the most common analytics-platform failure modes (Snowflake's "every team has its own definition of MRR" problem).

Admaxxer publishes its ontology as a documented contract. Every entity name, every field name, every join key is listed here. If you're building against the API, integrating with a downstream BI tool, or asking the Claude agent a complex question, this is the source of truth.

The 10 core entities

1. Workspace — the multi-tenant root

The top-level container. Every other entity in the system carries a workspace_id column for isolation. Every API request resolves to one workspace; cross-workspace queries are explicitly disallowed.

2. Website — a tracked property within a workspace

A single domain (or subdomain set) the merchant runs the pixel on. One workspace can have many websites — useful for brands running multiple Shopify stores, for agencies, or for staging-vs-production split.

3. Visitor — the anonymous identity

An anonymous identity tied to the visitor_id first-party cookie. Spans multiple sessions across cookie rotation when identify(external_user_id) has been called. Without an external_user_id, the visitor is bounded by the cookie's lifetime (24h by default; merchants can extend to 365 days on Pro+ plans).

4. Session — a discrete visit

A bounded sequence of pageviews and goal events from one visitor. Session boundary is a 30-minute idle gap (configurable per workspace).

5. Pageview — a single page render

One row per page render. The atomic unit of behavioral data.

6. Goal — a discrete user action

Any custom goal fired via admx.goal(), data-admx-goal="", or server-side POST /api/event. Plus the seven reserved __admx_* goals fired automatically by script.plus.js (outbound click, file download, form submit, video play, scroll-50, scroll-75, scroll-90).

7. Payment — a revenue event

The most heavily-deduplicated entity. Each row represents exactly one charge, refund, or subscription renewal — whether it landed via Custom Pixel, server-side webhook, or daily reconciliation poll.

8. AdAccount — a connected Meta/Google/TikTok account

One row per platform-credential set the merchant has connected.

9. Campaign / AdSet / Ad — the ad-platform hierarchy

The three-level hierarchy pulled from each ad platform's API. Stored as cached insights in Tinybird, keyed by (date, account_id, campaign_id, adset_id, ad_id).

10. Subscription — recurring revenue

A long-lived agreement between the merchant's customer and a recurring-billing provider. Stripe subscription, Recharge subscription, ReCharge-equivalent, Polar subscription, etc.

Entity relationships

The relationships between the 10 entities, written as cardinality + join key:

Workspace 1 ----- N Websites          (websites.workspace_id)
Workspace 1 ----- N AdAccounts        (ad_platform_connections.workspace_id)
Workspace 1 ----- N WorkspaceMembers  (workspace_members.workspace_id)
Workspace 1 ----- N Subscriptions     (subscriptions.workspace_id)

Website 1 ------- N Visitors          (pixel_events.workspace_id + website_id, then group by visitor_id)
Visitor 1 ------- N Sessions          (pixel_events grouped by 30-min idle window)
Session 1 ------- N Pageviews         (pixel_events with event_type = 'pageview')
Session 1 ------- N Goals             (pixel_events with event_type = 'goal')
Visitor 1 ------- N Payments          (visitor_payments.visitor_id)

AdAccount 1 ----- N Campaigns         (ad_insights_*.account_id)
Campaign 1 ------ N AdSets            (ad_insights_*.campaign_id)
AdSet 1 --------- N Ads               (ad_insights_*.adset_id)
Ad N ------------ M Goals             (attribution model: last-touch, linear, time-decay)
Ad N ------------ M Payments          (attribution model + cohort window)

Cardinality reads "1 ----- N" as "one [left] has many [right]". N ------ M denotes a many-to-many relationship resolved at read-time by the attribution model; there is no static join table.

Where the entities live (Tinybird vs Postgres)

Tinybird (OLAP — append-only event data)

Neon Postgres (OLTP — structural state)

The split rule: if it's append-only and you query it analytically, it goes to Tinybird. If it has lifecycle (created → updated → deleted) and you query it transactionally, it goes to Postgres.

Common joins

The five highest-frequency joins inside Admaxxer, with the SQL shape (Tinybird flavor):

Revenue by ad — last-touch attribution

SELECT
  meta.ad_id,
  meta.ad_name,
  SUM(pay.amount_cents) / 100 AS revenue
FROM visitor_payments pay
INNER JOIN visitor_events evt
  ON pay.workspace_id = evt.workspace_id
  AND pay.visitor_id = evt.visitor_id
  AND evt.event_type = 'pageview'
INNER JOIN ad_insights_meta meta
  ON evt.utm_source = 'meta'
  AND evt.utm_content = meta.ad_id
WHERE pay.workspace_id = {workspace_id}
  AND pay.occurred_at >= now() - INTERVAL 30 DAY
GROUP BY meta.ad_id, meta.ad_name

Cohort LTV — first-purchase 90-day window

WITH first_purchase AS (
  SELECT visitor_id, MIN(occurred_at) AS first_purchased_at
  FROM visitor_payments
  WHERE workspace_id = {workspace_id}
  GROUP BY visitor_id
)
SELECT
  toStartOfMonth(fp.first_purchased_at) AS cohort_month,
  SUM(pay.amount_cents) / 100 AS revenue,
  COUNT(DISTINCT pay.visitor_id) AS visitors,
  SUM(pay.amount_cents) / 100 / COUNT(DISTINCT pay.visitor_id) AS ltv_per_visitor
FROM visitor_payments pay
INNER JOIN first_purchase fp USING (visitor_id)
WHERE pay.workspace_id = {workspace_id}
  AND pay.occurred_at BETWEEN fp.first_purchased_at AND fp.first_purchased_at + INTERVAL 90 DAY
GROUP BY cohort_month

MER — blended marketing efficiency ratio

SELECT
  toDate(pay.occurred_at) AS day,
  SUM(pay.amount_cents) / 100 AS revenue,
  (SELECT SUM(spend_cents) / 100
     FROM ad_insights_meta
     WHERE workspace_id = pay.workspace_id
       AND occurred_on = toDate(pay.occurred_at)) AS meta_spend,
  (SELECT SUM(spend_cents) / 100
     FROM ad_insights_google
     WHERE workspace_id = pay.workspace_id
       AND occurred_on = toDate(pay.occurred_at)) AS google_spend,
  revenue / (meta_spend + google_spend) AS mer
FROM visitor_payments pay
WHERE pay.workspace_id = {workspace_id}
GROUP BY day

The summary_kpis pipe pre-aggregates this; the live SQL is a fallback for ad-hoc analysis. See /documentation/data/revenue-data-flow for the full ingestion model.

Naming conventions

Comparison — vs Triple Whale, vs Datafast

Different ontologies for different shapes of business:

Entity classTriple WhaleDatafastAdmaxxer
Multi-tenancy Account Workspace Workspace
Tracked property Shop (Shopify-only) Website Website (multi-domain)
Anonymous identity (none — Shopify customer ID required) Visitor Visitor (visitor_id + external_user_id + email_hash)
Visit semantics (implicit through orders) Session + Pageview Session + Pageview
Discrete action Customer Journey event Goal Goal (custom + 7 reserved __admx_*)
Revenue event Order (Shopify-native) Payment Payment (RMT-keyed across 7 providers)
Customer Customer (with LTV, cohort, RFM) (none as first-class entity) (derived from Visitor + Payment)
Product Product (Shopify-native catalog) (none) (none in v1)
Marketing entity Marketing Entity (collapsed across platforms) (none) AdAccount + Campaign + AdSet + Ad (platform-native, three levels)
Operations Operations (fulfillment, COGS) (none) (none in v1)
Subscription Subscription (Recharge / Skio) Subscription Subscription (Stripe / Paddle / LS / Polar / Recharge via webhook)

The shape difference: Triple Whale's ontology is commerce-rich (10 entities including Order, Customer, Product, Operations) because TW's roots are pure-Shopify. Datafast's ontology is analytics-only (5 entities: Visitor, Session, Pageview, Goal, Payment) because DF doesn't model the ad stack. Admaxxer sits between: 10 entities including DTC commerce + first-class ad-platform hierarchy + the analytics primitives, but without commerce-fulfillment objects (Product, Inventory, COGS) that aren't in the v1 scope.

API access — same entities, programmatic surface

Every entity in this document is exposed unchanged through /api/v1/*. The endpoints follow REST plus a few RPC-shaped reads for analytical queries:

Authentication is by API key issued from /dashboard/site-settings; rate-limited per-workspace. Quota tracking against the workspace's subscriptionPlan. The full endpoint reference is at the developer documentation hub at /documentation/developer.

Inside the dashboard, the same entities surface in the Claude agents' tool definitions (see /documentation/architecture/how-data-works): list_campaigns reads the AdAccount + Campaign + AdSet + Ad hierarchy; get_revenue_summary reads the Payment entity; query_metric exposes the same POST /api/v1/metrics/query shape.

See also

How data works (architecture) · Revenue data flow · Tinybird auth model · Multi-currency display · Metric glossary · Conversion funnels · Custom goals · GSC integration