Data Ontology — The 10 Canonical Entities

A shared vocabulary for the objects Admaxxer models. Read this before querying /api/v1/*, before reading dashboard SQL, or before asking the Claude agent a "give me X by Y" question.

TL;DR: Admaxxer's data model is built on 10 canonical entities: Workspace, Website, Visitor, Session, Pageview, Goal, Payment, AdAccount, the Campaign / AdSet / Ad hierarchy, and Subscription. Workspace is the multi-tenant root; every other entity belongs to exactly one workspace. High-volume behavioral entities (Pageview, Goal, Payment, ad insights) live in the append-only analytics warehouse for fast aggregation; structural state (Workspace, User, AdAccount config) lives in the primary transactional database. IDs that come from a third-party system are always clearly marked as external. The same 10 entities surface unchanged in the UI, in /api/v1/*, and in every Claude agent tool call.

What is a data ontology — and why does it matter?

A data ontology is the shared vocabulary your stack uses to refer to the entities and relationships in your business. When the dashboard says "Revenue by Ad", the API says GET /api/v1/ads/{id}/revenue, and the Claude agent says get_ad_revenue(ad_id, ...) — they're all referring to the same Ad entity, related the same way to the same Payment entity. Drift in that vocabulary is one of the most common analytics-platform failure modes ("every team has its own definition of MRR").

Admaxxer publishes its ontology as a documented contract. Every entity name, every field name, every join key is listed here. If you're building against the API, integrating with a downstream BI tool, or asking the Claude agent a complex question, this is the source of truth.

The 10 core entities

1. Workspace — the multi-tenant root

The top-level container. Every other entity in the system belongs to exactly one workspace, which is how data is kept isolated. Every API request resolves to one workspace; cross-workspace queries are explicitly disallowed.

Storage: primary database (transactional state).
Key fields: a unique ID, a display name, the owning user, your current plan, your reporting currency, your reporting timezone, and a created-at timestamp.
Common joins: referenced by every other entity by workspace. Joined to your team members for seat queries.

2. Website — a tracked property within a workspace

A single domain (or subdomain set) the merchant runs the pixel on. One workspace can have many websites — useful for brands running multiple Shopify stores, for agencies, or for staging-vs-production split.

Storage: primary database (transactional state).
Key fields: a website ID, the owning workspace, the production hostname, the pixel snippet ID, plus the site's currency and timezone.
Common joins: every pixel event is scoped to a single website within a single workspace, so behavioral data always rolls up cleanly per property.

3. Visitor — the anonymous identity

An anonymous identity tied to the visitor_id first-party cookie. Spans multiple sessions across cookie rotation when identify(external_user_id) has been called. Without an external_user_id, the visitor is bounded by the cookie's lifetime (24h by default; merchants can extend to 365 days on Pro+ plans).

Storage: derived in the analytics warehouse — the visitor is computed on read from behavioral and revenue events rather than stored as a standalone record.
Key fields: the anonymous visitor_id, an optional external_user_id (set after login), an optional hashed customer email (we only ever store a one-way hash, never the plaintext address), first-seen and last-seen timestamps, and the sticky first-touch UTM attribution.
Common joins: behavioral events and revenue events are tied together by visitor for cohort revenue. When the first-party cookie has rotated, the join falls back to the logged-in user ID, then to the hashed email.

4. Session — a discrete visit

A bounded sequence of pageviews and goal events from one visitor. Session boundary is a 30-minute idle gap (configurable per workspace).

Storage: derived in the analytics warehouse from the pageview stream; computed on read rather than stored as a standalone record.
Key fields: a session ID, start and end timestamps, duration, pageview count, the landing and exit page URLs, and the sticky UTM attribution captured from the landing pageview.
Common joins: funnel reports group pageviews by session for the "in-session conversion" view.

5. Pageview — a single page render

One row per page render. The atomic unit of behavioral data.

Storage: the analytics warehouse, as a pageview-type behavioral event.
Key fields: the owning workspace and website, the visitor and session, a timestamp, the full page URL and a normalized page path (trailing slash, hash, and query string stripped), the referrer, UTM parameters, device type, and geo fields (country, region, city, language).
Common joins: tied to revenue events by visitor for attribution; grouped by page path for the Pages report.

6. Goal — a discrete user action

Any custom goal fired via admx.goal(), data-admx-goal="", or server-side POST /api/event. Plus the seven reserved __admx_* goals fired automatically by script.plus.js (outbound click, file download, form submit, video play, scroll-50, scroll-75, scroll-90).

Storage: the analytics warehouse, as a goal-type behavioral event.
Key fields: the owning workspace and website, the visitor and session, a timestamp, the goal name (up to 100 characters), and optional structured metadata (JSON, up to 16 keys, 256 characters per value, 8 KB total).
Common joins: tied to ad-platform performance data for "goal completions per ad". Drives custom funnel steps.
Naming rule: __admx_ prefix is reserved for system-fired goals. Merchant-defined goals MUST NOT start with __.

7. Payment — a revenue event

The most heavily-deduplicated entity. Each row represents exactly one charge, refund, or subscription renewal — whether it landed via Custom Pixel, server-side webhook, or daily reconciliation poll.

Storage: the analytics warehouse, as a revenue event stream (deduplicated so each charge is counted exactly once, last write wins).
Key fields: the owning workspace and website, the visitor, a timestamp, the amount (in minor units), the currency (ISO 4217), the source provider (one of Shopify, Stripe, Paddle, Lemon Squeezy, Polar, Dodo, WooCommerce), the provider-issued payment ID, an optional hashed customer email, and the kind of event (charge, refund, or subscription renewal).
Common joins: Cohort LTV (group by visitor first-purchase + N-day window). MER (revenue per workspace per day ÷ ad spend).
Idempotency: each payment is deduplicated by its source provider and the provider's own unique payment ID, scoped to your workspace and website. The same Shopify order arriving from the Custom Pixel, a webhook, and the daily reconciliation poll lands once.

8. AdAccount — a connected Meta/Google/TikTok account

One row per platform-credential set the merchant has connected.

Storage: primary database (transactional state).
Key fields: a unique ID, the owning workspace, the platform (Meta, Google, or TikTok), the platform's own account ID, the connection credentials (always stored encrypted at rest, never in plaintext), a status (connected, expired, error, or disconnected), and a token-expiry timestamp.
Common joins: tied to your sync history for the connection-health UI, and to your ad performance data by the platform account ID for spend reporting.

9. Campaign / AdSet / Ad — the ad-platform hierarchy

The three-level hierarchy pulled from each ad platform's API. Stored as cached daily insights in the analytics warehouse, keyed by date plus the account, campaign, ad set, and ad.

Storage: the analytics warehouse, one daily-insights stream per platform (Meta, Google, TikTok). The primary database also caches the most recent 15 minutes of API responses for the connection-health UI.
Key fields: the account, campaign, ad set, and ad identifiers, a name, a status (active, paused, archived, or deleted), spend, impressions, clicks, platform-reported conversions, and the date.
Common joins: the cross-platform creative grid combines all three platforms' daily insights by date for one unified Meta + Google + TikTok view. Ad-level LTV ties each ad's insights to revenue events through the attribution model.
Note: Platforms differ — Google's "campaign" maps to a Meta "ad set" in some advertisers' mental model. Admaxxer preserves each platform's native naming, and the cross-platform abstractions are computed in the ad-level reports.

10. Subscription — recurring revenue

A long-lived agreement between the merchant's customer and a recurring-billing provider. Stripe subscription, Recharge subscription, ReCharge-equivalent, Polar subscription, etc.

Storage: the primary database mirrors Stripe for the merchant's own Admaxxer subscription. Your customers' subscriptions (their end users on Stripe Subscriptions, Recharge, and the like) flow through the revenue event stream as subscription-renewal events.
Key fields: a unique ID, the provider's own subscription ID, a status (active, past due, canceled, or trialing), and the started, renewed, canceled, and current-period-end timestamps.
Common joins: MRR / ARR rollups count active subscriptions by current-period-end. Churn analysis groups cancellations by month.

Entity relationships

The relationships between the 10 entities, written as cardinality + join key:

Workspace 1 ----- N Websites          (a workspace owns many tracked properties)
Workspace 1 ----- N AdAccounts        (a workspace owns many ad connections)
Workspace 1 ----- N TeamMembers       (a workspace has many members)
Workspace 1 ----- N Subscriptions     (a workspace has its own billing subscription)

Website 1 ------- N Visitors          (a property has many visitors)
Visitor 1 ------- N Sessions          (visits are split on a 30-min idle gap)
Session 1 ------- N Pageviews         (a visit has many page renders)
Session 1 ------- N Goals             (a visit has many goal completions)
Visitor 1 ------- N Payments          (a visitor has many revenue events)

AdAccount 1 ----- N Campaigns         (an account has many campaigns)
Campaign 1 ------ N AdSets            (a campaign has many ad sets)
AdSet 1 --------- N Ads               (an ad set has many ads)
Ad N ------------ M Goals             (attribution model: last-touch, linear, time-decay)
Ad N ------------ M Payments          (attribution model + cohort window)

Cardinality reads "1 ----- N" as "one [left] has many [right]". N ------ M denotes a many-to-many relationship resolved at read-time by the attribution model; there is no static join table.

Where the entities live (analytics warehouse vs primary database)

Analytics warehouse — append-only event data

The high-volume, append-only behavioral and revenue data lives here, where it can be aggregated fast:

Behavioral events — pageviews and goals (tens of millions of rows per month for a typical DTC merchant)
Revenue events — charges, refunds, and renewals, deduplicated so each is counted once
Ad performance — daily spend, impressions, clicks, and platform-reported conversions across Meta, Google, and TikTok
Email engagement — opens, clicks, and conversions from your email platform
Search performance — Google Search Console impressions, clicks, and position
Pre-aggregated summaries — the rolled-up series that power the dashboard summary tiles

Primary database — structural state

Everything with a lifecycle (created → updated → deleted) that you query transactionally lives here:

Accounts & multi-tenancy — workspaces, users, team members, invites, and sessions
Tracked properties — the registry of websites you run the pixel on
Connector state — your ad-platform connections and their sync health
Agent conversations — your Claude agent chat history
Billing — the Stripe customer and subscription mirror for your own plan
Developer access — your API keys and outbound webhook configuration
Marketing content — blog posts and other site content

The split rule: if it's append-only and you query it analytically, it goes to the analytics warehouse. If it has lifecycle (created → updated → deleted) and you query it transactionally, it goes to the primary database.

Common metrics — how the entities combine

The highest-frequency cross-entity calculations inside Admaxxer, described by the entities they relate and the rule they apply:

Revenue by ad — last-touch attribution

Each Payment is matched to the Pageview that touched the same Visitor, and that pageview's UTM attribution links it to the Ad that drove the click. Revenue is then summed per ad over the chosen window (e.g. the last 30 days). The result answers "which ad produced this revenue?" under a last-touch model.

Cohort LTV — first-purchase 90-day window

Visitors are grouped into cohorts by the month of their first purchase. For each cohort, Admaxxer sums every subsequent Payment from those same visitors within a fixed window (e.g. 90 days), then divides by the number of visitors in the cohort to get lifetime value per visitor. This shows how much a cohort is worth as it matures.

MER — blended marketing efficiency ratio

For each day, Admaxxer takes total Payment revenue and divides it by total ad spend across every connected AdAccount (Meta + Google + …) for that same day. The ratio is your blended return on ad spend, independent of any single platform's self-reported attribution.

Pre-aggregated summary series make these instant on the dashboard; the per-entity model above is what they roll up. See /documentation/data/revenue-data-flow for the full ingestion model.

Naming conventions

API responses: field names are camelCase in /api/v1/* JSON, in React component props, and in Claude agent tool parameters — one consistent style everywhere you read data.
External IDs: any value issued by a third-party system is clearly marked as external. Examples: a payment's provider ID (Stripe pi_*, a Shopify order GID, a Paddle order ID), an ad account's platform ID (Meta act_*, a Google customer ID), and a subscription's provider ID (Stripe sub_*).
Reserved goal prefix: __admx_ — reserved for goals Admaxxer fires automatically (the enhanced pixel's auto-events). Your own custom goals MUST NOT start with __.

Comparison — vs Triple Whale, vs Datafast

Different ontologies for different shapes of business:

Entity class	Triple Whale	Datafast	Admaxxer
Multi-tenancy	Account	Workspace	Workspace
Tracked property	Shop (Shopify-only)	Website	Website (multi-domain)
Anonymous identity	(none — Shopify customer ID required)	Visitor	Visitor (anonymous ID + logged-in user ID + hashed email)
Visit semantics	(implicit through orders)	Session + Pageview	Session + Pageview
Discrete action	Customer Journey event	Goal	Goal (custom + 7 reserved __admx_*)
Revenue event	Order (Shopify-native)	Payment	Payment (deduplicated across 7 providers)
Customer	Customer (with LTV, cohort, RFM)	(none as first-class entity)	(derived from Visitor + Payment)
Product	Product (Shopify-native catalog)	(none)	(none in v1)
Marketing entity	Marketing Entity (collapsed across platforms)	(none)	AdAccount + Campaign + AdSet + Ad (platform-native, three levels)
Operations	Operations (fulfillment, COGS)	(none)	(none in v1)
Subscription	Subscription (Recharge / Skio)	Subscription	Subscription (Stripe / Paddle / LS / Polar / Recharge via webhook)

The shape difference: Triple Whale's ontology is commerce-rich (10 entities including Order, Customer, Product, Operations) because TW's roots are pure-Shopify. Datafast's ontology is analytics-only (5 entities: Visitor, Session, Pageview, Goal, Payment) because DF doesn't model the ad stack. Admaxxer sits between: 10 entities including DTC commerce + first-class ad-platform hierarchy + the analytics primitives, but without commerce-fulfillment objects (Product, Inventory, COGS) that aren't in the v1 scope.

API access — same entities, programmatic surface

Every entity in this document is exposed unchanged through /api/v1/*. The endpoints follow REST plus a few RPC-shaped reads for analytical queries:

GET /api/v1/workspaces/me — the current user's workspace shape
GET /api/v1/websites — tracked properties
GET /api/v1/visitors/{visitor_id} — visitor profile + history
GET /api/v1/payments?since=...&limit=... — revenue events
GET /api/v1/ad-accounts — connected platforms
GET /api/v1/ads/{ad_id}/insights?date_range=... — spend + impressions + clicks for an ad
POST /api/v1/metrics/query — arbitrary metric query (PIPE_ALLOWLIST-gated)

Authentication is by API key issued from /dashboard/site-settings; rate-limited per workspace. Usage is tracked against your plan's monthly event allowance. The full endpoint reference is at the developer documentation hub at /documentation/developer.

Inside the dashboard, the same entities surface in the Claude agents' tool definitions (see /documentation/architecture/how-data-works): list_campaigns reads the AdAccount + Campaign + AdSet + Ad hierarchy; get_revenue_summary reads the Payment entity; query_metric exposes the same POST /api/v1/metrics/query shape.