📌 Series: GA4 × SFCC Reconciliation — Part 4 of 5

This article is part of a technical series on reconciling e-commerce order data between Salesforce Commerce Cloud (SFCC) and Google Analytics 4 (GA4). Parts 1–3 cover data extraction, normalisation, and BigQuery architecture. This part covers LookML metric governance. If you are arriving here directly, the section below gives you the context to make this article standalone.

Context: The Problem This Article Solves

Salesforce Commerce Cloud (SFCC) is an enterprise e-commerce platform. It records every order server-side: the order ID, value, channel, customer type, fulfilment status. It is the system of record for revenue.

Google Analytics 4 (GA4) tracks user behaviour client-side via a JavaScript tag in the browser. When a purchase completes, GA4 fires a purchase event. It is the system of record for digital attribution.

The reconciliation problem is that these two systems never agree on the same revenue number. SFCC always reports more orders than GA4, because GA4 misses orders structurally: Safari’s ITP cookie restrictions, consent refusals, ad blockers, in-app browsers, offline orders. The gap is typically 10–35% of web orders.

The Two-System RealitySFCCServer-side order system100% of orders recordedRevenue = ground truthSource: order DB / ERPGap: 10–35%of web ordersGA4Client-side analytics65–90% of orders visibleAttribution = channel sourceSource: browser JS tag
Figure 0 — SFCC and GA4 are two different systems measuring overlapping but not identical realities. SFCC is the revenue source of truth. GA4 is the attribution source of truth.

Parts 1–3 dealt with the data layer: extracting both datasets into BigQuery, normalising order IDs so they can be joined. The result is a reconciliation_base table with one row per SFCC order, enriched with its GA4 match (or absence).

This article addresses the semantic layer: how do you define your metrics in a way that is consistent, documented, version-controlled, and cannot accidentally diverge across dashboards? That is what LookML solves.


Monday morning. The CFO opens three revenue reports side by side. Three different numbers, all for Q3. She asks which one is correct. Nobody has a confident answer. This is not caused by bad data. It is caused by business logic living in individual dashboards rather than a shared, governed model.


The Root Cause: Logic in the Wrong Layer

In most BI tools, metrics are defined at the dashboard level. Every analyst redefines the metrics they need, copies a formula, makes a small adjustment, and forgets to note what changed. Over time: the same metric has fifteen definitions, business rule changes require hunting down every dashboard, new team members can’t find the canonical version, audits are painful.

LookML solves this by moving business logic from the dashboard layer to the model layer. A metric is defined once, documented inline, version-controlled in Git, and consumed identically by every dashboard and user.

Business Logic: Dashboard Layer vs Model LayerWITHOUT LookMLLogic distributed across dashboardsDashboard Arevenue = SUM(value)Dashboard Brevenue = SUM(value-refund)Dashboard Crevenue = SUM(net_value)Dashboard Drevenue = total_order_value⚠ 4 definitions. 1 metric name. Which is right?Change the rule → update 4+ dashboards manuallyNo version history. No documentation.WITH LookMLLogic defined once in the modelLookML Model: net_revenueSUM(order_value − refund_amount)documented · git-versioned · canonicalDashboard ADashboard BDashboard C✓ All use the same definition. Always.Change the rule → update 1 LookML fileGit commit. Propagates everywhere.
Figure 1 — Without LookML, business logic is distributed and diverges. With LookML, it is centralised, documented, and version-controlled.

Dimensions vs Measures: The Foundational Distinction

Every field in LookML is either a dimension or a measure. A dimension is a non-aggregated attribute (order ID, date, channel, status) used to group, filter, and segment. A measure is an aggregation (sum, count, average, ratio) that cannot be used to group. Looker’s type system makes double-aggregation structurally impossible.

Complete dimension examples

view: sfcc_orders {
  sql_table_name: `project.reconciliation.norm_sfcc_orders` ;;

  dimension: order_id {
    type: string
    sql: ${TABLE}.norm_order_id ;;
    primary_key: yes
    hidden: yes
    description: "Normalised SFCC order ID. Used as join key."
  }

  dimension_group: created {
    type: time
    timeframes: [raw, time, date, week, month, quarter, year]
    sql: ${TABLE}.created_at ;;
    datatype: timestamp
    convert_tz: yes
  }

  dimension: channel {
    label: "Order Channel"
    type: string
    sql: ${TABLE}.channel ;;
    description: "Values: web, store, phone, api, b2b."
  }

  dimension: order_size_tier {
    label: "Order Size Tier"
    type: string
    sql: CASE
      WHEN ${TABLE}.order_value < 50    THEN '1. Small (<€50)'
      WHEN ${TABLE}.order_value < 200   THEN '2. Medium (€50-200)'
      WHEN ${TABLE}.order_value < 1000  THEN '3. Large (€200-1000)'
      ELSE '4. Enterprise (>€1000)'
    END ;;
    description: "Prefix numbers ensure correct sort order in visualisations."
  }

  dimension: order_value {
    type: number
    sql: ${TABLE}.order_value ;;
    hidden: yes
  }
}

Complete measure examples with documentation

  measure: confirmed_order_count {
    label: "Confirmed Orders"
    type: count_distinct
    sql: CASE WHEN ${order_status} NOT IN ('cancelled', 'fraud', 'test')
              THEN ${order_id} END ;;
    description: "Excludes cancelled, fraud, and test orders."
  }

  measure: gross_revenue {
    label: "Gross Revenue (€)"
    type: sum
    sql: ${TABLE}.order_value ;;
    value_format_name: decimal_2
    description: "Total order value before refunds."
    filters: [order_status: "-cancelled,-fraud,-test"]
  }

  measure: net_revenue {
    label: "Net Revenue (€)"
    type: sum
    sql: ${TABLE}.order_value - COALESCE(${TABLE}.refund_amount, 0) ;;
    value_format_name: decimal_2
    description: "Gross revenue minus processed refunds. Does not include pending refunds."
    filters: [order_status: "-cancelled,-fraud,-test"]
  }

  measure: average_order_value {
    label: "Average Order Value (€)"
    type: number
    sql: ${net_revenue} / NULLIF(${confirmed_order_count}, 0) ;;
    value_format_name: decimal_2
  }

  measure: web_revenue {
    label: "Web Revenue (€)"
    type: sum
    sql: ${TABLE}.order_value - COALESCE(${TABLE}.refund_amount, 0) ;;
    filters: [channel: "web,storefront"]
    value_format_name: decimal_2
    description: "Net revenue from web storefront channel only."
  }

Cross-Source Metrics: GA4 + SFCC in a Single Governed Model

The real governance payoff comes with cross-source metrics — measures that combine data from GA4 and SFCC. These are the most frequently redefined metrics in organisations lacking a governance layer.

view: order_reconciliation {
  sql_table_name: `project.reconciliation.reconciliation_base` ;;

  dimension: order_origin {
    label: "Order Origin"
    type: string
    sql: ${TABLE}.order_origin ;;
    description: "tracked_web = matched GA4 session | offline_store = POS | b2b_wholesale = B2B | dark_traffic_web = web order no GA4 match | unknown = unclassified"
  }

  dimension: tracking_confidence {
    label: "Tracking Confidence"
    type: string
    sql: CASE
      WHEN ${TABLE}.order_origin = 'tracked_web'    THEN '1. High'
      WHEN ${TABLE}.order_origin IN ('offline_store','offline_callcenter','b2b_wholesale','subscription') THEN '2. Structural gap'
      WHEN ${TABLE}.order_origin = 'dark_traffic_web' THEN '3. Fixable gap'
      ELSE '4. Unknown'
    END ;;
  }

  measure: sfcc_net_revenue {
    label: "SFCC Net Revenue (€)"
    type: sum
    sql: ${TABLE}.sfcc_net_revenue ;;
    value_format_name: decimal_2
    description: "SFCC gross revenue after deducting processed refunds."
  }

  measure: ga4_reported_revenue {
    label: "GA4 Reported Revenue (€)"
    type: sum
    sql: ${TABLE}.ga4_revenue ;;
    value_format_name: decimal_2
    description: "Revenue as reported by the GA4 purchase event."
  }

  measure: revenue_delta {
    label: "Revenue Gap SFCC vs GA4 (€)"
    type: number
    sql: ${sfcc_net_revenue} - ${ga4_reported_revenue} ;;
    value_format_name: decimal_2
    description: "Positive = SFCC reports more than GA4 (normal, expected)."
  }

  measure: revenue_delta_pct {
    label: "Revenue Gap %"
    type: number
    sql: (${sfcc_net_revenue} - ${ga4_reported_revenue}) / NULLIF(${sfcc_net_revenue}, 0) ;;
    value_format_name: percent_1
    description: "Benchmark: below 15% for web-only orders."
  }

  measure: web_match_rate {
    label: "Web Order Match Rate"
    type: number
    sql: COUNT(DISTINCT CASE WHEN ${TABLE}.session_id IS NOT NULL AND ${TABLE}.channel IN ('web','storefront') THEN ${TABLE}.order_id END)
         / NULLIF(COUNT(DISTINCT CASE WHEN ${TABLE}.channel IN ('web','storefront') THEN ${TABLE}.order_id END), 0) ;;
    value_format_name: percent_1
    description: "Match rate for web storefront orders only. Target: above 85%."
  }

  measure: tracked_revenue {
    label: "Tracked Revenue (€)"
    type: sum
    sql: CASE WHEN ${TABLE}.order_origin = 'tracked_web' THEN ${TABLE}.sfcc_net_revenue END ;;
    value_format_name: decimal_2
  }

  measure: fixable_gap_revenue {
    label: "Fixable Gap Revenue (€)"
    type: sum
    sql: CASE WHEN ${TABLE}.order_origin = 'dark_traffic_web' THEN ${TABLE}.sfcc_net_revenue END ;;
    value_format_name: decimal_2
    description: "Revenue from web orders with no GA4 match. Financial cost of tracking gaps."
  }
}

Governance Best Practices

1. Document every measure with a description that answers two questions

The description field surfaces as a tooltip in the Explore UI. It must answer: what does this metric include? and what does it explicitly exclude? “Total revenue” is useless. “Net revenue from confirmed web orders, after deducting processed refunds. Excludes cancelled, fraud, and B2B orders.” is genuinely useful.

2. Version control every model change in Git

Every change to the LookML model is a commit. This creates a complete audit trail that makes answering “why did net revenue drop 8% on March 15th?” a solvable question.

# commit a3f9b12 — Update net_revenue to exclude pending refunds
# - sql: ${TABLE}.order_value - COALESCE(${TABLE}.refund_amount, 0) ;;
# + sql: ${TABLE}.order_value - COALESCE(
# +        CASE WHEN ${TABLE}.refund_status = 'processed'
# +        THEN ${TABLE}.refund_amount END, 0) ;;
# Every dashboard using net_revenue immediately reflected the change.

3. Use labels that business users can read

sfcc_orders.net_revenue is ambiguous. "Net Revenue (€) — SFCC, post-refund" is not. For reconciliation metrics, always include the source system in the label.

4. Hide technical fields aggressively

Join keys, intermediate IDs, raw numeric fields — hide with hidden: yes. They remain available for calculations and joins, but do not clutter the Explore field picker.

5. Use value format names consistently

measure: net_revenue  { value_format_name: decimal_2 }  # 12,345.67
measure: order_count  { value_format_name: decimal_0 }  # 12,346
measure: match_rate   { value_format_name: percent_1 }  # 87.4%
measure: revenue_delta { value_format: "+0.00;-0.00;0.00" }

The Complete Governance Workflow


What Comes Next


Sources and Further Reading