Skip to content

AC-Level Derived Metrics — Design Document, v0.1

Status: v0.1 design (2026-05-25). Captures the Reading B architecture aligned with Huayin on 2026-05-24, before code lands. Lives alongside coframe_metric_engine_design_v0_1.md; folds into a v2.3 platform-design supplement once implementation ships.

Author: reeeneeee Audience: implementers of Coframe Core v2.3+ (resolver / AC catalog / metric engine touches); reviewers checking the architectural commitments before code lands.

Scope. This document specifies how Coframe Core represents and executes AC-level derived metrics — metric families declared as formulas over other metric families, computed post-aggregation. It covers the AC schema extension, the engine's new dispatch branch, the planner's involvement (or deliberate non-involvement), the validation rules at COMMIT time, the trade-offs accepted, and the non-goals.

Out of scope. User-level (ad-hoc, in-query) derived metrics — those remain a Frame-QL expression concern and are not part of the AC's declared surface. Pro-tier features (SQL pushdown of formulas, multi-source schema picking for derived families, cross-process engine coordination on derived results) are noted as forward-compatible exits but not specified here.


0. Naming convention

Throughout this document and the implementation:

Term Definition
profit revenue − cost (cost may be all-in or specifically COGS depending on the AC; the name distinguishes from the gross/margin variants)
gross profit revenue − COGS (only direct cost of goods sold)
margin profit / revenue
gross margin gross profit / revenue

The canonical retail AC's cost column is the all-in per-transaction cost, so the canonical derived example in this document is profit = revenue - cost, not gross_profit = revenue - COGS.


1. Motivation

The Metric Engine (Manual ch. 11) ships compose() with a frame_expression parameter that lets callers post-process the merged Frame programmatically:

revenue_lf = engine.serve(METRIC, "revenue", ("region",), ...)
cost_lf    = engine.serve(METRIC, "cost",    ("region",), ...)
frame = engine.compose([revenue_lf, cost_lf], ("region",),
    frame_expression=lambda lf: lf.with_columns(
        (pl.col("revenue") - pl.col("cost")).alias("profit"),
    ),
)

This works, and the retail demo's D8 example exercises it end-to-end. But a Frame-QL author cannot write SELECT region, profit AT region directly — the planner doesn't recognize profit as a family, doesn't know it derives from revenue - cost, and would refuse the query with UnknownFamilyError.

The AC author's intent — "expose profit as a first-class family for consumers" — has no declaration surface today. This design adds one.


2. The Reading B principle (load-bearing)

The architectural commitment, agreed before implementation:

The user does not know profit is a derived metric. The backend doesn't know either (no SQL pushdown). The metric engine knows and treats it as a frame_expression internally. The planner stays unaware (treats it as a family with a schema-virtual marker).

The "secret agreement" is between the AC declaration and the metric engine:

  • The AC declares the formula.
  • The engine, when asked to serve the derived family, recursively serves the components and applies the formula via compose()'s existing frame_expression hook.
  • The planner sees the derived family as schema-virtual — it skips Rule 3 (schema selection) and Rule 4 (cross-schema coherence) for derived families and passes the request through.

The principle is formula, not data: a derived metric is a declaration of how to derive, not a new column to materialise. Derived results are never written to the engine's manifest. The substrate stays primitive.


3. Three-tier picture

SELECT region, profit AT region
┌──────────────────────────────────────────────────────────────────┐
│ USER / Frame-QL author                                            │
│   Sees: profit as a family. No knowledge that it's derived.       │
│   Schema-blind (always — Frame-QL doesn't mention schemas).       │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ PLANNER                                                            │
│   Rule 1: profit → AC catalog confirms it's a family               │
│           (does not act on "derived" marker)                       │
│   Rule 2: anchor (region,) reachable; inherited ip_reducers OK    │
│   Rule 3: SKIPPED — derived families are schema-virtual            │
│           (family_root.schema = <virtual>)                         │
│   Rule 4: SKIPPED — no schemas picked, no coherence question      │
│                                                                    │
│   Output: one ServingRequest for profit at (region,).             │
│   Planner does NOT decompose. It passes through.                  │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ METRIC ENGINE                                                      │
│   serve(METRIC, "profit", (region,)) entry point.                  │
│                                                                    │
│   Branch 0 (NEW): is this family declared `derived` in the AC?    │
│     Yes — look up the AC's derivation:                            │
│       { formula: revenue - cost, inputs: [revenue, cost] }         │
│     Recurse for each input (uses each input's family_root.schema):│
│       revenue_lf = self.serve(METRIC, "revenue", (region,), ...)   │
│       cost_lf    = self.serve(METRIC, "cost",    (region,), ...)   │
│     Internally compose:                                           │
│       result = compose([revenue_lf, cost_lf], (region,),          │
│           frame_expression = (rev - cost).alias("profit"))         │
│     Return result.                                                │
│                                                                    │
│   Branches 1/2/3 unchanged.  Recursive serves naturally hit them. │
│                                                                    │
│   The Branch-0 result is NEVER memoised. The substrate stays      │
│   primitive. Formula re-runs each call (trivial cost).            │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ BACKEND (DuckDB / Polars / SQLite)                                 │
│   Sees only aggregate() calls for primitive families:             │
│     - SUM(revenue) AT region FROM transactions                     │
│     - SUM(cost)    AT region FROM transactions                     │
│   No knowledge of profit. No schema for it. Nothing.              │
└──────────────────────────────────────────────────────────────────┘

Who knows what (single-source-of-truth)

Knowledge Owner
profit exists as a family-name AC declaration (visible in catalog)
profit is derived (vs primitive) AC declaration (surfaced to engine via Branch-0 lookup)
Formula: revenue - cost AC declaration (read by engine at Branch-0 time)
Inherited ip_reducers (computed from inputs) AC declaration (validated + cached at COMMIT)
Schema-virtual marker AC declaration (tells planner to skip Rules 3 / 4)
Component → canonical-source-schema Each component family's own family_root.schema
Frame_expression construction Engine (at Branch-0 time, from the AC's formula)
Derived result memoisation Nowhere. Pure formula; re-runs every call

4. AC schema for derived families

4.1 Shape

A new optional derived block on MetricFamily. Both formula syntaxes are accepted; the loader canonicalises to the operator-AST form internally.

Shape A (formula-as-string, author-friendly):

metric_families:
  - name: profit
    derived:
      formula: "revenue - cost"
      inputs: [revenue, cost]
    # ip_reducers omitted — inherited from inputs (see §4.2)
    # family_root omitted — schema-virtual derived families have no root column

Shape B (operator-AST, planner-friendly):

metric_families:
  - name: profit
    derived:
      operator: SUBTRACT
      args:
        - {family: revenue}
        - {family: cost}

The two shapes are equivalent. Shape A reads more naturally and is the recommended authoring form; Shape B is what the loader canonicalises to and what the engine consults at Branch-0 time. The expression parser supports the operators in Chapter 10's catalog (SUBTRACT, ADD, MULTIPLY, DIVIDE = MAP_DIV, plus parenthesisation).

4.2 Inherited ip_reducers

The framework computes derived families' ip_reducers from the components, at COMMIT time:

Components' ip_reducers Derived family's ip_reducers
All inputs SUM (or all the same partition-invariant linear operator) Same operator, same block set (linearity preserves rollup)
Mixed operators (e.g., SUM and MAX) Empty — derived family is anchor-locked
Any input is anchor-locked Empty — derived family is anchor-locked
All inputs share a non-empty block set Same block set
Inputs have conflicting block sets Empty — derived family is anchor-locked
Operator is non-linear (DIVIDE, MULTIPLY) Empty — division/multiplication never partition-invariant

The framework refuses to compute ip_reducers itself if the AC author also declared them explicitly; the inference is the canonical source.

4.3 Schema-virtual marker

The derived family's family_root.schema is set to the sentinel <virtual> (string literal). The planner recognises this sentinel and:

  • Skips Rule 3 (schema selection) — no candidate schemas to pick from.
  • Skips Rule 4 (cross-schema coherence) — no schemas to coordinate.
  • Passes the request through to the engine with source_schemas=[] (the engine ignores this for derived families).

Other framework components treating <virtual> schemas as a no-op:

  • The DQ / Validation Surface (Chapter 7) — no Phase 2 attestation possible (no backend table to verify against); the Phase 3 sibling-coherence check is automatically waived (the derived family has no in-AC sibling — it's a singleton-like family from the verification perspective).
  • The data-API protocol (Chapter 6) — never queried with a <virtual> schema name; backends never see it.
  • The Workbench Table Explorer — does not list <virtual> schemas (they don't exist as backend tables).

4.4 Cycles

The family-DAG (Manual §2.7.8) is required acyclic. Adding derived families extends this: a derived family's inputs cannot transitively include itself. AC validation at COMMIT performs a cycle check over the inputs graph and refuses with DerivedFamilyCycleError if one is found.

4.5 Complete example (the retail AC)

metric_families:
  - name: revenue
    family_root: {schema: transactions, column: revenue}
    ip_reducers: [{operator: SUM, a_block: []}]
    cache_hint:
      materialize_at: [[region], [region, day]]

  - name: cost
    family_root: {schema: transactions, column: cost}
    ip_reducers: [{operator: SUM, a_block: []}]
    cache_hint:
      materialize_at: [[region]]

  - name: profit                          # NEW: derived family
    derived:
      formula: "revenue - cost"
      inputs: [revenue, cost]
    # ip_reducers inferred: [{operator: SUM, a_block: []}]
    # family_root inferred: {schema: <virtual>, column: profit}

  - name: gross_margin_pct                # NEW: ratio (anchor-locked)
    derived:
      formula: "profit / revenue"
      inputs: [profit, revenue]
    # ip_reducers inferred: [] (DIVIDE is never partition-invariant)
    # family_root inferred: {schema: <virtual>, column: gross_margin_pct}

Note gross_margin_pct derives transitively (uses profit, which itself derives from revenue and cost). The engine's recursive serve handles this naturally: the Branch-0 dispatch for gross_margin_pct calls serve("profit", ...), which is itself a derived family → its Branch-0 calls serve("revenue", ...) and serve("cost", ...). Two levels of recursion, both transparent to the planner.


5. Engine Branch-0 dispatch

5.1 The new branch

The engine's serve() gains a new branch ahead of the existing three:

def serve(self, domain, dataset_id, anchor, *, operator=None, ...):
    if domain == Domain.METRIC and self.ac.is_derived_family(dataset_id):
        # Branch 0 — NEW for v2.3.
        family = self.ac.derived_family(dataset_id)

        # Recursively serve each input. Each recursive call carries
        # the input's family_root.schema as source_schemas, so the
        # planner's earlier schema-virtual hand-off is unwound here.
        input_lfs = []
        for input_name in family.input_families:
            input_family = self.ac.metric_family(input_name)
            input_lf = self.serve(
                Domain.METRIC,
                input_name,
                anchor,
                operator=input_family.ip_reducers[0].operator,
                partition_invariant=input_family.ip_reducers[0].partition_invariant,
                source_table=input_family.family_root.schema,
                source_column=input_family.family_root.column,
                source_schemas=[input_family.family_root.schema],
            )
            input_lfs.append(input_lf)

        # Compose with the AC-declared formula, label as derived name.
        frame = self.compose(
            input_lfs,
            anchor,
            frame_expression=family.as_frame_expression(),
        )
        # compose() returns a DataFrame; convert to LazyFrame to match
        # the serve() contract (returns LazyFrame, not DataFrame).
        return frame.lazy()

    # Branches 1 / 2 / 3 unchanged …

5.2 Recursion termination

The recursion terminates because:

  • AC validation (§4.4) guarantees the inputs graph is acyclic.
  • Each recursive serve() call is at the same anchor; the recursion depth is bounded by the family-DAG's max derivation depth.
  • At each level, the recursion either hits another derived family (one more recursion) or hits a primitive family (terminates into Branch 1/2/3).

In practice, max derivation depth is 2–3 (gross_margin_pct = profit / revenue where profit = revenue - cost is depth 2). Pathological cases are forbidden by validation.

5.3 Memoisation discipline (the load-bearing non-decision)

Branch 0 NEVER calls ingest(). Derived results are not written to the manifest. The Parquet substrate contains only primitive (family, anchor) entries. Consequences:

  • No invalidation cascade when components refresh. When revenue is re-materialised (via engine.refresh() or stability-cutoff advance), there is nothing to cascade-invalidate downstream — the next call to serve("profit", ...) re-runs the formula over the fresh revenue (still hitting cache if cogs hasn't changed).
  • No double-storage. profit@(region,) and revenue@(region,) - cost@(region,) would carry redundant information; we don't.
  • The manifest's invariant — "every entry represents an aggregate of a primitive family from observed data" — is preserved. Derived families are never primitive; storing them would break the invariant.

This is your minimalist instinct made structural: the engine's substrate is primitive observations; everything compositional happens at frame-assembly time.

5.4 The two-compose dance

When the planner-level pipeline composes a full Frame including a derived family:

planner.execute_query("SELECT region, profit, units_sold AT region")
  → engine.serve(METRIC, "profit",     ("region",))   ← Branch 0:
      → engine.serve(METRIC, "revenue", ("region",))  ← Branch 1 cache hit
      → engine.serve(METRIC, "cost",    ("region",))  ← Branch 1 cache hit
      → compose([revenue_lf, cost_lf], ("region",), frame_expression=...)
      → returns a LazyFrame [region, profit]
  → engine.serve(METRIC, "units_sold", ("region",))   ← Branch 3 + memoise
  → compose([profit_lf, units_sold_lf], ("region",),  ← planner-level compose
       order_by=..., limit=..., ...)
  → returns the final Frame

Two compose() invocations — one inner (engine Branch 0) producing the derived-metric LazyFrame, one outer (planner) assembling the full Frame and applying post-grain ops. The inner compose() does the join + frame_expression; the outer compose() does the join + post-grain ops. Cost is negligible (composition is metadata over LazyFrames until collect()).


6. AC validation at COMMIT

The framework's COMMIT-time checks gain four new rules for derived families:

Check Detail
D-100 All input families exist Every name in derived.inputs resolves to a declared family in the same AC
D-101 Inputs are FD-compatible at all reachable anchors Each input is reachable to every other input's anchor via the FD-DAG (so a join at the target grain is well-defined)
D-102 No derivation cycles The inputs graph (derived → its inputs, recursive) is acyclic
D-103 Formula references only inputs The formula's identifiers all appear in derived.inputs; no implicit dependencies
D-104 ip_reducers consistent with inferred If the AC author explicitly declared ip_reducers, they match the framework's inference; if they don't, refuse with DerivedFamilyReducerMismatchError

These join the integrity catalog (Manual §2.10) under a new D-1xx class. The Workbench Validation surface (Chapter 7) reports them at AC validation time.


7. What's accepted (trade-offs)

The design accepts these trade-offs explicitly:

7.1 Derived families always read from canonical sources

Each input family's family_root.schema determines where the engine recursively reads from. If revenue lives both in transactions (detail) and region_daily_summary (pre-aggregate), the recursive serve("revenue", ...) always routes to transactions (the family-root's schema).

Why this is fine in practice: the cache layer absorbs the cost. Once revenue@(region,) is memoised — and it will be, since the canonical retail AC declares it in cache_hint — the recursive serve hits Branch 1 instantly, regardless of which source it was originally read from.

Forward-compat exit: Pro could surface per-input schema picking back to the planner if real workload shows the canonical-only restriction is costly. The AC schema doesn't preclude it (we can add derived.input_schemas in v2.4 if needed).

7.2 Two compose() invocations per query that includes a derived family

See §5.4. The cost is negligible (LazyFrame metadata composition). The alternative — engine emits structured "derived intent" tokens that the planner-level compose understands — adds API surface for no measurable benefit.

7.3 Branch-0 results never memoised

See §5.3. The cost: the formula re-runs on every call. The savings: invalidation logic disappears entirely. Trade is unambiguously good for v1.

7.4 No SQL pushdown of formulas

A derived metric is never sent to the backend as a single computed expression (e.g., SELECT SUM(revenue) - SUM(cost) AS profit FROM transactions GROUP BY region). The backend always sees the primitives. This costs us one extra round-trip per query (the components must arrive as separate columns), which the engine's component-level cache substantially absorbs.

Forward-compat exit: Pro could add a "fused" mode where the engine constructs and pushes down a single aggregate request when (a) all components live in the same schema, (b) the formula is expressible in the backend's SQL dialect, and (c) profiling indicates round-trip cost dominates. v0.1 leaves this unimplemented.

7.5 No multi-level reasoning at the planner

A SELECT gross_margin_pct AT region from the AC in §4.5 fires two levels of engine recursion (gross_margin_pct → profit → {revenue, cost}). The planner sees only the top-level family request; it doesn't reason about the chain. This is intentional — keeping the planner unaware preserves Reading B.


8. Non-goals

The following are explicitly out of scope for v0.1:

Non-goal Note
User-level derived metrics in Frame-QL A Frame-QL author can already write SELECT region, SUM(revenue) - SUM(cost) AS p AT region for ad-hoc derivation. This document covers only the AC-level (persistent, catalogable) case.
Re-ingestion of derived results as new schemas A v3 affordance: emit a derived family's materialised result as a new physical table in the backend, then declare that table as a primitive schema in the same AC. Different design problem.
Per-row derived families (singletons) The existing singleton family mechanism (Manual §2.7.7) — e.g., gross_margin_pct as a stored per-transaction MAP_DIV column — is unchanged. This document covers post-aggregation derivation, not per-row.
NLQ surface for derived families NLQ work is its own design pass. Derived families surface in the AC catalog and are reachable from NLQ once that surface lands; no NLQ-specific machinery is added here.
MCP exposure of derived families Same — the MCP surface treats derived families as catalog entries like any other family; nothing MCP-specific in this design.

9. Implementation plan

Three slices, in dependency order:

Slice 1 — AC schema + validation

  • Extend MetricFamily Pydantic model with optional derived: DerivedSpec field.
  • Add DerivedSpec model: formula: str, inputs: list[str], optional operator: str + args: list[DerivedArg] (canonical AST form).
  • Add a formula parser (operators from Chapter 10: +, -, *, /, parens). Output: DerivedSpec AST.
  • COMMIT-time inference: compute ip_reducers from inputs; refuse on conflict if author also declared (D-104).
  • COMMIT-time validation: D-100 through D-104.
  • New error types: UnknownInputFamilyError, DerivedFamilyCycleError, DerivedFamilyReducerMismatchError.
  • Tests: AC loader fixtures with derived families (linear / anchor-locked / multi-level / cyclic-rejected).

Slice 2 — Engine Branch 0

  • Add is_derived_family(name) → bool and derived_family(name) → DerivedSpec helpers on the AC catalog.
  • Implement Branch 0 in engine.serve() per §5.1.
  • Implement DerivedSpec.as_frame_expression()Callable[[pl.LazyFrame], pl.LazyFrame] (translates the AST to a Polars expression).
  • Tests: derived family at primitive depth; transitively-derived (depth 2); cache-hit on components; cold-cache full-recursive serve; anchor-locked refusal on a partition-invariance-violating ratio.

Slice 3 — Planner schema-virtual handling + end-to-end

  • Teach the planner's family resolver to honor family_root.schema == "<virtual>": skip Rule 3 / Rule 4 for derived families, pass through.
  • Update the resolver's plan execution to call engine.serve(METRIC, derived_family_name, ...) for derived family ServingRequests (same as for primitives — no special case needed in the executor).
  • Update the retail demo AC (drafts/data/retail_demo/retail.coframe/ac.yaml) to declare profit as a derived family.
  • Update the walkthrough Step 7a + smoke test to use the AC-level SELECT region, profit AT region form instead of programmatic compose().
  • Update Workbench Query UI: derived families appear in the family browser; the served_from badge reflects the inner serves (typically engine_cache if components were warmed).
  • Tests: end-to-end SELECT region, profit AT region from Frame-QL through to Frame; assert served_from = engine_cache when components are warmed; cross-backend invariant test extended to cover derived families.

Estimated effort: ~300–500 LOC + tests across the three slices. Slice 1 is the largest (parser + validation); slice 2 is small (Branch 0 dispatch + AST → Polars translator); slice 3 is glue + integration tests.


10. Summary

AC-level derived metrics are added as formulas declared on the AC, executed by the engine, transparent to the planner and the backend. The key architectural commitments:

  1. Formula, not data — derived results are never memoised. The substrate stays primitive.
  2. Engine owns the contract — Branch 0 in serve() recursively serves components and applies the AC-declared formula via compose()'s frame_expression.
  3. Planner stays unaware — derived families are schema-virtual; Rules 3 / 4 short-circuit on a sentinel marker.
  4. Backend never sees them — the backend's surface is unchanged; only primitives are queried.
  5. ip_reducers are inferred — the AC author declares the formula; the framework computes whether the derived family rolls up, and how.

The design follows the Reading B principle agreed before implementation: maximum architectural restraint, single-source-of-truth ownership, no new vocabulary in components that don't need it. The engine substrate stays clean; the planner stays simple; the AC author gains a first-class declaration surface for derived metrics.

When implementation lands (per §9), the canonical retail AC will declare profit directly, and the Frame-QL author will write SELECT region, profit AT region and have it just work — with served_from: engine_cache on the second call.