Skip to content

Coframe Metric Engine — Design Document, v0.2

Per-AC multi-domain query engine with persistent memoization, built on Polars. Hosts METRIC and QM (quasi-metadata) domains in v0.1; additional domains (LINEAGE, SKETCH, ...) are forward-compatible. Subsumes what an earlier iteration called the "cache layer" into a first-class execution component. Phase 8 deliverable per the v2.1 supplement roadmap.

0. Status

  • Draft v0.2, 2026-05.
  • v0.2 change: broadened scope from metric-only to multi-domain (METRIC + QM). The QM domain hosts quasi-metadata as engine-managed LazyFrames; Pydantic types become projections. Eliminates the duplicated per-backend kind-hint heuristic. See §2.7, §5.5, §13.4.
  • All 7 open questions from v0.1 resolved (see §15).
  • Not yet implemented. Slices 2-8 follow this doc.
  • The naming throughout uses "Metric Engine" rather than "cache" — see §1.5 for why. (Despite hosting QM as a co-resident domain, the public-facing name stays "Metric Engine" since metrics are the dominant use case.)

1. Purpose, scope, and the reframe from "cache"

1.1 What changed in the design

This document started as a cache-layer design. Through brainstorm, it became clear the natural shape is not a cache wrapping a backend — it is a single-column query engine with persistent memoization as a byproduct of its operation. The collapse happened in three steps:

  1. The atomic logical unit of cached state is (metric_family, anchor) → values — a single column with its anchor. Wide-table caching introduces grain mismatches, duplications, and forced co-locations that single-column form avoids.
  2. Once the cache is asked "can you serve this metric at this anchor (possibly via FD-DAG rollup from a finer cached entry)?", the cache is no longer passive storage — it is making semantic decisions. If it can answer yes, it might as well serve.
  3. With (1) and (2), the distinction between "cache hit," "cache hit with rollup," and "fresh backend call" collapses into one per-metric serve() operation; what differs is only whether the answer is computed from materialized state or from the underlying backend. The materialization is a side effect of execution, not its precondition.

What this implies architecturally: the Metric Engine is the execution substrate for coframe queries. The DataAPIBackend becomes a thin single-metric-aggregate provider invoked only for misses. Frame composition, FD-DAG rollups, post-grain operations, and cross-source unification all move into the Metric Engine.

1.2 Why now

Three converging forces:

  • Phase 7 (coframe-polars) landed. We now have a production-grade Polars-based DataAPIBackend with cross-backend invariants verified against SQLite. Polars's columnar engine + lazy evaluation + Parquet IO are exactly the substrate a single-column query engine needs; we're not starting from scratch.
  • The W4 abstraction (L1/L2/L3 operator registry) is de-risked. The cross-backend invariant tests in Phase 7 establish that the planner's IR (AggregateRequest) is genuinely backend-agnostic. Inserting a Metric Engine layer between planner and backend doesn't fight any existing abstraction.
  • The FD-DAG is a first-class object. Most warehouse materialized-view advisors decide what to cache from workload statistics + cost models alone. Coframe's Metric Engine reasons from workload + the AC's declared FD-DAG — knowing revenue@(region, month) is FD-reachable from revenue@(region, day) isn't a statistical hypothesis but a declared invariant. This is a real differentiator and it deserves a first-class component to express it.

1.3 Scope (what the engine does)

The engine manages tabular AC-substrate data in two domains in v0.1:

  • Metric values: per-metric, per-anchor aggregates the planner routes through the engine when serving Frame-QL queries.
  • Quasi-metadata (QM): per-column profile rows + auxiliary distribution data (top-N values, etc.) the AC needs for integrity reasoning, kind-hint classification, and cost estimation.

Both domains share storage, manifest, and serve machinery; they differ in their semantics layer (FD-DAG traversal applies to metrics; QM is exact-match-or-compute). See §2.7.

For each AC, the engine:

  • Answers per-dataset serve requests: serve(domain, dataset_id, anchor) → LazyFrame returning a single-column-with-anchor result.
  • Persistently memoizes serve results in a per-AC store (Polars/Parquet on disk).
  • For the metric domain, walks the FD-DAG to find the cheapest serving path: exact match, rollup from a finer cached node, or backend fallback.
  • For the QM domain, returns exact-match or computes via the backend's table extraction + the engine's shared profiling logic.
  • Composes multi-metric query results by merging per-metric LazyFrames at a target grain (metric domain).
  • Evaluates frame-expressions (derived metrics, post-grain ops: HAVING, ORDER BY, LIMIT [PER]).
  • Evicts memoized entries under workload pressure (LRU + FD-DAG-aware for metrics; pure LRU for QM).
  • Honors stability-window-driven invalidation (v2.1 §6).
  • Surfaces materialization provenance so verification levels propagate correctly.

The Pydantic types (TableProfile, ColumnProfile, etc.) remain as the API contract surface — they're constructed as projections over engine-backed LazyFrames when callers want typed objects. Storage is Polars-native; the typed view is computed on read.

1.4 Non-scope (what the engine does not do)

  • No cross-AC sharing in v0.1. Each AC has its own Metric Engine instance with its own materialized store. Storage duplication for shared physical metrics is accepted; cross-AC sharing is a future optimization (see §15).
  • No backend-side computation of cross-schema joins. The Metric Engine assembles cross-schema results in its Polars substrate; the backend's job is single-metric scan + aggregate.
  • No automatic schema promotion. When a memoized entry has been consistently hot, the Metric Engine can surface a recommendation to promote it to a declared schema (and possibly a physical pre-aggregate), but the promotion itself is an AC-author action via the Workbench, not an engine action.
  • No real-time invalidation channels. The engine relies on the stability filter's hold-off window for validity guarantees. Sources of data freshness signals beyond that (e.g., warehouse CDC streams) are out of scope for v0.1.

1.5 Naming

The choice of "Metric Engine" over "Cache" is deliberate. "Cache" frames it as an optimization on top of something else; "Metric Engine" names it as a first-class execution component whose persistent materialization is a byproduct, not its purpose. The component sits at the same architectural level as DataAPIBackend — peer, not adapter.

  • Package name: coframe-metric-engine.
  • Module path: coframe.metric_engine.
  • Core type: MetricEngine (per-AC instance).

2. Design principles

2.1 Per-AC identity

Each AC owns its own Metric Engine instance. The engine knows the AC's FD-DAG, declared filter, and name_map; every memoized entry is pre-scoped to the AC's filter. Cross-AC sharing of physical-name caches is a future concern.

Rationale: cross-AC sharing introduces messy interactions around filter scopes, name remapping, and invalidation correlation. We pay a storage duplication cost to keep the design simple. The duplication is small relative to total compute saved; we revisit if profiling shows it bites.

2.2 Single-column-with-anchor as the atomic unit

The fundamental cache entry shape is (metric_family, anchor_tuple) → column_of_values. Each entry maps exactly to one node in the FD-DAG. Storage may physically group hot co-occurring entries into shared Parquet files for IO efficiency, but the logical contract stays atomic.

Implications:

  • Adding/removing metrics from materialization doesn't require schema-wide updates.
  • Incremental population (one new metric at a time) is trivial.
  • Cache reachability is FD-DAG graph reachability — clean algebra.
  • Sketch reducers (HLL, t-digest) fit naturally as single-column entries with sketch-state values.

2.3 Materialization is a byproduct, not a precondition

serve() always returns a LazyFrame. Whether it came from a materialized entry, a rollup over a finer materialized entry, or a fresh backend call is opaque to the caller. The decision of whether to materialize a serve result for future reuse is internal policy (§9), not part of the contract.

2.4 The Metric Engine is the execution substrate

Frame composition, FD-DAG rollups, post-grain operations, and cross-source unification all live in the engine's Polars substrate. Backends become thin single-metric providers; today's coframe-resolution execution layer (merge_blocks, post_grain_ops) collapses into the engine.

This is the largest architectural shift in the design. It is justified by:

  • Polars's LazyFrame composition gives compounding query planning for free (column pruning + predicate pushdown across the whole chain of cache reads + fresh-result merges + post-grain ops).
  • The cross-backend invariant becomes stronger: composition runs in one engine (Polars), not in Python lists with engine-dependent semantics.
  • Backends get simpler — including hypothetical join-less ones (the Metric Engine handles all multi-source assembly).

2.5 FD-DAG is an input to the engine, not just to the planner

Most warehouse cache advisors reason from workload statistics + cost models alone. The Metric Engine additionally uses the AC's declared FD-DAG to know:

  • Which rollup paths are sound (exact, not heuristic).
  • Which operators are partition-invariant (cached value can be further rolled up).
  • Which materialized entries serve which broader request sets.

This gives the engine principled, not heuristic, decisions about serving paths and memoization priorities.

2.6 Honor the stability filter for invariants

The v2.1 §6 stability filter defines a clean cache-validity window: data older than hold_off_days is provably stable and won't be modified. Cached aggregates over stable data don't expire from data change; they expire only when the stability window rolls forward.

This is a stronger guarantee than typical OLTP caches enjoy — we don't need invalidation channels or version vectors. A daily cron at the stability boundary is sufficient.

2.7 Multi-domain substrate

The engine's storage layer is domain-agnostic. Every entry is keyed by (domain, dataset_id, anchor_signature); the same Polars+Parquet storage + SQLite manifest serves all domains. The serve API takes a domain parameter; semantics layered above the storage layer specialize per domain.

v0.1 domains:

Domain dataset_id Anchor shape Semantics
METRIC metric family name (e.g., revenue) grain tuple (e.g., (region, day)) FD-DAG rollup, partition-invariance, compose-to-Frame
QM profile-kind id (e.g., column_profile, top_values) (schema, column) or finer exact-match-or-compute; no FD-DAG; Pydantic projections on read

Why this matters: quasi-metadata is naturally tabular and benefits from the same columnar/lazy/Parquet substrate the metric domain needs. Today QM compute is duplicated across backends (SQLite via SQL aggregates + Python lists, Polars via Series ops); both implementations re-derive the same kind-hint heuristic. Hosting QM in the engine eliminates the duplication: backends extract tables to Polars LazyFrames; the engine's shared profiling logic runs once.

The Pydantic types stay (TableProfile, ColumnProfile, NumericStats, …) — but they shift from being the storage medium to being projections materialized from engine-backed LazyFrames when callers want typed objects. Best of both: typed contracts at the API surface, columnar storage underneath.

Future domains worth considering (not in v0.1):

  • LINEAGE: per-column lineage edges (today in Python dicts).
  • INTEGRITY_RESULTS: cached attestation results (when refreshed against stable data, valid until stability rolls forward).
  • SKETCH: per-anchor HLL / t-digest sketches as first-class cached objects.

These are explicitly future scope; v0.1 ships METRIC + QM only.

3. Architectural shape

3.1 Where the Metric Engine sits

   surface  ──→  coframe-resolution (parse, resolve, plan)
                  list[(metric_family, anchor, mvt, filter)]
                  ┌─────────────────────────────────────────────┐
                  │  Metric Engine (per-AC)                     │
                  │                                             │
                  │   serve(metric_family, anchor)              │
                  │     ├─ exact materialized? → return         │
                  │     ├─ finer cached → roll up → return      │
                  │     └─ neither → backend.aggregate (single) │
                  │                  → memoize (per §9) → return│
                  │                                             │
                  │   compose(entries, target_grain) → Frame    │
                  │     ├─ merge per-metric LazyFrames on anchor│
                  │     ├─ apply frame-expression (derived)     │
                  │     └─ post-grain ops (HAVING/ORDER/LIMIT)  │
                  └────────────────────┬────────────────────────┘
                                       ▼ (misses only)
                              ┌──────────────────────┐
                              │  DataAPIBackend      │
                              │  (single-metric calls)│
                              └──────────────────────┘

3.2 Lifecycle

The Metric Engine instance is created at AC COMMIT time (per v2.1 §5.2). The AC's structure is frozen at COMMIT, so the FD-DAG, filter, and name_map the engine relies on are stable for the engine's lifetime. Forking an AC creates a new engine instance with the forked AC's structure; the old engine continues to serve the unforked AC.

  • Lazy-load mode: engine is constructed in-process on first query for that AC. Materialized store is read from disk if it exists.
  • Pre-load mode: a runtime/workbench bootstrap walks the installation's ACs and instantiates engines eagerly. Useful for cold-start latency reduction in long-running runtimes.

3.3 Process model (v0.1)

v0.1 is shared-process within a single Python process that owns the AC. All serve() and compose() calls — whether issued from FastAPI request handlers, NLQ-generated queries, workbench ops, or direct programmatic use — share the same in-process engine instance. In-process thread safety is provided by a simple threading.Lock around manifest mutations; readers and writers do not need to coordinate via the filesystem.

Single-process assumption. v0.1 assumes exactly one Python process owns the engine for any given AC. Multi-worker deployments (e.g., uvicorn --workers N over the same installation) should route each AC to a single worker, or run with a single worker per installation. The engine does not attempt cross-process coordination.

Coframe Pro will address multi-process and distributed deployments — cross-worker engine coordination, optional dedicated engine processes (Redis-style), and possibly distributed cache fronts. These are out of scope for v0.1.

The simplification is real: no lock file, no manifest revalidation on read, no version vectors. SQLite's own concurrency handles incidental cross-process reads safely; we just don't depend on it.

3.4 On-disk layout

   <installation>/.coframe/metric_engine/<ac_name>/
   ├── manifest.sqlite                  # entry catalog + metadata
   ├── lock                             # writer lock file
   └── data/
       ├── <metric_family_a>/
       │   ├── anchor=region/
       │   │   └── part-000.parquet
       │   ├── anchor=region,day/
       │   │   └── part-000.parquet
       │   └── anchor=region,month/
       │       └── part-000.parquet
       └── <metric_family_b>/
           └── ...
  • One Parquet directory per (metric_family, anchor_signature), partitioned by anchor (so Polars can scan-by-grain selectively).
  • Manifest is SQLite (small, transactional, well-understood). One row per cache entry; see §6 for the schema.
  • The .coframe/metric_engine/ directory sits alongside the workbench's .coframe/session.json, scoped per installation.

4. The serve() API

4.1 Signature

class MetricEngine:
    def serve(
        self,
        domain: Domain,                   # METRIC | QM
        dataset_id: str,                  # metric_family | qm-kind id
        anchor: tuple[str, ...],
        *,
        mvt: MissingValueTreatment | None = None,  # metric domain only
        backend: DataAPIBackend,
    ) -> pl.LazyFrame:
        """Return a single-column LazyFrame: anchor cols + the data column.

        Behavior is domain-specific:

        METRIC domain:
          1. Materialized entry at exact (dataset_id, anchor): scan + return.
          2. Materialized entry at a finer FD-DAG node serving this
             family + reachable to this anchor via rollup: scan + roll
             up via the operator's partition-invariant rule + return.
          3. No usable materialized entry: invoke backend.aggregate for
             a single-metric request; memoize per §9; return.

        QM domain:
          1. Materialized entry at exact (dataset_id, anchor): scan + return.
          2. No usable materialized entry: backend extracts the relevant
             table to a LazyFrame; engine's shared profiling logic
             computes the QM entry; memoize per §9; return.
          (No FD-DAG rollup for QM — anchors are (schema, column) or
          finer, with no algebra above them.)
        """

4.2 Inputs

  • domain: METRIC or QM. Selects the per-domain semantics layer.
  • dataset_id: domain-specific identifier.
    • METRIC: the AC-logical metric family name (e.g., revenue).
    • QM: the profile-kind id (e.g., column_profile, top_values).
  • anchor: tuple of AC-dimension names (METRIC) or (schema_name, column_name[, value]) (QM) defining the grain.
  • mvt: optional missing-value treatment override (METRIC only).
  • backend: the AC's bound DataAPIBackend (needed only for misses).

4.3 Outputs

A Polars LazyFrame with columns [anchor_col_1, ..., anchor_col_n, <dataset_id_column>]. The frame is lazy; the caller (typically compose() for METRIC, or a Pydantic-projection helper for QM) determines when to collect.

4.4 Why backend is a per-call argument, not engine-state

The Metric Engine instance is per-AC. The AC's backend binding is known at engine-creation time and could be stored as engine state. But passing it per-call keeps the engine pure with respect to data source: it can be reused if the AC's backend binding changes (e.g., testing with a stub backend) without re-instantiating the engine. The materialized store is independent of which backend produced the contents; provenance metadata records the source.

5. FD-DAG traversal and ServingPath

5.1 ServingPath

@dataclass(frozen=True)
class ServingPath:
    """The plan for serving a metric request from materialized state.

    Captures the chain: which materialized entry to start from + how
    to roll up + which operator + which intermediate anchors.
    """
    source_entry: MetricEntry        # The materialized entry to scan
    rollup_steps: tuple[FDStep, ...]  # FD-DAG edges to traverse
    operator: str                    # Reducer used at each step
    target_anchor: tuple[str, ...]    # Final grain

5.2 Path-selection algorithm

Given (metric_family, target_anchor):

  1. Look up exact match. If (metric_family, target_anchor) is in the manifest, return ServingPath(source=that_entry, rollup_steps=()).
  2. Look up finer cached nodes. Enumerate all materialized entries for the same metric_family at anchors finer than target_anchor per the FD-DAG. Filter to those whose operator is partition_invariant (only those can be rolled up further).
  3. Pick the cheapest path. For each candidate finer node, compute the FD-DAG path to target_anchor. Cost heuristic for v0.1: row count of the candidate node (cheaper to scan). Return the minimum-cost path.
  4. No materialized node serves. Return None; caller falls back to backend.

In v0.1 we use the row-count heuristic. A future revision can add column-cardinality estimates, IO cost models, etc.

5.3 Soundness

The FD-DAG declares which rollups are semantically valid. Combined with the operator's partition_invariant flag, the engine knows which rollups are also arithmetically sound:

  • SUM, COUNT, MIN, MAX, BOOL_AND, BOOL_OR: partition-invariant; the cached value can be rolled up further via the same operator.
  • AVG, MEDIAN, COUNT_DISTINCT: not partition-invariant; the cached entry is usable at its own anchor but not further-rollable.

The partition_invariant flag is part of the Operator declaration in coframe.operators (already exists, used by W4).

5.4 Missing-value treatment under rollup

Rolling up a propagate-treated metric: the cached entry already carries NULL for any-NULL groups; further SUM rollup naturally preserves the NULL (NULL + anything = NULL with native skip-NULL absent). The engine respects this; the rollup expression doesn't need to re-apply the propagate guard.

Rolling up a skip-treated metric: native Polars SUM continues to skip NULL; the rollup behaves as expected.

Rolling up an impute-treated metric: the imputation has already substituted at the source-row level; the rollup is over already-imputed values. No re-imputation needed.

5.5 QM-domain serve semantics

QM serve is intentionally simpler than METRIC serve — no FD-DAG traversal, no rollup algebra:

  1. Exact-match-or-compute. The engine looks up (QM, dataset_id, anchor) in the manifest. If present, scan + return. If not, compute via the backend's extract-to-LazyFrame path, then memoize + return.
  2. No rollup. QM data doesn't compose along an FD-DAG. A column profile at (stores, region) doesn't roll up into anything; it's a leaf observation about the data.
  3. Anchor shapes per QM kind:
  4. column_profile: anchor = (schema_name, column_name). The single row of per-column stats lives here.
  5. top_values: anchor = (schema_name, column_name, value). Each anchor cell carries one (value, count) pair; query top-N by sorting + limiting at read time.
  6. histogram: anchor = (schema_name, column_name, bin_index). Each cell carries one bin's range + count.
  7. Future kinds add their own anchor shape.
  8. Shared compute logic. The profiling algorithm (cardinality classification, kind-hint heuristic, per-kind stats blocks) lives in the engine as Polars expressions. Backends provide table-extraction; the engine runs the heuristic.
  9. Pydantic projections. ColumnProfile, TableProfile, etc. become helper functions that read the relevant engine entries and assemble the typed view: engine.column_profile(schema, column) → ColumnProfile. The Pydantic object is constructed from a fresh LazyFrame collect; the storage is the LazyFrame.

6. Manifest schema

The SQLite manifest stores one row per materialized entry plus metadata for the engine itself.

CREATE TABLE entries (
    id                INTEGER PRIMARY KEY AUTOINCREMENT,
    domain            TEXT    NOT NULL,       -- 'METRIC' | 'QM'
    dataset_id        TEXT    NOT NULL,       -- metric_family | qm-kind id
    anchor_signature  TEXT    NOT NULL,       -- canonical sorted-tuple repr
    parquet_path      TEXT    NOT NULL,
    -- Metric-domain fields (NULL for QM):
    operator          TEXT,                   -- the reducer used
    partition_invariant BOOLEAN,
    mvt               TEXT,                   -- skip|propagate|impute
    imputation_value  TEXT,                   -- JSON literal (when mvt=impute)
    -- Shared:
    source_schemas    TEXT    NOT NULL,       -- JSON list of source schema names
    source_filter     TEXT,                   -- JSON: the filter applied
    row_count         INTEGER NOT NULL,
    byte_size         INTEGER NOT NULL,
    materialized_at   TIMESTAMP NOT NULL,
    stability_cutoff  TIMESTAMP,              -- the hold-off cutoff at materialize time
    last_access_at    TIMESTAMP NOT NULL,
    access_count      INTEGER NOT NULL DEFAULT 0,
    verification_level TEXT,                  -- A|AA|AAA, inherited from source
    UNIQUE(domain, dataset_id, anchor_signature)
);

CREATE INDEX entries_by_domain_dataset ON entries(domain, dataset_id);
CREATE INDEX entries_by_access ON entries(last_access_at);

CREATE TABLE engine_meta (
    key   TEXT PRIMARY KEY,
    value TEXT NOT NULL
);
-- engine_meta rows: ac_name, ac_fingerprint, schema_version, etc.

The manifest is small (one row per entry) and SQLite gives us transactional updates, multi-reader concurrency, and a familiar operational story.

7. Backend interaction

7.1 Single-metric calls

When serve() falls through to the backend, it constructs an AggregateRequest with exactly one AggregateMetric. The existing backend protocol handles this without modification.

Cross-schema queries previously fanned out into multiple AggregateRequests in the planner's merge_blocks path. With the Metric Engine, this fan-out shifts to per-metric serve() calls; the backend never sees a multi-metric, cross-schema request — each call is narrow.

7.2 Backend batching (deferred)

A future optimization: when multiple cache misses in a single composition target the same source schema, the Metric Engine could batch them into one multi-metric AggregateRequest for IO efficiency. This preserves the per-metric serve() API; batching is an engine internal optimization.

In v0.1 we issue one aggregate() call per miss. The simplicity is worth more than the IO savings for the initial release.

7.3 Join elimination at the backend

Because the Metric Engine assembles cross-schema results in Polars, the planner no longer needs to emit AggregateJoin entries for dim-table navigation. Each backend call asks for metric@anchor where the metric and the anchor's dimensions live on the same physical table — or, when they don't, the engine first calls the backend for the metric at its native anchor, then joins to a dim-table entry the engine has separately memoized.

This realizes the join-less-backend hypothetical we discussed during Phase 7 — without changing the DataAPIBackend protocol, backends can now ignore joins entirely (the engine never sends them).

8. Composition

8.1 compose() API

def compose(
    self,
    entries: list[pl.LazyFrame],
    target_grain: tuple[str, ...],
    frame_expression: FrameExpression | None = None,
    post_grain_ops: PostGrainOps | None = None,
) -> Frame:
    """Combine multiple single-column LazyFrames into a Frame.

    Each input LazyFrame has the shape [anchor_cols..., metric].
    The output Frame has [grain_cols..., metrics...] after merge,
    frame_expression evaluation, and post-grain ops.
    """

8.2 Steps

  1. Grain reconciliation: each input LazyFrame is at some anchor; if any are finer than target_grain, roll them up via the engine's FD-DAG traversal (re-using the same ServingPath machinery).
  2. Join on target_grain: outer-join all LazyFrames on the grain columns. Polars's lazy join + the engine's columnar layout make this cheap.
  3. Evaluate frame_expression: derived metrics (e.g., profit = revenue - cost) are Polars expressions over the joined frame.
  4. Post-grain ops: HAVING (filter), ORDER BY (sort), LIMIT [PER] (top-N per group). Applied in declared order.
  5. Materialize the final result into a Frame (the coframe-resolution output type) and return.

8.3 What this replaces

  • Today's merge_blocks in coframe-resolution/execution.py ➝ becomes a thin shim that calls compose().
  • Today's post_grain_ops in coframe-resolution/execution.py ➝ moves into compose() step 4.
  • The post-aggregation Frame transformations stay conceptually the same; the execution shifts from Python list-of-lists to Polars LazyFrame.

8.4 Backwards compatibility

execute_query(ac, backend, src) continues to return the same Frame type. The engine is wired between the resolver and the surface; the public API surface is unchanged.

9. Memoization policy (the hybrid lazy + push-opt-in)

9.1 Lazy by default

Every backend call's result is memoized by default. Cold start has no materialized entries; the engine populates as queries arrive.

9.2 Push opt-in via cache_hint

The AC can declare hot grains to pre-materialize:

metric_families:
  - name: revenue
    family_root: {schema: transactions, column: revenue}
    ip_reducers:
      - operator: SUM
        a_block: [time]
    cache_hint:
      materialize_at:
        - [region]
        - [region, day]
        - [region, month]

A new lifecycle step at AC COMMIT walks every cache_hint and schedules pre-materialization (either eager at COMMIT, or deferred to a "warmup" cron). The pre-materialization invokes the same serve() machinery; the only difference is when it runs (not at first query but at COMMIT).

9.3 Promotion recommendations

The Metric Engine tracks per-entry access frequency. When an entry has sustained hot access over a configurable window (e.g., 7 days, 100+ hits), it surfaces a recommendation: "consider promoting this entry to a declared schema." The promotion itself is an AC-author action through the Workbench (not an engine action) — the engine emits the SchemaSpec YAML stanza the author can paste.

This gives the progression:

  • Lazy memoization catches unknown workload patterns.
  • Push opt-in catches known-hot patterns the AC author can declare in advance.
  • Promotion catches steady-state patterns the engine identifies from observed workload, lifting them out of the engine into the AC's declared schema set (and optionally to a physical pre-aggregate table).

9.4 What never gets memoized

  • Results of singleton lookups (single anchor cell, single value): not worth the memoization overhead.
  • Results explicitly tagged no_cache by the query (a future Frame-QL hint; v0.1 just memoizes everything).
  • Results whose source schema is in SELECTING phase (not yet COMMITTED): the AC isn't stable, so memoizing would risk staleness.

10. Eviction

10.1 LRU baseline

Manifest tracks last_access_at per entry. When the engine's materialized store exceeds a configurable byte budget, evict least-recently-accessed entries.

10.2 FD-DAG-aware enhancement

Pure LRU can evict a node that's the only reachability path to many ancestor grains, defeating future serves at those grains. The engine's eviction scorer biases against evicting such nodes:

   eviction_score = (now - last_access_at) /
                    (downstream_fanout * access_count_log)

where downstream_fanout is the number of FD-DAG ancestor nodes that would lose their cheapest serving path if this entry were evicted.

10.3 Bytes budget

  • Default: 1 GB per AC.
  • Configurable via installation.yaml:
    metric_engine:
      max_bytes_per_ac: 1073741824   # 1 GB
    

11. Stability-window invalidation

11.1 Per-entry stability cutoff

Each entry records the stability_cutoff it was materialized over — i.e., the date cutoff applied to the source data at materialization time. As long as the engine's current effective cutoff is ≤ this value, the entry is valid.

11.2 Rolling forward

When the engine's effective cutoff rolls forward (typically daily, at the stability boundary), entries materialized over older cutoffs become stale: they're missing the newly-stable rows that were previously in the hold-off window.

Two strategies:

  • Invalidate: drop stale entries from the manifest; next serve repopulates with the fresh cutoff.
  • Append-and-merge: if the operator is partition-invariant, compute the incremental aggregate over the newly-stable rows only and merge with the stale entry. Faster than full recompute when the increment is small.

v0.1 implements invalidate-only. Append-and-merge is a future optimization that requires source-table partition awareness on the backend side.

11.3 Cron / on-demand

The roll-forward check runs at engine serve() time (cheap: compare two timestamps) and is also exposed as an explicit engine.refresh() call for batch operations.

12. Verification level interaction

12.1 Inheritance

Each materialized entry records the verification level (A / AA / AAA) of its source schema(s). The entry inherits the minimum level of its sources — a cached entry can be no better-verified than its weakest input.

12.2 Soundness of rollup

If the source schema is AAA-verified and the operator is partition-invariant, the cached rollup is trivially AAA-attestable: the materialized entry is the rollup of the source, so the sibling-coherence attestation between them is satisfied by construction (the cached node is itself the answer to the sibling-coherence check). No re-attestation needed.

Non-partition-invariant operators (AVG, MEDIAN, COUNT_DISTINCT) produce entries that are usable at their own anchor; their AAA status is inherited from the source but the entry is flagged not_further_rollable so the engine doesn't attempt to roll them up.

12.3 Surfaced provenance

When the engine serves a request from a materialized entry, the returned Frame carries provenance metadata identifying the entry (and its verification level) for each cell. This lets the verification surface report "this row came from <metric_engine_entry> materialized at <timestamp> from <source_schema>@<level>."

13. Integration with coframe-resolution

13.1 What changes in coframe-resolution

  • execute_query(ac, backend, src) accepts an optional metric_engine: MetricEngine | None argument. When present, the engine handles execution; when absent, the legacy backend.aggregate path runs.
  • build_plan() emits a list of (metric_family, anchor, mvt, filter) requests instead of AggregateRequests when the engine is in use. (The legacy AggregateRequest path remains for engine-less execution.)
  • merge_blocks and post_grain_ops are removed in favor of engine.compose().

13.2 Default off in v0.1

The Metric Engine is opt-in via per-installation configuration. It must soak through real workloads before becoming default. The default path remains: direct backend invocation through today's execute_query.

# installation.coframe/installation.yaml
metric_engine:
  enabled: true
  max_bytes_per_ac: 1073741824

13.3 The cross-backend invariant retains

The Phase 7 cross-backend invariant tests (same query → same Frame through SQLite vs Polars) become a forcing function for the engine: when the engine is enabled, the same query through the engine-on-SQLite path must produce identical Frames to the engine-on-Polars path. This is the engine's primary correctness gate.

13.4 QM backend migration

The QM domain landing as a co-resident substrate (per §2.7, §5.5) collapses the duplicated quasi-metadata compute paths across backends into one. After the migration:

  • SQLiteBackend.compute_table_profileextract_to_lazyframe(table) → engine.profile_table(lazyframe) → engine.serve(QM, "column_profile", (schema, col)) for each column; the typed TableProfile is a projection assembled from the per-column entries.
  • PolarsBackend.compute_table_profile → same path: extract to LazyFrame (trivial — it already has one) → shared profiling.
  • The per-backend kind-hint heuristic in coframe.sqlite.backend
    • coframe.polars.backend is removed; the canonical implementation lives in the engine.
  • Backend protocol grows one method: extract_to_lazyframe(table) → pl.LazyFrame (the QM ingestion door). SQLite reads via sqlite3 → arrow → polars; Polars returns its native frame.
  • Existing callers of backend.compute_table_profile() continue to work: the call now routes through the engine when enabled, or falls back to the legacy per-backend impl when disabled.

The migration removes ~200 lines of duplicated logic and eliminates the cross-backend invariant tests for QM (they become structurally trivial — same code path runs for both backends).

14. Slice plan (after this design doc)

Slice Deliverable
2 Package skeleton (coframe-metric-engine). Domain-aware types: Domain enum (METRIC/QM), EngineEntry, Manifest, ServingPath, FDStep. Skeleton tests.
3 Storage + manifest. Polars+Parquet writes, SQLite manifest CRUD with (domain, dataset_id, anchor_signature) unique key. Domain-agnostic.
4 QM as engine-managed substrate. Port both backends' compute_table_profile to: extract → LazyFrame → shared engine.profile_table() → store as QM entries → return Pydantic projection. Eliminates the duplicated kind-hint heuristic. Proves the substrate before metric serving is load-bearing.
5 METRIC-domain serve() with FD-DAG traversal (exact, rollable, fallback). Memoization on backend miss.
6 METRIC-domain compose() — multi-metric merge + frame-expression + post-grain ops. Replaces merge_blocks + post_grain_ops.
7 Eviction (LRU + FD-DAG-aware for METRIC; pure LRU for QM) + stability-window invalidation.
8 Integration with coframe-resolution: rewire execute_query behind the metric_engine.enabled flag. End-to-end retail test through both paths (engine on + off) returning identical Frames.

Why slice 4 (QM) before slice 5 (metric serve): QM's serve semantics are simpler (no FD-DAG rollup, no partition-invariance reasoning), so it validates the storage substrate end-to-end before the metric domain's complexity lands on top. QM unification also removes a duplicated heuristic immediately — visible win in the first substantive slice. The metric serve() machinery in slice 5 lands on a proven substrate rather than co-developing with one.

15. Open questions

15.1 Per-process vs shared-process Metric Engine — RESOLVED

Resolved per author guidance. v0.1 ships shared-process within a single Python process per AC (see §3.3). Cross-process coordination, dedicated engine processes, and distributed-engine futures are explicit Coframe Pro tier concerns; v0.1 keeps the lightweight shape.

15.2 Cache hint placement in the AC — RESOLVED

Resolved: cache_hint lives on MetricFamily (per §9.2). In v0.1, almost all metric families ship a single ip_reducer; the granularity benefit of per-reducer hints is theoretical for most ACs, and MetricFamily-level matches how authors think about hot grains ("revenue is hot at these grains," not "revenue's SUM reducer specifically is hot at these grains").

Forward-compat: extending to a per-reducer override later is additive — an optional cache_hint on IpReducer would shadow the family-level hint for that reducer only. Lands if/when real workloads demand it.

15.3 Which call paths route through the engine — RESOLVED

Resolved: the engine intercepts only the analytical-query pathserve() and its descendants under compose(). Every other DataAPIBackend method goes direct to the backend, untouched by the engine.

Call path Routes through engine?
execute_query(ac, backend, src) (Frame-QL → Frame) Yes
Per-metric serve() from compose() Yes (by definition)
backend.attest_fd_edge(...) No — verification op
backend.attest_sibling_coherence(...) No — verification op
backend.compute_table_profile(...) No — L2 QM has its own cache
backend.list_tables() / describe_table() No — cheap introspection
backend.check_update_timestamp_column() No — stability-filter probe
backend.apply_stability_filter() No — stability-filter probe

Rationale: attestation results are operational diagnostics, not analytical answers; memoizing them muddles provenance ("did I just verify this, or am I reading a verification from 3 days ago?"). Introspection + profiling are cheap and have their own L2 quasi-metadata caching story. Stability probes are O(1) lookups.

Contributor guidance: when adding a new method to DataAPIBackend, ask whether it's an analytical-query path. If yes, route through serve(). If no, call the backend directly.

15.4 NLQ and engine — RESOLVED

Resolved per author guidance. Not an engine concern. NLQ produces Frame-QL — highly structured input — that goes through the same execute_query path as any other source. The engine sees no distinction. Speculation about engine-state-driven NLQ suggestions ("you've cached X; want to ask about it?") would be NLQ-side work, not engine-side, and is out of scope here.

15.5 Cross-AC sharing — RESOLVED

Resolved: Coframe Pro tier territory. Same shape as §15.1: v0.1 keeps per-AC stores for simplicity (filter-scope isolation, naming isolation, no shared-write coordination). Physical-name-keyed cross-AC sharing with L3 name remapping is a Pro-tier optimization that lands if real workloads show meaningful storage duplication or hit-rate loss from non-sharing. Storage cost is the least concern in v0.1.

15.6 Backend batching for misses — RESOLVED

Resolved: Coframe Pro tier territory. Batching multiple single-metric misses into multi-metric backend calls is an IO-overhead optimization that matters for remote warehouses (Snowflake, BigQuery — round-trip latency dominates) but not for the local engines Core targets (SQLite, Polars — per-call overhead is negligible). Coframe Core keeps serve() clean and atomic; batching machinery lives in the Pro tier alongside the remote- warehouse backends that benefit from it.

Forward-compat: batching is purely an engine-internal optimization; adding it later doesn't change serve(), compose(), or DataAPIBackend signatures.

15.7 Results too large to memoize — RESOLVED

Resolved: serve normally; skip memoization when the result exceeds a per-entry size cap; surface a coarser-grain recommendation.

  • Per-entry size cap: 10% of the engine's max_bytes_per_ac budget (default 100MB on the default 1GB budget). Configurable via installation.yaml as metric_engine.max_entry_bytes.
  • Behavior at the cap: serve the result from the backend normally (no error, no degraded answer); skip writing to the materialized store; log a recommendation that "a coarser-grain version of this query would be cacheable" — feeds the promotion-recommendation machinery (§9.3).
  • No errors thrown: query correctness is preserved unconditionally. The engine only opts out of memoization, never out of serving.

The 10% default is a starting point. Real workloads will inform whether to raise or lower it.

16. Out of scope for this doc

  • Concrete Polars expression generation for rollup (slice 4 detail).
  • Concrete Parquet partition scheme tuning (slice 3 detail).
  • Concurrency primitive choices in Python (slice 3 detail).
  • Telemetry / observability hooks (worth its own design pass).
  • Distributed-engine futures (clearly out of v1.0).

Next: review + iterate this doc; lock the design (or flag open questions for further brainstorm); then proceed to slice 2 (package skeleton + types) per §14.