Coframe Analytics Data Platform — Design Document, v2.1 Supplement¶
Status: v2.1 supplement (amended 2026-05-23) to coframe_platform_design_v2_0.md
Author: reeeneeee
Compiled: May 2026
Spec authority: Coframe Core Manual (the Manual). Where this document and the Manual disagree, the Manual wins.
Amendments (2026-05-23): Four refinements landed after the initial v2.1 ship: (a) AC primitive renamed Analytic Collection (was Analytics Collection) for parallelism with Analytic Layer; (b) AC Surfaces introduced as the umbrella term for the AC's access protocols (Frame-QL / NL Query / MCP / HTTP API / Workbench / Validation); (c) the frontend restructured into a three-UI / four-package architecture; (d) build phasing re-ordered to vertical-slice-first so the front-end + SQLite back-end ship together before the rest of the core / polars / duckdb work. See §10 for the amendment details. Inline mentions of the old Analytics Collection form throughout this document have been corrected to Analytic Collection.
Relationship to v2.0: This supplement extends v2.0 rather than replacing it. It captures a set of architectural refinements that emerged from a multi-hour design conversation and which together sharpen the platform's positioning, formalize multi-AC operation, and pin down a small number of dependency contracts the v2.0 design assumed but didn't make explicit. Sections in v2.0 affected by this supplement are noted inline; readers should consider v2.0 + v2.1 together as the current platform design.
What this supplement adds:
- Positioning: Coframe is the Analytic Layer for analytical data. Named as a peer/sibling to the Semantic Layer category; the name is descriptive, not aspirational; the underlying claim is that Coframe defines a new architectural layer in the data stack, characterized by the four properties below.
- Multi-AC at installation level: the platform was always designed to support multiple ACs over the same physical data (per the Manual's "contextual multiplicity" stance). v2.1 makes the installation concept explicit and formalizes the relationship between an installation, its backend binding, and the ACs hosted in it.
- L1/L2/L3 metadata layering: what gets cached at installation level (shared across all ACs) vs. what's per-AC. Refresh semantics and cache invalidation are pinned to a small contract.
- AC-level filter as the fourth orthogonal customization control: alongside dimension subset, measure subset, and
name_map+ operator customizations. Restricted in v1.0 to dimensional-value subsetting; richer filters are Pro. - Frozen-scope phase at the start of every AC session — dimension subset + measure subset + AC filter committed together, then locked for the rest of the session. The lock is what makes per-AC attestation stable.
- L2 stability filter: an installation-level filter by data update timestamp, protecting all downstream ACs from late-arrival churn. Names the warehouse-hygiene contract Coframe depends on.
- Incremental update insulation: the architectural property that ACs with bounded scope are invariant under data changes outside their scope. Closed-period ACs, reproducible-snapshot ACs, and hold-off-buffer patterns fall out naturally.
- AC-as-architectural-pattern coda: the four properties (insulation, customization, verification, scope stability) that the AC bundles together — the "what is Coframe, architecturally" framing that complements the technical claims in the position article.
1. Positioning — Coframe as the Analytic Layer¶
1.1 The category claim¶
Coframe is the Analytic Layer for analytical data.
This is a category claim, parallel in form to the Semantic Layer (Looker, dbt MetricFlow, Cube, AtScale, Snowflake Semantic Views, Databricks Metric Views), but distinct in substance. The two layers are complementary peers in a modern data stack, not competing implementations of the same concept.
| Layer | What it does | What gets centralized |
|---|---|---|
| Semantic Layer | gives metrics their meaning — central definitions, consistent naming, exposed to consumers as a controlled vocabulary | named metric definitions + their operational logic |
| Analytic Layer (Coframe) | guarantees that derivations of metrics are structurally correct — through declared structural commitments, verified integrity conditions, and a graded verification regime (A/AA/AAA) | structural metadata: anchors, dimension families, metric families, lineage, operator catalog |
In a deployed stack, both layers coexist. A semantic layer can sit beside (or be expressed through) an Analytic Layer; the two answer different questions. Where the semantic layer asks "what does revenue MEAN?", the Analytic Layer asks "is this revenue computation STRUCTURALLY CORRECT?" Both are real questions; the right architecture answers both.
1.2 The name is descriptive, not aspirational¶
A small editorial note: the choice of "Analytic Layer" (rather than, say, "Trust Layer" or "Verified Layer") is deliberate and descriptive. The layer is named for the work it demonstrably does — analytical correctness reasoning, FD discovery, integrity attestation, constructive query resolution — not for a property we hope it will eventually have. This matters because, in technical communities, names that claim more than the underlying substance delivers tend to erode credibility over time. The Analytic Layer's name and its work are aligned; we keep them so by holding the underlying machinery to the name.
1.3 Sidebar: A/AA/AAA vs. Bronze/Silver/Gold¶
Coframe's verification levels — A, AA, AAA — are sometimes confused with Databricks' medallion architecture (Bronze, Silver, Gold). They are not the same thing and grade different properties:
| Label system | What it grades | Where it sits |
|---|---|---|
| Bronze / Silver / Gold (Databricks medallion architecture) | data processing maturity — where in the pipeline a dataset is (raw / cleaned / business-ready) | per-dataset, evolves through ETL |
| A / AA / AAA (Coframe) | AC verification rigor — how much has been verified about an analytical surface | per-AC, computed from the integrity catalog's status |
These are orthogonal axes. You could have a Bronze-tier table that an AC verifies at AAA, or a Gold-tier aggregate that an AC verifies at only A. The labels coexist; they do not compete.
The Coframe verification levels deliberately echo WCAG accessibility levels (which also use A/AA/AAA) — both are graded-conformance-against-an-objective-standard schemes that signal "how rigorously has this artifact been audited." Practitioners who know WCAG recognize the shape of the grading system immediately.
1.4 The AC as architectural pattern — the four properties¶
The Analytic Layer is realized by the Analytic Collection (AC) — the platform's primitive. Each AC is a single, scoped, verified analytical surface over a shared physical data substrate. An AC bundles four architectural properties that together constitute its value as a layer in the stack:
| Property | What it means | What in the platform implements it |
|---|---|---|
| Insulation | Consumers of the AC are decoupled from physical data layout changes. Schema migrations, ETL refactoring, even backend swaps don't break consumers. | The AC layer absorbs all references to physical reality; consumers see only the logical surface (name_map, dimension families, metric families, the AC's exposed columns). |
| Customization | Each consuming context gets its own scoped, named, filtered view — no "one canonical view for the whole company" forced consensus. | Multi-AC architecture (§2 below); per-AC name_map, operator customizations, dimension/measure subset, AC-level filter (§4). |
| Verification | Correctness is constructively guaranteed: every premise the AC's queries rely on is either declared and assertable, code-affirmed, or data-attested, with the warrant level made visible. | Integrity catalog + DQ machinery + per-condition warrant tracking + A/AA/AAA level rollup (Manual §7). |
| Scope stability | An AC's answers reproduce over time within its scope. Incremental data updates outside the scope do not perturb already-verified ACs. | Frozen scope phase (§5) + L2 stability filter (§6) + filter-aware staleness detection (§3.4 + §7). |
These four are the architectural value proposition. The technical claims in the position article (column-native, removal of join, two kinds of correctness, etc.) are the means; the four properties are the ends. Organizations adopt Coframe to install this combination of properties as a layer in their data stack.
2. Multi-AC at installation level¶
(Affects v2.0 §0.2 "Backend binding: one execution backend per AC" — superseded.)
2.1 The installation as a first-class concept¶
A Coframe installation is the unit of deployment: one connection to a physical data substrate, hosting many ACs over that substrate. This was always the design intent (per the Manual's stance on contextual multiplicity rather than canonical singularity), but v2.0 implicitly treated AC and installation as the same thing. v2.1 separates them.
The structural relationship:
CoframeInstallation
│
├── BackendBinding (one connection to physical data — sqlite, polars, duckdb)
│ ↓ reads
├── InstallationMetadata (L2 — see §3; shared across all ACs)
│
└── ACs[] (multiple analytical surfaces; each a "view" over L2)
│
└── for each AC:
dimension subset
measure subset
name_map
operator customizations
AC-level filter (frozen at scope-set; see §4, §5)
attestation config
per-AC artifacts (L3 — see §3)
One installation, one backend binding, many ACs. Queries and authoring sessions both name an AC explicitly.
2.2 ACs as customization surfaces, not data containers¶
Under this model, an AC is not a copy of data nor a separate datastore. It is a customization surface over the installation's L2 metadata — a named, scoped, filtered, customized view that consumers query through. Two ACs over the same installation share the underlying physical data and the L2 metadata derived from it; what differs is their declared structure, names, and filter.
This recasts several v2.0 decisions:
| v2.0 decision | v2.1 update |
|---|---|
| "Backend binding: one execution backend per AC" | One execution backend per installation; multiple ACs share it. |
.coframe/ workspace as the unit of AC artifact |
.coframe/ workspaces still per-AC; installation-level artifacts live in a sibling installation.coframe/ workspace (see §3). |
coframe-mcp wraps a bound AC |
coframe-mcp wraps an installation; clients select an AC per session (or per query, depending on configuration). |
coframe-author workbench session = one AC |
Workbench session bound to installation first, then to an AC within it (or starting a new AC for that installation). |
2.3 InstallationConfig¶
A new top-level configuration object — InstallationConfig — specifies the installation:
# installation.coframe/installation.yaml
name: "retail-warehouse-prod"
description: "Production retail warehouse, all ACs"
backend:
type: sqlite # | polars | duckdb
source: retail_demo.db
stability_filter: # see §6
default_hold_off_days: 7
overrides:
stores: { hold_off_days: 0 } # master tables don't churn
products: { hold_off_days: 0 }
transactions: { hold_off_days: 7 }
acs: # registry of ACs in this installation
- path: ./acs/retail_east.coframe/
- path: ./acs/retail_central.coframe/
- path: ./acs/retail_west.coframe/
- path: ./acs/finance_2024_closed.coframe/
2.4 Cross-AC interactions in v1.0¶
v1.0 deliberately keeps ACs independent of each other. A query targets one AC; an authoring session edits one AC. Cross-AC operations (e.g., "compare the East and West ACs' revenue numbers for the overlap period") are not in v1.0. They are an obvious Pro feature (the "consistency-across-ACs" verification class), but require careful design — particularly around how shared physical data combined with divergent customizations can produce defensible "drift" reports.
3. The L1/L2/L3 metadata layering¶
(New section; closes a gap in v2.0 about where metadata physically lives.)
3.1 The three layers¶
Coframe operates over three layers of state:
| Layer | What it is | Where it lives | Per |
|---|---|---|---|
| L1 | physical data | the warehouse / database | one (installation's backend binding) |
| L2 | installation metadata derived from L1 | installation.coframe/ workspace |
one (shared across all ACs in the installation) |
| L3 | per-AC artifacts derived from L2 + AC's declarations | each AC's .coframe/ workspace |
many (one per AC) |
The layering is what makes shared computation efficient: L2 is computed once per L1 refresh and reused by every AC; L3 is computed per AC but most of its content is derived (cheap) from L2 rather than recomputed from L1.
3.2 What lives at L2 (installation-level, shared)¶
Anything that depends only on the physical reality, independent of any AC's interpretation:
- Table inventory and DDL. What tables exist, their schemas, their columns' physical types.
- Raw column profiles. Per column:
nunique,null_rate, distribution shape, sample values, date range, mean string length, etc. (the quasi-metadata from v2.0'scoframe-core). - Distinct dimensional values. Per dimensional column: the set of distinct values observed. Crucial for the workbench's AC-filter picker (§5.3).
- FD candidates discovered from raw data. Pairs (X, Y) where X→Y holds in the L1 data, with support stats.
- Lineage extracted from processing code. SQL/Python jobs the installation knows about, their grain inferences, their measure-lineage classifications.
- Source-data timestamps. Per table: when the table was last touched (the
updated_at-like signal used by §6's stability filter). - Operator registry. A mapping from L1 catalog operator names (Manual ch. 10:
SUM,MAX,HLL_MERGE, …) → the physical operator name the installation's bound backend uses, or an explicit unsupported marker. Per-backend because different substrates expose different SQL/dialect syntax —APPROX_DISTINCTisAPPROX_COUNT_DISTINCTin BigQuery, plainAPPROX_DISTINCTin Snowflake, unsupported (today) in vanilla SQLite. The registry inherits the catalog's per-backend defaults (Manual §10.2'sbackend_overrides) and is the runtime source of truth the resolver consults for execution-time translation. See §3.6.
L2 is expensive to compute the first time and cheap to refresh incrementally when L1 changes (only changed-table sub-stats need re-derivation). The operator registry is the cheap exception — it's installation-static, refreshed only when the bound backend changes or the catalog version is upgraded.
3.3 What lives at L3 (per-AC)¶
Anything that depends on the AC's specific interpretation of L2:
- Declarations. Dimension families, metric families, schema declarations,
name_map, operator customizations, attestation config. - AC-level filter. The frozen dimensional-value subset (§4).
- Per-AC FD-edge attestation results. An FD-edge declared in the AC may hold globally (cheap — copy the L2 result) or hold only within the AC's filter scope (re-derived on-demand; see §3.4).
- Per-AC integrity-catalog status. Which of the 27 integrity conditions have been verified, at what warrant level, with what evidence.
- AC's verification level. A / AA / AAA / none — computed deterministically from the per-condition status.
- Provenance log. Per AC, the trail of author decisions and verification operations and their outcomes.
- Operator override patch. An optional dict of catalog operator name → physical operator name (or unsupported marker) that overrides the installation's L2 operator registry for this AC only. Typical contents: empty (most ACs use the installation defaults) or one to three entries naming a custom UDF the AC author has installed in the warehouse — e.g.,
{"APPROX_DISTINCT": "approx_distinct_v2_udf"}to use a UDF instead of the backend's built-in approximate-distinct. The runtime computes the effective registry as L2 ∪ L3-overrides with L3 winning on overlap (§3.6). Lives on the AC catalog asoperator_overrides.
L3 is small relative to L2 in terms of raw bytes (the AC's declarations are short) but rich in interpretive content (the warrant-tracking + provenance is what makes the AC trustworthy).
3.4 Derivation, not duplication¶
L3 statistics that look like duplicates of L2 statistics (e.g., "this AC's region cardinality") are derived projections of L2 through the AC's filter, not separately computed values. Example:
- L2:
regionhas 3 distinct values:{West: 5 stores, Central: 4, East: 4}— across the whole installation. - L3 (for an East-only AC):
regionhas 1 distinct value:{East: 4 stores}— within scope.
The L3 figure is computed from the L2 figure + the AC filter, not by re-scanning the warehouse. This is the efficiency win the layering buys.
3.5 Refresh semantics and cache invalidation¶
A small contract governs when each layer invalidates:
| Trigger | What invalidates |
|---|---|
| L1 changes (ETL run, manual update) | L2 partial refresh (only affected tables' sub-stats) |
| L2 refresh | L3 selective invalidation: for each AC, check whether the refreshed L2 portion intersects the AC's scope; if yes, mark the AC's affected L3 entries as stale; if no, AC stays clean |
| AC declarations change | L3 invalidates only the affected entries (e.g., adding a new FD-edge declaration invalidates that edge's attestation result; doesn't touch unrelated results) |
| AC scope change | Not allowed mid-session (§5). Counts as a new AC; fork or replace. |
The cleanest framing: L3 is a cached projection of L2. The cache key for each L3 entry is (L2 version, AC scope, relevant AC declarations). Cache invalidates when any element of that key changes.
The workbench's "needs re-verification" UI signal maps directly to "L3 cache key has moved." The workbench's data_changes_in_scope operation (§7.4) is the user-facing surfacing of this signal.
3.6 Operator registry: L1 → L2 → L3¶
The operator registry is the canonical example of the layering's override semantics. Three layers, three concerns:
| Layer | Owns | Contents | Mutability |
|---|---|---|---|
| L1 (catalog) | What operators exist and their structural properties | Manual ch. 10 — name, kind, partition_invariance, identity_preservation, type_signature, default naming, missing-value treatment. No physical syntax. |
Per catalog version (coframe-core-catalog/<x.y.z>). Same across all installations targeting that version. |
| L2 (installation registry) | How operators bind to this backend | Mapping from L1 operator name → physical operator name for the installation's bound substrate, or unsupported. Seeded from the catalog's backend_overrides (Manual §10.2); per-installation overrides land on top. |
Installation-static. Refreshed only on backend change or catalog upgrade. |
| L3 (per-AC override patch) | The AC's deviations from L2 | A dict of catalog operator name → physical operator name. Usually empty; a small handful of entries when the AC author has installed custom UDFs in the warehouse that this AC depends on. | Per-AC. Captured on the AC catalog as operator_overrides. |
Effective registry at runtime = L2 ⊕ L3-overrides, with L3 winning on overlap.
Concrete example:
- L2 (SQLite installation):
{SUM: "SUM", MEDIAN: "MEDIAN", APPROX_DISTINCT: <unsupported>} - L3 (an analytics AC that has a custom UDF):
{APPROX_DISTINCT: "approx_distinct_v2_udf"} - Effective for this AC:
{SUM: "SUM", MEDIAN: "MEDIAN", APPROX_DISTINCT: "approx_distinct_v2_udf"}
The same SQLite installation hosting a different AC without that override would see APPROX_DISTINCT as unsupported for that AC's queries — clean isolation per AC.
Why the layering matters:
- The catalog stays backend-agnostic. Adding
THETA_INTERSECTIONto the catalog (L1) is a behavior-only change; per-backend bindings land in each backend's L2 registry shipped with its package. - Adding a backend is L2-only work. A new backend ships with its own L2 registry; the catalog is untouched. (This is the W3 abstraction's payoff applied to operators.)
- Custom UDFs are declarative. An AC author who has installed a domain-specific UDF in their warehouse can use it via an
operator_overridesentry — no backend code change required. - Refusals are structured. A query referencing an operator absent from L1 →
UnknownOperatorError(refusal points at catalog version). Referencing an operator in L1 but with no L2/L3 binding →OperatorUnsupportedError(refusal suggests the L3 override path).
Implementation lands in coframe-resolution (planner consults the effective registry) + per-backend operator_registry modules (coframe-sqlite, coframe-polars, coframe-duckdb, …). See Manual §10.13.4 for the runtime-side cross-reference.
4. AC-level filter — the fourth orthogonal customization control¶
(Extends v2.0 §0.2 list of per-AC customization controls.)
4.1 The fourth control¶
v2.0 names three orthogonal AC-level customization controls: dimension subset, measure subset, name_map + operator customizations. v2.1 adds a fourth: AC-level filter — a frozen scope restriction on which rows the AC includes from L1.
| Control | What it scopes |
|---|---|
| Dimension / measure subset | which columns the AC exposes |
name_map + operator customizations |
how the exposed columns are named + computed |
attestation config |
what verifications are required |
| AC-level filter | which rows the AC includes |
The four together specify the AC's full scope and interpretation surface.
4.2 Expressiveness: dimensional-value subsetting only in v1.0¶
The AC-level filter is restricted in v1.0 to Boolean predicates over the AC's dimensional surface. Examples:
region IN ('East', 'West')— geographic scopingdepartment = 'Electronics'— vertical scopingsale_date >= '2025-01-01'— temporal scoping (still dimensional value subsetting; time is a dimension)customer_segment IN ('Enterprise', 'SMB')— segment scoping(region IN ('East', 'West')) AND (sale_date BETWEEN '2024-10-01' AND '2024-12-31')— multiple dimensional predicates combined
What's NOT in v1.0 (deferred to Pro):
- Filters over measure columns (
revenue > 0) — these create verification subtleties (rollup attestation becomes filter-dependent in non-obvious ways). - Joins to other tables (
store_id IN (SELECT store_id FROM trusted_stores)) — adds parser complexity and verification implications. - Subqueries in general.
- User-defined-function predicates.
The dimensional-value-subsetting restriction captures the common case (well over 90% of real authoring use cases per industry experience) while keeping the verification semantics clean: a filter on region doesn't break store_id → city → region — those FDs still hold within any subset.
4.3 Composition with user queries at runtime¶
At query time, the AC-level filter is fused with the user's WHERE clause invisibly. Per query path (§1.4 in v2.0), step 4 (resolution) takes the user's resolved AST and AND-joins the AC's filter predicates with the user's predicates. The user does not see the AC filter in their query, in error messages, or in coherence_posture annotations; they see only their own scoping. The AC's filter is part of the "what the AC observes" boundary, not part of "what the user asked."
4.4 Verification implications under filter¶
Filter-bounded ACs have well-defined verification semantics:
- FD-edge attestation: an FD-edge
X → Ydeclared in the AC must hold within the AC's filter scope, not necessarily globally. Ifstore_id → regionfails globally (a store moved regions historically) but holds withinregion IN ('East'), the East AC can validly attest the edge. - Sibling-coherence (Phase 3): the rollup must equal the rolled-up detail within the AC's filter scope. If a pre-aggregate table covers the whole installation but the AC restricts to East, attestation compares only the East portions.
- Grain uniqueness: within the filter, the schema's declared grain identifies rows uniquely. Trivially preserved by any pure dimensional filter (filtering rows doesn't introduce duplicates).
These follow naturally from "the AC's correctness guarantees are about what the AC observes." No special verification machinery is needed; the existing DQ checks just apply the AC filter before running.
4.5 The UI affordance¶
In the workbench, AC-level filter is presented as a peer to dimension/measure selection — see §5 for the scope-setting phase. After commit, it appears as a chip in the AC overview with the predicate text and the row count it scopes (region IN ('East', 'West') · ~120K rows / 523K total). The chip is read-only; the user can fork the AC to change it.
5. Frozen-scope phase at session start¶
(Refines v2.0 §1.5 "What runs where, at AC-authoring time" — the previous "no fixed sequence" claim is partially walked back.)
5.1 The constraint¶
v2.0 described the workbench as "interactive at the core, not linear" — and that's right for the post-scope-set phase. But there is, in fact, a one-time scope-setting step at the start of every AC session that must precede the interactive phase:
At the start of every AC session, the user makes three peer scope decisions together: dimension subset, measure subset, and AC-level filter. These three are committed as the AC's scope. Once committed, the scope is frozen for the rest of the session.
After scope commit, everything is interactive as v2.0 described. Before scope commit, the workbench surfaces only scope-setting operations; the rest of the operation catalog is greyed out.
5.2 Why the scope is frozen¶
Freezing the scope at session start is what makes per-AC verification meaningful. Without it:
- Every filter change invalidates every per-AC attestation result downstream (an FD-edge that holds under filter A might fail under filter B; sibling-coherence under filter A says nothing about filter B).
- The user-experience becomes confusing: "Did I verify this with the filter I'm using now, or an earlier filter?"
- The L3 cache thrashes constantly.
With the freeze, per-AC results are stable artifacts: they describe the AC at its committed scope, and they remain valid as long as the underlying L2 hasn't moved within that scope. Changing scope = a new AC (fork the session, replace it, or start fresh). This is honest: scope is a load-bearing structural commitment, not a flippable knob.
5.3 What the user does in the scope-setting phase¶
The workbench's scope-setting phase is short — typically a few minutes for a focused AC — and is the only place where ordering is enforced:
- Bind the session to an installation. The workbench reads
installation.coframe/installation.yamland the L2 metadata; the user sees what tables are available, what their columns are, the installation's stability filter setting (§6), etc. - Pick tables of interest. A subset of the installation's tables. The user can iterate (look at samples, look at column lists) before committing — this part is exploratory.
- Pick columns of interest from those tables. The dimension and measure subsets. Again exploratory — the user can profile columns, see distributions, before committing.
- Set the AC-level filter. Using L2's dimensional-value enumerations, the user picks values from checkbox lists (for low-cardinality dims) or pickers (for date ranges). No SQL typed. The filter preview shows the scoped row count and any verification implications ("with this filter, your AC scope is 120K rows / 523K total; 3 FD-edges that don't hold globally will likely hold within scope").
- Commit scope. The session's
scope_phasefield flips from"selecting"to"committed". The rest of the workbench's operations unlock; the scope-setting controls become read-only.
5.4 Workbench state implication¶
The session-state model in v2.0 §7.2 gets a small but load-bearing addition:
WorkbenchSession {
session_id: str
workspace_path: Path
installation_id: str # NEW in v2.1
scope_phase: Literal["selecting", "committed"] # NEW
...
}
The scope_phase is a single source-of-truth boolean (effectively) that gates which operations are available, which UI panels are interactive vs. read-only, and whether attestation can run. Forking an AC duplicates the AC's ac_state with scope_phase = "selecting" so the user can revise scope; the original AC is unaffected.
5.5 Reconciling with v2.0's "interactive at the core" framing¶
v2.0 was right that the bulk of the workbench experience is interactive — but it understated the role of the early scope-setting phase. The honest framing:
The workbench has two phases per session: a brief, ordered scope-setting phase (commit your scope), then an open-ended, interactive working phase (explore, declare, verify, iterate within that scope, in any order).
The first phase is constrained because scope is load-bearing; the second is open because the per-condition verifications within a fixed scope have no order dependencies.
6. L2 stability filter — the installation-level integrity-hygiene filter¶
(New section; pins a warehouse-hygiene contract Coframe depends on.)
6.1 The filter¶
At the installation level, Coframe applies a stability filter to every data read from L1 before deriving L2. The filter is a per-table-or-class hold-off window on the table's data-update timestamp:
The stability filter is NOT a business decision (which would belong in an AC's filter, §4); it is an integrity-hygiene decision about which data the installation treats as authoritative. It says "we treat data as stable only after N days have passed since its updated_at." Once the installation admin sets it, every AC downstream — regardless of its own scope — is protected from late-arrival churn.
6.2 The cut between L1 / L2 / AC filters¶
This produces a clean three-level filter architecture:
| Layer | Filter type | Purpose | Who configures |
|---|---|---|---|
| L1 | none | physical reality (everything) | warehouse/DB admin |
| L2 | updated_at < (today − N days) |
data-stability boundary — keeps late-arrivals + corrections out of installation view | installation admin (one-time, parametric) |
| AC | business predicates (region IN ..., etc.) |
analytical scoping per AC | AC author (frozen at scope-set) |
Each layer's filter serves a distinct purpose; together they let an installation host multiple ACs that all share a stable, integrity-hygiened view of the data without each AC author having to remember to add their own hold-off buffer.
6.3 Warehouse-hygiene contract¶
The stability filter has prerequisites Coframe expects warehouses to follow. These are best practice already in mature warehouse environments; v1.0 documents them as Coframe's explicit dependencies:
- Every table needs an update timestamp.
updated_at,ingested_at,loaded_at,etl_run_at— names vary, but the column must exist. Without it, L2 can't apply the stability filter for that table, and the insulation breaks for any AC that touches it. - The timestamp must reflect data-write time, not business time.
updated_at≠sale_date. The stability filter cares about when did this row land in the warehouse, not what date does the event represent. - Different tables may need different hold-off windows. Fact tables churn (transactions land late for days); dimension tables update slowly (a new store, a renamed product). A per-table or per-table-class
Nis more honest than one globalN. Stored as a map ininstallation.yaml(§2.3). - Soft-delete-or-correction semantics matter. If a historical record gets corrected (UPDATE on an existing row), the
updated_atadvances and the L2 filter re-includes the row when the hold-off elapses. This is correct behavior — corrections should propagate after they've stabilized — but means the L3 caches of ACs whose scope includes the corrected record will invalidate when the correction settles. The framework surfaces this as a verification-needs-refresh signal (§7.3) rather than silently absorbing it.
6.4 Bind-time diagnostic¶
When the workbench (or the runtime) binds to an installation, it runs a diagnostic that enumerates the L1 tables and checks which have an updated_at-like column. Tables without one are flagged:
[WARN] Table `transactions_legacy` has no update timestamp column.
The installation's stability filter cannot apply to this table.
Late-arrivals to this table will leak into any AC that includes it.
Options:
1. Add an updated_at column (recommended) — fix ETL to populate it.
2. Declare an explicit override in installation.yaml — affirm that
this table doesn't churn after load:
stability_filter.overrides.transactions_legacy:
no_update_timestamp_acceptable: true
reason: "Static legacy table, never updated after import"
3. Exclude this table from the installation (don't make it
available to ACs).
The diagnostic is loud and constructive: it identifies the gap, names the available remediations, and provides an explicit override path for cases where the admin knows the table doesn't need protection. Coframe doesn't assume; it asks for an affirmation.
6.5 Documentation deliverable¶
The warehouse-hygiene contract belongs in v1.0's documentation set as a first-page concern:
- New doc:
docs/warehouse_hygiene_contract.md— covers the update-timestamp requirement, the per-table hold-off rationale, soft-delete/correction semantics, the bind-time diagnostic, and the override paths. Written for the installation admin / data engineer who provisions the warehouse for Coframe. - Position-article callout: one sentence in the §"What Coframe is" section noting that Coframe assumes a warehouse-hygiene contract (with link to the dedicated doc).
- Tutorials: the getting-started tutorial includes a "preparing your warehouse for Coframe" section that walks through the contract.
This pinning is intentional: by making the dependency explicit, Coframe also serves as a forcing function for better warehouse hygiene — warehouses that already follow the contract get the integrity benefits immediately; warehouses that don't get a clear signal that they should.
7. Incremental update insulation¶
(New section; names the architectural property that falls out of §§4-6.)
7.1 The insulation principle¶
An AC with a frozen scope is invariant under data changes outside its scope.
Concretely: an AC scoped to sale_date BETWEEN '2024-10-01' AND '2024-12-31' is unaffected by 50,000 new transactions arriving tonight with sale_date >= today. The AC's L1 data — as observed through its scope — is unchanged; L3 stats stay valid; FD-edge attestations stay valid; sibling-coherence stays valid; the AC's verification level (A/AA/AAA) does not budge.
This is a real architectural property — one of the four named in §1.4 ("scope stability") — and it has substantial practical value.
7.2 Patterns the insulation enables¶
| Pattern | What it does |
|---|---|
| Closed-period ACs | "Q4 2024 analytics" — once Q4 closes (no more backfills expected per the stability filter's hold-off), the AC is authored once, verified, and stays at AAA forever. The verification doesn't expire. Auditors love this. |
| Reproducible-snapshot ACs | "Annual report for 2024" — query against this AC today or in five years produces the same number. The provenance bundle + scope + DQ deliverable = a complete audit-grade snapshot. |
| Cheap incremental re-attestation | Only ACs whose scope intersects changed L1 data need re-verification. A current-quarter AC and a historical AC don't interfere with each other's freshness. |
| Parallel authoring of new periods | While the "2024 final" AC sits stable, a new "2025 H1" AC can be authored independently with no risk of touching the older one's verification state. |
| Per-region ACs at independent cadences | An East-region AC verified yesterday isn't invalidated by new West-region data landing today. Each region's AC has its own refresh cycle. |
7.3 Where insulation has a soft edge: late-arriving in-scope data¶
The insulation is conditional on data within the scope being immutable. The L2 stability filter (§6) handles the common case of late-arrival churn, but doesn't eliminate it entirely:
- Late-arriving facts beyond the hold-off window. A transaction with
sale_date = 2024-12-15arrives today (more than 7 days late). It IS within a "Q4 2024" AC's scope even though the scope predicate hasn't changed. The hold-off filter excludes it from L2 today, but eventually it'll cross the boundary and become part of L2 — and then L3 caches for any AC whose scope includes it will invalidate. - Historical corrections. Someone fixes a 2024 transaction record. Same mechanism — once the corrected
updated_atadvances past the hold-off, L2 picks up the change and downstream AC caches invalidate. - Backfills. Large historical period re-loaded with corrections. The biggest version of the same problem.
These cases ARE precisely the demo's Stage 5b "stale by one cycle" beat, generalized: any AC whose scope includes a record that lands or changes after the AC's last attestation has stale L3 entries.
7.4 Architectural mechanisms for handling it¶
Three mechanisms together let the framework expose this honestly:
- L3 cache key includes L2 version. When L2 refreshes (because hold-off elapsed for some late data), any L3 entry whose key references the now-outdated L2 version is marked stale.
- Filter-aware staleness signal. When the workbench / runtime detects stale L3 entries, it surfaces only the ones whose AC scope intersects the changed L2 portion. ACs whose scope doesn't intersect stay green.
data_changes_in_scopeworkbench operation. A user-facing operation: "show me what changed in my AC's scope since the last attestation." Surfaces late-arrivals + corrections explicitly so the AC author can decide whether to re-verify.
The combination: the framework is honest about when an AC's verification is stale (without panicking), and gives the AC author the tools to act on it.
7.5 Hold-off buffer as the AC author's practical pattern¶
The hold-off concept appears at two scales:
- L2 stability filter (§6): installation-level hold-off, applies to all ACs uniformly.
- AC author's filter discipline: an AC author may additionally scope to
sale_date < (today − 30 days)to give themselves extra buffer beyond what L2 provides, when authoring a historical-analytics AC where they want minimal disturbance.
These compose cleanly: L2 ensures no late-arrivals within 7 days are visible to anyone; the AC author who wants 30 days of buffer adds it to their own filter. The "Q4 2024 analytics" AC, by virtue of scoping to a closed past period, automatically gets indefinite stability for periods well past the L2 hold-off.
7.6 Why this is one of Coframe's underrated wins¶
The contextual-multiplicity stance the position article takes — many ACs over the same physical data, each scoped to a context — has an underappreciated consequence: it decouples the analytical surfaces from each other. Each AC's stability is independent. The semantic-layer "one model for the whole company" framing inherently couples every analytical surface to every data change; every refresh forces a re-think across the whole layer. Per-context ACs with frozen scopes break that coupling. The verification level becomes a stable, per-AC property — which is what makes A/AA/AAA computable and meaningful in the first place.
Worth surfacing as a feature in the position article's claims list (claim 8, perhaps), and in the v1.0 marketing materials. The benefit is concrete: an organization with 50 ACs over the same warehouse can have 50 independent verification cadences, instead of one big synchronized re-verification ceremony every time data lands.
8. Component implications — what changes in each package¶
(Per-component deltas vs. v2.0. Most are additive; a few sections in v2.0 are superseded.)
8.1 coframe-core (v2.0 §2 + §2A territory)¶
Additions:
InstallationConfigtype (v2.1 §2.3). Top-level Pydantic model loaded frominstallation.yaml.- L2 metadata types (v2.1 §3.2). The catalog of installation-level metadata artifacts; what they look like in memory.
- L3-derives-from-L2 helpers. Cheap projection utilities the workbench / runtime call to compute per-AC stats from L2 + AC scope (rather than re-running against L1).
scope_phaseenum + state-machine on the AC declaration model (v2.1 §5.4). Locks scope mutability after commit.AC.filterfield on the AC catalog model — the dimensional-value subset predicate (§4).- Integrity catalog enrichment. Each catalog entry gains a
scope_sensitivityfield: whether the condition's verification result depends on the AC's filter scope. Most do; a few (catalog-level conditions like operator partition-invariance) don't.
8.2 coframe-connect (v2.0 §3)¶
Additions to the Backend protocol's data-API surface:
enumerate_acs()→ list[ACMeta] — for the multi-AC operations layer; lets a workbench / MCP server enumerate ACs hosted in the installation.apply_stability_filter(table, hold_off_days)— the backend implements its own stability filter logic againstupdated_atcolumns; receives the hold-off config from the installation.check_update_timestamp_column(table)→ Optional[str] — bind-time diagnostic helper: backend returns the name of anupdated_at-like column if one exists, else None.data_changed_in_scope(ac_scope, since_timestamp)→ bool + diff — for the workbench'sdata_changes_in_scopeoperation (§7.4).
The execution-surface remains unchanged; the AC's filter is composed with the user's WHERE clause at resolution time (in coframe-core's resolution pass), not at backend-protocol level.
8.3 coframe-author (v2.0 §7)¶
Additions:
- Installation-binding step in the session bootstrap. The workbench session first binds to an installation, then to an AC within it (or starts a new AC).
- Scope-setting phase (§5.3) as a distinct UI mode. The workbench UI has a "Scope" mode (active until commit) and a "Working" mode (active after commit); modal transitions visible to the user.
- AC-level filter picker in the scope-setting UI. Uses L2's dimensional-value enumerations to render checkbox lists (low-card dims), range pickers (dates / numerics), search boxes (medium-card dims). No SQL typed.
- Filter-impact preview operation. Before commit, shows scoped row count, distribution shift on key dimensions, list of FDs / coherence properties that may now hold-or-fail differently within the scope.
data_changes_in_scopeoperation in the operation catalog. Surfaces L1 changes within the AC's frozen scope since last attestation.- Filter chip + scope summary in the AC overview pane, always visible. Read-only after commit; click "fork" to start a new AC at a different scope.
8.4 coframe-mcp (v2.0 §8)¶
Changes:
- MCP wraps an installation, not a bound AC (v2.1 §2.2). Clients select an AC per session or per query.
- New
list_acscapability. Returns ACs in the installation with their verification levels and scope summaries. - New
select_ac(name)capability. Sets the session's active AC; subsequentexecute_query/nl_queryruns against this AC. coherence_posturepropagation now includes scope info — clients see "this answer is scoped toregion IN ('East')" in result metadata, so downstream consumers can reason about the answer's universe of applicability.
8.5 coframe-sqlite / coframe-polars / coframe-duckdb (v2.0 §§4-6)¶
Each backend implements the additions to the data-API protocol (§8.2):
coframe-sqliteusessqlite_masterforenumerate_acsindirectly (an installation's.coframe/directories enumerate the ACs; the backend just provides the data underneath). Stability filter appliesWHERE updated_at < datetime('now', '-N days')to all reads. Update-timestamp detection inspectssqlite_masterfor columns namedupdated_at,ingested_at, etc. (configurable column-name patterns).coframe-polarsapplies the stability filter as aLazyFrame.filter(pl.col('updated_at') < ...)step injected before any data read. Update-timestamp detection inspects the Parquet schema.coframe-duckdbapplies the stability filter via aWHEREclause in every query. Update-timestamp detection inspectsinformation_schema.columns.
8.6 Build phasing impact (v2.0 §13)¶
The v2.1 additions slot into existing phases, mostly:
- Phase 1 (coframe-core foundations): adds
InstallationConfig, L2 types,scope_phaseenum, ACfilterfield, integrity-catalogscope_sensitivityfield. ~1 week added. - Phase 2 (coframe-connect + coframe-sqlite): adds the data-API extensions (
enumerate_acs,apply_stability_filter,check_update_timestamp_column,data_changed_in_scope) and their sqlite implementations. ~1 week added. - Phase 5 (coframe-author + UI): adds the scope-setting phase UI, filter picker, filter-impact preview,
data_changes_in_scopeoperation. ~2 weeks added (the UI work is the biggest part). - Phase 6 / Phase 7 (polars + duckdb): each adds their data-API extensions. ~1 week added each.
- Phase 9 (coframe-mcp): adds the
list_acs/select_accapabilities + scope-awarecoherence_posture. ~3-4 days added. - Phase 10 (hardening, docs): new
warehouse_hygiene_contract.mddoc, multi-AC tutorials, scope-setting walkthroughs. ~1 week added.
New total estimate: ~38-47 weeks (vs. v2.0's 32-40). The increment is real but justified by what v2.1 enables — without it, the platform can't honestly support multiple ACs over shared data, which was always the design intent.
9. Open questions for v2.2+¶
Settled in v2.1 (initial ship):
- Multi-AC at installation level: yes, formalized.
- L1/L2/L3 metadata layering: yes, with refresh/invalidation contract.
- AC-level filter: yes, fourth orthogonal customization control.
- AC-level filter expressiveness: dimensional-value subsetting only in v1.0.
- Frozen scope phase: yes, gated transition from scope-setting to working phase.
- L2 stability filter: yes, parametric per-table hold-off; warehouse-hygiene contract documented.
- Incremental update insulation: named as a first-class architectural property (scope stability — fourth of the four).
- "Analytic Layer" as the category name: yes, alongside Semantic Layer.
Newly settled in the 2026-05-23 amendments (§10):
- AC primitive name: Analytic Collection (AC) — renamed from Analytics Collection for parallelism with Analytic Layer. Same acronym; copy-edit pass at v1.0 publication time.
- AC Surfaces as the umbrella term for the AC's access protocols (Frame-QL / NL Query / MCP / HTTP API / Workbench / Validation). Each surface is a separately-documented conformance contract.
- Frontend architecture: three UIs (Workbench / AC Management / Query) hosted in one React app; four packages backing them (
coframe-author/coframe-management/coframe-runtime/coframe-frontend). Three-tier UI distinction (coframe.tech / Platform front-end / Pro cloud) made explicit. - Build phasing: vertical-slice-first. Phases re-ordered to ship front-end + SQLite back-end together as the alpha milestone (~16-18 weeks in), with Frame-QL / polars / duckdb / MCP / dialogue landing after. See §10.4 for the full phasing.
Deferred to v2.2 or later:
- Cross-AC consistency invariants — when two ACs over the same installation should agree on overlap measures, and how the framework surfaces drift. Likely a Pro feature; needs careful design.
- "As-of" snapshot pinning for ACs — pin an AC to a specific L1 version, serve from that version forever (with versioned storage at the warehouse). Reproducibility-grade for regulated environments. Pro.
- Multi-user concurrent workbench sessions on one installation — currently single-user per session; collaborative authoring is a Pro feature.
- AC-level filter beyond dimensional-value subsetting — filters over measures, derived expressions, subqueries. Pro.
- Workbench-driven Frame-QL preview during authoring — "given the AC you've authored so far, here are queries you can run." Out of scope for v1.0; the runtime is the place for queries.
- The split between
coframe-runtimeandcoframe-core's execution-helper modules — what's the clean boundary? Settle in Phase 5 / 6 of the new phasing. - Per-surface RBAC — the three UIs and the access surfaces need a permissions model. Out of scope for v1.0 (single-user, single-installation assumption); Pro feature.
10. Amendments (2026-05-23)¶
Four refinements landed after the initial v2.1 ship. Each is a small, self-contained restructuring of one part of the v2.0 + v2.1 design.
10.1 AC primitive renamed: Analytic Collection¶
The platform's primitive — formerly Analytics Collection (AC) — is now Analytic Collection (AC). The acronym is unchanged; the modifier shifts from the attributive noun Analytics to the adjective Analytic.
Rationale: parallelism with the Analytic Layer category name (Amendment 0 / §1.1). Writing both terms in the same sentence — "Coframe is the Analytic Layer; the Analytic Collection is its primitive" — reads as coherent vocabulary. The earlier "Analytic Layer / Analytics Collection" pairing flipped registers between the two terms; the rename eliminates that.
Secondary upside: the Analytic adjective form is slightly more formal than Analytics (which carries faint marketing-tech vibes — "analytics platform," "analytics stack"). Coframe's positioning is rigor over rhetoric; the more formal modifier matches the verification-rigor stance.
Implications for other documents: the Manual, the position article, the demo script, the platform_design v2.0, the integrity catalog YAML, and any code docstrings should adopt the rename in a single copy-edit pass at v1.0 publication time. ~15 minutes of mechanical find/replace; no semantic changes. The acronym AC is unchanged everywhere.
10.2 AC Surfaces — the access-protocols umbrella¶
An AC is accessed through multiple protocols. v2.1's initial framing treated these implicitly; the amendment names them as a first-class concept: AC Surfaces — the umbrella term for the AC's access protocols.
An AC has multiple surfaces. Each is a separate conformance contract — what operations it supports, what its semantics guarantee, what authentication it requires. A given AC deployment can offer some surfaces and not others.
| Surface | What it exposes | Audience |
|---|---|---|
| Frame-QL Surface | the formal query language | engineers, BI tools, automated systems |
| NL Query Surface | natural-language → Frame-QL via coframe.dialogue |
analysts, agents, conversational interfaces |
| MCP Surface | Model Context Protocol capabilities for AI agents | Claude Desktop, agent frameworks, IDE integrations |
| HTTP API Surface | REST-like endpoints | external services, dashboards, embedded SDKs |
| Workbench Surface | the coframe-author UI / programmatic API |
AC authors (the human-or-agent who builds the AC) |
| Validation Surface | the validate_ac operation: integrity status, verification level |
auditors, compliance pipelines, CI/CD |
The naming aligns with industry usage of surface: "API surface area," "the public surface of a library," "attack surface." The metaphor is interface-through-which-you-interact-with-X.
Why this naming is good:
- Frees the AC name for what it is — the thing (the bundled analytical commitments), not the interface (the access protocol).
- Composable. An AC has many surfaces; each is independently versioned, documented, and tested.
- Maps to real engineering concerns. Surface-level permissions, surface-level rate limiting, surface-level deprecation policies are all natural in this framing.
Implications for v2.1 documents:
- §2.2 ("ACs as customization surfaces") needs a clarifying note that customization surface (one role of the AC) is distinct from access surface (the protocols defined in 10.2). The amendment text resolves the ambiguity.
- §8 (Component implications) gains a mapping: each surface is realized by specific packages — Frame-QL Surface lives in
coframe-core+coframe-runtime; MCP Surface lives incoframe-mcp; Workbench Surface lives incoframe-author+coframe-frontend; etc.
10.3 Three-UI / four-package frontend architecture¶
v2.1's initial §8.3 spoke of "coframe-author + web UI" as a unit. The amendment splits this into three UIs and four packages.
The three UI surfaces (all sharing one front-end app):
| UI surface | Audience | Mode |
|---|---|---|
| Workbench UI | AC authors (data engineers, analytics engineers) | Interactive AC authoring (operations catalog from v2.0 §7). Where ACs are built. |
| AC Management UI | Installation admin / data platform team | Governance and lifecycle. List ACs with verification levels + scopes + owners; create / fork / archive; trigger refresh; manage installation config; view installation health (L2 status, stability filter, backend connectivity). Where ACs are managed. |
| Query UI | AC consumers (analysts, business users, BI tool integrators) | Hosts Frame-QL Surface and/or NL Query Surface (configurable per-installation which are exposed). Pick AC → see available dimensions/measures → submit query → view results with coherence_posture annotations. Query history, saved queries, export. Where ACs are used. |
The three roles often overlap in small organizations (one person plays all three) but separate cleanly in larger ones. Permission gates per-surface make this work — an analyst gets Query UI access; doesn't see Workbench or Management.
Architectural implementation: one React app (per v2.0's UI tech decision), one web server, three top-level sections corresponding to the three UI surfaces. Shared design system, shared auth, shared header/nav chrome, common state primitives. Mode switching is a top-level navigation; identity, active installation, and (where relevant) active AC are persistent across modes.
The four-package restructure (replaces v2.1 §8.3's single coframe-author package):
| Package | What it owns | Replaces |
|---|---|---|
coframe-author |
Workbench operation catalog + session model. Python, no UI. | narrowed from v2.0/v2.1's "workbench + UI" |
coframe-management (new) |
Installation-level operations: list / create / fork / archive ACs; manage installation.yaml; trigger refresh; bind-time diagnostics. Python, no UI. |
new |
coframe-runtime (new) |
Frame-QL + NL query execution path for the Query UI. HTTP/WS-exposed for the front-end. | new (was implicit in v2.0's coframe-core + coframe-mcp boundary) |
coframe-frontend (new) |
React app + Python web server hosting all three UI surfaces. Built static assets shipped with the Python package. Depends on the three logic packages above. | new (was bundled in coframe-author) |
The three logic packages stay backend-agnostic. The frontend package bundles them into a unified front-end. Backends (coframe-sqlite, coframe-polars, coframe-duckdb) remain execution + data-API implementations as in v2.0 §§4-6 + v2.1 §8.5.
Three-tier UI distinction (worth being explicit about):
| Layer | What it is | Where it runs | Maintained by |
|---|---|---|---|
| coframe.tech | marketing site, docs, downloads, community | Vercel (current) | Coframe team |
| Coframe Platform front-end | the three-surface UI shipped with the platform — Workbench / AC Management / Query | client's installation, on their infrastructure | Coframe team ships it; client deploys it |
| Coframe Pro cloud (future) | SaaS-hosted multi-tenant version with collaboration, etc. | Coframe's infra | Coframe team |
These are three distinct products serving different purposes. The v1.0 documentation should be explicit about which is which to prevent positioning confusion.
10.4 Build re-phasing: vertical-slice-first¶
The v2.0 §13 / v2.1 §8.6 phasing was bottom-up (core → parser → resolution → backends → UI). The amendment re-orders to vertical-slice-first: build the front-end + back-end with SQLite together, end-to-end, before committing too much on the rest of coframe-core, coframe-polars, coframe-duckdb.
Why vertical-slice-first is the right move for Coframe:
| What it buys | What it costs |
|---|---|
| UX feedback comes early. The workbench is the most novel thing in the platform; building it first means discovering its issues before everything downstream is locked in. | Some Phase-1 work needs to be sized smaller — "minimum coframe-core" rather than "complete coframe-core" up front. |
| Architectural pressure from below. Implementing the data-API protocol against SQLite first means it gets refined by real use rather than speculation. | Minor refactor risk when polars/duckdb land later — but they'll be designing against a real, used protocol, not a speculative one. |
| Visible progress. Stakeholders can see Coframe authoring in action months earlier. A working workbench is worth a thousand design docs. | No end-to-end "author → query → result" until Frame-QL + resolution land. Early demo is "author → verify → export the AC artifact." |
| Risk surfacing. If the workbench UX doesn't work, you find out at the end of the first major phase, not at the end of the whole project. | Requires more iterative work early; less time spent on the "purist" architecture for its own sake. |
The new phasing (replaces v2.0 §13 + v2.1 §8.6):
| Phase | What ships | Duration |
|---|---|---|
| 0 — Skeleton | Monorepo, package skeletons (coframe-core, coframe-connect, coframe-sqlite, coframe-author, coframe-management, coframe-runtime, coframe-frontend), CI, CLAUDE.md |
1 week |
| 1 — Minimum coframe-core | Types only (AC dataclass, ColumnSpec, FD-DAG, integrity catalog, quasi-metadata types, verification-level computation, attestation/config). No ql/, no resolution/, no dialogue/. Just enough to represent an AC in memory. |
2 weeks |
| 2 — coframe-connect (data-API only) + coframe-sqlite | coframe-connect Backend protocol (data-API surface only — defer execution surface). coframe-sqlite: data-API implementation + CSV/SQL loaders + stability filter. End of phase: can load retail demo into SQLite and call all data-API operations against it. |
2 weeks |
| 3 — Workbench + Management backends | coframe-author (operation catalog, session state, serialization) + coframe-management (installation-level operations). Python only, no UI. End of phase: can author the retail AC programmatically (CI test). |
3 weeks |
| 4 — Frontend (the ALPHA milestone) | coframe-frontend: React app, all three UI surfaces. Workbench fully functional; AC Management mostly functional; Query is shell-only (no execution path yet — Frame-QL not built). End of phase: alpha-able product. An installation admin can install, an AC author can author, the workbench is real, integrity verifications run end-to-end against SQLite. |
8-10 weeks |
| — Alpha release — | First external users can try the Workbench flow against their own SQLite data. ~16-18 weeks in. | — |
| 5 — Frame-QL parser + resolution | coframe.ql + coframe.resolution. End of phase: can resolve queries against an AC. |
5-6 weeks |
| 6 — coframe-runtime + Query UI integration | Execution path against bound backend + Query UI lights up. End of phase: full author → query → result workflow against SQLite. | 2 weeks |
| 7 — coframe-polars | Execution + data-API for Polars. | 3-4 weeks |
| 8 — coframe-duckdb | Execution + data-API for DuckDB. | 3-4 weeks |
| 9 — coframe.dialogue | NL → Frame-QL. | 1-2 weeks |
| 10 — coframe-mcp | MCP server. | 2-3 weeks |
| 11 — Hardening, docs, v1.0 release | Per v2.0 §13 Phase 10. | 3-4 weeks |
Net timing: roughly the same total (~38-44 weeks for v1.0), but the alpha milestone arrives at ~16-18 weeks instead of ~32+. That's the real win — half a year earlier in the calendar for external feedback to start flowing.
One discipline this re-ordering surfaces:
Phase 1's "minimum coframe-core" needs to be scoped deliberately — not everything in v2.0 §2.1 is needed for Workbench-against-SQLite. The cut:
- Needed in Phase 1: AC dataclass + ColumnSpec, dimension/metric family declarations, FD-DAG data structure (in-memory), integrity catalog (YAML loader + 27 conditions), quasi-metadata types, verification-level computation,
attestation/config.py. - Deferred:
coframe.ql(Phase 5),coframe.resolution(Phase 5),coframe.dialogue(Phase 9), per-DNA-edge value attestation runner (stub in Phase 4; full implementation in Phase 6),attestation/plan.py(minimum version sufficient for alpha).
Implications for §8.6 (Build phasing impact): the new phasing in this §10.4 supersedes the v2.1 §8.6 phasing. Where §8.6 estimated v2.1's deltas as "+6-7 weeks added to the v2.0 phasing," §10.4 is the actual phasing for v1.0 — not deltas, but a fresh plan that incorporates v2.0 + v2.1 + the four amendments.
What this supplement is not¶
This is not a re-specification of the framework. The Manual specifies what the framework does. This supplement specifies how a few architectural commitments — multi-AC, metadata layering, scope discipline, stability filter — get reflected in the engineering design.
This is not a comprehensive position revision. The position article (drafts/coframe_position_v2_0.md) should adopt the "Analytic Layer" framing and the four-properties coda as copy edits; that work is separate from this supplement.
This is not the final v2 of the platform design. v2.0 + v2.1 together are the current design state; a full integrated v3.0 (or v2.2 if smaller) can come later when v1.0 implementation work surfaces what still needs sharpening.