Struct · Surface · Access · Drift: A Practical Math for Low-Entropy Repos

Published

October 1, 2025

TL;DR. Minimize repository interface entropy so humans and coding agents see one small, explicit, deterministic surface.
Model: (H_{} = H_{} + H_{} + H_{} + H_{}).
Gates (pass/fail):
• Struct: cycles (=0), depth (d_{}), no public shadows.
• Surface: no wildcard exports; public registry matches runtime.
• Access: entrypoints (CLI/ABC) cover ().
• Drift: hermetic extractor sees all public modules; same-commit builds are bit-identical.
Policy: an Agent-First Max-Clarity bundle combining src/ layout, layered DAG rules, explicit exports + registry, dual facade, and hermetic + deterministic + offline builds; add a small pre-flight smoke and reproducibility hash.

Motivation

Most repos don’t fail from missing braces; they fail from ambiguous interfaces: sprawling import graphs, accidental exports, no obvious entrypoint, and nondeterministic outputs. That hurts humans and makes coding agents hallucinate APIs. We reduce ambiguity with four independent entropy cuts and simple gates.

The E0 model (one page)

We define a scalar repository interface entropy: [ H_{} = H_{} + H_{} + H_{} + H_{}. ]

Acceptance (pass E0) requires all gates to hold: - Struct: cycles (=0), (d(m) d_{}), shadows (=0).
- Surface: (g_{}=0), (g_{}()!!0), unlisted public (=0).
- Access: (
{} ()).
- Drift: extraction coverage (g_{}=0), bit-drift (f=0).

Where () is maturity; thresholds tighten as ().

Math corner (compact)

Struct:   cycles=0, max_depth<=d_max, shadows=0
Surface:  g_init=0, g_star<=tau_star(gamma)->0, unlisted=0
Access:   cov_iface >= rho(gamma) (ratchet to 1.0)
Drift:    g_cov=0, f=0 (hermetic extraction + bit-stability)

Extreme examples (cartoon repos)

Surface entropy — HIGH vs LOW

High (everything leaks):

pkg/__init__.py
from .core import *
from .secrets import *      # accidental export
from .exp.lab import *      # experimental leaks as stable

Symptoms: users/agents import unstable internals; renames silently break them. Fix: explicit named exports + public registry; move internals under _internal/.

Low (curated surface):

# pkg/__init__.py
from .core import public_func, PublicClass
__all__ = ["public_func", "PublicClass"]  # small, explicit surface

Drift entropy — HIGH vs LOW

High (intent ≠ tool view; builds differ):

  • Runtime adds exports dynamically (globals()[f"Func_{rand}"]=...) → extractor misses them.
  • Docs/theme inject timestamps; symbol lists unsorted → same commit hashes differ.

Low (intent = extraction; bit-stable):

  • Static exports; hermetic parse (no runtime imports, no network).
  • Determinism knobs pinned (seed, TZ/locale, font/backend/DPI); timestamps disabled; sorted lists.

Struct entropy — HIGH vs LOW

High (layer cycle + shadowed names):

ui -> service -> data -> ui  # cycle
pkg.math.ops.add and pkg.utils.math.add  # two public paths

Low (layered DAG + one canonical path):

adapter -> service -> domain   # acyclic flow
__all__ only exposes canonical names

Agent-First Max-Clarity (policy you can apply today)

  • Struct: src/ + explicit packages; layered DAG rules; no public shadows.
  • Surface: machine-readable public registry + Charter (Developers 2018); explicit exports (ban wildcard).
  • Access: dual facade—CLI + thin ABC/Factory—covering () of public symbols.
  • Drift: hermetic extractor (parse-only) Tweag (2022); network-off with mirrors (Community 2023); determinism knobs; pre-flight builder import smoke; reproducibility hash Maven (2024).

A compact E0 diagram

flowchart TD
  subgraph E0[ E0 - Graph and Contract Normalization ]
    S[Struct gate: cycles=0, depth<=d_max, no shadows]:::gate
    U[Surface gate: registry==runtime; no wildcard; g_init=0]:::gate
    A[Access gate: CLI/ABC coverage >= rho]:::gate
    D[Drift gate: g_cov=0; f=0 under hermetic+deterministic build]:::gate
  end
  IN[Repo + Charter/Registry]:::file --> S --> U --> A --> D --> OUT[Low-entropy, agent-ready facade]:::rec

  classDef gate fill:#f1f3f5,stroke:#495057,color:#212529
  classDef file fill:#fff4e6,stroke:#d9480f,color:#7f0000
  classDef rec  fill:#eaffea,stroke:#2b8a3e,color:#0b4d1a

CI and a badge (turn the math into a dial)

Emit a JSON on PRs:

{
  "struct": {"cycles":0,"max_depth":3,"shadows":0},
  "surface":{"g_init":0,"g_star":0,"unlisted":0},
  "access":{"coverage":0.86,"target":0.80},
  "drift":{"g_cov":0,"f":0}
}

Gate on failures; publish a Shields badge: Repo Entropy A (Struct A • Surface A • Access B • Drift A). Reproducibility levers: strip timestamps, sort enumerations, pin fonts/DPI and RNG, enforce offline Community (2023).

Appendix — “smell → metric → fix”

Smell Metric trips Gate Typical fix
Users import secret.py unlisted (>0) Surface Move to _internal/, stop re-export
Barrels everywhere (g_{}>0) Surface Explicit named exports only
UI↔︎Data cycle cycles (>0) Struct Introduce interface; invert dependency
Two public paths to same symbol shadows (>0) Struct Pick canonical path; keep alias private
Same commit, different index (f=1) Drift Disable timestamps; sort lists; pin fonts/seed
Public module missing in index (g_{}>0) Drift Static exports; add markers; align registry

References

Authors, Pact. 2023. “Pact — Consumer Driven Contracts.” 2023. https://docs.pact.io/.
Build, Pants. 2022. “Why Dependency Inference.” 2022. https://www.pantsbuild.org/blog/2022/10/27/why-dependency-inference.
———. 2023. “Tweag Case Study: Simplifying CI Triggers with Pants.” 2023. https://www.pantsbuild.org/blog/2023/01/03/tweag-case-study.
Community, SLSA. 2023. “SLSA Specification V0.1 — Requirements.” 2023. https://slsa.dev/spec/v0.1/requirements.
Developers, NumPy. 2018. “NEP 23 — Backwards Compatibility and Deprecation Policy.” 2018. https://numpy.org/neps/nep-0023-backwards-compatibility.html.
Engineering, Shopify. 2023. “A Test Budget for Time-Constrained CI Feedback.” 2023. https://shopify.engineering/test-budget-time-constrained-ci-feedback.
Maven, Apache. 2024. “Guide to Reproducible Builds.” 2024. https://maven.apache.org/guides/mini/guide-reproducible-builds.html.
Preston-Werner, Tom. 2010. “Readme-Driven Development.” 2010. https://tom.preston-werner.com/2010/08/23/readme-driven-development.
Project, Reproducible Builds. 2024. “Reproducible Builds Definition.” 2024. https://reproducible-builds.org/docs/definition/.
Team, Bazel. 2024. “Hermeticity.” 2024. https://bazel.build/basics/hermeticity.
Tweag. 2022. “Keeping Bazel Builds Hermetic.” 2022. https://tweag.io/blog/2022-09-15-hermetic-bazel/.