flowchart TD
subgraph E0[ E0 - Graph and Contract Normalization ]
S[Struct gate: cycles=0, depth<=d_max, no shadows]:::gate
U[Surface gate: registry==runtime; no wildcard; g_init=0]:::gate
A[Access gate: CLI/ABC coverage >= rho]:::gate
D[Drift gate: g_cov=0; f=0 under hermetic+deterministic build]:::gate
end
IN[Repo + Charter/Registry]:::file --> S --> U --> A --> D --> OUT[Low-entropy, agent-ready facade]:::rec
classDef gate fill:#f1f3f5,stroke:#495057,color:#212529
classDef file fill:#fff4e6,stroke:#d9480f,color:#7f0000
classDef rec fill:#eaffea,stroke:#2b8a3e,color:#0b4d1a
Struct · Surface · Access · Drift: A Practical Math for Low-Entropy Repos
TL;DR. Minimize repository interface entropy so humans and coding agents see one small, explicit, deterministic surface.
Model: (H_{} = H_{} + H_{} + H_{} + H_{}).
Gates (pass/fail):
• Struct: cycles (=0), depth (d_{}), no public shadows.
• Surface: no wildcard exports; public registry matches runtime.
• Access: entrypoints (CLI/ABC) cover ().
• Drift: hermetic extractor sees all public modules; same-commit builds are bit-identical.
Policy: an Agent-First Max-Clarity bundle combiningsrc/layout, layered DAG rules, explicit exports + registry, dual facade, and hermetic + deterministic + offline builds; add a small pre-flight smoke and reproducibility hash.
Motivation
Most repos don’t fail from missing braces; they fail from ambiguous interfaces: sprawling import graphs, accidental exports, no obvious entrypoint, and nondeterministic outputs. That hurts humans and makes coding agents hallucinate APIs. We reduce ambiguity with four independent entropy cuts and simple gates.
The E0 model (one page)
We define a scalar repository interface entropy: [ H_{} = H_{} + H_{} + H_{} + H_{}. ]
Acceptance (pass E0) requires all gates to hold: - Struct: cycles (=0), (d(m) d_{}), shadows (=0).
- Surface: (g_{}=0), (g_{}()!!0), unlisted public (=0).
- Access: ({} ()).
- Drift: extraction coverage (g_{}=0), bit-drift (f=0).
Where () is maturity; thresholds tighten as ().
Math corner (compact)
Struct: cycles=0, max_depth<=d_max, shadows=0
Surface: g_init=0, g_star<=tau_star(gamma)->0, unlisted=0
Access: cov_iface >= rho(gamma) (ratchet to 1.0)
Drift: g_cov=0, f=0 (hermetic extraction + bit-stability)
Extreme examples (cartoon repos)
Surface entropy — HIGH vs LOW
High (everything leaks):
pkg/__init__.py
from .core import *
from .secrets import * # accidental export
from .exp.lab import * # experimental leaks as stable
Symptoms: users/agents import unstable internals; renames silently break them. Fix: explicit named exports + public registry; move internals under _internal/.
Low (curated surface):
# pkg/__init__.py
from .core import public_func, PublicClass
__all__ = ["public_func", "PublicClass"] # small, explicit surfaceDrift entropy — HIGH vs LOW
High (intent ≠ tool view; builds differ):
- Runtime adds exports dynamically (
globals()[f"Func_{rand}"]=...) → extractor misses them. - Docs/theme inject timestamps; symbol lists unsorted → same commit hashes differ.
Low (intent = extraction; bit-stable):
- Static exports; hermetic parse (no runtime imports, no network).
- Determinism knobs pinned (seed, TZ/locale, font/backend/DPI); timestamps disabled; sorted lists.
Struct entropy — HIGH vs LOW
High (layer cycle + shadowed names):
ui -> service -> data -> ui # cycle
pkg.math.ops.add and pkg.utils.math.add # two public paths
Low (layered DAG + one canonical path):
adapter -> service -> domain # acyclic flow
__all__ only exposes canonical names
Agent-First Max-Clarity (policy you can apply today)
- Struct:
src/+ explicit packages; layered DAG rules; no public shadows. - Surface: machine-readable public registry + Charter (Developers 2018); explicit exports (ban wildcard).
- Access: dual facade—CLI + thin ABC/Factory—covering () of public symbols.
- Drift: hermetic extractor (parse-only) Tweag (2022); network-off with mirrors (Community 2023); determinism knobs; pre-flight builder import smoke; reproducibility hash Maven (2024).
A compact E0 diagram
CI and a badge (turn the math into a dial)
Emit a JSON on PRs:
{
"struct": {"cycles":0,"max_depth":3,"shadows":0},
"surface":{"g_init":0,"g_star":0,"unlisted":0},
"access":{"coverage":0.86,"target":0.80},
"drift":{"g_cov":0,"f":0}
}Gate on failures; publish a Shields badge: Repo Entropy A (Struct A • Surface A • Access B • Drift A). Reproducibility levers: strip timestamps, sort enumerations, pin fonts/DPI and RNG, enforce offline Community (2023).
Appendix — “smell → metric → fix”
| Smell | Metric trips | Gate | Typical fix |
|---|---|---|---|
Users import secret.py |
unlisted (>0) | Surface | Move to _internal/, stop re-export |
| Barrels everywhere | (g_{}>0) | Surface | Explicit named exports only |
| UI↔︎Data cycle | cycles (>0) | Struct | Introduce interface; invert dependency |
| Two public paths to same symbol | shadows (>0) | Struct | Pick canonical path; keep alias private |
| Same commit, different index | (f=1) | Drift | Disable timestamps; sort lists; pin fonts/seed |
| Public module missing in index | (g_{}>0) | Drift | Static exports; add markers; align registry |