Space Network — Billing Settlement Gas Analysis

Org↔Operator ZK settlement on Creditcoin L1 · L1 vs L2 sizing · Network Architect · 2026-07-02 · Draft

Basis: measured D5/D6 Groth16 benchmarks (Foundry --via-ir, 75M block, V3 circuit) + SpaceNetwork Billing Whitepaper (Nguyen/Lebée, Jun 2026).

Executive summary. On-chain settlement gas is independent of tree depth D — the only thing D changes on-chain is the maximum hops per flow (D5 = 32, D6 = 64). Cost is dominated by warm replay storage (70–78%), not proof verification. Because the whitepaper makes flow ids sequential and client-scoped, modeling full always-on usage is the deterministic worst case — no guesswork. Under that worst case, Creditcoin L1 cannot hold the workload beyond a few thousand flows/5-min, and a 300M/2s L2 is the robust target for the current per-hop settlement design.

236,939

verify gas (flat, any N/D)

26,533

cold replay write

4,637

warm replay write

70–78%

cost that is warm storage

1 · Why full usage removes the guesswork

Earlier analyses carried one soft assumption: will the cheap "warm" storage path actually apply in production? The whitepaper settles it with two facts:

Flow id = org‖client‖seq, and seq is a client-local sequential counter ("the client increments the sequence for each new session"). Not random.
The replay word is keyed (org, client, seq/256) — one storage word holds a client's 256 consecutive flows.

Because seq is sequential, a busy client's flows land in the same word, contiguously, by construction. No scattering, no probability.

Takeaway. At full always-on usage every client's replay word is maximally packed — the cold/warm split is deterministic, not estimated. Full usage is both the steady state (no idle discount) and the worst case (all clients active). Nothing in normal operation costs more.

2 · Measured unit costs (from the benchmark files)

All values from committed result files in `zk/proof-artifacts/`. On-chain gas identical D5 vs D6.
Action	Gas	When charged	Source
Groth16 verify (flat)	236,939	once per `submitProof` tx	v3-verify-matrix.csv
Verify amortized / hop	1,185	per operator-hop (÷ N=200)	derived
Cold replay write	26,533	first bit in a client word	v3-coldwarm.csv N200 op0
Warm replay write	4,637	every subsequent bit	v3-coldwarm.csv N200 op1–9

Takeaway. D changes only the circuit (Merkle siblings, off-chain proving). On-chain, verify is flat at 236,939. The cold-vs-warm gap (~5.7×) drives everything.

3 · What D means: the hop ceiling

Whitepaper §Path Merkle tree. Tree depth is fixed per flow; gas scales with hops actually used.
Tree depth D	Leaf slots 2^D	Max hops	Merkle siblings / row
D = 5	32	32	5
D = 6	64	64	6

We evaluate four hop scenarios throughout: mean H=9.3, peak H=18.5 (telemetry), D5 max H=32, D6 max H=64 (design ceilings).

4 · Scenario: 50,000 UTN, full usage

Workload inputs.
Input	Value	Source
Clients (UTN)	50,000	scenario
Flows per client / 5min	4.125	telemetry anchor (41,250 ÷ 10,000)
Total flows / 5min	206,250	50,000 × 4.125
Replay word span	256 flows/word	paper key (org,client,seq/256)

Total gas per 5-min, and % of each chain budget

Chains: L1 75M/15s = 1.5B/5min · L2-A 300M/5s = 18B/5min · L2-B 300M/2s = 45B/5min.
Scenario	Gas / 5min	warm %	L1	L2-A (5s)	L2-B (2s)
Mean (H=9.3)	12.30B	70%	820%	68%	27%
Peak (H=18.5)	23.34B	75%	1,556%	130%	52%
D5 max (H=32)	39.55B	77%	2,637%	220%	88%
D6 max (H=64)	77.98B	78%	5,198%	433%	173%

Takeaway. L1 is out at every hop count. L2-A holds only the mean. L2-B holds mean and peak, and barely survives the D5 (32-hop) ceiling — but not the D6 (64-hop) maximum.

5 · How gas scales with flow rate

Cost is linear in flow rate. Table in 5,000-flow steps. Each cell shows gas per 5-min (billions) on top, and below it the % of each chain's budget — L1, L2-A, L2-B (budgets 1.5B / 18B / 45B). Color: green <25% · blue 25–50% · orange 50–100% (over EIP-1559 target) · red ≥100% (over block gas limit).

Gas (B) with a mini L1 / L2-A / L2-B %-of-budget grid per cell. Green <25% · blue 25–50% · orange 50–100% (over EIP-1559 target) · red ≥100% (over block limit). Slope: mean 0.060B · peak 0.113B · D5 0.192B · D6 0.378B per 5,000 flows.

flows / 5min

mean (9.3)

peak (18.5)

D5 = 32

D6 = 64

5,000

0.30

L1	L2-A	L2-B
20%	2%	1%

0.57

L1	L2-A	L2-B
38%	3%	1%

0.96

L1	L2-A	L2-B
64%	5%	2%

1.89

L1	L2-A	L2-B
126%	11%	4%

10,000

0.60

L1	L2-A	L2-B
40%	3%	1%

1.13

L1	L2-A	L2-B
75%	6%	3%

1.92

L1	L2-A	L2-B
128%	11%	4%

3.78

L1	L2-A	L2-B
252%	21%	8%

15,000

0.89

L1	L2-A	L2-B
60%	5%	2%

1.70

L1	L2-A	L2-B
113%	9%	4%

2.88

L1	L2-A	L2-B
192%	16%	6%

5.67

L1	L2-A	L2-B
378%	32%	13%

20,000

1.19

L1	L2-A	L2-B
79%	7%	3%

2.26

L1	L2-A	L2-B
151%	13%	5%

3.84

L1	L2-A	L2-B
256%	21%	9%

7.56

L1	L2-A	L2-B
504%	42%	17%

25,000

1.49

L1	L2-A	L2-B
99%	8%	3%

2.83

L1	L2-A	L2-B
189%	16%	6%

4.79

L1	L2-A	L2-B
320%	27%	11%

9.45

L1	L2-A	L2-B
630%	53%	21%

30,000

1.79

L1	L2-A	L2-B
119%	10%	4%

3.40

L1	L2-A	L2-B
226%	19%	8%

5.75

L1	L2-A	L2-B
384%	32%	13%

11.34

L1	L2-A	L2-B
756%	63%	25%

35,000

2.09

L1	L2-A	L2-B
139%	12%	5%

3.96

L1	L2-A	L2-B
264%	22%	9%

6.71

L1	L2-A	L2-B
447%	37%	15%

13.23

L1	L2-A	L2-B
882%	74%	29%

40,000

2.38

L1	L2-A	L2-B
159%	13%	5%

4.53

L1	L2-A	L2-B
302%	25%	10%

7.67

L1	L2-A	L2-B
511%	43%	17%

15.12

L1	L2-A	L2-B
1008%	84%	34%

45,000

2.68

L1	L2-A	L2-B
179%	15%	6%

5.09

L1	L2-A	L2-B
340%	28%	11%

8.63

L1	L2-A	L2-B
575%	48%	19%

17.01

L1	L2-A	L2-B
1134%	95%	38%

50,000

2.98

L1	L2-A	L2-B
199%	17%	7%

5.66

L1	L2-A	L2-B
377%	31%	13%

9.59

L1	L2-A	L2-B
639%	53%	21%

18.90

L1	L2-A	L2-B
1260%	105%	42%

55,000

3.28

L1	L2-A	L2-B
219%	18%	7%

6.22

L1	L2-A	L2-B
415%	35%	14%

10.55

L1	L2-A	L2-B
703%	59%	23%

20.79

L1	L2-A	L2-B
1386%	116%	46%

Settlement gas per 5-min per chain, with the EIP-1559 gas-raising target (amber) and block gas limit (gray) drawn per chain. X-axis extended past the data so end labels are readable.

% of each chain's own 5-min block budget, one panel per chain (y-scales differ: L1 3000%, L2-A 500%, L2-B 200%). The amber EIP-1559 gas-raising target (50%) is the real operating ceiling — base fee rises ~12.5%/block above it, so gas price climbs before the block gas limit. Keep sustained load under the amber line to avoid fee escalation.

Crossover flow rates (max flows/5min each chain can absorb)

Where each scenario line hits each chain's 5-min budget.
Scenario	L1 (1.5B)	L2-A (18B)	L2-B (45B)
mean (9.3)	25,161	301,934	754,834
peak (18.5)	13,254	159,045	397,613
D5 = 32	7,822	93,863	234,658
D6 = 64	3,968	47,611	119,028

Takeaway. L1 saturates at ~4k flows/5min at the D6 ceiling — categorically too small. L2-B absorbs ~119k flows even at the 64-hop worst case.

6 · Why warm storage dominates

Warm is 70–78% because the current design settles per operator-hop: every operator on every flow submits its own transaction and writes its own replay bit. The multiplier is flows × hops — up to 13.2M operator-hops at the D6 ceiling. Nothing in the current design compresses that count.

7 · Benchmark matrix (N = 50 / 200 / 300 / 400)

Batch size N = flows settled per proof. Larger N amortizes the flat verify over more flows, lowering per-flow cost. Presented at both tree depths: 7.a D=5 (32-hop paths) and 7.b D=6 (64-hop paths).

On-chain gas is identical at D=5 and D=6. Tree depth only changes the circuit (Merkle siblings per row → constraint count and proving time). It does not touch submitProof gas, verify, or storage. So the gas rows in 7.a and 7.b are the same by construction — only constraints and proving time differ between them.

7.a · Depth D=5 (32-hop max paths)

N=50/200/300 **measured** (real Groth16, Foundry `--via-ir`, 75M block); N=400 **projected** — see note.
Metric	N = 50	N = 200	N = 300	N = 400
Status	measured	measured	measured	projected
Circuit constraints (D5)	2,167,422	8,613,660	12,891,566	17,169,472
`submitProof` gas (total)	1,547,070	5,299,053	7,801,543	10,304,033
Per-flow gas (÷N)	30,941	26,495	26,005	25,760
Verify gas / flow (÷N)	4,739	1,185	790	592
`submitProof` % of 75M block (L1)	2.06%	7.07%	10.40%	13.74%
`submitProof` % of 300M block (L2)	0.52%	1.77%	2.60%	3.43%
Proving time, rapidsnark D5 (this host)	3,461 ms	14,429 ms	15,686 ms	n/a *

7.b · Depth D=6 (64-hop max paths)

N=50/200/300 **measured**; N=400 **projected** — see note. Gas rows identical to 7.a by construction.
Metric	N = 50	N = 200	N = 300	N = 400
Status	measured	measured	measured	projected
Circuit constraints (D6)	2,201,594	8,711,660	13,038,566	17,365,472
`submitProof` gas (total)	1,547,070	5,299,053	7,801,543	10,304,033
Per-flow gas (÷N)	30,941	26,495	26,005	25,760
Verify gas / flow (÷N)	4,739	1,185	790	592
Marginal gas / row	—	25,013	25,025	25,025
`submitProof` % of 75M block (L1)	2.06%	7.07%	10.40%	13.74%
`submitProof` % of 300M block (L2)	0.52%	1.77%	2.60%	3.43%
Proving time, rapidsnark D6 (this host)	3,507 ms	13,644 ms	18,110 ms	n/a *

D6 adds ~34k–147k constraints over D5 at the same N (one extra Merkle level, ~519 constraints/row) — a ~1% circuit cost, and zero on-chain gas difference.

7.c · Per-flow savings between batch steps

Applies to both depths (gas is depth-independent). How much each jump in N reduces per-flow gas:

Amortization of the flat 236,939 verify across the batch.
Step	Per-flow before → after	Gas saved / flow	% cheaper
N 50 → 200	30,941 → 26,495	−4,446	14.4%
N 200 → 300	26,495 → 26,005	−490	1.8%
N 300 → 400	26,005 → 25,760	−245	0.9%

Takeaway — diminishing returns. Per-flow gas has two parts: a fixed ~25,000 marginal row cost (irreducible — it's the storage + calldata per flow) and the verify amortized over N. The first jump (50→200) saves 14.4% because it slashes verify/flow from 4,739 to 1,185. But by N=200 verify is already tiny, so 200→300 saves only 1.8% and 300→400 just 0.9%. The sweet spot is around N=200–300; beyond that you pay more proving time (13.6s → 18.1s → needs pot26) for well under 1% gas improvement. Batching helps a lot early, then flattens against the ~25k/flow floor.

7.d · Pushing N higher (toward 500): you save a whole batch, but never storage

The bigger batching win is not just amortizing the pairing check — it is avoiding the fixed cost of creating another batch entirely. Every batch is one submitProof transaction, and each transaction carries a fixed overhead independent of how many flows it contains:

Fixed per-batch (per-transaction) overhead — paid once per batch no matter how many flows are in it.
Fixed per-batch component	Gas
EVM base transaction cost	21,000
Groth16 verify (pairing check, flat)	236,939
Dispatch + calldata base overhead	~36,134
Total fixed cost of one batch	~294,073

So per-flow gas is really (fixed_batch_cost ÷ N) + fixed_row. Growing N spreads that ~294,073 "cost of a batch" over more flows — the more you batch, the more you avoid ever creating (and paying for) another batch. But the second term — per-flow storage + calldata — is untouchable.

Fixed-batch overhead spread over N, plus the fixed per-flow storage floor. N=400/500 projected (need pot26 to prove; gas projection reliable).
Metric	N = 200	N = 300	N = 400	N = 500
Fixed batch overhead / flow (÷N) — shrinks with N	1,470	980	735	588
of which: verify / flow	1,185	790	592	474
Storage + calldata floor / flow — never shrinks	~25,025	~25,025	~25,025	~25,025
Per-flow gas (total)	26,495	26,005	25,760	25,613
Marginal saving vs previous step	—	−1.8%	−0.9%	−0.6%

At scale the batch-avoidance is the headline number. To settle 1,000,000 flows: at N=200 that is 5,000 batches × 294,073 = 1.47B gas of pure batch overhead; at N=500 it is only 2,000 batches = 0.59B — ~882M gas saved just by not creating 3,000 extra transactions.

Takeaway — batching saves a whole transaction, not storage. Each extra batch you avoid saves the full ~294,073 fixed transaction cost (base tx + verify + dispatch), spread as ~1,470 → 588 gas/flow from N=200→500. That is the real batching win, and it keeps paying (just with diminishing marginal returns). But the ~25,025 storage + calldata per flow is fixed — every flow must still write its own replay bit and carry its own 24-byte flow id, no matter the batch size. So per-flow gas asymptotes to ~25,025 and never goes below it. Bigger batches eliminate transactions; only a design change (fewer writes per flow) eliminates storage.

N=500 not proved on this host. N=500 D6 (~21.7M constraints) needs a 2^26 FFT domain; local ptau reaches only 2^25. On-chain gas is projected from the measured flat-verify / constant-per-row model (reliable); proving time is left unquoted (2^25→2^26 jump breaks linear extrapolation).

* N=400 proving not measured on this host. N=400 D6 needs a 2^26 Groth16 FFT domain (~17.4M constraints × 2); local Powers-of-Tau only reaches 2^25 (pot25). The on-chain gas projection is reliable because verify is flat and per-row cost is constant and measured; only the proving time requires pot26 to measure and is left as n/a rather than guessed (the 2^25→2^26 domain jump makes linear proving-time extrapolation unreliable).

7.e · Mean is 9.3 hops — D=4 (16 hops) covers the common case

The routing telemetry mean is 9.3 hops per flow. That comfortably fits inside a D=4 tree (2^4 = 16 leaf slots → 16-hop max). Since the whitepaper lets each flow pick the smallest depth that fits its path, the typical flow can settle at D=4, reserving D=5/D=6 only for the longer tail (peak 18.5 needs D≥5; 64-hop worst case needs D=6).

A shallower tree is a cheaper circuit — one fewer Merkle level. From the measured D5→D6 delta, one level costs 490 constraints per settled row (~1.1%), so D=4 saves that much versus D=5:

D=4 vs D=5 circuit size. D=4 = D=5 minus the measured per-level delta (490/row). On-chain gas is unchanged — depth never affects gas. Source: v3-gas-d4-estimate.csv.
N	D=5 constraints	D=4 constraints (est)	Saved
200	8,613,660	8,515,660	98,000 (1.1%)
300	12,891,566	12,744,566	147,000 (1.1%)
400	17,169,472	16,961,720	207,752 (1.2%)

Takeaway. Because fewer constraints per row means more rows fit under the same proving/FFT-domain budget, D=4 lets you pack a higher N into one circuit for the same proving cost — i.e. the mean-case flows can be batched deeper before hitting the domain ceiling. The per-row saving is modest (~1.1%), so the win is incremental, not dramatic; but it is free (zero on-chain gas change) and the right default depth: settle the ~9.3-hop mean traffic at D=4, and only escalate to D=5/D=6 for genuinely long paths. This keeps the average proving cost down and the average batch a little larger.

8 · Recommendations

Full-usage 50k-UTN is off-L1 by 8–52× — not viable at any tuning.
On the current per-hop design, use L2-B (300M/2s): mean 27%, peak 52%. Only chain that survives peak-hop windows. Note the D5 (32-hop) ceiling still hits ~88% with no margin, and D6 (64-hop) exceeds it.
L2-A (300M/5s) is mean-only (130% at peak) — acceptable only if peak-hop is rare/transient.
D-selection is a gas decision: short paths (D2–4, ≤16 hops) are the most comfortable; D5 (32 hops) is the on-chain viability cliff on L2-B; D6 (64 hops) exceeds every chain modeled.
Quote full usage — the paper's sequential client-scoped seq makes the storage behavior deterministic, so these are a hard ceiling with no guesswork.

Constants (measured): verify 236,939 · verify/hop 1,185 (N=200) · cold 26,533 · warm 4,637 · FPC 4. Sources: v3-verify-matrix.csv, v3-breakdown.csv, v3-coldwarm.csv, v3-gas.csv, v3-rapidsnark-results.json (spacenetwork-billing-research/code-samples/b0001a/zk/proof-artifacts/). Whitepaper: SpaceNetwork Billing & Settlement, Nguyen/Lebée, Jun 2026 — §Flow identity, §Path Merkle tree.