Methodology & data.

A long-form companion to the formal definitions in the paper, including selection criteria, edge weighting, cycle enumeration, and known limitations.

§ 1Scope

The methodology is restricted to publicly disclosed deals among firms whose primary revenue is derived from AI model development, hyperscale cloud infrastructure, or semiconductor manufacturing. Private equity stakes below 5% are excluded unless they accompany a service or supply commitment material to either party.

The corpus is constructed from SEC filings (10-K, 10-Q, 8-K), audited financial statements, official press releases, and contemporaneous reporting in the financial press. Where deal value is undisclosed, magnitude is estimated using the proxy method described in § 4.

§ 2Sample selection

The 28-firm sample was constructed using a three-step procedure:

(i) Stratum definition. Firms were grouped into four strata: AI model labs, hyperscale cloud providers, semiconductor manufacturers, and capital providers (sovereign and private).

(ii) Inclusion threshold. Within each stratum, firms with at least one disclosed deal of $100M+ in the 2022–2025 window were included.

(iii) Adjacency closure. Firms that fell below threshold but appeared on at least three deal records with already-included firms were added to preserve cycle visibility.

§ 3Deal taxonomy

Each edge is classified into one of four flow types:

money Direct capital transfers, debt instruments, prepayments.

compute Sale or supply of physical hardware (GPUs, accelerators, networking).

service Multi-year cloud, API, or compute-as-a-service commitments.

equity Common, preferred, or convertible equity stakes.

§ 4Edge weighting

Each edge carries a normalized weight wₑ ∈ (0, 1]. The weighting function combines disclosed value with deal recency and confidence:

wₑ = α · normValue(e) + β · recency(e) + γ · confidence(e) (eq. 4)

with α + β + γ = 1, defaults α = 0.6, β = 0.25, γ = 0.15. Sensitivity analyses with alternate weightings are reported in Appendix B of the paper.

§ 5Cycle enumeration

Simple directed cycles are enumerated using Johnson's algorithm, which is polynomial in (|V| + |E|)(c + 1) where c is the number of elementary cycles. For the present sample the bound is comfortably tractable.

# pseudocode
G = build_multigraph(deals)
for cycle in johnson(G):
    if 3 <= len(cycle) <= MAX_LEN:
        score = product(weight[e] for e in cycle)
        record(cycle, score)

Cycles longer than length 6 are dropped to keep interpretation tractable; in the current sample no cycle of length 7+ would survive the inclusion threshold anyway.

§ 6Metric definitions

The three metrics — Loop, Cycle, Hub — are defined in § 2 of the paper. This section restates them with the implementation conventions used in the released code.

L(a, b) = (Σ wᵢ : a→b) · (Σ wⱼ : b→a) (eq. 1)
C(s) = ∏ wₑ : e ∈ s,  |s| ≥ 3 (eq. 2)
H(v) = Σₛ 1{v ∈ s} · score(s) (eq. 3)

§ 7Limitations

Disclosure bias. Private contracts and undisclosed terms may systematically understate cycle density.

Stratum boundaries. Firms that span strata (e.g., NVIDIA's growing software footprint) introduce ambiguity in role assignment.

Temporal alignment. Deals that close years apart are treated as edges in a single graph; the methodology does not currently model time-varying weights.

§ 8Reproducibility

Source data (CSV) and analysis code (Python) are released alongside the paper. The released artifact reproduces every figure and table in the paper from raw inputs in a single command.

Sample data (CSV) ↓Analysis code (GitHub) →