Lab · pedagogical demos

The signal is collective

Reproducible demos behind the article: how a tiny shared edge invisible at the strategy level becomes a confident signal at the portfolio level.

The mathematics

Consider N strategies whose per-period excess returns share a common signal ε and a one-factor noise structure:

x_{i, t} = ε + ρ F_{t} + 1 - ρ Z_{i, t}, F_{t}, Z_{i, t} \sim iid N (0, 1) .

Each x_i,t has mean ε, variance 1, and pairwise correlation ρ. The Sharpe ratio of any single strategy is mean / std = ε; over T periods, the t-statistic is

t_{strategy} = T ε .

Now form the equal-weight portfolio

\overset{x}{ˉ}_{t} = \frac{1}{N} i = 1 \sum N x_{i, t} = ε + ρ F_{t} + 1 - ρ \overset{ˉ}{Z}_{t},

where $\overset{ˉ}{Z}_{t} = \frac{1}{N} \sum_{i} Z_{i, t}$ has variance 1/N. The portfolio mean is still ε. Its variance, however, has shrunk:

Var (\overset{x}{ˉ}_{t}) = ρ + \frac{1 - ρ}{N} = \frac{1 + ( N - 1 ) ρ}{N} .

Hence the portfolio t-statistic over T periods is

t_{portfolio} = \frac{T ε}{( 1 + ( N - 1 ) ρ ) / N} = T ε \frac{N}{1 + ( N - 1 ) ρ} .

The diversification factor √(N / (1 + (N−1)ρ)) is what does the work. As ρ → 0 it grows like √N; as ρ → 1 it saturates at 1. For modest ρ = 0.05 and N = 50 the factor is ≈ 5.0, so the portfolio t-stat is five times the per-strategy t-stat.

Why this is the article’s central claim, mathematically

The strategy and the portfolio are computed from the same data. They sample the same noise. The signal ε is identical in both. What changes between them is purely the variance of the noise, the signal-to-noise ratio scales by the diversification factor. This is the toy demonstration of the empirical claim in Edge is in the Process: at strategy level, ε is undetectable; at portfolio level, the same ε is a statistically reliable edge.

Worked example

Take N = 50, T = 252, ε = 0.04, ρ = 0.05. Plug into the formulas:

E[t_strategy] = √252 · 0.04 ≈ 0.635. The empirical histogram of strategy t-stats peaks near this value with standard deviation ≈ 1; positive-fraction is ~74%, > 2 fraction is < 5%.
E[t_portfolio] = 0.635 · √(50 / (1 + 49·0.05)) = 0.635 · √(50 / 3.45) ≈ 2.42. Empirical positive-fraction is > 99%; > 2 fraction is around 65%.

The interactive demo below confirms these numbers in real time. Sweep ε to zero and watch both distributions collapse onto the standard normal, the diversification advantage vanishes when there is no shared signal to extract.

Demo: The signal is collective

Per-period model: x_i = ε + √ρ · F + √(1−ρ) · Z_i. Same ε goes into every strategy; portfolio averages dilute idiosyncratic noise.

N strategies50

T periods252

ε (signal)0.040

ρ (cross-corr)0.05

trials2000

seed=9

E[t] strategy

0.63

E[t] portfolio

2.42

mean t̂, strat

0.66

mean t̂, port

2.43

P(t > 0) strat

75%

P(t > 0) port

99%

P(t > 2) strat

8.9%

P(t > 2) port

66.6%

Gray: distribution of t̂ across 2000 draws of a single random strategy. Amber: distribution of t̂ across 2000 draws of an N=50 equal-weight portfolio of strategies, same data-generating process. The mean of the amber distribution exceeds the gray by exactly the diversification factor √(N / (1 + (N−1)ρ)).

Figures

Fig. 1:The toy model made empirical on real data, with a deliberate pre-selection step: the 39 BTC strategies shown are the top quartile of the pool by mean per-day P&L, scaled by per-strategy std. This is not a naïve average over the full pool, production work uses a much stronger proprietary filter. The shape is the point: individual strategies (light grey spaghetti) are noisy and many spend years underwater; the equal-weight population mean (amber) compounds the small shared positive drift into a clean upward trajectory.

Fig. 2:Empirical portfolio t-statistic as N grows from 1 to 39, averaged over 80 random sub-samples per N (band: 10-90% across draws). The dashed teal curve is the theoretical scaling t₁ · √(N / (1 + (N−1)ρ̄)) with the empirical mean pairwise correlation ρ̄ ≈ 0.08. Single-strategy t ≈ 0.47, a strategy you would dismiss as noise, lifts to t ≈ 1.4 at N = 39, exactly tracking the diversification-factor prediction.

Why this matters for systematic strategies

The toy model is an oversimplification, real strategy returns are not jointly Gaussian with constant ρ, but the qualitative claim is robust. As long as a strategy population has any shared, mean-positive signal that is not perfectly correlated across strategies, equal-weighting (or any sensible weighting) amplifies it. The corollary is that strategy selection by individual t-statistics throws away the amplification: you keep only strategies that are individually significant, but the strategies that carry the most signal are typically not individually significant, they are individually shaped to extract a small piece of the shared edge.

This is why the firm’s production pipeline never selects strategies on individual statistical significance. It selects on filter-conditioned population statistics, Daru Finance’s proprietary regime-confidence diagnostic, the regime-conditional Sharpe, the FDR-controlled subset under M/02. The signal lives in the collective.

Reproducibility

DaruFinance / signal-is-collective

Python · open source reference implementation

Minimal invocation

import numpy as np
from signal_is_collective import simulate

results = simulate(N=50, T=252, eps=0.04, rho=0.05, trials=2000, seed=9)
print(f"Mean strategy t = {results.strategy_t.mean():.2f}")
print(f"Mean portfolio t = {results.portfolio_t.mean():.2f}")
# Reproduces the article's headline numbers exactly.

References

[1]Gatto, D. V. (2026). Edge is in the Process. Daru Finance Research Notes.
[2]Sharpe, W. F. (1994). The Sharpe ratio. Journal of Portfolio Management 21(1), 49–58.
[3]Fama, E. F. (1976). Foundations of Finance. Basic Books, New York.

All projects View on GitHub