Lab — regime segmentation
Hidden-Markov regime segmentation of crypto microstructure
A Gaussian HMM on (logret, vol, trend) features, with the number of states K selected by BIC — yielding persistent, interpretable regimes for risk and sizing.
The mathematics
A hidden Markov model factors a length-T observation sequence x1:T through a latent Markov chain z1:T with K states. The joint density is
For continuous features we let each emission be Gaussian: . The parameters θ = (π, A, {μk, Σk}) are estimated by Baum–Welch (EM applied to HMMs). The E-step is the forward-backward recursion
The smoothed posterior γt(k) ∝ αt(k) βt(k) is what we ship to downstream sizing rules. It is a probability — not a hard label — and its row-entropy is a natural regime-confidence score.
Choosing K with BIC
With features in d dimensions, a K-state Gaussian HMM with full covariances has p = K − 1 + K(K−1) + Kd + Kd(d+1)/2 free parameters. The Bayesian Information Criterion
trades off in-sample fit against complexity at the rate log n. We sweep K = 2, 3, 4,…, pick K* = arg minK BIC(K). On 30-minute LTC/USDT (n = 270,171 bars; d = 3 features) the sweep is unambiguous: BIC drops from 1.633M at K = 2 to 1.366M at K = 3 to 1.251M at K = 4. K = 4 wins on every asset we’ve tried. The same cross-asset BIC chart appears below.
Persistence and dwell time
The transition matrix A has self-loop probabilities pkk. The expected number of bars spent inside regime k before the chain transitions is the geometric mean
Fitted on LTC 30m we measure stay-probabilities (0.987, 0.970, 0.969, 0.981) — i.e. dwell times τ ≈ (76, 34, 32, 52) bars, which on a 30-minute clock means the regimes last on the order of 16–38 hours. They are not artefacts of a flickering segmentation; they are macroscopically stable.
Worked example
Per-bar feature vector xt = (logrett, volt, trendt), where logret is the bar log-return, vol is a rolling realised std, and trend is a smoothed slope. Fitting K = 4 on LTC 30m yields four regimes whose standardized means are crisply separated:
- R1 — drift-down (32% of bars): negative vol z, near-zero trend. The default background regime.
- R2 — trend-up (24%): mean trend ≈ +0.58 σ. Sustained positive drift, moderate vol.
- R3 — trend-down (25%): mean trend ≈ −0.60 σ. Mirror of R2.
- R4 — high-vol (20%): mean vol ≈ +1.48 σ, near-zero trend. The dispersion regime.
The interactive demo below fits a Gaussian-mixture model live for K ∈ {2, …, 6} on a synthetic price series and plots BIC(K) — the minimum is highlighted. The strip underneath colours the price path by inferred state at the chosen K.
Demo — Gaussian-mixture BIC sweep & regime strip
Synthetic log-returns with three planted regimes. EM-fit a K-component mixture for K ∈ {2,…,6}; pick K by BIC; colour the price by MAP regime.
BIC selects K* = 2 on this synthetic series (planted K = 3). The mixture conflates the persistence structure of an HMM into pure distributional separation, so on shorter samples it can prefer K = 2 or K = 4 — the production HMM uses the full transition matrix and is better calibrated. Reseed to explore that variance.
Figures
Why this matters for systematic strategies
Most strategy evaluations average performance over a single mixed sample and report a single Sharpe. Conditioning on zt partitions that sample by regime and exposes the structure that the average hides: a strategy that is +1.5 Sharpe in trend-up and −0.8 Sharpe in high-vol is not the same object as a strategy that is +0.3 in every regime, even if both have the same blended Sharpe. The HMM gives us the partition we need to make those statements quantitatively, and — because we use the smoothed posterior — to weight bars by regime-membership probability rather than thresholding on a hard label.
The same posterior feeds the position sizer: scale exposure by P(zt ∈ favourable | x1:t) computed in the causal forward pass (filtering, not smoothing, for live use). When the chain is confident we lean in; when entropy spikes near a transition we de-risk. The persistence we measure (τ ≈ 16–38 h) is the timescale that makes this kind of conditional sizing economically viable on 30-minute data.
Reproducibility
DaruFinance / strategy-regime
Python — open source reference implementation
Minimal invocation
import numpy as np
from strategy_regime import fit_hmm, bic_sweep, posterior
# X: T x d feature matrix (logret, vol, trend) per bar
sweep = bic_sweep(X, K_grid=[2, 3, 4, 5, 6])
K_star = min(sweep, key=lambda r: r["bic"])["K"]
model = fit_hmm(X, n_states=K_star, n_iter=200, seed=0)
gamma = posterior(model, X) # T x K smoothed P(z_t | x_{1:T})
states = gamma.argmax(axis=1) # MAP regime per bar
dwell = 1.0 / (1.0 - np.diag(model.transmat_)) # expected dwell per regime
References
- [1]Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2), 357–384.
- [2]Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286.
- [3]Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics 6(2), 461–464.