Testing the Random Walk Hypothesis on Bitcoin (BTC-USD)

A Statistical Research Report
Date: April 2026
Data: CoinGecko BTC-USD (2013–2026)
Observations: 4,735 daily log returns
Tools: Python · pandas · statsmodels · arch

Final Verdict: RWH REJECTED — 2 of 5 tests support the Random Walk Hypothesis

Bitcoin exhibits statistically significant momentum, serial autocorrelation at 6–30 day horizons, long-horizon variance ratio deviations, and highly persistent GARCH volatility — all inconsistent with weak-form market efficiency.

Abstract. This report investigates whether Bitcoin (BTC-USD) price movements follow the Random Walk Hypothesis (RWH), a cornerstone of the Efficient Market Hypothesis (EMH) in its weak form. Using 4,735 daily observations spanning April 2013 to April 2026, we apply four core statistical tests — the Augmented Dickey-Fuller (ADF) test, Ljung-Box autocorrelation test, Wald-Wolfowitz Runs test, and the Variance Ratio test (Lo-MacKinlay, 1988) — together with GARCH(1,1) volatility modelling and rolling window analysis as extensions. The results indicate that 2 out of 5 core tests support the RWH, leading to a verdict of REJECTED. Bitcoin's return series exhibits significant momentum, serial dependence at medium horizons (6–30 days), and sustained volatility clustering — all inconsistent with a pure random walk.

1 Introduction

The Random Walk Hypothesis posits that successive asset price changes are independent and identically distributed, rendering future prices impossible to predict from historical data alone. First formalised by Bachelier (1900) and popularised by Fama (1970) as the weak form of the Efficient Market Hypothesis, the RWH has profound implications:

  • For investors: If RWH holds, systematic excess returns through technical analysis are impossible.
  • For quantitative traders: Rejection of the RWH identifies windows of exploitable inefficiency.
  • For regulators and economists: Market efficiency informs policy on transparency, liquidity, and price discovery.

Bitcoin, as a relatively young, decentralised, and retail-driven asset, presents an ideal candidate for testing market efficiency. Unlike mature equity markets, cryptocurrency markets lack continuous institutional arbitrage, operate 24/7, and were historically subject to speculative bubbles — all conditions that may precipitate inefficiency.

2 Theoretical Background

2.1 The Random Walk Model

Let $P_t$ denote the asset price at time $t$. A random walk without drift is defined as:

$$P_t = P_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim \text{IID}(0, \sigma^2)$$

Taking the natural logarithm, the log return $r_t$ is:

$$r_t = \ln\left(\frac{P_t}{P_{t-1}}\right) = \ln P_t - \ln P_{t-1}$$

For the RWH to hold, $r_t$ must satisfy:

  1. Independence: $\text{Cov}(r_t, r_{t-k}) = 0$ for all $k \neq 0$
  2. Stationarity: $E[r_t] = \mu$ and $\text{Var}(r_t) = \sigma^2$ are constant
  3. Variance linearity: $\text{Var}(r_{t,k}) = k \cdot \sigma^2$ (the Lo-MacKinlay condition)

2.2 Efficient Market Hypothesis (Weak Form)

The weak-form EMH (Fama, 1970) states that all past price and volume information is fully reflected in current prices. This is equivalent to requiring the price process to follow a martingale:

$$E[P_{t+1} \mid P_t, P_{t-1}, \ldots] = P_t$$

2.3 Why Log Returns?

Log returns are preferred over simple arithmetic returns for two key reasons:

  • Time additivity: $r_{0 \to T} = \sum_{t=1}^{T} r_t$ (simple returns are multiplicative)
  • Normalisation: Log returns suppress the exponential scale effect of long price series, making distributional assumptions more tractable

3 Data

3.1 Source and Structure

PropertyValue
AssetBitcoin (BTC-USD)
SourceCoinGecko — btc-usd-max.csv
FrequencyDaily
Date Range2013-04-28 → 2026-04-17
Total Rows4,736
Return Observations4,735

3.2 Descriptive Statistics

StatisticValueInterpretation
Mean+0.001335≈ +0.13%/day average return
Std Dev0.037455≈ 3.75%/day daily volatility
Min-0.4337Single-day crash of −43.4%
Max+0.2871Single-day rally of +28.7%
Skewness-0.4934Left-skewed: crashes sharper than rallies
Kurtosis9.37789× fatter tails than Normal

The excess kurtosis of 9.38 is strongly leptokurtic — Bitcoin's tails are nearly 9× fatter than a normal distribution. This alone indicates that any test relying on normality assumptions should be interpreted cautiously.

4 Exploratory Data Analysis

The EDA chart below shows the full price history, log returns, return distribution, ACF, and rolling volatility. Key visual observations:

  • Price series: Four distinct regimes (early adoption → bubble 2017 → COVID crash → institutional adoption → current)
  • Log returns: Clear volatility clustering — calm periods interrupted by turbulent bursts (GARCH effect)
  • Distribution: Highly peaked centre with extreme fat tails — Normal curve (orange) systematically underestimates tail mass
  • ACF: Individual lags small but clusters appear at medium horizons (6–30 lags)
EDA Charts

5 Statistical Tests

5.1 Augmented Dickey-Fuller Test (ADF)

✔ Supports RWH
H₀: The series has a unit root (is non-stationary)
$$\Delta y_t = \alpha + \beta y_{t-1} + \sum_{i=1}^{p} \gamma_i \Delta y_{t-i} + \varepsilon_t$$
SeriesADF Statisticp-valueDecision
Prices-1.02250.7450Non-stationary ✔
Log Returns-18.47510.0000Stationary ✔

Prices are non-stationary (unit root present) — consistent with a random walk. Returns are strongly stationary — consistent with a weakly stationary noise process. Both results align with the RWH structural framework.

✔ Verdict: Supports RWH

5.2 Ljung-Box Autocorrelation (Q) Test

✘ Against RWH
H₀: No autocorrelation up to lag k — returns are white noise
$$Q(k) = n(n+2)\sum_{j=1}^{k} \frac{\hat{\rho}_j^2}{n-j} \sim \chi^2(k)$$
LagQ-statisticp-valueSignificant?
10.640.423No ✔
55.210.391No ✔
616.030.0136Yes ✘
1029.170.0012Yes ✘
2047.310.0005Yes ✘
3058.940.0024Yes ✘

Short-run returns (lags 1–5) appear independent, but significant cumulative autocorrelation emerges from lag 6. Beyond one trading week, the history of Bitcoin returns carries predictive information — a direct violation of the weak-form EMH.

✘ Verdict: Against RWH — Serial dependence at medium horizons

5.3 Wald-Wolfowitz Runs Test

✘ Against RWH
H₀: The binary sign sequence (+/−) of returns is purely random
$$Z = \frac{R - E[R]}{\sqrt{V[R]}} \xrightarrow{d} N(0,1)$$
MetricValue
Z-statistic3.0376
p-value0.0024
InterpretationFewer runs than expected → Momentum clustering

Positive Z = 3.04 means there are fewer runs than expected under randomness — same-sign returns cluster together. This is the statistical signature of momentum: up-days tend to follow up-days, down-days follow down-days. With p = 0.0024 we reject randomness at all conventional levels.

✘ Verdict: Against RWH — Non-random momentum clustering

5.4 Variance Ratio Test (Lo-MacKinlay, 1988)

✘ Against RWH
H₀: VR(k) = 1 for all k — variance scales linearly with time
$$VR(k) = \frac{\text{Var}(r_{t,k})}{k \cdot \text{Var}(r_t)}, \quad z(k) = \frac{VR(k)-1}{\sqrt{2(2k-1)(k-1)/(3kn)}}$$
k (days)VR(k)z-statReject H₀?Interpretation
20.9902-0.677No ✔Near-random
50.9990-0.030No ✔Near-random
101.07471.523No ✔Borderline
201.22733.147Yes ✘Momentum at 1-month
301.32103.582Yes ✘Strong momentum at 6-week

VR rises monotonically from 0.99 at k=2 to ~1.44 at k=50. A pure random walk would stay flat at 1.0. This reveals positive autocorrelation that compounds over longer holding periods — Bitcoin trends.

Variance Ratio Profile
✘ Verdict: Against RWH — VR rises to 1.44 at k=50 (significant at k=20,30)

6 Extensions — GARCH(1,1) & Rolling Analysis

6.1 GARCH(1,1) Volatility Model

$$r_t = \mu + \varepsilon_t, \quad \varepsilon_t = \sigma_t z_t, \quad \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2$$
μ mu
0.1357
p=0.0013 ★★
ω omega
0.5957
p=0.0003 ★★★
α alpha[1]
0.1484
p<0.0001 ★★★
β beta[1]
0.8207
p<0.0001 ★★★
Persistence α+β
0.9691
Very high persistence
Long-run Vol (Ann.)
83.84%

Persistence (α+β) = 0.969 means volatility shocks are extremely slow to decay. Even after GARCH filtering, the residuals remain autocorrelated (significant at all lags 1–20), indicating deeper non-linear structure beyond what GARCH can capture.

6.2 Rolling Window Analysis

Rolling ADF (365-day window): 100% of windows stationary — stationarity is consistent throughout BTC history, not a statistical artefact of the full sample.

Rolling Runs Test (180-day window): 14.8% of windows show non-random behaviour — momentum is episodic, concentrated in major bull/bear market runs, not uniformly present.

GARCH and Rolling Analysis

7 Discussion

7.1 Summary of Evidence

ADF — Prices Non-Stationary✔ Supports RWH
ADF — Returns Stationary✔ Supports RWH
Ljung-Box ACF (Lags 6–30)✘ Rejects RWH
Wald-Wolfowitz Runs Test✘ Rejects RWH
Variance Ratio Test (k=20,30)✘ Rejects RWH
Final Score2/5 — REJECTED ❌

7.2 BTC vs. Mature Equity Markets (AAPL)

MetricBTC-USDAAPL
Tests supporting RWH2/53/5
Kurtosis9.385.69
Runs Test p-value0.0024 ✘0.3537 ✔
VR at k=201.23 ✘0.85 ✘
VerdictRejectedPartially Supported

Bitcoin is measurably less efficient than Apple stock — consistent with lower institutional arbitrage capacity and greater retail speculative activity.

7.3 Limitations

  1. Fat tails: Kurtosis of 9.38 means the asymptotic normal distribution underlying the VR and Runs z-tests is a poor approximation.
  2. Structural breaks: Bitcoin has passed through fundamentally different market regimes. Full-sample tests aggregate across these.
  3. Non-linear dependence: ACF and VR tests only detect linear autocorrelation. Machine learning may uncover deeper structure.
  4. Transaction costs: Statistical predictability does not imply economic profitability after costs and slippage.

8 Conclusion

Based on four statistical tests applied to 4,735 daily Bitcoin return observations (2013–2026), the Random Walk Hypothesis is rejected at conventional significance levels.

  • ADF: Price levels exhibit unit roots and returns are stationary — structurally consistent with a random walk model ✔
  • Runs Test (p=0.0024): Strong evidence that return sign sequences are non-random — momentum clustering ✘
  • Ljung-Box: Statistically significant autocorrelation from lag 6 through lag 30 ✘
  • Variance Ratio: VR rises from 0.99 (k=2) to 1.44 (k=50) — significant positive autocorrelation at longer horizons ✘
  • GARCH(1,1): Persistence α+β=0.969 confirms highly clustered volatility — constant-variance RWH assumption strongly violated ✘

Bitcoin's market is not weak-form efficient over the 2013–2026 window. Exploitable statistical structure exists primarily at the 1–4 week holding period horizon, where momentum-based strategies might generate meaningful signals before accounting for transaction costs.

9 Future Work

EGARCH / GJR-GARCH

Capture asymmetric volatility (leverage effect) — negative shocks may amplify volatility more than positive shocks.

Bootstrap VR Test

Heteroskedasticity-robust variance ratio inference — addresses the fat-tail limitation of asymptotic z-tests.

Hurst Exponent

Measure long-range dependence (H > 0.5 → trending). Provides a continuous measure of persistence rather than a binary test.

Machine Learning

LSTM/Transformer to detect non-linear return predictability beyond what linear autocorrelation tests can capture.

High-Frequency Data

Tick-level analysis would likely reveal even stronger RWH violations due to market microstructure effects.

Regime-Switching Model

Markov-switching to test market efficiency separately within bull and bear regimes.

Ref References

  • Bachelier, L. (1900). Théorie de la spéculation. Annales Scientifiques de l'École Normale Supérieure.
  • Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. Journal of Finance, 25(2), 383–417.
  • Lo, A. W., & MacKinlay, A. C. (1988). Stock market prices do not follow random walks. Review of Financial Studies, 1(1), 41–66.
  • Engle, R. F. (1982). Autoregressive conditional heteroscedasticity. Econometrica, 50(4), 987–1007.
  • Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity. Journal of Econometrics, 31(3), 307–327.
  • Urquhart, A. (2016). The inefficiency of Bitcoin. Economics Letters, 148, 80–82.