Testing the Random Walk Hypothesis on Bitcoin (BTC-USD)

A Statistical Research Report

Date: April 2026

Data: CoinGecko BTC-USD (2013–2026)

Observations: 4,735 daily log returns

Tools: Python · pandas · statsmodels · arch

❌

Final Verdict: RWH REJECTED — 2 of 5 tests support the Random Walk Hypothesis

Bitcoin exhibits statistically significant momentum, serial autocorrelation at 6–30 day horizons, long-horizon variance ratio deviations, and highly persistent GARCH volatility — all inconsistent with weak-form market efficiency.

Abstract. This report investigates whether Bitcoin (BTC-USD) price movements follow the Random Walk Hypothesis (RWH), a cornerstone of the Efficient Market Hypothesis (EMH) in its weak form. Using 4,735 daily observations spanning April 2013 to April 2026, we apply four core statistical tests — the Augmented Dickey-Fuller (ADF) test, Ljung-Box autocorrelation test, Wald-Wolfowitz Runs test, and the Variance Ratio test (Lo-MacKinlay, 1988) — together with GARCH(1,1) volatility modelling and rolling window analysis as extensions. The results indicate that 2 out of 5 core tests support the RWH, leading to a verdict of REJECTED. Bitcoin's return series exhibits significant momentum, serial dependence at medium horizons (6–30 days), and sustained volatility clustering — all inconsistent with a pure random walk.

1 Introduction

The Random Walk Hypothesis posits that successive asset price changes are independent and identically distributed, rendering future prices impossible to predict from historical data alone. First formalised by Bachelier (1900) and popularised by Fama (1970) as the weak form of the Efficient Market Hypothesis, the RWH has profound implications:

For investors: If RWH holds, systematic excess returns through technical analysis are impossible.
For quantitative traders: Rejection of the RWH identifies windows of exploitable inefficiency.
For regulators and economists: Market efficiency informs policy on transparency, liquidity, and price discovery.

Bitcoin, as a relatively young, decentralised, and retail-driven asset, presents an ideal candidate for testing market efficiency. Unlike mature equity markets, cryptocurrency markets lack continuous institutional arbitrage, operate 24/7, and were historically subject to speculative bubbles — all conditions that may precipitate inefficiency.

2 Theoretical Background

2.1 The Random Walk Model

Let $P_t$ denote the asset price at time $t$. A random walk without drift is defined as:

P_t = P_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim \text{IID}(0, \sigma^2)

Taking the natural logarithm, the log return $r_t$ is:

r_t = \ln\left(\frac{P_t}{P_{t-1}}\right) = \ln P_t - \ln P_{t-1}

For the RWH to hold, $r_t$ must satisfy:

Independence: $\text{Cov}(r_t, r_{t-k}) = 0$ for all $k \neq 0$
Stationarity: $E[r_t] = \mu$ and $\text{Var}(r_t) = \sigma^2$ are constant
Variance linearity: $\text{Var}(r_{t,k}) = k \cdot \sigma^2$ (the Lo-MacKinlay condition)

2.2 Efficient Market Hypothesis (Weak Form)

The weak-form EMH (Fama, 1970) states that all past price and volume information is fully reflected in current prices. This is equivalent to requiring the price process to follow a martingale:

E[P_{t+1} \mid P_t, P_{t-1}, \ldots] = P_t

2.3 Why Log Returns?

Log returns are preferred over simple arithmetic returns for two key reasons:

Time additivity: $r_{0 \to T} = \sum_{t=1}^{T} r_t$ (simple returns are multiplicative)
Normalisation: Log returns suppress the exponential scale effect of long price series, making distributional assumptions more tractable

3 Data

3.1 Source and Structure

Property	Value
Asset	Bitcoin (BTC-USD)
Source	CoinGecko — btc-usd-max.csv
Frequency	Daily
Date Range	2013-04-28 → 2026-04-17
Total Rows	4,736
Return Observations	4,735

3.2 Descriptive Statistics

Statistic	Value	Interpretation
Mean	+0.001335	≈ +0.13%/day average return
Std Dev	0.037455	≈ 3.75%/day daily volatility
Min	-0.4337	Single-day crash of −43.4%
Max	+0.2871	Single-day rally of +28.7%
Skewness	-0.4934	Left-skewed: crashes sharper than rallies
Kurtosis	9.3778	9× fatter tails than Normal

The excess kurtosis of 9.38 is strongly leptokurtic — Bitcoin's tails are nearly 9× fatter than a normal distribution. This alone indicates that any test relying on normality assumptions should be interpreted cautiously.

4 Exploratory Data Analysis

The EDA chart below shows the full price history, log returns, return distribution, ACF, and rolling volatility. Key visual observations:

Price series: Four distinct regimes (early adoption → bubble 2017 → COVID crash → institutional adoption → current)
Log returns: Clear volatility clustering — calm periods interrupted by turbulent bursts (GARCH effect)
Distribution: Highly peaked centre with extreme fat tails — Normal curve (orange) systematically underestimates tail mass
ACF: Individual lags small but clusters appear at medium horizons (6–30 lags)

5 Statistical Tests

5.1 Augmented Dickey-Fuller Test (ADF)

✔ Supports RWH

H₀: The series has a unit root (is non-stationary)

\Delta y_t = \alpha + \beta y_{t-1} + \sum_{i=1}^{p} \gamma_i \Delta y_{t-i} + \varepsilon_t

Series	ADF Statistic	p-value	Decision
Prices	-1.0225	0.7450	Non-stationary ✔
Log Returns	-18.4751	0.0000	Stationary ✔

Prices are non-stationary (unit root present) — consistent with a random walk. Returns are strongly stationary — consistent with a weakly stationary noise process. Both results align with the RWH structural framework.

✔ Verdict: Supports RWH

5.2 Ljung-Box Autocorrelation (Q) Test

✘ Against RWH

H₀: No autocorrelation up to lag k — returns are white noise

Q(k) = n(n+2)\sum_{j=1}^{k} \frac{\hat{\rho}_j^2}{n-j} \sim \chi^2(k)

Lag	Q-statistic	p-value	Significant?
1	0.64	0.423	No ✔
5	5.21	0.391	No ✔
6	16.03	0.0136	Yes ✘
10	29.17	0.0012	Yes ✘
20	47.31	0.0005	Yes ✘
30	58.94	0.0024	Yes ✘

Short-run returns (lags 1–5) appear independent, but significant cumulative autocorrelation emerges from lag 6. Beyond one trading week, the history of Bitcoin returns carries predictive information — a direct violation of the weak-form EMH.

✘ Verdict: Against RWH — Serial dependence at medium horizons

5.3 Wald-Wolfowitz Runs Test

✘ Against RWH

H₀: The binary sign sequence (+/−) of returns is purely random

Z = \frac{R - E[R]}{\sqrt{V[R]}} \xrightarrow{d} N(0,1)

Metric	Value
Z-statistic	3.0376
p-value	0.0024
Interpretation	Fewer runs than expected → Momentum clustering

Positive Z = 3.04 means there are fewer runs than expected under randomness — same-sign returns cluster together. This is the statistical signature of momentum: up-days tend to follow up-days, down-days follow down-days. With p = 0.0024 we reject randomness at all conventional levels.

✘ Verdict: Against RWH — Non-random momentum clustering

5.4 Variance Ratio Test (Lo-MacKinlay, 1988)

✘ Against RWH

H₀: VR(k) = 1 for all k — variance scales linearly with time

VR(k) = \frac{\text{Var}(r_{t,k})}{k \cdot \text{Var}(r_t)}, \quad z(k) = \frac{VR(k)-1}{\sqrt{2(2k-1)(k-1)/(3kn)}}

k (days)	VR(k)	z-stat	Reject H₀?	Interpretation
2	0.9902	-0.677	No ✔	Near-random
5	0.9990	-0.030	No ✔	Near-random
10	1.0747	1.523	No ✔	Borderline
20	1.2273	3.147	Yes ✘	Momentum at 1-month
30	1.3210	3.582	Yes ✘	Strong momentum at 6-week

VR rises monotonically from 0.99 at k=2 to ~1.44 at k=50. A pure random walk would stay flat at 1.0. This reveals positive autocorrelation that compounds over longer holding periods — Bitcoin trends.

✘ Verdict: Against RWH — VR rises to 1.44 at k=50 (significant at k=20,30)

6 Extensions — GARCH(1,1) & Rolling Analysis

6.1 GARCH(1,1) Volatility Model

r_t = \mu + \varepsilon_t, \quad \varepsilon_t = \sigma_t z_t, \quad \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2

μ mu

0.1357

p=0.0013 ★★

ω omega

0.5957

p=0.0003 ★★★

α alpha[1]

0.1484

p<0.0001 ★★★

β beta[1]

0.8207

p<0.0001 ★★★

Persistence α+β

0.9691

Very high persistence

Long-run Vol (Ann.)

83.84%

Persistence (α+β) = 0.969 means volatility shocks are extremely slow to decay. Even after GARCH filtering, the residuals remain autocorrelated (significant at all lags 1–20), indicating deeper non-linear structure beyond what GARCH can capture.

6.2 Rolling Window Analysis

Rolling ADF (365-day window): 100% of windows stationary — stationarity is consistent throughout BTC history, not a statistical artefact of the full sample.

Rolling Runs Test (180-day window): 14.8% of windows show non-random behaviour — momentum is episodic, concentrated in major bull/bear market runs, not uniformly present.

7 Discussion

7.1 Summary of Evidence

ADF — Prices Non-Stationary✔ Supports RWH

ADF — Returns Stationary✔ Supports RWH

Ljung-Box ACF (Lags 6–30)✘ Rejects RWH

Wald-Wolfowitz Runs Test✘ Rejects RWH

Variance Ratio Test (k=20,30)✘ Rejects RWH

Final Score2/5 — REJECTED ❌

7.2 BTC vs. Mature Equity Markets (AAPL)

Metric	BTC-USD	AAPL
Tests supporting RWH	2/5	3/5
Kurtosis	9.38	5.69
Runs Test p-value	0.0024 ✘	0.3537 ✔
VR at k=20	1.23 ✘	0.85 ✘
Verdict	Rejected	Partially Supported

Bitcoin is measurably less efficient than Apple stock — consistent with lower institutional arbitrage capacity and greater retail speculative activity.

7.3 Limitations

Fat tails: Kurtosis of 9.38 means the asymptotic normal distribution underlying the VR and Runs z-tests is a poor approximation.
Structural breaks: Bitcoin has passed through fundamentally different market regimes. Full-sample tests aggregate across these.
Non-linear dependence: ACF and VR tests only detect linear autocorrelation. Machine learning may uncover deeper structure.
Transaction costs: Statistical predictability does not imply economic profitability after costs and slippage.

8 Conclusion

Based on four statistical tests applied to 4,735 daily Bitcoin return observations (2013–2026), the Random Walk Hypothesis is rejected at conventional significance levels.

ADF: Price levels exhibit unit roots and returns are stationary — structurally consistent with a random walk model ✔
Runs Test (p=0.0024): Strong evidence that return sign sequences are non-random — momentum clustering ✘
Ljung-Box: Statistically significant autocorrelation from lag 6 through lag 30 ✘
Variance Ratio: VR rises from 0.99 (k=2) to 1.44 (k=50) — significant positive autocorrelation at longer horizons ✘
GARCH(1,1): Persistence α+β=0.969 confirms highly clustered volatility — constant-variance RWH assumption strongly violated ✘

Bitcoin's market is not weak-form efficient over the 2013–2026 window. Exploitable statistical structure exists primarily at the 1–4 week holding period horizon, where momentum-based strategies might generate meaningful signals before accounting for transaction costs.

9 Future Work

EGARCH / GJR-GARCH

Capture asymmetric volatility (leverage effect) — negative shocks may amplify volatility more than positive shocks.

Bootstrap VR Test

Heteroskedasticity-robust variance ratio inference — addresses the fat-tail limitation of asymptotic z-tests.

Hurst Exponent

Measure long-range dependence (H > 0.5 → trending). Provides a continuous measure of persistence rather than a binary test.

Machine Learning

LSTM/Transformer to detect non-linear return predictability beyond what linear autocorrelation tests can capture.

High-Frequency Data

Tick-level analysis would likely reveal even stronger RWH violations due to market microstructure effects.

Regime-Switching Model

Markov-switching to test market efficiency separately within bull and bear regimes.

Ref References

Bachelier, L. (1900). Théorie de la spéculation. Annales Scientifiques de l'École Normale Supérieure.
Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. Journal of Finance, 25(2), 383–417.
Lo, A. W., & MacKinlay, A. C. (1988). Stock market prices do not follow random walks. Review of Financial Studies, 1(1), 41–66.
Engle, R. F. (1982). Autoregressive conditional heteroscedasticity. Econometrica, 50(4), 987–1007.
Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity. Journal of Econometrics, 31(3), 307–327.
Urquhart, A. (2016). The inefficiency of Bitcoin. Economics Letters, 148, 80–82.