Predicting Market State Transitions Using Markov Chains

2025-04-20

Algorithm, Finance

This is my course project of Matrix Computation, the original page is on
https://docs.google.com/document/d/19rqlwL9RqEjm3eHTEahrBmxk1uIzwdrTQbp7exbuRdg/edit?tab=t.0#heading=h.v6krt4tg3vk8

1. Introduction

This project models stock market state transitions using a discrete-time Markov chain with three defined market states: Bull, Bear, and Stagnant. By computing the state transition matrix from historical data and analyzing its properties, especially its eigenvalues and steady-state behavior, we explore whether Markov chains can provide predictive insights into financial market dynamics.

And by processing the history data, studying the property of the transition matrix of the Markov chain, applying eigen decomposition and power methods to discover some insights of the market.

2. Objectives

This study aims to analyze market dynamics using a Markov chain framework. The primary objectives are as follows:

Classify daily market movements into Bull, Bear, or Stagnant states based on daily log returns.
Construct a Markov state transition matrix using these classified market states derived from historical data.
Perform eigenvalue decomposition on the transition matrix to evaluate market characteristics, with a focus on the left eigenvector corresponding to the eigenvalue of 1, which represents the steady-state distribution.
Apply the power method to approximate the steady-state distribution and gain insight into the long-term behavior of market states.
Assess the convergence and stability of the system to evaluate the reliability of the Markov model in representing market dynamics.

3. Data Accu and Market State Classification

3.1 Data Source

We use daily prices of the S&P 500 index for the past 20 years, retrieved via the python Yfinance library. The table is like below:

Price	Close	High	Low	Open	Volume	LogReturn
Ticker	^GSPC	^GSPC	^GSPC	^GSPC	^GSPC
Date
2000-01-04	1399.42	1455.22	1397.43	1455.22	1009000000	-3.91%
2000-01-05	1402.11	1413.27	1377.68	1399.42	1085500000	0.19%
2000-01-06	1403.45	1411.90	1392.10	1402.11	1092300000	0.10%
2000-01-07	1441.47	1441.47	1400.73	1403.45	1225200000	2.67%
2000-01-10	1457.60	1464.36	1441.47	1441.47	1064800000	1.11%
2000-01-11	1438.56	1458.66	1434.42	1457.60	1014000000	-1.31%

… totally 6217 rows of data, each row represents the prices, volume and return information in a day specified by the first column.

The python scripts which is using to retrieve above data is as below:

import yfinance as yf
import numpy as np
import pandas as pd
# Download Historical Market Data
def download_data(ticker: str, start: str, end: str) -> pd.DataFrame:
data = yf.download(ticker, start=start, end=end)
data[‘LogReturn’] = np.log(data[‘Close’] / data[‘Close’].shift(1))
return data.dropna()
if __name__ == “__main__“:
data = download_data(“^GSPC”, “2000-01-01”, “2024-12-31”)
data.to_csv(“sp500_20y.csv”)

3.2 Market State Classification

Logarithmic returns will be used in this study project to classify the daily market state. The definition of log return:

represents the price at t

Logarithmic returns (log returns) are generally more suitable for data analysis compared to simple returns for several key reasons.

First, log returns are time-additive, meaning that multi-period returns can be calculated by simply summing individual period returns, which simplifies statistical modeling.

represents the log return from t to t+k

Second, they are symmetric—a positive log return and its corresponding negative return cancel each other out, avoiding the asymmetry issue present in simple returns. This is a very great property compared to the simple return.

Additionally, log returns tend to follow a normal distribution more closely, making them better suited for many financial models, such as the Black-Scholes option pricing model and risk management techniques. Finally, log returns help mitigate the impact of extreme values, providing more stable and interpretable results in time series analysis and econometrics.

We define three market states based on daily log returns with below threshold:

Bull (1): ( ) (0.5% daily gain)
Bear (0): ( ) (0.5% daily loss)
Stagnant (2): otherwise

This thresholding provides a simple but effective way to capture distinct market regimes from every day’s log return.
Applying the threshold onto the daily log return, we will get a series of states corresponding to each day’s log return, to identify the daily state which will be used to calculate the transition matrix.

4. Markov Chain Modeling

4.1 Markov Chain
Markov Chain modeling is a powerful stochastic process used to analyze systems that transition between different states over time, where the probability of moving to the next state depends only on the current state and not on the sequence of prior events (Markov property). The Markov property ensures that the next state depends only on the current state:

In financial markets, this approach is particularly useful for modeling market regimes—such as Bear, Bull, or Stagnant states—as it captures the probabilistic nature of market movements. By discretizing market conditions into distinct states and estimating transition probabilities between them, we can gain insights into market behavior, predict future trends, and assess the likelihood of regime shifts.

The Markov Chain framework is especially valuable for its simplicity and interpretability, making it a popular tool for risk management, algorithmic trading, and economic forecasting.

A (discrete-time) Markov Chain is defined by:

A state space :
A transition probability matrix P, where each entry :
j represents the state in a day
i represents the state in the precede day of j
represents the probability of moving from state i to state j

In this study project, we define the Markov Chain as below:

State space:
Transition probability matrix P:

	0 (Bear)	1 (Bull)	2 (Stagnant)
0 (Bear)	P(0	0)	p(1
1 (Bull)	p(0	1)	p(1
2 (Stagnant)	p(0	2)	p(1

The transition matrix is the core component of a Markov Chain, representing the probabilities of moving from one state to another in a single time step. Each entry
, in the matrix denotes the probability of transitioning from state i(e.g., Bear) to state j (e.g., Bull). The rows of the matrix correspond to the current state, while the columns represent the next possible state, ensuring each row sums to 1, as probabilities must account for all potential outcomes.

4.2 State Transition Matrix

The Markov transition matrix can be computed by counting transitions from each day and its preceding day. By counting the state transition from day to day, we can get the transition matrix as below:

	Bear	Bull	Stagnant
Bear	0.28180039	0.3652968	0.3529028
Bull	0.25154062	0.25490196	0.49355742
Stagnant	0.21986532	0.25925926	0.52087542

Each row in the transition matrix sums to 1 since they represent probability distributions.

In this state transition matrix, a Bear market has a 28.18% chance of remaining Bearish, a 36.53% chance of transitioning to a Bull market, and a 35.29% chance of becoming Stagnant. This matrix encapsulates the dynamics of market regimes, allowing us to further compute long-term steady-state probabilities or simulate future market scenarios via the following eigenvalue/vector decomposition and power method.

5. Eigenvalue Decomposition

5.1 Eigenvalue Decomposition

Eigenvalue decomposition provides a powerful tool for analyzing Markov chains by examining the spectral properties of their transition matrices. Given a transition matrix P, we decompose it through its left eigenvectors and eigenvalues, solving

represents a left eigenvector (row vector)
is the corresponding eigenvalue.

The steady-state distribution corresponds to the eigenvector associated with the dominant eigenvalue λ = 1, while the spectral gap—the difference between the largest and second-largest eigenvalues—determines the convergence rate of the Markov chain.
This approach allows us to quantify long-term behavior and mixing properties, which is essential for applications in network analysis, stochastic modeling, and Monte Carlo simulations.
By applying the eigenvalue decomposition on above transition matrix, we get eigenvalues ( ) of the transition matrix ( P ) as below:
λ₁ = 1.0
λ₂ ≈ 0.09555295
λ₃ ≈ -0.03797518

5.2 Steady-State Distribution Analysis

In Markov chains, the largest eigenvalue is ( = 1 ). The corresponding normalized eigenvector represents the steady-state distribution, indicating the long-run proportion of time the system spends in each market state.

Steady-State Distribution, From eigenvector corresponding to the eigenvalue , we can get the steady-state distribution as below:

[Bear, Bull, Stagnant] ≈ [ 24.4%, 28.4%, 47.2%]

This implies that over the long term, the market has roughly equal tendencies to be in Bull or Bear states, bull is 4% more possible than bear, and with a higher chance in 47.2% of being Stagnant.

5.3 Spectral Gap Analysis

Let ( ) be the second-largest eigenvalue in magnitude. The spectral gap ( ) quantifies convergence speed, which is defined as below:

Large spectral gap γ: Fast convergence to steady state.
Small spectral gap γ: Slow mixing; more memory of initial conditions.

We can get the spectral gap in the case of this study which is about 0.904447
A large spectral gap (≈ 0.9) means the market reaches steady-state equilibrium rapidly, with regimes lasting shorter periods and transitioning quickly. This implies fast price adjustments to new information and reduced predictability of trend durations.

5.4 Computation Expense

we evaluate the computational cost of determining the steady-state distribution and spectral properties of a Markov chain transition matrix in the method of eigenvalues decomposition. The dominant operation is the eigendecomposition of , which is estimated to require FLOPs, consistent with the asymptotic complexity of the QR algorithm used in standard eigenvalue solvers.

Additional steps include matrix transposition ( FLOPs), extracting and normalizing the eigenvector corresponding to the unit eigenvalue ( FLOPs), and sorting eigenvalues to compute the spectral gap ( FLOPs).

The overall FLOP count is dominated by the eigendecomposition, making it the primary computational bottleneck for large matrices. The actual FLOP cost we recorded with matlab script is about 300.

This analysis underscores the trade-off between accuracy and efficiency in Markov chain analysis, particularly when iterative methods may offer lower-cost alternatives for approximating the steady state in high-dimensional cases.

6. Power Method

6.1 Iteratively get Steady-State Distribution

To avoid unstable or expensive eigen decomposition (especially with larger state spaces), we also use the power method:

Starting from a uniform distribution (), we iterate until convergence. The result approximates the steady-state distribution.
The uniform distribution will converge after a number of iterations, below image shows the convergence over iterations.

The above figure illustrates the convergence of state probabilities—Bear, Bull, and Sideway—over successive iterations of the power method applied to a Markov chain transition matrix. Initially, the probabilities exhibit variability, but they stabilize as the iteration count increases, demonstrating the method’s convergence to the steady-state distribution.

By the 9th iteration, the probabilities approach equilibrium, with Bear settling near 0.35, Bull near 0.25, and Sideway near 0.4, reflecting their long-term likelihoods. This visualization confirms the power method’s efficacy in approximating steady-state behavior, with convergence rates influenced by the transition matrix’s spectral gap. The plot aligns with theoretical expectations, where repeated matrix-vector multiplication drives the system toward its dominant eigenvector.

The figure depicts the convergence rate of the power method applied to a Markov chain transition matrix, plotted on a logarithmic scale to highlight the exponential decay in distribution changes. The vertical axis (Change in Distribution) shows the magnitude of difference between successive state vectors, while the horizontal axis (Iteration) tracks the progression of the algorithm.

The rapid decline in the curve demonstrates the power method’s fast convergence, particularly in early iterations, followed by stabilization as it approaches the steady-state solution. This behavior reflects the influence of the transition matrix’s spectral gap—the larger the gap, the faster the convergence. The log-scale visualization effectively captures the method’s geometric convergence rate, confirming its suitability for computing steady-state distributions in Markov chains.

6.2 Computation Expense

The power method’s computational cost is dominated by the matrix-vector multiplication step, which requires FLOPs per iteration (each of the n output elements involves n multiplications and n-1 additions). Additional operations include tracking convergence, where calculating the -norm of the difference between successive vectors contributes FLOPs( subtraction, squaring, and summation).

For k iterations until convergence, the total FLOP cost scales as , making the method per iteration. This quadratic dependence on n ensures efficiency for sparse or moderately sized matrices, particularly when the spectral gap is large (reducing k).
The flop recorded in the matlab script is about 219.

7. Comparison of both methods

The computational expense of finding a Markov chain’s steady state distribution varies significantly between eigendecomposition and power method approaches.

Eigendecomposition, with its O(n³) complexity requiring approximately 10n³ floating-point operations, represents a fixed computational cost regardless of the transition matrix’s properties.

In contrast, the power method scales as O(kn²), where k is the number of iterations and each iteration demands roughly 2n² operations for matrix-vector multiplication. This creates an interesting efficiency crossover: for small matrices (n < 50), eigendecomposition often outperforms due to its relatively low constant factor, while for larger systems, the power method becomes more attractive, particularly when the spectral gap is substantial.

The convergence rate of the power method is governed by the ratio of the second-largest to largest eigenvalue (|λ₂/λ₁|), making it exceptionally efficient for systems with well-separated eigenvalues but potentially prohibitively slow for nearly-degenerate systems where |λ₂| approaches |λ₁|. Memory requirements also differ substantially, with eigendecomposition demanding O(n²) storage for eigenvectors, while the power method requires only O(n) working memory plus any desired iteration history.

Beyond raw performance metrics, the methods differ in numerical stability and implementation complexity—eigendecomposition leverages highly optimized linear algebra libraries but may encounter difficulties with degenerate eigenvalues, whereas the power method remains conceptually straightforward and robust for most problems despite its iterative nature.

8. Limitations and Future Work

Simplified classification: Real markets involve volatility, volume, macroeconomic factors, and sentiment— These data should be able to contribute the evaluation of market state transition prediction

Fixed transition probabilities: The assumption of stationarity is a simplification; in reality, transition probabilities evolve over time .

Better model: Hidden Markov Models (HMMs) or neural networks could learn latent regimes and non-Markovian behavior. This will be of high interest to bring out in the future.

9. Conclusion

This project demonstrates how Markov chains, a fundamental topic in matrix computation, can be applied to model and analyze financial market transitions. Through eigenanalysis and iterative approximation methods, we gain insight into long-run behavior and system stability, supporting the practical application of matrix theory in finance.

10. Appendix

% Predicting Stock Market States Using Markov Chains (Bull, Bear, Stagnant)
% Step 2: Classify Market States (Return Threshold-Based with Stagnant State)
function states = classify_market_states(returns, bull_threshold, bear_threshold)
if nargin < 2
bull_threshold = 0.005;
end
if nargin < 3
bear_threshold = -0.005;
end

states = zeros(length(returns), 1);
for i = 1:length(returns)
r = returns(i);
if r > bull_threshold
states(i) = 1; % Bull
elseif r < bear_threshold
states(i) = 0; % Bear
else
states(i) = 2; % Stagnant
end
end
end
% Step 3: Compute State Transition Matrix
function P = compute_transition_matrix(states, n_states)
if nargin < 2
n_states = 3;
end

P = zeros(n_states, n_states);
for i = 2:length(states)
prev_state = states(i-1) + 1; % MATLAB is 1-indexed
curr_state = states(i) + 1; % MATLAB is 1-indexed
P(prev_state, curr_state) = P(prev_state, curr_state) + 1;
end

% Normalize rows to get probabilities
row_sums = sum(P, 2);
for i = 1:n_states
if row_sums(i) > 0
P(i, :) = P(i, :) / row_sums(i);
end
end
end
% Step 4: Analyze Transition Matrix with FLOPS counting
function [analysis, flops] = analyze_transition_matrix(P)
flops = 0;
n = size(P, 1);

% FLOPS for transpose operation
flops = flops + n^2;

% Eigendecomposition
% A common estimate for eig is ~10*n^3 FLOPS
[eigenvectors, eigenvalues] = eig(P’);
flops = flops + 10*n^3;

eigenvalues = diag(eigenvalues);
flops = flops + n; % Extracting diagonal

% Finding max eigenvalue
[~, idx] = max(real(eigenvalues));
flops = flops + 2*n; % n comparisons and n real part calculations

% Extract steady state
steady_state = real(eigenvectors(:, idx));
flops = flops + n; % Real part calculation

% Normalize
sum_val = sum(steady_state);
flops = flops + (n - 1); % Sum operation
steady_state = steady_state / sum_val;
flops = flops + n; % Division

% Sorting eigenvalues
sorted_eigenvalues = sort(abs(eigenvalues), ‘descend’);
flops = flops + n; % abs calculation
flops = flops + n*log(n); % Sorting complexity

% Calculating spectral gap
spectral_gap = 1 - sorted_eigenvalues(2);
flops = flops + 1;

analysis = struct(…
‘transition_matrix’, P, …
‘eigenvalues’, eigenvalues, …
‘steady_state’, steady_state, …
‘spectral_gap’, spectral_gap);
end
% Step 5: Power Method to Estimate Steady-State with Convergence History and FLOPS counting
function [v, history, flops] = power_method(P, iterations, tol)
flops = 0;
if nargin < 2
iterations = 1000;
end
if nargin < 3
tol = 1e-8;
end

n = size(P, 1);
v = ones(1, n) / n;
flops = flops + n; % Initialization division

% Initialize history matrix to store all iterations
% Each row will be a distribution vector at that iteration
history = zeros(iterations+1, n);
history(1, :) = v; % Store initial distribution

for i = 1:iterations
% Matrix-vector multiplication: v * P
% Each element requires n multiplications and n-1 additions
v_new = v * P;
flops = flops + n * (2*n - 1); % n*(n multiplications + (n-1) additions)

   history(i+1, :) \= v\_new; % Store the new distribution  
    
   % Calculate norm(v\_new \- v)  
   % Vector subtraction: n operations  
   % L2 norm: n multiplications, n-1 additions, 1 square root  
   diff \= v\_new \- v;  
   flops \= flops \+ n; % Vector subtraction  
    
   norm\_diff \= norm(diff);  
   flops \= flops \+ 2\*n; % Squared values and sum  
    
   if norm\_diff \< tol  
       % Trim the history to the actual number of iterations used  
       history \= history(1:i+1, :);  
       break;  
   end  
    
   v \= v\_new;

end
end
% Function to plot convergence trajectory
function plot_convergence_history(history)
iterations = size(history, 1) - 1;

% Create figure
figure;

% Plot each state’s probability over iterations
hold on;
plot(0:iterations, history(:, 1), ‘r-‘, ‘LineWidth’, 2);
plot(0:iterations, history(:, 2), ‘g-‘, ‘LineWidth’, 2);
plot(0:iterations, history(:, 3), ‘b-‘, ‘LineWidth’, 2);
hold off;

% Add labels and title
xlabel(‘Iteration’);
ylabel(‘Probability’);
title(‘Convergence of State Probabilities’);
legend({‘Bear’, ‘Bull’, ‘Stagnant’}, ‘Location’, ‘east’);
grid on;

% Calculate convergence metric (norm of differences between consecutive distributions)
diffs = zeros(iterations, 1);
for i = 1:iterations
diffs(i) = norm(history(i+1, :) - history(i, :));
end

% Create second plot for convergence rate
figure;
semilogy(1:iterations, diffs, ‘k-‘, ‘LineWidth’, 2);
xlabel(‘Iteration’);
ylabel(‘Change in Distribution (log scale)’);
title(‘Convergence Rate’);
grid on;
end
% Step 6: Run the Full System
function run_market_model()
% Read CSV file
data = readtable(‘sp500_20y.csv’);
log_returns = data.LogReturn;

% Classify states
states = classify_market_states(log_returns);

% Compute transition matrix
P = compute_transition_matrix(states);

% Analyze transition matrix
[analysis, flops_eigen_decomp] = analyze_transition_matrix(P);

% Compute steady state using power method
[power_steady_state, history, flops_power_method] = power_method(P);

% Plot the convergence history
plot_convergence_history(history);
% Display results
fprintf(‘\nTransition Matrix:\n’);
disp(analysis.transition_matrix);

fprintf(‘\nEigenvalues:\n’);
disp(analysis.eigenvalues);
disp([‘FLOPS costed : ‘, num2str(flops_eigen_decomp)])

fprintf(‘\nSteady-State Distribution (Eigenvector):\n’);
disp(analysis.steady_state);

fprintf(‘\nSpectral Gap:\n’);
disp(analysis.spectral_gap);
fprintf(‘\nSteady-State Distribution (Power Method):\n’);
disp(power_steady_state’);
disp([ ‘FLOPS costed :’ , num2str(flops_power_method)])

end
% Script entry point - uncomment to run directly
run_market_model();