Note

Download this Jupyter notebook and all data (unzip next to the ipynb file!). You will need a Gurobi license to run this notebook, please follow the license instructions.

Mean-Variance Data Preparation¶

Import time series data¶

[2]:

import pandas as pd
import numpy as np
import scipy.linalg as la

The primary data used for mean-variance portfolio analysis are estimates of the return and risk of the involved assets. Assuming that the excess returns of the assets are normally distributed, we need to estimate the parameters of the underlying distribution. The methods used for this vary greatly and depend on the nature of the available data. Commonly used return/risk models are:

Estimates from time series
Single-factor models such as CAPM
Commercial multiple-factor models like from Axioma, Barra or others

Our primary goal here is to generate sensible, realistic data for demonstrating the modeling and solving features of Gurobi. In order to keep things simple, we will discuss a simplistic variation of option 1 from above and reuse this data through most of our examples.

The starting point for our estimators are the weekly closing values for a set of 462 stocks from the S&P500 index, collected over a time span of ten years:

[3]:

df_ts = pd.read_pickle("subset_weekly_closings_10yrs.pkl")
df_ts.head()

[3]:

	APTV	DVN	HSY	CAG	HST	LUV	MMC	BA	VRSK	TSLA	...	GL	CPB	STLD	BG	TDG	AEE	AAPL	AIZ	UNP	K
2013-01-07	29.253906	37.565445	57.688988	16.892731	11.112709	9.571073	28.727692	63.796139	51.355377	2.289333	...	31.923758	26.039112	11.071469	55.641228	76.824593	21.690004	16.001547	28.292595	50.840179	36.513779
2013-01-14	28.817282	38.202770	57.806736	16.965885	11.456966	10.055566	28.361572	62.120152	52.218086	2.194000	...	32.215347	25.665024	11.382249	55.746841	74.700508	21.788601	15.325012	28.348913	51.520897	36.726971
2013-01-21	29.330509	38.431908	59.792229	17.483582	11.395001	10.128694	28.304636	62.296135	52.189018	2.260000	...	32.464428	26.156462	11.584254	57.157295	74.183243	22.091415	14.841517	29.201626	51.987900	36.843266
2013-01-28	29.606276	38.875881	60.592682	17.911243	11.567130	10.384659	28.564873	62.011208	52.848152	2.346000	...	33.065845	26.589228	11.654181	58.703533	75.316818	22.429447	13.435313	30.528965	52.585495	37.579758
2013-02-04	29.613934	40.787815	61.047863	18.316059	11.298610	10.220111	28.654951	61.717922	53.129269	2.500667	...	33.642952	26.523214	11.747411	59.254131	74.529922	22.830845	13.509834	30.697912	51.912697	37.534512

5 rows × 462 columns

Next, we convert the absolute closing prices to relative changes (“excess returns”), in units of percent. We use the Pandas pct_change function in order to compute all the ratios in one pass, resulting in a leading row containing NaN, which we clean up by simply dropping it. Depending on the context, it may be more meaningful to use log normal returns instead of the plain returns here. For the illustrative purpose of this data, however, this technical aspect doesn’t make a difference.

[4]:

df_r = df_ts.pct_change().iloc[1:] * 100
df_r.head()

[4]:

	APTV	DVN	HSY	CAG	HST	LUV	MMC	BA	VRSK	TSLA	...	GL	CPB	STLD	BG	TDG	AEE	AAPL	AIZ	UNP	K
2013-01-14	-1.492534	1.696573	0.204109	0.433053	3.097871	5.062058	-1.274448	-2.627098	1.679881	-4.164230	...	0.913394	-1.436640	2.807031	0.189812	-2.764850	0.454571	-4.227933	0.199057	1.338936	0.583867
2013-01-21	1.780971	0.599793	3.434708	3.051396	-0.540850	0.727237	-0.200751	0.283295	-0.055667	3.008203	...	0.773174	1.914816	1.774741	2.530105	-0.692452	1.389784	-3.154939	3.007920	0.906434	0.316647
2013-01-28	0.940203	1.155221	1.338724	2.446077	1.510563	2.527130	0.919414	-0.457376	1.262974	3.805308	...	1.852543	1.654528	0.603640	2.705233	1.528075	1.530150	-9.474801	4.545429	1.149489	1.998987
2013-02-04	0.025866	4.918046	0.751215	2.260120	-2.321409	-1.584529	0.315347	-0.472955	0.531932	6.592802	...	1.745325	-0.248271	0.799964	0.937930	-1.044781	1.789601	0.554666	0.553400	-1.279437	-0.120400
2013-02-11	-0.905307	1.983842	2.866628	2.136201	0.670313	0.804988	3.286669	3.014293	-0.948740	1.652877	...	-0.631966	1.963475	0.529126	-8.197570	2.953356	0.586077	3.398002	0.602701	0.632769	0.688484

5 rows × 462 columns

For example, the weekly return rate for Aptiv PLC (“APTV”) closing on 2013-01-14 was roughly -1.49%, and the following week +1.78%.

Compute mean and covariance matrix¶

With the time series data for the stocks at hand, we use the Pandas mean and cov functions to compute the average returns and their covariance matrix.

[5]:

mu = df_r.mean()
sigma = df_r.cov()
print(mu.head())
print(sigma.head())

APTV    0.308637
DVN     0.247540
HSY     0.246571
CAG     0.144669
HST     0.153303
dtype: float64
           APTV        DVN       HSY        CAG        HST        LUV  \
APTV  26.597463  14.778097  3.064101   2.264344  11.601311  11.322953
DVN   14.778097  42.022493  2.305614   3.309097  11.857115   8.160965
HSY    3.064101   2.305614  6.338661   3.081948   2.445438   2.462402
CAG    2.264344   3.309097  3.081948  10.795454   2.251297   2.013963
HST   11.601311  11.857115  2.445438   2.251297  15.706045   9.121264

           MMC         BA      VRSK       TSLA  ...        GL       CPB  \
APTV  6.545988  15.774890  6.146144  17.156187  ...  8.749196  0.351038
DVN   5.611923  12.794991  4.319937  10.399214  ...  9.758414  0.809340
HSY   2.577923   3.890774  2.662696   4.427763  ...  2.459988  3.089025
CAG   2.219121   3.173212  2.048475   3.609194  ...  2.640858  4.392320
HST   4.629432  10.313413  3.824482   7.838277  ...  6.979635  0.363426

           STLD         BG        TDG       AEE      AAPL       AIZ  \
APTV  12.383185   7.251853  13.716921  2.616640  7.353120  7.046057
DVN   15.489979  10.290904  10.162485  1.991456  5.870277  7.621633
HSY    2.458087   2.602136   3.051838  2.729324  2.244011  2.425771
CAG    1.946567   2.809631   2.010134  2.376624  2.817637  2.035238
HST    8.820236   5.097961   8.612474  2.802346  4.067069  6.014143

            UNP         K
APTV   8.246912  2.312723
DVN   10.036320  2.279012
HSY    2.248228  3.139629
CAG    2.040563  4.261462
HST    5.758568  1.797050

[5 rows x 462 columns]

The vector \(\mu\) and matrix \(\Sigma\) could be used directly in a typical Markowitz mean-variance optimization model. However, the uncertainty and sensitivity, especially of the return estimator \(\mu\), pose a problem: Optimization models exploit differences among the estimated returns, but they don’t have a holistic view of the data that could differentiate between significant differences and sampling or estimation noise. Next, we will present a standard technique that mitigates this effect to some degree.

Applying shrinkage to the return estimator¶

The basic idea of shrinking the return estimator is to combine it with another uniformly biased but noiseless estimator. Mathematically, we will choose a scalar \(0 < w < 1\) (shrinkage weight) and define the shrinkage estimator as

\[\mu_s = (1 - w) \mu + w \mu_0 1_n\]

where \(1_n\) is the all-ones vector of appropriate dimension. The vector \(\mu_0 1_n\) is called the shrinkage target. There are plenty of established ways to choose shrinkage weight and target; here, we follow the work of Jorion [1] (see also [2, Sec. 7.2]).

For the shrinkage target \(\mu_0 1_n\) we use the value

\begin{equation*} \mu_0 = \frac{\mu^\top \Sigma^{-1} 1_n}{ 1_n^\top \Sigma^{-1} 1_n} \tag{1} \end{equation*}

and compute the shrinkage weight as

\begin{equation*} w = \frac{n+2}{n + 2 + (\mu - \mu_0 1_n) t \Sigma^{-1} (\mu - \mu_0 1_n)} \tag{2} \end{equation*}

A technical problem with this approach is that it assumes that \(\Sigma\) is the actual covariance matrix of the return. But generally, this is not known exactly and is estimated from the time series, too. In order to reduce the bias towards the estimation error in \(\Sigma\), a common remedy is to use the scaled matrix

\begin{equation*} \frac{t - 1}{t - n - 2} \Sigma \end{equation*}

instead of \(\Sigma\) itself (\(t\) is the number of time samples, \(n\) is the number of assets). As the ratio between data points and assets increases, and the estimation becomes more reliable, this damping approaches the quantity of the estimate. Hence we start the computation of \(\mu_0\) and \(w\) as follows:

[6]:

t, n = df_r.shape
sigma_shrink = (t - 1) / (t - n - 2) * sigma

Next, we compute \(\mu_0\). As a first step, we compute the intermediate quantity

\begin{equation*} z = \Sigma^{-1} 1_n \end{equation*}

Then, following formula (1), we can obtain \(\mu_0\) as

\begin{equation*} \mu_0 = \frac{\mu^\top z}{1_n^\top z} \end{equation*}

[7]:

z = la.solve(sigma_shrink, np.ones(n), assume_a="pos")
mu_0 = mu @ z / z.sum()

We split apart the computation of \(w\) in (2) above by first computing the temporary vectors

\begin{align*} d & = \mu - \mu_0 1_n \\ y & = \Sigma^{-1} d \end{align*}

so that \(w\) can be computed as

\[w = \frac{n + 2}{n + 2 + t d^\top y}\]

[8]:

d = mu - mu_0 * np.ones(n)
y = la.solve(sigma_shrink, d, assume_a="pos")
w = (n + 2) / (n + 2 + t * d @ y)

Finally, we compute \(\mu_s\) as follows:

[9]:

mu_s = (1 - w) * mu + w * mu_0 * np.ones(n)

[10]:

print(f"Total change in mu: {la.norm(mu - mu_s, 2) / la.norm(mu, 2)}")
print(f"Max   change in mu: {(mu - mu_s).abs().max()}")

Total change in mu: 0.3272560076737803
Max   change in mu: 0.541640891206755

For this particular data set, this shrinkage results in a relative change of about 30% to the return estimator.

For future use, we save the data as Python pickle files.

[11]:

mu_s.to_pickle("mu.pkl")
sigma.to_pickle("sigma.pkl")

Literature¶

[1] Jorion, Philippe, 1986. “Bayes-Stein Estimation for Portfolio Analysis,” Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 21(3), pages 279-292, September.

[2] Cornuéjols G., Peña J., Tütüncü R. 2018. Optimization Methods in Finance. Cambridge University Press. Second Edition. Markowitz H.M. 1952. Portfolio Selection. The Journal of Finance. 7 (1): 77–91.