Note

Download this Jupyter notebook and all data (unzip next to the ipynb file!). You will need a Gurobi license to run this notebook, please follow the license instructions.


Mean-Variance Data Preparation

Import time series data

[2]:
import pandas as pd
import numpy as np
import scipy.linalg as la

The primary data used for mean-variance portfolio analysis are estimates of the return and risk of the involved assets. Assuming that the excess returns of the assets are normally distributed, we need to estimate the parameters of the underlying distribution. The methods used for this vary greatly and depend on the nature of the available data. Commonly used return/risk models are:

  1. Estimates from time series

  2. Single-factor models such as CAPM

  3. Commercial multiple-factor models like from Axioma, Barra or others

Our primary goal here is to generate sensible, realistic data for demonstrating the modeling and solving features of Gurobi. In order to keep things simple, we will discuss a simplistic variation of option 1 from above and reuse this data through most of our examples.

The starting point for our estimators are the weekly closing values for a set of 462 stocks from the S&P500 index, collected over a time span of ten years:

[3]:
df_ts = pd.read_pickle("subset_weekly_closings_10yrs.pkl")
df_ts.head()
[3]:
APTV DVN HSY CAG HST LUV MMC BA VRSK TSLA ... GL CPB STLD BG TDG AEE AAPL AIZ UNP K
2013-01-07 29.253906 37.565445 57.688988 16.892731 11.112709 9.571073 28.727692 63.796139 51.355377 2.289333 ... 31.923758 26.039112 11.071469 55.641228 76.824593 21.690004 16.001547 28.292595 50.840179 36.513779
2013-01-14 28.817282 38.202770 57.806736 16.965885 11.456966 10.055566 28.361572 62.120152 52.218086 2.194000 ... 32.215347 25.665024 11.382249 55.746841 74.700508 21.788601 15.325012 28.348913 51.520897 36.726971
2013-01-21 29.330509 38.431908 59.792229 17.483582 11.395001 10.128694 28.304636 62.296135 52.189018 2.260000 ... 32.464428 26.156462 11.584254 57.157295 74.183243 22.091415 14.841517 29.201626 51.987900 36.843266
2013-01-28 29.606276 38.875881 60.592682 17.911243 11.567130 10.384659 28.564873 62.011208 52.848152 2.346000 ... 33.065845 26.589228 11.654181 58.703533 75.316818 22.429447 13.435313 30.528965 52.585495 37.579758
2013-02-04 29.613934 40.787815 61.047863 18.316059 11.298610 10.220111 28.654951 61.717922 53.129269 2.500667 ... 33.642952 26.523214 11.747411 59.254131 74.529922 22.830845 13.509834 30.697912 51.912697 37.534512

5 rows × 462 columns

Next, we convert the absolute closing prices to relative changes (“excess returns”), in units of percent. We use the Pandas pct_change function in order to compute all the ratios in one pass, resulting in a leading row containing NaN, which we clean up by simply dropping it. Depending on the context, it may be more meaningful to use log normal returns instead of the plain returns here. For the illustrative purpose of this data, however, this technical aspect doesn’t make a difference.

[4]:
df_r = df_ts.pct_change().iloc[1:] * 100
df_r.head()
[4]:
APTV DVN HSY CAG HST LUV MMC BA VRSK TSLA ... GL CPB STLD BG TDG AEE AAPL AIZ UNP K
2013-01-14 -1.492534 1.696573 0.204109 0.433053 3.097871 5.062058 -1.274448 -2.627098 1.679881 -4.164230 ... 0.913394 -1.436640 2.807031 0.189812 -2.764850 0.454571 -4.227933 0.199057 1.338936 0.583867
2013-01-21 1.780971 0.599793 3.434708 3.051396 -0.540850 0.727237 -0.200751 0.283295 -0.055667 3.008203 ... 0.773174 1.914816 1.774741 2.530105 -0.692452 1.389784 -3.154939 3.007920 0.906434 0.316647
2013-01-28 0.940203 1.155221 1.338724 2.446077 1.510563 2.527130 0.919414 -0.457376 1.262974 3.805308 ... 1.852543 1.654528 0.603640 2.705233 1.528075 1.530150 -9.474801 4.545429 1.149489 1.998987
2013-02-04 0.025866 4.918046 0.751215 2.260120 -2.321409 -1.584529 0.315347 -0.472955 0.531932 6.592802 ... 1.745325 -0.248271 0.799964 0.937930 -1.044781 1.789601 0.554666 0.553400 -1.279437 -0.120400
2013-02-11 -0.905307 1.983842 2.866628 2.136201 0.670313 0.804988 3.286669 3.014293 -0.948740 1.652877 ... -0.631966 1.963475 0.529126 -8.197570 2.953356 0.586077 3.398002 0.602701 0.632769 0.688484

5 rows × 462 columns

For example, the weekly return rate for Aptiv PLC (“APTV”) closing on 2013-01-14 was roughly -1.49%, and the following week +1.78%.

Compute mean and covariance matrix

With the time series data for the stocks at hand, we use the Pandas mean and cov functions to compute the average returns and their covariance matrix.

[5]:
mu = df_r.mean()
sigma = df_r.cov()
print(mu.head())
print(sigma.head())
APTV    0.308637
DVN     0.247540
HSY     0.246571
CAG     0.144669
HST     0.153303
dtype: float64
           APTV        DVN       HSY        CAG        HST        LUV  \
APTV  26.597463  14.778097  3.064101   2.264344  11.601311  11.322953
DVN   14.778097  42.022493  2.305614   3.309097  11.857115   8.160965
HSY    3.064101   2.305614  6.338661   3.081948   2.445438   2.462402
CAG    2.264344   3.309097  3.081948  10.795454   2.251297   2.013963
HST   11.601311  11.857115  2.445438   2.251297  15.706045   9.121264

           MMC         BA      VRSK       TSLA  ...        GL       CPB  \
APTV  6.545988  15.774890  6.146144  17.156187  ...  8.749196  0.351038
DVN   5.611923  12.794991  4.319937  10.399214  ...  9.758414  0.809340
HSY   2.577923   3.890774  2.662696   4.427763  ...  2.459988  3.089025
CAG   2.219121   3.173212  2.048475   3.609194  ...  2.640858  4.392320
HST   4.629432  10.313413  3.824482   7.838277  ...  6.979635  0.363426

           STLD         BG        TDG       AEE      AAPL       AIZ  \
APTV  12.383185   7.251853  13.716921  2.616640  7.353120  7.046057
DVN   15.489979  10.290904  10.162485  1.991456  5.870277  7.621633
HSY    2.458087   2.602136   3.051838  2.729324  2.244011  2.425771
CAG    1.946567   2.809631   2.010134  2.376624  2.817637  2.035238
HST    8.820236   5.097961   8.612474  2.802346  4.067069  6.014143

            UNP         K
APTV   8.246912  2.312723
DVN   10.036320  2.279012
HSY    2.248228  3.139629
CAG    2.040563  4.261462
HST    5.758568  1.797050

[5 rows x 462 columns]

The vector \(\mu\) and matrix \(\Sigma\) could be used directly in a typical Markowitz mean-variance optimization model. However, the uncertainty and sensitivity, especially of the return estimator \(\mu\), pose a problem: Optimization models exploit differences among the estimated returns, but they don’t have a holistic view of the data that could differentiate between significant differences and sampling or estimation noise. Next, we will present a standard technique that mitigates this effect to some degree.

Applying shrinkage to the return estimator

The basic idea of shrinking the return estimator is to combine it with another uniformly biased but noiseless estimator. Mathematically, we will choose a scalar \(0 < w < 1\) (shrinkage weight) and define the shrinkage estimator as

\[\mu_s = (1 - w) \mu + w \mu_0 1_n\]

where \(1_n\) is the all-ones vector of appropriate dimension. The vector \(\mu_0 1_n\) is called the shrinkage target. There are plenty of established ways to choose shrinkage weight and target; here, we follow the work of Jorion [1] (see also [2, Sec. 7.2]).

For the shrinkage target \(\mu_0 1_n\) we use the value

\begin{equation*} \mu_0 = \frac{\mu^\top \Sigma^{-1} 1_n}{ 1_n^\top \Sigma^{-1} 1_n} \tag{1} \end{equation*}

and compute the shrinkage weight as

\begin{equation*} w = \frac{n+2}{n + 2 + (\mu - \mu_0 1_n) t \Sigma^{-1} (\mu - \mu_0 1_n)} \tag{2} \end{equation*}

A technical problem with this approach is that it assumes that \(\Sigma\) is the actual covariance matrix of the return. But generally, this is not known exactly and is estimated from the time series, too. In order to reduce the bias towards the estimation error in \(\Sigma\), a common remedy is to use the scaled matrix

\begin{equation*} \frac{t - 1}{t - n - 2} \Sigma \end{equation*}

instead of \(\Sigma\) itself (\(t\) is the number of time samples, \(n\) is the number of assets). As the ratio between data points and assets increases, and the estimation becomes more reliable, this damping approaches the quantity of the estimate. Hence we start the computation of \(\mu_0\) and \(w\) as follows:

[6]:
t, n = df_r.shape
sigma_shrink = (t - 1) / (t - n - 2) * sigma

Next, we compute \(\mu_0\). As a first step, we compute the intermediate quantity

\begin{equation*} z = \Sigma^{-1} 1_n \end{equation*}

Then, following formula (1), we can obtain \(\mu_0\) as

\begin{equation*} \mu_0 = \frac{\mu^\top z}{1_n^\top z} \end{equation*}

[7]:
z = la.solve(sigma_shrink, np.ones(n), assume_a="pos")
mu_0 = mu @ z / z.sum()

We split apart the computation of \(w\) in (2) above by first computing the temporary vectors

\begin{align*} d & = \mu - \mu_0 1_n \\ y & = \Sigma^{-1} d \end{align*}

so that \(w\) can be computed as

\[w = \frac{n + 2}{n + 2 + t d^\top y}\]
[8]:
d = mu - mu_0 * np.ones(n)
y = la.solve(sigma_shrink, d, assume_a="pos")
w = (n + 2) / (n + 2 + t * d @ y)

Finally, we compute \(\mu_s\) as follows:

[9]:
mu_s = (1 - w) * mu + w * mu_0 * np.ones(n)
[10]:
print(f"Total change in mu: {la.norm(mu - mu_s, 2) / la.norm(mu, 2)}")
print(f"Max   change in mu: {(mu - mu_s).abs().max()}")
Total change in mu: 0.3272560076737803
Max   change in mu: 0.541640891206755

For this particular data set, this shrinkage results in a relative change of about 30% to the return estimator.

For future use, we save the data as Python pickle files.

[11]:
mu_s.to_pickle("mu.pkl")
sigma.to_pickle("sigma.pkl")

Literature

[1] Jorion, Philippe, 1986. “Bayes-Stein Estimation for Portfolio Analysis,” Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 21(3), pages 279-292, September.

[2] Cornuéjols G., Peña J., Tütüncü R. 2018. Optimization Methods in Finance. Cambridge University Press. Second Edition. Markowitz H.M. 1952. Portfolio Selection. The Journal of Finance. 7 (1): 77–91.