{ "cells": [ { "cell_type": "markdown", "id": "409dea88", "metadata": {}, "source": [ "# Enforcing Diversification\n", "\n", "The *standard mean-variance (Markowitz) portfolio selection model* determines an optimal investment portfolio that balances risk and expected return. In this notebook, we minimize the variance (risk) of the portfolio, constraining the expected return to meet a prescribed minimum level. Please refer to the [annotated list of references](../literature.rst#portfolio-optimization) for more background information on portfolio optimization.\n", "\n", "To this basic model, we add two types of *diversification constraints*:\n", "\n", "* Holdings must be diversified across a specified minimum number of assets.\n", "* Positions must fall between specified lower and upper bounds. This ensures that no single asset makes up more than a pre-specified percentage of the portfolio's value." ] }, { "cell_type": "code", "execution_count": 1, "id": "2d8041e1", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:00.280904Z", "iopub.status.busy": "2025-01-31T10:04:00.280675Z", "iopub.status.idle": "2025-01-31T10:04:01.064750Z", "shell.execute_reply": "2025-01-31T10:04:01.063987Z" }, "nbsphinx": "hidden" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: numpy in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (2.2.2)\r\n", "Requirement already satisfied: scipy in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (1.15.1)\r\n", "Requirement already satisfied: gurobipy in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (11.0.3)\r\n", "Requirement already satisfied: pandas in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (2.2.3)\r\n", "Requirement already satisfied: matplotlib in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (3.10.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: python-dateutil>=2.8.2 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from pandas) (2.9.0.post0)\r\n", "Requirement already satisfied: pytz>=2020.1 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from pandas) (2025.1)\r\n", "Requirement already satisfied: tzdata>=2022.7 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from pandas) (2025.1)\r\n", "Requirement already satisfied: contourpy>=1.0.1 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from matplotlib) (1.3.1)\r\n", "Requirement already satisfied: cycler>=0.10 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from matplotlib) (0.12.1)\r\n", "Requirement already satisfied: fonttools>=4.22.0 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from matplotlib) (4.55.8)\r\n", "Requirement already satisfied: kiwisolver>=1.3.1 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from matplotlib) (1.4.8)\r\n", "Requirement already satisfied: packaging>=20.0 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from matplotlib) (24.2)\r\n", "Requirement already satisfied: pillow>=8 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from matplotlib) (11.1.0)\r\n", "Requirement already satisfied: pyparsing>=2.3.1 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from matplotlib) (3.2.1)\r\n", "Requirement already satisfied: six>=1.5 in /opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "# Install dependencies\n", "%pip install numpy scipy gurobipy pandas matplotlib" ] }, { "cell_type": "code", "execution_count": 2, "id": "2c6aafe8", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.066963Z", "iopub.status.busy": "2025-01-31T10:04:01.066752Z", "iopub.status.idle": "2025-01-31T10:04:01.388375Z", "shell.execute_reply": "2025-01-31T10:04:01.387680Z" } }, "outputs": [], "source": [ "import gurobipy as gp\n", "import pandas as pd\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 3, "id": "4a7a204a", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.390584Z", "iopub.status.busy": "2025-01-31T10:04:01.390332Z", "iopub.status.idle": "2025-01-31T10:04:01.398455Z", "shell.execute_reply": "2025-01-31T10:04:01.397877Z" }, "nbsphinx": "hidden" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Set parameter WLSAccessID\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Set parameter WLSSecret\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Set parameter LicenseID to value 2443533\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "WLS license 2443533 - registered to Gurobi GmbH\n" ] } ], "source": [ "# Hidden cell to avoid licensing messages\n", "# when docs are generated.\n", "with gp.Model():\n", " pass" ] }, { "cell_type": "markdown", "id": "1e5f7ea3", "metadata": {}, "source": [ "## Input Data\n", "\n", "The following input data is used within the model:\n", "\n", "- $S$: set of stocks\n", "- $\\mu$: vector of expected returns\n", "- $\\Sigma$: PSD variance-covariance matrix\n", " - $\\sigma_{ij}$ covariance between returns of assets $i$ and $j$\n", " - $\\sigma_{ii}$ variance of return of asset $i$" ] }, { "cell_type": "code", "execution_count": 4, "id": "2af1a786", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.400351Z", "iopub.status.busy": "2025-01-31T10:04:01.400167Z", "iopub.status.idle": "2025-01-31T10:04:01.405937Z", "shell.execute_reply": "2025-01-31T10:04:01.405376Z" } }, "outputs": [], "source": [ "# Import some example data set\n", "Sigma = pd.read_pickle(\"sigma.pkl\")\n", "mu = pd.read_pickle(\"mu.pkl\")" ] }, { "cell_type": "markdown", "id": "3f8b8975", "metadata": {}, "source": [ "## Formulation\n", "The model minimizes the variance of the portfolio given that the minimum level of expected return is attained and that some portfolio diversification requirements are met. Mathematically, this results in a convex quadratic mixed-integer optimization problem.\n", "\n", "### Model Parameters\n", "\n", "We use the following parameters:\n", "\n", "- $\\bar\\mu$: required expected portfolio return\n", "- $K$: minimal number of open positions in the portfolio\n", "- $\\ell > 0$: lower bound on position size\n", "- $u \\leq 1$: upper bound on position size" ] }, { "cell_type": "code", "execution_count": 5, "id": "a9501c4f", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.407873Z", "iopub.status.busy": "2025-01-31T10:04:01.407671Z", "iopub.status.idle": "2025-01-31T10:04:01.410898Z", "shell.execute_reply": "2025-01-31T10:04:01.410302Z" } }, "outputs": [], "source": [ "# Values for the model parameters:\n", "r = 0.25 # Required return\n", "K = 35 # Minimal number of stocks\n", "u = 0.15 # Maximal position size\n", "l = 0.005 # Minimal position size" ] }, { "cell_type": "markdown", "id": "ebe012de", "metadata": {}, "source": [ "### Decision Variables\n", "We need two sets of decision variables:\n", "\n", "1. The proportions of capital invested among the considered stocks. The corresponding vector of positions is denoted by $x$ with its component $x_i$ denoting the proportion of capital invested in stock $i$.\n", "\n", "2. Binary variables $b_i$ indicating whether or not asset $i$ is held. If $b_i$ is 0, the holding $x_i$ is also 0; otherwise if $b_i$ is 1, the investor holds asset $i$ (that is, $x_i \\geq \\ell$).\n", "\n", "### Variable Bounds\n", "\n", "Each position $x_i$ must be between 0 and $u$; this prevents large positions as well as leverage and short-selling:\n", "\n", "\\begin{equation*}\n", "0\\leq x_i\\leq u \\; , \\; i \\in S\\tag{1}\n", "\\end{equation*}\n", "\n", "The $b_i$ must be binary:\n", "\n", "\\begin{equation*}\n", "b_i \\in \\{0,1\\} \\; , \\; i \\in S\n", "\\end{equation*}" ] }, { "cell_type": "code", "execution_count": 6, "id": "3f897ab9", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.412671Z", "iopub.status.busy": "2025-01-31T10:04:01.412499Z", "iopub.status.idle": "2025-01-31T10:04:01.418181Z", "shell.execute_reply": "2025-01-31T10:04:01.417571Z" } }, "outputs": [], "source": [ "%%capture\n", "# Create an empty optimization model\n", "m = gp.Model()\n", "\n", "# Add variables: x[i] is the capital proportion invested in stock i; upper bound is u; see formula (1) above\n", "x = m.addMVar(len(mu), lb=0, ub=u, name=\"x\")\n", "\n", "# Add binary variables: b[i]=1 if stock i is held; b[i]=0 otherwise\n", "b = m.addMVar(len(mu), vtype=gp.GRB.BINARY, name=\"b\")" ] }, { "cell_type": "markdown", "id": "049301f6", "metadata": {}, "source": [ "### Constraints\n", "The budget constraint ensures that all capital is invested:\n", "\n", "\n", "\\begin{equation*}\n", "\\sum_{i \\in S} x_i =1 \\tag{2}\n", "\\end{equation*}\n", "\n", "The expected return of the portfolio must be at least $\\bar\\mu$:\n", "\n", "\\begin{equation*}\n", "\\mu^\\top x \\geq \\bar\\mu\\tag{3}\n", "\\end{equation*}\n", "\n", "\n", "The variable bounds only enforce that each $x_i$ is between $0$ and $u$. To enforce the minimal position size, we need the binary variables $b$ and the following sets of discrete constraints:\n", "\n", "Ensure that $x_i = 0$ if $b_i = 0$:\n", "\n", "\\begin{equation*}\n", "x_i \\leq b_i \\; , \\; i \\in S\\tag{4}\n", "\\end{equation*}\n", "\n", "Note that $x_i$ has an upper bound of $u\\leq 1$. Thus, if $b_i = 1$, the above constraint is non-restrictive.\n", "\n", "\n", "Ensure a minimal position size of $\\ell$ if asset $i$ is traded:\n", "\n", "\\begin{equation*}\n", "x_i \\geq \\ell b_i \\; , \\; i \\in S\\tag{5}\n", "\\end{equation*}\n", "\n", "Hence $b_i = 1$ implies $x_i \\geq \\ell$. If $b_i = 0$, this constraint is non-restrictive since $x_i$ has a lower bound of 0.\n", "\n", "\n", "Finally, there must be at least $K$ positions in the portfolio:\n", "\n", "\\begin{equation*}\n", "\\sum_{i \\in S} b_i \\geq K\\tag{6}\n", "\\end{equation*}" ] }, { "cell_type": "code", "execution_count": 7, "id": "e59e10e0", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.419939Z", "iopub.status.busy": "2025-01-31T10:04:01.419765Z", "iopub.status.idle": "2025-01-31T10:04:01.555257Z", "shell.execute_reply": "2025-01-31T10:04:01.554700Z" } }, "outputs": [], "source": [ "%%capture\n", "# Budget constraint: all investments sum up to 1; see formula (2) above\n", "m.addConstr(x.sum() == 1, name=\"Budget_Constraint\")\n", "\n", "# Lower bound on expected return; see formula (3) above\n", "m.addConstr(mu.to_numpy() @ x >= r, name=\"Minimal_Return\")\n", "\n", "# Force x to 0 if not traded; see formula (4) above\n", "m.addConstr(x <= b, name=\"Indicator\")\n", "\n", "# Minimal position; see formula (5) above\n", "m.addConstr(x >= l * b, name=\"Minimal_Position\")\n", "\n", "# Diversification constraint: at least B stocks must be held; see formula (6) above\n", "m.addConstr(b.sum() >= K, name=\"Diversification\")" ] }, { "cell_type": "markdown", "id": "bc4cf38c", "metadata": {}, "source": [ "### Objective Function\n", "The objective is to minimize the risk of the portfolio, which is measured by its variance:\n", "\n", "$$\\min_x x^\\top \\Sigma x$$" ] }, { "cell_type": "code", "execution_count": 8, "id": "43652616", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.557977Z", "iopub.status.busy": "2025-01-31T10:04:01.557713Z", "iopub.status.idle": "2025-01-31T10:04:01.577698Z", "shell.execute_reply": "2025-01-31T10:04:01.577115Z" } }, "outputs": [], "source": [ "# Define objective function: Minimize risk\n", "m.setObjective(x @ Sigma.to_numpy() @ x, gp.GRB.MINIMIZE)" ] }, { "cell_type": "markdown", "id": "6c956f5b", "metadata": {}, "source": [ "We now solve the optimization problem:" ] }, { "cell_type": "code", "execution_count": 9, "id": "9a5d0a53", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.580311Z", "iopub.status.busy": "2025-01-31T10:04:01.580099Z", "iopub.status.idle": "2025-01-31T10:04:01.805533Z", "shell.execute_reply": "2025-01-31T10:04:01.804903Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Gurobi Optimizer version 11.0.3 build v11.0.3rc0 (linux64 - \"Ubuntu 24.04.1 LTS\")\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "CPU model: AMD EPYC 7763 64-Core Processor, instruction set [SSE2|AVX|AVX2]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Thread count: 1 physical cores, 2 logical processors, using up to 2 threads\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "WLS license 2443533 - registered to Gurobi GmbH\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Optimize a model with 927 rows, 924 columns and 3234 nonzeros\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Model fingerprint: 0x1c1570fe\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Model has 106953 quadratic objective terms\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Variable types: 462 continuous, 462 integer (462 binary)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Coefficient statistics:\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Matrix range [5e-03, 1e+00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Objective range [0e+00, 0e+00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " QObjective range [6e-03, 2e+02]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Bounds range [1e-01, 1e+00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " RHS range [2e-01, 4e+01]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Found heuristic solution: objective 5.8417163\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Presolve time: 0.05s\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Presolved: 927 rows, 924 columns, 3233 nonzeros\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Presolved model has 106953 quadratic objective terms\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Variable types: 462 continuous, 462 integer (462 binary)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Root relaxation: objective 2.026834e+00, 1839 iterations, 0.02 seconds (0.05 work units)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Nodes | Current Node | Objective Bounds | Work\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " 0 0 2.02683 0 8 5.84172 2.02683 65.3% - 0s\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "H 0 0 2.0423444 2.02683 0.76% - 0s\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " 0 0 2.02695 0 8 2.04234 2.02695 0.75% - 0s\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "H 0 0 2.0270503 2.02695 0.01% - 0s\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Explored 1 nodes (1839 simplex iterations) in 0.21 seconds (0.13 work units)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Thread count was 2 (of 2 available processors)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Solution count 3: 2.02705 2.04234 5.84172 \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Optimal solution found (tolerance 1.00e-04)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Best objective 2.027050298073e+00, best bound 2.026945015707e+00, gap 0.0052%\n" ] } ], "source": [ "m.optimize()" ] }, { "cell_type": "markdown", "id": "b772305a", "metadata": {}, "source": [ "Display basic solution data:" ] }, { "cell_type": "code", "execution_count": 10, "id": "5cb18a94", "metadata": { "execution": { "iopub.execute_input": "2025-01-31T10:04:01.807470Z", "iopub.status.busy": "2025-01-31T10:04:01.807290Z", "iopub.status.idle": "2025-01-31T10:04:01.813551Z", "shell.execute_reply": "2025-01-31T10:04:01.812916Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Minimum Risk: 2.027050\n", "Expected return: 0.250000\n", "Solution time: 0.22 seconds\n", "\n", "Number of stocks: 35.0\n", "\n", "KR 0.029874\n", "EG 0.005000\n", "CTRA 0.005000\n", "PGR 0.045229\n", "CHRW 0.008562\n", "CME 0.032737\n", "ODFL 0.026182\n", "BDX 0.024356\n", "LIN 0.013054\n", "MNST 0.005000\n", "KDP 0.075960\n", "GILD 0.016222\n", "META 0.005006\n", "CLX 0.055330\n", "SJM 0.030111\n", "PG 0.006777\n", "LLY 0.102443\n", "DPZ 0.052576\n", "MKTX 0.019684\n", "CPRT 0.005000\n", "MRK 0.043846\n", "ED 0.085151\n", "WST 0.021548\n", "TMUS 0.037665\n", "NOC 0.016102\n", "EA 0.005000\n", "MSFT 0.005000\n", "WM 0.044581\n", "TTWO 0.036273\n", "WMT 0.065286\n", "TXN 0.005000\n", "HRL 0.035113\n", "XEL 0.005000\n", "AZO 0.009048\n", "CPB 0.021283\n", "Name: Position, dtype: float64\n" ] } ], "source": [ "print(f\"Minimum Risk: {m.ObjVal:.6f}\")\n", "print(f\"Expected return: {mu @ x.X:.6f}\")\n", "print(f\"Solution time: {m.Runtime:.2f} seconds\\n\")\n", "print(f\"Number of stocks: {sum(b.X)}\\n\")\n", "\n", "# Print investments (with non-negligible value, i.e. >1e-5)\n", "positions = pd.Series(name=\"Position\", data=x.X, index=mu.index)\n", "print(positions[positions > 1e-5])" ] }, { "cell_type": "markdown", "id": "08c7c92d", "metadata": {}, "source": [ "## Takeaways\n", "* Cardinality constraints can be modeled using binary decision variables.\n", "* For diversification, it is essential to have non-zero lower bounds on the position size." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 5 }