Formulation
Let’s consider a portfolio of \(n\) risky assets. Let
- \(\mathbf{w} = (w_1, w_2, \ldots, w_n)^T\) be the vector of portfolio weights, where \(w_i\) is the proportion of the total portfolio value invested in asset \(i\).
- \(\mathbf{\mu} = (\mu_1, \mu_2, \ldots, \mu_n)^T\) be the vector of expected returns for each asset.
- \(\mathbf{\Sigma} = (\sigma_{ij})_{n \times n}\) be the \(n \times n\) covariance matrix of asset returns.
- \(\mathbf{1}\) be a vector of ones of length \(n\).
The mean-variance optimization problem can be formulated as the following maximization problem:
\[\begin{aligned}
\max_{\mathbf{w}} \quad & \mathbf{w}^T \mathbf{\mu} - \frac{\gamma}{2} \mathbf{w}^T \mathbf{\Sigma} \mathbf{w} \\
\text{s.t} \quad & \mathbf{1}^T \mathbf{w} = 1, \\
\end{aligned}\]
where \(\gamma > 0\) is the risk aversion parameter.
The first term in the objective function represents the expected return of the portfolio, while the second term represents the risk (variance) of the portfolio, scaled by the risk aversion parameter. The constraint ensures that the total weight of the portfolio sums to 1 .
1 There are other equivalent formulations of the mean-variance optimization problem, such as minimizing portfolio variance subject to a lower bound on expected return or maximizing expected return subject to an upper bound on level of variance.
\[
\begin{aligned}
\min_{\mathbf{w}} \quad & \mathbf{w}^T \mathbf{\Sigma} \mathbf{w} \\
\text{s.t} \quad & \mathbf{w}^T \mathbf{\mu} \geq \mu_p, \mathbf{1}^T \mathbf{w} = 1\\
\end{aligned}
\]
\[
\begin{aligned}
\max_{\mathbf{w}} \quad & \mathbf{w}^T \mathbf{\mu} \\
\text{s.t} \quad & \mathbf{w}^T \mathbf{\Sigma} \mathbf{w} \leq \sigma_p^2, \mathbf{1}^T \mathbf{w} = 1\\
\end{aligned}
\]
See here for a discussion on three equivalent formulations of the mean-variance optimization problem.
This mean-variance objective function for an investor can be justified in a few ways. One common justification is that investors are assumed to be rational and risk-averse, meaning they prefer higher returns and lower risk. The mean-variance objective captures this trade-off between return and risk, allowing investors to make decisions based on their individual risk preferences.
You may recall that a rational investor maximizes their expected utility. Isn’t it possible to directly maximize the expected utility instead of using the mean-variance objective? Yes, it is possible. However, the mean-variance objective is often used as an approximation of the expected utility maximization problem. This is because the mean-variance objective is easier to compute and analyze than the expected utility function, especially when dealing with multiple assets.
(For more on mean-variance approximation of expected utility, see Levy and Markowitz (1979). There is a large literature on this topic. For example, see some recent discussions in Markowitz (2014) and Schuhmacher, Kohrs, and Auer (2021). See also the appendix for linking the mean-variance objective to expected utility maximization using a Taylor expansion.)
Solving the Optimization Problem
To solve the mean-variance optimization problem, we can use various optimization techniques, such as quadratic programming (as we have a quadratic objective function and a linear constraint) or the method of Lagrange multipliers.
For the current setup, we can find a closed-form solution to the mean-variance optimization problem using the method of Lagrange multipliers. The Lagrangian function for this problem is given by:
\[\mathcal{L}(\mathbf{w}, \lambda) = \mathbf{w}^T \mathbf{\mu} - \frac{\gamma}{2} \mathbf{w}^T \mathbf{\Sigma} \mathbf{w} + \lambda (1 - \mathbf{1}^T \mathbf{w}),\]
where \(\lambda\) is the Lagrange multiplier associated with the constraint.
To find the optimal portfolio weights, we take the partial derivatives of the Lagrangian with respect to \(\mathbf{w}\) and \(\lambda\), set them to zero, and solve the resulting system of equations. Setting the derivatives to zero gives us the following equations:
2 Recall that, from Matrix Calculus,
\[\frac{\partial}{\partial \mathbf{w}} (\mathbf{w}^T \mathbf{\mu}) = \mathbf{\mu},\]
\[\frac{\partial}{\partial \mathbf{w}} (\mathbf{w}^T \mathbf{\Sigma} \mathbf{w}) = 2\mathbf{\Sigma}\mathbf{w}\]
(since \(\mathbf{\Sigma}\) is a symmetric), and
\[\frac{\partial}{\partial \mathbf{w}} (\lambda \mathbf{1}^T \mathbf{w}) = \lambda \mathbf{1}.\]
\[\frac{\partial \mathcal{L}}{\partial \mathbf{w}} = \mathbf{\mu} - \gamma \mathbf{\Sigma} \mathbf{w} - \lambda \mathbf{1} = 0 \tag{1}\]
\[\frac{\partial \mathcal{L}}{\partial \lambda} = 1 - \mathbf{1}^T \mathbf{w} = 0 \tag{2}\]
Assuming \(\mathbf{\Sigma}\) is positive definite, and hence invertible, we solve for \(\mathbf{w}\) using Equation 1:
\[\mathbf{w} = \frac{1}{\gamma} \mathbf{\Sigma}^{-1} (\mathbf{\mu} - \lambda \mathbf{1})\]
Plug the result into Equation 2 to solve for \(\lambda\).
\[
\lambda = \frac{\mathbf{1}^T \mathbf{\Sigma}^{-1} \mathbf{\mu} - \gamma}{\mathbf{1}^T \mathbf{\Sigma}^{-1} \mathbf{1}}
\]
To make the notation cleaner, let’s define two scalars:
- \(A = \mathbf{1}^T \mathbf{\Sigma}^{-1} \mathbf{1}\)
- \(B = \mathbf{1}^T \mathbf{\Sigma}^{-1} \mathbf{\mu}\)
Thus, \[
\lambda = \frac{B - \gamma}{A}
\]
Substituting \(\lambda\) back into the expression for \(\mathbf{w}\), we obtain the optimal portfolio weights:
\[
\mathbf{w}^* = \frac{1}{\gamma} \mathbf{\Sigma}^{-1} \left( \mathbf{\mu} - \frac{B - \gamma}{A} \mathbf{1} \right)
\]
Or, grouping terms by \(\gamma\),
\[
\mathbf{w}^* = \frac{1}{A} \mathbf{\Sigma}^{-1} \mathbf{1} + \frac{1}{\gamma} \left( \mathbf{\Sigma}^{-1} \mathbf{\mu} - \frac{B}{A} \mathbf{\Sigma}^{-1} \mathbf{1} \right)
\]
Note that the first order condition is both necessary and sufficient for optimality since the objective function is concave (the negative of a convex quadratic function) and the constraint is linear.
As \(\gamma \to \infty\), the investor becomes infinitely risk-averse and the optimal portfolio converges to the minimum variance portfolio:
\[\mathbf{w}_{MVP} = \frac{1}{A} \mathbf{\Sigma}^{-1} \mathbf{1}.\]
To trace out the efficient frontier, the set of optimal portfolios for different levels of risk aversion, we can vary \(\gamma\) from a small value (close to risk-neutral) to a large value (more risk-averse) and compute the corresponding optimal portfolio weights.
In the the risk-return plane (portfolio standard deviation vs expected return), the efficient frontier is the upper portion of the hyperbola formed by these optimal portfolios. That is, let \(\mu_p = \mathbf{w}^{*T} \mathbf{\mu}\) be the expected return of the optimal portfolio, and \(\sigma_p = \sqrt{\mathbf{w}^{*T} \mathbf{\Sigma} \mathbf{w}^*}\) be the standard deviation (risk) of the optimal portfolio. By varying \(\gamma\), we can plot \(\mu_p\) against \(\sigma_p\) to visualize the efficient frontier.
In particular, let’s define another scalar \(C=\mu^T \mathbf{\Sigma}^{-1} \mathbf{\mu}\). We then have:
\[
\mu_p = \frac{B}{A} + \frac{1}{\gamma} \left( C - \frac{B^2}{A} \right)
\]
\[
\sigma_p^2 = \frac{1}{A} + \frac{1}{\gamma^2} \left( C - \frac{B^2}{A} \right)
\]
In the Python implementation below, we actually trace out the efficient frontier by varying the expected return \(\mu_p\). \(\mu_p\) starts from the expected return associated with the minimum variance portfolio (\(\mu_p \geq\frac{B}{A}\)). For any given \(\mu_p\), we can solve for \(\sigma_p\) as
\[
\sigma_p = \sqrt{\frac{A\mu_p^2 - 2B\mu_p + C}{AC-B^2}}.
\]
Risk-free Asset Extension
We can extend the mean-variance optimization problem to include a risk-free asset with return \(r_f\). Let \(w_f\) be the weight of the risk-free asset in the portfolio, and \(\mathbf{w}\) be the weights of the risky assets. The new optimization problem becomes:
\[
\begin{aligned}
\max_{\mathbf{w}, w_f} \quad & \mathbf{w}^T \mathbf{\mu} + w_f r_f - \frac{\gamma}{2} \mathbf{w}^T \mathbf{\Sigma} \mathbf{w} \\
\text{s.t} \quad & \mathbf{1}^T \mathbf{w} + w_f = 1 \\
\end{aligned}
\]
Note that the risk-free asset does not contribute to the portfolio variance.
The solution to this problem can be derived similarly using the method of Lagrange multipliers, leading to adjusted optimal weights for both risky and risk-free assets.
\[\mathbf{w}^* = \frac{1}{\gamma} \mathbf{\Sigma}^{-1} \underbrace{(\mathbf{\mu} - r_f \mathbf{1})}_{\text{Excess Returns}} \tag{3}\] \[w_f^* = 1 - \mathbf{1}^T \mathbf{w}^* \tag{4}\]
The presence of a risk-free asset allows investors to achieve any desired combination of risk and return by adjusting the weights between the risk-free asset and the optimal risky portfolio. This leads to the concept of the Capital Market Line (CML), which represents the set of optimal portfolios that can be formed by combining the risk-free asset with the market portfolio of risky assets.
The CML is a straight line in the risk-return space, starting from the risk-free rate on the y-axis and tangent to the efficient frontier of risky assets. The tangency point represents the market portfolio, which is the optimal risky portfolio. The slope of the CML is given by the Sharpe ratio of the market portfolio, which measures the excess return per unit of risk.
The equation of the CML can be expressed as: \[
\mu_p = r_f + \frac{\mu_m - r_f}{\sigma_m} \sigma_p
\] where \(\mu_p\) and \(\sigma_p\) are the expected return and standard deviation of the portfolio, \(r_f\) is the risk-free rate, and \(\mu_m\) and \(\sigma_m\) are the expected return and standard deviation of the market portfolio (i.e., the tangent portfolio).
The tangent portfolio can be derived by setting \(w_f^*=0\) becaueuse it lies on the efficient frontier of risky assets. This implies \(\mathbf{1}^T \mathbf{w}_{tangent} = 1\) (Equation 4), and hence the weights of the tangent portfolio are given by Equation 3 with \(\gamma\) eliminated through normalization.
\[
\mathbf{w}_{tangent} = \frac{\mathbf{\Sigma}^{-1}(\mathbf{\mu} - r_f \mathbf{1})}{\mathbf{1}^T \mathbf{\Sigma}^{-1} (\mathbf{\mu} - r_f \mathbf{1})}
\]
(Alternatively, the tangent portfolio can be found by maximizing the Sharpe ratio: \[
\text{Sharpe Ratio} = \frac{\mathbf{w}^T (\mathbf{\mu} - r_f \mathbf{1})}{\sqrt{\mathbf{w}^T \mathbf{\Sigma} \mathbf{w}}},
\]
which yields the same result.)