EconGraphs Logo BETA
Note: These explanations are in the process of being adapted from my textbook.
I'm trying to make them each a "standalone" treatment of a concept, but there may still
be references to the narrative flow of the book that I have yet to remove.
This work is under development and has not yet been professionally edited.
If you catch a typo or error, or just have a suggestion, please submit a note here. Thanks!

Unconstrained Optimization

Unconstrained optimization means finding the global maximum or minimum of a function over its entire domain.

Critical points

In the case of a continuous, smooth function (one which is both continuous and continuously differentiable), a critical point — that is, a local maximum or minimum — occurs at a point where the function is “flat”. For a univariate function $y = f(x)$, this occurs where the derivative $dy/dx$ is equal to zero:

Global maxima where the derivative is zero

It’s clear from the above graph that just setting $dy/dx = 0$ and solving for $x$ does not necessarily find you a global maximum or minimum. However, there are special cases in which it does. For example, consider the function

\(y = f(x) = 16 + 8x - 2x^2\) The derivative of this is \({dy \over dx} = f^\prime(x) = 8 - 4x\) This has is equal to 0 at $x = 2$:

Note that in this case, the derivative starts out positive, ends up negative, and is continuously and monotonically decreasing: that is, $f^{\prime \prime}(x) < 0$. In such a case, the function is increasing for low values of $x$ and eventually decreases; so at some point it must reach a maximum, and that maximum must be a global maximum.

Unconstrained maxima for multivariable functions

With a multivariable function, critical points occur when all partial derivatives are zero. As with a univariate function, this is a “flat” point on the function, only now it’s the flat in both the $x$ and $y$

For example, we saw previously that the function \(y = f(x_1,x_2) = 8x_1 - 2x_1^2 + 8x_2 - x_2^2\) had a maximum at $(2,4)$. The gradient of this function, $\nabla f$, is the vector of its partial derivatives: \(\nabla f(x_1,x_2) = \left[\begin{matrix}{\partial f(x_1,x_2) \over \partial x_1} \\ \\ {\partial f(x_1,x_2) \over \partial x_2}\end{matrix}\right] = \left[\begin{matrix}8 - 4x_1 \\ \\ 8 - 2x_2\end{matrix}\right]\) This can be interpreted as the slope of a plane tangent to the function at the point $(x_1,x_2)$. Try changing $x_1$ and $x_2$ in the diagram below to see how this works, in particular at the maximum $(2,4)$:

At the maximum of the function at $(2,4)$, \(\nabla f(x_1,x_2) = \left[\begin{matrix}8 - 4 \times 2 \\ 8 - 2 \times 4 \end{matrix}\right] =\left[\begin{matrix}0 \\ 0 \end{matrix}\right]\) In other words, at the maximum of the function, both partial derivatives are zero, so the gradient is flat.

Next: Constrained Optimization and the Lagrange Method
Copyright (c) Christopher Makler /