Unconstrained Optimization

Unconstrained optimization means finding the global maximum or minimum of a function over its entire domain.

Critical points

In the case of a continuous, smooth function (one which is both continuous and continuously differentiable), a critical point — that is, a local maximum or minimum — occurs at a point where the function is “flat”. For a univariate function $y = f(x)$, this occurs where the derivative $dy/dx$ is equal to zero:

Global maxima where the derivative is zero

It’s clear from the above graph that just setting $dy/dx = 0$ and solving for $x$ does not necessarily find you a global maximum or minimum. However, there are special cases in which it does. For example, consider the function

$y = f(x) = 16 + 8x - 2x^2$ The derivative of this is ${dy \over dx} = f^\prime(x) = 8 - 4x$ This has is equal to 0 at $x = 2$:

Note that in this case, the derivative starts out positive, ends up negative, and is continuously and monotonically decreasing: that is, $f^{\prime \prime}(x) < 0$. In such a case, the function is increasing for low values of $x$ and eventually decreases; so at some point it must reach a maximum, and that maximum must be a global maximum.

Unconstrained maxima for multivariable functions

With a multivariable function, critical points occur when all partial derivatives are zero. As with a univariate function, this is a “flat” point on the function, only now it’s the flat in both the $x$ and $y$ directions.is

For example, we saw previously that the function $y = f(x_1,x_2) = 8x_1 - 2x_1^2 + 8x_2 - x_2^2$ had a maximum at $(2,4)$. The gradient of this function, $\nabla f$, is the vector of its partial derivatives: $\nabla f(x_1,x_2) = \left[\begin{matrix}{\partial f(x_1,x_2) \over \partial x_1} \\ \\ {\partial f(x_1,x_2) \over \partial x_2}\end{matrix}\right] = \left[\begin{matrix}8 - 4x_1 \\ \\ 8 - 2x_2\end{matrix}\right]$ This can be interpreted as the slope of a plane tangent to the function at the point $(x_1,x_2)$. Try changing $x_1$ and $x_2$ in the diagram below to see how this works, in particular at the maximum $(2,4)$:

At the maximum of the function at $(2,4)$, $\nabla f(x_1,x_2) = \left[\begin{matrix}8 - 4 \times 2 \\ 8 - 2 \times 4 \end{matrix}\right] =\left[\begin{matrix}0 \\ 0 \end{matrix}\right]$ In other words, at the maximum of the function, both partial derivatives are zero, so the gradient is flat.

Next: Constrained Optimization and the Lagrange Method