Boost Optimization Algorithm Performance for Faster Convergence

Analysis shows how to get the best results when approximating solutions to complex engineering problems.

Optimization algorithms—procedures designed to find minimum values of mathematical functions—play a central role across engineering and data science. They guide decisions about tradeoffs in design, tune control systems, and extract useful patterns from data. Many practical optimization tasks, from designing a fuel-efficient car to selecting compact, predictive feature sets in machine learning, reduce to finding the lowest value of a cost function defined over design or parameter variables.

One widely used practical strategy for tackling hard, nonconvex optimization problems is a continuation or coarse-to-fine approach: replace the original problem with a much simpler, smoothed version; solve that simpler problem; then gradually reintroduce complexity, using each solution as the starting point for the next, more detailed problem. This heuristic often works well in practice, but until recently it lacked a clear theoretical characterization that would tell practitioners how to choose the sequence of simplified problems to guarantee the best possible approximation.

This month at the International Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition, researchers Hossein Mobahi, a postdoc at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and John Fisher, a senior research scientist at CSAIL, presented an analysis that identifies how to generate the sequence of simplified functions so that the continuation method yields the best approximation attainable under the approach.

“There are some fundamental questions about this method that we answer for the first time,” Mobahi says. “For example, I told you that you start from a simple problem, but I didn’t tell you how you choose that simple problem. There are infinitely many functions you can start with. Which one is good? Even if I tell you what function to start with, there are infinitely many ways to transform that to your actual problem. And that transformation affects what you get at the end.”

Bottoming out

To make the concept concrete, imagine a canned-food manufacturer aiming to minimize material cost by designing a can with the smallest possible surface-area-to-volume ratio. That ratio is a function of the can’s height and radius; the optimal can dimensions come from finding the minimum of that function. In other problems, such as balancing component costs, weight, and aerodynamics in a vehicle, the cost function is much more complex—but the mathematical objective is the same.

Most efficient optimization routines rely on local search. They start from an initial guess and iteratively move in directions that locally decrease the cost. This process converges to a local minimum—a point that has lower cost than nearby points—but a local minimum is not necessarily the global minimum, which is the absolute lowest value across the entire domain. Convex functions are an important special case: if a cost function is convex, every local minimum is also a global minimum, and local search will find the best solution. The parabola y = x2 is a simple convex function; by contrast, oscillatory functions such as y = sin x are nonconvex and have many local minima and maxima.

Smooth sailing

Mobahi and Fisher analyze a continuation strategy built around Gaussian smoothing. Gaussian smoothing transforms the original cost function into a family of smoothed versions by replacing each point’s value with a weighted average of values in its neighborhood. The weights come from a Gaussian (normal) distribution, the familiar bell curve: values closer to the point contribute more to the average than values farther away. The degree of smoothing is controlled by the Gaussian’s width (standard deviation).

Starting with a very wide Gaussian produces a heavily smoothed function that, under certain conditions, can be convex. The researchers then gradually narrow the Gaussian, producing a sequence of intermediary cost functions that progressively resemble the original, potentially nonconvex function. At each step, the solution obtained for the current smoothed function is used to initialize the search for the next function in the sequence. When the Gaussian width shrinks to zero, the smoothing disappears and the original cost function is recovered.

The key contribution of the work is a theoretical prescription for how to choose that smoothing sequence so that the continuation method yields the best approximation possible within this framework. By identifying which smoothed starting point and which transformation path are appropriate, the analysis turns a heuristic into a principled procedure. That matters in practice because there are many ways to perform coarse-to-fine optimization; understanding which choices are sound reduces wasted effort and makes the approach more reliable.

John Wright, an assistant professor of electrical engineering at Columbia University who was not involved in the research, emphasizes the importance of such analysis: “The continuation method for optimization is something that is really widely used in practice, widely used in computer vision, for solving alignment problems, for solving tracking problems, a bunch of different places, but it’s not very well understood. The interesting thing about Hossein’s work in general, and this paper in particular, is that he’s really digging into this continuation method and trying to see what we can say analytically about this.”

Knowing an analytically justified recipe for smoothing and continuation can help practitioners apply these methods more effectively across computer vision, machine learning, and other engineering disciplines where robust, reproducible optimization is crucial.

This image is a sequence of graphs illustrates the application of the researchers' technique to a real-world computer vision problem. — This sequence of graphs illustrates the application of the researchers’ technique to a real-world computer vision problem. The solution to each successive problem (red balls) is used to initialize (green arrows) the search for a solution to the next. Image credit: The researchers.

About this computer science research

Contact: Larry Hardesty – MIT
Source: MIT press release
Image Source: Image credited to the researchers and adapted from the MIT press release
Original Research: Hossein Mobahi and John W. Fisher III presented research titled “On the Link between Gaussian Homotopy Continuation and Convex Envelopes” at the International Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition in January 2015.

Share this Computational Neuroscience News