非线性最优化问题求解器Ipopt介绍

Ipopt(Interior Point OPTimizer)是求解大规模非线性最优化问题的求解软件。可以求解如下形式的最优化问题的(局部)最优解。
非线性最优化问题求解器Ipopt介绍
其中，是优化目标函数，是约束函数，可以是非线性和非凸的，但是需要是二阶微分连续的。

为了求解最优化问题，Ipopt需要更多的信息，如下：

优化问题的维度
1. 优化变量的数目；
2. 约束函数的数目;
优化变量的边界
1. 优化变量的边界；
2. 约束函数的边界；
优化问题的初始迭代点：
1. 优化变量的初始值；
2. 拉格朗日乘子的初始值(仅仅是在warm start的时候需要)；
优化问题的数据结构(Structure)：
1. 约束函数的雅可比矩阵的非零元素的数目；
2. 拉格朗日函数的黑森矩阵的非零元素的数目；
3. 约束函数的雅可比稀疏矩阵的非零元素的行索引和列索引(sparsity structure，row and column indices of each of the nonzero entries )；
4. 拉格朗日函数的黑森稀疏矩阵的非零元素的行索引和列索引(sparsity structure，row and column indices of each of the nonzero entries )；
优化问题函数的值：
1. 优化目标函数；
2. 优化目标函数的梯度函数;
3. 约束函数；
4. 约束函数的雅可比矩阵；
5. 拉格朗日函数的黑森矩阵，如果使用拟牛顿法(quasi-Newton options )则不需要此矩阵；

优化问题的维度和边界约束可以直接获得，并且来自于问题定义。初始迭代点会影响优化问题的是否收敛或者是否收敛到(局部)最优解，不同的初始值可能会导致收敛到不同的局部最优解。计算微分矩阵(雅可比矩阵和黑森矩阵)可能有一点复杂，Ipopt需要提供约束函数的雅可比矩阵和拉格朗日函数的黑森矩阵的非零元素以及他们所在的行索引和列索引，并且标准接口是下三角矩阵(黑森矩阵是对称矩阵)。矩阵的非零元素确定后，在整个求解过程中是不可变的，因此，非零元素不可以仅仅包含在初始值条件下，还需要包括在求解过程中会不为零的元素。

1. Example

可以得到优化目标的梯度为：
非线性最优化问题求解器Ipopt介绍
可以得到约束函数的雅可比矩阵为：

需要计算拉格朗日函数的黑森矩阵，如果使用拟牛顿法来近似二阶微分，则不需要提供黑森矩阵。但是黑森矩阵可以是计算有更好的鲁棒性和更快的收敛速度。NLP的拉格朗日函数定义为非线性最优化问题求解器Ipopt介绍，黑森矩阵定义为，然而在Ipopt中引入使Ipopt可以分别确定优化目标函数和约束函数的黑森矩阵，因此Ipopt的黑森矩阵为。可以得到优化问题的黑森矩阵为：

其中，第一项是优化目标函数的黑森矩阵，第二项和第三项是约束函数的黑森矩阵，非线性最优化问题求解器Ipopt介绍是约束函数的拉格朗日乘子。

2. C++ Interface

需要继承纯虚基类Ipopt::TNLP来编写自己的求解类，并且需要重载非线性最优化问题求解器Ipopt介绍个Ipopt::TNLP基类的虚函数，Ipopt通过Ipopt::IpoptApplication类来求解最优化问题。

2.1 Ipopt::TNLP::get_nlp_info

   virtual bool get_nlp_info(
      Index&          n,
      Index&          m,
      Index&          nnz_jac_g,
      Index&          nnz_h_lag,
      IndexStyleEnum& index_style
   ) = 0;

Ipopt使用这个函数来确定数组的内存分配，这里如果发生问题，会引起内存泄漏等问题，很难去debug。

n：优化变量的数目；
m：约束函数的数目；
nnz_jac_g：雅可比矩阵非零元素的数目；
nnz_h_lag：黑森矩阵非零元素的数目；
index_style：稀疏矩阵的索引使用C语言风格(从开始），还是使用Fortran语言风格(从开始)；

上述例子中有非线性最优化问题求解器Ipopt介绍个优化变量，个约束函数，雅可比矩阵中的非零元素个数为，黑森矩阵中非零元素的个数为，由于是对称矩阵，因此下三角矩阵中非零元素个数为。

// returns the size of the problem
bool HS071_NLP::get_nlp_info(
   Index&          n,
   Index&          m,
   Index&          nnz_jac_g,
   Index&          nnz_h_lag,
   IndexStyleEnum& index_style
)
{
   // The problem described in HS071_NLP.hpp has 4 variables, x[0] through x[3]
   n = 4;
 
   // one equality constraint and one inequality constraint
   m = 2;
 
   // in this example the jacobian is dense and contains 8 nonzeros
   nnz_jac_g = 8;
 
   // the Hessian is also dense and has 16 total nonzeros, but we
   // only need the lower left corner (since it is symmetric)
   nnz_h_lag = 10;
 
   // use the C style indexing (0-based)
   index_style = TNLP::C_STYLE;
 
   return true;
}

2.2 Ipopt::TNLP::get_bounds_info

   virtual bool get_bounds_info(
      Index   n,
      Number* x_l,
      Number* x_u,
      Index   m,
      Number* g_l,
      Number* g_u
   ) = 0;

Ipopt使用这个函数来确定优化变量非线性最优化问题求解器Ipopt介绍的边界和约束函数的边界。

n：优化变量的数目；
x_l：优化变量的下边界，数组；
x_u：优化变量的上边界，数组；
m：约束函数的数目；
g_l：约束函数的下边界，数组；
g_u：约束函数的上边界，数组；

在Ipopt中默认设置边界值需要在非线性最优化问题求解器Ipopt介绍范围内，当不在此范围时，则认为是无穷大或者无穷小。

// returns the variable bounds
bool HS071_NLP::get_bounds_info(
   Index   n,
   Number* x_l,
   Number* x_u,
   Index   m,
   Number* g_l,
   Number* g_u
)
{
   // here, the n and m we gave IPOPT in get_nlp_info are passed back to us.
   // If desired, we could assert to make sure they are what we think they are.
   assert(n == 4);
   assert(m == 2);
 
   // the variables have lower bounds of 1
   for( Index i = 0; i < 4; i++ )
   {
      x_l[i] = 1.0;
   }
 
   // the variables have upper bounds of 5
   for( Index i = 0; i < 4; i++ )
   {
      x_u[i] = 5.0;
   }
 
   // the first constraint g1 has a lower bound of 25
   g_l[0] = 25;
   // the first constraint g1 has NO upper bound, here we set it to 2e19.
   // Ipopt interprets any number greater than nlp_upper_bound_inf as
   // infinity. The default value of nlp_upper_bound_inf and nlp_lower_bound_inf
   // is 1e19 and can be changed through ipopt options.
   g_u[0] = 2e19;
 
   // the second constraint g2 is an equality constraint, so we set the
   // upper and lower bound to the same value
   g_l[1] = g_u[1] = 40.0;
 
   return true;
}

2.3 Ipopt::TNLP::get_starting_point

   virtual bool get_starting_point(
      Index   n,
      bool    init_x,
      Number* x,
      bool    init_z,
      Number* z_L,
      Number* z_U,
      Index   m,
      bool    init_lambda,
      Number* lambda
   ) = 0;

Ipopt使用这个函数来确定迭代优化的起点。

n：优化变量的数目；
init_x：如果是ture，则需要提供优化变量的初始值；
x：优化变量的初始值；

其他为dual variables的初始值，一般不用设置。在Ipopt中默认是需要设置非线性最优化问题求解器Ipopt介绍的初始值。

// returns the initial point for the problem
bool HS071_NLP::get_starting_point(
   Index   n,
   bool    init_x,
   Number* x,
   bool    init_z,
   Number* z_L,
   Number* z_U,
   Index   m,
   bool    init_lambda,
   Number* lambda
)
{
   // Here, we assume we only have starting values for x, if you code
   // your own NLP, you can provide starting values for the dual variables
   // if you wish
   assert(init_x == true);
   assert(init_z == false);
   assert(init_lambda == false);
 
   // initialize to the given starting point
   x[0] = 1.0;
   x[1] = 5.0;
   x[2] = 5.0;
   x[3] = 1.0;
 
   return true;
}

2.4 Ipopt::TNLP::eval_f

   virtual bool eval_f(
      Index         n,
      const Number* x,
      bool          new_x,
      Number&       obj_value
   ) = 0;

Ipopt使用这个函数来确定优化目标函数。

n：优化变量的数目；
x：优化变量的值，用来计算；
new_x：在此之前调用的eval_*函数是否有错误发生，可以忽略；
obj_value：；

// returns the value of the objective function
bool HS071_NLP::eval_f(
   Index         n,
   const Number* x,
   bool          new_x,
   Number&       obj_value
)
{
   assert(n == 4);
 
   obj_value = x[0] * x[3] * (x[0] + x[1] + x[2]) + x[2];
 
   return true;
}

2.5 Ipopt::TNLP::eval_grad_f

   virtual bool eval_grad_f(
      Index         n,
      const Number* x,
      bool          new_x,
      Number*       grad_f
   ) = 0;

Ipopt使用这个函数来确定优化目标函数的梯度。

n：优化变量的数目；
x：优化变量的值，用来计算；
new_x：在此之前调用的eval_*函数是否有错误发生，可以忽略；
obj_value：，数组的大小和的数组大小一致；

// return the gradient of the objective function grad_{x} f(x)
bool HS071_NLP::eval_grad_f(
   Index         n,
   const Number* x,
   bool          new_x,
   Number*       grad_f
)
{
   assert(n == 4);
 
   grad_f[0] = x[0] * x[3] + x[3] * (x[0] + x[1] + x[2]);
   grad_f[1] = x[0] * x[3];
   grad_f[2] = x[0] * x[3] + 1;
   grad_f[3] = x[0] * (x[0] + x[1] + x[2]);
 
   return true;
}

2.6 Ipopt::TNLP::eval_g

   virtual bool eval_g(
      Index         n,
      const Number* x,
      bool          new_x,
      Index         m,
      Number*       g
   ) = 0;

Ipopt使用这个函数来确定约束函数非线性最优化问题求解器Ipopt介绍。

n：优化变量的数目；
x：优化变量的值，用来计算；
new_x：在此之前调用的eval_*函数是否有错误发生，可以忽略；
m：约束函数的数目；
g：，数组的大小和一致；

// return the value of the constraints: g(x)
bool HS071_NLP::eval_g(
   Index         n,
   const Number* x,
   bool          new_x,
   Index         m,
   Number*       g
)
{
   assert(n == 4);
   assert(m == 2);
 
   g[0] = x[0] * x[1] * x[2] * x[3];
   g[1] = x[0] * x[0] + x[1] * x[1] + x[2] * x[2] + x[3] * x[3];
 
   return true;
}

2.7 Ipopt::TNLP::eval_jac_g

   virtual bool eval_jac_g(
      Index         n,
      const Number* x,
      bool          new_x,
      Index         m,
      Index         nele_jac,
      Index*        iRow,
      Index*        jCol,
      Number*       values
   ) = 0;

Ipopt使用这个函数来确定约束函数非线性最优化问题求解器Ipopt介绍的雅可比矩阵的非零元素的值，以及其在稀疏矩阵中的行索引值和列索引值。雅可比矩阵中的第行和第列的元素值是对的导数。

n：优化变量的数目；
x：优化变量的值，用来计算；
new_x：在此之前调用的eval_*函数是否有错误发生，可以忽略；
m：约束函数的数目；
nele_jac：雅可比矩阵非零元素的数目；
iRow：存储雅可比矩阵非零元素在矩阵中的行索引值，如果是C语言风格，雅可比矩阵索引值从开始；
jCol：存储雅可比矩阵非零元素在矩阵中的列索引值，如果是C语言风格，雅可比矩阵索引值从开始；
values：存储雅可比矩阵中的非零元素；

需要注意的是：①iRow、jCol和values三个数组的大小是一致的，并且其储存的值应该和雅可比矩阵非零元素的行索引值、列索引值和非零元素值相对应；②数组iRow和jCol只需要被填写一次，即第一次调用此函数时填写iRow和jCol，第一次调用时x和values都是null，当Ipopt需要values的值时，传递iRow和jCol将会是null，此时对values的值进行填写。

// return the structure or values of the Jacobian
bool HS071_NLP::eval_jac_g(
   Index         n,
   const Number* x,
   bool          new_x,
   Index         m,
   Index         nele_jac,
   Index*        iRow,
   Index*        jCol,
   Number*       values
)
{
   assert(n == 4);
   assert(m == 2);
 
   if( values == NULL )
   {
      // return the structure of the Jacobian
 
      // this particular Jacobian is dense
      iRow[0] = 0;
      jCol[0] = 0;
      iRow[1] = 0;
      jCol[1] = 1;
      iRow[2] = 0;
      jCol[2] = 2;
      iRow[3] = 0;
      jCol[3] = 3;
      iRow[4] = 1;
      jCol[4] = 0;
      iRow[5] = 1;
      jCol[5] = 1;
      iRow[6] = 1;
      jCol[6] = 2;
      iRow[7] = 1;
      jCol[7] = 3;
   }
   else
   {
      // return the values of the Jacobian of the constraints
 
      values[0] = x[1] * x[2] * x[3]; // 0,0
      values[1] = x[0] * x[2] * x[3]; // 0,1
      values[2] = x[0] * x[1] * x[3]; // 0,2
      values[3] = x[0] * x[1] * x[2]; // 0,3
 
      values[4] = 2 * x[0]; // 1,0
      values[5] = 2 * x[1]; // 1,1
      values[6] = 2 * x[2]; // 1,2
      values[7] = 2 * x[3]; // 1,3
   }
 
   return true;
}

2.8 Ipopt::TNLP::eval_h

   virtual bool eval_h(
      Index         n,
      const Number* x,
      bool          new_x,
      Number        obj_factor,
      Index         m,
      const Number* lambda,
      bool          new_lambda,
      Index         nele_hess,
      Index*        iRow,
      Index*        jCol,
      Number*       values
   )

Ipopt使用这个函数来确定拉格朗日函数黑森矩阵的非零元素的值，以及其在稀疏矩阵中的行索引值和列索引值。

n：优化变量的数目；
x：优化变量的值，用来计算；
new_x：在此之前调用的eval_*函数是否有错误发生，可以忽略；
obj_factor：；
m：约束函数的数目；
lambda：拉格朗日乘子；
new_lambda：如果之前调用的函数使用相同的则为false，一般忽略；
nele_hess：黑森矩阵非零元素的个数(下三角矩阵)；
iRow：存储黑森矩阵非零元素在矩阵中的行索引值，如果是C语言风格，黑森矩阵索引值从开始；
jCol：存储黑森矩阵非零元素在矩阵中的列索引值，如果是C语言风格，黑森矩阵索引值从开始；
values：存储黑森矩阵中的非零元素的值；

需要注意的是：①iRow、jCol和values三个数组的大小是一致的，并且其储存的值应该和黑森矩阵非零元素的行索引值、列索引值和非零元素值相对应；②数组iRow和jCol只需要被填写一次，即第一次调用此函数时填写iRow和jCol，第一次调用时x、lambda和values都是null，当Ipopt需要values的值时，传递iRow和jCol将会是null，此时对values的值进行填写；③由于黑森矩阵是对称阵，Ipopt使用下三角矩阵；④Ipopt默认是需要黑森矩阵的，当使用拟牛顿法时，则不需要黑森矩阵。

在此例中，黑森矩阵是稠密的，但是仍然使用稀疏矩阵来表示。

//return the structure or values of the Hessian
bool HS071_NLP::eval_h(
   Index         n,
   const Number* x,
   bool          new_x,
   Number        obj_factor,
   Index         m,
   const Number* lambda,
   bool          new_lambda,
   Index         nele_hess,
   Index*        iRow,
   Index*        jCol,
   Number*       values
)
{
   assert(n == 4);
   assert(m == 2);
 
   if( values == NULL )
   {
      // return the structure. This is a symmetric matrix, fill the lower left
      // triangle only.
 
      // the hessian for this problem is actually dense
      Index idx = 0;
      for( Index row = 0; row < 4; row++ )
      {
         for( Index col = 0; col <= row; col++ )
         {
            iRow[idx] = row;
            jCol[idx] = col;
            idx++;
         }
      }
 
      assert(idx == nele_hess);
   }
   else
   {
      // return the values. This is a symmetric matrix, fill the lower left
      // triangle only
 
      // fill the objective portion
      values[0] = obj_factor * (2 * x[3]); // 0,0
 
      values[1] = obj_factor * (x[3]);     // 1,0
      values[2] = 0.;                      // 1,1
 
      values[3] = obj_factor * (x[3]);     // 2,0
      values[4] = 0.;                      // 2,1
      values[5] = 0.;                      // 2,2
 
      values[6] = obj_factor * (2 * x[0] + x[1] + x[2]); // 3,0
      values[7] = obj_factor * (x[0]);                   // 3,1
      values[8] = obj_factor * (x[0]);                   // 3,2
      values[9] = 0.;                                    // 3,3
 
      // add the portion for the first constraint
      values[1] += lambda[0] * (x[2] * x[3]); // 1,0
 
      values[3] += lambda[0] * (x[1] * x[3]); // 2,0
      values[4] += lambda[0] * (x[0] * x[3]); // 2,1
 
      values[6] += lambda[0] * (x[1] * x[2]); // 3,0
      values[7] += lambda[0] * (x[0] * x[2]); // 3,1
      values[8] += lambda[0] * (x[0] * x[1]); // 3,2
 
      // add the portion for the second constraint
      values[0] += lambda[1] * 2; // 0,0
 
      values[2] += lambda[1] * 2; // 1,1
 
      values[5] += lambda[1] * 2; // 2,2
 
      values[9] += lambda[1] * 2; // 3,3
   }
 
   return true;
}

2.9 Ipopt::TNLP::finalize_solution

   virtual void finalize_solution(
      SolverReturn               status,
      Index                      n,
      const Number*              x,
      const Number*              z_L,
      const Number*              z_U,
      Index                      m,
      const Number*              g,
      const Number*              lambda,
      Number                     obj_value,
      const IpoptData*           ip_data,
      IpoptCalculatedQuantities* ip_cq
   ) = 0;

Ipopt使用这个函数来得到最优化问题的求解结果，对其重要的值进行介绍。

status：求解器的状态；
- SUCCESS：在满足收敛条件的情况下，找到局部最优解；
- MAXITER_EXCEEDED：超出最大迭代次数；
- CPUTIME_EXCEEDED：超出最大求解时间；
- STOP_AT_ACCEPTABLE_POINT：求解收敛在某点，不满足期望的容差，但是在可接受范围内；
- LOCAL_INFEASIBILITY：在可行域内找不到最优解，一般是由于bounds和约束设置不合理导致的；
x：优化变量的局部最优解的值；

void HS071_NLP::finalize_solution(
   SolverReturn               status,
   Index                      n,
   const Number*              x,
   const Number*              z_L,
   const Number*              z_U,
   Index                      m,
   const Number*              g,
   const Number*              lambda,
   Number                     obj_value,
   const IpoptData*           ip_data,
   IpoptCalculatedQuantities* ip_cq
)
{
   // here is where we would store the solution to variables, or write to a file, etc
   // so we could use the solution.
 
   // For this example, we write the solution to the console
   std::cout << std::endl << std::endl << "Solution of the primal variables, x" << std::endl;
   for( Index i = 0; i < n; i++ )
   {
      std::cout << "x[" << i << "] = " << x[i] << std::endl;
   }
 
   std::cout << std::endl << std::endl << "Solution of the bound multipliers, z_L and z_U" << std::endl;
   for( Index i = 0; i < n; i++ )
   {
      std::cout << "z_L[" << i << "] = " << z_L[i] << std::endl;
   }
   for( Index i = 0; i < n; i++ )
   {
      std::cout << "z_U[" << i << "] = " << z_U[i] << std::endl;
   }
 
   std::cout << std::endl << std::endl << "Objective value" << std::endl;
   std::cout << "f(x*) = " << obj_value << std::endl;
 
   std::cout << std::endl << "Final value of the constraints:" << std::endl;
   for( Index i = 0; i < m; i++ )
   {
      std::cout << "g(" << i << ") = " << g[i] << std::endl;
   }
}

2.10 main function

上述对Ipopt::TNLP的函数进行了重载，但是需要编写调用Ipopt的函数来执行求解。

#include "IpIpoptApplication.hpp"
#include "hs071_nlp.hpp"
 
#include <iostream>
 
using namespace Ipopt;
 
int main(
   int    /*argv*/,
   char** /*argc*/
)
{
   // Create a new instance of your nlp
   //  (use a SmartPtr, not raw)
   SmartPtr<TNLP> mynlp = new HS071_NLP();
 
   // Create a new instance of IpoptApplication
   //  (use a SmartPtr, not raw)
   // We are using the factory, since this allows us to compile this
   // example with an Ipopt Windows DLL
   SmartPtr<IpoptApplication> app = IpoptApplicationFactory();
 
   // Change some options
   // Note: The following choices are only examples, they might not be
   //       suitable for your optimization problem.
   app->Options()->SetNumericValue("tol", 3.82e-6);
   app->Options()->SetStringValue("mu_strategy", "adaptive");
   app->Options()->SetStringValue("output_file", "ipopt.out");
   // The following overwrites the default name (ipopt.opt) of the options file
   // app->Options()->SetStringValue("option_file_name", "hs071.opt");
 
   // Initialize the IpoptApplication and process the options
   ApplicationReturnStatus status;
   status = app->Initialize();
   if( status != Solve_Succeeded )
   {
      std::cout << std::endl << std::endl << "*** Error during initialization!" << std::endl;
      return (int) status;
   }
 
   // Ask Ipopt to solve the problem
   status = app->OptimizeTNLP(mynlp);
 
   if( status == Solve_Succeeded )
   {
      std::cout << std::endl << std::endl << "*** The problem solved!" << std::endl;
   }
   else
   {
      std::cout << std::endl << std::endl << "*** The problem FAILED!" << std::endl;
   }
 
   // As the SmartPtrs go out of scope, the reference count
   // will be decremented and the objects will automatically
   // be deleted.
 
   return (int) status;
}

文章出处登录后可见！

已经登录？立即刷新

非线性最优化问题求解器Ipopt介绍

相关推荐