By default, the functions cp() and cpl() do not exploit problem structure. Two mechanisms are provided for implementing customized solvers that take advantage of problem structure.
![]() | (9.2) |
where
The matrix W depends on the current iterates and is defined as follows. Suppose
Then W is a block-diagonal matrix,
with the following diagonal blocks.
This transformation is symmetric:
This transformation is symmetric:
where
These transformations are also symmetric:
In general, this operation is not symmetric, and
It is often possible to exploit problem structure to solve (9.2) faster than by standard methods. The last argument kktsolver of cp() allows the user to supply a Python function for solving the KKT equations. This function will be called as ”f = kktsolver(x, z, W)”. The argument x is the point at which the derivatives in the KKT matrix are evaluated. z is a positive vector of length it m + 1, containing the coefficients in the 1,1 block H. W is a dictionary that contains the parameters of the scaling:
The function call ”f = kktsolver(x, z, W)” should return a routine for solving the KKT system (9.2) defined by x, z, W. It will be called as ”f(bx, by, bz)”. On entry, bx, by, bz contain the righthand side. On exit, they should contain the solution of the KKT system, with the last component scaled, i.e., on exit,
The role of the argument kktsolver in the function cpl() is similar, except that in (9.2),
If the argument G of conelp() is a Python function, it should be defined as follows:
G(u, v [, alpha[, beta[, trans]]])
This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.
Similarly, if the argument A is a Python function, then it must be defined as follows.
A(u, v [, alpha[, beta[, trans]]])
This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.
In a similar way, when the first argument F() of cp() returns matrices of first derivatives or second derivatives Df, H, these matrices can be specified as Python functions. If Df is a Python function, it should be defined as follows:
Df(u, v [, alpha[, beta[, trans]]])
This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.
If H is a Python function, it should be defined as follows:
H(u, v [, alpha[, beta]])
This evaluates the matrix-vector product
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.
If G, A, Df, or H are Python functions, then the argument kktsolver must also be provided.
As an example, we consider the unconstrained problem
where A is an m by n matrix with m less than n. The Hessian of the objective is diagonal plus a low-rank term:
We can exploit this property when solving (9.2) by applying the matrix inversion lemma. We first solve
and then obtain
The following code follows this method. It also uses BLAS functions for matrix-matrix and matrix-vector products.
from cvxopt.base import matrix, spdiag, mul, div, log
from cvxopt import base, blas, lapack, solvers, base def l2ac(A, b): """ Solves minimize (1/2) * ||A*x-b||_2^2 - sum log (1-xi^2) assuming A is m x n with m << n. """ m, n = A.size def F(x = None, z = None): if x is None: return 0, matrix(0.0, (n,1)) if max(abs(x)) >= 1.0: return None # r = A*x - b r = -b blas.gemv(A, x, r, beta = -1.0) w = x**2 f = 0.5 * blas.nrm2(r)**2 - sum(log(1-w)) # gradf = A’*r + 2.0 * x ./ (1-w) gradf = div(x, 1.0 - w) blas.gemv(A, r, gradf, trans = ’T’, beta = 2.0) if z is None: return f, gradf.T else: def Hf(u, v, alpha = 1.0, beta = 0.0): # v := alpha * (A’*A*u + 2*((1+w)./(1-w)).*u + beta *v v *= beta v += 2.0 * alpha * mul(div(1.0+w, (1.0-w)**2), u) blas.gemv(A, u, r) blas.gemv(A, r, v, alpha = alpha, beta = 1.0, trans = ’T’) return f, gradf.T, Hf # Custom solver for the Newton system # # z[0]*(A’*A + D)*x = bx # # where D = 2 * (1+x.^2) ./ (1-x.^2).^2. We apply the matrix inversion # lemma and solve this as # # (A * D^-1 *A’ + I) * v = A * D^-1 * bx / z[0] # D * x = bx / z[0] - A’*v. S = matrix(0.0, (m,m)) Asc = matrix(0.0, (m,n)) v = matrix(0.0, (m,1)) def Fkkt(x, z, W): ds = (2.0 * div(1 + x**2, (1 - x**2)**2))**-0.5 base.gemm(A, spdiag(ds), Asc) blas.syrk(Asc, S) S[::m+1] += 1.0 lapack.potrf(S) a = z[0] def g(x, y, z): x[:] = mul(x, ds) / a blas.gemv(Asc, x, v) lapack.potrs(S, v) blas.gemv(Asc, v, x, alpha = -1.0, beta = 1.0, trans = ’T’) x[:] = mul(x, ds) return g return solvers.cp(F, kktsolver = Fkkt)[’x’] |