Semidefinite programming is a generalization of
linear programming.
The topic is quite deep, here we give only a gentle introduction.
A semidefinite program (SDP) is a linear program, along with constraints that say
that some of the variables should form a positive semidefinite matrix.
So, what is a semidefinite matrix?
Semidefinite matrices
An n × n matrix A is positive semidefinite if for every vector x∈ ℜn, xTAx ≥ 0.
Examples:
- The identity matrix, because x.x ≥ 0.
- Any diagonal matrix with non-negative entries.
- Given n vectors v1,v2,…,vn ∈ ℜn, let Aij = vi . vj (the dot product of the i'th and j'th vectors).
Letting V be the matrix whose i'th column is vi, we can write A = VT V.
Then xT A x = xTVT Vx = (Vx)T Vx = (Vx).(Vx) ≥ 0.
Proposition: The following statements are equivalent for a symmetric matrix A:
- A is positive semidefinite.
- All eigenvalues of A are non-negative.
- There exists a matrix V such that A = VT V. (Thus, example 3 above is the general case.)
Exercise: prove the above proposition. You will need to know about eigenvectors.
SDP has been studied extensively within the mathematical programming community
(mainly a part of Operations Research). It has also been used in the design of approximation
algorithms for combinatorial optimization problems, for example Lovász showed how to
use SDP to compute the so-called theta function of a graph in polynomial time.
(The theta function gives a number that is between the maximum clique size
and the minimum number of colors needed to color a graph so that no edge
has endpoints of the same color. Both of these quantities are NP-hard to compute.)
Facts we will use about semidefinite programming:
- Semidefinite programs can be solved within additive error ε in polynomial time, using the EllipsoidMethod or InteriorPointMethods.?
Thm: For any n, the space of positive semidefinite n × n matrices is convex.
proof:
- Think of matrices as vectors in ℜn2, and think of the space of matrices as the vector space ℜn2, with the matrix dot-product (A.B) = ∑ij Aij Bij. Then, the constraint (∀ x ∈ ℜn) xT A x ≥ 0 can be rewritten as (∀ x ∈ ℜn) (xxT). A ≥ 0. Thus, the space of PSD matrices is the intersection of halfspaces: (one for each x∈ℜn). Oh, we also need to constrain A to be symmetric, but this follows from the constraints Aij ≤ Aji for each {i,j} ⊂ {1..n}.
For those familiar with the EllipsoidMethod,
the above proof actually reveals more:
Thm: If a matrix A is not PSD, then a a hyperplane that separates A from the space of PSD matrices can be found in polynomial time.
proof:
- First check the constraints Aij ≤ Aji for each {i,j}. If these are satisfied, then A is symmetric. Otherwise, a violated constraint gives the separating hyperplane.
- Assuming A is symmetric, we either need to find x such that xT A x ≥ 0 or prove that no such x exists.
- We claim without proof (leaving as an exercise) that this holds for all x if and only if it holds for A's eigenvectors. Thus, it suffices to find the eigenvectors of A and check the condition for each one. Either they all satisfy the condition, or one of them gives a separating hyperplane.
A corollary to this theorem is that semidefinite programs can be solved using the EllipsoidMethod in polynomial time.
References:
- Chapter 26 of Approximation Algorithms by Vijay Vazirani