Top: MaxCoverageByGreedy

LP:

 max ∑_e y[e] c_e
  y[e] ≤ 1
  y[e] ≤ ∑_{S∋ e} x[S]
  ∑_S x[S] c_S ≤ k.

Greedy algorithm:

 1. repeat until cost of chosen sets reaches k or more:
 2.   choose set S maximizing ∑_{e∈ S, e n.y.c.} c_e / c_S.
       In the sum e ranges over the elements in S that are not yet covered by a chosen set.
 3. return the chosen sets.

analysis:

The algorithm maintains the invariant that

: cost(sets chosen so far)/k + ln(1 - cost(elts covered so far)/OPT) ≤ 0.

The invariant is initially true. If you choose a set S, then the LHS above increases by at most

: cost(S)/k - cost(elts newly covered by S)/[OPT - cost(elts covered so far)].

(Use ln(X) ≤ X-1 to prove it.)

Claim: If S is chosen randomly according to distribution defined by OPT, then the expectation of the above quantity is non-positive.

Assume for now the claim is true. Thus, there exists a set S that makes the quantity above non-positive. Rewriting, the above is non-positive iff

: cost(S)/cost(elts newly covered by S) ≤ k/[OPT - cost(elts covered so far)].

Since the greedy alg minimizes the LHS, the inequality must hold for the set it chooses. Thus, the greedy alg maintains the invariant. From the invariant it follows that when cost(sets chosen so far) > k, cost(elements covered so far)/OPT > 1-1/e.

proving the claim:

let (x*,y*) be an optimal solution. choose a set S from the distribution defined by x*(S)/|x*|.

Then E[cost(S)] = ∑_S c_S x*[S]/|x*| ≤ k/|x*|.

The probability that a given element e is in S is at least y*[e]/|x*|, so E[cost of elements newly covered by S] is at least

: ∑_{e n.y.c.} y*[e]c_e / |x*|.

Letting y[e] be 1 if e is covered so far and 0 otherwise, this is

: [∑_e y*[e]c_e - ∑_e:y[e]=1 y*[e]c_e] / |x*| ≤ [OPT - ∑_e y[e]c_e] / |x*|.