Design and Analysis of Algorithms Neal Young
Computer Science 45
Notes on generating functions My intention here is to just give you a brief introduction to pique your interest and familiarize you with the ideas. Check out the reference by Vitter and Flajolet [3] for further and more comprehensive info on generating functions. Another reference for discrete math in general is the text Concrete Math [2]. I expect you could find Mathematica or Maple packages to play with on the net, one possible lead is [1].
How many distinct n-node binary trees are there? The ordinary generating function for the set of binary trees is
where is the number of binary trees of size n. In the function b, there is one term of ``size'' n for each tree of size n. Postponing for a minute the question of what purpose this serves, can we find a simpler form for the function b? Here is a simple set of rules defining the binary trees:
Here `` '' represents a single node and `` '' means there is a size-preserving bijection (a one-to-one and onto function f such that the size of x equals the size of f(x)) between the set on the left and the set on the right. Replacing each root `` '' by a ``z'' (a term of ``size'' 1) and replacing the set union by addition yields the corresponding equivalence:
Solving for b(z) using the quadratic formula yields
From this closed form and a little algebra, we can see that b(z) is well-defined as long as -- for larger values, the term inside the square root becomes negative. Amazingly, from this it follows directly (for reasons discussed below) that grows roughly like . With a little more work, an exact form for can be obtained.
The Fibonacci numbers are defined by the following recurrence:
The corresponding generating function is Can we find a simple form for f(z)? Using the recurrence, we get
Solving for f yields
From this we can see that f is well-defined as long as z is less than -- the smallest root of . From the general principle described below, it follows that grows roughly like .
In the above examples, we used the following rule of thumb:
We can summarize this by saying that the best exponential approximation to the rate of growth of is , where r is the smallest singularity of f(z), in other words, r is the largest value such that for all z<r, f(z) is well-defined.
Why is this the case? If f(a) is well-defined, then the infinite sum converges. Its terms must tend to zero as n tends to infinity:
This establishes the first part of the principle: .
On the other hand, let b and c be as in the second part of the principle: f(b) is not well-defined and b<c. Suppose for contradiction that is . Then , so that
This contradicts our assumption that f(b) was not well-defined.
Here are a few general principles for deriving generating functions: Suppose sets A and B have generating functions a(z) and b(z), respectively.
Why are these principles true? Take the Cartesian product rule for example. For every element and , there is a term in a(z) and a term in b(z), where i and j are the sizes of and , respectively. In the product a(z)b(z), the pair thus contributes a single term of ``size'' equal to the size of the pair . More formally, we can write
Since the number of pairs of size n in is , this proves the principle.
How many k-digit decimal numbers are there whose digits sum to n? Let D be the set of digits . For this problem, we think of each digit as having a size equal to itself. With this interpretation, the generating function for d(z) is .
The set of k-digit numbers is then . The generating function for is . Here the ``size'' we associate with a k-digit number is just the sum of the sizes of the digits.
The singularity principle says that the nth coefficient grows like -- this time it's not so useful! On the other hand, note that for fixed k, this generating function has only finitely many terms, so we have to be a little more careful when asking questions about the asymptotics.
Let be the number of distinct rooted trees of size n. We can define the set of rooted trees as follows:
(Recall that represents finite sequences of elements of T.) The generating function t(z) thus satisfies
Solving for t(z) using the quadratic formula yields
From the singularity principle, it follows that grows roughly like .
The singularity principle is useful for getting rough estimates of the exponential rate of growth of the coefficients of a generating function. Even more can be said about a generating function from its singularities, however, the proper tool for this is complex analysis and we don't pursue it further here.
Instead we give a few simple examples where exact expressions can be obtained for the coefficients.
Suppose you have a simple form f(z) for the generating function , and you want a simple form for -- that is, you want to introduce a linear term. Note that , so that the answer you want is just zf'(z). For instance, applying this to the generating function gives .
You can also use differentation to get the exact form for the coefficients. The general rule is
Here represents the kth derivative of f. To prove it, first use induction to show that , then substitute k=n and z=0 (so all terms but the leading one drop out) to get .
Repeated differentation can often be messy, but not always. For example,
This holds for arbitrary m, not just integer values. To prove it, just note that differentiating n times yields and then apply the preceding theorem.
From it follows by substituting 2z for z that . Similarly by substituting for z one obtains .
Recall that the generating function for the Fibonacci numbers is
Factor the denominator and separate into partial fractions:
Here 1/a and 1/b are the roots of : . Letting denote the coefficient of in f(z), we have
This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -no_navigation -split 0 gen-fns.
The translation was initiated by Neal Young on Fri Apr 18 09:47:38 EDT 1997
Suppose that f(r) is well-defined but f(z) is not well-defined for z>r. Compare the sum (which converges) to the sum (which diverges). This will give you better upper bounds on . Suppose further that f' (the derivative of f) is not well-defined at r. Compare the sum (the derivative at r, which diverges) to (which converges). This will give you better lower bounds.
If you are not so lucky that f(r) is well-defined and f'(r) isn't, then consider integrating or differentiating f several times so that the resulting function has the property you want. Recall that each differentation introduces a linear term in each coefficient, whereas integration factors one out.