General comment re inclusion/exclusion principle and the sieve formula

I believe the textbook uses what may seem a rather formal presentation of the principle, where one explicitly describes a family of sets of objects {A_1, A_2, ..., A_n}, which usually are subsets of some larger universal set, and then the so-called sieve formula expresses the cardinality of the union of sets in this family as the sum of the respective cardinalities of the individual sets in the family, MINUS the sum of the cardinalities of all their pairwise intersections, PLUS the sum of the cardinalities of all their triple (3-way) intersections, MINUS the sum of the cardinalities of all their quadruple (4-way) intersections, etc. Just as with mathematical induction, my advice to the students is to follow this formalism when writing up solutions involving the inclusion-exclusion principle until you have really mastered its use. In other words, take care to explicitly describe your family of sets -- give each one a name or indexed symbol and clearly describe its contents. Then compute the cardinalities of the sets and their various intersections, then apply the sieve formula using the appropriate signs. The first benefit is that it can help clarify your thinking about the problem as you work out the solution. The next, more important benefit is that it makes it clear to the reader (the grader) what exactly you are doing and that you understand how to apply the principle. Most points lost on problem solutions using the inclusion-exclusion principle are due to being vague about what exactly is being counted by each expression.

To give an example of the sorts of problems that can arise otherwise, on problem 6 of the homework set, it was not uncommon to see a statement like: "The number of solutions to the equation x_1+...+x_10=100 in positive integers where two of the variables are greater than 30 is (10 choose 2)*(39 choose 9)." First of all, this is slightly ambiguous. Does the writer mean solutions where at least two of the variables exceed 30, or solutions where exactly two of the variables exceed thirty? The usual mathematical convention is to interpret this as meaning "at least" unless "exactly" is explicitly specified, but sometimes the students apparently meant the opposite their writeup. This sort of confusion is not helpful to the grader. The second issue is that the expression is wrong for both interpretations, since it is larger than either of the two interpretations due to overcounting solutions in which three variables exceed 30. So what does the expression count? For 1 <= i <= 10, let A_i be the set of solutions to the equation in positive integers where the variable x_i exceeds 30. Then the expression is the sum of the cardinalities of all the pariwise intersections of distinct A_i, which is just what we need for the sieve formula. Notice how the formalism of explicit sets avoids the two issues with the above statement. It allows us to say the correct thing precisely.

Another common problem I see with solutions involving the sieve formula is to fail to include all of the various intersections of sets. In some problems, such as problem 6, you can show that all k-wise intersections for sufficiently large k are empty, and so can be omitted. Quite often you can show (usually due to symmetry) that all k-wise intersections for a given value of k have the same cardinality. This generally depends on the choice of sets used to describe the solution. Also, one can shorten the notation by using indexed formulas, e.g. "|A_i intersect A_j| = (expression) for all i not equal to j" instead of listing every pair explicitly. Still, the listing of all the cases can get tedious, and the temptation is strong to skip a few. But to use the sieve formula properly, you have to take them all into account.

Comment regarding terminology

Some students still don't have all the terminology down. Not everything that we count is a permutation. For example, in problem 6 we are counting (strong) compositions of 100 with 10 parts (satisfying certain constraints), equivalently solutions to the equation x_1+...+x_10=100 in positive integers. We are not counting permutations of anything here. (Permutations are linear arrangements or orderings of a set of things.) I try not to mark down for lapses of terminology, but sometimes they can throw me off, leading me to think that the student misunderstands something they don't. For better or worse, mathematicians are a bit like lawyers, we rely on the precision of our terms to a sometimes annoying degree.

I also note that the apparent convention of the text or this course is that by "composition of n", without any adjective, we assume we are referring to a strong composition unless somehow otherwise indicated. Nevertheless, in problem sets where we deal with both types, it does not hurt to explicitly say what type one is dealing with. Here and there I noticed a student using the wrong formula (weak vs. strong) for counting compositions.

Regarding problem 1

Speaking of terminology, many students were confused by the author's unfortunate and unnecessary use of the adjective "weak" in the problem statement. Indeed, if the composition can only have odd parts, then no part can be zero, and all such compositions are in fact strong compositions. The use of adjective can be (weakly) defended on the grounds that all strong compositions are also weak compositions, though the converse is not true. Nevertheless, a few students were thrown for a loop, assuming that the problem was asking to count compositions in which every part is either odd or zero, and so they were marked down.

Regarding problem 2

Most students got this problem just fine. However, a few students got confused with the formula for weak compositions, namely which variable is n (the number being divided into parts) versus which is k (the number of parts). Note that with weak compositions, n can be larger or smaller than k and you will still get a positive number of weak compositions. In this problem, the number of balls (of one color or the other) is n, and the k = 9 is the number of pockets.

Regarding problem 4

I was somewhat lenient on this problem, usually giving three points if I felt that the student understood and applied the inclusion-exclusion principle (sieve formula) correctly, even if they had the wrong values for the cardinality of the subsets and their intersections. The most common problem seemed to be that some students did not remember from earlier chapter(s) how to count permutations of a multiset, such as the multiset of letters in a string as in this problem. For example, since the string here has 9 letters, many students said that the total number of permutations is (9!). This is wrong, since some of the letters appear more than once. The proper value is given by the multinomial coefficient (9 choose 3,3,3) = 9!/(3!3!3!).