USEMO Problem Development, Behind the Scenes

In this post I’m hoping to say a bit about the process that’s used for the problem selection of the recent USEMO: how one goes from a pool of problem proposals to a six-problem test. (How to write problems is an entirely different story, and deserves its own post.) I choose USEMO for concreteness here, but I imagine a similar procedure could be used for many other contests.

I hope this might be of interest to students preparing for contests to see a bit of the behind-the-scenes, and maybe helpful for other organizers of olympiads.

The overview of the entire timeline is:

    1. Submission period for authors (5-10 weeks)
    2. Creating the packet
    3. Reviewing period where volunteers try out the proposed problems (6-12 weeks)
    4. Editing and deciding on a draft of the test
    5. Test-solving of the draft of the test (3-5 weeks)
    6. Finalizing and wrap-up

Now I’ll talk about these in more detail.

Pinging for problems

The USA has the rare privilege of having an extremely dedicated and enthusiastic base of volunteers, who is going to make the contest happen rain or shine. When I send out an email asking for problem proposals, I never really worry I won’t get enough people. You might have to adjust the recipe below if you have fewer hands on deck.

When you’re deciding who to invite, you have to think about a trade-off between problem security versus openness. The USEMO is not a high-stakes competition, so it accepts problems from basically anyone. On the other hand, if you are setting problems for your country’s IMO team selection test, you probably don’t want to take problems from the general public.

Submission of the problem is pretty straightforward, ask to have problems emailed as TeX, with a full solution. You should also ask for any information you care about to be included: the list of authors of the problem, any other collaborators who have seen or tried the problem, and a very very rough estimate of how hard the difficulty is. (You shouldn’t trust the estimate too much, but I’ll explain in a moment why this is helpful.)

Ideally I try to allocate 5-10 weeks between when I open submissions for problems and when the submission period ends.

It’s at this time you might as well see who is going to be interested in being a test-solver or reviewer as well — more on that later.

Creating the packet

Once the submission period ends, you want to then collate the problems into a packet that you can send to your reviewers. The reviewers will then rate the problems on difficulty and suitability.

A couple of minor tips for setting the packet:

  • I used to try to sort the packet roughly by difficulty, but in recent years I’ve switched to random order and never looked back. It just biases the reviewers too much to have the problem number matter. The data has been a lot better with random order.
  • Usually I’ll label the problems A-01, A-02, …, A-05 (say), C-06, C-07, …., and so on. The leading zero is deliberate: I’ve done so much IMO shortlist that if I see a problem named “C7”, it will automatically feel like it should be a hard problem, so using “C-07” makes this gut reaction go away for me.
  • It’s debatable whether you need to have subject classifications at all, since in some cases a problem might not fit cleanly, or the “true” classification might even give away the problem. I keep it around just because it’s convenient from an administrative standpoint to have vaguely similar problems grouped together and labelled, but I’ll explicitly tell reviewers to not take the classification seriously, rather as a convenience.

More substantial is the choice of which problems to include in the packet if you are so lucky to have a surplus of submissions. The problem is that reviewers only have so much time and energy and won’t give as good feedback if the packet is too long. In my experience, 20 problems is a nice target, 30 problems is strenuous, anything more than that is usually too much. So if you have more than 30 problems, you might need to cut some problems out.

Since this “early cutting” is necessarily pretty random (because you won’t be able to do all the problems yourself singe-handedly), I usually prefer to do it in a slightly more egalitarian way. For example, if one person submits a lot of problems, you might only take a few problems from them, and say “we had a lot of problems, so we took the 3 of yours we liked the most for review”. (That said, you might have a sense that certain problems are really unlikely to be selected, and so you might as well exclude those.)

You usually also want to make sure that you have a spread of difficulties and subjects. Actually, this is even more true if you don’t have a surplus: if it turns out that, say, you have zero easy algebra or geometry problems, that’s likely to cause problems for you later down the line. It might be good to see if you can get one or two of those if you can. This is why the author’s crude estimates of difficulty can still be useful — it’s not supposed to be used for deciding the test, but it can give you “early warnings” that the packet might be lacking in some area.

One other useful thing to do, if you have the time at this point, is to edit the proposals as they come in, before adding them onto the packet. This includes both making copy edits for clarity and formatting, as well as more substantial changes if you can see an alternate version or formulation that you think is likely to be better than the original. The reason you want to do this step now is that you want the reviewer’s eyes on these changes: making edits halfway through the review process or after it can cause confusion and desynchronization in the reviewer data, and increases the chances of errors (because the modifications have been checked fewer times).

The packet review process

Then comes the period where your reviewers will work through the problems and submit their ratings. This is also something where you want to give reviewers a lot of time: 6-8 weeks is a comfortable amount. Between 10 and 25 reviewers is a comfortable number.

Reviewers are asked to submit a difficulty rating and quality rating for each problem. The system I’ve been using, which seems to work pretty well, goes like follows:

  • The five possible difficulty ratings are “Unsuitable”, “Mediocre”, “Acceptable”, “Nice”, “Excellent”. I’ve found that this choice of five words has the connotations needed to get similar distributions of calibrations across the reviewers.
  • For difficulty, I like to provide three checkboxes “IMO1”, “IMO2”, “IMO3”, but also tell the reviewer that they can check two boxes if, say, they think a problem could appear as either IMO1 or IMO2. That means in essence five ratings {1, 1.5, 2, 2.5, 3} are possible.

This is what I converged on as far as scales that are granular enough to get reasonable numerical data without being so granular that they are unintuitive or confusing. (If your scale is too granular, then a person’s ratings might say more about how a person interpreted the scale than the actual content of the problem.) For my group, five buckets seems to be the magic number; your mileage may vary!

More important is to have lots of free text boxes so that reviewers can provide more detailed comments, alternate solutions, and so on. Those are ultimately more valuable than just a bunch of numbers.

Here’s a few more tips:

  • If you are not too concerned about security, it’s also nice to get discussion between reviewers going. It’s more fun for the reviewers and the value of having reviewers talk with each other a bit tends to outweight the risk of bias.
  • It’s usually advised to send only the problem statements first, and then only send the solutions out about halfway through. I’ve found most reviewers (myself included) appreciate the decreased temptation to look at solutions too early on.
  • One thing I often do is to have a point person for each problem, to make sure every problem is carefully attempted This is nice, but not mandatory — the nicest problems tend to get quite a bit of attention anyways.
  • One thing I’ve had success with is adding a question on the review form that asks “what six problems would you choose if you were making the call, and why?” I’ve found I get a lot of useful perspective from hearing what people say about this.

I just use Google Forms to collect all the data. There’s a feature you can enable that requires a sign-in, so that the reviewer’s responses are saved between sessions and loaded automatically (making it possible to submit the form in multiple sittings).

Choosing the draft of the test

Now that you have the feedback, you should pick a draft of the test! This is the most delicate part, and it’s where it is nice to have a co-director or small committee if possible so that you can talk out loud and bounce ideas of each other.

For this stage I like to have a table with the numerical ratings as a summary of what’s available. The way you want to do this is up to you, but some bits from my workflow:

  • My table is color-coded, and it’s sorted in five different ways: by problem number, by quality rating, by difficulty rating, by subject then quality rating, by subject then difficulty rating.
  • For the quality rating, I use the weights -0.75, -0.5, 0, 1, 1.5 for Unsuitable, Mediocre, Acceptable, Nice, Excellent. This fairly contrived set of weights was chosen based on some experience in which I wanted to the average ratings to satisfy a couple properties: I wanted the sign of the rating (- or +) to match my gut feeling, and I wanted to not the rating too sensitive to a few Unsuitable / Excellent ratings (either extreme). This weighting puts a “cliff” between Acceptable and Nice, which empirically seems a good place to make the most differential.
  • I like to include a short “name” in the table to help with remembering which problem numbers are which, e.g. “2017-vtx graph”.

An example of what a table might look like is given in the image below.

Here is a output made with fake data. The Python script used to generate the table is included here, for anyone that wants to use it.

WT_U = -0.75
WT_M = -0.5
WT_A = 0
WT_N = 1
WT_E = 1.5

# ---- Populate with convincing looking random data ----
import random
slugs = {
		"A-01" : r"$\theta \colon \mathbb Z[x] \to \mathbb Z$",
		"A-02" : r"$\sqrt[3]{\frac{a}{b+7}}$",
		"A-03" : r"$a^a b^b c^c$",
		"C-04" : r"$a+2b+\dots+32c$",
		"C-05" : r"$2017$-vtx dinner",
		"G-06" : r"$ST$ orthog",
		"G-07" : r"$PO \perp YZ$",
		"G-08" : r"Area $5/2$",
		"G-09" : r"$XD \cap AM$ on $\Gamma$",
		"G-10" : r"$\angle PQE, \angle PQF = 90^{\circ}$",
		"N-11" : r"$5^n$ has six zeros",
		"N-12" : r"$n^2 \mid b^n+1$",
		"N-13" : r"$fff$ cycles",

qualities = {}
difficulties = {}
for k in slugs.keys():
	# just somehow throw stuff at wall to get counts
	a,b,c,d,e,f = [random.randrange(0,3) for _ in range(6)]
	if c >= 1: a = 0
	if a >= 2: d,e = 1,0
	if e == 0: f = 0
	if a == 0 and b == 0: e *= 2
	qualities[k] = [WT_U] * a + [WT_M] * b + [WT_A] * (b+d+e) + [WT_N] * (c+d+e) + [WT_E] * (c+e+f)

for k in slugs.keys():
	# just somehow throw stuff at wall to get counts
	a,b,c,d,e = [random.randrange(0,5) for _ in range(5)]
	if e >= 4:
		b = 0
		c //= 2
	elif e >= 3:
		a = 0
		b //= 2
	if a >= 3:
		e = 0
		d //= 3
	elif a >= 2:
		e = 0
		d //= 2
	difficulties[k] = [1] * a + [1.5] * b + [2] * c + [2.5] * d + [3] * e

# ---- End random data population ----

import statistics
def avg(S):
	return statistics.mean(S) if len(S) > 0 else None
def median(S):
	return statistics.median(S) if len(S) > 0 else None

# criteria for inclusion on chart
criteria = lambda k: True

def get_color_string(x, scale_min, scale_max, color_min, color_max):
	if x is None:
		return r"\rowcolor{gray}"
	m = (scale_max+scale_min)/2
	a = min(int(100 * 2 * abs(x-m) / (scale_max-scale_min)), 100)
	color = color_min if x < m else color_max
	return r"\rowcolor{%s!%d}" %(color, a) + "\n"

def get_label(key, slugged=False):
	if slugged:
		return r"{\scriptsize \textbf{%s} %s}" %(key, slugs.get(key, ''))
		return r"{\scriptsize \textbf{%s}}" % key

## Quality rating
def get_quality_row(key, data, slugged = True):
	a = avg(data)
	s = ("$%+4.2f$" % a) if a is not None else "---"
	color_tex = get_color_string(a, WT_U, WT_E, "Salmon", "green")
	row_tex = r"%s & %d & %d & %d & %d & %d & %s \\" \
			% (get_label(key, slugged),
	return color_tex + row_tex
def print_quality_table(d, sort_key = None, slugged = True):
	items = sorted(d.items(), key = sort_key)
	print(r"\toprule Prob & U & M & A & N & E & Avg \\ \midrule")
	for key, data in items:
		print(get_quality_row(key, data, slugged))

## Difficulty rating
def get_difficulty_row(key, data, slugged = False):
	a = avg(data)
	s = ("$%.3f$" % a) if a is not None else "---"
	color_tex = get_color_string(a, 1, 3, "cyan", "orange")
	row_tex = r"%s & %d & %d & %d & %d & %d & %s \\" \
			% (get_label(key, slugged),
	return color_tex + row_tex
def print_difficulty_table(d, sort_key = None, slugged = False):
	items = sorted(d.items(), key = sort_key)
	print(r"\begin{tabular}{l ccccc c}")
	print(r"\toprule Prob & 1 & 1.5 & 2 & 2.5 & 3 & Avg \\ \midrule")
	for key, data in items:
		print(get_difficulty_row(key, data, slugged))

filtered_qualities = {k:v \
		for k,v in qualities.items() if criteria(k)}
filtered_difficulties = {k:v \
		for k,v in difficulties.items() if criteria(k)}

def print_everything(name, fn = None, flip_slug = False):
	if fn is not None:
		sort_key = lambda item: fn(item[0])
		sort_key = None
	print(r"\section{" + name + "}")
	if flip_slug:
		print_quality_table(filtered_qualities, sort_key, False)
		print_difficulty_table(filtered_difficulties, sort_key, True)
		print_quality_table(filtered_qualities, sort_key, True)
		print_difficulty_table(filtered_difficulties, sort_key, False)

# Start outputting content

\title{Example of ratings table with randomly generated data}


print(r"\section{All ratings}")

print("\n" + r"\newpage" + "\n")
print_everything("Beauty contest, by overall popularity",
		lambda p : (-avg(qualities[p]), p), False)
print_everything("Beauty contest, by subject and popularity",
		lambda p : (p[0], -avg(qualities[p]), p), False)
print("\n" + r"\newpage" + "\n")
print_everything("Beauty contest, by overall difficulty",
		lambda p : (-avg(difficulties[p]), p), True)
print_everything("Beauty contest, by subject and difficulty",
		lambda p : (p[0], -avg(difficulties[p]), p), True)

print(r"\section{Scatter plot}")
print(r"""\begin{axis}[width=0.9\textwidth, height=22cm, grid=both,
	xlabel={Average difficulty}, ylabel={Average suitability},
	every node near coord/.append style={font=\scriptsize},
print(r"""\addplot [scatter,
	only marks, point meta=explicit symbolic,
	nodes near coords*={\prob},
	visualization depends on={value \thisrow{prob} \as \prob}]""")
print(r"table [meta=subj] {")
for p in qualities.keys():
	x = avg(difficulties[p])
	y = avg(qualities[p])
	if x is None or y is None:
	print("%0.2f\t%0.2f\t%s\t%s" %(x,y,p[2:],p[0]))


Of course, obligatory warning to not overly rely on the numerical ratings and to put heavy weight on the text comments provided. (The numerical ratings will often have a lot of variance, anyways.)

One thing to keep in mind is that when choosing the problems is that there are two most obvious goals are basically orthogonal. One goal is to have the most attractive problems (“art”), but the other is to have an exam which is balanced across difficulty and subject composition (“science”). These two goals will often compete with each other and you’ll have to make judgment calls to prioritize one over the other.

A final piece of advice is to not be too pedantic. For example, I personally dislike the so-called “Geoff rule” that 1/2/4/5 should be distinct subjects: I find that it is often too restrictive in practice. I also support using “fractional distributions” in which say a problem can be 75% number theory and 25% combinatorics (rather than all-or-nothing) when trying to determine how to balance the exam. This leads to better, more nuanced judgments than insisting on four categories.

This is also the time to make any last edits you want to the problems, again both copy edits or more substantial edits. This gives you a penultimate draft of the exam.

Test solving

If you can, a good last quality check to do is to have a round of test-solving from an unbiased group of additional volunteers who haven’t already seen the packet. (For the volunteers, this is a smaller time commitment than reviewing an entire packet so it’s often feasible as an intermediate commitment.) You ask this last round of volunteers to try out the problems under exam-like conditions, although I find it’s not super necessary to do a full 4.5 hours or have full write-ups, if you’ll get more volunteers this way. A nice number of test-solvers is 5-10 people.

Typically this test-solving is most useful as a sanity check (e.g. to make sure the test is not obviously too difficult) and for any last minute shuffling of the problems (which often happens). I don’t advise making drastic changes at this point. It’s good as a way to get feedback on the most tricky decisions, though.


After any final edits, I recommend sending a copy of the edited problems and solutions to the reviewers and test-solvers. They’re probably interested to know what problems made the cut, and you want to have eyes going through the final paper to check for ambiguities or errors.

I usually take the time to also send out some details of the selection itself: what the ratings for the problems looked like, often a sentence or two for each problem about the overall feedback, and a documentation of my thought process in the draft selection. It’s good to give people feedback on their problems, in my experience the authors usually appreciate it a lot, especially if they decide to re-submit the problem elsewhere.

And that’s the process.

IMO 2019 Aftermath

Here is my commentary for the 2019 International Math Olympiad, consisting of pictures and some political statements about the problem.


This year’s USA delegation consisted of leader Po-Shen Loh and deputy leader Yang Liu. The USA scored 227 points, tying for first place with China. For context, that is missing a total of four problems across all students, which is actually kind of insane. All six students got gold medals, and two have perfect scores.

  1. Vincent Huang 7 7 3 7 7 7
  2. Luke Robitaille 7 6 2 7 7 6
  3. Colin Shanmo Tang 7 7 7 7 7 7
  4. Edward Wan 7 6 0 7 7 7
  5. Brandon Wang 7 7 7 7 7 1
  6. Daniel Zhu 7 7 7 7 7 7

Korea was 3rd place with 226 points, just one point shy of first, but way ahead of the 4th place score (with 187 points). (I would actually have been happier if Korea had tied with USA/China too; a three-way tie would have been a great story to tell.)



Leader’s hotel at Celtic manor.


The opening ceremony!


Spotted by the leader of France, while waiting for the opening ceremony to start.


From last October. It turned out to be IMO5.


My room at the University of Bath.


Picture with Anant Mudgal, OTIS instructor and IMO6 author


McDonald’s trip, an annual tradition whenever USA wins an international contest.


Picture with Po-Shen Loh at McDonald’s


Canmoo with friend (another tradition apparently)


Ran into this while working with team USA on a crossword after the contest


View from the Ferris wheel at the closing ceremony


Picture with team Taiwan at the closing ceremony


Picture with deputy leader Yang Liu

Contest analysis

You can find problems and my solutions on my website already, and this year’s organizers were kind of enough to post already the official solutions from the Problem Selection Committee. So what follows are merely my opinions on the problems, and my thoughts on them.

First, comments on the individual problems. (Warning: spoilers follow.)

  1. This is a standard functional equation, which is quite routine for students with experience. In general, I don’t really like to put functional equations as opening problems to exams like the IMO or USAJMO, since students who have not seen a functional equation often have a difficult time understanding the problem statement.
  2. This is the first medium geometry problem that the IMO has featured since 2012 (the year before the so-called “Geoff rule” arose). I think it’s genuinely quite tricky to do using only vanilla synthetic methods, like the first official solution. In particular, the angle chasing solution was a big surprise to me because my approach (and many other approaches) start by eliminating the points {P_1} and {Q_1} from the picture, while the first official solution relies on them deeply. (For example one mightt add {X = PB_1 \cap AB} and {Y = QA_1 \cap AB} and noting {P_1CXA} and {Q_1CYB} are cyclic so it is equivalent to prove {T = PX \cap QY} lies on the radical axis of {\triangle CXA} and {\triangle CYB}). That said, I found that the problem succumbs to barycentric coordinates and I will be adding it as a great example to my bary handout. The USA students seem to have preferred to use moving points, or Menelaus theorem (which in this case was just clumsier bary).
  3. I actually felt the main difficulty of the problem was dealing with the artificial condition. Basically, the problem is about performing an operation while trying to not disconnect the graph. However, this “connectedness” condition, together with a few other necessary weak hypotheses (namely: not a clique, and has at least one odd-degree vertex) are lumped together in a misleading way, by specifying 1010 vertices of degree 1009 and 1009 vertices of degree 1010. This misleads contestants into, say, splitting the graph into the even and odd vertices, while hiding the true nature of the problem. I do think the idea behind the problem is quite cute though, despite being disappointed at how it was phrased. Yang Liu suggested to me this might have been better off at the IOI, where one might ask a contestant to output a sequence of moves reducing it to a tree (or assert none exist).
  4. I liked the problem (and I found the connection to group theory amusing), though I think it is pretty technical for an IMO 1/4. Definitely on the hard side for inexperienced countries.
  5. This problem was forwarded to the USAMO chair and then IMO at my suggestion, so I was very happy to see it on the exam. I think it’s a natural problem statement that turns out to have an unexpecetdly nice solution. (And there is actually a natural interpretation of the statement via a Turing machine.) However, I thought it was quite easy for the P5 position (even easier than IMO 2008/5, say).
  6. A geometry problem from one of my past students Anant Mudgal. Yang and I both solved it very quickly with complex numbers, so it was a big surprise to us that none of the USA students did. I think this problem is difficult but not “killer”; easier than last year’s IMO 2018/6 but harder than IMO 2015/3.

For gold-level contestants, I think this was the easiest exam to sweep in quite a few years, and I confess during the IMO to wondering if we had a small chance at getting a full 252 (until I found out that the marking scheme deducted a point on P2). Problem 2 is tricky but bary-able, and Problem 5 is quite easy. Furthermore neither Problem 3 or Problem 6 are “killers” (the type of problem that gets fewer than 20 solves, say). So a very strong contestant would really have 3 hours each day to work on a Problem 3 or Problem 6 which is not too nightmarish. I was actually worried for a while that the gold cutoff might be as high as 34 points (five problems), but I was just being silly.

RMM 2019 pictures and aftermath

Pictures, thoughts, and other festives from the 2019 Romania Masters in Math. See also the MAA press release.


Po-Shen Loh and I spent the last week in Bucharest with the United States team for the 11th RMM. The USA usually sends four students who have not attended a previous IMO or RMM before.

This year’s four students did breathtakingly well:

  1. Benjamin Qi — gold (rank 2nd)
  2. Luke Robitaille — silver (rank 10th)
  3. Carl Schildkraut — gold (rank 8th)
  4. Daniel Zhu — gold (rank 4th)

(Yes, there are only nine gold medals this year!)

The team score is obtained by summing the three highest scores of the four team members. The USA won the team component by a lofty margin, making it the first time we’ve won back to back. I’m very proud of the team.


RMM 2019 team after the competition (taken by Daniel Zhu’s dad):


McDonald’s trip. Apparently, the USA tradition is that whenever we win an international contest, we have to order chicken mcnuggets. Fortunately, this time we didn’t order one for every point on the team (a silly idea that was unfortunately implemented at IMO 2018).

2019-02-24 22.40.48

The winner plate. Each year the winning country brings it back to get it engraved, and returns it to the competition the next year. I will have it for the next while.

2019-02-25 02.55.20

And a present from one of the contestants (thanks!):

2019-02-25 22.17.53.jpg


Problems, and thoughts on them

Problem 1 (Maxim Didin, RUS)

Amy and Bob play a game. First, Amy writes down a positive integer on a board. Then the players alternate turns, with Bob moving first. On Bob’s turn, he chooses a positive integer {b} and subtracts {b^2} from the number on the board. On Amy’s turn, she chooses a positive integer {k} and raises the number on the board to the {k}th power. Bob wins if the number on the board ever reads zero. Can Amy prevent Bob from winning?

I found this to be a cute easy problem. The official solution is quite clever, but it’s possible (as I myself did) to have a very explicit solution using e.g. the characterization of which integers are the sum of k squares (for {k \le 4}).

Problem 2 (Jakob Jurij Snoj, SLV)

Let {ABCD} be an isosceles trapezoid with {\overline{AB} \parallel \overline{DC}}. Let {E} be the midpoint of {\overline{AC}}. Denote by {\Gamma} and {\Omega} the circumcircles of triangles {ABE} and {CDE}, respectively. The tangent to {\Gamma} at {A} and the tangent to {\Omega} at {D} intersect at point {P}. Prove that {\overline{PE}} is tangent to {\Omega}.

There are nice synthetic solutions to this problem, but I found it much easier to apply complex numbers with {\triangle CDE} as the unit circle, taking as the phantom point {P} the intersection of tangents. So, unsurprisingly, all our team members solved the problem quite quickly.

I suspect the American students who took RMM Day 1 at home (as part of the USA team selection process) will find the problem quite easy as well. Privately, it is a bit of a relief for me, because if a more difficult geometry had been placed here I would have worried that our team selection this year has (so far) been too geometry-heavy.

Problem 3 (Fedor Petrov, RUS)

Let {\varepsilon > 0} be a positive real number. Prove that if {n} is sufficiently large, then any simple graph on {n} vertices with at least {(1+\varepsilon)n} edges has two (different) cycles of equal length.

A really nice problem with a short, natural problem statement. I’m not good at this kind of problem, but I enjoy it anyways. Incidentally, one of our team members won last year’s IOI, and so this type of problem is right up his alley!

Problem 4 (Morteza Saghafian, IRN)

Show that for every positive integer {N} there exists a simple polygon (not necessarily convex) admitting exactly {N} distinct triangulations.

A fun construction problem. I think it’s actually harder than it looks, but with enough time everyone eventually catches on.

Problem 5 (Jakob Jurij Snoj, SLV)

Solve over {{\mathbb R}} the functional equation

\displaystyle f(x+yf(x)) + f(xy) = f(x) + f(2019y).

I found this problem surprisingly pernicious. Real functional equations in which all parts of the equation are “wrapped by f” tend to be hard to deal with: one has to think about things like injectivity and the like in order to have any hope of showing that f actually takes on some desired value. And the answer is harder to find that it seems — it is (rightly) worth a point even to get the entire answer correct.

Fortunately for our team members, the rubric for the problem was generous, and it was possible to get 4-5 points without a complete solution. In the USA, olympiad grading tends to be even harsher than in most other countries (though not as Draconian as the Putnam), so this came as a surprise to the team. I jokingly told the team afterwards that they should appreciate how I hold them to a higher standard than the rest of the world.

(Consequently, the statistics for this problem are somewhat misleading — the average score makes the problem seem easier than it actually is. In truth there were not that many 6’s and 7’s given out.)

Problem 6 (Adrian Beker)

Find all pairs of integers {(c, d)}, both greater than {1}, such that the following holds:

For any monic polynomial {Q} of degree {d} with integer coefficients and for any prime {p > c(2c+1)}, there exists a set {S} of at most {\tfrac{2c-1}{2c+1} \cdot p} integers, such that

\displaystyle \bigcup_{s \in S} {s,\; Q(s),\; Q(Q(s)),\; Q(Q(Q(s))),\; \dots}

contains a complete residue system modulo {p} (i.e., intersects with every residue class modulo {p}).

Unlike problem 5, I found this problem to be quite straightforward. I think it should have been switched with problem 5. Indeed, all our team members produced complete solutions to this problem.

So I am glad that our team has learned by now to try all three problems seriously on any given day. I have seen too many times students who would spend all their time on a P2 and not solve it, only to find out that P3 was of comparable or easier difficulty.