USEMO Problem Development, Behind the Scenes

In this post I’m hoping to say a bit about the process that’s used for the problem selection of the recent USEMO: how one goes from a pool of problem proposals to a six-problem test. (How to write problems is an entirely different story, and deserves its own post.) I choose USEMO for concreteness here, but I imagine a similar procedure could be used for many other contests.

I hope this might be of interest to students preparing for contests to see a bit of the behind-the-scenes, and maybe helpful for other organizers of olympiads.

The overview of the entire timeline is:

    1. Submission period for authors (5-10 weeks)
    2. Creating the packet
    3. Reviewing period where volunteers try out the proposed problems (6-12 weeks)
    4. Editing and deciding on a draft of the test
    5. Test-solving of the draft of the test (3-5 weeks)
    6. Finalizing and wrap-up

Now I’ll talk about these in more detail.

Pinging for problems

The USA has the rare privilege of having an extremely dedicated and enthusiastic base of volunteers, who is going to make the contest happen rain or shine. When I send out an email asking for problem proposals, I never really worry I won’t get enough people. You might have to adjust the recipe below if you have fewer hands on deck.

When you’re deciding who to invite, you have to think about a trade-off between problem security versus openness. The USEMO is not a high-stakes competition, so it accepts problems from basically anyone. On the other hand, if you are setting problems for your country’s IMO team selection test, you probably don’t want to take problems from the general public.

Submission of the problem is pretty straightforward, ask to have problems emailed as TeX, with a full solution. You should also ask for any information you care about to be included: the list of authors of the problem, any other collaborators who have seen or tried the problem, and a very very rough estimate of how hard the difficulty is. (You shouldn’t trust the estimate too much, but I’ll explain in a moment why this is helpful.)

Ideally I try to allocate 5-10 weeks between when I open submissions for problems and when the submission period ends.

It’s at this time you might as well see who is going to be interested in being a test-solver or reviewer as well — more on that later.

Creating the packet

Once the submission period ends, you want to then collate the problems into a packet that you can send to your reviewers. The reviewers will then rate the problems on difficulty and suitability.

A couple of minor tips for setting the packet:

  • I used to try to sort the packet roughly by difficulty, but in recent years I’ve switched to random order and never looked back. It just biases the reviewers too much to have the problem number matter. The data has been a lot better with random order.
  • Usually I’ll label the problems A-01, A-02, …, A-05 (say), C-06, C-07, …., and so on. The leading zero is deliberate: I’ve done so much IMO shortlist that if I see a problem named “C7”, it will automatically feel like it should be a hard problem, so using “C-07” makes this gut reaction go away for me.
  • It’s debatable whether you need to have subject classifications at all, since in some cases a problem might not fit cleanly, or the “true” classification might even give away the problem. I keep it around just because it’s convenient from an administrative standpoint to have vaguely similar problems grouped together and labelled, but I’ll explicitly tell reviewers to not take the classification seriously, rather as a convenience.

More substantial is the choice of which problems to include in the packet if you are so lucky to have a surplus of submissions. The problem is that reviewers only have so much time and energy and won’t give as good feedback if the packet is too long. In my experience, 20 problems is a nice target, 30 problems is strenuous, anything more than that is usually too much. So if you have more than 30 problems, you might need to cut some problems out.

Since this “early cutting” is necessarily pretty random (because you won’t be able to do all the problems yourself singe-handedly), I usually prefer to do it in a slightly more egalitarian way. For example, if one person submits a lot of problems, you might only take a few problems from them, and say “we had a lot of problems, so we took the 3 of yours we liked the most for review”. (That said, you might have a sense that certain problems are really unlikely to be selected, and so you might as well exclude those.)

You usually also want to make sure that you have a spread of difficulties and subjects. Actually, this is even more true if you don’t have a surplus: if it turns out that, say, you have zero easy algebra or geometry problems, that’s likely to cause problems for you later down the line. It might be good to see if you can get one or two of those if you can. This is why the author’s crude estimates of difficulty can still be useful — it’s not supposed to be used for deciding the test, but it can give you “early warnings” that the packet might be lacking in some area.

One other useful thing to do, if you have the time at this point, is to edit the proposals as they come in, before adding them onto the packet. This includes both making copy edits for clarity and formatting, as well as more substantial changes if you can see an alternate version or formulation that you think is likely to be better than the original. The reason you want to do this step now is that you want the reviewer’s eyes on these changes: making edits halfway through the review process or after it can cause confusion and desynchronization in the reviewer data, and increases the chances of errors (because the modifications have been checked fewer times).

The packet review process

Then comes the period where your reviewers will work through the problems and submit their ratings. This is also something where you want to give reviewers a lot of time: 6-8 weeks is a comfortable amount. Between 10 and 25 reviewers is a comfortable number.

Reviewers are asked to submit a difficulty rating and quality rating for each problem. The system I’ve been using, which seems to work pretty well, goes like follows:

  • The five possible difficulty ratings are “Unsuitable”, “Mediocre”, “Acceptable”, “Nice”, “Excellent”. I’ve found that this choice of five words has the connotations needed to get similar distributions of calibrations across the reviewers.
  • For difficulty, I like to provide three checkboxes “IMO1”, “IMO2”, “IMO3”, but also tell the reviewer that they can check two boxes if, say, they think a problem could appear as either IMO1 or IMO2. That means in essence five ratings {1, 1.5, 2, 2.5, 3} are possible.

This is what I converged on as far as scales that are granular enough to get reasonable numerical data without being so granular that they are unintuitive or confusing. (If your scale is too granular, then a person’s ratings might say more about how a person interpreted the scale than the actual content of the problem.) For my group, five buckets seems to be the magic number; your mileage may vary!

More important is to have lots of free text boxes so that reviewers can provide more detailed comments, alternate solutions, and so on. Those are ultimately more valuable than just a bunch of numbers.

Here’s a few more tips:

  • If you are not too concerned about security, it’s also nice to get discussion between reviewers going. It’s more fun for the reviewers and the value of having reviewers talk with each other a bit tends to outweight the risk of bias.
  • It’s usually advised to send only the problem statements first, and then only send the solutions out about halfway through. I’ve found most reviewers (myself included) appreciate the decreased temptation to look at solutions too early on.
  • One thing I often do is to have a point person for each problem, to make sure every problem is carefully attempted This is nice, but not mandatory — the nicest problems tend to get quite a bit of attention anyways.
  • One thing I’ve had success with is adding a question on the review form that asks “what six problems would you choose if you were making the call, and why?” I’ve found I get a lot of useful perspective from hearing what people say about this.

I just use Google Forms to collect all the data. There’s a feature you can enable that requires a sign-in, so that the reviewer’s responses are saved between sessions and loaded automatically (making it possible to submit the form in multiple sittings).

Choosing the draft of the test

Now that you have the feedback, you should pick a draft of the test! This is the most delicate part, and it’s where it is nice to have a co-director or small committee if possible so that you can talk out loud and bounce ideas of each other.

For this stage I like to have a table with the numerical ratings as a summary of what’s available. The way you want to do this is up to you, but some bits from my workflow:

  • My table is color-coded, and it’s sorted in five different ways: by problem number, by quality rating, by difficulty rating, by subject then quality rating, by subject then difficulty rating.
  • For the quality rating, I use the weights -0.75, -0.5, 0, 1, 1.5 for Unsuitable, Mediocre, Acceptable, Nice, Excellent. This fairly contrived set of weights was chosen based on some experience in which I wanted to the average ratings to satisfy a couple properties: I wanted the sign of the rating (- or +) to match my gut feeling, and I wanted to not the rating too sensitive to a few Unsuitable / Excellent ratings (either extreme). This weighting puts a “cliff” between Acceptable and Nice, which empirically seems a good place to make the most differential.
  • I like to include a short “name” in the table to help with remembering which problem numbers are which, e.g. “2017-vtx graph”.

An example of what a table might look like is given in the image below.

Here is a output made with fake data. The Python script used to generate the table is included here, for anyone that wants to use it.

WT_U = -0.75
WT_M = -0.5
WT_A = 0
WT_N = 1
WT_E = 1.5

# ---- Populate with convincing looking random data ----
import random
slugs = {
		"A-01" : r"$\theta \colon \mathbb Z[x] \to \mathbb Z$",
		"A-02" : r"$\sqrt[3]{\frac{a}{b+7}}$",
		"A-03" : r"$a^a b^b c^c$",
		"C-04" : r"$a+2b+\dots+32c$",
		"C-05" : r"$2017$-vtx dinner",
		"G-06" : r"$ST$ orthog",
		"G-07" : r"$PO \perp YZ$",
		"G-08" : r"Area $5/2$",
		"G-09" : r"$XD \cap AM$ on $\Gamma$",
		"G-10" : r"$\angle PQE, \angle PQF = 90^{\circ}$",
		"N-11" : r"$5^n$ has six zeros",
		"N-12" : r"$n^2 \mid b^n+1$",
		"N-13" : r"$fff$ cycles",
		}

qualities = {}
difficulties = {}
random.seed(150)
for k in slugs.keys():
	# just somehow throw stuff at wall to get counts
	a,b,c,d,e,f = [random.randrange(0,3) for _ in range(6)]
	if c >= 1: a = 0
	if a >= 2: d,e = 1,0
	if e == 0: f = 0
	if a == 0 and b == 0: e *= 2
	qualities[k] = [WT_U] * a + [WT_M] * b + [WT_A] * (b+d+e) + [WT_N] * (c+d+e) + [WT_E] * (c+e+f)

random.seed(369)
for k in slugs.keys():
	# just somehow throw stuff at wall to get counts
	a,b,c,d,e = [random.randrange(0,5) for _ in range(5)]
	if e >= 4:
		b = 0
		c //= 2
	elif e >= 3:
		a = 0
		b //= 2
	if a >= 3:
		e = 0
		d //= 3
	elif a >= 2:
		e = 0
		d //= 2
	difficulties[k] = [1] * a + [1.5] * b + [2] * c + [2.5] * d + [3] * e

# ---- End random data population ----

	
import statistics
def avg(S):
	return statistics.mean(S) if len(S) > 0 else None
def median(S):
	return statistics.median(S) if len(S) > 0 else None

# criteria for inclusion on chart
criteria = lambda k: True

def get_color_string(x, scale_min, scale_max, color_min, color_max):
	if x is None:
		return r"\rowcolor{gray}"
	m = (scale_max+scale_min)/2
	a = min(int(100 * 2 * abs(x-m) / (scale_max-scale_min)), 100)
	color = color_min if x < m else color_max
	return r"\rowcolor{%s!%d}" %(color, a) + "\n"

def get_label(key, slugged=False):
	if slugged:
		return r"{\scriptsize \textbf{%s} %s}" %(key, slugs.get(key, ''))
	else:
		return r"{\scriptsize \textbf{%s}}" % key

## Quality rating
def get_quality_row(key, data, slugged = True):
	a = avg(data)
	s = ("$%+4.2f$" % a) if a is not None else "---"
	color_tex = get_color_string(a, WT_U, WT_E, "Salmon", "green")
	row_tex = r"%s & %d & %d & %d & %d & %d & %s \\" \
			% (get_label(key, slugged),
				data.count(WT_U),
				data.count(WT_M),
				data.count(WT_A),
				data.count(WT_N),
				data.count(WT_E),
				s)
	return color_tex + row_tex
def print_quality_table(d, sort_key = None, slugged = True):
	items = sorted(d.items(), key = sort_key)
	print(r"\begin{tabular}{lcccccr}")
	print(r"\toprule Prob & U & M & A & N & E & Avg \\ \midrule")
	for key, data in items:
		print(get_quality_row(key, data, slugged))
	print(r"\bottomrule")
	print(r"\end{tabular}")

## Difficulty rating
def get_difficulty_row(key, data, slugged = False):
	a = avg(data)
	s = ("$%.3f$" % a) if a is not None else "---"
	color_tex = get_color_string(a, 1, 3, "cyan", "orange")
	row_tex = r"%s & %d & %d & %d & %d & %d & %s \\" \
			% (get_label(key, slugged),
				data.count(1),
				data.count(1.5),
				data.count(2),
				data.count(2.5),
				data.count(3),
				s)
	return color_tex + row_tex
def print_difficulty_table(d, sort_key = None, slugged = False):
	items = sorted(d.items(), key = sort_key)
	print(r"\begin{tabular}{l ccccc c}")
	print(r"\toprule Prob & 1 & 1.5 & 2 & 2.5 & 3 & Avg \\ \midrule")
	for key, data in items:
		print(get_difficulty_row(key, data, slugged))
	print(r"\bottomrule")
	print(r"\end{tabular}")

filtered_qualities = {k:v \
		for k,v in qualities.items() if criteria(k)}
filtered_difficulties = {k:v \
		for k,v in difficulties.items() if criteria(k)}

def print_everything(name, fn = None, flip_slug = False):
	if fn is not None:
		sort_key = lambda item: fn(item[0])
	else:
		sort_key = None
	print(r"\section{" + name + "}")
	if flip_slug:
		print_quality_table(filtered_qualities, sort_key, False)
		print_difficulty_table(filtered_difficulties, sort_key, True)
	else:
		print_quality_table(filtered_qualities, sort_key, True)
		print_difficulty_table(filtered_difficulties, sort_key, False)

# Start outputting content
print(r"""\documentclass[11pt]{scrartcl}
\usepackage{booktabs}
\usepackage[sexy]{evan}
\usepackage{tikz}
\usepackage{pgfplots}
\pgfplotsset{compat=1.17}

\begin{document}
\title{Example of ratings table with randomly generated data}
\maketitle

\setlength\tabcolsep{5pt}
""")



print(r"\section{All ratings}")
print_quality_table(qualities)
print_difficulty_table(difficulties)

print("\n" + r"\newpage" + "\n")
print_everything("Beauty contest, by overall popularity",
		lambda p : (-avg(qualities[p]), p), False)
print_everything("Beauty contest, by subject and popularity",
		lambda p : (p[0], -avg(qualities[p]), p), False)
print("\n" + r"\newpage" + "\n")
print_everything("Beauty contest, by overall difficulty",
		lambda p : (-avg(difficulties[p]), p), True)
print_everything("Beauty contest, by subject and difficulty",
		lambda p : (p[0], -avg(difficulties[p]), p), True)

print("\n")
print(r"\section{Scatter plot}")
print(r"\begin{center}")
print(r"\begin{tikzpicture}")
print(r"""\begin{axis}[width=0.9\textwidth, height=22cm, grid=both,
	xlabel={Average difficulty}, ylabel={Average suitability},
	every node near coord/.append style={font=\scriptsize},
	scatter/classes={A={red},C={blue},G={green},N={black}}]""")
print(r"""\addplot [scatter,
	only marks, point meta=explicit symbolic,
	nodes near coords*={\prob},
	visualization depends on={value \thisrow{prob} \as \prob}]""")
print(r"table [meta=subj] {")
print("X\tY\tprob\tsubj")
for p in qualities.keys():
	x = avg(difficulties[p])
	y = avg(qualities[p])
	if x is None or y is None:
		continue
	print("%0.2f\t%0.2f\t%s\t%s" %(x,y,p[2:],p[0]))
print(r"};")
print(r"\end{axis}")
print(r"\end{tikzpicture}")
print(r"\end{center}")

print(r"\end{document}")

Of course, obligatory warning to not overly rely on the numerical ratings and to put heavy weight on the text comments provided. (The numerical ratings will often have a lot of variance, anyways.)

One thing to keep in mind is that when choosing the problems is that there are two most obvious goals are basically orthogonal. One goal is to have the most attractive problems (“art”), but the other is to have an exam which is balanced across difficulty and subject composition (“science”). These two goals will often compete with each other and you’ll have to make judgment calls to prioritize one over the other.

A final piece of advice is to not be too pedantic. For example, I personally dislike the so-called “Geoff rule” that 1/2/4/5 should be distinct subjects: I find that it is often too restrictive in practice. I also support using “fractional distributions” in which say a problem can be 75% number theory and 25% combinatorics (rather than all-or-nothing) when trying to determine how to balance the exam. This leads to better, more nuanced judgments than insisting on four categories.

This is also the time to make any last edits you want to the problems, again both copy edits or more substantial edits. This gives you a penultimate draft of the exam.

Test solving

If you can, a good last quality check to do is to have a round of test-solving from an unbiased group of additional volunteers who haven’t already seen the packet. (For the volunteers, this is a smaller time commitment than reviewing an entire packet so it’s often feasible as an intermediate commitment.) You ask this last round of volunteers to try out the problems under exam-like conditions, although I find it’s not super necessary to do a full 4.5 hours or have full write-ups, if you’ll get more volunteers this way. A nice number of test-solvers is 5-10 people.

Typically this test-solving is most useful as a sanity check (e.g. to make sure the test is not obviously too difficult) and for any last minute shuffling of the problems (which often happens). I don’t advise making drastic changes at this point. It’s good as a way to get feedback on the most tricky decisions, though.

Wrap-up

After any final edits, I recommend sending a copy of the edited problems and solutions to the reviewers and test-solvers. They’re probably interested to know what problems made the cut, and you want to have eyes going through the final paper to check for ambiguities or errors.

I usually take the time to also send out some details of the selection itself: what the ratings for the problems looked like, often a sentence or two for each problem about the overall feedback, and a documentation of my thought process in the draft selection. It’s good to give people feedback on their problems, in my experience the authors usually appreciate it a lot, especially if they decide to re-submit the problem elsewhere.

And that’s the process.

USA Special Team Selection Test Series for IMO 2021

A lot of people have been asking me how team selection is going to work for the USA this year. This information was sent out to the contestants a while ago, but I understand that there’s a lot of people outside of MOP 2020 who are interested in seeing the TST problems :) so this is a quick overview of how things are going down this year.

This year there are six tests leading to the IMO 2021 team:

  • USA TSTST Day 1: November 12, 2020 (3 problems, 4.5 hours)
  • USA TSTST Day 2: December 10, 2020 (3 problems, 4.5 hours)
  • USA TSTST Day 3: January 21, 2021 (3 problems, 4.5 hours)
  • RMM Day 1: February 2021 (3 problems, 4.5 hours)
  • APMO: March 2021 (5 problems, 4 hours)
  • USAMO: April 2021 (2 days, each with 3 problems and 4.5 hours)

Everyone who was at the virtual MOP in June 2020 is invited to all three days of TSTST, and then the top scores get to take the latter three exams as team selection tests for the IMO. Meanwhile, the RMM teams and EGMO teams are based on just the three days of TSTST.

Similar to past years, discussion of TSTST is allowed on noon Eastern time Monday after each day. That means you can look forward to the first set of three new problems coming out on Monday, November 16, and similarly for the other two days of TSTST.

To add to the hype, I’ll be doing a short one-hour-or-less Twitch stream at 8:00pm ET on Tuesday November 17 where I present the solutions to the TSTST problems of day 1.  If there’s demand, I’ll probably run a review session for the other two days of TSTST, as well.

EDIT: Changed stream time to Tuesday so more people have time to try the problems.

USEMO sign-ups are open

I’m happy to announce that sign-ups for my new olympiad style contest, the United States Ersatz Math Olympiad (USEMO), are open now! The webpage for the USEMO is https://web.evanchen.cc/usemo.html (where sign-ups are posted).

https://web.evanchen.cc/static/usemo/usemo-logo.png

The US Ersatz Math Olympiad is a proof-based competition open to all US middle and high school students. Like many competitions, its goals are to develop interest and ability in mathematics (rather than measure it). However, it is one of few proof-based contests open to all US middle and high school students. You can see more about the goals of this contest in the mission statement.

The contest will run over Memorial day weekend:

  • Day 1 is Saturday May 23 2020, from 12:30pm ET — 5:00pm ET.
  • Day 2 is Sunday May 24 2020, from 12:30pm ET — 5:00pm ET.

In the future, assuming continued interest, I hope to make the USEMO into an annual tradition run in the fall.

Circular optimization

This post will mostly be focused on construction-type problems in which you’re asked to construct something satisfying property {P}.

Minor spoilers for USAMO 2011/4, IMO 2014/5.

1. What is a leap of faith?

Usually, a good thing to do whenever you can is to make “safe moves” which are implied by the property {P}. Here’s a simple example.

Example 1 (USAMO 2011)

Find an integer {n} such that the remainder when {2^n} is divided by {n} is odd.

It is easy to see, for example, that {n} itself must be odd for this to be true, and so we can make our life easier without incurring any worries by restricting our search to odd {n}. You might therefore call this an “optimization”: a kind of move that makes the problem easier, essentially for free.

But often times such “safe moves” or not enough to solve the problem, and you have to eventually make “leap-of-faith moves”. For example, maybe in the above problem, we might try to focus our attention on numbers {n = pq} for primes {p} and {q}. This does make our life easier, because we’ve zoomed in on a special type of {n} which is easy to compute. But it runs the risk that maybe there is no such example of {n}, or that the smallest one is difficult to find.

2. Circular reasoning can sometimes save the day

However, a strange type of circular reasoning can sometimes happen, in which a move that would otherwise be a leap-of-faith is actually known to be safe because you also know that the problem statement you are trying to prove is true. I can hardly do better than to give the most famous example:

Example 2 (IMO 2014)

For every positive integer {n}, the Bank of Cape Town issues coins of denomination {\frac 1n}. Given a finite collection of such coins (of not necessarily different denominations) with total value at most {99 + \frac12}, prove that it is possible to split this collection into {100} or fewer groups, such that each group has total value at most {1}.

Let’s say in this problem we find ourselves holding two coins of weight {1/6}. Perhaps we wish to put these coins in the same group, so that we have one less decision to make. However, this could rightly be viewed as a “leap-of-faith”, because there’s no logical reason why the task must remain possible after making this first move.

Except there is a non-logical reason: this is the same as trading the two coins of weight {1/6} for a single coin of weight {1/3}. Why is the task still possible? Because the problem says so: the very problem we are trying to solve includes this case, too. If the problem is going to be true, then it had better be true after we make this trade.

Thus by a perverse circular reasoning we can rest assured that our leap-of-faith here will not come back to bite us. (And in fact, this optimization is a major step of the solution.)

3. More examples of circular optimization

Here’s some more examples of problems you can try that I think have a similar idea.

Problem 1

Prove that in any connected graph {G} on {2004} vertices one can delete some edges to obtain a graph (also with {2004} vertices) whose degrees are all odd.

Problem 2 (USA TST 2017)

In a sports league, each team uses a set of at most {t} signature colors. A set {S} of teams is color-identifiable if one can assign each team in {S} one of their signature colors, such that no team in {S} is assigned any signature color of a different team in {S}. For all positive integers {n} and {t}, determine the maximum integer {g(n,t)} such that: In any sports league with exactly {n} distinct colors present over all teams, one can always find a color-identifiable set of size at least {g(n,t)}.

Feel free to post more examples in the comments.

MOHS hardness scale

There’s a new addition to my olympiad problems and solutions archive: I created an index of many past IMO/USAMO/USA TST(ST) problems by what my opinions on their difficulties are. You can grab the direct link to the file below:

https://evanchen.cc/upload/MOHS-hardness.pdf

In short, the scale runs from 0M to 50M in increments of 5M, and every USAMO / IMO problem on my archive now has a rating too.

My hope is that this can be useful in a couple ways. One is that I hope it’s a nice reference for students, so that they can better make choices about what practice problems would be most useful for them to work on. The other is that the hardness scale contains a very long discussion about how I judge the difficulty of problems. While this is my own personal opinion, obviously, I hope it might still be useful for coaches or at least interesting to read about.

As long as I’m here, I should express some concern that it’s possible this document does more harm than good, too. (I held off on posting this for a few months, but eventually decided to at least try it and see for myself, and just learn from it if it turns out to be a mistake.) I think there’s something special about solving your first IMO problem or USAMO problem or whatever and suddenly realizing that these problems are actually doable — I hope it would not be diminished by me rating the problem as 0M. Maybe more information isn’t always a good thing!

IMO 2019 Aftermath

Here is my commentary for the 2019 International Math Olympiad, consisting of pictures and some political statements about the problem.

Summary

This year’s USA delegation consisted of leader Po-Shen Loh and deputy leader Yang Liu. The USA scored 227 points, tying for first place with China. For context, that is missing a total of four problems across all students, which is actually kind of insane. All six students got gold medals, and two have perfect scores.

  1. Vincent Huang 7 7 3 7 7 7
  2. Luke Robitaille 7 6 2 7 7 6
  3. Colin Shanmo Tang 7 7 7 7 7 7
  4. Edward Wan 7 6 0 7 7 7
  5. Brandon Wang 7 7 7 7 7 1
  6. Daniel Zhu 7 7 7 7 7 7

Korea was 3rd place with 226 points, just one point shy of first, but way ahead of the 4th place score (with 187 points). (I would actually have been happier if Korea had tied with USA/China too; a three-way tie would have been a great story to tell.)

Pictures

2019-07-10_13-07-10

Leader’s hotel at Celtic manor.

2019-07-15_14-15-24

The opening ceremony!

opening-wifi-networks

Spotted by the leader of France, while waiting for the opening ceremony to start.

dizzy-screenshot

From last October. It turned out to be IMO5.

2019-07-17_17-39-04

My room at the University of Bath.

2019-07-17_20-33-39

Picture with Anant Mudgal, OTIS instructor and IMO6 author

2019-07-20_23-20-51

McDonald’s trip, an annual tradition whenever USA wins an international contest.

po-pic

Picture with Po-Shen Loh at McDonald’s

2019-07-21_11-54-30

Canmoo with friend (another tradition apparently)

crossword-at-imo

Ran into this while working with team USA on a crossword after the contest

2019-07-21_19-18-42

View from the Ferris wheel at the closing ceremony

twn-pic

Picture with team Taiwan at the closing ceremony

yang-pic

Picture with deputy leader Yang Liu

Contest analysis

You can find problems and my solutions on my website already, and this year’s organizers were kind of enough to post already the official solutions from the Problem Selection Committee. So what follows are merely my opinions on the problems, and my thoughts on them.

First, comments on the individual problems. (Warning: spoilers follow.)

  1. This is a standard functional equation, which is quite routine for students with experience. In general, I don’t really like to put functional equations as opening problems to exams like the IMO or USAJMO, since students who have not seen a functional equation often have a difficult time understanding the problem statement.
  2. This is the first medium geometry problem that the IMO has featured since 2012 (the year before the so-called “Geoff rule” arose). I think it’s genuinely quite tricky to do using only vanilla synthetic methods, like the first official solution. In particular, the angle chasing solution was a big surprise to me because my approach (and many other approaches) start by eliminating the points {P_1} and {Q_1} from the picture, while the first official solution relies on them deeply. (For example one mightt add {X = PB_1 \cap AB} and {Y = QA_1 \cap AB} and noting {P_1CXA} and {Q_1CYB} are cyclic so it is equivalent to prove {T = PX \cap QY} lies on the radical axis of {\triangle CXA} and {\triangle CYB}). That said, I found that the problem succumbs to barycentric coordinates and I will be adding it as a great example to my bary handout. The USA students seem to have preferred to use moving points, or Menelaus theorem (which in this case was just clumsier bary).
  3. I actually felt the main difficulty of the problem was dealing with the artificial condition. Basically, the problem is about performing an operation while trying to not disconnect the graph. However, this “connectedness” condition, together with a few other necessary weak hypotheses (namely: not a clique, and has at least one odd-degree vertex) are lumped together in a misleading way, by specifying 1010 vertices of degree 1009 and 1009 vertices of degree 1010. This misleads contestants into, say, splitting the graph into the even and odd vertices, while hiding the true nature of the problem. I do think the idea behind the problem is quite cute though, despite being disappointed at how it was phrased. Yang Liu suggested to me this might have been better off at the IOI, where one might ask a contestant to output a sequence of moves reducing it to a tree (or assert none exist).
  4. I liked the problem (and I found the connection to group theory amusing), though I think it is pretty technical for an IMO 1/4. Definitely on the hard side for inexperienced countries.
  5. This problem was forwarded to the USAMO chair and then IMO at my suggestion, so I was very happy to see it on the exam. I think it’s a natural problem statement that turns out to have an unexpecetdly nice solution. (And there is actually a natural interpretation of the statement via a Turing machine.) However, I thought it was quite easy for the P5 position (even easier than IMO 2008/5, say).
  6. A geometry problem from one of my past students Anant Mudgal. Yang and I both solved it very quickly with complex numbers, so it was a big surprise to us that none of the USA students did. I think this problem is difficult but not “killer”; easier than last year’s IMO 2018/6 but harder than IMO 2015/3.

For gold-level contestants, I think this was the easiest exam to sweep in quite a few years, and I confess during the IMO to wondering if we had a small chance at getting a full 252 (until I found out that the marking scheme deducted a point on P2). Problem 2 is tricky but bary-able, and Problem 5 is quite easy. Furthermore neither Problem 3 or Problem 6 are “killers” (the type of problem that gets fewer than 20 solves, say). So a very strong contestant would really have 3 hours each day to work on a Problem 3 or Problem 6 which is not too nightmarish. I was actually worried for a while that the gold cutoff might be as high as 34 points (five problems), but I was just being silly.

An opening speech for MOP

While making preparations for this year’s MOP, I imagined to myself what I would say on orientation night if I was director of the camp, and came up with the following speech. I thought it might be nice to share on this blog. Of course, it represents my own views, not the actual views of MOP or MAA. And since I am not actually director of MOP, the speech was never given.

People sometimes ask me, why do we have international students at MOP? Doesn’t that mean we’re training teams from other countries? So I want to make this clear now: the purpose of MOP is not to train and select future IMO teams.

I know it might seem that way, because we invite by score and grade. But I really think the purpose of MOP is to give each one of you the experience of working hard and meeting new people, among other things. Learn math, face challenges, make friends, the usual good stuff, right? And that’s something you can get no matter what your final rank is, or whether you make IMO or EGMO or even next year’s MOP. The MOP community is an extended family, and you are all part of it now.

What I mean to say is, the camp is designed with all 80 of you in mind. It made me sad back in 2012 when one of my friends realized he had little chance of making it back next year, and told me that MAA shouldn’t have invited him to begin with. Even if I can only take six students to the IMO each year, I never forget the other 74 of you are part of MOP too.

This means one important thing: everyone who puts in their best shot deserves to be here. (And unfortunately this also means there are many other people who deserve to be here tonight too, and are not. Maybe they solved one or two fewer problems than you did; or maybe they even solved the same number of problems, but they are in 11th grade and you are in 10th grade.)

Therefore, I hope to see all of you put in your best effort. And I should say this is not easy to do, because MOP is brutal in many ways. The classes are mandatory, we have a 4.5-hour test every two days, and you will be constantly graded. You will likely miss problems that others claim are easy. You might find out you know less than you thought you did, and this can be discouraging. Especially in the last week, when we run the TSTST, many of you will suddenly realize just how strong Team USA is.

So I want to tell you now, stay determined in the face of adversity. This struggle is your own, and we promise it’s worth it, no matter the outcome. We are rooting for you, and your friends sitting around you are too. (And if the people around you aren’t your friends yet, change that asap.)

Hard and soft techniques

In yet another contest-based post, I want to distinguish between two types of thinking: things that could help you solve a problem, and things that could help you understand the problem better. Then I’ll talk a little about how you can use the latter. (I’ve talked about this in my own classes for a while by now, but only recently realized I’ve never gotten the whole thing in writing. So here goes.)

1. More silly terminology

As usual, to make these things easier to talk about, I’m going to introduce some words to describe these two. Taking a page from martial arts, I’m going to run with hard and soft techniques.

A hard technique is something you try in the hopes it will prove something — ideally, solve the problem, but at least give you some intermediate lemma. Perhaps a better definition is “things that will end up in the actual proof”. Examples include:

  • Angle chasing in geometry, or proving quadrilaterals are cyclic.
  • Throwing complex numbers at a geometry problem.
  • Plugging in some values into a functional equation (which gives more equations to work with).
  • Taking a given Diophantine equation modulo {p} to get some information, or taking {p}-adic evaluations.
  • Trying to perform an induction, for example by deleting an element.
  • Trying to write down an inequality that when summed cyclically gives the desired conclusion.
  • Reducing the problem to one or more equivalent claims.

and so on. I’m sure you can come up with more examples.

In contrast, a soft technique is something you might try to help you understand the problem better — even if it might not prove anything. Perhaps a better definition is “things not written up”. Examples include:

  • Examining particular small cases of the problem.
  • Looking at the equality cases of a min/max problem.
  • Considering variants of the problem (for example, adding or deleting conditions).
  • Coming up with lots of concrete examples and playing with them.
  • Trying to come with a counterexample to the problem’s assertion and seeing what the obstructions are.
  • Drawing pictures, even on non-geometry problems (see JMO2 and JMO5 in my 2019 notes for example).
  • Deciding whether or not a geometry problem is “purely projective”.
  • Counting the algebraic degrees of freedom in a geometry problem.
  • Checking all the linear/polynomial solutions to a functional equation, in order to get a guess what the answer might be.
  • Blindly trying to guess solutions to an algebraic equation.
  • Making up an artificial unnatural function in a functional equation, and then trying to see why it doesn’t work (or occasionally being surprised that it does work).
  • Thinking about why a certain hard technique you tried failed, or even better convincing yourself it cannot work (for example, this Diophantine equation has a solution modulo every prime, so stop trying to one-shot by mods).
  • Giving a heuristic argument that some claim should be true or false (“probably {2^n \bmod n} is odd infinitely often”), or even easy/hard to prove.

and so on. There is some grey area between these two, some of the examples above might be argued to be in the other category (especially in context of specific problems), but hopefully this gives you a sense of what I’m talking about.

If you look at things I wrote back when I was in high school, you’ll see this referred to as “attacking” and “scouting” instead. This is too silly for me now even by my standards, but back then it was because I played a lot of StarCraft: Brood War (I’ve since switched to StarCraft II). The analogy there is pretty self-explanatory: knowing what your opponent is doing is important because your army composition and gameplay decisions should change in reaction to more information.

2. Using soft techniques: an example

Now after all that blabber, here’s the action item for you all: you should try soft techniques when stuck.

When you first start doing a problem, you will often have some good ideas for what to try. (For example: a wild geometry appeared, let’s scout for cyclic quadrilaterals.) Sometimes if you are lucky enough (especially if the problem is easier) this will be enough to topple the problem, and you can move on. But more often what happens is that eventually you run out of steam, and the problem is still standing. When that happens, my advice is to try doing some soft techniques if you haven’t already done so.

Here’s an example that I like to give.

Example 1 (USA TST 2009)

Find all real numbers {x}, {y}, {z} which satisfy

\displaystyle  \begin{aligned} x^3 &= 3x - 12y + 50,\\ y^3 &= 12y + 3z - 2,\\ z^3 &= 27z + 27x. \end{aligned}

A common first thing that people will try to do is add the first two equations, since that will cause the {12y} terms to cancel. This gives a factor of {x+y} in the left and an {x+z} in the right, so then maybe you try to submit that into the {27(x+z)} in the last equation, so you get {z^3 = 9(x^3+y^3-48)}, cool, there’s no more linear terms. Then. . .

Usually this doesn’t end well. You add this and subtract that and in the end all you see is equation after equation, and after a while you realize you’re not getting anywhere.

So we’re stuck now. What to do? I’ll now bring in two of the soft techniques I mentioned earlier:

  1. Let’s imagine the problem had {\mathbb R} replaced with {\mathbb C}. In this new problem, you can imagine solving for {y} in terms of {x} using the first equation, then {z} in terms of {y}, and then finally putting everything into the last equation to find a degree {27} polynomial in {x}. I say “imagine” because wow would that be ugly.

    But here’s the kicker: it’s a polynomial. It should have exactly {27} complex roots, with multiplicity. That’s a lot. Really?

    So here’s a hint you might take: there’s a good reason this is over {\mathbb R} but not {\mathbb C}. Often these kind of things end up being because there’s an inequality going on somewhere, so there will only be a few real solutions even though there might be tons of complex ones.

  2. Okay, but there’s an even more blatant thing we don’t know yet: what is the answer, anyways?

    This was more than a little bit embarrassing. We’re half an hour in to the problem and thoroughly stuck, and we don’t even have a single {(x,y,z)} that works? Maybe it’d be a good idea to fix that, like, right now. In the simplest way possible: guess and check.

    It’s much easier than it sounds, since if you pick a value of {z}, say, then you get {x} from the third equation, {y} from the first, then check whether it fits the second. If we restrict our search to integer values of {z}, then there aren’t so many that are reasonable.

I won’t spoil what the answer {(x,y,z)} is, other than saying there is an integer triple and it’s not hard to find it as I described. Once you have these two meta-considerations, you suddenly have a much better foothold, and it’s not too hard to solve the problem from here (for a USA TST problem anyways).

I pick this example because it really illustrates how hopeless repeatedly using hard techniques can be if you miss the right foothold (and also because in this problem it’s unusually tempting to just think that more manipulation is enough). It’s not impossible to solve the problem without first realizing what the answer is, but it is certainly way more difficult.

3. Improving at soft techniques

What this also means is that, in the after-math of a problem (when you’ve solved/given up on a problem and are reading and reflecting on the solution), you should also add soft techniques into the list of possible answers to “how might I have thought of that?”. An example of this is at the end of my earlier post On Reading Solutions, in which I describe how you can come up with solutions to two Putnam problems by thinking carefully about what should be the equality case.

Doing this is harder than it sounds, because the soft techniques are the ones that by definition won’t appear in most written solutions, and many people don’t explicitly even recognize them. But soft techniques are the things that tell you which hard techniques to use, which is why they’re so valuable to learn well.

In writing this post, I’m hoping to make the math contest world more aware that these sorts of non-formalizable ideas are things that can (and should) be acknowledged and discussed, the same way that the hard techniques are. In particular, just as there are a plethora of handouts on every hard technique in the olympiad literature, it should also be possible to design handouts aimed at practicing one or more particular soft techniques.

At MOP every year, I’m starting to see more and more classes to this effect (alongside the usual mix of classes called “inversion” or “graph theory” or “induction” or whatnot). I would love to see more! End speech.

Undergraduate Math 011: a firsT yeaR coursE in geometrY

tl;dr I parodied my own book, download the new version here.

People often complain to me about how olympiad geometry is just about knowing a bunch of configurations or theorems. But it recently occurred to me that when you actually get down to its core, the amount of specific knowledge that you need to do well in olympiad geometry is very little. In fact I’m going to come out and say: I think all the theory of mainstream IMO geometry would not last even a one-semester college course.

So to stake my claim, and celebrate April Fool’s Day, I decided to actually do it. What would olympiad geometry look like if it was taught at a typical college? To find out, I present to you the course notes for:

Undergrad Math 011: a firsT yeaR coursE in geometrY
handtruck

It’s 36 pages long, title page, preface, and index included. So, there you go. It is also the kind of thing I would never want to read, and the exercises are awful, but what does that matter?

scrshot-tr011ey.png

(I initially wanted to post this file as an April Fool’s gag, but became concerned that one would not have to be too gullible to believe these were actual course notes and then attempt to work through them.)

RMM 2019 pictures and aftermath

Pictures, thoughts, and other festives from the 2019 Romania Masters in Math. See also the MAA press release.

Summary

Po-Shen Loh and I spent the last week in Bucharest with the United States team for the 11th RMM. The USA usually sends four students who have not attended a previous IMO or RMM before.

This year’s four students did breathtakingly well:

  1. Benjamin Qi — gold (rank 2nd)
  2. Luke Robitaille — silver (rank 10th)
  3. Carl Schildkraut — gold (rank 8th)
  4. Daniel Zhu — gold (rank 4th)

(Yes, there are only nine gold medals this year!)

The team score is obtained by summing the three highest scores of the four team members. The USA won the team component by a lofty margin, making it the first time we’ve won back to back. I’m very proud of the team.

Pictures

RMM 2019 team after the competition (taken by Daniel Zhu’s dad):

RMM-2019-USA

McDonald’s trip. Apparently, the USA tradition is that whenever we win an international contest, we have to order chicken mcnuggets. Fortunately, this time we didn’t order one for every point on the team (a silly idea that was unfortunately implemented at IMO 2018).

2019-02-24 22.40.48

The winner plate. Each year the winning country brings it back to get it engraved, and returns it to the competition the next year. I will have it for the next while.

2019-02-25 02.55.20

And a present from one of the contestants (thanks!):

2019-02-25 22.17.53.jpg

 

Problems, and thoughts on them

Problem 1 (Maxim Didin, RUS)

Amy and Bob play a game. First, Amy writes down a positive integer on a board. Then the players alternate turns, with Bob moving first. On Bob’s turn, he chooses a positive integer {b} and subtracts {b^2} from the number on the board. On Amy’s turn, she chooses a positive integer {k} and raises the number on the board to the {k}th power. Bob wins if the number on the board ever reads zero. Can Amy prevent Bob from winning?

I found this to be a cute easy problem. The official solution is quite clever, but it’s possible (as I myself did) to have a very explicit solution using e.g. the characterization of which integers are the sum of k squares (for {k \le 4}).

Problem 2 (Jakob Jurij Snoj, SLV)

Let {ABCD} be an isosceles trapezoid with {\overline{AB} \parallel \overline{DC}}. Let {E} be the midpoint of {\overline{AC}}. Denote by {\Gamma} and {\Omega} the circumcircles of triangles {ABE} and {CDE}, respectively. The tangent to {\Gamma} at {A} and the tangent to {\Omega} at {D} intersect at point {P}. Prove that {\overline{PE}} is tangent to {\Omega}.

There are nice synthetic solutions to this problem, but I found it much easier to apply complex numbers with {\triangle CDE} as the unit circle, taking as the phantom point {P} the intersection of tangents. So, unsurprisingly, all our team members solved the problem quite quickly.

I suspect the American students who took RMM Day 1 at home (as part of the USA team selection process) will find the problem quite easy as well. Privately, it is a bit of a relief for me, because if a more difficult geometry had been placed here I would have worried that our team selection this year has (so far) been too geometry-heavy.

Problem 3 (Fedor Petrov, RUS)

Let {\varepsilon > 0} be a positive real number. Prove that if {n} is sufficiently large, then any simple graph on {n} vertices with at least {(1+\varepsilon)n} edges has two (different) cycles of equal length.

A really nice problem with a short, natural problem statement. I’m not good at this kind of problem, but I enjoy it anyways. Incidentally, one of our team members won last year’s IOI, and so this type of problem is right up his alley!

Problem 4 (Morteza Saghafian, IRN)

Show that for every positive integer {N} there exists a simple polygon (not necessarily convex) admitting exactly {N} distinct triangulations.

A fun construction problem. I think it’s actually harder than it looks, but with enough time everyone eventually catches on.

Problem 5 (Jakob Jurij Snoj, SLV)

Solve over {{\mathbb R}} the functional equation

\displaystyle f(x+yf(x)) + f(xy) = f(x) + f(2019y).

I found this problem surprisingly pernicious. Real functional equations in which all parts of the equation are “wrapped by f” tend to be hard to deal with: one has to think about things like injectivity and the like in order to have any hope of showing that f actually takes on some desired value. And the answer is harder to find that it seems — it is (rightly) worth a point even to get the entire answer correct.

Fortunately for our team members, the rubric for the problem was generous, and it was possible to get 4-5 points without a complete solution. In the USA, olympiad grading tends to be even harsher than in most other countries (though not as Draconian as the Putnam), so this came as a surprise to the team. I jokingly told the team afterwards that they should appreciate how I hold them to a higher standard than the rest of the world.

(Consequently, the statistics for this problem are somewhat misleading — the average score makes the problem seem easier than it actually is. In truth there were not that many 6’s and 7’s given out.)

Problem 6 (Adrian Beker)

Find all pairs of integers {(c, d)}, both greater than {1}, such that the following holds:

For any monic polynomial {Q} of degree {d} with integer coefficients and for any prime {p > c(2c+1)}, there exists a set {S} of at most {\tfrac{2c-1}{2c+1} \cdot p} integers, such that

\displaystyle \bigcup_{s \in S} {s,\; Q(s),\; Q(Q(s)),\; Q(Q(Q(s))),\; \dots}

contains a complete residue system modulo {p} (i.e., intersects with every residue class modulo {p}).

Unlike problem 5, I found this problem to be quite straightforward. I think it should have been switched with problem 5. Indeed, all our team members produced complete solutions to this problem.

So I am glad that our team has learned by now to try all three problems seriously on any given day. I have seen too many times students who would spend all their time on a P2 and not solve it, only to find out that P3 was of comparable or easier difficulty.