Understanding with System 1

Math must be presented for System 1 to absorb and only incidentally for System 2 to verify.

I finally have a sort-of formalizable guideline for teaching and writing math, and what it means to “understand” math. I’ve been unconsciously following this for years and only now managed to write down explicitly what it is that I’ve been doing.

(This post is written from a math-centric perspective, because that’s the domain where my concrete object-level examples from. But I suspect much of it applies to communicating hard ideas in general.)

S1 and S2

The quote above refers to the System 1 and System 2 framework from Thinking, Fast and Slow. Roughly it divides the brain’s thoughts into two categories:

  • S1 is the part of the brain characterized by fast, intuitive, automatic, instinctive, emotional responses, For example, when you read the text “2+2=?”, S1 tells you (without any effort) that this equals 4.
  • S2 is the part of the brain characterized by slow, deliberative, effortful, logical responses; for example, S2 is used to count the number of words in this sentence.

(The link above gives some more examples.)

The premise of this post is that understanding math well is largely about having the concept resonate with your S1, rather than your S2. For example, let’s take groups from abstract algebra. Then I claim that

G = \{ a/b \mid a,b \text{ odd integers} \}

is a group under the usual multiplication. Now, if you have a student who’s learning group theory for the first time, the only way they could see this is a group is to compare it against a list of the group axioms, and have their S2 verify them one by one. But experienced people don’t do this: their S1 automatically tells them that G “feels” like a group (because e.g. it’s closed and doesn’t have division-by-zero issues).

I think this S1-level understanding is what it means to “get it”. Verifying a solution to a hard olympiad problem by having S2 check each individual step is straightforward in principle, albeit time-consuming. The tricky part is to get this solution to resonate with S1. Hence my advice to never read a solution line by line.

Writing for S1

What this means is that if you’re trying to teach someone an idea, then you should be focusing on trying to get their S1 to grasp it, rather than just their S2. For example, in math it’s not enough to just give a sequence of logical steps which implies the result: give it life.

Here are some examples of ways I (try to) do this.

First, giving good concrete examples. S1 reacts well when it “sees” a concrete object like G above, and can see some intuitive properties about it right away. Abstract “symbol-pushing” is usually left to S2 instead.

Similarly, drawing pictures, so your S1 can actually see the object. On one extreme end, you can write something like “a point $S$ lies on the polar of $T$ if and only if $T$ lies on the polar of $S$”, but it’s much better to just have a picture:

You can even do this for things that aren’t really geometrical in nature. For example, my Napkin features the following picture of cardinal collapse when forcing.

Third, write like you talk, and share your feelings. S1 is emotional. S1 wants to know that compactness is a good property for a space to have, or that non-Noetherian rings are way too big and “only weirdos care about non-Noetherian rings” (just kidding!), or that ramified primes are the “finitely many edge cases” and aren’t worth worrying about. These S1 reactions you get are the things you want to pass on. In particular, avoid standard formal college-textbook-bleed-your-eyes-dry-in-boredom style. (To be fair, not all textbooks do this; this is one reason why I like Pugh’s book so much, for example.)

Even the mechanics on the page can be made to accommodate S1 in this way. S1 can’t read a wall of text; S2 has to put in effort to do that. But S1 can pick out section headers, or bolded phrases like this one, and so on and so forth. That’s why in Napkin all the examples are in separate red boxes and all the big theorems are in blue boxes, and important philosophical points are typeset in bold centered green text. This way S1 naturally puts its attention there.

But do not force it

On the flip side, if you’re trying to learn something, there’s a common failure mode where you try to keep forcing S2 to do something unnatural (rather than trying to have S1 figure it out). This is the kind of thing when you don’t understand what the Chinese Remainder Theorem is trying to say, so you try to fix this by repeatedly reading the proof line by line, and still not really understanding what is going on. Usually this ends up in S2 getting tired and not actually reading the proof after the third or fourth iteration.

(For the Chinese remainder theorem the right thing to do is ask yourself why any arithmetic progression with common difference 7 must contain multiples of 3: credits to Dominic Yeo again for that. I’m not actually sure what you’re supposed to do when stuck on math in general. Usually I just ask my friends what is going on, or give up for now and come back later.)

Actually, I really like the advice that SSC mentions: “develop instincts, then use them”.

On Reading Solutions

(Ed Note: This was earlier posted under the incorrect title “On Designing Olympiad Training”. How I managed to mess that up is a long story involving some incompetence with Python scripts, but this is fixed now.)

Spoiler warnings: USAMO 2014/1, and hints for Putnam 2014 A4 and B2. You may want to work on these problems yourself before reading this post.

1. An Apology

At last year’s USA IMO training camp, I prepared a handout on writing/style for the students at MOP. One of the things I talked about was the “ocean-crossing point”, which for our purposes you can think of as the discrete jump from a problem being “essentially not solved” ({0+}) to “essentially solved” ({7-}). The name comes from a Scott Aaronson post:

Suppose your friend in Boston blindfolded you, drove you around for twenty minutes, then took the blindfold off and claimed you were now in Beijing. Yes, you do see Chinese signs and pagoda roofs, and no, you can’t immediately disprove him — but based on your knowledge of both cars and geography, isn’t it more likely you’re just in Chinatown? . . . We start in Boston, we end up in Beijing, and at no point is anything resembling an ocean ever crossed.

I then gave two examples of how to write a solution to the following example problem.

Problem 1 (USAMO 2014)

Let {a}, {b}, {c}, {d} be real numbers such that {b-d \ge 5} and all zeros {x_1}, {x_2}, {x_3}, and {x_4} of the polynomial {P(x)=x^4+ax^3+bx^2+cx+d} are real. Find the smallest value the product

\displaystyle  (x_1^2+1)(x_2^2+1)(x_3^2+1)(x_4^2+1)

can take.

Proof: (Not-so-good write-up) Since {x_j^2+1 = (x+i)(x-i)} for every {j=1,2,3,4} (where {i=\sqrt{-1}}), we get {\prod_{j=1}^4 (x_j^2+1) = \prod_{j=1}^4 (x_j+i)(x_j-i) = P(i)P(-i)} which equals to {|P(i)|^2 = (b-d-1)^2 + (a-c)^2}. If {x_1 = x_2 = x_3 = x_4 = 1} this is {16} and {b-d = 5}. Also, {b-d \ge 5}, this is {\ge 16}. \Box

Proof: (Better write-up) The answer is {16}. This can be achieved by taking {x_1 = x_2 = x_3 = x_4 = 1}, whence the product is {2^4 = 16}, and {b-d = 5}.

Now, we prove this is a lower bound. Let {i = \sqrt{-1}}. The key observation is that

\displaystyle  \prod_{j=1}^4 \left( x_j^2 + 1 \right) 		= \prod_{j=1}^4 (x_j - i)(x_j + i) 		= P(i)P(-i).

Consequently, we have

\displaystyle  \begin{aligned} 		\left( x_1^2 + 1 \right) 		\left( x_2^2 + 1 \right) 		\left( x_3^2 + 1 \right) 		\left( x_1^2 + 1 \right) 		&= (b-d-1)^2 + (a-c)^2 \\ 		&\ge (5-1)^2 + 0^2 = 16. 	\end{aligned}

This proves the lower bound. \Box

You’ll notice that it’s much easier to see the key idea in the second solution: namely,

\displaystyle  \prod_j (x_j^2+1) = P(i)P(-i) = (b-d-1)^2 + (a-c)^2

which allows you use the enigmatic condition {b-d \ge 5}.

Unfortunately I have the following confession to make:

In practice, most solutions are written more like the first one than the second one.

The truth is that writing up solutions is sort of a chore that people never really want to do but have to — much like washing dishes. So must solutions won’t be written in a way that helps you learn from them. This means that when you read solutions, you should assume that the thing you really want (i.e., the ocean-crossing point) is buried somewhere amidst a haystack of other unimportant details.

2. Diff

But in practice even the “better write-up” I mentioned above still has too much information in it.

Suppose you were explaining how to solve this problem to a friend. You would probably not start your explanation by saying that the minimum is {16}, achieved by {x_1 = x_2 = x_3 = x_4 = 1} — even though this is indeed a logically necessary part of the solution. Instead, the first thing you would probably tell them is to notice that

\displaystyle  \prod_{j=1}^4 \left( x_j^2 + 1 \right) = P(i)P(-i) 	= (b-d-1)^2 + (a-c)^2 \ge 4^2 = 16.

In fact, if your friend has been working on the problem for more than ten minutes, this is probably the only thing you need to tell them. They probably already figured out by themselves that there was a good chance the answer would be {2^4 = 16}, just based on the condition {b-d \ge 5}. This “one-liner” is all that they need to finish the problem. You don’t need to spell out to them the rest of the details.

When you explain a problem to a friend in this way, you’re communicating just the difference: the one or two sentences such that your friend could work out the rest of the details themselves with these directions. When reading the solution yourself, you should try to extract the main idea in the same way. Olympiad problems generally have only a few main ideas in them, from which the rest of the details can be derived. So reading the solution should feel much like searching for a needle in a haystack.

3. Don’t Read Line by Line

In particular: you should rarely read most of the words in the solution, and you should almost never read every word of the solution.

Whenever I read solutions to problems I didn’t solve, I often read less than 10% of the words in the solution. Instead I search aggressively for the one or two sentences which tell me the key step that I couldn’t find myself. (Functional equations are the glaring exception to this rule, since in these problems there sometimes isn’t any main idea other than “stumble around randomly”, and the steps really are all about equally important. But this is rarer than you might guess.)

I think a common mistake students make is to treat the solution as a sequence of logical steps: that is, reading the solution line by line, and then verifying that each line follows from the previous ones. This seems to entirely miss the point, because not all lines are created equal, and most lines can be easily derived once you figure out the main idea.

If you find that the only way that you can understand the solution is reading it step by step, then the problem may simply be too hard for you. This is because what counts as “details” and “main ideas” are relative to the absolute difficulty of the problem. Here’s an example of what I mean: the solution to a USAMO 3/6 level geometry problem, call it {P}, might look as follows.

Proof: First, we prove lemma {L_1}. (Proof of {L_1}, which is USAMO 1/4 level.)

Then, we prove lemma {L_2}. (Proof of {L_2}, which is USAMO 1/4 level.)

Finally, we remark that putting together {L_1} and {L_2} solves the problem. \Box

Likely the main difficulty of {P} is actually finding {L_1} and {L_2}. So a very experienced student might think of the sub-proofs {L_i} as “easy details”. But younger students might find {L_i} challenging in their own right, and be unable to solve the problem even after being told what the lemmas are: which is why it is hard for them to tell that {\{L_1, L_2\}} were the main ideas to begin with. In that case, the problem {P} is probably way over their head.

This is also why it doesn’t make sense to read solutions to problems which you have not worked on at all — there are often details, natural steps and notation, et cetera which are obvious to you if and only if you have actually tried the problem for a little while yourself.

4. Reflection

The earlier sections describe how to extract the main idea of an olympiad solution. This is neat because instead of having to remember an entire solution, you only need to remember a few sentences now, and it gives you a good understanding of the solution at hand.

But this still isn’t achieving your ultimate goal in learning: you are trying to maximize your scores on future problems. Unless you are extremely fortunate, you will probably never see the exact same problem on an exam again.

So one question you should often ask is:

“How could I have thought of that?”

(Or in my case, “how could I train a student to think of this?”.)

There are probably some surface-level skills that you can pick out of this. The lowest hanging fruit is things that are technical. A small number of examples, with varying amounts of depth:

  • This problem is “purely projective”, so we can take a projective transformation!
  • This problem had a segment {AB} with midpoint {M}, and a line {\ell} parallel to {AB}, so I should consider projecting {(AB;M\infty)} through a point on {\ell}.
  • Drawing a grid of primes is the only real idea in this problem, and the rest of it is just calculations.
  • This main claim is easy to guess since in some small cases, the frogs have “violating points” in a large circle.
  • In this problem there are {n} numbers on a circle, {n} odd. The counterexamples for {n} even alternate up and down, which motivates proving that no three consecutive numbers are in sorted order.
  • This is a juggling problem!

(Brownie points if any contest enthusiasts can figure out which problems I’m talking about in this list!)

5. Learn Philosophy, not Formalism

But now I want to point out that the best answers to the above question are often not formalizable. Lists of triggers and actions are “cheap forms of understanding”, because going through a list of methods will only get so far.

On the other hand, the un-formalizable philosophy that you can extract from reading a question, is part of that legendary “intuition” that people are always talking about: you can’t describe it in words, but it’s certainly there. Maybe I would even be better if I reframed the question as:

“What does this problem feel like?”

So let’s talk about our feelings. Here is David Yang’s take on it:

Whenever you see a problem you really like, store it (and the solution) in your mind like a cherished memory . . . The point of this is that you will see problems which will remind you of that problem despite having no obvious relation. You will not be able to say concretely what the relation is, but think a lot about it and give a name to the common aspect of the two problems. Eventually, you will see new problems for which you feel like could also be described by that name.

Do this enough, and you will have a very powerful intuition that cannot be described easily concretely (and in particular, that nobody else will have).

This itself doesn’t make sense without an example, so here is an example of one philosophy I’ve developed. Here are two problems on Putnam 2014:

Problem 2 (Putnam 2014 A4)

Suppose {X} is a random variable that takes on only nonnegative integer values, with {\mathbb E[X] = 1}, {\mathbb E[X^2] = 2}, and {\mathbb E[X^3] = 5}. Determine the smallest possible value of the probability of the event {X=0}.

Problem 3 (Putnam 2014 B2)

Suppose that {f} is a function on the interval {[1,3]} such that {-1\le f(x)\le 1} for all {x} and

\displaystyle  \int_1^3 f(x) \; dx=0.

How large can {\int_1^3 \frac{f(x)}{x} \; dx} be?

At a glance there seems to be nearly no connection between these problems. One of them is a combinatorics/algebra question, and the other is an integral. Moreover, if you read the official solutions or even my own write-ups, you will find very little in common joining them.

Yet it turns out that these two problems do have something in common to me, which I’ll try to describe below. My thought process in solving either question went as follows:

In both problems, I was able to quickly make a good guess as to what the optimal {X}/{f} was, and then come up with a heuristic explanation (not a proof) why that guess had to be correct, namely, “by smoothing, you should put all the weight on the left”. Let me call this optimal argument {A}.

That conjectured {A} gave a numerical answer to the actual problem: but for both of these problems, it turns out that numerical answer is completely uninteresting, as are the exact details of {A}. It should be philosophically be interpreted as “this is the number that happens to pop out when you plug in the optimal choice”. And indeed that’s what both solutions feel like. These solutions don’t actually care what the exact values of {A} are, they only care about the properties that made me think they were optimal in the first place.

I gave this philosophy the name Equality, with poster description “problems where looking at the equality case is important”. This text description feels more or less useless to me; I suppose it’s the thought that counts. But ever since I came up with this name, it has helped me solve new problems that come up, because they would give me the same feeling that these two problems did.

Two more examples of these themes that I’ve come up with are Global and Rigid, which will be described in a future post on how I design training materials.

Things SPARC

[EDIT 2018/03/05: This description seems significantly less accurate to me now than it did a few years ago, both because my views/values have changed substantially, and because SPARC has changed direction substantially since I attended as a junior counselor in 2015. I’ll leave it here as a reference, but should be taken with a grain of salt.]

I often get asked about what I learned from the SPARC summer camp. This is hard to describe and I never manage to give as a good of an answer as I want, so I want to take the time to write down something concrete now. For context: I attended SPARC in 2013 and 2014 and again as a counselor in 2015, so this post is long overdue (but better late than never).

(For those of you still in high school: applications for 2016 are now open, due March 1, 2016. The program is completely free including room/board and you don’t need rec letters, so there is no reason to not apply.)

The short version is that maybe 1/4 of the life skills I use on a regular basis are things I picked up from SPARC. (The rest came from some combination of math contests and living in college dorms.) On paper SPARC seems like a math or CS camp, but there is a strong emphasis on practicality in the sense that the instructors specifically want to teach you things that you can apply in life. So in addition to technical classes on Bayes’ theorem and the like, you’ll have classes on much “softer” topics like

  • Posture (literally about having good body posture)
  • Aversion factoring (e.g. understanding why I’m not exercising and fixing it)
  • Expanding comfort zones (with hands-on practice; my year I learned to climb trees)

and so on. This makes it hard to compare to other math camps like MOP (though if you insist on drawing a comparison, I think many MOP+SPARC students agree they learned more from SPARC).

Some more testimonials:

Now, here is my own list of concrete things which SPARC has taught me, in no particular order:

  • Being significantly more introspective / reflective about life. Example: realizing that some class/activity/etc. are not adding much value to life and dropping them.
  • Being interested in optimizing life in general; I now find it fun to think about how to be more productive and live life well the same way I like to think about hard math problems.
  • Thinking about thinking: things like mental models, cognitive biases, emotions, aversions, System 1 vs System 2,
  • Becoming very aggressive at conserving time. I’m much more willing to trade money for time, and actively asking whether I really need to do something, or if I can just axe it.
  • Using game theory concepts to think about the world. College tuition is expensive because this is the Nash equilibrium. Recognizing real-life situations which are well understood as games, like prisoner’s dilemma, chicken, etc.
  • Applying Bayes’ theorem and expected value to real life. Trying out X activity has constant cost but potentially large payoffs, hence large positive EV.
  • Being able to use probabilities in a meaningful way. Being able to tell the difference between being 90% confident and 70% confident in an event happening.
  • Actively buying O(n) returns for O(1) cost.
  • Being more willing to take less conventional paths, like essentially dropping out of high school to train for the IMO (I describe this in the first FAQ here). Another good example I haven’t done myself (yet) is taking a gap year.
  • Using Workflowy. It’s a big part of why I can think as clearly as I do. In context of SPARC, this is a special case of understanding the idea of working memory, which is also an idea I picked up from camp.
  • Writing a lot more. This is probably actually a consequence of the things above rather than something that I directly learned from SPARC; some combination of understanding working memory well, and being much more reflective.
  • Peer group and culture. This is a bigger one than people realize. In the same way that math contests establish a group of people where it’s cool to think about hard math problems, the SPARC network establishes a group of people with a culture of encouraging people to think about rationality. It’s very hard to be good at reflection in an isolated environment! SPARC lets you see how other people go about thinking about how to live life well and gives you other people to bounce ideas off of.

I’m sure there’s other things, but it’s hard for me to notice since it’s been so long since I had to live life pre-SPARC. And there’s some things that other people learned from SPARC that never stuck with me (lots of the social skills, for example). Much like your first time attending MOP, there will be more things to learn than you’ll actually be able to absorb. So the list above is only the things that I myself learned, and in fact I think the set of things you acquire from SPARC more or less molds to whichever particular things matter to you most.


In high school, I hated English class and thought it was a waste of time. Now I’m in college, and I still hate English class and think it’s a waste of time. (Nothing on my teachers, they were all nice people, and I hope they’re not reading this.)

However, I no longer think writing itself is a waste of time. Otherwise, I wouldn’t be blogging, even about math. This post explains why I changed my mind.

1. Guts

My impression is that teachers in high school got it all wrong.

In high school, students are told to learn algebra because “we all use math every day”. This is obviously false, and somehow the students eventually are led to believe it.

You can’t actually be serious. Do people really think that knowing the Pythagorean Theorem will help in your daily life? I sure don’t, and I’m an aspiring mathematician. (Tip: Even real mathematicians stopped doing Euclidean geometry ages go.) It’s hilarious when you think about it. We’ve convinced millions of kids all over the country that they’re learning math because it’s useful in their lives, and they grudgingly believe it.

The actual answer of why we teach math in schools is that it is supposed to teach students how to think. But even the teachers have lost sight of this. Most high school math teachers are now just interested in making sure their students can “do” certain classes of problems in a short time, where “do” here doesn’t refer to solving the problem but regurgitating the solution that’s already been presented. The process is so repetitive and artificial that in high school I wrote computer programs to do my homework for me, because all the “problems” were just the same thing with numbers changed. If you’re interested in just how far off math is, I encourage you to read Lockhart’s Lament.

How can this happen? I think the answer is that many high schoolers don’t really have the guts to think, “my math teachers don’t have a clue”, even though they like to joke about it. I have the guts to say this now because I know lots of math. And it’s amazing to know that millions and millions of people are just plain wrong about something I believe in.

But on to the topic of this post…

2. The world lied to me

I was always told that the purpose of English class was to learn to write. Why is this important? Because it was important to be able to communicate my ideas.

Dead wrong. Somehow the skill of being able to argue on the nature of love in Romeo and Juliet was going to help me when I was writing a paper on Evan’s Theorem years down the road? That’s what my parents said. It sounds absurd when I put it this way, but people believe it. (And let’s not forget the fact that theorems are named by last name…)

I claim that the situation is just like math. People are just being boneheads. As it turns out, the standard structure of an English essay is nothing more than a historical accident. Even the fact that essays are about literature is a historical accident. But that’s beyond the scope of what I have to say.

So what is the purpose of writing? It turns out that there is one, and that it has nothing to do with communication. It’s that writing clarifies thinking.

3. Writing lets you see everything

“I sometimes find, and I am sure you know the feeling, that I simply have too many thoughts and memories crammed into my mind…. At these times… I use the Pensieve. One simply siphons the excess thoughts from one’s mind, pours them into the basin, and examines them at one’s leisure.”

— Harry Potter and the Goblet of Fire

Here’s some advice to all of you still in doing math contests — start keeping track of the problems you solve.

There’s superficial reasons for doing this. A few days ago I was trying to write a handout on polynomials, and I was looking for some problems on irreducibility. I knew I had seen and done a bunch of these problems in the past, but of course like most people I hadn’t bothered to keep track of every problem I did, so I could only remember a few off my head. So I had to go through the painful process of looking through my old posts on the Art of Problem Solving forums, searching through old databases, mucking through pages of garbage looking for problems that I did ages ago that I could use for my handout. And all the time I was thinking, “man, I should have kept track of all the problems I did”.

But there are deeper reasons for this. As I started collating the problems and solutions into a list, I started noticing some themes in the solutions that I never noticed before. For example, basically every solution started with the line “Assume for contradiction that {f} is not irreducible and write {f = g \cdot h}”. And then from there, one of three things happened.

  • The problem would take the coefficients modulo some prime or prime power, and then deduce some things about {g} and {h}. Obviously this only worked on the problems with integer coefficients.
  • The problem would start looking at absolute values of the coefficients and try to achieve some bound that showed the polynomial had to reduce in a certain way.
  • If the problem had multiple variables, the solution would reduce to a case with just one-variable. This was always the case with problems that had complex coefficients as well.

You can’t really be serious — I’m only noticing this now? Here I was, already a retired contestant, looking at problems I had done long long ago and only realizing now there was a common theme. I had already done all the work by having done all the problems. The only difference was that I didn’t write anything down; as a result I could only look at one problem at a time.

Needless to say, I was very angry for the rest of the day.

4. External and Working Memory

Why does this happen? More profoundly, it turns out that humans have a finite working memory. You can only keep so many things in your head at once. That’s why it’s a stupid idea to not write down problems and (sketches of) solutions after you solve them and keep them somewhere you can look at.

I probably did at least 1000 olympiad problems over the course of my life. Did I manage to keep all the solutions in my head? Of course not. That’s why at the IMO in 2014, I didn’t try a maximality argument despite the {\sqrt n} in the problem. I think if I had kept better records I wouldn’t have missed this. How else do you get exactly {\sqrt n} in the lower bound? It’s not even an integer! Poof. There goes my neat 42.

I didn’t realize this wasn’t just a math thing until much later. I was talking about something along these lines during my interview for Harvard College; my interviewer was an artist. When I was talking about writing things down because I couldn’t keep them all in my head, he said something that surprised me — his easel was covered with sticky notes where he wrote down any ideas that occurred to him. He called it “external memory”, a term I still use now.

It’s actually obvious when you think about it. Why do people have to-do lists and calendars and reminders? Because you can’t keep track of everything in your head. You can try and might even get good at it, but you’ll never do as well as the old-fashioned pen and paper.

This isn’t just about “I need to remember to do {X} in exactly {Y} time”. There’s a reason we use blackboards during math lectures instead of just talking. The ideas in math are really, really hard, because math is only about ideas, and nothing else. If the professors didn’t write the steps on the board, no one would be able to keep more than two or three steps in their head at once. The difficulty is only compounded by the fact that math has its own notation. We didn’t develop this notation because we were bored. We developed notation because the ideas we’re trying to express are so complex that the English language can’t even express them. In other words, mathematicians were forced to create a whole new set of symbols just to write down their ideas.

5. An Imperfect Analogy to Teaching

But so far I haven’t really argued anything other than “if you want to remember something you better write it down”. There’s a difference between a to-do list and an exposition. One is just a collection of disconnected bullet points. The other needs to do more, it needs to explain.

The following quote is excerpted from Richard Rusczyk’s article “Learning Through Teaching” ).

You can’t just “kind of get it” or know it just well enough to get by on a test; teaching calls for complete understanding of the concept.

  • How do you know that?
  • When would you use that?
  • How could you come up with that in the first place?

If you can’t answer these questions for something you “know”, then you can’t teach it.

I knew this was true from my own experiences teaching, but it took me more time to realize that writing well is a similar skill. The difference is the medium: when you’re teaching in person, you get real-time feedback on whether what you said makes sense. You don’t get this live feedback when you’re writing, and so you need to be much more careful. Yet all the nuances of teaching are still there — distinguishing between details, main ideas, hardest steps; deciding what can be worked out from what other things, even deciding which things are worth including and which things should be omitted.

This all really started to become obvious to me when I started my olympiad geometry textbook. In senior year of high school, I decided that I had a good enough understanding of olympiad geometry to write a textbook on it. I felt like I could probably do better than all the existing resources; not as hard as it sounds, since to my knowledge there aren’t any dedicated books for olympiad geometry.

After I had around 200 pages written, I realized that I had gotten a lot better at geometry. There were lots of things that happened in the process of thinking about the best way to teach geometry.

  1. Most basically, I did in fact fill in gaps in my knowledge. For example, I studied projective transformations for the first time in order to write the corresponding section in my book. The ideas definitely clicked much faster when I was thinking about how to teach it.
  2. I made new connections. I realized for the first time that symmedians and harmonic quadrilaterals are actually the same concept; I discovered a lemma about directed angles that I wished I had known before; I found a new proof to Menelaus using an elegant strategy I had used on Monge’s Theorem. None of this would have happened from just doing problems.
  3. Most profoundly, I got a much better understanding for when to apply certain techniques. One of the main goals of my book was to make solutions natural — a reader should be able to understand where a solution came from. That meant that at every page I was constantly fighting to try and explain how I had thought up of something. This unending reflection was exhausting and reduced me to a rate of about one page written per hour\footnote{But conveniently, this process is something that just requires a laptop, not even paper and pencil. So I got a lot of pages written during office assistant.}. But it improved my own ability significantly.

Ultimately what this exemplifies is that trying to explain something lets you understand it better. And that’s in part because you can only manage so many things in your head at once. If you think keeping track of your appointments in your head is hard, try doing that with a complex argument. Can’t do it. Writing solves this problem.

6. Finding the Truth

But that’s not a perfect analogy. What I’ve presented above is a model where you have ideas in your head and you output them onto paper. This isn’t totally accurate, because as you write, something else can happen: the ideas can change.

I’ll draw an analogy from painting, again courtesy of Paul Graham.

The model of painting I used to have is that you would have something you want to draw, and then you sit down and draw it, then polish up the details. (That’s how I did all my high school art projects, anyways.) But this turns out to not be true: Countless paintings, when you look at them in x-rays, turn out to have limbs that have been moved or facial features that have been readjusted. I was surprised when I first read this. But it makes sense if you can think about it: how you can be sure what’s in your head is what you want if you can’t even see it yet?

I propose that writing does the same thing. I don’t start by thinking “these are the ideas and I will now write them down”. Rather, I just write my thoughts down, not sure where they’re going to end up. That’s how my geometry textbook actually got written. I didn’t start with a table of contents. I started by putting down ideas, finding the connections between them, noticing new things I hadn’t before. I created new sections on the fly as the need arose, added new things as I thought of them, and let the whole thing sort itself out with a simple \verb+\tableofcontents+. You can even think of the table of contents as a natural bucket sort — put down related ideas near each others, add section headers as needed, and bam, you have an outline of the main ideas. And I never know what this outline will look like until it’s actually been written.

By the same token, revising shouldn’t be the art of modifying the presentation of an idea to be more convincing. It should be the art of changing the idea itself to be closer to the truth, which will automatically make it more convincing. This is consistent with the Latin: the word “revise” literally means “see again”.

This is where high school and college essays get it really wrong. In a college essay, the goal is to “sell an idea” to the reader. If something in the essay looks unconvincing, you fix it by trickery: re-writing it in a way that it sounds more convincing without changing the underlying idea. The way you say something goes a long way in selling it. That’s what English class should have taught you. Sure, some teachers tell you to make concessions or counterarguments, but you’re doing this to try and pretend to be “honest”. You only write such things with an agenda in mind.

But since when are you always right? That’s absurd. The English class model is “I have a thesis that I know is right, and now I’m going to explain to the reader why”. But how can you know you’re right about a thesis before you’ve written it down? If the thesis and its accompanying argument is even remotely complex, it wouldn’t have been possible to sort through the whole thing in your head. Worse still, if the thesis is nontrivial, odds are that someone who is about as smart as you will disagree with you. And as Yan Zhang often reminds the SPARC attendees, you should really only expect to be right about half the time when you disagree with someone about as smart as you. If an essay is supposed to move you closer to the truth, and your original thesis is wrong half the time, do you scrap half your essays? Unfortunately, I don’t think you’d ever pass English class that way.

The culture that’s been instilled, where the goal of writing is to convince, is intellectually dishonest. I might even go to say it’s dangerous; I’ll have to think about that for a while. There are times when you do want to write to convince others (grant proposals, anyone?) but it seems highly unfortunate that this type of writing has become synonymous with writing as a whole.

7. Conclusion

So this post has a few main ideas. The main purpose of writing is not in fact communication, at least not if you’re interested in thinking well. Rather, the benefits (at least the ones I perceive) are

  • Writing serves as an external memory, letting you see all your ideas and their connections at once, rather than trying to keep them in your head.
  • Explaining the ideas forces you to think well about them, the same way that teaching something is only possible with a full understanding of the concept.
  • Writing is a way to move closer to the truth, rather than to convince someone what the truth is.

So now I’ll tell you how I actually wrote my geometry book, or this blog post, or any of my various olympiad articles. It starts because I have an idea — just a passing thought, like “this would be a good way to explain Masckhe’s Theorem”. Some time later I’ll another such thought which is related to the first. Then a third. My memory is especially bad, so pretty soon it bothers me so much that I have to write it down, because I’m starting to lose track. And as I write the first ideas down, I start noticing new ideas, so I add in these ideas, and then more new ideas start flooding in. There are so many things I want to say and I just keep writing them down. That’s how I ended up with a 400-page textbook written from what originally was just meant to be a short article. There were too many things to say that other people hadn’t said yet, and I just had to write them all down. The miraculous things is that these ideas naturally sorted themselves out. The bulleted main ideas I listed above weren’t things I realized until I looked at the resulting table of contents.

I’m sometimes told by people I respect that they like my writing. But I think this actually just translates to “I like the ideas in your writing”, and so I take it as a big compliment.