18.099 Transcript: Bourgain’s Theorem

As part of the 18.099 Discrete Analysis reading group at MIT, I presented section 4.7 of Tao-Vu’s Additive Combinatorics textbook. Here were the notes I used for the second half of my presentation.

1. Synopsis

We aim to prove the following result.

Theorem 1 (Bourgain)

Assume {N \ge 2} is prime and {A, B \subseteq Z = \mathbb Z_N}. Assume that

\displaystyle  \delta \gg \sqrt{\frac{(\log \log N)^3}{\log N}}

is such that {\min\left\{ \mathbf P_ZA, \mathbf P_ZB \right\} \ge \delta}. Then {A+B} contains a proper arithmetic progression of length at least

\displaystyle  \exp\left( C\sqrt[3]{\delta^2 \log N} \right)

for some absolute constant {C > 1}.

The methods that we used with Bohr sets fail here, because in the previous half of yesterday’s lecture we took advantage of Parseval’s identity in order to handle large convolutions, always keeping two {\widehat 1_\ast} term’s inside the {\sum} sign. When we work with {A+B} this causes us to be stuck. So, we instead use the technology of {\Lambda(p)} constants and dissociated sets.

2. Previous results

As usual, let {Z} denote a finite abelian group. Recall that

Definition 2

Let {S \subseteq Z} and {2 \le p \le \infty}. The {\Lambda(p)} constant of {S}, denoted {\left\lVert S \right\rVert_{\Lambda(p)}}, is defined as

\displaystyle  \left\lVert S \right\rVert_{\Lambda(p)} = \sup_{\substack{c : S \rightarrow \mathbb C \\ c \not\equiv 0}} \frac{\left\lVert \displaystyle\sum_{\xi \in S} c(\xi) e(\xi \cdot x) \right\rVert_{L^p(Z)}} {\left\lVert c \right\rVert_{\ell^2(S)}}.

Definition 3

If {S \subseteq Z}, we say {S} is a dissociated set if all {2^{|S|}} subset sums of {S} are distinct.

For such sets we have the Rudin’s inequality (yes, Walter) which states that

Lemma 4 (Rudin’s inequality)

If {S} is dissociated then

\displaystyle  \left\lVert S \right\rVert_{\Lambda(p)} \ll \sqrt p.

Disassociated sets come up via the so-called “cube covering lemma”:

Lemma 5 (Cube covering lemma)

Let {S \subseteq Z} and {d \ge 1}. Then we can partition

\displaystyle  S = D_1 \sqcup D_2 \sqcup \dots \sqcup D_k \sqcup R

such that

  • Each {D_i} is dissociated of size {d+1},
  • There exists {\eta_1}, {\dots}, {\eta_d} such that {R} is contained in a {d}-cube, i.e. it’s covered by {c_1\eta_1 + \dots + c_d\eta_d}, where {c_i \in \{-1,0,1\}}.

Finally, we remind the reader that

Lemma 6 (Parseval)

We have

\displaystyle  \left\lVert f \right\rVert_{L^2Z} = \left\lVert \widehat f \right\rVert_{\ell^2Z}.

Since we don’t have Bohr sets anymore, the way we detect progressions is to use the pigeonhole principle. In what follows, let {T^n f} be the shift of {x} by {n}, id est {T^nf(x) = f(x-n)}.

Proposition 7 (Pigeonhole gives arithmetic progressions)

Let {f : Z \rightarrow \mathbb R_{\ge 0}}, {J \ge 1} and suppose {r \in \mathbb Z} is such that

\displaystyle  \mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr}f - f \right\rvert < \mathbf E_Z f.

Then {\text{supp }(f)} contains an arithmetic of length {j} and spacing {r}.

Proof: Apply the pigeonhole principle to find an {x} such that

\displaystyle  \max_{1 \le j \le J} \left\lvert T^{jr}f(x) - f(x) \right\rvert < f(x).

Then the claim follows. \Box

3. Periodicity

Proposition 8 (Estimate for {\max_{h \in H} |T^hf|} for {\text{supp }(\widehat f)} dissociated)

Let {f : Z \rightarrow \mathbb R}, {\text{supp }(\widehat f) \subseteq S \subseteq Z} with {S} dissociated. Then for any set {H} with {|H| > 1} we have

\displaystyle  \left\lVert \max_{h \in H} \left\lvert T^h f \right\rvert \right\rVert_{L^2Z} \ll \sqrt{\log|H|} \left\lVert f \right\rVert_{L^2Z}.

Proof: Let {p > 2} be large and note

\displaystyle  \begin{aligned} \left\lVert \max_{h \in H} \left\lvert T^h f \right\rvert \right\rVert_{L^2Z} &\le \left\lVert \max_{h \in H} \left\lvert T^h f \right\rvert \right\rVert_{L^pZ} \\ &\le \left\lVert \left( \sum_{h \in H} \left\lvert T^h f \right\rvert^p \right)^{1/p} \right\rVert_{L^pZ} \\ &= \left( \mathbf E_Z \left( \sum_{h \in H} \left\lvert T^h f \right\rvert^p \right) \right)^{1/p} \\ &= \left( \sum_{h \in H} \mathbf E_Z \left\lvert T^h f \right\rvert^p \right)^{1/p} \\ &= \left( \sum_{h \in H} \mathbf E_Z \left\lvert f \right\rvert^p \right)^{1/p} \\ &= \left\lvert H \right\rvert^{1/p} \left\lVert \sum_\xi \widehat f(\xi) e(\xi \cdot x) \right\rVert_{L^pZ} \\ &\le \left\lvert H \right\rvert^{1/p} \left\lVert S \right\rVert_{\Lambda(p)} \left\lVert \widehat f \right\rVert_{\ell^2Z} \\ \end{aligned}

Then by Parseval and Rudin,

\displaystyle  \begin{aligned} \left\lVert \max_{h \in H} \left\lvert T^h f \right\rvert \right\rVert_{L^2Z} &\le \left\lvert H \right\rvert^{1/p} \left\lVert S \right\rVert_{\Lambda(p)} \left\lVert f \right\rVert_{L^2Z} \\ &\ll \left\lvert H \right\rvert^{1/p} \sqrt p \left\lVert f \right\rVert_{L^2Z}. \end{aligned}

We may then take {p \ll \log H}. \Box

We combine these two propositions into the following lemma which applies if {\widehat f} has nonzero values of “uniform” size.

Lemma 9 (Uniformity estimate for shifts)

Let {f : Z \rightarrow \mathbb R} and {J, d > 1}. Suppose that {\widehat f} is “uniform in size” across its support, in the sense that

\displaystyle  \frac {\sup_{\xi \in \text{supp }(\widehat f)} \left\lvert \widehat f(\xi) \right\rvert} {\inf_{\xi \in \text{supp }(\widehat f)} \left\lvert \widehat f(\xi) \right\rvert} \le 2016.

Then one can find {S \subseteq Z} such that {|S| = d} and for all {r \in Z},

\displaystyle  \mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr}f - f \right\rvert \ll \left( \sum_\xi \left\lvert \widehat f(\xi) \right\rvert \right) \left( \sqrt{\frac{\log J}{d}} + Jd\max_{\eta \in S} \left\lVert \eta \cdot r \right\rVert_{\mathbb R/\mathbb Z} \right).

Proof: Use the cube covering lemma to put {\text{supp }(\widehat f) = D_1 \sqcup \dots \sqcup D_k \sqcup R} where {R} is contained in the cube of {S = \left\{ \eta_1, \dots, \eta_d \right\}} and {|D_i| = d+1} for {1 \le i \le k}. Accordingly, we decompose {f} over its Fourier transform as

\displaystyle  f = f_1 + \dots + f_k + g

by letting {f_i} be supported on {D_i} and {g(x)} supported on {R}.

First, we can bound the “leftover” bits in {R}:

\displaystyle  \begin{aligned} \mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr} g - g \right\rvert &= \mathbf E_Z \max_{0 \le j \le J} \sum_{\xi \in R} \left\lvert \widehat f(\xi) \cdot (e(\xi \cdot (x+jr)) - e(\xi \cdot x)) \right\rvert \\ &\le \mathbf E_Z \max_{0 \le j \le J} \sum_{\xi \in R} \left\lvert \widehat f(\xi) \right\rvert \left\lvert (e(\xi \cdot (x+jr)) - e(\xi \cdot x)) \right\rvert \\ &\le \left( \sum_{\xi \in R} \left\lvert \widehat f(\xi) \right\rvert \right) \max_{\substack{0 \le j \le J \\ \xi \in R}} \left\lvert (e(\xi \cdot (x+jr)) - e(\xi \cdot x)) \right\rvert \\ &\le \left( \sum_{\xi \in R} \left\lvert \widehat f(\xi) \right\rvert \right) \max_{\substack{0 \le j \le J \\ \xi \in R}} \left\lvert e(\xi \cdot jr) - 1 \right\rvert \\ &\le \left( \sum_{\xi \in R} \left\lvert \widehat f(\xi) \right\rvert \right) 2\pi \max_{\substack{0 \le j \le J \\ \xi \in R}} \left\lVert \xi \cdot jr \right\rVert_{\mathbb R/\mathbb Z} \end{aligned}

Since the {\xi \in R} are covered by a cube of {S = \left\{ \eta_1, \dots, \eta_d \right\}}, we get

\displaystyle  \mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr} g - g \right\rvert \le \left( \sum_{\xi \in R} \left\lvert \widehat f(\xi) \right\rvert \right) 2\pi Jd \max_{\substack{0 \le j \le J \\ \eta \in S}} \left\lVert \eta \cdot jr \right\rVert_{\mathbb R/\mathbb Z}.

Let’s then bound the contribution over each dissociated set. We’ll need both the assumption of uniformity and the proposition we proved for dissociated sets.

\displaystyle  \begin{aligned} \mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr} f_i - f_i \right\rvert &\le 2\mathbf E_Z \max_{0 \le j \le J} \left\lvert T^{jr} f_i \right\rvert \\ &\le 2\left\lVert \max_{0 \le j \le J} \left\lvert T^{jr} f_i \right\rvert \right\rVert_{L^2Z}. \\ &\ll \sqrt{\log(J)} \left\lVert f_i \right\rVert_{L^2Z} \\ &= \sqrt{\log(J)} \sqrt{\sum_{\xi \in D_i} \left\lvert \widehat f(\xi) \right\rvert^2 } \\ &\ll \sqrt{\frac{\log J}{D}} \sum_{\xi \in D_i} \left\lvert \widehat f(\xi) \right\rvert \end{aligned}

where the last step is by uniformity of {\widehat \xi}. Now combine everything with triangle inequality. \Box

4. Proof of main theorem

Without loss of generality {\mathbf P_ZA = \mathbf P_ZB = \delta}. Of course, we let {f = 1_A \ast 1_B} so {\mathbf E_Z f = \delta^2}. We will have parameters {d \ge 1}, {M \ge 1}, and {J \ge \exp(C\sqrt[3]{\delta^2 \log N})} which we will select at the end.

Our goal is to show there exists some integer {r} such that

\displaystyle  \mathbf E_Z \max_{1 \le j < J} \left\lvert T^{jr} f - f \right\rvert < \delta^2.

Now we cannot apply the uniformity estimate directly since {f} is probably not uniform, and therefore we impose a dyadic decomposition on the base group {Z}; let

\displaystyle  \begin{aligned} Z_0 &= \left\{ \xi \in Z \;:\; \frac{1}{2} \delta^2 < \left\lvert \widehat f(\xi) \right\rvert \le \delta^2 \right\} \\ Z_1 &= \left\{ \xi \in Z \;:\; \frac14\delta^2 < \left\lvert \widehat f(\xi) \right\rvert \le \frac{1}{2}\delta^2 \right\} \\ Z_2 &= \left\{ \xi \in Z \;:\; \frac18\delta^2 < \left\lvert \widehat f(\xi) \right\rvert \le \frac14\delta^2 \right\} \\ &\vdots \\ Z_{M-1} &= \left\{ \xi \in Z \;:\; 2^{-M} \delta^2 < \left\lvert \widehat f(\xi) \right\rvert \le 2^{-M+1} \delta^2 \right\} \\ Z_{\mathrm{err}} &= \left\{ \xi \in Z \;:\; \left\lvert \widehat f(\xi) \right\rvert < 2^{-M} \delta^2 \right\} \\ \end{aligned}

Then as before we can decompose via Fourier transform to obtain

\displaystyle  f = f_0 + f_1 + \dots + f_{M-1} + f_{\mathrm{err}}

so that {\widehat f_i} is supported on {Z_i}.

Now we can apply the previous lemma to get for each {0 \le m < M}:

\displaystyle  \mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr} f_m - f_m \right\rvert \ll \left( \sum_{\xi \in Z_m} \left\lvert \widehat f(\xi) \right\rvert \right) \left( \sqrt{\frac{\log J}{d}} + Jd\max_{\eta \in S_m} \left\lVert \eta \cdot r \right\rVert_{\mathbb R/\mathbb Z} \right)

for some {S_m}; hence by summing and using the fact that

\displaystyle  \sum_{\xi \in Z} \left\lvert \widehat f(\xi) \right\rvert = \sum_{\xi \in Z} \left\lvert \widehat 1_A(\xi) \right\rvert \left\lvert \widehat 1_B(\xi) \right\rvert \le \left\lVert \widehat 1_A \right\rVert_{\ell^2Z} \left\lVert \widehat 1_B \right\rVert_{\ell^2Z} = \left\lVert 1_A \right\rVert_{L^2Z} \left\lVert 1_B \right\rVert_{L^2Z} = \sqrt{\mathbf P_ZA \mathbf P_ZB} = \delta

we obtain that

\displaystyle  \sum_{0 \le m < M} \mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr} f - f \right\rvert \ll \delta \left( \sqrt{\frac{\log J}{d}} + Jd\max_{\eta \in \bigcup S_m} \left\lVert \eta \cdot r \right\rVert_{\mathbb R/\mathbb Z} \right).

As for the “error” term, we bound

\displaystyle  \begin{aligned} \mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr} f_{\mathrm{err}} - f_{\mathrm{err}} \right\rvert &\le 2\mathbf E_Z \max_{1 \le j \le J} \left\lvert T^{jr} f_{\mathrm{err}} \right\rvert \\ &\le 2\mathbf E_Z \sum_{1 \le j \le J} \left\lvert T^{jr} f_{\mathrm{err}} \right\rvert \\ &\le 2\sum_{1 \le j \le J} \mathbf E_Z \left\lvert T^{jr} f_{\mathrm{err}} \right\rvert \\ &\le 2\sum_{1 \le j \le J} \mathbf E_Z \left\lvert f_{\mathrm{err}} \right\rvert \\ &\le 2J \mathbf E_Z \left\lvert f_{\mathrm{err}} \right\rvert \\ &\le 2J \left\lVert f_{\mathrm{err}} \right\rVert_{L^2Z} \\ &= 2J \left\lVert \widehat f_{\mathrm{err}} \right\rVert_{\ell^2 Z} \\ &= 2J \sqrt{\sum_{\xi \in Z_{\mathrm{err}}} \left\lvert \widehat f_{\mathrm{err}}(\xi) \right\rvert^2} \\ &\le 2J \sqrt{\max_{\xi \in Z_{\mathrm{err}}} \left\lvert \widehat f_{\mathrm{err}}(\xi) \right\rvert \sum_{\xi \in Z_{\mathrm{err}}} \left\lvert \widehat f_{\mathrm{err}}(\xi) \right\rvert} \\ &\le 2J \sqrt{2^{-M}\delta^2 \cdot \delta} \\ &= 2J 2^{-M/2} \delta^{3/2} \\ &\le 2J 2^{-M/2} \delta. \end{aligned}

Thus, putting these altogether we need to find {R \neq 0} such that

\displaystyle  \sqrt{\frac{\log J}{d}} + Jd \max_{\eta\in\bigcup S_m} \left\lVert \eta \cdot r \right\rVert_{\mathbb R/\mathbb Z} + 2J \cdot2^{-M/2} \ll \delta.

Now set {M \asymp \log J} and {d \asymp \delta^{-2} \log J}, so the first and third terms are less than {\frac13 c \delta}, since by hypothesis

\displaystyle  \delta \gg \sqrt{\frac{(\log \log N)^3}{\log N}}

from which we deduce

\displaystyle  J \gg \exp\left( C\sqrt[3]{\delta^2\log N} \right) = \exp\left( C\log \log N \right) \ge (\log N)^C \gg \delta^{-1}.

Thus it suffices that

\displaystyle  \max_{\eta\in S} \left\lVert \eta \cdot r \right\rVert_{\mathbb R/\mathbb Z} \ll \frac{\delta^3}{J \log J}

where {S = \bigcup S_m}. Note {\left\lvert S \right\rvert \le dM \ll \left( \frac{\log J}{\delta} \right)^2}. Now we recall the result that

\displaystyle  \text{Bohr }(S, \rho) \ge |Z| \rho^{|S|}

and so it suffices for us that

\displaystyle  N \cdot \left( \frac{c_1 \delta^3}{J \log J} \right) ^{c_2 \left( \delta^{-1} \log J \right)^2} > 1

for constants {c_1} and {c_2}. Then {J = \exp(C\sqrt[3]{\delta^2 \log N})} works now.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s