From a $(p,2)$-Theorem to a Tight $(p,q)$-Theorem

A family $F$ of sets is said to satisfy the $(p,q)$-property if among any $p$ sets of $F$ some $q$ intersect. The celebrated $(p,q)$-theorem of Alon and Kleitman asserts that any family of compact convex sets in $\mathbb{R}^d$ that satisfies the $(p,q)$-property for some $q \geq d+1$, can be pierced by a fixed number $f_d(p,q)$ of points. The minimum such piercing number is denoted by $HD_d(p,q)$. Already in 1957, Hadwiger and Debrunner showed that whenever $q>\frac{d-1}{d}p+1$ the piercing number is $HD_d(p,q)=p-q+1$; no exact values of $HD_d(p,q)$ were found ever since. While for an arbitrary family of compact convex sets in $\mathbb{R}^d$, $d \geq 2$, a $(p,2)$-property does not imply a bounded piercing number, such bounds were proved for numerous specific families. The best-studied among them is axis-parallel rectangles in the plane. Wegner and (independently) Dol'nikov used a $(p,2)$-theorem for axis-parallel rectangles to show that $HD_{\mathrm{rect}}(p,q)=p-q+1$ holds for all $q>\sqrt{2p}$. These are the only values of $q$ for which $HD_{\mathrm{rect}}(p,q)$ is known exactly. In this paper we present a general method which allows using a $(p,2)$-theorem as a bootstrapping to obtain a tight $(p,q)$-theorem, for families with Helly number 2, even without assuming that the sets in the family are convex or compact. To demonstrate the strength of this method, we obtain a significant improvement of an over 50 year old result by Wegner and Dol'nikov. Namely, we show that $HD_{\mathrm{d-box}}(p,q)=p-q+1$ holds for all $q>c' \log^{d-1} p$, and in particular, $HD_{\mathrm{rect}}(p,q)=p-q+1$ holds for all $q \geq 7 \log_2 p$ (compared to $q \geq \sqrt{2p}$ of Wegner and Dol'nikov). In addition, for several classes of families, we present improved $(p,2)$-theorems, some of which can be used as a bootstrapping to obtain tight $(p,q)$-theorems.


Helly's theorem and (p,q)-theorems
The classical Helly's theorem says that if in a family of compact convex sets in R d every d + 1 members have a non-empty intersection then the whole family has a non-empty intersection.
For a pair of positive integers p ≥ q, we say that a family F of sets satisfies the (p, q)-property if |F| ≥ p, none of the sets in F is empty, and among any p sets of F there are some q with a non-empty intersection. A set P is called a transversal (or alternatively, a piercing set) for F if it has a non-empty intersection with every member of F. In this language, Helly's theorem states that any family of compact convex sets in R d satisfying the (d + 1, d + 1)-property has a singleton transversal (alternatively, can be pierced by a single point).
In general, d + 1 is clearly optimal in Helly's theorem, as any family of n hyperplanes in a general position in R d satisfies the (d, d)-property but cannot be pierced by less than n/d points. However, for numerous specific classes of families, a (d ′ , d ′ )-property for some d ′ < d + 1 is already sufficient to imply piercing by a single point. The minimal number d ′ for which this holds is called the Helly number of the family. For example, any family of axis-parallel boxes in R d has Helly number 2.
In 1957, Hadwiger and Debrunner [16] proved the following generalization of Helly's theorem: Theorem 1.1 (Hadwiger-Debrunner Theorem [16]). For all p ≥ q ≥ d + 1 such that q > d−1 d p + 1, any family of compact convex sets in R d that satisfies the (p, q)-property can be pierced by p − q + 1 points.
by O(p log d−1 p) points. Kim et al. [22] proved in 2006 that any family of translates of a fixed convex set in R d that satisfies the (p, 2)-property can be pierced by 2 d−1 d d (p − 1) points; five years later, Dumitrescu and Jiang [10] obtained a similar result for homothets of a convex set in R d . In 2012, Chan and Har-Peled [7] proved a (p, 2)-theorem for families of pseudo-discs in the plane, with a piercing number linear in p. Two years ago, Govindarajan and Nivasch [14] showed that any family of convex sets in the plane in which among any p sets there is a pair that intersects on a given convex curve γ, can be pierced by O(p 8 ) points.
In 2004, Matoušek [25] showed that families of sets with bounded dual VC-dimension have a bounded fractional Helly number. Recently, Pinchasi [26] has drawn a similar relation between the union complexity and the fractional Helly number. Each of these results implies a (p, 2)theorem for the respective families, using the proof technique of the Alon-Kleitman (p, q)theorem.
Besides their intrinsic interest, (p, 2)-theorems serve as a tool for obtaining other results. One such result is an improved Ramsey Theorem. Consider, for example, a family F of n axis-parallel rectangles in the plane. The classical Ramsey theorem implies that F contains a subfamily of size Ω(log n), all whose elements are either pairwise disjoint or pairwise intersecting. As was observed by Larman et al. [24], the aforementioned (p, 2)-theorem for axis-parallel rectangles [20] allows obtaining an improved bound of Ω( n/ log n). Indeed, either F contains a subfamily of size ⌈ n/ log n⌉ all whose elements are pairwise disjoint, and we are done, or F satisfies the (p, 2)-property with p = ⌈ n/ log n⌉. In the latter case, by the (p, 2)-theorem, F can be pierced by O(p log p) = O( √ n log n) points. The largest among the subsets of F pierced by a single point contains at least Ω( n √ n log n ) = Ω( n/ log n) rectangles, and all its elements are pairwise intersecting.
Another result that can be obtained from a (p, 2)-theorem is an improved (p, q)-theorem; this will be described in detail below.

(p,2)-theorems and (p,q)-theorems for axis-parallel rectangles and boxes
The (p, q)-problem for axis-parallel boxes is almost as old as the general (p, q)-problem, and was studied almost as thoroughly (see the survey of Eckhoff [11]). It was posed in 1960 by Hadwiger and Debrunner [17,18], who proved that any family of axis-parallel rectangles in the plane that satisfies the (p, q)-property, for p ≥ q ≥ 2, can be pierced by p−q+2 2 points. Unlike the (p, q)-problem for general families of convex sets, in this problem a finite bound on the piercing number was known from the very beginning, and the research goal has been to improve the bounds on this size, denoted HD rect (p, q) for rectangles and HD d−box (p, q) for boxes in R d .
For rectangles and q = 2, the quadratic upper bound on HD rect (p, 2) was improved to O(p log p) by Wegner (unpublished), and independently, by Károlyi [20]. The best currently known upper bound, which follows from a recursive formula presented by Fon Der Flaass and Kostochka [13], is for all p ≥ 2. On the other hand, it is known that the 'optimal possible' answer p − q + 1 = p − 1 fails already for p = 4. Indeed, Wegner [32] showed that HD rect (4, 2) = 5, and by taking ⌈p/3⌉ − 1 pairwise disjoint copies of his example, one obtains a family of axis-parallel rectangles that satisfies the (p, 2)-property but cannot be pierced by less than ≈ 5p/3 points. Wegner [32] conjectured that HD rect (p, 2) is linear in p, and is possibly even bounded by 2p − 3. While Wegner's conjecture is believed to hold (see [11,15]), no improvement of the bound (1) was found so far.
For rectangles and q > 2, Hadwiger and Debrunner showed that the exact bound HD rect (p, q) = p−q +1 holds for all q ≥ p/2+1. Wegner [32] and (independently) Dol'nikov [9] presented recursive formulas that allow leveraging a (p, 2)-theorem for axis-parallel rectangles into a tight (p, q)theorem. Applying these formulas along with the Hadwiger-Debrunner quadratic upper bound on HD rect (p, 2), Dol'nikov showed that HD rect (p, q) = p − q + 1 holds for all 2 ≤ q ≤ p < q+1 2 . Applying the formulas along with the improved bound (1) on HD rect (p, 2), Scheller ([29], see also [11]) obtained by a computer-aided computation upper bounds on the minimal p such that HD rect (p, q) = p − q + 1 holds, for all q ≤ 12. These values suggest that HD rect (p, q) = p − q + 1 holds already for q = Ω(log p). However, it appears that the method in which Dol'nikov proved a tight bound in the range p < q+1 2 does not extend to show a tight bound for all q = Ω(log p) (even if (1) is employed), and in fact, no concrete improvement of Dol'nikov's result was presented (see the survey [11]).
For axis-parallel boxes in R d , the aforementioned recursive formula of [13] implies the bound HD d−box (p, 2) ≤ O(p log d−1 p). While it is believed that the correct upper bound is O(p), the result of [13] was not improved ever since; the only advancement is a recent result of Chudnovsky et al. [8], who proved an upper bound of O(p log log p) for any family of axis-parallel boxes in which for each two intersecting boxes, a corner of one is contained in the other.

Our results
From (p, 2)-theorems to (p, q)-theorems The main result of this paper is a general method for leveraging a (p, 2)-theorem into a tight (p, q)-theorem, applicable to families with Helly number 2. Interestingly, the method does not assume that the sets in F are convex or compact.
While the condition on the function f (p) looks a bit "scary", it actually holds for any function f whose growth rate (as expressed by its derivative f ′ (p) and by the derivative of its logarithm (log f (p)) ′ = f ′ (p) f (p) ) is between the growth rates of f (p) = log 2 p and f (p) = p 5 , including all cases needed in the current paper. The proof can be easily adjusted to work for any f with a polynomial growth rate, at the expense of replacing '100' with a larger constant depending on the degree of the polynomial.
The first application of our general method is the following theorem for families of axisparallel rectangles in the plane, obtained using (1) as the basic (p, 2)-theorem and some local refinements.
Another corollary is a tight (p, q)-theorem for axis-parallel boxes in R d : In the proof of Theorem 1.4 we deploy the following observation of Wegner and Dol'nikov, which holds for any family F with Helly number 2: where λ = ν(F) is the packing number of F. 1 We use an inductive process in which (2) is applied as long as F contains a sufficiently large pairwise-disjoint set. To treat the case where F does not contain a 'large' pairwise-disjoint set (and thus, ν(F) is small), we make use of a combinatorial argument, based on a variant of a 'combinatorial dichotomy' presented by the authors and Tardos [21], which first leverages the (p, 2)-theorem into a 'weak' (p, q)-theorem, and then uses that (p, q)-theorem to show that if ν(F) is 'small' then τ (F) < p − q + 1.
From (2, 2)-theorems to (p, 2)-theorems It is natural to ask, under which conditions a (2, 2)-theorem implies a (p, 2)-theorem for all p > 2. While in general, a (2, 2)-theorem does not imply a (p, 2)-theorem (see an example in Appendix B.1), we prove such an implication for several kinds of families. Our first result here concerns families with Helly number 2.
Theorem 1.8. Let F be a family of compact convex sets in R d with Helly number 2. Then HD F (p, 2) ≤ p 2d−1 /2 d−1 , and consequently, HD F (p, q) = p − q + 1 holds for all q > cp 1− 1 2d−1 , where c = c(d) is a constant depending only on the dimension d.
The second result only assumes the existence of a (2, 2)-theorem. Theorem 1.9. Let F be a family of compact convex sets in R d that admits a (2, 2)-theorem. Then: 1. F admits a (p, 2)-theorem for piercing with a bounded number s = s(p, d) of points.
Since families with a sub-quadratic union complexity admit a (2, 2)-theorem and have a bounded VC-dimension, Theorem 1.9(3) implies that any family F of regions in the plane with a sub-quadratic union complexity satisfies HD F (p, 2) = O(p 4 log 2 p). This significantly improves over the bound HD F (p, 2) = O(p 16 ) that was obtained for such families in [21].

Organization of the paper
In Section 2 we demonstrate our general method for leveraging a (p, 2)-theorem into a tight (p, q)theorem and prove Theorem 1.5. Our new (p, 2)-theorem for compact convex sets with Helly number 2 (i.e., Theorem 1.8 above) is presented in Section 3. Finally, the proof of Theorem 1.4 is presented in Appendix A, and the proof of Theorem 1.9 is presented in Appendix B.
2 From (p,2)-theorems to tight (p,q)-theorems In this section we present our main theorem which allows leveraging a (p, 2)-theorem into a tight (p, q)-theorem, for families F that satisfy HD F (2, 2) = 1. As the proof of the theorem in its full generality is somewhat complex, we present here the proof in the case of axis-parallel rectangles in the plane, and provide the full proof in Appendix A. Before presenting the proof of the theorem, we briefly present the Wegner-Dol'nikov argument (parts of which we use in our proof) in Section 2.1, provide an outline of our method in Section 2.2, and prove two preparatory lemmas in Section 2.3.

The Wegner-Dol'nikov method
As mentioned in the introduction, Wegner and (independently) Dol'nikov leveraged the Hadwiger-Debrunner (p, 2)-theorem for axis-parallel rectangles in the plane, which asserts that HD rect (p, 2) ≤ p 2 , into a tight (p, q)-theorem, asserting that HD rect (p, q) ≤ p − q + 1 holds for all p ≥ q ≥ 2 such that p < q+1 2 . The heart of the Wegner-Dol'nikov argument is the following observation. 3 Observation 2.1. Let F be a family that satisfies HD F (2, 2) = 1, and put λ = ν(F). Then Using Observation 2.1, Wegner and Dol'nikov proved the following theorem, which we will use in our proof below.
Theorem 2.2 ([9], Theorem 2). Let F be a family of axis-parallel rectangles in the plane. Then for any p ≥ q ≥ 2 such that p < q+1 2 , we have HD F (p, q) = p − q + 1.
Proof. The proof is by induction. The induction basis is q = 2: for this value, the assertion is relevant only for p = 2, and we indeed have HD rect (2, 2) = 1 = 2 − 2 + 1 as asserted.

Outline of our method
Let F be a family of axis-parallel rectangles in the plane. Instead of leveraging the Hadwiger-Debrunner (p, 2)-theorem for F into a (p, q)-theorem as was done by Wegner and Dol'nikov, we would like to leverage the stronger bound HD rect (p, 2) ≤ p log 2 p which follows from (1). We want to deduce that HD rect (p, q) = p − q + 1 holds for all q ≥ 7 log p.
Basically, we would like to perform an inductive process similar to the process applied in the proof of Theorem 2.2. As above, put λ = ν(F). If λ is 'sufficiently large' (namely, if q − 1 ≥ 7 log 2 (p − λ)), we apply the recursive formula HD F (p, q) ≤ HD F (p − λ, q − 1) + λ − 1 and use the induction hypothesis to bound HD F (p − λ, q − 1). Otherwise, we would like to use the improved (p, 2)-theorem to deduce that F can be pierced by at most p − q + 1 points.
However, since we want to prove the theorem in the entire range q ≥ 7 log 2 p, in order to apply the induction hypothesis to HD F (p − λ, q − 1), λ must be at least linear in p (specifically, we need λ ≥ 0.1p, as is shown below). Thus, in the 'otherwise' case we have to show that if λ < 0.1p, then F can be pierced by at most p − q + 1 points. If we merely use the fact that F satisfies the (λ + 1, 2)-property and apply the improved (p, 2)-theorem, we only obtain that F can be pierced by O(p log p) points -significantly weaker than the desired bound p − q + 1.
Instead, we use a more complex procedure, partially based on the following observation, presented in [21] (and called there a 'combinatorial dichotomy'): Observation 2.3. Let F be a family that satisfies the (p, q)-property. For any p ′ ≤ p, q ′ ≤ q such that q ′ ≤ p ′ , either F satisfies the (p ′ , q ′ )-property, or there exists S ⊂ F of size p ′ that does not contain an intersecting q ′ -tuple. In the latter case, F \ S satisfies the (p − p ′ , q − q ′ + 1)property.
First, we use Observation 2.3 to leverage the (p, 2)-theorem by an inductive process into a 'weak' (p, q)-theorem that guarantees piercing with p − q + 1 + O(p) points, for all q = Ω(log p). We then show that if λ < 0.1p then F can be pierced by at most p − q + 1 points, by combining the weak (p, q)-theorem, another application of Observation 2.3, and a lemma which exploits the size of λ.

The two main lemmas used in the proof
Our first lemma leverages the (p, 2)-theorem HD rect (p, 2) ≤ p log 2 p into a weak (p, q)-theorem, using Observation 2.3.
Lemma 2.4. Let F be a family of axis-parallel rectangles in the plane. Then for any c > 0 and for any p ≥ q ≥ 2 such that q ≥ c log 2 p, we have Proof. First, assume that both p and q are powers of 2. We perform an inductive process with ℓ = (log 2 q) − 1 steps, where we set F 0 = F and (p 0 , q 0 ) = (p, q), and in each step i, we apply Observation 2.3 to a family F i−1 that satisfies the ( At the end of Step ℓ we obtain a family F ℓ that satisfies the (2p/q, 2)-property. (Note that the ratio between the left term and the right term remains constant along the way.) By the (p, 2)-theorem, F ℓ can be pierced by 2p q log 2 2p q points. As q ≥ max(c log 2 p, 2), this implies that F ℓ can be pierced by 2p In order to pierce F, we also have to pierce the 'bad' sets S i . In the worst case, in each step we have a bad set, and so we have to pierce Since any family that satisfies the (p, q)-property also satisfies the (p − k, q − k)-property for any k, the family S contains an intersecting (q − 1)-tuple, which of course can be pierced by a single point. Hence, S can be pierced by (p − 1) − (q − 1) + 1 = p − q + 1 points. Therefore, in total F can be pierced by p − q + 1 + 2p/c points, as asserted. Now, we have to deal with the case where p, q are not necessarily powers of 2, and thus, in some of the steps either p i−1 or q i−1 or both are not divisible by 2. It is clear from the proof presented above that if we can define (p i , q i ) in such a way that in both cases (i.e., whether there is a 'bad' set or not), we have p i q i ≤ p i−1 q i−1 , and also the total size of the bad sets (i.e., |S|) is at most p, the assertion can be deduced as above (as the ratio between the left term and the right term only decreases). We show that this can be achieved by a proper choice of (p i , q i ) and a slight modification of the steps described above. Let If F i−1 satisfies the (p ′ , q ′ )-property, we define F i = F i−1 and (p i , q i ) = (p ′ , q ′ ). Otherwise, there exists a 'bad' set S i of size p ′ that does not contain an intersecting q ′ -tuple, and the family property. In this case, we define It is easy to check that in both cases we have p i q i ≤ p i−1 q i−1 , and that |S| ≤ p − 1 holds also with respect to the modified definition of the S i 's. Hence, the proof indeed can be completed, as above.
Our second lemma is a simple upper bound on the piercing number of a family that satisfies the (p, 2)-property. We shall use it to show that if ν(F) is 'small', then we can save 'something' when piercing large subsets of F. Proof. We perform the following simple recursive process. If G contains a pair of intersecting sets, pierce them by a single point and remove both of them from G. Continue in this fashion until all remaining sets are pairwise disjoint. Then pierce each remaining set by a separate point.
As G satisfies the (p, 2)-property, the number of sets that remain in the last step is at most p − 1 if m − (p − 1) is even and at most p − 2 otherwise. In the former case, the resulting piercing set is of size at most m−(p−1) In the latter case, the piercing set is of size . Hence, in both cases the piercing set is of size at most ⌊ m+p−1 2 ⌋, as asserted.
Remark 2.6. The assertion of Lemma 2.5 is tight, as for a family G composed of m − p + 2 lines in a general position in the plane and p − 2 pairwise-disjoint segments that do not intersect any of the lines, we have |G| = m, G satisfies the (p, 2)-property, and G clearly cannot be pierced by less than m+p−1 2 points.
Corollary 2.7. Let F be a family of sets in R d , and put λ = ν(F). Then any subset S ⊂ F can be pierced by at most |S|+λ 2 points.
The corollary follows from the lemma immediately, as any such family F satisfies the (λ + 1, 2)-property.

Proof of Theorem 1.5
Now we are ready to present the proof of our main theorem, in the specific case of axis-parallel rectangles in the plane. Let us recall its statement. Theorem 1.5. Let F be a family of axis-parallel rectangles in the plane. If F satisfies the (p, q)-property, for p ≥ q ≥ 2 such that q ≥ 7 log 2 p, then F can be pierced by p − q + 1 points.
Remark 2.8. We note that the parameters in the proof (e.g., the values of (p ′ , q ′ ) in the inductive step) were chosen in a sub-optimal way, that is however sufficient to yield the assertion with the constant 7. (The straightforward choice (p ′ , q ′ ) = (0.5p, 0.5q) is not sufficient for that). The constant can be further optimized by a more careful choice of the parameters; however, it seems that in order to reduce it below 6, a significant change in the proof is needed.
Proof of Theorem 1.5. The proof is by induction.
Induction basis. One can assume that q ≥ 37, as for any smaller value of q, there are no p's such that 7 log 2 p ≤ q ≤ p. For q = 37, the theorem is only relevant for (p, q) = (37, 37), and in this case we clearly have HD F (p, q) = 1 = p − q + 1. In fact, this is a sufficient basis, since, in the inductive step, the value of q is reduced by 1 every time. However, in the proof we would like to assume that p, q are 'sufficiently large'; hence, we use Theorem 2.2 as the induction basis in order to cover a larger range of small (p, q) values.
We observe that for q ≤ 70, all relevant (p, q) pairs (i.e., all pairs for which 7 log 2 p ≤ q ≤ p) satisfy p ≤ q+1 2 . Hence, in this range we have HD F (p, q) = p−q +1 by Theorem 2.2. Therefore, we may assume that q > 70; we also may assume q < √ 2p (as otherwise, the assertion follows from Theorem 2.2), and thus, p > 2450 and so (using again the assumption q < √ 2p), also p > 35q.
We want λ to be sufficiently large, such that if (p, q) lies in the range covered by the theorem (i.e., if q ≥ 7 log 2 p), then (p − λ, q − 1) also lies in the range covered by the theorem (i.e., q − 1 ≥ 7 log 2 (p − λ)). Note that the condition q ≥ 7 log 2 p is equivalent to 2 q/7 ≥ p, which implies 2 (q−1) , and so we can deduce from the induction hypothesis that as asserted. Therefore, it is sufficient to prove that HD F (p, q) ≤ p − q + 1 holds when λ < 0.1p.
Under this assumption on λ, we apply Observation 2.3 to F, with (p ′ , q ′ ) = (⌊0.62p⌋, 0.5q). We have to consider two cases: Case 1: F satisfies the (p ′ , q ′ )-property. By the assumption on (p, q), we have q ≥ 7 log 2 p, and thus, 0.5q ≥ 3.5 log 2 p ≥ 3.5 log 2 ⌊0.62p⌋. Hence, by Lemma 2.4, where the last inequality holds because we may assume q ≤ 0.05p, since p > 35q as was written above. Thus, F can be pierced by at most p − q + 1 points, as asserted.
Case 2: F does not satisfy the (p ′ , q ′ )-property. In this case, there exists a 'bad' subfamily S of size p ′ = ⌊0.62p⌋ that does not contain an intersecting 0.5q-tuple, and the family F \ S satisfies the (⌈0.38p⌉, 0.5q)-property.
To pierce F \ S, we use Lemma 2.4. Like above, we have 0.5q ≥ 3.5 log 2 ⌈0.38p⌉, whence by Lemma 2.4, where the first inequality holds since we may assume p ≥ 100 (as was written above), and thus, ⌈0.38p⌉ ≤ 0.39p. To pierce the 'bad' subfamily S, we use Lemma 2.5, which implies that S can be pierced by points. Therefore, in total F can be pierced by (0.613p − 0.5q + 1) + 0.36p < 0.975p − 0.5q + 1 points. Since we may assume q ≥ 0.05p (like above), this implies that F can be pierced by p − q + 1 points. This completes the proof.
3 From (2,2)-theorems to (p,2)-theorems As was mentioned in the introduction, in general, the existence of a (2, 2)-theorem (and even Helly number 2) does not imply the existence of a (p, 2)-theorem. An example mentioned by Fon der Flaass and Kostochka [13] (in a slightly different context) is presented in Appendix B.1.
In this section we prove Theorem 1.8 which asserts that for compact convex families with Helly number 2, a (2, 2)-theorem does imply a (p, 2)-theorem, and consequently, a tight (p, q)-theorem for a large range of q's. Due to space constraints, the proof of our other new (p, 2)-theorem (i.e., Theorem 1.9) is presented in Appendix B. Let us recall the assertion of the theorem: Theorem 1.8. For any family F of compact convex sets in R d that has Helly number 2, we have HD F (p, 2) ≤ p 2d−1 2 d−1 . Consequently, we have HD F (p, q) = p − q + 1 for all q > cp 1− 1 2d−1 , where c = c(d).
The 'consequently' part follows immediately from the (p, 2)-theorem via Theorem 1.4. (Formally, Theorem 1.4 is stated only for growth rate of HD F (p, 2) = O(p 5 ), but it is apparent from the proof that the argument can be extended to HD F (p, 2) = O(p m ) for any m ∈ N, at the expense of the constant c becoming dependent on m.) Hence, we only have to prove the (p, 2)-theorem.
Let us present the proof idea first. The proof goes by induction on d. Given a family F of sets in R d that satisfies the assumptions of the theorem and has the (p, 2)-property, we take S to be a maximum (with respect to size) pairwise-disjoint subfamily of F, and consider the intersections of other sets of F with the elements of S. We observe that by the maximality of S, each set A ∈ F \ S intersects at least one element of S, and thus, we may partition F into three subfamilies: S itself, the family U of sets in F \ S that intersect only one element of S, and the family M ⊂ F \ S of sets that intersect at least two elements of S.
We show (using the maximality of S and the (2, 2)-theorem on F) that U ∪ S can be pierced by p − 1 points. As for M, we represent it as a union of families: M = ∪ C,C ′ ∈S X C,C ′ , where each X C,C ′ consists of the elements of F \ S that intersect both C and C ′ . We use a geometric argument to show that each X C,C ′ corresponds to Y C,C ′ ⊂ R d−1 that has Helly number 2 and satisfies the (p, 2)-property. This allows us to bound the piercing number of Y C,C ′ by the induction hypothesis, and consequently, to bound the piercing number of X C,C ′ . Adding up the piercing numbers of all X C,C ′ 's and the piercing number of U ∪ S completes the inductive step.
Proof of Theorem 1.8. By induction on d.
Inductive step. Let F be a family of sets in R d that satisfies the assumptions of the theorem and has the (p, 2)-property. Let S be a maximum (with respect to size) pairwise-disjoint subfamily of F. W.l.o.g., we may assume |S| = p − 1.
By the maximality of S, each set A ∈ F \ S intersects at least one element of S. Moreover, any two sets A, B ∈ F that intersect the same C ∈ S and do not intersect any other element of S, are intersecting, as otherwise, the subfamily S ∪ {A, B} \ {C} would be a pairwise-disjoint subfamily of F that is larger than S, a contradiction. Hence, for each C 0 ∈ S, the subfamily satisfies the (2, 2)-property, and thus, can be pierced by a single point by the assumption on F. Therefore, denoting U = {A ∈ F : |{C ∈ S : A ∩ C = ∅}| = 1}, all sets in U ∪ S can be pierced by at most p − 1 points.
Let M ⊂ F be the family of all sets in F that intersect at least two elements of S. For each C, C ′ ∈ S, let (Note that the elements of X C,C ′ may intersect other elements of S). Let H ⊂ R d be a hyperplane that strictly separates C from C ′ , and put Y C,C ′ = {A ∩ H : A ∈ X C,C ′ }.
or equivalently, that the family T can be pierced by a single point. Therefore, Y C,C ′ satisfies HD Y C,C ′ (2, 2) = 1, as asserted.
Claim 3.1 allows us to apply the induction hypothesis to Y C,C ′ , to deduce that it can be pierced by less than p 2d−3 /2 d−1 points. Since S contains only p−1 2 pairs (C, C ′ ), and since any set in M belongs to at least one of the X C,C ′ , this implies that M can be pierced by less than p−1 2 · p 2d−3 /2 d−2 points. As U ∪ S can be pierced by p − 1 points as shown above, F can be pierced by less than points. This completes the proof.
In this appendix we present the full proof of Theorem 1.4, which allows leveraging a (p, 2)theorem into a tight (p, q)-theorem, for families F that satisfy HD F (2, 2) = 1. For the sake of completeness we present the proof almost in full, although most of the components appear (in a simplified form) in the case of axis-parallel rectangles presented in Section 2. This appendix is organized as follows. First we outline the proof in Section A.1. Then we present several lemmas required for the proof in Section A.2, and the proof itself in Section A.3. We deduce Theorem 1.7 from Theorem 1.4 in Section A.4. Finally, for sake of completeness we present the proof of Observation 2.1 in Section A.5.

A.1 Outline of our method
Let F be a family that satisfies HD F (2, 2) = 1. In order to leverage a (p, 2)-theorem for F into a tight (p, q)-theorem we would like to perform an inductive process similar to the process applied in the proof of Theorem 2.2. Put λ = ν(F). If λ is 'sufficiently large', we apply the recursive formula HD F (p, q) ≤ HD F (p − λ, q − 1) + λ − 1 and use the induction hypothesis to bound HD F (p − λ, q − 1). Otherwise, we would like to use the (p, 2)-theorem to deduce that F can be pierced by at most p − q + 1 points. Since we allow q to be as small as roughly log p, and as we want to apply the induction hypothesis to HD F (p − λ, q − 1), λ must be at least linear in p. Thus, in the 'otherwise' case we have to show directly that if λ < c ′ p for a sufficiently small constant c ′ , then F can be pierced by at most p − q + 1 points. If we merely use the fact that F satisfies the (λ + 1, 2)-property and apply the (p, 2)-theorem, we only obtain that F can be pierced by c ′ pf (c ′ p) points -significantly weaker than the desired bound p − q + 1.
Instead, we use a more complex procedure, based on Observation 2.3 presented above. First, we use Observation 2.3 to leverage the (p, 2)-theorem by an inductive process into a 'weak' (p, q)theorem that guarantees piercing with p − q + 1 + O(p) points, for all q = Ω(T 100 (p)), where T c (p) = min{q : q ≥ 2c · f (2p/q)}. We then show that if λ < c ′ p for a sufficiently small absolute constant c ′ , then F can be pierced by at most p − q + 1 points, by combining the weak (p, q)-theorem, another application of Observation 2.3, and a lemma which exploits the size of λ.
In addition, we have to handle the induction basis: while in the proof of Theorem 2.2, Dol'nikov could use the case p = q = 2 as the induction basis, our assertion applies only to significantly larger values of q. Hence, we will have to guarantee that for the 'minimum relevant' value of q, for all 'relevant' values of p (i.e., all values of p such that q ≥ T 100 (p)) we have HD F (p, q) = p − q + 1. We shall deduce this from another result of Dol'nikov presented below.

A.2 Lemmas used in the proof
The first lemma is a weak (p, q)-theorem, that can be obtained from a (p, 2)-theorem using Observation 2.3. While the proof of the lemma is very similar to the proof that was already described before, in the case of axis-parallel rectangles, we present it in full for sake of completeness.
Lemma A.1. Let F be a family of sets in R d and let c > 0. Assume that for all 2 ≤ p ∈ N we have HD F (p, 2) = pf (p), where f : [2, ∞) → [1, ∞) is a monotone increasing function of p. Let T c (p) = min{q : q ≥ 2c · f (2p/q)}. Then for any q ≥ T c (p), we have Proof. First, assume that both p and q are powers of 2. We perform an inductive process with ℓ = (log 2 q) − 1 steps, where we set F 0 = F and (p 0 , q 0 ) = (p, q), and in each step i, we apply Observation 2.3 to a family F i−1 that satisfies the (p i−1 , q i−1 )-property, with (p ′ , q ′ ) = ( 2 ) which we denote by (p i , q i ). Consider Step i. By Observation 2.3, either F i−1 satisfies the (p i , q i ) = ( 2 )-property, or there exists a 'bad' set S i of size p i−1 2 without an intersecting q i−1 2 -tuple, and the family F i−1 \ S i satisfies the ( p i−1 2 , q i−1 2 + 1)-property, and in particular, the ( p i−1 2 , q i−1 2 )-property. In either case, we are reduced to a family F i (either F i−1 or F i−1 \ S i ) that satisfies the (p i , q i )property, to which we apply Step i + 1.
At the end of Step ℓ we obtain a family F ℓ that satisfies the (2p/q, 2)-property. By the assumption of the lemma, this family can be pierced by 2p q f ( 2p q ) points. Noting that the map q → f (2p/q) is decreasing and using the definition of T c (p) and the assumption q > T c (p), we obtain 2p and thus, F ℓ can be pierced by p/c points. In order to pierce F, we also have to pierce the 'bad' sets S i . In the worst case, in each step we have a bad set, and so we have to pierce Since any family that satisfies the (p, q)-property also satisfies the (p − k, q − k)-property for any k, the family S contains an intersecting (q − 1)-tuple, which of course can be pierced by a single point. Hence, S can be pierced by (p − 1) − (q − 1) + 1 = p − q + 1 points. Therefore, in total F can be pierced by p − q + 1 + p/c points, as asserted. Now, we have to deal with the case where p, q are not necessarily powers of 2, and thus, in some of the steps either p i−1 or q i−1 or both are not divisible by 2. It is clear from the proof presented above that if we can define (p i , q i ) in such a way that in both cases (i.e., whether the (p i , q i )-property is satisfied or not), we have p i q i ≤ p i−1 q i−1 , and also the total size of the bad sets (i.e., |S|) is at most p, the assertion can be deduced as above. We show that this can be achieved by a proper choice of (p i , q i ) and a slight modification of the steps described above.
If F i−1 satisfies the (p ′ , q ′ )-property, we define F i = F i−1 and (p i , q i ) = (p ′ , q ′ ). Otherwise, there exists a 'bad' set S i of size p ′ that does not contain an intersecting q ′ -tuple, and the family property. In this case, we define It is easy to check that in both cases we have p i q i ≤ p i−1 q i−1 , and that |S| ≤ p − 1 holds also with respect to the modified definition of the S i 's. Hence, the proof indeed can be completed, as above.
We also need the following easy extension of the classical Hadwiger-Debrunner theorem, obtained by Dol'nikov [9].
A.3 Proof of Theorem 1.4 We are ready to prove our main theorem. Let us recall its statement.
Theorem 1.4. Let F be a family of sets in R d such that HD F (2, 2) = 1. Assume that for all x for all p ≥ 2. Denote T c (p) = min{q : q ≥ 2c · f (2p/q)}. Then for any p ≥ q ≥ 2 such that q ≥ T 100 (p), we have HD F (p, q) = p − q + 1.
Proof. By induction. We start with the inductive step, and leave the induction basis for the end.
We apply Observation 2.3 to F, with (p ′ , q ′ ) = 2 3 p, q 2 . (For the sake of simplicity, we assume that p, q are divisible by 3 and 2, respectively. It will be apparent that this does not affect the proof). We have to consider two cases: Case 1: F satisfies the (p ′ , q ′ )-property. Note that by the assumption on f ′ (p) f (p) , for any p ≥ 2 we have and thus, f (4p/3) < 5f (p). By the assumption on (p, q), we have q ≥ 200f (2p/q), and thus, By the definition of T c , this implies q 2 ≥ T 10 ( 2p 3 ). Therefore, we can apply Lemma A.1 to deduce This shows that F can be pierced by less than p − q + 1 points, if we may assume 0.26p ≥ 0.5q, or equivalently, q ≤ 0.52p. To see that we indeed may assume this, note that by Proposition A.2, HD F (p, q) = p − q + 1 holds whenever q ≥ p 2 + 1. Our theorem applies only for q ≥ 200 (as for any 'relevant' pair (p, q) we have q ≥ 200f (2p/q) ≥ 200 · 1), and in this range, (q > 0.52p) ⇒ (q > 0.5p + 1). Thus, either the above argument implies HD F < p − q + 1, or Proposition A.2 implies HD F = p − q + 1, and either way we are done.
Case 2: F does not satisfy the (p ′ , q ′ )-property. In this case, there exists a 'bad' subfamily S of size p ′ = 2p 3 that does not contain an intersecting q ′ -tuple, and the family F \ S satisfies the ( p 3 , q 2 )-property.
To pierce the family F \ S we use Lemma A.1. By the monotonicity of f , we have q 2 ≥ 100f 2p q ≥ 100f 2·(p/3) q/2 , and thus, q 2 ≥ T 50 ( p 3 ). Hence, Lemma A.1 implies whence F \ S can be pierced by 0.34p − 0.5q + 1 points. To pierce the 'bad' subfamily S, we use Lemma 2.5, which implies that S can be pierced by Therefore, F can be pierced by 0.68p − 0.5q + 1 points. As in Case 1, we may argue that either 0.68p − 0.5q + 1 < p − q + 1 and we are done, or q ≥ p 2 + 1 and then we are done by Proposition A.2. This completes the inductive step.
To conclude the proof, we need the induction basis. The idea is to show that for q 0 = min{q : there exists a 'relevant' pair (p,q) } (where 'relevant' means a pair (p, q) that belongs to the range covered by the theorem), for all relevant pairs (p, q 0 ) we have p ≤ 2q 0 − 2, and thus HD F (p, q 0 ) = p − q 0 + 1 holds by Proposition A.2. This is a sufficient basis, since in the inductive process, q is decreased by 1 in each step, and so we will eventually reduce to q = q 0 , for which the assertion holds. Note that we cannot move from (p, q) to (p ′ , q − 1) such that p ′ < q − 1, since this would mean that the family contains an independent set of size ≥ p − q + 2; completing such a set to p elements by adding ≤ q − 2 arbitrary elements, we obtain a subfamily of F of size p without an intersecting q-tuple, a contradiction.
By Claim A.3(2), T c (p) is increasing in p, and thus, if for some q there exists a p such that q ≥ T c (p), then we also have q ≥ T c (q). Hence, for each q, the smallest p for which (p, q) lies in the range covered by the theorem is q itself. Consequently, the smallest q for which there exists a 'relevant' (p, q) is equal to the smallest p for which there exists a 'relevant' (p, q). Denote this value by q 0 . By its minimality, we have q 0 − 1 < T 100 (q 0 − 1). Therefore, by Claim A.3(3a) we have q 0 < T 100 (2q 0 − 1). (Note that in order to apply the claim, we need c ≥ 1 2 log 2(k−1)/k 2, where k is a lower bound on T c (q 0 − 1). This indeed holds for c = 100, as we can take k = 200 as a lower bound, as mentioned above.) As T c (p) is increasing in p, this implies that {p : T 100 (p) = q 0 } ⊂ {q 0 , q 0 + 1, . . . , 2q 0 − 2}. Therefore, by Proposition A.2, we have HD F (p, q 0 ) = p − q 0 + 1 for all 'relevant' (p, q 0 ). This completes the proof of the induction basis, and hence the proof of the theorem.

A.4 Proof of Theorem 1.7
We conclude this appendix with the simple deduction of Theorem 1.7 from Theorem 1.4. Let us recall the statement of the theorem. Theorem 1.7. Let F be a family of axis-parallel boxes in R d . Then HD F (p, q) = p − q + 1 holds for all q > c log d−1 (p), where c is a universal constant.
Proof. The (p, 2)-theorem for axis-parallel boxes in R d [20] yields HD F (p, 2) ≤ O(p log d−1 2 (p)), which means that HD F (p, 2) ≤ pf (p) holds for f (p) = c ′ log d−1 2 (p) (where c ′ is a universal constant). For this f (p), we have T 100 (p) ≤ 200c ′ log d−1 2 (p). Hence, the assertion of Theorem 1.7 will follow from Theorem 1.4, once we verify that f (p) satisfies the conditions of the theorem.
The condition regarding f ′ (p) is clearly satisfied: we have f ′ (p) = c ′ (d − 1) log d−2 2 (p) log 2 e/p ≥ log 2 e/p, for all p ≥ 2 and c ′ ≥ 1. As for the condition regarding f ′ (p)/f (p), we observe that in the proof of Theorem 1.4, this condition is applied only for values of p for which there exists a 'relevant' pair (p, q), and thus, it is sufficient to show that it holds for all such values. We have and so we have to show that This indeed holds in all the required range, since if there exists a 'relevant' pair (p, q) then p ≥ 200 log d−1 (p), and thus, log 2 p ≥ (d − 1) log 2 log 2 p ≥ d − 1, which clearly implies (4). This completes the proof.

A.5 Proof of Observation 2.1
For the sake of completeness, we present in this subsection the proof of Observation 2.1, due to Wegner [32] and (independently) Dol'nikov [9]. Let us recall its formulation.
Observation 2.1. Let F be a family that satisfies HD F (2, 2) = 1, and put λ = ν(F). Then Proof. The slightly weaker bound HD F (p, q) ≤ HD F (p−λ, q −1)+λ holds trivially, and does not even require the assumption HD F (2, 2) = 1. Indeed, if S is a pairwise-disjoint subset of F of size λ, then F \ S satisfies the (p − λ, q − 1)-property, and thus, can be pierced by HD F (p − λ, q − 1) points. As S clearly can be pierced by λ points, we obtain HD F (p, q) ≤ HD F (p − λ, q − 1) + λ. To get the improvement by 1, let S be a pairwise-disjoint subfamily of F of size λ = ν(F) and let T be a transversal of F \ S of size HD F (p − λ, q − 1). Take an arbitrary x ∈ T , and consider the subfamily X = {A ∈ F \ S : x ∈ A} (i.e., the sets in F \ S pierced by x). By the maximality of S, each A ∈ X intersects some B ∈ S. Hence, we can write X = ∪ B∈S X B , where X B = {A ∈ X : A ∩ B = ∅}. Observe that for each B, the set X B ∪ {B} is pairwise-intersecting. Indeed, any A, A ′ ∈ X intersect in x, and all elements of X B intersect B. Therefore, by the assumption on F, each X B ∪ {B} can be pierced by a single point. Since X = ∪ B∈S X B , this implies that there exists a transversal T ′ of X ∪ S of size |S| = λ. Now, the set (T \ {x}) ∪ T ′ is the desired transversal of F with HD F (p − λ, q − 1) + λ − 1 points.

B Proof of Theorem 1.9
In this appendix we prove Theorem 1.9, which provides (p, 2)-theorems for families of compact convex sets that admit a (2, 2)-theorem. In addition, for sake of completeness we present an example, due to Fon der Flaass and Kostochka [13], of a set system (i.e., a hypergraph) with Helly number 2 that does not admit a (3, 2)-theorem, thus showing that in general, the existence of a (2, 2)-theorem does not imply the existence of a (p, 2)-theorem.
Let us restate Theorem 1.9 in a more precise form. Theorem 1.9 (precise formulation). Let F be a family of compact convex sets in R d such that HD F (2, 2) = t. Then: 1. We have In particular, F admits a (p, 2)-theorem for piercing with a bounded number s = s(p, d, t) of points.
Remark B.1. The difference between the general case (Part (1) of the theorem) and the planar case (Parts (2,3) of the theorem) looks surprisingly huge. We do not know whether any of these results are tight; however, the difference is well-explained by the proof method. While in the proof of Parts (2,3) we use a Ramsey-type theorem for families of convex sets in the plane of Larman et al. [24] (Theorem B.3 below) in which the 'Ramsey number' R(k) is polynomial in k, in the general case we have to resort to the classical Ramsey theorem in which R(k) is exponential in k. This is in a sense necessary, since Tietze [30] and (independently) Besicovitch [6] showed that any graph can be represented as the intersection graph of a family of convex sets in R 3 , which implies that no 'Ramsey theorem for convex sets in R d ' for d ≥ 3 can improve over the classical Ramsey theorem.
Remark B.2. As mentioned in the introduction, Matoušek [25] showed that families of sets with dual VC-dimension k have fractional Helly number at most k+1. This allows deducing that such families admit a (p, k)-theorem, using the proof technique of the Alon-Kleitman (p, q)-theorem. This result (which applies in a much more general setting than Part (3) of our theorem) does not imply our theorem, since it yields a (p, 2)-theorem only for families with dual VC-dimension 1, while our theorem applies whenever the VC-dimension is bounded.
The proof of the theorem is a combination of three tools: The first is a Ramsey-type theorem. Recall that the classical Ramsey theorem [27] asserts that for any k, there exists R(k) such that any graph on R(k) vertices contains either a complete subgraph on k vertices or an empty subgraph on k vertices. Ramsey showed that R(k) ≤ 2k−2 k−1 ≤ 4 k . As the best currently known upper bound is not much lower, we use the upper bound R(k) ≤ 4 k for sake of simplicity.
The Ramsey theorem implies that any family of R(k) sets contains either a pairwiseintersecting subfamily of size k or a pairwise-disjoint subfamily of size k. Larman et al. [24] showed that for families of compact convex sets in the plane, a significantly better result can be achieved.  The third result is the ǫ-net theorem for families with a bounded VC-dimension. Let us recall a few definitions. For a set system (U, R), where U is a set of points and R ⊂ P(U ) is a set of ranges (or alternatively, a hypergraph (U, R) in which U is the set of vertices and R is the set of hyperedges), we say that a set Y ⊂ U is shattered by R if every subset of Y can be obtained as the intersection of Y with some range e ∈ R. The VC-dimension of R is the maximal size of a set Y that is shattered by R. For example, any set of three non-collinear points in the plane can be shattered by halfplanes, but no four points can be. Hence, the VC-dimension of halfplanes in the plane is 3. This notion was introduced by Vapnik and Chervonenkis [31].
An ǫ-net of (U, R) is a subset S ⊂ U , such that any range e ∈ R that contains at least ǫ-fraction of the elements of U , intersects S.
The ǫ-net theorem of Haussler and Welzl [19] asserts the following: Theorem B.5 (The ǫ-net theorem, [19]). Let (U, R) be a range space of VC-dimension k, let A be a finite subset of U and suppose 0 < ǫ, δ < 1. Let N be a set obtained by m random independent draws from A, where Then N is an ǫ-net for A with probability at least 1 − δ.
In particular, any family with VC-dimension k admits an ǫ-net of size O k ( 1 ǫ log 1 ǫ ). Now we are ready to present the proof of the theorem.
Proof of Theorem 1.9. Part (1). As any family that satisfies the (p, q)-property, clearly satisfies the (p ′ , q)-property for all p ′ ≥ p (if it contains at least p ′ sets), and as we are not interested in constant factors, we may assume that p is larger than any prescribed constant; in particular, we may assume p ≥ (d + 1)t. Let F be a family that satisfies the assumptions of the theorem and has the (p, 2)-property. We claim that F satisfies the (4 p , ⌈p/t⌉)-property.
To prove this, let S be a subfamily of F of size 4 p . We have to show that S contains an intersecting ⌈p/t⌉-tuple.
By the Ramsey theorem [27], either S contains p pairwise intersecting sets, or else it contains p pairwise disjoint sets. The latter is impossible since F satisfies the (p, 2)-property. Hence, S contains a pairwise intersecting subfamily T of size p. As T satisfies the (2, 2)-property, by the assumption on F it can be pierced by t points. The largest among the subsets of T pierced by a single point is of size ≥ ⌈p/t⌉, and so, S contains an intersecting ⌈p/t⌉-tuple, as asserted.
Since ⌈p/t⌉ ≥ d + 1 by assumption, we can apply Theorem B.4(1) to F to deduce that Part (2). As in the proof of Part (1), we may assume that p is sufficiently large so that p/ log p > 5t. Let F be a family that satisfies the assumptions of the theorem and has the (p, 2)-property. Applying the same argument as in Part (1), with Theorem B.3 instead of the Ramsey theorem, we deduce that F satisfies the (p 5 , ⌈p/t⌉)-property.
Since by assumption, the VC-dimension of F is bounded by k, F admits an ǫ-net of size O k ( 1 ǫ log 1 ǫ ) by Theorem B.5. Hence, we can replace the application of the weak ǫ-net theorem in the above argument with an application of Theorem B.5. Substituting q = log p, which is easily seen to be (roughly) optimal, we obtain B.1 An example of a set system with Helly number 2 that does not admit a (3,2)-theorem The following example, presented by Fon der Flaass and Kostochka [13] in a different context, implies that in the abstract (i.e., non-geometric) setting, the existence of a (2, 2)-theorem, and even Helly number 2, does not imply the existence of a (p, 2)-theorem with a fixed number f (p) of points that does not depend on the size of the family. The example uses a classical result of Erdős [12] which asserts that for any m ∈ N, there exists an m-chromatic triangle-free graph G m on n(m) = O(m 2 log 2 m) vertices. As any graph on n vertices can be represented as the intersection graph of a family of axis-parallel boxes in R ⌈n/2⌉ (see [28]), the complement graphḠ m can be represented as the intersection graph of some family F of axis-parallel boxes in R n ′ , for n ′ = ⌈ n(m) 2 ⌉ = O(m 2 log 2 m). The family F has Helly number 2 (like any family of axis-parallel boxes). It satisfies the (3, 2)property, since if some three elements of F are pairwise disjoint, then the intersection graph of F contains an empty triangle, and this cannot happen since the intersection graphḠ m is the complement of a triangle-free graph. On the other hand, F cannot be pierced by less than m points, as any transversal of F of size k induces a partitioning of the vertices ofḠ m into into k cliques, which in turn yields a k-coloring of G m (which was assumed to be m-chromatic). Therefore, we have HD F (3, 2) ≥ m although the Helly number of F is 2.

C Discussion and open problems
A central problem left for further research is whether Theorem 1.4 which allows leveraging a (p, 2)-theorem into a (p, q)-theorem, can be extended to the cases HD F (p, 2) = pf (p) where f (p) ≪ log p or f (p) being super-polynomial in p. It seems that super-polynomial growth rates can be handled with a slight modification of the argument (at the expense of replacing T 100 (p) with some worse dependence on p). For sub-logarithmic growth rate, it seems that the current argument does not work, since the inductive step requires the packing number of F to be extremely small, and so, Lemma 2.5 allows reducing the piercing number of the 'bad' family S only slightly, rendering Lemma A.1 insufficient for piercing F with p − q + 1 points in total.
Extending the method for sub-logarithmic growth rates will have interesting applications. For instance, it will immediately yield a tight (p, q)-theorem for all q = Ω(log log p) for families of axis-parallel boxes in which for each two intersecting boxes, a corner of one is contained in the other, following the work of Chudnovsky et al. [8].
Another open problem is whether the method can be extended to families F that admit a (2, 2)-theorem, but satisfy HD F (2, 2) > 1. A main obstacle here is that in this case, Observation 2.1 does not apply, and instead, we have the bound HD(p, q) ≤ HD(p − λ, q − 1) + λ. While the bound is only slightly weaker, it precludes us from using the inductive process of Wegner and Dol'nikov, as in each application of the inductive step we have an 'extra' point.