A New Impossibility Result for Online Bipartite Matching Problems
Abstract
Online Bipartite Matching with random user arrival is a fundamental problem in the online advertisement ecosystem. Over the last 30 years, many algorithms and impossibility results have been developed for this problem. In particular, the latest impossibility result was established by Manshadi, Oveis Gharan and Saberi [50] in 2011. Since then, several algorithms have been published in an effort to narrow the gap between the upper and the lower bounds on the competitive ratio.
In this paper we show that no algorithm can achieve a competitive ratio better than , improving upon the upper bound presented in [50]. Our construction is simple to state, accompanied by a fully analytic proof, and yields a competitive ratio bound intriguingly similar to , the optimal competitive ratio for the fully adversarial Online Bipartite Matching problem.
Although the tightness of our upper bound remains an open question, we show that our construction is extremal in a natural class of instances.
Keywords and phrases:
Bipartite Matching, Random Graphs, Competitive RatioCategory:
Track A: Algorithms, Complexity and GamesCopyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Online algorithmsAcknowledgements:
We thank Will Ma, David Wajc, Pan Xu, and the anonymous reviewers for several comments and suggestions.Funding:
Flavio Chierichetti, Mirko Giacchini, and Alessandro Panconesi were supported in part by a Google Focused Research Award, by BiCi – Bertinoro international Center for informatics, and by the PRIN project 20229BCXNW funded by the European Union - Next Generation EU, Mission 4 Component 1, CUP B53D23012910006.Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele PuppisSeries and Publisher:

1 Introduction
Online Bipartite Matching problems are central to the online advertising ecosystem [50, 49, 45, 52] and have numerous other applications, including efficient packet switching and routing [5, 2], crowdsourcing tasks [62], market clearing problems [10] and ride-sharing platform optimization [22]. In this paper, we establish a new upper bound of on the competitive ratio for several well-studied variants of the problem. This result marks the first improvement of the upper bound for this classic problem in more than a decade.
To explain our result in context, we begin by recalling the problem. We are given a bipartite graph where one side of the bipartition consists of a known set of “ads” (or “advertisers”), while the other side consists of “users” (or “ad slots”), whose neighbor sets are initially unknown. Users arrive sequentially, and the generic user’s neighbors are revealed upon arrival. When a user arrives, the online algorithm can match, irrevocably, the user to one of its available neighbors; the algorithm’s aim is to maximize the size of the matching at the end of the sequence. The competitive ratio of an algorithm, in this case, is defined as the expected size of the matching produced by the algorithm divided by the expected size of the maximum possible matching. (We consider expectations since the algorithm can be randomized, and the input graph can be sampled from a distribution).
A strong motivation for studying this problem comes from online advertising, where edges represent ads that a user is interested in. Budget constraints limit how often each ad can be shown; without loss of generality, we assume that each ad can be matched to at most one user (ads with budgets allowing for multiple impressions can be reproduced multiple times in the graph). In this setting, a matching corresponds to a set of ads that will be shown to users, and the competitive ratio indicates how effectively the algorithm maximizes revenue.
This problem was introduced by Karp, Vazirani and Vazirani in a classic paper [47]. In their setup, the graph and the ordering of the users were chosen adversarially, and they presented a simple algorithm called RANKING that achieves a competitive ratio of (see [47, 34]), which was also proven to be tight [47]. The assumptions of adversarial graphs and adversarial user orderings have since been relaxed to develop algorithms with better competitive ratios; in particular, an important variant of the problem assumes that the user ordering is sampled uniformly at random, instead of being produced by an adversary [34]. Furthermore, researchers have relaxed the notion of adversarial graphs, exploring probabilistic models of user behavior [53]. In particular, consider the power set of the set of ads , and let be a probability distribution over . When a user arrives, a set of ads is sampled according to , and the ads in become the user’s neighbors. Both the cases where is known to the algorithm and where it is not have been considered in the literature. This model of random bipartite graphs has been studied extensively and is motivated by modern machine learning-driven methods used to identify promising ads for users.
A particularly interesting special case of this last model, known as the “irregular cuckoo hashing” model,111We recall that in (a standard variant of) cuckoo hashing [56, 24], given an integer , each item is hashed to uniform-at-random slots of a hashtable; the item is stored in an empty slot, if one is available. In irregular cuckoo hashing [23], one obtains by applying another hash function to – this way, different items can be hashed to a different number of slots. arises when a sample from is obtained by first sampling the degree of a user from a random variable taking values over the non-negative integers, and then sampling uniformly-at-random with replacement a number of ads equal to the sampled degree. This process generates a multiset of ads, but duplicate ads can be removed, leaving the remaining set as the user’s neighbors.
We can now state our result more precisely. We show that no online algorithm can achieve a competitive ratio greater than across all the models discussed above. We establish our bound in the irregular cuckoo hashing model and so, by definition, the bound extends to all the other models and is the strongest known bound for all those variants. The previous strongest upper bound, established in 2011 by Manshadi, Oveis Gharan, and Saberi [50], was . (Table 1 summarizes the state of the art). While our improvement is small, it is comparable in magnitude to many previous algorithmic advancements for this problem.
The upper bound of on the competitive ratio from [50] relies on the irregular cuckoo hashing constructions of Dietzfelbinger et al. [23] and was derived through numerical optimization, without yielding a closed-form expression. In contrast, our construction enables an exact analytical computation of its optimal competitive ratio. Furthermore, our construction uses simple rational probabilities, whereas the optimal probabilities in [50] remain unspecified and likely irrational.
Setting | Competitive Ratio | ||
---|---|---|---|
Best Algorithm | Our UB | Previous Best UB | |
Irregular Cuckoo Hashing | [37] | [50] | |
IID Users (Known Distr.) | |||
IID Users (Unknown Distr.) | [49] | ||
Adv. Graph, UAR Ordering | |||
Adv. Graph and Ordering | [47, 34] | [47] |
The random graphs used to prove our bound are based on a simple random variable taking values over the non-negative integers:
or, equivalently, for each positive integer . As discussed earlier, when a user arrives, a value is drawn from , and ads are chosen uniformly at random (with replacement) as neighbors. The graph consists of ads and users. As we will clarify later, this distribution greedily ensures the “maximum possible” number of vertices of degree for , while maintaining a quasi-complete matching, i.e., one matching a fraction of the users.
Intuitively, selecting user neighbors uniformly at random while keeping user degrees low impairs the performance of any online algorithm: over time, newly arriving users are likely to find all their randomly assigned neighbors already matched. However, establishing a strong upper bound also requires ensuring that the maximum matching remains large – this is the main technical challenge in our analysis.
Whether our upper bound is tight for some or all of the four problem variants mentioned above remains an open question. Notably, the bound involves the Euler’s number, in a manner analogous to the optimal competitive ratio of of the fully-adversarial Online Bipartite Matching problem. While we cannot determine whether our upper bound can be improved, we show that it possesses a certain extremal property and that it is the strongest bound possible for a natural class of instances.
Paper Organization.
Section 2 gives an overview of our main construction, of its relation to previous upper bounds, and of the techniques we used to study it. Section 3 reviews related work. Section 4 introduces key tools, techniques, and notation. In Section 5, we present our construction and prove the main result: the upper bound on the competitive ratio of Online Bipartite Matching Problems. Section 6 establishes the extremality of our construction, showing that it greedily maximizes the number of low-degree users while ensuring a quasi-complete matching. Section 7 extends our construction to arbitrary user-to-ad ratios, demonstrating that equibipartite graphs yield the strongest bound. We conclude with open questions in Section 8.
All the missing proofs can be found in the full version (10.48550/arXiv.2504.14251).
2 Overview
Our irregular cuckoo hashing instance contains users and ads, and assigns probability to each user-degree – in other words, it posits that the probability that a user has degree strictly larger than the generic positive integer is exactly . We obtained this distribution by greedily maximizing the number of users of degree while guaranteeing that the random graph admits a quasi-complete matching from the user side – a matching that pairs all users except for at most of them. Observe that, in a cuckoo hashing instance, the smaller the user-degrees the less likely it is for an online algorithm to match users. The goal of our distribution is to minimize the number of online matches while guaranteeing that the graph admits a quasi-complete matching – in other words, the distribution aims to minimize the numerator of the competitive ratio while making its denominator as large as possible. (We remark that most previous upper bounds were also based on graphs admitting quasi-complete matchings; see, e.g., [6, 50, 34, 47].)
Our Main Construction.
It is easy to see that, in order for a cuckoo hashing instance containing ads, and no more than users, to be quasi-complete from the user side, the instance must contain at most users of degree and users of degree . Furthermore, classical results can be used to prove that, if one aims to be quasi-complete from the user side, and all users have degree , then the maximum number of users is , for .222The feasibility of can be obtained as a corollary of a classical result of Erdös and Rényi [27] which states that, for all small enough , has, with probability , a fraction of its vertices in connected components that are trees. By taking each vertex of to be an ad, and each of its edges to be a user of degree connected to the ads and , one deduces the claim from the observation that the edges of a forest can be oriented so that no vertex has in-degree larger than . Now, suppose that we must have users of degree . What is the largest number of users of degree that we can add while still guaranteeing that there exists a matching covering a fraction of the users? We prove that the answer is “, for ”. Next, if we are bound to have users of degree and users of degree , what is the maximum such that we can add users of degree and match a fraction of users? And so on and so forth.
In general, we set for each degree , and . These fractions result in a total of users, to be matched to the ads. To establish that these ’s give rise to a graph admitting a quasi-complete matching we couple the irregular cuckoo hashing graph with a random graph in the configuration model having the same degree distribution on the user side, and the same Poissonian degree distribution on the ads side; then, by means of the Karp-Sipser algorithm [46] (which we analyze with a variant of the techniques presented in [7] by Balister and Gerke), we establish that this configuration model graph admits a quasi-complete matching; using the coupling, we derive the existence of a quasi-complete matching in the irregular cuckoo hashing graph, as well. Finally, we upper bound the fraction of ads matched by any online algorithm with , thus proving our new upper bound on the competitive ratio of several Online Bipartite Matching problems.
As an extra step, aimed at establishing the extremality of our construction, we show that if one increases any of the ’s by any constant , then a constant fraction of the vertices of degree at most become unmatchable. (In particular, if the extra fraction assigned to degree is taken from vertices of higher degree, then the quasi-complete matching disappears).
A Class of Constructions.
Our construction has users of unbounded degree, and contains the same number of users and ads. To get a more thorough understanding of the construction, and of the problem, we study a class of constructions that generalizes our main one; the instances in this class have ads, and users. Just like our main construction, we populate the generic instance in the class by greedily adding users with the smallest possible degree, under the constraint that the instance admits a quasi-complete matching. For a given , the generalized instance has a maximum user degree of . We prove that, for all , these instances give a weaker upper bound on the competitive ratio than the case , that is, the case with users and ads (see Figure 1 and Theorem 27).
Previous Constructions.
We point out that studying instances of varying user degrees is necessary to improve over the state of the art. In particular, if an irregular cuckoo hashing instance has users and ads, and all the users have degree , any greedy online algorithm will necessarily return a maximum matching (which matches a fraction of users). Thus, the best upper bound provable with this instance is . If, instead, each user has degree , a greedy online algorithm will return a matching with, roughly, edges. The maximum matching has size approximately equal to (see, e.g., [43]; here, is the Lambert W function), so that the best upper bound provable with this instance is . Finally, if each user has degree or more, one can show that the number of users matched by any greedy online algorithm is at least . It follows that having a uniform user degree in equibipartite graphs makes it impossible to improve upon – or even match – the upper bound of Manshadi et al. [50].
Indeed, Manshadi et al. [50] derived their upper bound by employing a mixture of users with varying degrees. In their construction, the user degree distribution is supported on .333Specifically, Manshadi et al. [50] have ads and users, and assign probability to user-degree , to user-degree , and to user-degree . Using the results in [23], they prove that the graph admits a maximum matching of size . They conclude their argument by numerically computing the size of the online matching, which they find to be no larger than . Manshadi et al.’s analysis leverages on an irregular cuckoo hashing construction of Dietzfelbinger et al. [23], whose parameters were obtained numerically; as a result, the optimal fractions of vertices of degrees and in the bipartite matching result of [50] are not given explicitly (and might very well be irrational numbers). Our extremal construction, instead, allows us to compute the competitive ratio analytically and exactly, and is made up of simple, rational, probabilities.
3 Related Work
Online Bipartite Matching problems have a long history. We mention here the results that are most relevant to our paper; see the book of Mehta [52], or the recent survey of Huang, Tang and Wajc [38], for a more comprehensive introduction to these problems.
As we mentioned before, if the graph and the ordering are adversarial, then a competitive ratio of can be achieved via the RANKING algorithm of Karp, Vazirani and Vazirani [47] (see, Goel and Mehta [34], or Birnbaum and Mathieu [8], for proofs; simpler analyses were given in, e.g., Devanur, Jain and Kleinberg [21] and Eden et al. [26]): this algorithm begins by sampling a uniform-at-random ordering of the ads; whenever a user arrives, if at least one ad is available for that user, the user is matched to the first available ad in the ordering. As shown in [47], the competitive ratio of the RANKING algorithm, , is optimal in the fully-adversarial case.
Determining the optimal competitive ratio if the set of users is permuted uniformly-at-random is, possibly, the most important open problem in this area. The best known upper bound on the competitive ratio of this problem was proved by Manshadi, Oveis Gharan and Saberi [50, 51] and has value – this upper bound holds regardless of whether the bipartite graph is generated adversarially or randomly. The RANKING algorithm is still the best known algorithm for this problem on adversarial graphs; Mahdian and Yan[49] show that it achieves a competitive ratio of at least , and Karande, Mehta and Tripathi [45] show that its competitive ratio is not larger than .
The case of randomized bipartite graphs has also been studied extensively; in this setting the adversary produces a distribution over subsets of ads, and each user samples independently a set of ads from that distribution. In the “known distribution” case, this distribution is assumed to be known to the algorithm; the “unknown distribution” variant has also been studied extensively. As mentioned in the previous paragraph, Mahdian and Yan [49] gave an algorithm for randomly-permuted adversarial graphs with a competitive ratio of ; this algorithm applies to both the known, and the unknown, distribution cases. Several improvements on the competitive ratio for the known distribution case have been proved in the last fifteen years. In 2011, Manshadi et al. [50], proposed an algorithm with a competitive ratio of . In 2014, Jaillet and Lu[39] had an algorithm achieving a competitive ratio of . Then, in 2021, Huang and Shu [36] gave an algorithm with a competitive ratio of ; finally, in 2022, Huang, Shu and Yan[37] achieved a competitive ratio of .
The best impossibility result so far – which holds in the cases of adversarial graph with UAR ordering, of random graphs with unknown distribution, and known distribution, as well as in the irregular cuckoo hashing case – is the aforementioned upper bound of published by Manshadi et al. in 2011 [50]. Our result improves the bounds in each of these four cases from to . We remark that our numerical improvement, of value roughly , is of the same order of all the algorithmic improvements mentioned in the previous paragraph.
Several special cases of online bipartite matching problems have been studied in the last few years. In particular, the “integral arrival rates” special case of the “known distribution” setting requires that each set in the support of the distribution has an integral expected number of occurrences.444We observe that our construction, like the construction of Manshadi et al. [50], is an equibipartite irregular cuckoo hashing instance with a maximum user degree larger than . Consequently, the total number of “user types” in the support of the distribution is at least quadratic in the number of users. Thus, these two constructions do not exhibit integral arrival rates. For this case, the best known algorithm – due to Brubach et al. [17] – achieves a competitive ratio of ; the best known upper bound for the integral arrival rates case has value , and is due to Manshadi et al. [51].
We conclude this section by noting that the interest in online bipartite matching is further evidenced by the attention given in recent years to several related problems. These include some of its weighted [37, 44, 29], and fairness-aware [48], variants; in particular, Huang et al. [37] show an upper bound of for the (known distribution) edge-weighted online bipartite matching problem, whereas Ma and Xu [48] show an upper bound of for a variant whose goal is to maximize the minimum matching rate across different groups of users. Other related problems include fractional variants and their rounding schemes [61, 55, 58, 18, 9], variants with different probing models [12, 13, 14], those with time-dependent user distributions [16, 60, 57, 15], with users arriving in batches [30, 31, 42] or with different arrival models [19, 32, 28], variants that leverage machine-learned advice [20, 3, 42], as well as generalized selection tasks [35, 33].
4 Preliminaries
We present here the objects, the tools and the notation that we will use in this paper.
Notation.
Given a predicate , the Iverson bracket has value if is true, and value otherwise. With a slight abuse of notation, for a positive integer , we use to denote the set . The support of a multiset is the set of elements whose multiplicity in is at least . For an integer , we let be the -th harmonic number. Finally, we let be the Lerch transcendent, a series that generalizes the Riemann function, and which converges if , and .
Bipartite (Multi)Graphs and Matchings.
An undirected bipartite (multi)graph is defined by two disjoint sets of vertices and , and by a (multi)set of edges having a support set which is a subset of . If coincides with its support set, then has no parallel edges, and we say that is a simple graph. We will typically use to denote the set of users, and to denote the set of ads. We will also use (resp., ) to denote the number of vertices of degree in (resp., ).
A matching in the (multi)graph is a subset containing pairwise disjoint edges. We say that admits a complete matching if there exists a matching of such that , and that it admits a perfect matching if there exists a matching such that .
Given a bipartite multigraph , if is the support set of , we have that is a bipartite simple graph. Observe that each matching of is also a matching of ; moreover, given an arbitrary matching of , we can transform into a matching of by substituting, for each , the edge with its unique parallel edge .
We say that an infinite sequence of bipartite (multi)graphs of increasing side sizes (), and with maximum matchings , admits a quasi-complete matching if – that is, if for each there exists such that, for each , .
The Karp-Sipser Algorithm.
The Karp-Sipser algorithm [46] is a greedy algorithm for computing a matching of a graph. A run of this algorithm is divided in two phases; for our purposes, only the first phase will be necessary. During the first phase, the algorithm looks for any vertex of degree ; as long as such a vertex exists, the algorithm adds its incident edge to a set , removes the two endpoints of from the graph, and iterates. If no such vertex exists, the first phase ends. It is easy to prove that, at any point during the execution of the first phase, the set can be extended to a maximum matching of the original graph. In fact, it is also easy to prove that if the original graph is bipartite and, at any point during the execution of the first phase, (resp. ) is the subset of (resp. ) containing vertices that have current degree and which are still unmatched, then the maximum matching cannot be larger than .
The Karp-Sipser algorithm was introduced to bound the size of the maximum matching in sparse random graphs; since its introduction, the algorithm has been used to bound the size of the maximum matching of many other random graph models.
The Irregular Cuckoo Hashing Model.
For simplicity of exposition, the random graph models that we will consider in this paper might produce non-simple graphs, that is, multigraphs with parallel edges; these can be easily transformed into simple graphs through the removal of parallel edges. One of the two random graph models that we will consider is the irregular cuckoo hashing model, which is defined as follows:
Definition 1 (Irregular Cuckoo Hashing Model).
Let be two positive integers, and be a probability distribution over the non-negative integers. The Irregular Cuckoo Hashing random multigraph is defined as follows. This bipartite graph has user set and ads set , with and . Each user samples independently an integer from and, for each , samples independently an ad uniformly at random from . The multiset of neighbors of is . (Note that might not be simple).
The Configuration Model.
In order to analyze the irregular cuckoo hashing graph of our construction, we will couple it with a random graph in the configuration model.
Definition 2 (Configuration Model).
Consider two sequences of non-negative integers and such that . To sample a bipartite graph from the configuration model , for each (resp. ), assign to vertex (resp. ) (resp. ) configuration points, and then sample a uniform at random perfect bipartite matching between the configuration points. (Note that might not be simple).
To compute the size of the maximum matching in our configuration model distribution, we will study the behavior of the first phase of the Karp-Sipser algorithm using the techniques presented, e.g., by Balister and Gerke in [7] (see also [4, 11]). In particular, we will show that our distribution admits a quasi-complete matching through the following modification555The Theorems of Balister and Gerke require the degree distributions to have finite support. Observe that any degree distribution with a finite mean can be truncated to a finite prefix altering the distribution’s expectation by no more than any ; in the setting of the configuration model, this truncation results in a change of no more than an fraction in the maximum matching size. Balister and Gerke use this approach to apply their Theorems to generic distributions of finite average degree. In our case, however, the ads degree distribution makes it possible to apply several algebraic shortcuts that become infeasible once that distribution is transformed into one having finite support. Therefore, to simplify our analysis, we reproved certain Lemmas and Theorems of Balister and Gerke [7], so to accommodate distributions with unbounded support. of a result in [7]:
Theorem 3.
Let be any constant. Let , and , where are non-negative sequences such that and . Consider any configuration model , with , , and such that (i) for , (ii) for , (iii) , and (iv) for , and , where (resp. ) is the number of vertices of degree in (resp. ).
Let be the smallest solutions to the simultaneous equations , . Then, with probability , the first phase of Karp-Sipser, on the graph , returns a matching of size at least
Moreover, with probability , after the first phase of Karp-Sipser, at least vertices in will be isolated (i.e., of degree 0 and unmatched). Equivalently, with probability , the maximum matching cannot be larger than,
Random Variables and Concentration Inequalities.
We denote with the binomial distribution with success probability and trials; with the Poisson distribution of parameter ; and with the geometric distribution with success probability , in particular for , and . For two random variables taking values in the natural numbers , we let be their total variation distance.
In our arguments, we will make use of the following well-known concentration bounds (whose proofs can be found, e.g., in [25]):
Fact 4 (Chernoff-Hoeffding’s inequality).
Let be independent random variables such that almost surely. Let , then, for all ,
Additionaly, we will employ McDiarmid’s inequality, a generalization of the Chernoff-Hoeffding bound:
Fact 5 (McDiarmid’s inequality).
Let be independent random variables with . Let be such that for each and for each , it holds . Then, for each ,
We will also leverage on the following concentration inequality for the sum of geometric random variables:
Fact 6 ([40], Theorem 2.1 and Theorem 3.1).
Let be independent random variables with . Let , and let . For each , , and for each , .
Finally, we will use the following approximation, by means of Poisson variables, of binomial random variables having small expectation.
5 Impossibility Result
In this Section we present, and analyze, an irregular cuckoo hashing instance of the online bipartite matching problem; by means of this instance, we will show that no online algorithm can achieve a competitive ratio better than . In Section 6 we will show how this construction can be obtained by greedily maximizing the probabilities of the degrees , while guaranteeing that the resulting graph has a quasi-complete matching.
Definition 8 (Main Instance).
Our main instance is given by the irregular cuckoo hashing distribution , where is defined as,
Observe that, for each positive integer , . Recall that in our irregular cuckoo hashing instances the ads are sampled with replacement.666We point out that our result could also be proved if the sampling of the ads was done without replacement (with a suitable truncation of the distribution), since the probability that a user chooses the same ad more than once using the model of Definition 8 – that is, the probability that the chosen multiset of ads is not a set – is at most . Indeed, if is the multiset of the neighbors of a given user, we have that . Therefore a sampling with replacement creates, in expectation and with high probability, at most users with parallel edges; removing these users reduces the size of any matching by no more than edges.
We will bound the performance of any online algorithm on the instance of Definition 8 in Section 5.1; we will then show in Section 5.2 that the instance admits, with high probability, a quasi-complete (and, thus, quasi-perfect) matching.
5.1 Online Matching
In this Section we upper bound the expected size of the matching found by any online algorithm when presented with the instance of Definition 8.
Theorem 9.
Proof.
Consider the greedy algorithm that, whenever possible, matches the current user to any of its neighboring ads that are still unmatched. This algorithm is optimal for the instance of Definition 8, given that each (multi)set of ads is chosen independently of the others, and that, after conditioning on the (multi)set size , the (multi)set is chosen uniformly at random among those of cardinality . We then restrict our analysis to this algorithm.
Let be the number of users sampled up until, and including, the point where the greedy algorithm matches ads, for , with . Note that it may be that . Without loss of generality, we can assume that is larger than any constant, otherwise the statement is trivially true – in our case, we will take .
The crux of the analysis of the algorithm lies in bounding the expected number of users necessary to achieve a matching of size . First, we split the value in parts. For , let be the number of users sampled in order to match the -th user, counting from the first user after the -th user is matched (or, if , counting from the first user). We then have . Let for . Observe that , where . For each , let ; then, , and:
(1) |
where we used for each .
We can now bound the expectation of :
where the second inequality follows from the fact that is increasing in and is a right Riemann sum on the interval . The third inequality holds since in and . Finally, the solution to the integral can be easily verified. Thus, .
We move on to showing the concentration of :
Lemma 10.
Let , and . It holds, .
Now, let be the size of the matching produced by the greedy algorithm after having processed the first users (in particular, the matching found by the online algorithm has size ). It holds . Moreover, . Note also that since . By Lemma 10,
We can finally bound the expected number of matched users,
where we used the fact that converges to a positive constant. A similar argument can be used to prove that our analysis is tight up to lower order terms.
Lemma 11.
The online greedy algorithm matches at least ads in expectation, with the instance of Definition 8.
5.2 Maximum Matching
We now prove that the instance of Definition 8 admits a matching of edges.
We will analyze our instance using the Karp-Sipser algorithm; first, we will reduce our instance to a configuration model; then, we will study this configuration model with a modification of the analysis in [7] of the first phase of the Karp-Sipser algorithm. In order to implement this plan, we need to modify our instance so that its average degree is bounded.
Definition 12.
Given any integer , we consider the irregular cuckoo hashing distribution , where is defined as
Observe that, to sample an instance of Definition 12, one can first sample an instance of Definition 8 and then remove all edges incident on users of degree larger than . We will prove the following result in Section 5.2.1.
Lemma 13.
With probability , the instance of Definition 12 admits a matching of size .
The existence of a quasi-complete matching in our original instance can be easily derived from Lemma 13.
Theorem 14.
With probability , the instance of Definition 8 admits a matching of size .
Theorem 15.
The optimal competitive ratio for the online matching problem with irregular cuckoo hashing distributions is no better than .
Clearly, Theorem 15 directly applies to the cases of “IID users with known distribution”, “IID users with unknown distribution” and “adversarial graphs whose users are permuted uniformly at random”.
Moreover, given that Theorem 14 guarantees that the maximum matching has size with probability at least , our upper bound of holds for both the ratios and ,777If , the generic online algorithm necessarily returns the maximum (empty) matching; it is typical to define in this borderline case. Moreover, if then necessarily , so that it is natural to define in this other case, as well. that is, for both the ratio-of-the-expectations and the expectation-of-the-ratio definitions of competitive ratio.
5.2.1 Reduction to the Configuration Model
In this Subsection, we aim to prove Lemma 13. We start by showing that, if we condition on the degrees of the vertices, then our instance is distributed like a configuration model. The following Lemma is the analogue of many that have been proved for various random graph models; a famous example is that of random regular graphs [41].
Lemma 16.
For any non-negative integers such that , and for any bipartite graph , it holds,
where is such that for each , so that the event happens with positive probability.
We now prove that, in our instance, the number of vertices with a certain degree is concentrated around the mean. This will allow us to restrict our attention to a specific configuration model. Define, for ,
(2) |
The following Observation can be proved by applying the concentration bounds in Section 4.
Observation 17.
Consider an irregular cuckoo hashing instance such that is upper bounded by a constant . Let for , , and let , , be a Poisson distribution with mean . We have that, with probability ,
-
1.
,
-
2.
for ,
-
3.
for .
Note that Observation 17 applies to our graph with and . Therefore, we can now leverage on Theorem 3, that is, a variant of the result in [7], which uses the Karp-Sipser algorithm to bound the maximum matching in configuration models whose degrees follow the conditions of Observation 17. Note that Theorem 3 requires a bounded average degree; we made sure that our random graph has bounded average degree by limiting its maximum user degree. Formally,
Lemma 18.
For any constant , let and be any two valid degree sequences such that (i) , (ii) , and (iii) for , where and are defined in Equation 2. Then, with probability , has a matching of size at least .
Proof.
We apply Theorem 3 with and as defined in Equation 2. Let , and without loss of generality, take , so that . We have, , where we used the standard tail bound for (see, e.g., [54, Theorem 5.4]). We have,
where we used for all and for all . Note that , , and clearly . Moreover, . Therefore, the hypotheses of Theorem 3 are met, we now compute and to apply the Theorem. Recall that . Since , we have that , and therefore . Substituting this expression for into the relation , we get,
Equivalently, and this equation is satisfied only for . Indeed, for , . Therefore, . Now, Theorem 3 ensures that the configuration model admits a matching of size at least,
and the proof is complete. We now apply Lemma 16, Observation 17 and Lemma 18, to prove Lemma 13, and conclude the proof of our impossibility result.
Proof of Lemma 13.
By Observation 17, there exists such that, in the instance of Definition 12, with probability , (i) , (ii) for all , and (iii) . Let be the set of all sequences , that are compatible with conditions (i), (ii), and (iii). Note that given that with probability , the sampled graph respects the conditions. Let be the distribution defined in Definition 12. For a graph , let be the event that the maximum matching of has size at least (where the term is the one given by Lemma 18), let be the event that respects conditions (i),(ii), and (iii), and let be the event that the vertices of have degrees and . Note that is a partition of . We have,
where in the first equality we use Lemma 16 and in the last inequality we apply Lemma 18.
6 Extremality
We prove in this section that the construction of Definition 8 is extremal among the irregular cuckoo hashing instances in the following sense: for each , our distribution (greedily) maximizes the total number of users of degree at most while guaranteeing that a fraction of these users can be matched to ads, that is, while guaranteeing that the resulting graph admits a quasi-complete matching. The distribution, then, aims to make the user degrees as small as possible (so to make the task of the online algorithm as hard as possible), while guaranteeing that offline algorithms can match a fraction of the users.
We will prove that, for each fixed , if we increase the probability that Definition 12 assigns to user-degree by any small enough positive constant , while keeping the probabilities of user-degrees fixed, then a constant fraction of the users having degree in cannot be matched. In particular, we will consider the following class of distributions.
Definition 19.
Choose an integer constant and a small enough constant . Consider the irregular cuckoo hashing instance , where is defined as,
where is given in Definition 12.
We will again analyze the first phase of the Karp-Sipser algorithm by using Theorem 3.
Lemma 20.
With probability , the instance of Definition 19 for any constant and for a small enough constant with respect to , does not admit matchings of size larger than where for sufficiently small compared to . Specifically, , where depends only on . In particular, with probability , each matching will leave at least users of positive degree unmatched.
We can now use the previous Lemma to prove the main result of this Section.
Theorem 21.
Suppose that there exists a constant such that for each , and . Then, with probability , the instance admits no matching that matches at least users of degree .
Our distribution, then, is extremal among the irregular cuckoo hashing distributions that make it possible to match users to the ads; indeed, it has for each integer , and thus it greedily maximizes, for each , the number of users of degree under the constraint that a fraction of the users of degree at most can be matched.
7 Generalization to Non-Equibipartite Graphs
In this Section, we consider a more general case, where we have ads and users, for a constant . As in the instance of Definition 8, we will greedily add users of smallest degree, while guaranteeing the existence of a quasi-complete matching. Our generalized instances have varying competitive ratios; we will show that the worst competitive ratio is obtained at ,888We do not consider explicitly the case where there are more users than ads, that is, the case . This case can be easily reduced to the case of , i.e., to the case of Definition 8. Indeed, if and one aims to greedily add (fractions of) users of the minimum degree that allow a quasi-complete matching to exist, the first users will have the degrees given by the distribution of Definition 8. As we proved in Theorem 14, these first users induce a matching that is quasi-perfect, and thus quasi-complete from the ads side. The next minimum user degree which guarantees a quasi-complete matching is then . Thus, if , the greedy approach would produce users of degree , and users of degree for each . The resulting graph will then induce online and offline matching sizes equal to those of Definition 8. that is, with the original instance of Definition 8 (see Figure 1 and Theorem 27).
Definition 22 (Generalized Instance).
Let be a constant. Consider the irregular cuckoo hashing instance . The distribution is defined as,
-
if , is equal to the instance of Definition 8, that is, for ,
-
if , let be the minimum integer such that , that is, let . Then, . For each , we set , and .
In particular, the distribution is obtained by conditioning the of Definition 8 on an event of probability , chosen to include the smallest possible values of ’s support. Observe that, for each , the distribution assigns probability to degree . Observe, further, that as increases from to , for any integer , assigns more and more probability to degree .
The following claim is a corollary of the analysis of the maximum matching for the case .
Corollary 23.
The maximum matching of the instance of Definition 22 has size with probability ; thus, the resulting graph admits a quasi-complete matching.
One can prove that the construction is extremal using the same argument we developed in the last section for the case (i.e., moving any constant mass from higher degrees to smaller degrees makes the quasi-complete matching lose a constant fraction of the edges).
Corollary 24.
Suppose that there exists a constant such that for each , and . Then in the instance , with probability , there exists no matching that matches at least users of degree .
We now move on to studying the performance of online algorithms on the generalized instance of Definition 22. We start with a technical lemma that relates the expected number of users to capture a new ad with distribution , and with distribution . This Lemma will be used to bound the number of users needed to capture a new ad with distribution , in terms of the number of users needed with – we need this Lemma because the integral required to determine the number of users to capture an ad with is not easy to solve, while the one with has already been solved in the proof of Theorem 9.
Lemma 25.
Fix and let , for and .999Observe that , where is the distribution of Definition 22. Moreover, let . Then,
We now study the performance of online algorithms on the instance for .
Lemma 26.
Fix . Then, the online greedy algorithm, if run on the distribution of Definition 22, will match at least ads in expectation.
We can now conclude that gives the strongest bound of :
Theorem 27.
The strongest impossibility result for the competitive ratio of online matching that can be obtained with the limiting constructions of Definition 22 is . This bound can be obtained by setting , that is, with the construction of Definition 8.
8 Conclusion
We proved that the optimal competitive ratio of several online bipartite matching problems cannot be larger than . The simplicity of our upper bound expression, and its similarity to the optimal competitive ratio for adversarial graphs that are also adversarially permuted, naturally raise the question of its potential optimality. We leave as open questions whether this upper bound is tight in (i) irregular cuckoo hashing instances, in (ii) IID instances with known distribution, in (iii) IID instances with unknown distribution, and in the hardest case, that of (iv) adversarial graphs with users permuted uniformly at random.
References
- [1] Jose Adell and Pedro Jodra. Exact Kolmogorov and total variation distances between some familiar discrete distributions. Journal of Inequalities and Applications, 2006. doi:10.1155/JIA/2006/64307.
- [2] Mohit Agarwal and Anuj Puri. Base station scheduling of requests with fixed deadlines. In INFOCOM, 2002. doi:10.1109/INFCOM.2002.1019293.
- [3] Antonios Antoniadis, Themis Gouleakis, Pieter Kleer, and Pavel Kolev. Secretary and online matching problems with machine learned advice. Discrete Optimization, 48, 2023. doi:10.1016/J.DISOPT.2023.100778.
- [4] Jonathan Aronson, Alan Frieze, and Boris G. Pittel. Maximum matchings in sparse random graphs: Karp–Sipser revisited. Random Structures & Algorithms, 12(2), 1998. doi:10.1002/(SICI)1098-2418(199803)12:2\%3C111::AID-RSA1\%3E3.0.CO;2-\%23.
- [5] Yossi Azar and Yoel Chaiutin. Optimal node routing. In STACS, 2006. doi:10.1007/11672142_49.
- [6] Bahman Bahmani and Michael Kapralov. Improved bounds for online stochastic matching. In ESA, 2010. doi:10.1007/978-3-642-15775-2_15.
- [7] Paul Balister and Stefanie Gerke. Controllability and matchings in random bipartite graphs. Surveys in Combinatorics, 2015. doi:10.1017/CBO9781316106853.004.
- [8] Benjamin Birnbaum and Claire Mathieu. On-line bipartite matching made simple. SIGACT News, 39(1), 2008. doi:10.1145/1360443.1360462.
- [9] Joakim Blikstad, Ola Svensson, Radu Vintan, and David Wajc. Online edge coloring is (nearly) as easy as offline. In STOC, 2024. doi:10.1145/3618260.3649741.
- [10] Avrim Blum, Tuomas Sandholm, and Martin Zinkevich. Online algorithms for market clearing. J. ACM, 53(5), 2006. doi:10.1145/1183907.1183913.
- [11] Tom Bohman and Alan Frieze. Karp–Sipser on random graphs with a fixed degree sequence. Combinatorics, Probability and Computing, 20(5), 2011. doi:10.1017/S0963548311000265.
- [12] Allan Borodin and Calum MacRury. Online bipartite matching in the probe-commit model. Mathematical Programming, 2025. doi:10.1007/s10107-024-02184-y.
- [13] Allan Borodin, Calum MacRury, and Akash Rakheja. Secretary Matching Meets Probing with Commitment. In APPROX/RANDOM, 2021. doi:10.4230/LIPIcs.APPROX/RANDOM.2021.13.
- [14] Allan Borodin, Calum MacRury, and Akash Rakheja. Prophet Matching in the Probe-Commit Model. In APPROX/RANDOM, 2022. doi:10.4230/LIPIcs.APPROX/RANDOM.2022.46.
- [15] Mark Braverman, Mahsa Derakhshan, and Antonio Molina Lovett. Max-weight online stochastic matching: Improved approximations against the online benchmark. In EC, 2022. doi:10.1145/3490486.3538315.
- [16] Mark Braverman, Mahsa Derakhshan, Tristan Pollner, Amin Saberi, and David Wajc. New philosopher inequalities for online bayesian matching, via pivotal sampling. In SODA, 2025. doi:10.1137/1.9781611978322.98.
- [17] Brian Brubach, Karthik Abinav Sankararaman, Aravind Srinivasan, and Pan Xu. New Algorithms, Better Bounds, and a Novel Model for Online Stochastic Matching. In ESA, 2016. doi:10.4230/LIPIcs.ESA.2016.24.
- [18] Niv Buchbinder, Joseph Naor, and David Wajc. Lossless online rounding for online bipartite matching (despite its impossibility). In SODA, 2023. doi:10.1137/1.9781611977554.ch78.
- [19] Niv Buchbinder, Danny Segev, and Yevgeny Tkach. Online algorithms for maximum cardinality matching with edge arrivals. Algorithmica, 81(5), 2019. doi:10.1007/s00453-018-0505-7.
- [20] Davin Choo, Themistoklis Gouleakis, Chun Kai Ling, and Arnab Bhattacharyya. Online bipartite matching with imperfect advice. In ICML, 2024.
- [21] Nikhil R. Devanur, Kamal Jain, and Robert D. Kleinberg. Randomized primal-dual analysis of ranking for online bipartite matching. In SODA, 2013. doi:10.1137/1.9781611973105.7.
- [22] John P. Dickerson, Karthik A. Sankararaman, Aravind Srinivasan, and Pan Xu. Allocation problems in ride-sharing platforms: Online matching with offline reusable resources. ACM Trans. Econ. Comput., 9(3), 2021. doi:10.1145/3456756.
- [23] Martin Dietzfelbinger, Andreas Goerdt, Michael Mitzenmacher, Andrea Montanari, Rasmus Pagh, and Michael Rink. Tight thresholds for cuckoo hashing via xorsat. In ICALP, 2010. doi:10.1007/978-3-642-14165-2_19.
- [24] Martin Dietzfelbinger and Christoph Weidling. Balanced allocation and dictionaries with tightly packed constant size bins. Theoretical Computer Science, 380(1), 2007. doi:10.1016/j.tcs.2007.02.054.
- [25] Devdatt Dubhashi and Alessandro Panconesi. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, 2009. doi:10.1017/CBO9780511581274.
- [26] Alon Eden, Michal Feldman, Amos Fiat, and Kineret Segal. An economics-based analysis of ranking for online bipartite matching. In SOSA, 2021. doi:10.1137/1.9781611976496.12.
- [27] Paul Erdös and Alfréd Rényi. On the evolution of random graphs. Publ. math. inst. hung. acad. sci, 5(1), 1960.
- [28] Tomer Ezra, Michal Feldman, Nick Gravin, and Zhihao Gavin Tang. Prophet matching with general arrivals. Math. Oper. Res., 47(2), 2022. doi:10.1287/moor.2021.1152.
- [29] Matthew Fahrbach, Zhiyi Huang, Runzhou Tao, and Morteza Zadimoghaddam. Edge-weighted online bipartite matching. J. ACM, 69(6), 2022. doi:10.1145/3556971.
- [30] Yiding Feng, Rad Niazadeh, and Amin Saberi. Two-stage stochastic matching with application to ride hailing. In SODA, 2021. doi:10.1137/1.9781611976465.170.
- [31] Yiding Feng, Rad Niazadeh, and Amin Saberi. Two-stage stochastic matching and pricing with applications to ride hailing. Oper. Res., 72(4), 2024. doi:10.1287/opre.2022.2398.
- [32] Buddhima Gamlath, Michael Kapralov, Andreas Maggiori, Ola Svensson, and David Wajc. Online Matching with General Arrivals . In FOCS, 2019. doi:10.1109/FOCS.2019.00011.
- [33] Ruiquan Gao, Zhongtian He, Zhiyi Huang, Zipei Nie, Bijun Yuan, and Yan Zhong. Improved online correlated selection. In FOCS, 2022. doi:10.1109/FOCS52979.2021.00123.
- [34] Gagan Goel and Aranyak Mehta. Online budgeted matching in random input models with applications to adwords. In SODA, 2008. URL: http://dl.acm.org/citation.cfm?id=1347082.1347189.
- [35] Daniel Hathcock, Billy Jin, Kalen Patton, Sherry Sarkar, and Michael Zlatin. The online submodular assignment problem. In FOCS, 2024. doi:10.1109/FOCS61266.2024.00026.
- [36] Zhiyi Huang and Xinkai Shu. Online stochastic matching, poisson arrivals, and the natural linear program. In STOC, 2021. doi:10.1145/3406325.3451079.
- [37] Zhiyi Huang, Xinkai Shu, and Shuyi Yan. The power of multiple choices in online stochastic matching. In STOC, 2022. doi:10.1145/3519935.3520046.
- [38] Zhiyi Huang, Zhihao Gavin Tang, and David Wajc. Online matching: A brief survey. SIGecom Exch., 22(1), 2024. doi:10.1145/3699824.3699837.
- [39] Patrick Jaillet and Xin Lu. Online stochastic matching: New algorithms with better bounds. Mathematics of Operations Research, 39(3), 2014. doi:10.1287/moor.2013.0621.
- [40] Svante Janson. Tail bounds for sums of geometric and exponential variables. Statistics & Probability Letters, 135, 2018. doi:10.1016/j.spl.2017.11.017.
- [41] Svante Janson, Tomasz Luczak, and Andrzej Rucinski. Random Graphs. Wiley, 2000. doi:10.1002/9781118032718.
- [42] Billy Jin and Will Ma. Online bipartite matching with advice: Tight robustness-consistency tradeoffs for the two-stage model. In NeurIPS, 2022.
- [43] Yossi Kanizo, David Hay, and Isaac Keslassy. Maximum bipartite matching size and application to cuckoo hashing, 2011. arXiv:1007.1946.
- [44] Haim Kaplan, David Naori, and Danny Raz. Online weighted matching with a sample. In SODA, 2022. doi:10.1137/1.9781611977073.52.
- [45] Chinmay Karande, Aranyak Mehta, and Pushkar Tripathi. Online bipartite matching with unknown distributions. In STOC, 2011. doi:10.1145/1993636.1993715.
- [46] Richard M. Karp and Michael Sipser. Maximum matching in sparse random graphs. In SFCS, 1981. doi:10.1109/SFCS.1981.21.
- [47] Richard M. Karp, Umesh V. Vazirani, and Vijay V. Vazirani. An optimal algorithm for on-line bipartite matching. In STOC, 1990. doi:10.1145/100216.100262.
- [48] Will Ma and Pan Xu. Promoting fairness among dynamic agents in online-matching markets under known stationary arrival distributions. In NeurIPS, 2024.
- [49] Mohammad Mahdian and Qiqi Yan. Online bipartite matching with random arrivals: an approach based on strongly factor-revealing LPs. In STOC, 2011. doi:10.1145/1993636.1993716.
- [50] Vahideh H. Manshadi, Shayan Oveis Gharan, and Amin Saberi. Online stochastic matching: online actions based on offline statistics. In SODA, 2011. doi:10.1137/1.9781611973082.98.
- [51] Vahideh H. Manshadi, Shayan Oveis Gharan, and Amin Saberi. Online stochastic matching: Online actions based on offline statistics. Mathematics of Operations Research, 37(4), 2012. doi:10.1287/MOOR.1120.0551.
- [52] Aranyak Mehta. Online Matching and Ad Allocation. Now Publishers, 2013.
- [53] Aranyak Mehta, Amin Saberi, Umesh V. Vazirani, and Vijay V. Vazirani. Adwords and generalized on-line matching. In FOCS, 2005. doi:10.1109/SFCS.2005.12.
- [54] Michael Mitzenmacher and Eli Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, 2005. doi:10.1017/CBO9780511813603.
- [55] Joseph (Seffi) Naor, Aravind Srinivasan, and David Wajc. Online dependent rounding schemes for bipartite matchings, with applications. In SODA, 2025. doi:10.1137/1.9781611978322.100.
- [56] Rasmus Pagh and Flemming Friche Rodler. Cuckoo hashing. Journal of Algorithms, 51(2), 2004. doi:10.1016/j.jalgor.2003.12.002.
- [57] Christos Papadimitriou, Tristan Pollner, Amin Saberi, and David Wajc. Online stochastic max-weight bipartite matching: Beyond prophet inequalities. In EC, 2021.
- [58] Barna Saha and Aravind Srinivasan. A new approximation technique for resource-allocation problems. Random Structures & Algorithms, 52(4), 2018. doi:10.1002/rsa.20756.
- [59] J. Michael Steele. Le Cam’s Inequality and Poisson Approximations. The American Mathematical Monthly, 101(1), 1994. doi:10.1080/00029890.1994.11996904.
- [60] Zhihao Gavin Tang, Jinzhao Wu, and Hongxun Wu. (Fractional) online stochastic matching via fine-grained offline statistics. In STOC, 2022. doi:10.1145/3519935.3519994.
- [61] Zhihao Gavin Tang and Yuhao Zhang. Improved bounds for fractional online matching problems. In EC, 2024. doi:10.1145/3670865.3673459.
- [62] Yongxin Tong, Jieying She, Bolin Ding, Libin Wang, and Lei Chen. Online mobile Micro-Task Allocation in spatial crowdsourcing . In ICDE, 2016. doi:10.1109/ICDE.2016.7498228.