On the Complexity of Knapsack Under Explorable Uncertainty: Hardness and Algorithms
Abstract
In the knapsack problem under explorable uncertainty, we are given a knapsack instance with uncertain item profits. Instead of having access to the precise profits, we are only given uncertainty intervals that are guaranteed to contain the corresponding profits. The actual item profit can be obtained via a query. The goal of the problem is to adaptively query item profits until the revealed information suffices to compute an optimal (or approximate) solution to the underlying knapsack instance. Since queries are costly, the objective is to minimize the number of queries.
In the offline variant of this problem, we assume knowledge of the precise profits and the task is to compute a query set of minimum cardinality that a third party without access to the profits could use to identify an optimal (or approximate) knapsack solution. We show that this offline variant is complete for the second-level of the polynomial hierarchy, i.e., -complete, and cannot be approximated within a non-trivial factor unless . Motivated by these strong hardness results, we consider a “resource-augmented” variant of the problem where the requirements on the query set computed by an algorithm are less strict than the requirements on the optimal solution we compare against. More precisely, a query set computed by the algorithm must reveal sufficient information to identify an approximate knapsack solution, while the optimal query set we compare against has to reveal sufficient information to identify an optimal solution. We show that this resource-augmented setting allows interesting non-trivial algorithmic results.
Keywords and phrases:
Explorable uncertainty, knapsack, queries, approximation algorithms2012 ACM Subject Classification:
Theory of computation Design and analysis of algorithmsAcknowledgements:
Parts of the work on this project have been done while the author was affiliated with the University of Bremen. The author thanks Nicole Megow for initial discussions.Funding:
This work was partially supported by the research project Optimization for and with Machine Learning (OPTIMAL), funded by the Dutch Research Council (NWO), grant number OCENW.GROOT.2019.015.Editors:
Anne Benoit, Haim Kaplan, Sebastian Wild, and Grzegorz HermanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
The field of explorable uncertainty considers optimization problems with uncertainty in the numeric input parameters. Initially, the precise values of the uncertain parameters are unknown. Instead, for each uncertain parameter, we are given an uncertainty interval that contains the precise value of that parameter. Each uncertain parameter can be queried to reveal its precise value. The goal is to adaptively query uncertain parameters until we have sufficient information to solve the underlying optimization problem.
In this paper, we consider knapsack under explorable uncertainty (KnapExp) with uncertain item profits. That is, we are given a set of items and a knapsack capacity . Each item has a known weight and an uncertain profit that is initially hidden within the known uncertainty interval , i.e., . A query of an item reveals the profit . Our goal is to compute a set of items with that maximizes the profit . We refer to this problem as the underlying knapsack problem. Since the profits are initially hidden within their uncertainty intervals, we do not always have sufficient information to compute an optimal or even approximate solution for the underlying knapsack problem. Instead, an algorithm for KnapExp can adaptively query items to reveal their profits until the revealed information suffices to compute an optimal solution for the underlying knapsack instance. As queries are costly, the goal is to minimize the number of queries.
The offline version, sometimes also called verification problem, of knapsack under explorable uncertainty (OfflineKnapExp) assumes full initial access to the profits and asks for a query set of minimum cardinality such that access to the profits of the items in and access to the uncertainty intervals of the items in suffices to compute an optimal solution to the underlying knapsack instance, independent of what the precise profits of the items in are. In this work, we mainly focus on studying the offline version of knapsack under explorable uncertainty. Most commonly, problems under explorable uncertainy are studied in an adversarial online setting, where the uncertain values are unknown, query outcomes are returned in a worst-case manner, and algorithms are compared against the optimal solution for the corresponding offline version by using competitive analysis. The complexity of the offline version is a natural barrier for efficiently solving the online version.
So far, most problems that have been studied under explorable uncertainty have an underlying problem that belongs to the complexity class P, i.e., can be solved in polynomial time. The seminal work by Kahan [20] on computing the minimum in a set of uncertain values was followed by works on computing the -th smallest uncertain value [20, 16], computing a minimum spanning tree with uncertain edge weights [12, 10, 25, 14, 28, 24], sorting [19, 13], shortest path [15], finding the cheapest set in a given family of sets [11, 26], simple geometric problems [7], stable matchings [2], and other selection problems [13, 3]. If we remove the explorable uncertainty aspect, then all of these problems can be solved in polynomial time.
Even tough these underlying problems are in P, the offline versions of the corresponding problems under explorable uncertainty are often NP-hard. For instance, the offline version of identifying the set of maximal points under explorable uncertainty is NP-hard [8], the offline version of the selection problem in [13, 3] is NP-hard, and the offline version of the minimum-spanning tree problem under vertex uncertainty is NP-hard [10]. The offline version of selecting the cheapest set is NP-hard [11] and even hard to approximate within a factor of , where is the number of sets [26]. Similarly, the offline version of stable matching under uncertainty is NP-hard to approximate [2]. For all of these problems, adding the layer of explorable uncertainty increases the complexity from polynomial-time solvable to NP-hard and leads to interesting algorithmic challenges even tough the underlying problems are easy. However, this observation also raises the following question:
If the underlying problem is already NP-hard, does adding the layer of explorable uncertainty still increase the complexity?
As a first main result, we answer this question in the affirmative for the offline version of knapsack under explorable uncertainty. More precisely, we show that OfflineKnapExp is complete for the second level of the polynomial hierarchy, i.e., -complete. We even show that, under a certain conjecture (), no -approximation is possible for any , where is the number of items. The latter can be seen as a natural next step from the inapproximability result given in [26]: They show that approximating the offline version of the cheapest set problem is hard to approximate within a factor of by exploiting that it is equivalent to solving a covering integer linear program (ILP) with constraints, whereas we show our inapproximability result by exploiting that offline KnapExp can be represented as a covering ILP with an exponential number of constraints. Unfortunately, these extremely strong hardness results pose further challenges:
If the hardness of the offline version prevents any non-trivial approximation, is there any hope for interesting algorithmic results in the offline, online or stochastic version?
Our approach for answering this question is to consider a form of resource augmentation. More precisely, we relax the requirements on a solution for OfflineKnapExp: Instead of requiring that querying reveals sufficient information to identify an optimal solution for the underlying knapsack problem, we only require sufficient information to identify an -approximate solution. Unfortunately, we can show that, unless P=NP, there is no non-trivial approximation for this relaxed problem variant if we compare against an optimal solution for the relaxed problem variant. However, as a second main result, we show that non-trivial algorithmic results are possible if the requirements on the algorithm’s solution are less strict than the requirements on the optimal solution we compare against; we make this more precise in the next section.
1.1 Problem Definition
An instance of KnapExp and OfflineKnapExp is a quintuple , where is a set of items, is the knapsack capacity, is the weight vector with for all items , is the profit vector with for all , and is the set of uncertainty intervals such that for all . The quadruple characterizes the underlying knapsack problem.
As is common in the area of explorable uncertainty, we assume that each uncertainty interval is either open or trivial. That is, we either have for a lower limit and an upper limit , or . In the latter case, we call both the item and the uncertainty interval trivial and define . All items that are not trivial are called non-trivial. We use to refer to the set of trivial items. A query of an item reveals the profit and can be seen as replacing the uncertainty interval with .
In OfflineKnapExp, all input parameters are known to the algorithm, while in KnapExp the profits are initially uncertain. Both problems ask for a feasible query set of minimum cardinality. Intuitively, a query set is feasible if the revealed information suffices to identify an optimal solution to the underlying knapsack problem and to determine the profit of such a solution. We proceed by making this definition more formal.
Packings and Feasible Query Sets.
To formally define feasible query sets, we first define packings. A subset of items is a packing if . That is, packings are feasible solutions to the underlying knapsack problem. For let denote the profit of . We call a packing optimal if it maximizes the profit over all packings. We usually use to refer to an optimal packing and to refer to the optimal profit.
For each packing define , i.e., the term describes an upper limit on the maximum possible profit the packing could potentially have. Note that can be computed even without access to the profits. By querying items in , the upper limit decreases as we gain more information and can replace the non-trivial uncertainty interval with after we query and learn the profit . For , we use to denote the upper limit of after querying , i.e., if and otherwise. The upper limit of packing after querying a set , is
Definition 1.
A query set is feasible if the following two conditions hold:
-
1.
There is a packing with .
-
2.
holds for every packing .
The first condition of Definition 1 ensures that querying reveals sufficient information to verify that there exists a packing with the optimal profit , while the second condition ensures that querying reveals sufficient information to verify that no packing can possibly have a larger profit than , no matter what the profits of items actually are.
Since any packing with can only satisfy if , the second condition of the definition actually implies the first one. In particular, this means that a query set is feasible if and only if it satisfies the constraints of the following ILP. Note that a similar covering point of view was first observed in [26] for the cheapest set problem.
| (K-ILP) |
The offline problem.
In the offline problem OfflineKnapExp, we are given an instance with full knowledge of all parameters and our goal is to compute a feasible queryset of minimum cardinality, which is equivalent to solving (K-ILP) with full knowledge of all coefficients. We use to refer to an optimal solution of OfflineKnapExp.
The online problem.
In the online problem KnapExp, we are also given an instance but the profits are initially unknown. The goal is to iteratively and adaptively query items to reveal their profit until the set of queried items is a feasibile query set. This problem can be seen as solving (K-ILP) with uncertain coefficients and right-hand side values : Querying an item corresponds to irrevocably setting , incurs a cost that cannot be reverted, and reveals the coefficient .
Relaxations.
As OfflineKnapExp turns out to admit strong inapproximability results, we introduce the following relaxed notion of feasible query sets. We use -OfflineKnapExp to refer to the offline problem of computing a -feasible query set of minimum cardinality.
Definition 2 (()-feasibility).
Let . We say that a query set is -feasible if the following two conditions hold:
-
1.
There is a packing such that and .
-
2.
for every packing .
The first condition of the definition ensures that we can find an -approximation for the underlying knapsack instance by using only queried and trivial items. The second condition ensures that, after querying , no feasible packing can have profit greater than , no matter what the profits of the items in are. Thus, querying reveals sufficient information to verify that the set of the first condition is a -approximation for the underlying knapsack instance. We use to refer to a minimum-cardinality -feasible query set. Note that , is -feasible for every and .
In contrast to Definition 1, the second condition of Definition 2 does not imply the first one. As a consequence, each -feasible query set is a feasible solution for a variant of (K-ILP) in which we replace with , but the inverse is not necessarily true.
1.2 Our Results and Outline
In Section 2, we give several hardness results for OfflineKnapExp. First, we show that deciding whether is an -feasible query set is weakly NP-hard for any , which immediately implies that it is weakly NP-hard to approximate -OfflineKnapExp within any bounded multiplicative factor. This hardness result mainly exploits that the optimal solution to the weakly NP-hard [21] underlying knapsack problem appears on the right-hand sides of (K-ILP). Then, we move on to show that OfflineKnapExp is -complete, which intuitively means that the problem remains hard even if we are given an oracle for deciding problems in NP. Since such an oracle allows us to compute , the reason for the -hardness is not the appearance of in the right-hand sides but the exponential number of constraints in (K-ILP). In fact, we prove this hardness result via reduction from succinct set cover[32, 31], which is a set cover variant where the elements and sets are only given implicitly, i.e., the number of set cover constraints is exponential in the encoding size of the problem. Exploiting a result by [30], our reduction also shows that there is no -approximation for any unless , where |.
In Section 3, we design algorithms for -OfflineKnapExp for different values of and . Since the hardness results prevent any non-trivial results when comparing against , we analyze our algorithms by comparing their solutions to instead. This can be seen as a form of resource augmentation as the requirements on the algorithms’s solution are less strict than the requirements on the optimal solution we compare against. To achieve our algorithmic results, we treat the two conditions for -feasible query sets (cf. Definition 2) as separate subproblems: (i) Compute a query set such that there exists a packing with , and (ii) Compute a query set such that holds for all packings . First, we show how to solve subproblem (i) in polynomial-time for with a set such that . Our algorithm for this subproblem exploits existing results for the two-dimensional knapsack problem. For (ii), we first show how to solve the problem in pseudopolynomial time for with the guarantee . The algorithm is based on solving an OfflineKnapExp variant that only considers packings that correspond to certain greedy solutions. We justify the pseudopolynomial running-time by showing weak NP-hardness for this OfflineKnapExp variant. By considering a relaxed version of that problem, we manage to solve problem (ii) for with in polynomial time. Combining the results for both subproblems yields a pseudopolynomial algorithm that computes a -feasible query set and a polynomial time algorithm that computes a -feasible query set . In both cases, .
1.3 Further Related Work
Meißner [27] gives an adversarial lower bound of for KnapExp that holds even if the instance only has two different weights, preventing any non-trivial adversarial results. However, Megow and Schlöter [26] show that this lower bound does not hold in a stochastic setting where the profits are drawn from their intervals according to an unknown distribution that satisfies for a threshold parameter . However, their result only breaks that particular lower bound instance and does not imply general stochastic results for KnapExp.
Goerigk et al. [18] consider a knapsack problem under uncertainty in a different query setting. In their problem, the profits are known and the weights are uncertain. Furthermore, there is a budget on the queries that an algorithm is allowed to make. These differences lead to a problem fundamentally different from KnapExp.
Maehara and Yamaguchi [23] consider packing ILPs with uncertainty in the cost coefficients. The cost coefficients can be queried. The key difference to the setting of explorable uncertainty is that they are interested in bounding the absolute number of queries instead of comparing against the optimal feasible query set. We remark that this is an important distinction between explorable uncertainty and many other query models. For example, the same distinction applies to a line of research that studies queries that reveal the existence of entities instead of numeric values, e.g., the existence of edges in a graph, c.f. [6, 17, 33, 4, 5, 9]. For instance, Behnezhad et al. [4] considered vertex cover in a stochastic setting and showed that it can be approximated within a factor of with only a constant number of queried edges per vertex.
2 Hardness of Approximation
We start by showing our hardness results for -OfflineKnapExp. Not surprisingly, the appearance of in the right-hand sides of (K-ILP) suffices to render -OfflineKnapExp weakly NP-hard. The following proposition shows that even deciding whether a given set is -feasible is already weakly NP-hard (cf. the full version [29] for a formal proof).
Proposition 3.
Deciding if is -feasible is weakly NP-hard for any .
This means that distinguishing between instances that can be solved without any query and instances that need at least one query is weakly NP-hard, which implies the following:
Corollary 4.
It is weakly NP-hard to approximate -OfflineKnapExp within any bounded multiplicative factor.
Proposition 3 is also an indicator that -OfflineKnapExp might not be in NP, as verifying whether a query set is -feasible is already NP-hard. For , we make this observation more formal by proving that OfflineKnapExp is complete for the second level of the polynomial hierarchy, i.e., -complete. Intuitively, the class contains problems that, given an oracle for deciding problems in NP, can be solved in non-deterministic polynomial time. Similarly, the class contains problems that, given the same type of oracle, can be solved in deterministic polynomial time. Hardness and completeness for the class are defined in the same way as for the class NP. For a more formal introduction, we refer to [1]. Under the conjecture that , the -completeness implies that OfflineKnapExp is not in NP, and under the conjecture it cannot be solved optimally in polynomial time even when given an oracle for deciding problems in NP.
Theorem 5.
OfflineKnapExp is -complete.
Proof.
We show -membership in the full version [29] and focus here on proving -hardness. Our proof is via reduction from succinct set cover, which is known to be -complete [32, 31]. In the same way as for NP-hardness proofs, we need to give a polynomial time reduction to the decision problem variant of OfflineKnapExp such that the constructed instance is a YES-instance if and only if the given succinct set cover instance is a YES-instance.
Succinct set cover.
We are given decision variables , propositional logic formulas over these variables and an integer parameter . Each formula is in -DNF form111Disjunctive normal form (DNF) refers to a disjunction of conjunctions, i.e., , where each clause is a conjunction of literals. In 3-DNF, each contains exactly three literals. A formula in DNF is satisfied by a variable assignment if at least one clause is satisfied by the assignment. and we use to denote the set of --vectors (variable assignments) that satisfy . The formulas have to satisfy , i.e., each variable assignment satisfies at least one formula . The goal is to find a subset such that and . We assume that each variable occurs as a literal in at least one formula. If not, we can just remove the variable and obtain a smaller instance. Succinct set cover can be interpreted as a set cover variant where the elements and sets are only given implicitly and not as an explicit part of the input.
Main reduction idea.
Before we give the technical details of the reduction, we first sketch the basic idea. In particular, we describe the properties that we want the constructed instance to have. In the technical part, we then describe how to actually achieve these properties. At its core, the reduction will use the knapsack weights to encode structural information of the input instance into the constructed instance. The idea to use numerical weights to encode constraints is quite natural in hardness proofs for weakly NP-hard problems, see e.g. the NP-hardness proof for the subset sum problem given in [22]. The usage of the knapsack weights in our reduction is on some level inspired by such classical reductions, but requires several new ideas to handle the implicit representation of the input problem and the covering nature of OfflineKnapExp.
First, we introduce a single trivial item with . This item alone fills up the complete knapsack capacity , which we define later, and the instance will be constructed in such a way that is the maximum profit of any packing. Thus, only packings with induce non-trivial constraints in (K-ILP). We design the instance such that only holds for packings that use the full capacity.
Property 1.
A packing of the constructed instance satisfies only if .
Next, we want each packing with and to represent a distinct variable assignment in . To this end, we introduce a set of items, two items and for each variable with . Intuitively, represents the positive literal and represents the negative literal . We say that a subset represents a variable assignment if for all . We design our instance such that the packings with and exactly correspond to the variable assignments in . Note that this excludes the packing as this packing has and .
Property 2.
If and , then represents a variable assignment. Each variable assignment is represented by at least one with and .
If the first two properties hold, then all non-trivial constraints in the ILP (K-ILP) for the constructed instance correspond to a packing with and and, thus, to a variable assignment of the given succinct set cover instance. Furthermore, each variable assignment is represented by at least one active constraint. With the next property, we want to ensure that each possible query, i.e., each non-trivial item, corresponds to a succinct set cover formula . To this end, we introduce the set of items . These items will be the only non-trivial items in the constructed instance, so each possible query is to an element of . Next, we want to achieve that querying an item suffices to satisfy all constraints of (K-ILP) for packings with and that represent a variable assignment which satisfies formula , and does not impact any other constraints. Formally, we would like to design our instance such that the following property holds.
Property 3.
For each packing with and each : if and only if represents a variable assignment that satisfies . If , then .
If we manage to define our reduction in such a way that the three properties are satisfied, it is not hard to show correctness (see the full version [29] for the second direction):
First Direction.
If there is an index set with that is a feasible solution to the succinct set cover instance, then each possible variable assignment must satisfy at least one formula with . We claim that is a feasible query set for the constructed OfflineKnapExp instance. To this end, consider an arbitrary packing with , which are the only packings that induce non-trivial constraints in the corresponding (K-ILP). By Property 1, we have . Property 2 implies that represents some variable assignment and Property 3 implies that for all that are satisfied by . By assumption that is a feasible succinct set cover solution, we get . Property 3 implies for each . Thus, satisfies the constraint of in the (K-ILP) for the constructed instance. As this holds for all with , the set is a feasible solution for the constructed OfflineKnapExp instance.
Technical reduction.
It remains to show how to actually construct an OfflineKnapExp instance that satisfies the three properties. Given an instance of succinct set cover, we construct an instance of OfflineKnapExp consisting of four sets , and of items such that . The set is defined exactly as sketched above, the set contains the set as introduced above, and for the item with . All further items will be used to ensure the three properties.
Conceptionally, we construct several partial weights for each item that will later be combined into a single weight. For each item , we construct two weights and for each formula , and a single weight . Similarly, we break down the knapsack capacity into multiple partial knapsack capacities , and for each . Intuitively, the full weight of an item will be the concatenation of the decimal representations of the partial weights, i.e., , and will be the concatenation of the partial capacities. We make this more precise in the full version [29] after defining all partial weights in such a way that the following property holds.
Property 4.
For each packing with , it holds if and only if , and and for all .
For now, we operate under the assumption that Property 4 holds and focus on the partial weights and capacities, and proceed by defining the remaining parts of the construction.
Definition of the -weights.
As formulated in Property 2, we would like each packing with and to represent a variable assignment. To this end, we need such a packing to contain exactly one item of for each . To enforce this, we use the partial -weights and the partial capacity . In particular, we define for each , and . For all items , we define , which immediately implies the following property:
Property 5.
satisfies iff represents a variable assignment.
Definition of the set and the -weights.
Define for sets to be defined below. For formula , let denote the number of clauses in and let denote these clauses. For each , we add four items to set . The idea is to define the partial -weights and the partial capacity in such a way that the following property holds.
Property 6.
For each , a packing satisfies and iff for each clause that is satisfied by the assignment represented by .
To achieve the property, we first define the partial -weights for the items . For a literal , let and define in the same way. We define the weights and as and , respectively. Intuitively, digit of the sum of a packing with indicates whether the assignment represented by satisfies clause or not: If the clause is satisfied, then contains the items that represent the three literals of and the digit has value . Otherwise, contains at most two of the items that represent the literals of and the digit has value at most .
Finally, for each for , we define the -weight of item as . For all remaining items , we define the partial -weight to be zero. Furthermore, we define the partial capacity . We claim that these definitions enforce Property 6.
Intuitively, the fact that the decimal digits of lowest magnitude in have value forces a packing with to contain exactly one item of for each as each such item increases the corresponding digit by one. Similarly, the value of each digit in is three. Since the elements of that occur in clause increase the value of digit in by one and item increases the digit by , a packing with value three in digit of can contain item if and only if contains the three items that correspond to the literals in . This implies Property 6. We give a more formal argumentation in the full version [29].
Definition of set and the -weight.
Define such that . Note that is the item that has already been introduced for Property 3. As a step toward enforcing this property, we define the -weights such that:
Property 7.
For each , a packing with has if and only if for some clause in .
To enforce this property, we define the following partial capacity . Next, we define the -weight of the elements , , as . Furthermore, we define , and , for all . For all other items, define the -weight to be zero. We show in the full version [29] that these definitions imply Property 7.
Definition of the uncertainty intervals and precise profits.
To finish the reduction, we define the profits and uncertainty intervals of all introduced items:
-
For the items , , we define the uncertainty interval for a fixed . We define the profits as .
-
For all items , we use trivial uncertainty intervals .
Proof of the three main properties.
With the full construction in place, we are ready to prove the three main properties from the beginning of the proof:
-
1.
Property 1: By definition of the profits, we have for each packing , which implies that the maximum profit is . Since the packing has a profit of exactly , we get . On the other hand, the upper limit of a packing is as only the items in have a non-trivial uncertainty interval with upper limits of . By choice of , this gives . Since all weights are integer, this implies that only holds if .
-
2.
Property 2: The property, in particular the first part, is essentially implied by Property 4 and Property 5. We give the formal argumentation in the full version [29].
-
3.
Property 3: By Property 1 and definition of the uncertainty intervals, a packing has if and only if and . By Property 4, the latter holds if and only if , and for all . Fix a packing with for some . By Property 7, holds if and only if for some clause in . By Property 6, if and only if is satisfied by the assignment represented by . This gives the first part of Property 3. For the final part, observe that . On the other hand, . Hence, the second part of Property 3 holds.
To finish the proof of the reduction, it remains to argue about the running time and space complexity of the reduction. We do so in the full version [29]. The main argument is that, while the numerical values of the constructed weights are exponential, the number of digits in their decimal representations (and, thus, their encoding size) is polynomial.
The previous theorem proves -hardness for OfflineKnapExp. Exploiting the inapproximability result on the succinct set cover problem given in [30, Theorem 7.2], we can show the following stronger statement by using the same reduction (cf. the full version [29] for a formal proof).
Theorem 6.
Unless , there exists no -approximation (given access to an oracle for problems in NP) for OfflineKnapExp for any , where .
Remark 7.
We remark that all results given in this section require large numerical input parameters. Hence, they do not prohibit the existence of pseudopolynomial algorithms.
3 Algorithmic Results
In this section, we give algorithms for -OfflineKnapExp for different values of and . Motivated by the hardness results of the previous section, we show bounds on the size of the computed query sets in comparison to instead of . All our algorithms treat the two conditions on -feasible query sets (cf. Definition 2) as two separate subproblems:
-
1.
Compute a query set such that there exists a packing with .
-
2.
Compute a query set such that for all packings .
In the following, we show how to solve these two problems and give bounds on in terms of .
3.1 The First Subproblem
The following lemma solves the first subproblem for by computing a packing with and . The set satisfies the requirement of the subproblem and has . In the full version [29], we prove the lemma by exploiting existing algorithms for two-dimensional knapsack.
Lemma 8.
Fix an . Given an instance of OfflineKnapExp, there exists a polynomial time algorithm that computes a packing with and .
3.2 The Second Subprobem: Prefix Problems
For the second subproblem, we consider special packings that correspond to greedy solutions for the underlying knapsack problem. For an item , define the density and the optimistic density . For a query set , define the optimistic density of an item after querying as if and if . We use to denote the optimistic density order after querying , that is, for we use to denote that . We assume an arbitrary but fixed tiebreaking rule between the items to treat as a total order. This allows us to define optimistic prefixes and the prefix problem.
Definition 9 (Optimistic prefixes).
For a set and a parameter , define the optimistic prefix to be the maximal prefix of order such that . To shorten the notation, we define
From the analysis of the knapsack greedy algorithm, it is well-known that the following holds for all packings :
| (1) |
If we compute a set , such that and , then solves the second subproblem for , which motivates the following problem.
Definition 10 (Prefix problem).
Given an instance of OfflineKnapExp and a threshold parameter , where is the optimal profit of the underlying knapsack instance, the prefix problem asks for the set of minimum cardinality such that .
Unfortunately, the prefix problem preserves the hardness of knapsack, even if the given threshold is larger than by a constant factor. We show this in the full version [29].
Theorem 11.
The prefix problem is weakly NP-hard for every with .
On the positive side, the problem can be solved to optimality in pseudopolynomial time.
Theorem 12.
The prefix problem can be solved in pseudopolynomial time.
We give the full proof of Theorem 12 in the full version [29], but highlight the main ideas here. In the following, we use to refer to an optimal solution for the prefix problem.
Assume for now, that the algorithm knows the last item in the prefix and the first item outside of in the order (Figure 1 (a)) and, for the sake of simplicity, assume that . We design our algorithm to reconstruct the optimal solution (or a similar solution) using just the knowledge of and .
If we look at the same two items in the initial optimistic density order , then there can be a subset of items before , a subset of items between and and a subset of items after (Figure 1 (b)). Based on and being next to each other in order , we can immediately deduce that as items in would still be between and in . Thus, the algorithm can safely add to its solution as the optimal solution does the same. Similarly, from being the first item outside of , it is clear that as the items in stay outside the prefix whether they are queried or not. Hence, the algorithm can safely ignore such items.
In the order , i.e., in the order after adding to the solution, the items and must already be next to each other. However, we still can have , that is, might not yet be part of the prefix (Figure 1 (c)). To fix this, the algorithms needs to query items from the set , which contains items that are before in the order but would move behind if they are added to the solution. To reconstruct the optimal solution, the algorithm has to query these items in such a way that enters the prefix but does not. On the one hand, the algorithm should select items with high as the items will leave the prefix and the goal of the prefix problem is to decrease the upper limit of the prefix below the threshold . On the other hand, the algorithm needs to make sure not to query too many items in , so the cardinality of the solution does not grow too large. If is the minimum amount of weight that needs to be queried for to enter the prefix, is the maximum amount of weight that can be queried before enters the prefix, and is the number of queries that the algorithm can afford, then the algorithm should select its queries from according to the following ILP, which we show to be solvable in pseudopolynomial time:
| () |
After adding the solution of () to the solution, the prefix of the algorithm has reached roughly the same configuration as : is last in the prefix and is first outside the prefix (Figure 1 (d)). However, we still might have . To fix this, the algorithm has to query items of that stay in front of in the optimistic density order even after being queried. Since these items are part of the prefix and will stay part of the prefix even after being queried, the algorithm should greedily query such items with large until the solution becomes feasible for the prefix problem instance. Our algorithm for Theorem 12 is based on exactly this approach. We show in the full version [29] how to formalize this and get rid of all assumptions that were used in the description above.
To achieve polynomial running time, we use the same approach but instead of () we solve a certain LP-relaxation with the property that an optimal basic feasible solution has at most two fractional variables. Omitting the fractional items leads to the following corollary.
Corollary 13.
Given an instance of the prefix problem with threshold parameter , let denote an optimal solution to the instance. There exists a polynomial time algorithm that computes a set with such that .
3.3 Combining the Subproblems
By combining Lemma 8 and Theorem 12, we can show the following theorem. The idea is to use Lemma 8 to compute a packing with for a carefully chosen and then use Theorem 12 with threshold to compute a solution to the prefix problem. Exploiting (1), we return the solution and observe that satisfies the theorem. A formal proof is given in the full version [29].
Theorem 14.
Fix . There exists a pseudopolynomial algorithm that given an instance of OfflineKnapExp computes a -feasible query set with .
Replacing the usage of Theorem 12 with Corollary 13 in the approach above yields:
Theorem 15.
Fix . There exists a polynomial time algorithm that given an instance of OfflineKnapExp computes a -feasible query set with .
4 Conclusion
We hope that our results on OfflineKnapExp improve the understanding of NP-hard problems under explorable uncertainty. In particular, our algorithmic insights on the resource augmentation setting give hope for tackling such problems even if the corresponding offline versions have strong impossibility results. For knapsack specifically, studying a stochastic version of the prefix problem of Section 3, for example in the stochastic setting of [26], seems like a logical next step towards algorithmic results for the non-offline KnapExp. For OfflineKnapExp, our results of Section 3 show that non-trivial algorithmic results with theoretical guarantees are possible, opening the door for more research on finding the best-possible guarantees.
References
- [1] Sanjeev Arora and Boaz Barak. Computational Complexity - A Modern Approach. Cambridge University Press, 2009. URL: http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521424264.
- [2] Evripidis Bampis, Konstantinos Dogeas, Thomas Erlebach, Nicole Megow, Jens Schlöter, and Amitabh Trehan. Competitive query minimization for stable matching with one-sided uncertainty. In APPROX/RANDOM, volume 317 of LIPIcs, pages 17:1–17:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.APPROX/RANDOM.2024.17.
- [3] Evripidis Bampis, Christoph Dürr, Thomas Erlebach, Murilo Santos de Lima, Nicole Megow, and Jens Schlöter. Orienting (hyper)graphs under explorable stochastic uncertainty. In ESA, volume 204 of LIPIcs, pages 10:1–10:18, 2021. doi:10.4230/LIPIcs.ESA.2021.10.
- [4] Soheil Behnezhad, Avrim Blum, and Mahsa Derakhshan. Stochastic vertex cover with few queries. In SODA, pages 1808–1846. SIAM, 2022. doi:10.1137/1.9781611977073.73.
- [5] Soheil Behnezhad, Mahsa Derakhshan, and MohammadTaghi Hajiaghayi. Stochastic matching with few queries: (1-) approximation. In STOC, pages 1111–1124. ACM, 2020. doi:10.1145/3357713.3384340.
- [6] Avrim Blum, John P. Dickerson, Nika Haghtalab, Ariel D. Procaccia, Tuomas Sandholm, and Ankit Sharma. Ignorance is almost bliss: Near-optimal stochastic matching with few queries. Oper. Res., 68(1):16–34, 2020. doi:10.1287/opre.2019.1856.
- [7] R. Bruce, M. Hoffmann, D. Krizanc, and R. Raman. Efficient update strategies for geometric computing with uncertainty. Theory of Computing Systems, 38(4):411–423, 2005. doi:10.1007/s00224-004-1180-4.
- [8] George Charalambous and Michael Hoffmann. Verification problem of maximal points under uncertainty. In IWOCA 2013, volume 8288 of Lecture Notes in Computer Science, pages 94–105. Springer, 2013. doi:10.1007/978-3-642-45278-9_9.
- [9] Shaddin Dughmi, Yusuf Hakan Kalayci, and Neel Patel. On sparsification of stochastic packing problems. In ICALP, volume 261 of LIPIcs, pages 51:1–51:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ICALP.2023.51.
- [10] T. Erlebach and M. Hoffmann. Minimum spanning tree verification under uncertainty. In D. Kratsch and I. Todinca, editors, WG 2014: International Workshop on Graph-Theoretic Concepts in Computer Science, volume 8747 of Lecture Notes in Computer Science, pages 164–175. Springer Berlin Heidelberg, 2014. doi:10.1007/978-3-319-12340-0_14.
- [11] T. Erlebach, M. Hoffmann, and F. Kammer. Query-competitive algorithms for cheapest set problems under uncertainty. Theoretical Computer Science, 613:51–64, 2016. doi:10.1016/j.tcs.2015.11.025.
- [12] T. Erlebach, M. Hoffmann, D. Krizanc, M. Mihal’ák, and R. Raman. Computing minimum spanning trees with uncertainty. In STACS’08: 25th International Symposium on Theoretical Aspects of Computer Science, pages 277–288, 2008. doi:10.48550/arXiv.0802.2855.
- [13] Thomas Erlebach, Murilo S. de Lima, Nicole Megow, and Jens Schlöter. Sorting and hypergraph orientation under uncertainty with predictions. In IJCAI, pages 5577–5585. ijcai.org, 2023. doi:10.24963/ijcai.2023/619.
- [14] Thomas Erlebach, Murilo Santos de Lima, Nicole Megow, and Jens Schlöter. Learning-augmented query policies for minimum spanning tree with uncertainty. In ESA, volume 244 of LIPIcs, pages 49:1–49:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.ESA.2022.49.
- [15] T. Feder, R. Motwani, L. O’Callaghan, C. Olston, and R. Panigrahy. Computing shortest paths with uncertainty. Journal of Algorithms, 62(1):1–18, 2007. doi:10.1016/j.jalgor.2004.07.005.
- [16] T. Feder, R. Motwani, R. Panigrahy, C. Olston, and J. Widom. Computing the median with uncertainty. SIAM Journal on Computing, 32(2):538–547, 2003. doi:10.1137/S0097539701395668.
- [17] Michel X. Goemans and Jan Vondrák. Covering minimum spanning trees of random subgraphs. Random Struct. Algorithms, 29(3):257–276, 2006. doi:10.1002/rsa.20115.
- [18] Marc Goerigk, Manoj Gupta, Jonas Ide, Anita Schöbel, and Sandeep Sen. The robust knapsack problem with queries. Comput. Oper. Res., 55:12–22, 2015. doi:10.1016/j.cor.2014.09.010.
- [19] M. M. Halldórsson and M. S. de Lima. Query-competitive sorting with uncertainty. In MFCS, volume 138 of LIPIcs, pages 7:1–7:15, 2019. doi:10.4230/LIPIcs.MFCS.2019.7.
- [20] S. Kahan. A model for data in motion. In STOC’91: 23rd Annual ACM Symposium on Theory of Computing, pages 265–277, 1991. doi:10.1145/103418.103449.
- [21] Richard M. Karp. Reducibility among combinatorial problems. In Complexity of Computer Computations, The IBM Research Symposia Series, pages 85–103. Plenum Press, New York, 1972. doi:10.1007/978-1-4684-2001-2_9.
- [22] Jon Kleinberg and Eva Tardos. Algorithm Design. Addison-Wesley Longman Publishing Co., Inc., USA, 2005.
- [23] Takanori Maehara and Yutaro Yamaguchi. Stochastic packing integer programs with few queries. Mathematical Programming, 182(1):141–174, 2020. doi:10.1007/s10107-019-01388-x.
- [24] Corinna Mathwieser and Eranda Çela. Special cases of the minimum spanning tree problem under explorable edge and vertex uncertainty. Networks, 83(3):587–604, 2024. doi:10.1002/net.22204.
- [25] N. Megow, J. Meißner, and M. Skutella. Randomization helps computing a minimum spanning tree under uncertainty. SIAM Journal on Computing, 46(4):1217–1240, 2017. doi:10.1137/16M1088375.
- [26] Nicole Megow and Jens Schlöter. Set selection under explorable stochastic uncertainty via covering techniques. In IPCO, volume 13904 of Lecture Notes in Computer Science, pages 319–333. Springer, 2023. doi:10.1007/978-3-031-32726-1_23.
- [27] J. Meißner. Uncertainty Exploration: Algorithms, Competitive Analysis, and Computational Experiments. PhD thesis, Technischen Universität Berlin, 2018. doi:10.14279/depositonce-7327.
- [28] Arturo Merino and José A. Soto. The minimum cost query problem on matroids with uncertainty areas. In ICALP, volume 132 of LIPIcs, pages 83:1–83:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.ICALP.2019.83.
- [29] Jens Schlöter. On the complexity of knapsack under explorable uncertainty: Hardness and algorithms, 2025. doi:10.48550/arXiv.2507.02657.
- [30] Amnon Ta-Shma, Christopher Umans, and David Zuckerman. Loss-less condensers, unbalanced expanders, and extractors. In Proceedings of the thirty-third annual ACM symposium on Theory of computing, pages 143–152, 2001. doi:10.1145/380752.380790.
- [31] Christopher Umans. Hardness of approximating sigma minimization problems. In FOCS, pages 465–474. IEEE Computer Society, 1999. doi:10.1109/SFFCS.1999.814619.
- [32] Christopher Umans. The minimum equivalent dnf problem and shortest implicants. Journal of Computer and System Sciences, 63(4):597–611, 2001. doi:10.1006/jcss.2001.1775.
- [33] Jan Vondrák. Shortest-path metric approximation for random subgraphs. Random Struct. Algorithms, 30(1-2):95–104, 2007. doi:10.1002/rsa.20150.
