LIPIcs.ISAAC.2021.69.pdf
- Filesize: 0.65 MB
- 13 pages
In this paper, we study the problem of maximizing the difference between an adaptive submodular (revenue) function and a non-negative modular (cost) function. The input of our problem is a set of n items, where each item has a particular state drawn from some known prior distribution The revenue function g is defined over items and states, and the cost function c is defined over items, i.e., each item has a fixed cost. The state of each item is unknown initially and one must select an item in order to observe its realized state. A policy π specifies which item to pick next based on the observations made so far. Denote by g_{avg}(π) the expected revenue of π and let c_{avg}(π) denote the expected cost of π. Our objective is to identify the best policy π^o ∈ arg max_π g_{avg}(π)-c_{avg}(π) under a k-cardinality constraint. Since our objective function can take on both negative and positive values, the existing results of submodular maximization may not be applicable. To overcome this challenge, we develop a series of effective solutions with performance guarantees. Let π^o denote the optimal policy. For the case when g is adaptive monotone and adaptive submodular, we develop an effective policy π^l such that g_{avg}(π^l) - c_{avg}(π^l) ≥ (1-1/e-ε)g_{avg}(π^o) - c_{avg}(π^o), using only O(nε^{-2}log ε^{-1}) value oracle queries. For the case when g is adaptive submodular, we present a randomized policy π^r such that g_{avg}(π^r) - c_{avg}(π^r) ≥ 1/eg_{avg}(π^o) - c_{avg}(π^o).
Feedback for Dagstuhl Publishing