Abstract 1 Introduction 2 Preliminaries 3 Related Work 4 Our Contribution: Finding Disjunctive Cliques 5 Experiments 6 Conclusions References Appendix A Conversion of 𝟑𝒏 problem proofs into pigeonhole principle pseudo-proofs Appendix B Exponential lower bound on pigeonhole proofs Appendix C Explanations Appendix D Novel bounds Appendix E Optimality proof for the RCPSP/max instance #64 of collection J30

Unite and Lead: Finding Disjunctive Cliques for Scheduling Problems

Konstantin Sidorov111Both authors contributed equally to this research. ORCID Delft University of Technology, The Netherlands Imko Marijnissen111Both authors contributed equally to this research. ORCID Delft University of Technology, The Netherlands Emir Demirović ORCID Delft University of Technology, The Netherlands
Abstract

Constraint programming solvers have seen much success in scheduling problems owing to their efficient reasoning over constraints to solve complex problems in practice. Many algorithms have been proposed for propagating information from a single constraint. However, inferring and exchanging information across multiple constraints can provide deeper insight into the global structure of a problem. In this work, we propose to exchange information amongst constraints by inferring the disjointness of tasks in scheduling problems from many constraints. We do this by (i) augmenting existing propagators, such as the Cumulative and nogoods, to report when pairs of tasks are disjoint, and (ii) leveraging this information by introducing the SelectiveDisjunctive propagator which generates a lower bound on the earliest completion time of cliques of disjoint tasks to determine conflicts. This allows us to aggregate disjointness information spanning multiple constraints to gain a better global overview of the problem, as well as more precise local information. We also identify a problem structure where an LCG solver reasoning over Cumulative constraints separately, without any reformulations, requires an exponential amount of time to prove infeasibility, which we both justify theoretically and show empirically; on the other hand, our approach solves those instances in polynomial time. On particular known RCPSP and RCPSP/max benchmarks, our approach significantly reduces the number of conflicts required to prove optimality when resource contention is high. Additionally, we discover new lower bounds for 16 RCPSP/max instances (closing six of them) and four RCPSP instances (closing one), as well as new upper bounds for two RCPSP/max instances and four RCPSP instances. Furthermore, we empirically analyse our proposed approach to determine which features are beneficial for performance, showing that finding cliques is one of the main bottlenecks and that detecting disjointness during search can lead to improved bounds on certain instances, but it generally negatively impacts learning. This work paves the way for reasoning over the disjointness of tasks inferred from a variety of standard constraints to discover novel information sourced from multiple constraints during search.

Keywords and phrases:
Constraint Programming, Lazy Clause Generation, Propagation, Scheduling, Cumulative, Disjunctive
Funding:
Konstantin Sidorov: supported by the TU Delft AI Labs program as part of the XAIT lab.
Imko Marijnissen: supported by the NWO/OCW, as part of the Quantum Software Consortium programme (project number 024.003.037 / 3368).
Copyright and License:
[Uncaptioned image] © Konstantin Sidorov, Imko Marijnissen, and Emir Demirović; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Constraint and logic programming
Supplementary Material:

Software  (Pumpkin and processing scripts): https://doi.org/10.5281/zenodo.15624366
Dataset  (Instances and solver logs): https://doi.org/10.5281/zenodo.15624416
Acknowledgements:
The authors would like to thank Maarten Flippo for the productive discussions on scheduling applications for constraint programming during the early phases of this project.
Editors:
Maria Garcia de la Banda

1 Introduction

Scheduling is one of the important classes of problems in the field of optimisation that can be loosely described as the problem of scheduling a set of tasks to satisfy certain constraints while optimising some objective. Constraint programming (CP) has been successful in solving scheduling problems. The key advantage of CP is its native support for constraints that capture task resource usage and efficient propagation algorithms to prune the search space, such as time-tabling [29] or energetic reasoning [2]. Another promising technique is exploiting the disjointness of tasks, as identified by Gay et al. [16] to infer bounds on tasks.

A common trait of the previously mentioned works is that they reason over a single resource, since reasoning jointly over multiple resources significantly increases the computational cost. However, the benefits of stronger inference obtained by aggregating and exchanging information across multiple resource constraints may outweigh the drawbacks, an opportunity underexplored in constraint programming for scheduling.

To illustrate the issues with single-resource reasoning, consider the following example:

Example 1.

We have two tasks on different resources and aim to minimise the latest finish time (makespan). Individually, neither resource constraint allows us to make any non-trivial inferences about the makespan. However, if we combine this with the information that the two tasks cannot be executed at the same time, then we can see that they need to be executed sequentially, which allows us to derive a tighter bound on the makespan of these two tasks.

To address this, we propose a principled solution for aggregating and exchanging information across constraints by reasoning over the disjointness of task pairs, based on both static incompatibilities (i.e., present in the original model) and dynamic incompatibilities (i.e., discovered during search). Our approach consists of three core components:

Variable creation

We create Boolean variables representing whether two tasks are disjoint.

Disjointness mining

We extend the existing propagators for constraints defined in the problem to infer whether two tasks should be disjoint and allow propagators to use the Boolean variables to perform additional propagation.

Conflict detection

We introduce the SelectiveDisjunctive propagator that aggregates this disjointness information by deriving lower bounds on the earliest completion time of disjoint cliques of tasks to detect conflicts.

By doing so, we can detect inconsistent states for disjoint tasks by (a) aggregating information from the whole problem in contrast to the more orthodox focus on single constraints, (b) allowing propagators to use this shared information, and (c) computing conflicting task groups during search, thus using information obtained later in the search. As finding maximum cliques is an NP-hard problem [23], we propose a heuristic to find disjoint cliques of tasks based on minimising the growth of the interval spanned by the clique when adding a task.

To support the usefulness of our contributions, we identify an instance structure for which state-of-the-art solvers are guaranteed to incur an exponential number of conflicts unless they reformulate the problem. While state-of-the-art LCG solvers also exhibit exponential runtime in practice, our approach can solve the same instances in polynomial time.

To evaluate our approach, we implemented it in Pumpkin [12] and ran experiments on well-known scheduling benchmarks (RCPSP and RCPSP/max). Comparing our approach to baseline Pumpkin and Google OR-Tools CP-SAT [31] (a state-of-the-art CP solver that has won many editions of the MiniZinc Challenge since 2017) shows that our approach works well when resource contention is high, with some instances exhibiting improvements of three orders of magnitude. For RCPSP/max, our approach discovers 16 new lower bounds (and closes six instances) and two new upper bounds; whereas for RCPSP, our approach yields four new upper bounds, four new lower bounds, and one closed instance. Our ablation study shows that the main bottleneck is finding the cliques but that it can be crucial to find the right cliques, whereas disjointness mining hampers performance due to poor learning on most instances.

To summarise, we address the insufficiency of local information by reasoning over the disjointness of tasks to infer global information. We do this by (1) introducing Boolean variables representing the disjointness between two tasks and extending existing propagators to infer the values of these variables and use them for additional propagation, and (2) introducing the SelectiveDisjunctive propagator that heuristically finds disjoint cliques to detect conflicts unattainable by current approaches due to their focus on individual constraints. We identify an instance structure for which current techniques are guaranteed to incur an exponential number of conflicts, while our approach only requires a polynomial number of them. We experimentally evaluate our approach and observe that it is most beneficial for instances with high resource contention, leading to orders-of-magnitude improvements for specific benchmarks. Through an ablation study, we find that dynamically determining which disjoint cliques to focus on can be beneficial and that looking for disjointness during search inhibits the learning capabilities of the solver. This work paves the way for aggregating information across constraints to derive global information about the problem.

The rest of the paper is structured as follows. Section 2 introduces the concepts of constraint programming relevant to our work. Section 3 reviews the previous research on disjunctive reasoning. We present and motivate our strategy for discovering and reasoning about disjunctive tasks in Section 4. We evaluate our approach on a range of scheduling problems in Section 5. Finally, we conclude and review further research directions in Section 6.

2 Preliminaries

Constraint programming framework.

A Constraint Satisfaction Problem (CSP) consists of a tuple (𝒳,𝒞,𝒟) where 𝒳 is the set of variables, 𝒞 is the set of constraints which specify the relations between variables, and 𝒟 is the domain which specifies for each variable which values it can take; we refer to the lowest and highest domain values as lower and upper bounds and denote them LB(x) and UB(x), respectively [32]. A solution is a mapping that maps each variable in 𝒳 to a single value in the domain of that variable in 𝒟 which satisfies all of the constraints in 𝒞. An atomic constraint (i.e. atomic predicate) is a predicate over a single integer variable x𝒳 and value v signified by xv where {=,,,}.

Constraint programming (CP) is a paradigm for solving CSPs; CP solvers enforce constraints through propagators, each represented with a function f:𝒟𝒟 (where 𝒟𝒟) which removes values from 𝒟 infeasible under the constraints in 𝒞. After applying the propagators, the solver makes a decision which creates several subproblems by splitting the domain of a variable into two or more parts. This process of applying propagators and making decisions is performed until either a solution is found, the problem is found to be unsatisfiable, or a termination criterion is met.

During the search process, CP solvers can use a technique called nogood learning [10]. A nogood is a partial assignment that cannot be extended to a full solution, typically rendered with an implication of the form N=p1pk with pj being the atomic constraints encoding the partial assignment; a nogood is unsatisfied if no pj is falsified. The purpose of adding nogoods is to prevent the search process from (re-)exploring parts of the search space. In this work, we use the lazy clause generation framework [14] to derive these nogoods. In this framework, it is required that the propagators explain their inferences in terms of atomic constraints, that is, they produce explanations of the form e1emq.

Cumulative constraint.

Cumulative is a constraint useful for many scheduling problems to model limited renewable resources, such as available work-hours. However, determining the satisfiability of a single Cumulative constraint is NP-complete [2], meaning that even reasoning efficiently over a single Cumulative requires reasoning over relaxations of the problem.

We define the Cumulative in 2, the variable si encodes the start time of task i, and (si+di) the finish time of i. Thus, in this constraint, we implicitly associate each task iT with an interval [si,si+di). We also introduce notation for the bounds of tasks and sets of tasks.

Definition 2.

Let T be a set of tasks, and let a task iT be defined by its start variable si, resource usage ri0, and duration di0. Finally, let C0 be the capacity of the resource. Then the Cumulative constraint is the condition that at any time point τ the cumulative resource usage of intervals [si,si+di) covering τ does not exceed the capacity:

τ:iT:siτ<si+diriC. (1)

Given a task iT, we denote its: earliest start time ESTi=LB(si), earliest completion time ECTi=LB(si)+di, latest start time LSTi=UB(si), latest completion time LCTi=UB(si)+di, and energy ei=ri×di.

Given a set of tasks ΩT, we denote its: total duration dΩ=iΩdi, earliest start time ESTΩ=miniΩLB(si), latest completion time LCTΩ=maxiΩUB(si)+di, earliest completion time ECTΩ=maxΩΩESTΩ+dΩ, and total energy eΩ=iΩei

An important concept for the inference over Cumulative constraints is the mandatory part (1(a)), which is a time interval that is covered by a task regardless of its placement:

Definition 3.

Given a Cumulative constraint over tasks T with capacity C, we denote the mandatory part MPi of a task iT as the interval MPi:=[LSTi,ECTi). We define a height at a time point τ as total consumption of mandatory parts covering τ, that is, Height(τ):=iT:τMPiri. We also define a reduced height at a time point τ without ΩT as Height(τ,Ω):=iTΩ:τMPiri. Finally, we define a profile Prx as a rectangle (ax,bx,hx) where τ[ax,bx]:Height(τ)=hx and we assume that all profiles are maximal (i.e. there are no adjacent profiles with the same height).

Disjunctive (No-Overlap) constraint.

In this paper, we focus on a special case of
Cumulative where the resource has unit capacity C=1 and each task has unit resource usage ri=1, known as Disjunctive. Many inference procedures that are intractable for Cumulative constraints can be efficiently executed for Disjunctive constraints. The core inference that we use is infeasibility checking (Theorem 4), as illustrated by 1(b).

Theorem 4 (Overload checking).

Let ΩT be a set of variables bound by a Disjunctive constraint such that dΩ>LCTΩESTΩ. Then the Disjunctive constraint has no feasible solutions [37], and any feasible solution of the CSP satisfies

(iΩsiESTΩδ)(iΩsiLCTΩdω+δ+) (2)

for arbitrary δ,δ+0 satisfying LCTΩESTΩ+δ+δ+<dΩ [38].

(a) Mandatory part illustration for a single task i.
(b) Overload checking leads to a conflict since d1+d2>LCT{1,2}EST{1,2}.
Figure 1: Key concepts related to Cumulative and Disjunctive constraints.

3 Related Work

Much work focused on inferring information based on the Cumulative efficiently [29, 34]. One of the approaches of primary interest in this work is edge-finding [29] as it is similar in its conflict detection procedure (also known as input/output consistency tests [13, Chapter 4]). A common trait of these approaches [22, 18] is that the earliest completion time of ΩT is used to make inferences. For example, overload checking enforces the condition ECTΩ>LCTΩ, while edge finding extends this rule to ECTΩ{i}>LCTΩΩi, updating the bounds of t such that it does not overflow the interval bounded by the tasks in Ω. While edge-finding takes into account the earliest start times and latest finish times of the tasks (and horizontally elastic edge-finding [18] takes into account the maximum resource usage of the tasks), this type of reasoning does not take into account the disjointness of tasks and its influence on the earliest completion time of a set of tasks. Furthermore, edge-finding reasons over a single Cumulative, whereas we propose gathering information across several Cumulative constraints to infer the earliest completion time.

Detecting disjoint pairs of tasks within a Cumulative constraints is also a viable reasoning strategy [16], known as time-table disjunctive reasoning. This strategy can detect disjoint task pairs dynamically by taking into account the time points at which other tasks are guaranteed to consume some amount of resource. When it detects such disjointness using time-tabling [29], it determines when the bounds of one of the tasks can be updated. The main difference with our approach is that time-table disjunctive reasoning only reasons about a single Cumulative constraint while we reason jointly over multiple constraints, and our approach infers disjointness from other constraints besides the Cumulative.

To the best of our knowledge, the only work that addresses the reasoning over multiple Cumulative constraints is the work by Beldiceanu and Carlsson [5]. This work introduces a multi-resource Cumulatives constraint as a generalisation of the Cumulative, admitting negative resource consumption and lower/upper bounds on the cumulative resource usage. However, their reasoning makes limited use of the additional information provided by reasoning over multiple Cumulative constraints, nor does it take into account disjointness.

Besides reasoning about Cumulative constraints, we consider other works that reason about disjunctive tasks. One area is branch-and-bound algorithms for scheduling problems. A common example is the resource-constrained scheduling problem (RCPSP), in which one of the lower bounds (LB4) on makespan is the sum of durations in a group of pairwise disjunctive tasks; with similar bounds being proposed for activity groups with at most two or at most k of them running in parallel [24]. Thus, our approach generalises that strategy while also working in the constraint programming context, where the tasks can be (a) arbitrarily bounded by the solver and (b) involved in other constraints. This type of reasoning can also be embedded into the search tree node representation, for example, as a scheduling scheme [7], which accounts for the pairs of tasks that are either disjunctive or running in parallel.

We thus note that there are several methods that attempt to find relations between variables and to exploit these relations to infer information about the bounds. However, the relations discovered by the majority of state-of-the-art techniques are local to individual resource constraints or do not reason over the disjointness of tasks; contrary to that, our work aims to discover this disjointness dynamically during search from multiple constraints and use information across propagators to infer the earliest completion time of a set of tasks.

4 Our Contribution: Finding Disjunctive Cliques

This section will introduce our approach for aggregating information across constraints. We propose the following workflow:

  1. 1.

    Given a set of tasks T (bound by Cumulative constraints), introduce new variables δij for all i,jT and reify them as δij[si,si+di)[sj,sj+dj)=. (Subsection 4.2)

  2. 2.

    Introduce the reification variables δ into other constraints (such as the Cumulative and difference logic) and extend their propagation logic to derive implications of the form variable boundsδij, and propagate variable bounds based on δ. (Subsection 4.4)

  3. 3.

    Introduce our novel SelectiveDisjunctive propagator which derives lower bounds on the earliest competion time of sets of pairwise disjoint tasks given (a) pairs of tasks {i,j} such that δij=1, and (b) the bounds on start times to discover conflicts. (Subsection 4.3)

Our approach allows us to gain a global view across multiple constraints rather than relying on single constraints to independently make local inferences which causes limited information exchange between propagators. Thus, our approach allows them to communicate more elaborate facts between propagators and a CP solver, such as the disjointness of a pair of tasks, as opposed to existing methods focusing on a local view.

4.1 Motivating Example: 𝟑𝒏 Problem

Let us first look at a case where reasoning over a single resource is insufficient: Suppose that we are scheduling six tasks, each with unit duration. At every time unit, we are given a fixed amount of resources of types I, II, and III. The amount of resources consumed by each task is illustrated in 2(a). How much time has to pass before all six tasks are completed?

This is possible to do in six units of time by sequentially scheduling the tasks in any order. What cannot be inferred by reasoning over single constraints (such as time-tabling or energetic reasoning) is that this schedule is the fastest possible (in terms of the latest completion time). To see why, observe that the two yellow tasks cannot be run in parallel due to the shortage of resource I, and the same is true for both the two green tasks and the two blue tasks by similar reasoning. On the other hand, scheduling a yellow and a green task in parallel is not possible, as it would overflow the resource II; the same holds for a green and a blue task, as well as for a blue and a yellow task. By this point, we have considered all pairs of tasks and concluded for each of them that they cannot be run in parallel, making it impossible to improve upon the sequential schedule.

(a) Conflict graph of the motivating example. Edge colour indicates resource, solid edges connect tasks jointly consuming 140%, and dotted edges connect tasks jointly consuming 110%. All tasks of the same colour have the same resource usage.
(b) Number of conflicts generated by CP-SAT solving 3n problem instances with d=1.
Figure 2: Motivating example: scheduling problem with tasks that are pairwise disjoint due to various resource constraints and how CP-SAT performs on these instances.

The previous reasoning relies only on the partition of tasks into three groups such that choosing one task from every two groups implies disjointness; this property holds if we consider groups of size n. We generalise the reasoning above not only in terms of group size n but also by duration d and capacity parameters p,q,M:

Definition 5.

Given a group size n1, duration d1, and parameters p,q,M1 with 2p<M<p+q<2q, the 𝟑𝐧 problem instance is the constraint satisfaction problem defined over variables X,Y,Z with |X|=|Y|=|Z|=n and domains 𝒟()[1,(3n1)d]:

𝚌𝚞𝚖𝚞𝚕𝚊𝚝𝚒𝚟𝚎(X++Y;[d]2n;[p]n++[q]n;M) 𝚌𝚞𝚖𝚞𝚕𝚊𝚝𝚒𝚟𝚎(Y++Z;[d]2n;[p]n++[q]n;M)
𝚌𝚞𝚖𝚞𝚕𝚊𝚝𝚒𝚟𝚎(Z++X;[d]2n;[p]n++[q]n;M)

As we established, any 3n instance is unsatisfiable, but to establish this without search, a solver needs information about all resources. However, each resource individually only implies that any solution occupies some time segment of duration 2n×d, and the remaining (n1)d time units have to be resolved by search. This suggests that any valid sequence of inferences that operates on Cumulative constraints separately has exponential length in terms of n.

We prove this, and the 3n problem thus serves a role similar to pigeonhole principle formulas [19], but for the Cumulative models instead of propositional formulas. We show that any unsatisfiability proof of a 3n problem in the C-RES proof system [21] has length exponential in n, which shows that any LCG solver requires an exponential number of steps to prove unsatisfiability, as long as it introduces neither new constraints nor new variables:

Theorem 6 (3n problem intractability).

Let 𝒫 be a C-RES proof of a CSP instance with all clauses valid for any single constraint in 5 for d=1. Then |𝒫|=Ω(1.05n).

Proof Sketch.

First, we show that 𝒫 can be encoded, with extra assumptions, as a pigeonhole problem proof over 3n variables (Appendix A). Next, we adapt the argument by Beame and Pitassi [4] to verify the exponential lower bound (Appendix B). We evaluate this insight empirically by running 3n instances for n10 and d5 with five solvers: two LCG solvers (Google OR-Tools CP-SAT [31] and Chuffed [9]), a non-learning CP solver (Gecode [17]), and two branch-and-cut solvers (CBC [15] and HiGHS [20]). We ran each solver with a 12-hour time limit, 1 CPU, and 4000 MB of RAM. Both the LCG and non-learning solvers exhibit exponential behaviour, as shown on 2(b) for CP-SAT, and only one of the runs n6 proved unsatisfiability. These results suggest that LCG solvers indeed resort to exhaustive enumeration on the 3n problem. Branch-and-cut solvers have exhibited better performance in solving those instances, demonstrating another similarity with pigeonhole principle formulas. HiGHS solved all instances with d3 in seven minutes. CBC has performed worse, as it has failed to solve instances with n7, however, it has successfully solved d=1 for n6 in less than two seconds.

4.2 SelectiveDisjunctive Constraint

Now that we have motivated the necessity for using global information, we describe how we use information concerning the disjointness of tasks across multiple constraints by introducing the SelectiveDisjunctive constraint. The SelectiveDisjunctive constraint (i) generalises the disjunctive constraint in a way that allows other propagators to indicate disjoint task pairs, and (ii) restricts the variables indicating this disjointness to be true when a pair of intervals in the assignment is disjoint. We formalise this intuition in 7.

Definition 7.

Let T be a set of tasks, and Δ be the n×n matrix of Boolean variables indexed by T. Then a SelectiveDisjunctive(T,Δ) constraint is a condition that for any two tasks i,jT,ij either δij=0 or [si,si+di)[sj,sj+dj)=.

Introducing these variables and SelectiveDisjunctive does not prune any valid solutions, but it allows performing two new operations: (a) fixing the variables from Δ and (b) using these Δ variables in other propagations. This allows us to encode facts – such as ‘a pair of tasks cannot overlap given a partial assignment of other tasks’ – that are not possible to encode concisely using atomic constraints over the original model variables. More specifically, we can now observe that 7 embeds the standard Disjunctive constraint (or even exponentially many of them); to formalise this, we first need the following notion:

Definition 8.

Given a search state with domains 𝒟 and a SelectiveDisjunctive constraint, a conflict graph 𝒢|𝒟 is a graph with a vertex for each task iT and an edge for tasks {i,j} such that 𝒟(δij)={1} (ommitting 𝒟 when it is clear from context).

We can make two key observations based on 8: (1) a conflict graph can be seen as an exponentially large collection of Disjunctive constraints, and (2) this is also a dynamically updating collection of Disjunctive constraints, provided that the other parts of a constraint solving engine can propagate the domains of reification variables. On top of this, it suggests that any reasoning for the conventional Disjunctive constraint can be deployed for any clique of the conflict graph. In Subsection 4.3, we discuss how to find these cliques in the conflict graph and how to apply the overload checking rule for the Disjunctive in this context.

4.3 Overload checking for SelectiveDisjunctive

To clarify the intuition behind our inference strategy, consider the following example:

Example 9.

Let i,j,k be tasks with possible start times si[0,2],sj[0,2],sk[2,3] and durations di=dj=2,dk=3 bound by a SelectiveDisjunctive constraint such that δij=δjk=δik=1 in the current search state. We can observe that any feasible assignment also satisfies Disjunctive([si,sj,sk],[di,dj,dk]). However, this constraint is infeasible by overload checking for {si,sj,sk}: these three tasks jointly cover seven time units, but their EST is 0 and LCT is 6, leaving only six time units among these three tasks.

 Note 10.

The example above purposefully does not specify why the Δ-variables are true: the same reasoning can be carried out as soon as those intervals are established to be disjoint. We discuss the specific strategies for deriving those facts in Subsection 4.4.

We formalise this intuition in 11 by reformulating Theorem 4 in the context of the SelectiveDisjunctive constraint.

Proposition 11 (Selective overload checking).

Let T be the set of tasks bound by a SelectiveDisjunctive constraint, and suppose that ΩT is a clique in the conflict graph 𝒢 induced by the current variable domains. Then this constraint is infeasible if dΩ>LCTΩESTΩ, and any feasible solution of the CSP satisfies

(i,jΩ,ijδij)(ωΩsωESTΩδ)(ωΩsωLCTΩdω+δ+) (3)

for arbitrary δ,δ+0 satisfying LCTΩESTΩ+δ+δ+<dΩ.

Unlike the Disjunctive overload checking, propagation of the SelectiveDisjunctive requires an additional step before overload checking can be performed: finding cliques. Finding a clique leading to a conflict, even if it exists, is NP-complete333This can be shown by reduction from the maximum clique problem and we thus propose a heuristic in 1 that dynamically explores some of the cliques present in the conflict graph to balance the trade-off between finding “ good” cliques and the time spent finding them. The heuristic algorithm considers each node as a root for a clique 𝒞 and then adds tasks jT𝒞 which minimise LCT𝒞{j}EST𝒞{j} (breaking ties in favour of tasks with a longer duration) while retaining the clique property. The time complexity of 1 is 𝒪(|T|4).

Algorithm 1 Heuristic conflict discovery procedure.

4.4 Disjointness mining

We described how to deploy the SelectiveDisjunctive propagator when some of the reification variables have been assigned, but we have not yet described how to infer the domain of these reification variables. In this section, we describe strategies for disjointness mining, that is, discovering, possibly during the search, pairs of tasks i,jT that are inferred to be disjoint.

Domain disjointness.

One direct source of disjointness is the current bounds of variables. The rule can be formally stated as i,jT:[ESTi,LCTi)[ESTj,LCTj)=δij=1. We perform this check during the selective overload checking (Subsection 4.3). Given a propagation of δij via this detection, we explain the propagation according to Equation 4.

{siESTjdisjESTjif LCTiESTjsjESTidjsiESTiotherwiseδij=1 (4)

Difference logic.

Scheduling problems often include tasks dependent on the execution of other tasks. These dependencies can be provided in the form of precedence constraints, which provide static information about the disjointness. For example, to encode that j starts δ time units after i, we can use the difference logic constraint si+δsj.

These constraints encode stronger conditions than disjointness; for example, precedence can be seen as disjointness with ordering. Thus, we can deduce the disjointness variables from the difference constraint for appropriate parameter values (this occurs at the root level and requires no explanation). We formalise this in the following proposition:

Proposition 12.

A pair of intervals [si,si+di),[sj,sj+dj) is disjoint (i.e., δij=1 in any feasible solution) if any of the following constraints are implied by the CSP: (a) sjsiδ for some constant δdi, or (b) sisjδ for some constant δdj.

Cumulative.

There are several Cumulative propagation techniques that reason about when the combination of scheduling a task i and a set of tasks ΩT{i} leads to resource overflows to infer relations between tasks. In this work, we use resource profiles to determine when a pair of tasks is disjoint. An example of this reasoning can be seen in 13.

Figure 3: Example where profiles (marked in grey) cause disjointness between all tasks i,j,k.
Example 13.

In addition to the assumptions of 9, suppose that the problem has three Cumulative constraints as visualised in Figure 3. We can derive from Resource 1 that i and j are disjoint (δij=1); if they are not, then they overlap at some point τ[max(ESTi,ESTj)=0,min(LCTi,LCTj)=4]. However, this would cause an overflow of the capacity, implying disjointness. Similar reasoning derives that δjk=1 and δik=1.

In this example, just as in the motivating example (Subsection 4.1), neither of the three constraints is infeasible; in fact, each can only imply one disjointness relation. However, the intersection of those constraints is correctly declared infeasible by (i) Cumulative mining and (ii) SelectiveDisjunctive overload checking. Thus, combining the mining strategies with SelectiveDisjunctive inferences can generate conflicts that are out of reach for the conventional techniques for the Cumulative constraint.  

We adapt the time-table disjunctive reasoning [16] by going over all pairs of tasks i,jT, and calculating their overlap oij. Then we check τoij if there exists a profile such that scheduling the tasks together with that profile would lead to an overflow. A simple implementation of this approach has time complexity 𝒪(|T|2|TT|), where TT is the set of existing profiles (i.e, the time-table). This reasoning is formally stated in the following proposition:

Proposition 14.

Consider a Cumulative constraint on tasks T with capacity C. Let the overlap between two tasks i,jT be given by oij=[max(ESTi,ESTj),min(LCTi,LCTj)). Then δij is true for any feasible solution as long as there is no point τoij such that Height(τ,{i,j}) plus the resource consumption of i and j fits within the capacity:

τoij:ri+rj+Height(τ,{i,j})>Cδij=1

Moreover, given a propagation of δij by a set of profiles 𝒫={(a0,b0,h0),,(am,bm,hm)},a0<<am, we explain the propagation according to Equation 5, assuming without loss of generality that ESTiESTj. For pr, we use the big-step explanation [33].

{sja0sjbmdj+1if LCTiLCTjsibmdi+1sja0otherwise(Pr𝒫pr(Pr))δij=1 (5)

The first part of the explanation in Equation 5 is based on the observation that either one task subsumes the entire interval of the other (in which case the explanation depends only on the bounds of sj), or there is a partial overlap where one task starts at the same time or before the other and ends at the same time or before the other (in which case the overlapping part is only defined by the upper bound on si and the lower bound on sj).

Additionally, we adapt the propagation by Gay et al. [16] and propose a rule using δ in Equation 6 (the rule for updating si is symmetric). The rule in Equation 6 and the rule proposed by Gay et al. [16] do not subsume one another since Equation 6 can propagate based on information gained from other constraints, while the rule by Gay et al. can propagate when two tasks are not fully disjoint.

δij=1ECTi>ESTjLSTi<ECTjsjECTi (6)

Nogood.

Last, we describe how to derive disjointness from nogoods: given an unsatisfied nogood N, tasks i,jT are disjoint if falsifying each of the unassigned atomic constraints implies δij. An example can be seen in 15 and it is formalised in 16.

Example 15.

Given two tasks i,jT such that si[0,9],sj[7,20] with durations di=2,dj=5, consider a nogood N={pi=si5,pj=sj11,}. Suppose that in the current search state, only the first two atomic constraints are unassigned, while the rest are satisfied. Thus, in any feasible solution either ¬pi=si4 or ¬pj=sj12 holds. But in either case, we can derive δij via domain disjointness:

  • If ¬pi is true, then LCTi=6ESTj=7, implying δij.

  • If ¬pj is true, then LCTj=11ESTi=12, also implying δij.

Proposition 16.

Let 𝒟 be a set of domains, and suppose that a nogood N has no falsified atomic constraints in this domain. Then a pair of tasks i,j is disjoint if for every atomic constraint pN not satisfied with respect to 𝒟, δij is implied by domain disjointness.

Our procedure is a simplified version of 16 based on the two-watcher scheme [28, 27]. Whenever a watcher is updated, we perform a scan to determine whether the conditions of 16 hold while reasoning over domain disjointness; this misses out on some propagations, but the time required is significantly reduced. The time complexity of this approach is 𝒪(|N|) for each nogood N. We explain this propagation due to an unsatisfied nogood N as sn(si,sj,pi,pj)(p+N+p+)δij=1, where N+N is the set of satisfied atomic constraints, and px=sxv is one of the two undecided atomic constraints. See Appendix C for the definition of sn.

5 Experiments

We implemented our approach as part of Pumpkin [12]. First, we present an ablation study in Subsection 5.1, which shows that (a) dynamic disjointness mining hinders the learning of the solver but that it can be beneficial on specific instances, (b) an alternative efficient sorting heuristic has a significant impact on the search but also leading to massive slowdowns on some instances, and (c) enabling the propagation in Equation 6 does not consistently change the performance. Next, we run a comparison on RCPSP and RCPSP/max benchmarks in Subsection 5.2 with baseline Pumpkin and CP-SAT, which shows that our approach can accelerate search by orders of magnitude on certain instances. Additionally, we compare our results with bounds reported in the literature and show that our approach discovers new lower bounds on 16 RCPSP/max instances and four RCPSP instances, as well as new upper bounds on two RCPSP/max instances and four RCPSP instances.

Experiment Setup.

The implementation of our approach with the infrastructure for running the experiments is available in the supplementary material. We ran our experiments on DelftBlue [11], with each run of an instance being allocated a single core of an Intel Xeon E5-6248R 24C 3.0GHz processor and 4000 MB of RAM with a time limit of one hour.

For our evaluation, we use two models that minimise the latest completion time of all tasks (makespan). One is RCPSP/max, a collection of Cumulative constraints and precedence relations encoded by the constraints si+γijsj with arbitrary constants γij; we use the test suites distributed by PSPLIB [25], namely, C, D, UBO, and SM suites. We also use RCPSP, a special case of RCPSP/max where all precedence relations have the form si+disj; we use the data files provided in the MiniZinc benchmarks [36], which correspond to the AT [30], BL [3], Pack [8], Pack-d and KSD15-D [26], and PSPLIB [25] suites.

Given a minimisation objective 𝒪, each of the solvers is run with one of the two search directions. In the primal search, the solver generates a series of problems with an extra assumption 𝒪on for decreasing values of on. In the dual search, the solver generates a series of problems with an extra assumption 𝒪on for increasing values of on.

We discard all the instances solved by the Pumpkin baseline within five seconds in either search direction, leaving us with 736 RCPSP instances and 349 RCPSP/max instances. The models we used and the data files are available in the supplementary material. We evaluate the rate of progress of a solver towards the best-known bound with the following metrics. First, if M(t) is the lowest makespan discovered at time t[0,T], and M is the lowest discovered makespan for this problem, then the primal integral is 0TM(t)MM(t) and measures how fast the solver progresses towards good solutions [6]. Conversely, if B(t) is the highest lower bound discovered at time t[0,T], and B is the highest discovered lower bound for this problem, the dual integral is defined as 0TBB(t)B.

5.1 Ablation Study

To evaluate the influence of individual components on the performance of the whole approach, we ran the disjunctive approach, both in primal and dual search directions, with all sixteen combinations of enabling and disabling the following features:

Cumulative mining

Enable the mining for Cumulative as described in Subsection 4.4.

Nogood mining

Enable the mining for nogoods as described in Subsection 4.4.

Sorting

Restrict Ω+ in 1 to tasks j with LCTjminiΩLCTi.

Propagation

Execute the propagation shown in Equation 6 on discovered disjunctive pairs.

4(a) reports the relative gains from enabling each feature. For each feature, we consider all combinations of the remaining three features and evaluate the ratio between the aforementioned integrals when adding the feature to this configuration and without the feature.

(a) Distribution of ratio of integrals of enabling each evaluated feature and not enabling the feature. The vertical line in the middle corresponds to no observed difference, a ratio larger than 1 corresponds to an improvement by using the feature, and values smaller than 1 correspond to performance degradation.
(b) Change in LBD split by change in # conflicts with cumulative mining compared to baseline; a negative value indicates a decrease in the LBD when using mining.
Figure 4: Impact of enabling the listed features on the solver performance.

One of the major takeaways is that cumulative mining has a predominantly negative impact on search performance. As we understand, this happens because nogood learning derives statements overly specific to the current search state and thus leads to less general nogoods. To support this conclusion, we compare the average literal block distance (LBD) [1] of nogoods produced444If two runs reached different solutions, we compare the number of conflicts to the best common bound. with and without cumulative mining in 4(b), as this metric indicates learned nogood quality. In 47% of all pairs of runs, mining leads to fewer conflicts, while having a relatively small impact on LBD; however, in the other half of run pairs, mining not only increases the number of conflicts but also exhibits a much larger increase in the LBD.

On the positive side, we have discovered that cumulative mining can be a beneficial strategy for instances where many triples of tasks – but few pairs of them – are mutually disjoint. In that case, fixing the position of a few tasks can uncover a variety of disjoint pairs, aiding the conflict discovery. In particular, this is the case for several instances from the Pack-d collection; we ran the mining configuration of our approach on those instances and observed that the non-mining implementation discovers few conflicts (and thus makes little progress), whereas the mining implementation discovers several bounds tighter than reported by all the other approaches (see Appendix D).

Another takeaway is that the sorting heuristic has a significant influence on the search. Enabling it reduces integrals for most instances, suggesting that the bottleneck of our approach is the clique search, and most of the iterations of 1 do not yield conflicts. However, there is a group of instances for which the dual integral degrades by three orders of magnitude or more; for this reason, we only use the sorting heuristic when the disjunctive approach runs in the primal search direction.

Last, enabling propagation does not exhibit a consistent change in any direction; the primal search can substantially benefit from propagation, but fails to do so consistently, and the dual search is left virtually uninfluenced by the propagation. A similar conclusion holds for using nogood mining in dual search, albeit the performance of primal search becomes consistently worse with nogood mining enabled.

As a result, we use the following configurations: enable sorting and disable propagation and mining for primal search, and disable all additional features for dual search.

5.2 Scheduling Problem Evaluation

In order to establish whether our approach is beneficial, we compare the disjunctive approach with baseline Pumpkin and CP-SAT v9.11 using the aforementioned integrals.

We start by evaluating the influence of adding disjointness reasoning to the baseline Pumpkin. As seen from Figure 5: (1) the disjunctive approach proves much more successful when using dual search as opposed to primal search, and (2) the disjunctive approach commonly translates to much larger improvements in cases where it does not hinder the search, while it can also degrade the performance on a number of instances.

(a) Primal integrals.
(b) Dual integrals.
Figure 5: Comparison of integrals of baseline and disjunctive approaches. Both axes are logarithmic; lines are evenly spaced and correspond to a tenfold relative change between the integrals.

5(b) suggests that the disjunctive approach can accelerate the search by at least three orders of magnitude, which indicates that our approach can improve state-of-the-art bounds. Indeed, our approach discovers a variety of lower bounds that are both better than the bounds previously reported and the bounds discovered by all other evaluated approaches. Appendix D introduces a precise criterion for reporting the bounds and the complete list of the novel bounds.

Remarkably, our approach is able to close five of the previously open RCPSP/max J30 instances in under a second. For each of those instances, there is a disjoint clique Ω at the root level with dΩ equal to the optimal bound. In all five cases, these cliques correspond to various resource constraints, making them similar to the aforementioned 3n instances; fittingly, CP-SAT has been able to certify only one of them after ten minutes of search, whereas the other instances time out with all other approaches. To clarify this point, we have reproduced and justified such a clique for one of those instances in Appendix E, thereby closing one of the previously open instances without any search.

Both plots in Figure 5 suggest that instance features have a major influence on the change in performance. To establish which instance properties aid our approach, we introduce a measure of similarity of a Cumulative constraint to a Disjunctive constraint.

Definition 17.

For a Cumulative constraint over a resource with capacity C, and tasks T, we call the quantity RC=1|T|iTriC the constrainedness of that Cumulative. Given a CSP, we say that its resource constrainedness RC is the maximum constrainedness among its cumulative constraints. A larger value indicates more resource contention.

Figure 6 demonstrates the impact of the resource constrainedness: there is exactly one instance with an RC under 15% which benefits from using the disjunctive approach, whereas instances with a “ scarce” resource (i.e., RC40%) account for the largest improvements and have the most variability. The performance on the instances between those two values mostly degrades, with a small number of instances seeing minor improvements.

Figure 6: Distribution of dual integral ratio of baseline and disjunctive approach for various values of resource constrainedness. Points to the right of the vertical line correspond to improvements by the disjunctive approach, and points to the left correspond to deterioration.

Last, we evaluate our approach against CP-SAT. 7(a) shows that the baseline consistently performs better on PSPLIB, with worse performance on pack(d) instances. On the other hand, 7(b) indicates that the disjunctive approach has a much smaller edge against CP-SAT, yet it can improve certain instances by many orders of magnitude.

(a) Baseline.
(b) Disjunctive.
Figure 7: Comparison of Pumpkin and CP-SAT dual integrals. All axes are logarithmic; lines on the plots are evenly spaced and correspond to a tenfold relative change.

6 Conclusions

We have presented a novel way to aggregate information about the disjointness of tasks in scheduling problems across multiple constraints. We do this by creating new variables representing the disjointness between pairs of tasks, mining for this disjointness during search, and heuristically finding disjoint cliques that lead to conflicts. We show that the “standard” approaches require exponentially many conflicts to prove optimality on crafted instances, while our approach can show their infeasibility instantaneously. We also show that our approach provides improvements in the order of magnitudes for benchmark instances with high resource contention, indicating that reasoning across multiple constraints during search is of vital importance for inferring the global structure of instances.

One direction for future work would be determining what is a good clique. Another direction would be to look into incorporating more Disjunctive reasoning over the cliques. Additionally, future work could focus on mining jointness rather than disjointness.

References

  • [1] Gilles Audemard and Laurent Simon. Refining restarts strategies for SAT and UNSAT. In Proceedings of the 18th international conference on principles and practice of constraint programming, pages 118–126, Berlin, Heidelberg, 2012. Springer-Verlag. doi:10.1007/978-3-642-33558-7_11.
  • [2] Philippe Baptiste, Claude Le Pape, and Wim Nuijten. Satisfiability tests and time-bound adjustments for cumulative scheduling problems. Annals of operations research, 92:305–333, 1999. doi:10.1023/a:1018995000688.
  • [3] Philippe Baptiste and Claude Le Pape. Constraint propagation and decomposition techniques for highly disjunctive and highly cumulative project scheduling problems. Constraints: an international journal, 5(1/2):119–139, January 2000. doi:10.1023/a:1009822502231.
  • [4] Paul Beame and Toniann Pitassi. Simplified and improved resolution lower bounds. In Proceedings of the 37th annual symposium on foundations of computer science, FOCS ’96, page 274, USA, 1996. IEEE Computer Society.
  • [5] Nicolas Beldiceanu and Mats Carlsson. A new multi-resource cumulatives constraint with negative heights. In Proceedings of the 8th international conference on principles and practice of constraint programming, CP ’02, pages 63–79, Berlin, Heidelberg, 2002. Springer-Verlag. doi:10.1007/3-540-46135-3_5.
  • [6] Timo Berthold. Measuring the impact of primal heuristics. Operations research letters, 41(6):611–614, 1 November 2013. doi:10.1016/j.orl.2013.08.007.
  • [7] Peter Brucker, Sigrid Knust, Arno Schoo, and Olaf Thiele. A branch and bound algorithm for the resource-constrained project scheduling problem. European journal of operational research, 107(2):272–288, 1 June 1998. doi:10.1016/s0377-2217(97)00335-4.
  • [8] Jacques Carlier and Emmanuel Néron. On linear lower bounds for the resource constrained project scheduling problem. European journal of operational research, 149(2):314–324, 1 September 2003. doi:10.1016/s0377-2217(02)00763-4.
  • [9] Geoffrey Chu, Peter J Stuckey, Andreas Schutt, Thorsten Ehlers, Graeme Gange, and Kathryn Francis. Chuffed: The chuffed CP solver. https://github.com/chuffed/chuffed.
  • [10] Rina Dechter. Learning while searching in constraint-satisfaction-problems. In Proceedings of the fifth AAAI national conference on artificial intelligence, AAAI’86, pages 178–183, Philadelphia, Pennsylvania, 1986. AAAI Press.
  • [11] Delft High Performance Computing Centre. DelftBlue, 2024.
  • [12] Emir Demirović, Maarten Flippo, Imko Marijnissen, Konstantin Sidorov, and Jeff Smits. Pumpkin: a lazy clause generation constraint solver in Rust. https://github.com/ConSol-Lab/Pumpkin, 2024.
  • [13] Ulrich Dorndorf. Consistency tests. In Project scheduling with time windows, chapter 4, pages 31–65. Physica-Verlag HD, Heidelberg, 2002. doi:10.1007/978-3-642-57506-8_4.
  • [14] Thibaut Feydy and Peter J Stuckey. Lazy clause generation reengineered. In Proceedings of the 15th international conference on principles and practice of constraint programming, CP’09, pages 352–366, Berlin, Heidelberg, 2009. Springer-Verlag. doi:10.1007/978-3-642-04244-7_29.
  • [15] John Forrest, Ted Ralphs, Stefan Vigerske, Haroldo Gambini Santos, John Forrest, Lou Hafer, Bjarni Kristjansson, jpfasano, EdwinStraver, Jan-Willem, Miles Lubin, rlougee, a-andre, jpgoncal, Samuel Brito, h-i-gassmann, Cristina, Matthew Saltzman, tosttost, Bruno Pitrus, Fumiaki Matsushima, Patrick Vossler, Ron @ Swgy, and to-st. coin-or/Cbc: Release releases/2.10.12, 2024. doi:10.5281/ZENODO.13347261.
  • [16] Steven Gay, Renaud Hartert, and Pierre Schaus. Time-table disjunctive reasoning for the cumulative constraint. In Integration of AI and OR techniques in constraint programming, Lecture notes in computer science, pages 157–172, Cham, 2015. Springer International Publishing. doi:10.1007/978-3-319-18008-3_11.
  • [17] Gecode Team. Gecode: Generic constraint development environment. http://www.gecode.org, 2006.
  • [18] Vincent Gingras and Claude-Guy Quimper. Generalizing the edge-finder rule for the cumulative constraint. In Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI’16, pages 3103–3109. AAAI Press, 9 July 2016. doi:10.5555/3061053.3061056.
  • [19] Armin Haken. The intractability of resolution. Theoretical computer science, 39:297–308, 1985. doi:10.1016/0304-3975(85)90144-6.
  • [20] Q Huangfu and J A J Hall. Parallelizing the dual revised simplex method. Mathematical programming computation, 10(1):119–142, March 2018. doi:10.1007/s12532-017-0130-5.
  • [21] Joey Hwang and David G Mitchell. 2-way vs. d-way branching for CSP. In Principles and practice of constraint programming - CP 2005, Lecture notes in computer science, pages 343–357, Berlin, Heidelberg, 1 October 2005. Springer Berlin Heidelberg. doi:10.1007/11564751_27.
  • [22] Roger Kameugne, Laure Pauline Fotso, Joseph Scott, and Youcheu Ngo-Kateu. A quadratic edge-finding filtering algorithm for cumulative resource constraints. Constraints: an international journal, 19(3):243–269, July 2014. doi:10.1007/s10601-013-9157-z.
  • [23] Richard M Karp. Reducibility among combinatorial problems. In Complexity of computer computations, pages 85–103, Boston, MA, 1972. Springer US. doi:10.1007/978-1-4684-2001-2_9.
  • [24] Robert Klein and Armin Scholl. Computing lower bounds by destructive improvement: An application to resource-constrained project scheduling. European journal of operational research, 112(2):322–346, January 1999. doi:10.1016/s0377-2217(97)00442-6.
  • [25] Rainer Kolisch and Arno Sprecher. PSPLIB - a project scheduling problem library. European journal of operational research, 96(1):205–216, January 1997. doi:10.1016/s0377-2217(96)00170-1.
  • [26] Oumar Koné, Christian Artigues, Pierre Lopez, and Marcel Mongeau. Event-based MILP models for resource-constrained project scheduling problems. Computers & operations research, 38(1):3–13, 1 January 2011. doi:10.1016/j.cor.2009.12.011.
  • [27] Joao Marques-Silva, Ines Lynce, and Sharad Malik. Conflict-driven clause learning SAT solvers. In Handbook of satisfiability, Frontiers in artificial intelligence and applications, chapter 4. IOS Press, 2 February 2021. doi:10.3233/faia200987.
  • [28] Matthew W Moskewicz, Conor F Madigan, Ying Zhao, Lintao Zhang, and Sharad Malik. Chaff: engineering an efficient SAT solver. In Proceedings of the 38th design automation conference. ACM, 2002. doi:10.1109/dac.2001.935565.
  • [29] Wim Nuijten. Time and resource constrained scheduling: a constraint satisfaction approach. PhD thesis, Technische Universiteit Eindhoven, Eindhoven, 1994. doi:10.6100/IR431902.
  • [30] Ramón Alvarez-Valdés Olaguíbel and José Manuel Tamarit Goerlich. Heuristic algorithms for resource-constrained project scheduling: A review and an empirical analysis. Advances in project scheduling, pages 113–134, 1989.
  • [31] Laurent Perron and Frédéric Didier. CP-SAT, 7 May 2024.
  • [32] Francesca Rossi, Peter van Beek, and Toby Walsh, editors. Handbook of constraint programming. Foundations of artificial intelligence. Elsevier Science, London, England, 18 August 2006. doi:10.1016/s1574-6526(06)x8001-x.
  • [33] Andreas Schutt, Thibaut Feydy, Peter J Stuckey, and Mark G Wallace. Explaining the cumulative propagator. Constraints: an international journal, 16(3):250–282, July 2011. doi:10.1007/s10601-010-9103-2.
  • [34] Andreas Schutt, Thibaut Feydy, Peter J Stuckey, and Mark G Wallace. Solving RCPSP/max by lazy clause generation. Journal of scheduling, 16(3):273–289, June 2013. doi:10.1007/s10951-012-0285-x.
  • [35] Peter J Stuckey. RCPSP. https://people.eng.unimelb.edu.au/pstuckey/rcpsp/. Accessed: 2025-3-29.
  • [36] Peter J Stuckey, Thibaut Feydy, Andreas Schutt, Guido Tack, and Julien Fischer. The MiniZinc Challenge 2008–2013. AI magazine, 35(2):55–60, 1 June 2014. doi:10.1609/aimag.v35i2.2539.
  • [37] Petr Vilím. o(nlogn) filtering algorithms for unary resource constraint. In Jean-Charles Régin and Michel Rueher, editors, Proceedings of CP-AI-OR 2004, volume 3011 of Lecture Notes in Computer Science, pages 335–347, Nice, France, April 2004. Springer-Verlag. doi:10.1007/978-3-540-24664-0_23.
  • [38] Petr Vilím. Computing explanations for the unary resource constraint. In Roman Barták and Michela Milano, editors, Integration of AI and OR techniques in constraint programming for combinatorial optimization problems, second international conference, CPAIOR 2005, Prague, Czech Republic, May 30 - June 1, 2005, volume 3524 of Lecture Notes in Computer Science, pages 396–409. Springer, 2005. doi:10.1007/11493853_29.
  • [39] Petr Vilím. Timetable edge finding filtering algorithm for discrete cumulative resources. In Proceedings of the 8th international conference on integration of Ai and OR techniques in constraint programming for combinatorial optimization problems, CPAIOR’11, pages 230–245, Berlin, Heidelberg, 2011. Springer-Verlag. doi:10.1007/978-3-642-21311-3_22.
  • [40] Petr Vilím, Philippe Laborie, and Paul Shaw. Failure-directed search for constraint-based scheduling. In CPAIOR ’15: proceedings of the 12th international conference on integration of AI and OR techniques in constraint programming for combinatorial optimization problems, Barcelon, Spain, 2015. Springer-Verlag.

Appendix A Conversion of 𝟑𝒏 problem proofs into pigeonhole principle pseudo-proofs

For convenience, we reproduce the definition of the C-RES proof system subject to the notation differences between our paper and the work of Hwang and Mitchell [21].

Definition 18.

Given a set of variables 𝒳 with domains 𝒟, a clause is a constraint of the form (j=1Nxj+=vj+)(j=1M¬xj=vj) with xj±𝒳,vj±𝒟(x).

Definition 19.

Given a CSP (𝒳,𝒟,𝒞), a sequence of clauses 𝒫 is an (unsatisfiability) proof of this CSP if the last clause is empty, and every clause ω𝒫 is produced by one of the following rules:

Resolution

ω can be represented as ωω′′ for some clauses ω,ω′′ such that ωx=v and ω′′¬x=v are clauses encountered earlier in the proof. We denote the result of this operation as ωω′′.

Domain clause

ω=v𝒟(x)x=v for some variable x.

Unique value clause

ω=¬x=u¬x=v for some variable x and two distinct values u,v𝒟(x).

Constraint clause

ω is implied by some constraint c𝒞, that is, any solution satisfying c also satisfies ω.

Each step of the proof is either a “ consistency” clause encoding that any variable is assigned to exactly one value, a constraint clause (which in our case corresponds to a propagation), or a resolution of two clauses (which typically corresponds to branching or clause learning). The original C-RES definition by Hwang and Mitchell also restricts 𝒞 to only contain nogoods, and thus the constraint clauses are the negations of nogoods in 𝒞. Again, that difference does not impact the proof definition, because any constraint can be replaced by a set of all nogoods implied by it, which impacts the size of 𝒞 but not the proof length.

In this appendix, we show that the proofs of 3n instances can be encoded as the proofs of the pigeonhole formula [19] with 3n variables, with clauses encoding the pigeonhole subformulas for up to 2n variables. We use the following definition of a pigeonhole problem in this text:

Definition 20.

A pigeonhole formula PHPm is the unsatisfiable propositional formula on (m1)×m variables pi,j,1im1,1jm having the following clauses:

Pigeon clauses

Any pigeon is placed in a hole: p1,ip2,ipm1,i1im1.

Hole clauses

No hole contains two pigeons: p¯i,jp¯i,k1im1,1j<km.

To simplify the description of the encoding, we switch to a more lenient notion of a proof proposed by Beame and Pitassi [4]:

Definition 21.

Given a formula PHPm, an assignment α is called critical if some (m1) pigeons are assigned to (m1) distinct holes, that is, p1,π1,,pm1,πm1 are satisfied by α for some pairwise different πj[1,m],1jm1; if additionally the only assigned pigeon has index k, then α is called 𝐤-critical.

A pair of clauses ω,ω′′ is congruent if ω and ω′′ are equal for any critical assignment. If any critical assignment satisfying clauses ω and ω′′ also satisfies another clause ρ, then this clause is a critical implication by ω and ω′′.

One of the immediate simplifications resulting from the notion of congruency is that we can assume without loss of generality that all clauses defined on pigeonhole variables are defined without negations:

Proposition 22 (Beame and Pitassi [4], Section 3).

Given a formula PHPm, any clause ω is congruent to a positive clause ω+ obtained by replacing all literals p¯i,jω with a negative polarity by the conjunction p¯i,jk[1,m],kipi,k of variables over all pigeons except the i-th one.

The core result of this appendix is the reduction from 3n proofs to pigeonhole proofs:

Lemma 23.

Any proof 𝒫 of a 3n problem instance can be encoded into a sequence of clauses 𝒫 with |𝒫||𝒫| such that the final clause is empty and any clause is either a pigeon clause555Hole clauses are tautological with respect to critical assignments. of PHP3n, a critical implication of previous clauses, or a subproblem clause of the form

σ(K,V):=k[1,3n1]K,vVpk,v,|V|>|K| (7)

with 1<|V|2n.

We start by replacing the original 3n problem with a simpler formulation that does not invalidate unsatisfiability proofs of 3n problems.

Definition 24.

Given a set of variables X, the 𝙰𝚕𝚕𝚍𝚒𝚏𝚏𝚎𝚛𝚎𝚗𝚝(X) constraint is true for the solutions in which any two distinct variables x,yX are assigned to different values.

Lemma 25.

Any proof of a 3n problem instance is also a proof of a CSP on the same variables and constraints 𝙰𝚕𝚕𝚍𝚒𝚏𝚏𝚎𝚛𝚎𝚗𝚝(G) with G{XY,YZ,ZX}.

Proof.

Let 𝒫 be a proof of the 3n problem instance, and ω is a constraint clause in 𝒫 with respect to the original 3n instance. We show that this is also a constraint clause with respect to the CSP in the lemma statement. This is sufficient to show that 𝒫 is also a valid proof of the new problem, because the resolution steps only depend on the previous clauses. As for the domain and unique value clauses, they only depend on the underlying variable definitions, which are the same between two problem instances, and are also valid for the new problem instance.

Without loss of generality, suppose that a clause ω is implied by the first constraint in 5. Consider an arbitrary solution that falsifies ω and thus also falsifies the Cumulative constraint in question. Any overflow of such a constraint can be described as either scheduling two tasks from X in parallel, scheduling one task from X and one task from Y in parallel, or scheduling three tasks from Y in parallel. In each case, this assignment violates the 𝙰𝚕𝚕𝚍𝚒𝚏𝚏𝚎𝚛𝚎𝚗𝚝(XY). Thus, 𝙰𝚕𝚕𝚍𝚒𝚏𝚏𝚎𝚛𝚎𝚗𝚝(XY) is false whenever ω is false, or, in other words, ω is true whenever 𝙰𝚕𝚕𝚍𝚒𝚏𝚏𝚎𝚛𝚎𝚗𝚝(XY) is true.

We now complete the reduction to pigeonhole proofs as follows:

Proof of 23.

Introduce a one-to-one mapping between atomic predicates in 𝒫 and pigeonhole variables with xj=vpv,j for some ordering of variables 𝒳={x1,,x3n}; we further refer to this as the canonical mapping. We show that the proof 𝒫 produced by (i) replacing atomic constraints in C-RES clauses with canonical mapping and (ii) discarding tautologies satisfies the lemma conditions. More specifically, we show that each clause ωP𝒫 produced from a C-RES clause ωC𝒫 either falls into one of the three stated categories or is tautologically true. Additionally, we enforce the following invariant: if a non-tautological clause ωC is falsified by an assignment of (3n1) variables to (3n1) different values, then the critical assignment produced from it by canonical mapping falsifies ωP.

We start with the “ consistency” clauses. If ωC=xk=1xk=3n1 is a domain clause, then ωP=p1,kp3n1,k is a valid clause in 𝒫 as a pigeon clause of PHP3n. Additionally, if falsifies ωP, that means that xk is unassigned; the canonical mapping leaves k-th pigeon unassigned, which falsifies ωP. On the other hand, if ωC=¬xk=u¬xk=v is a unique value clause, then ωP=p¯u,kp¯v,k is true for any critical assignment and is thus a tautology.

Next, consider the case when ωC is a constraint clause. By 25, we can assume without loss of generality that ωC𝒫 is a constraint clause with respect to 𝙰𝚕𝚕𝚍𝚒𝚏𝚏𝚎𝚛𝚎𝚗𝚝(XY). Thus, if ωC is false, then any assignment of XY assigns the same values to some two variables in this set. By Hall’s theorem, that means that there is a set of variables QXY with indices V and a set of values K[1,3n1] such that |K|<|V|2n and assigning all variables in Q to values in K falsifies ωC. Conversely, that means that assigning some variable in Q to a value outside of K, equivalently, in K¯=[1,3n1]K satisfies ωC. We have encoded a condition vV(xvK)=vV(xvK¯) implying ωC, which after applying the canonical mapping becomes ωP=kK¯,vVpk,v and can be directly added to the proof, since it coincides with σ(K,V) from Equation 7.

Last, suppose that ωC=ωC′′ωC′′ is a resolution of some previously introduced propositional clauses ωC,ωC′′. Let ωP and ωP′′ be the corresponding clauses in 𝒫; suppose first that neither of those clauses was skipped. In that case, we observe the encoding ωP of ωC is critically implied by ωP and ωP′′: otherwise, there would be a critical assignment α satisfying both ωP and ωP′′ but falsifying ωP, and applying the canonical mapping to it yields an assignment satisfying ωC and ωC but falsifying ωC, contradicting the soundness of the resolution step. In case when one (or both) of the original clauses is a tautology, the same reasoning holds, because resolving with a tautological clause does not increase the set of falsifying assignments.

Appendix B Exponential lower bound on pigeonhole proofs

We start with an auxiliary fact that establishes a lower bound on the width of a valid proof:

Lemma 26.

Let n1,0<γ1, and consider a sequence 𝒫 such that any clause in it is either a pigeon clause of PHP(2+γ)n, a critical implication of the previous clauses, or a subproblem clause σ(K,V), 1|K|<|V|2n. Then 𝒫 has a clause with at least 2γn2 literals.

Proof.

Given a clause ω, let Complexity(ω) be the smallest number of pigeon clauses that imply ω on all critical assignments; in particular, Complexity(ω)=1 for all pigeon clauses ωPHP(2+γ)n, Complexity()=(2+γ)n, and Complexity(σ(K,V))=|V|.

Observe that a proof contains a clause ω^ of complexity greater than γn and not greater than 2n. To verify this, let V2n be the largest cardinality of a set V used in a subproblem clause. If V>γn, then the subproblem clause that achieves this bound also satisfies the complexity bound. Otherwise, assume that this is not the case and that all proof clauses have complexity that is either at most γn or more than 2n. Observe that if ω is critically implied by ω and ω′′, then Complexity(ω)Complexity(ω)+Complexity(ω′′), since the implication relation is transitive. As the formula clauses have complexity 1 and subproblem clauses have complexity at most γn, then all their resolvents ω have to have the complexity of at most γn: otherwise, Complexity(ω)γn+γn2n violates our assumption. But then the same bound has to hold for the resolvents of those resolvents, and so on until the empty clause, which has complexity (2+γ)n>2n, which is a contradiction.

We complete the proof by demonstrating that this clause satisfies the lemma conditions. Let Ω^ be the set of formula clauses that implies ω^ such that γn<|Ω^|2n; observe that ω^ has at least |Ω^|((2+γ)n|Ω^|) literals [4, Lemma 1], and this expression achieves the smallest value 2γn2 at the endpoints of the interval on |Ω^|.

We now derive an exponential bound on the proof length of PHP3n with extra clauses:

Lemma 27.

Let n1, and consider a sequence of clauses 𝒫 such that any clause is either a pigeon clause of PHP3n, a resolution of the previous clauses, or a subproblem clause σ(K,V) with 1|K|<|V|2n. Then 𝒫 has at least 20.08n clauses for sufficiently large n.

Proof.

Suppose this is not the case, and we discovered a proof of length L<20.08n. For the rest of the proof, we say that the clause is long if it has at least n2 literals. By pigeonhole principle, there is a variable xi,j that is contained in at least Ln23n(3n1)19L long clauses. Observe that setting xi,j=1 with xk,j=xi,=0 for all ki,j and discarding all satisfied clauses from both the proof and the formula yields a proof of a PHP3n1 with the same assumptions on subproblem clauses but having at most 89L long clauses.

Repeating this variable elimination procedure for T=log9/8L steps yields a proof of PHP3nT with at most (89)TL<1 long clause, or, in other words, no long clauses. On the other hand, we have a pigeonhole formula with 3nT=3n0.08n×log9/82(2+12)n variables and a proof with no clauses of length at least n2. However, we can now rewrite 3nT=γ^n with γ^1/2 and apply 26, which implies that the proof in question has to have a clause with at least 2γ^n2n2 literals. That is a contradiction, therefore, 𝒫 has to have length of at least 20.08n.

The derivation of the intractability theorem bound is a compilation of the results above:

Proof of Theorem 6.

Let 𝒫 be the pigeonhole formula proof produced from 𝒫 from 23 with |𝒫||𝒫|. By 27, we can rewrite the bound further as |𝒫|=Ω(20.08n), and the theorem bound follows from the 1.05<20.08 inequality.

Appendix C Explanations

Assume without loss of generality that ESTiESTj, as otherwise the tasks i and j can be swapped. Then we generate the explanations sn(si,sj,pi,pj) as prescribed by Table 1.

Table 1: List of explanations sn(si,sj,pi,pj) produced for each pair of atomic constraints pi,pj.
Atomic predicate pi Atomic predicate pj Explanation sn(si,sj,pi,pj)
sivi sjvj sjvi+disivjdi
sivi sjvj sjvi+disivjdi+1
sivi sj=vj sjvi+disivjdi
sivi sjvj sjvi+di1sivjdi
sivi sjvj sjvi+di1sivjdi+1
sivi sj=vj sjvi+di1sivjdi
si=vi sjvj sjvi+disivjdi
si=vi sjvj sjvi+disivjdi+1
si=vi sj=vj sjvi+disivjdi

The common idea behind the listed explanations is to lift the bounds in a way that ensures the absence of overlap when one of the two predicates is true. For example, if pi=sivi and pj=sjvj, then the first predicate sjvi+di in the explanation states the following fact: ‘if pi is true then sj starts only after task i has ended’. Similarly, the second predicate sivjdi+1 in the explanation states that task i should end before task j starts; we add an extra unit of time since we know that sjvjsjvj+1.

Appendix D Novel bounds

We report the bounds discovered by our approach if they are both better than the previous reported bounds and are not directly reproducible without using our approach. More precisely, we report bounds that are simultaneously (a) tighter than the bounds reported in the previous sources known to us that used the same benchmarks [25, 40, 39, 35], and (b) either tighter than any bound derived by non-disjunctive approaches (CP-SAT and baseline Pumpkin) or matches it but was derived at least ten times faster than with any other approach.

Novel upper bounds (makespans) are reported in Table 2, and novel lower bounds are reported in Table 3; the same data is available as supplementary materials. All durations reported in both tables are in MM:SS format; Table 3 additionally reports closed instances and bounds derived with mining. To indicate the remaining optimality gap, we also state the best known lower bound in Table 2 and the best known makespan in Table 3; in either case, that bound is the tightest among the previously reported values and the values discovered by non-disjunctive approaches.

Table 2: Novel upper bounds derived with our approach.
Problem Collection # Ref. objective Our objective Time Best bound
RCPSP/max C 61 378 374 07:03 345
RCPSP/max C 69 380 371 08:47 356
RCPSP J90 5-4 103 102 01:02 101
RCPSP J120 27-7 125 124 08:21 122
RCPSP J120 46-10 188 187 57:34 184
RCPSP J120 9-4 87 86 06:40 85
Table 3: Novel lower bounds derived with our approach. Rows with an asterisk highlight closed instances; rows with a dagger highlight objective values derived with enabled mining.
Problem Collection # Ref. bound Our bound Time Best objective
RCPSP/max J30 64 141 169* <1s 169
RCPSP/max J30 65 144 162* <1s 162
RCPSP/max J30 151 142 157* <1s 157
RCPSP/max J30 153 163 176* <1s 176
RCPSP/max J30 155 125 154* <1s 154
RCPSP/max UBO50 10 154 186* 00:05 186
RCPSP/max UBO100 4 303 365 44:07 376
RCPSP/max UBO100 7 281 373 51:47 395
RCPSP/max UBO100 8 364 376 29:06 385
RCPSP/max UBO100 32 353 414 53:52 434
RCPSP/max UBO100 33 328 397 47:04 406
RCPSP/max UBO200 35 610 747 57:27 823
RCPSP/max UBO200 62 621 702 56:46 796
RCPSP/max UBO500 36 1 285 1 423 26:37 1 908
RCPSP/max UBO500 61 1 296 1 533 59:50 1 944
RCPSP/max UBO500 64 1 329 1 489 55:08 1 932
RCPSP J120 46-9 157 159 18:58 166
RCPSP Pack-d 2 745 746† 25:39 747
RCPSP Pack-d 3 624 625†* 13:54 625
RCPSP Pack-d 47 2 740 2 742† 52:35 2 745

Appendix E Optimality proof for the RCPSP/max instance #64 of collection J30

In this appendix, we support the optimality claim on instance # 64 from the J30 collection by showing that the tasks with indices Ω={1,,9,11,,30} are pairwise disjoint, which implies optimality since tΩdt=169. This instance has five resources, each with the capacity of five units.

To decrease the number of considered pairs, we partition the claimed clique into six parts and record for each of them the lowest amount of each resource consumed by a task:

  1. 1.

    Ω1={6,11,21,24}; lowest resource consumption is (2,4,1,2,2).

  2. 2.

    Ω2={18,23,28}; lowest resource consumption is (2,1,2,2,4).

  3. 3.

    Ω3={30}; lowest resource consumption is (4,1,3,2,1).

  4. 4.

    Ω4={1,2,3,4,5,8,9,12,13,14,15,16,17,19,20,25,29}; lowest resource consumption is (1,1,1,4,1).

  5. 5.

    Ω5={7}; lowest resource consumption is (3,5,1,1,2).

  6. 6.

    Ω6={22,26,27}; lowest resource consumption is (1,1,5,1,2).

The last two groups consume 100% of either resource #2 or resource #3; since all other tasks consume some amount of all resources, any pair of tasks involving a task in Ω5Ω6 is disjoint. The same holds for any pair of tasks involving Ω4 and any task in Ω1Ω4 due to the overflow of resource #4. Thus, it remains to show that any pair of tasks in Ω1Ω2Ω3 is disjoint. Table 4 handles the six cases for all remaining pairs of task groups.

Table 4: Disjointness justifications for every pair of tasks in Ω1Ω2Ω3.
Ω1 Ω2 Ω3
Ω1 #2: 4 + 4 #5: 2 + 4 #1: 2 + 4
Ω2 #5: 4 + 4 #5: 2 + 4
Ω3 #1: 4 + 4