Abstract 1 Introduction 2 Preliminaries 3 pddl to Symbolic dfa 4 Reduction of fond Planning to Synthesis 5 Evaluation 6 Conclusion References

PDDL to DFA: A Symbolic Transformation for Effective Reasoning

Giuseppe De Giacomo ORCID University of Oxford, UK Antonio Di Stasio ORCID City St George’s, University of London, UK Gianmarco Parretti ORCID La Sapienza University of Rome, Italy
Abstract

ltlf reactive synthesis under environment specifications, which concerns the automated generation of strategies enforcing logical specifications, has emerged as a powerful technique for developing autonomous AI systems. It shares many similarities with Fully Observable Nondeterministic (fond) planning. In particular, nondeterministic domains can be expressed as ltlf environment specifications. However, this is not needed since nondeterministic domains can be transformed into deterministic finite-state automata (dfa) to be used directly in the synthesis process. In this paper, we present a practical symbolic technique for translating domains expressed in Planning Domain Definition Language (pddl) into dfas. The technique allows for the integration of the planning domain, reduced to dfa in a symbolic form, into current symbolic ltlf synthesis tools. We implemented our technique in a new tool, pddl2dfa, and applied it to solve fond planning by using state-of-the-art reactive synthesis techniques in a tool called syft4fond. Our empirical results confirm the effectiveness of our approach.

Keywords and phrases:
Fully Observable Nondeterministic Planning, Linear Temporal Logics on finite traces, Reactive Synthesis, DFA
Copyright and License:
[Uncaptioned image] © Giuseppe De Giacomo, Antonio Di Stasio, and Gianmarco Parretti; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Computing methodologies Planning and scheduling
; Theory of computation Logic and verification ; Computing methodologies Symbolic and algebraic manipulation
Supplementary Material:
Software  (Source Code): https://github.com/GianmarcoDIAG/syft4fond [33]
Editors:
Thierry Vidal and Przemysław Andrzej Wałęga

1 Introduction

In recent years there has been a growing interest in applying Formal Methods techniques to Artificial Intelligence in order to develop autonomous AI systems that can operate effectively in dynamic and complex environments.

These techniques include reactive synthesis, which concerns the automated generation of winning strategies that enforce requirements given in the form of logical specifications [16, 34]. Specifically, we consider reactive synthesis for specifications in Linear Temporal Logic on Finite Traces (ltlf) [26, 27], which maintains the syntax of ltl [35], the formalism typically used to express complex dynamic properties in Formal Methods [8], but it is interpreted on finite traces.

A key aspect shared by all synthesis work in AI is the need for a model of the environment in which the agent acts. In fully observable nondeterministic (fond) planning [18, 17], such a model is given as a state-based domain that specifies, in each state, how the environment enacts the (possibly nondeterministic) effects of agent actions. In its most common form, fond planning involves computing a strong plan that guarantees reaching one of the goal states independently of the nondeterminism in the domain, thus sharing many similarities with ltlf reactive synthesis [27, 15]. Specifically, a fond domain represented with functions and sets can be expressed as an ltlf environment specification and transformed into a deterministic finite-state automaton (dfa) that accepts the traces consistent with the domain specification [25, 1, 6, 23]. It follows that fond planning can be reduced to synthesizing a winning strategy over a dfa game [25, 23], as ltlf reactive synthesis [27].

In practice, fond domains are often specified in a compact language like the Planning Domain Definition Language (pddl) [31], which has been extensively used in planning competitions111See https://www.icaps-conference.org/competitions/.. However, while how to transform a fond domain into dfa is well-known in theory, how to effectively transform pddl into dfa in practice is still open to further investigation.

In this paper, we present an effective technique for transforming pddl into a symbolic dfa with transitions and final states represented as Boolean functions encoded by using Binary Decision Diagrams (bdds) [13]. This technique allows for the integration of fond domains into symbolic ltlf synthesis tools, which are known for their scalability and efficiency [39, 10, 9, 37]. The construction process involves representing the nondeterminism in the domain through suitable agent actions and environment reactions. Once this representation is established, the symbolic dfa of the domain can be efficiently constructed by manipulating bdds. Our technique has the notable property that, while worst-case exponential in the size of the input domain, it is often polynomial due to its concise representation of the nondeterminism in the domain through a compact set of environment reactions.

We implemented our method in a new tool, pddl2dfa, and applied it to devise a reduction of fond planning into reactive synthesis (optimal wrt complexity of fond planning, i.e., exptime-complete [17]) in a tool called syft4fond. We applied this construction to various case studies, including the classic blocks world, blocks world extended, an elevator system, and two navigation environments. Our empirical results show the performance of our approach in these different cases. Specifically, our technique successfully constructs the dfa for a considerable number of instances and solves the synthesis problem for a reasonable number of them, showing the practical feasibility of reducing planning to synthesis.

Our approach takes a step towards integrating planning and synthesis more closely and serves as a promising starting point for future research.

2 Preliminaries

Notations.

A trace over an alphabet of symbols Σ is a finite or infinite sequence of elements from Σ. The empty trace is denoted λ. Traces are indexed starting at zero, and we write π=π0π1. For a finite trace π, let 𝗅𝗌𝗍(π) denote the index of the last element of π, i.e., 𝗅𝗌𝗍(π)=|π|1.

2.1 Linear Temporal Logic on finite traces (ltlf)

ltlf is a variant of ltl interpreted over finite traces [26] instead of infinite traces [35]. ltlf has the same syntax as ltl. Given a set AP of atomic propositions (aka atoms), the ltlf formulas over AP are generated by the following grammar:

φ::=aφ1φ2¬φφφ1𝒰φ2

Where aAP. Here (Next) and 𝒰 (Until) are temporal operators. We use standard Boolean abbreviations such as (or), (implies), 𝑡𝑟𝑢𝑒 and 𝑓𝑎𝑙𝑠𝑒. Moreover, we define the following abbreviations: φ¬¬φ (Weak Next), φ𝑡𝑟𝑢𝑒𝒰φ (Eventually), and φ¬¬φ (Always). The size of φ, written |φ|, is the number of its subformulas. Formulas are interpreted over finite traces π over the alphabet Σ=2AP, i.e., the alphabet consisting of the propositional interpretations of the atoms. Thus, for 0i𝗅𝗌𝗍(π), πi2AP is the i-th interpretation of π. That an ltlf formula φ holds at instant i𝗅𝗌𝗍(π), written π,iφ, is defined inductively:

  • π,ia iff aπi (for aAP);

  • π,i¬φ iff π,i⊧̸φ;

  • π,iφ1φ2 iff π,iφ1 and π,iφ2;

  • π,iφ iff i<𝗅𝗌𝗍(π) and π,i+1φ;

  • π,iφ1𝒰φ2 iff j such that ij𝗅𝗌𝗍(π) and π,jφ2, and k,ik<j we have that π,kφ1.

We say that π satisfies φ, written πφ, if π,0φ.

2.2 Deterministic Finite Automata

A deterministic finite automaton (dfa) is a tuple 𝒜=(Σ,Q,q0,δ,F), where: Σ is a finite input alphabet; Q is a finite set of states; q0Q is the initial state; δ:Q×ΣQ is the transition function; and FQ is the set of final states. The size of 𝒜 is |Q|. Given a finite trace α=α0α1αn over Σ, we extend δ to be a function δ:Q×ΣQ as follows: δ(q,λ)=q, and, if qn=δ(q,α0αn1), then δ(q,α0αn)=δ(qn,αn). A trace α is accepted by 𝒜 if δ(q0,α)F. The language of 𝒜, written (𝒜), is the set of traces that the automaton accepts.

Theorem 1 ([26]).

Given an ltlf formula φ, we can build a dfa of at most doubly-exponential size in |φ| whose language is the set of traces satisfying φ.

2.3 ltlf Reactive Synthesis

ltlf reactive synthesis [27] concerns finding a strategy to satisfy an ltlf goal specification. Goals are expressed as ltlf formulas over AP=𝒳𝒴, where 𝒳 and 𝒴 are disjoint sets of variables. Intuitively, 𝒳 (resp. 𝒴) is under the environment’s (resp. agent’s) control. Traces over Σ=2𝒳𝒴 will be denoted π=(X0Y0)(X1Y1) where Xi𝒳 and Yi𝒴 for every i. Infinite traces of this form are also called plays.

An agent strategy is a function σ:(2𝒳)2𝒴 mapping sequences of environment moves to an agent move. The domain of σ includes the empty sequence λ as we assumed that the agent moves first. A trace π is σ-consistent if Y0=σ(λ) and Yj+1=σ(X0Xj) for every j0. Let φ be an ltlf formula over 𝒳𝒴. An agent strategy σ is winning for (aka enforces) φ if, for every play π that is σ-consistent, some finite prefix of π satisfies φ. ltlf reactive synthesis is the problem of finding an agent strategy σ that enforces φ, if one exists, and is 2exptime-complete in the size of φ [27].

In many AI applications the agent has some knowledge about how the environment works, which it can exploit to enforce the goal [2]. This knowledge can be expressed by an ltlf formula over 𝒳𝒴, which we call environment specification. In the synthesis of winning strategies, we can intuitively see as restricting traces of interest to those satisfying . In this setting, synthesis amounts to computing a strategy that enforces the implication φ, if one exists.

2.4 dfa Games

A dfa game is a dfa 𝒢=(Σ,Q,q0,δ,F) with input alphabet Σ=2𝒳𝒴. The notions of agent strategy and play defined above also apply to dfa games. A play is winning if it contains a finite prefix that is accepted by the dfa. An agent strategy is winning if, for every play π that is σ-consistent, π is winning. That is, the agent wins the game if it can force the play to visit the set of final states at least once. The winning region is the set of states qQ for which the agent has a winning strategy in the game 𝒢, where 𝒢=(2𝒳𝒴,Q,q,δ,F), i.e., the same game as 𝒢, but with initial state q. Solving a dfa game is the problem of computing the agent winning region and a winning strategy, if one exists. dfa games can be solved in polynomial time by a backward-induction algorithm that performs a fixpoint computation over the state space of the game [7]. Synthesis of an ltlf formula φ can be reduced in doubly-exponential time to solve the dfa game 𝒢φ corresponding to φ [27].

Solving ltlf synthesis reduces to solving a reachability game over the dfa corresponding to the ltlf specification. The procedure for solving ltlf synthesis is detailed in Algorithm 1.

Algorithm 1 ltlf Synthesis.

Input: ltlf formula φ
Output: strategy σag realizing φ;

1: Compute the corresponding NFA 𝒜φ;
2: Determinize 𝒜φ into a dfa φ;
3: Solve the reachability game over φ.

2.5 Symbolic Synthesis

We consider the dfa representation described above as an explicit-state representation. Instead, we are able to represent a dfa more compactly in a symbolic way by using a logarithmic number of propositions to encode the state space [40]. Formally, the symbolic representation of a dfa 𝒜=(2𝒳𝒴,Q,q0,δ,F) is a tuple 𝒜s=(𝒳,𝒴,𝒵,Z0,η,f), where: 𝒵 is a set of state variables such that |𝒵|=log|Q|, and every state qQ corresponds to an interpretation Z2𝒵; Z02𝒵 is the interpretation corresponding to the initial state q0; η:2𝒵×2𝒳×2𝒴2𝒵 is a Boolean function such that η(Z,X,Y)=Z if and only if Z is the interpretation of a state q and Z is the interpretation of the state δ(q,XY); and f is a Boolean function over 𝒵 such that f(Z)=1 if and only if Z is the interpretation corresponding to a state qF. Note that the transition function η can be represented by an indexed family consisting of a Boolean formula ηz for each state variable z𝒵, which when evaluated over an assignment to 𝒵𝒳𝒴 returns the next assignment to z.

A symbolic dfa game can be solved by performing a least fixpoint computation over two Boolean formulas w over 𝒵 and t over 𝒵𝒴 which represent the winning region and winning states with agent moves such that, regardless of how the environment behaves, the agent reaches the final states, respectively [40]. Specifically, w and t are initialized as w0(𝒵)=f(𝒵) and t0(𝒵,𝒴)=f(𝒵), since every final state is a winning state. Note that t0 is independent of the propositions from 𝒴, since once the play reaches the final states, the agent can do whatever it wants. Then, ti+1 and wi+1 are constructed as follows:

ti+1(Z,Y)=ti(Z,Y)(¬wi(Z)X.wi(η(X,Y,Z)))
wi+1(Z)=Y.ti+1(Z,Y)

The computation reaches a fixpoint when wi+1wi. When the fixpoint is reached, no more states will be added, and all winning states have been collected. By evaluating Z0 on wi+1 we can determine if there exists a winning strategy. If that is the case, ti+1 can be used to compute a uniform positional winning strategy through the mechanism of Boolean synthesis [28].

2.6 Fully Observable Non-Deterministic (fond) Planning

Following [23], we define a fond domain as a tuple 𝒟=(2,s0,Act,React,α,β,δ), where: is a finite set of fluents, || is the size of 𝒟, and 2 is the state space; Act and React are finite sets of agent actions and environment reactions, respectively; α:22Act is a function denoting agent action preconditions; β:2×Act2React is a function denoting environment reaction preconditions; and δ:2×Act×React2 is the transition function such that δ(s,a,r) is defined if and only if aα(s) and rβ(s,a). We assume that planning domains satisfy the properties of:

  • Existence of agent action: s2.aα(s);

  • Existence of environment reaction: s2.aα(s).rβ(s,a);

  • Uniqueness of environment reaction:
       s2.aα(s).r1,r2β(s,a).δ(s,a,r1)=δ(s,a,r2)r1=r2.

With these properties, inspired by [22], we can capture classical fond domains [18, 29] by explicitly introducing environment reactions corresponding to nondeterministic effects of agent actions.

A state trace of 𝒟 is a finite sequence τ=s0sn of states such that s0 is the initial state of 𝒟 and, for every i<n, there exists an agent action aiα(si) and an environment reaction riβ(si,ai) such that si+1=δ(si,ai,ri). A plan is a partial function κ:2Act such that, if κ(s) is defined, then κ(s)α(s). A plan terminates its execution in states where, being a partial function, it specifies no action. A state trace τ=s0sn is κ-consistent if: (i) s0 is the initial state of 𝒟; (ii) for every i<n, si+1=δ(si,ai,ri) for ai=κ(si) and some riβ(si,ai); and (iii) κ(sn) is undefined.

fond (strong) planning concerns finding a plan to satisfy a goal regardless of the nondeterminism in the domain, called strong plan. A goal G is a conjunction of fluents and negations of fluents. Given a goal G and a domain 𝒟, a plan κ is strong for G in 𝒟 if, for every state trace τ=s0,,sn that is κ-consistent, snG. Formally, fond planning is the problem of finding a strong plan for G in 𝒟, if one exists. fond planning is exptime-complete in the size of 𝒟 [17].

In this paper, we always assume that we have fond domains expressed in pddl [31], i.e., the fluents defining the states of the domain are predicates over objects. We write predicate/k to specify that k objects participate in the predicate relation. Predicates, actions, and action preconditions are specified in first-order syntax in a domain.pddl file, whereas the initial state, goal, and objects, are usually specified in a separate problem.pddl file.

3 pddl to Symbolic dfa

In this paper, we present a technique for transforming pddl into symbolic dfa. A naive approach is to translate pddl into ltlf [2] and build its corresponding symbolic dfa. However, this approach is limited by the doubly-exponential blow-up resulting from transforming ltlf formulas in dfas [26]. Instead, our technique ensures only a single-exponential blowup in the number of fluents.

Our transformation is based on representing the nondeterminism in the domain with agent actions and environment reactions, as described in Section 2. Once such a representation is established, the symbolic dfa of the domain can be easily built by constructing suitable Boolean formulas.

The core idea to represent the nondeterminism in the domain with agent actions and environment reactions is as follows. For every state of the domain s2 and possible nondeterministic effect s1,,sn of an agent action aα(s) (specified in its oneof clause), we introduce an environment reaction r1,,rn such that, for every i<n, we have δ(s,a,ri)=si and riβ(s,a). That is, ri represents the i-th nondeterministic effect of applying action a in s. Applying this construction to every state s2 and agent action aAct generates a planning domain 𝒟=(2,s0,Act,React,α,β,δ) as detailed in Algorithm 2

Algorithm 2 Pddl2ActsAndReacts(domain.pddl,problem.pddl).
Example 2.

Consider a robotic agent that operates in a blocksworld environment where it can pick up/drop blocks from/in top of other blocks as well as pick up/drop blocks from/on the table. The domain defines the predicates emptyhand/0, holding/1, on-table/1, on/2, clear/1 to specify that the agent holds no block, the agent holds a block, a block is on the table, a block is on top of another block, and a block can be picked, respectively.

Assume that there exist two blocks, yellow and green, and that in the initial state green is on the table, yellow is on top of green, and the agent holds no block. Consider the agent actions pick-up and put-on-block defined as follows.

1(:action pick-up
2 :parameters (?b1 ?b2 - block)
3 :precondition (and
4 (not (= ?b1 ?b2))
5 (emptyhand)
6 (clear ?b1)
7 (on ?b1 ?b2))
8 :effect (oneof
9 (and
10 (holding ?b1)
11 (clear ?b2)
12 (not (emptyhand))
13 (not (clear ?b1))
14 (not (on ?b1 ?b2)))
15 (and
16 (clear ?b2)
17 (on-table ?b1)
18 (not (on ?b1 ?b2)))))
19
20(:action put-on-block
21 :parameters (?b1 ?b2 - block)
22 :precondition (and
23 (holding ?b1)
24 (clear ?b2))
25 :effect (oneof
26 (and
27 (on ?b1 ?b2)
28 (emptyhand)
29 (clear ?b1)
30 (not (holding ?b1))
31 (not (clear ?b2)))
32 (and
33 (on-table ?b1)
34 (emptyhand)
35 (clear ?b1)
36 (not (holding ?b1)))))

Intuitively, action pick-up specifies that whenever the agent tries to pick up a block, either it succeeds, or it does not and the block falls on the table. Similarly for action put-on-block. We can capture the nondeterministic effects of actions pick-up and put-on-block with two reactions succeed and fail. A fragment of the planning domain resulting from applying Algorithm 2 to the pddl description above is shown in Figure 1.

Figure 1: Fragment of the domain resulting from applying Algorithm 2 to the pddl description in Example 2.
 Remark 3.

Algorithm 2 returns a domain 𝒟=(2,s0,Act,React,α,β,δ) with fluents, initial state, agent actions, and action preconditions as in its pddl description. The number of environment reactions is at most single-exponential in the number of fluents ||. To see this, observe that there may exist an action aAct that, from some state s2 of the domain, nondeterministically leads to every other state s2, thus generating an environment reaction for each such successor state. However, in practice the number of reactions is often several orders of magnitude less than 2, since the possible nondeterministic effects in oneof clauses of agent actions are usually very few.

Once the nondeterminism in the domain is represented through deterministic action-reaction pairs (a,r), we can construct the symbolic dfa of the domain. To do this, we characterize each pair (a,r) in terms of its add-list add(a,r) and delete-list del(a,r), which are the sets of fluents added and deleted by (a,r), respectively. Formally, a fluent f is in add(a,r) (resp. del(a,r)) iff for every s2, if aα(s) and fs (resp. fs), then fδ(s,a,r) (resp. fδ(s,a,r)). The add- and delete-lists of action-reaction pairs can be extracted immediately from the pddl description of the domain. Specifically, fluents appearing without (resp. with) not in the oneof clause of an action a are in the add-list (resp. delete-list) of the corresponding action-reaction pair (a,r).

Algorithm 3 DomainToDfa(𝒟,G).

We give in Algorithm 3 a technique to transform a fond domain with a goal into a symbolic dfa. Algorithm 3 constructs a symbolic dfa with a state variable for each fluent and two error state variables, AgErr and EnvErr, denoting that the agent and the environment violated the domain specification, respectively, and whose initial state is that of the domain (Line 1). The alphabet of the symbolic dfa is partitioned into actions and reactions, which are under the control of the agent and the environment, respectively (Line 2). For every state variable corresponding to a fluent, Algorithm 3 constructs its transition function as specified in Line 3. Intuitively, the transition function ηf of fluent f specifies that in the next time step f holds if and only if either:

  1. 1.

    f was true in the previous time step and was not deleted by an action-reaction pair (a,r) such that fdel(a,r); or

  2. 2.

    f was added by some action reaction pair (a,r) such that fadd(a,r).

Algorithm 3 constructs the transition functions of AgErr and EnvErr in Lines 4 and 5. Intuitively, the transition function ηAgErr of AgErr specifies that in the next time step the agent reaches the error state if and only if either:

  1. 1.

    The agent was in its error state in the previous time step, written AgErr; or

  2. 2.

    The agent violated the mutual exclusion axiom for its actions, which states that, at each time step, the agent must execute one and only one action, written ¬AgMutex(𝒴);

  3. 3.

    The agent violated an action precondition, written ¬AgPre(𝒵,𝒴).

The transition function of EnvErr is constructed similarly.

Taking the agent’s point of view, Algorithm 3 constructs the final states of the dfa so that a trace is accepted if the agent does not reach its error state and either the environment reaches its error state or the goal is reached (Line 6).

The size of the symbolic dfa constructed by Algorithm 3, i.e., the number of its state variables, is polynomial in the size of the domain. Each line can be executed in polynomial time in the size of the domain. As a result, Algorithm 3 runs in polynomial time in the size of the input domain.

Together, Algorithms 2 and 3 form our technique for transforming pddl into symbolic dfa. While worst-case exponential in the size of the input domain, our transformation is often polynomial due to its compact representation of nondeterministic effects of agent actions with suitable environment reactions.

4 Reduction of fond Planning to Synthesis

As an application to demonstrate the effectiveness of our pddl to dfa transformation technique, we show how to use it to solve fond planning problems.

Let 𝒟 and G be a fond domain and goal in pddl. Construct its symbolic dfa 𝒜s by using Algorithms 2 and 3. Solve the symbolic dfa game over 𝒜s, assigning actions and reactions to agent and environment, respectively.

Theorem 4.

Let 𝒟 and G be a fond domain and goal in pddl and 𝒜s their dfa. There is a strong plan for G in 𝒟 iff there is a winning strategy in 𝒜s.

This synthesis technique is exponential in the size of the pddl domain and therefore optimal wrt the computational complexity of fond planning. However, we also observe that synthesis on dfa games is based on basic backward search algorithms. While we do not introduce sophisticated optimization techniques, we show that we can include them in our synthesis algorithm.

Specifically, we show how to include a simple form of invariants, i.e., properties of states of the domain that must remain unchanged during the execution of every sequence of actions [12]. Invariants can be used to prune the search space by eliminating actions or states that violate these unchanging properties.

We consider mutual exclusion invariants specifying that, at every state, no more than one fluent or negation of fluent l in a set I can be true. Such invariants can be captured as a Boolean formula i(𝒵) over the state variables of the symbolic dfa constructed in Algorithm 3 as:

i(𝒵)=lI(llI.ll¬l)

To include invariants in backward search, we rewrite the fixpoint computation in Section 2. Specifically, w and t are now initialized as w0(𝒵)=f(𝒵)i(𝒵) and t0(𝒵,𝒴)=f(𝒵)i(𝒵), since every goal state must satisfy the invariant. Then, ti+1 and wi+1 are constructed as follows:

ti+1(𝒵,𝒴)=ti(𝒵,𝒴)(¬wi(𝒵)𝒳.wi(η(𝒳,𝒴,𝒵)))
wi+1(𝒵)=(𝒴.ti+1(𝒵,𝒴))i(𝒵)

That is, only states satisfying the invariant are added to the winning region.

 Remark 5.

An approach alternative to ours is to reduce fond planning to ltlf synthesis of the formula φ, where and φ describe the domain and the agent goal, respectively [2]. The ltlf formula is transformed in dfa in doubly-exponential time (Theorem 1) and a strong plan is obtained by solving the corresponding dfa game. However, this approach is limited by the doubly-exponential blowup resulting from transforming the ltlf formula in dfa.

5 Evaluation

We implemented Algorithms 2 and 3 in a tool called pddl2dfa. For parsing, grounding, and computing invariants for the input pddl domain, we based on the tool prp [32]. We code Boolean functions representing transitions and final states of symbolic dfas by Binary Decision Diagrams (bdds) [13] with the bdd library cudd 3.0.0 [36]. The size of a bdd is the number of its nodes. pddl2dfa also implements the transformation from pddl to ltlf (see Remark 5), in which case the dfas of ltlf formulas are constructed with lydia [21], which is among the best performing tools publicly available for ltlf-to-dfa conversion.

We applied the transformation in pddl2dfa to implement the reduction of fond planning to synthesis (ref. Section 4) in a tool called syft4fond 222pddl2dfa and syft4fond at https://github.com/GianmarcoDIAG/syft4fond. For solving symbolic dfa games, we use the symbolic synthesis framework in [40] at the base of state-of-the-art ltlf synthesis tools [11, 37]. We compute winning strategies for dfa games through Boolean synthesis [28].

Setup.

Experiments were run on a laptop running 64-bit Ubuntu 22.04, 3.6 GHz CPU, and 12 GB of memory. Timeout was 1000 secs.

5.1 Benchmark

We performed experiments on a suite of 170 classical fond planning benchmarks divided in five classes: blocks world (50 instance), extended blocks world (50 instances), triangle-tire world (40 instances), rectangle-tire world (15 instances), and elevators (15 instances). In blocks world instances, the agent manipulates blocks with actions that can nondeterministically succeed or fail. In extended blocks world instances, the agent can also move towers of two blocks, with similar nondeterministic success or failure in actions. In triangle-tire and rectangle-tire instances, the agent navigates a grid environment, dealing with nondeterministic success or failure when moving. Elevator instances involve managing elevators to collect coins across multiple floors. For each class, the instances grow by increasing the problem size, which depends on their parameters.

5.2 Empirical Results

We performed experiments to evaluate the efficiency of our technique for translating pddl into symbolic dfa and the practical feasibility of reducing fond planning to synthesis.

We evaluated the performance of pddl2dfa in transforming pddl to ltlf and ltlf to symbolic dfa. When employing this transformation, pddl2dfa was unable to construct the symbolic dfa of the pddl domain in the considered benchmarks, except in very few cases. The bottleneck was transforming ltlf into dfa, which requires doubly-exponential time in the size of the ltlf formula [26].

Table 1: Coverage achieved by pddl2dfa (constructed dfas/instances in benchmark), and domain size (||), number of actions (|Act|), size of largest fluent bdd (Max{F} bdd), size of smallest fluent bdd (Min{F} bdd), size of agent error bdd (AgErr bdd), and size of environment error bdd (EnvErr bdd) in the largest solved instance.
Bench. Coverage || |Act| Max{F} bdd Min{F} bdd AgErr bdd EnvErr bdd
Blocks 31/50 1055 1953 514 14 6549 109
BlocksExt 19/50 379 12654 5085 277 156509 562
Triangle 30/40 2881 4709 1305 17 40247 740
Rect. 10/15 53 52249 5432 2077 66261 1783
Elev. 15/15 66 105 42 10 511 41
Total 105/170

We evaluated the performance of pddl2dfa in transforming pddl to symbolic dfa with Algorithms 2 and 3. Table 1 shows that pddl2dfa constructed the dfa for a considerable number of benchmarks. Notably, pddl2dfa was able to construct the dfa of triangle-tire domains with side 61. The bottleneck of the dfa construction was computing the agent error bdd. Indeed, Table 1 shows that the size of the agent error bdd in the hardest solved instances is orders of magnitude larger than that of the other bdds, including that of the largest fluent bdd. The reason is that constructing the bdd of the agent error requires iterating over all actions of the domain. Instead, constructing the bdds of the fluents requires only considering actions containing that fluent in their add- and delete-lists. Constructing the environment error bdd requires iterating over all environment reactions, but these are often several orders of magnitude less than fluents and agent actions (see Remark 3). As a result, the number of actions is the parameter that affects most the performance of the dfa construction. The size of the pddl domain, i.e., its fluents, has less impact than the number of agent actions on the dfa construction performance. Overall, this is a good result and shows the effectiveness of our construction.

Table 2: Coverage achieved by syft4fond (solved instances/instances in benchmark), and domain size (||), number of actions (|Act|), size of largest fluent bdd (Max{F} bdd), size of smallest fluent bdd (Min{F} bdd), size of agent error bdd (AgErr bdd), and size of environment error bdd (EnvErr bdd) in the largest solved instance.
Bench. Coverage || |Act| Max{F} bdd Min{F} bdd AgErr bdd EnvErr bdd
Blocks 6/50 55 78 46 10 403 29
BlocksExt 3/50 19 81 33 8 327 32
Triangle 9/40 298 467 169 12 2119 111
Rect. 8/15 25 15481 2272 1057 74757 884
Elev. 7/15 44 68 28 10 281 28
Total 33/170 - - - - - -

We evaluated the performance of syft4fond in reducing fond planning to synthesis. Table 2 shows that syft4fond is able to solve a reasonable number of instances. Indeed, syft4fond was able to solve triangle-tire planning instances with side 19, though based on plain backward search. The bottleneck of the synthesis was constructing the bdds of the agent winning moves and region during the fixpoint computation, which are mostly affected by the size of the agent error bdd. In general, we consider this result adequate to show the practical feasibility of reducing fond planning to synthesis.

6 Conclusion

In this paper, we have presented an effective method for translating pddl into symbolic dfa. We implemented our method in a new tool, pddl2dfa, and applied it to solving planning problems through reduction to synthesis in the tool syft4fond. Testing these tools on various case studies demonstrated the practicality and performance of our approach. Indeed, we demonstrated that our method successfully constructs dfas for considerably large domains and solves a practical number of planning instances, performing significantly better than straightforward approaches based on translating pddl in ltlf. Our work makes a step towards integrating planning and synthesis more closely. Future research can build on this foundation, aiming to integrate fond domains into advanced synthesis techniques developed for temporally extended goals [27, 19], structured environment specifications [20, 9, 30, 3], multiple goal specifications [14, 38], and for handling goal unrealizability [4, 5, 24, 23].

References

  • [1] Benjamin Aminof, Giuseppe De Giacomo, Aniello Murano, and Sasha Rubin. Planning and synthesis under assumptions. CoRR, abs/1807.06777, 2018. arXiv:1807.06777.
  • [2] Benjamin Aminof, Giuseppe De Giacomo, Aniello Murano, and Sasha Rubin. Planning under LTL environment specifications. In ICAPS, pages 31–39, 2019. URL: https://ojs.aaai.org/index.php/ICAPS/article/view/3457.
  • [3] Benjamin Aminof, Giuseppe De Giacomo, Gianmarco Parretti, and Sasha Rubin. Effective approach to ltlf best-effort synthesis in multi-tier environments. In IJCAI, 2024.
  • [4] Benjamin Aminof, Giuseppe De Giacomo, and Sasha Rubin. Best-effort synthesis: Doing your best is not harder than giving up. In IJCAI, pages 1766–1772, 2021. doi:10.24963/IJCAI.2021/243.
  • [5] Benjamin Aminof, Giuseppe De Giacomo, and Sasha Rubin. Reactive synthesis of dominant strategies. In AAAI, pages 6228–6235. AAAI Press, 2023. doi:10.1609/AAAI.V37I5.25767.
  • [6] Benjamin Aminof, Giuseppe De Giacomo, Aniello Murano, and Sasha Rubin. Planning under LTL environment specifications. In ICAPS, pages 31–39. AAAI Press, 2019. URL: https://ojs.aaai.org/index.php/ICAPS/article/view/3457.
  • [7] Krzysztof R. Apt and Erich Grädel, editors. Lectures in Game Theory for Computer Scientists. Cambridge University Press, 2011.
  • [8] Christel Baier, Joost-Pieter Katoen, and Kim Guldstrand Larsen. Principles of Model Checking. MIT Press, 2008.
  • [9] Suguman Bansal, Giuseppe De Giacomo, Antonio Di Stasio, Yong Li, Moshe Y. Vardi, and Shufang Zhu. Compositional safety LTL synthesis. In VSTTE, volume 13800 of Lecture Notes in Computer Science, pages 1–19. Springer, 2022. doi:10.1007/978-3-031-25803-9_1.
  • [10] Suguman Bansal, Yong Li, Lucas M. Tabajara, and Moshe Y. Vardi. Hybrid compositional reasoning for reactive synthesis from finite-horizon specifications. In AAAI, pages 9766–9774, 2020. doi:10.1609/AAAI.V34I06.6528.
  • [11] Suguman Bansal, Yong Li, Lucas M. Tabajara, and Moshe Y. Vardi. Hybrid compositional reasoning for reactive synthesis from finite-horizon specifications. In AAAI, pages 9766–9774, 2020. doi:10.1609/AAAI.V34I06.6528.
  • [12] Avrim Blum and Merrick L. Furst. Fast planning through planning graph analysis. Artif. Intell., 90(1-2):281–300, 1997. doi:10.1016/S0004-3702(96)00047-1.
  • [13] Randal E. Bryant. Symbolic Boolean Manipulation with Ordered Binary-Decision Diagrams. ACM Comput. Surv., 24(3):293–318, 1992. doi:10.1145/136035.136043.
  • [14] Alberto Camacho, Meghyn Bienvenu, and Sheila A. McIlraith. Finite LTL synthesis with environment assumptions and quality measures. In KR, pages 454–463. AAAI Press, 2018. URL: https://aaai.org/ocs/index.php/KR/KR18/paper/view/18072.
  • [15] Alberto Camacho, Meghyn Bienvenu, and Sheila A McIlraith. Towards a unified view of AI planning and reactive synthesis. In ICAPS, pages 58–67, 2019. URL: https://ojs.aaai.org/index.php/ICAPS/article/view/3460.
  • [16] Alonzo Church. Logic, arithmetic and automata. In Proc. International Congress of Mathematicians, 1963.
  • [17] Alessandro Cimatti, Marco Pistore, Marco Roveri, and Paolo Traverso. Weak, strong, and strong cyclic planning via symbolic model checking. Artificial Intelligence, 147:35–84, 2003. doi:10.1016/S0004-3702(02)00374-0.
  • [18] Alessandro Cimatti, Marco Roveri, and Paolo Traverso. Strong Planning in Non-Deterministic Domains Via Model Checking. In AIPS, pages 36–43. AAAI, 1998. URL: http://www.aaai.org/Library/AIPS/1998/aips98-005.php.
  • [19] Giuseppe De Giacomo, Antonio Di Stasio, Francesco Fuggitti, and Sasha Rubin. Pure-past linear temporal and dynamic logic on finite traces. In IJCAI, pages 4959–4965, 2020. doi:10.24963/IJCAI.2020/690.
  • [20] Giuseppe De Giacomo, Antonio Di Stasio, Moshe Y Vardi, and Shufang Zhu. Two-stage technique for LTLf synthesis under LTL assumptions. In KR, pages 304–314, 2020. doi:10.24963/KR.2020/31.
  • [21] Giuseppe De Giacomo and Marco Favorito. Compositional approach to translate LTLf/LDLf into deterministic finite automata. In ICAPS, pages 122–130, 2021. URL: https://ojs.aaai.org/index.php/ICAPS/article/view/15954.
  • [22] Giuseppe De Giacomo and Yves Lespérance. The nondeterministic situation calculus. In KR, pages 216–226, 2021. doi:10.24963/KR.2021/21.
  • [23] Giuseppe De Giacomo, Gianmarco Parretti, and Shufang Zhu. LTLf best-effort synthesis in nondeterministic planning domains. In ECAI, pages 533–540, 2023. doi:10.3233/FAIA230313.
  • [24] Giuseppe De Giacomo, Gianmarco Parretti, and Shufang Zhu. Symbolic LTLf best-effort synthesis. In EUMAS, pages 228–243, 2023.
  • [25] Giuseppe De Giacomo and Sasha Rubin. Automata-theoretic foundations of FOND planning for LTLf and LDLf goals. In IJCAI, pages 4729–4735, 2018. doi:10.24963/IJCAI.2018/657.
  • [26] Giuseppe De Giacomo and Moshe Y. Vardi. Linear temporal logic and linear dynamic logic on finite traces. In IJCAI, pages 854–860, 2013. URL: http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6997.
  • [27] Giuseppe De Giacomo and Moshe Y. Vardi. Synthesis for LTL and LDL on finite traces. In IJCAI, pages 1558–1564, 2015. URL: http://ijcai.org/Abstract/15/223.
  • [28] Dror Fried, Lucas M. Tabajara, and Moshe Y. Vardi. BDD-based Boolean functional synthesis. In CAV, pages 402–421, 2016. doi:10.1007/978-3-319-41540-6_22.
  • [29] Hector Geffner and Blai Bonet. A Concise Introduction to Models and Methods for Automated Planning. Morgan & Claypool, 2013.
  • [30] Giuseppe De Giacomo, Antonio Di Stasio, Lucas M. Tabajara, Moshe Y. Vardi, and Shufang Zhu. Finite-trace and generalized-reactivity specifications in temporal synthesis. Formal Methods Syst. Des., 61(2):139–163, 2022. doi:10.1007/S10703-023-00413-2.
  • [31] Patrik Haslum, Nir Lipovetzky, Daniele Magazzeni, and Christian Muise. An Introduction to the Planning Domain Definition Language. Morgan & Claypool, 2019.
  • [32] Christian Muise, Sheila A McIlraith, and J Christopher Beck. Improved non-deterministic planning by exploiting state relevance. In ICAPS, 2012.
  • [33] Gianmarco Parretti. Syft4Fond. Software (visited on 2025-09-18). URL: https://github.com/GianmarcoDIAG/syft4fond, doi:10.4230/artifacts.24786.
  • [34] A. Pnueli and R. Rosner. On the synthesis of a reactive module. In POPL, pages 179–190, 1989.
  • [35] Amir Pnueli. The Temporal Logic of Programs. In FOCS, pages 46–57, 1977. doi:10.1109/SFCS.1977.32.
  • [36] Fabio Somenzi. CUDD: CU Decision Diagram Package 3.0.0. Universiy of Colorado at Boulder, 2016.
  • [37] Shufang Zhu and Marco Favorito. Lydiasyft: A compositional symbolic synthesis framework for ltlf specifications. In TACAS 2025, pages 295–302, 2025. doi:10.1007/978-3-031-90643-5_15.
  • [38] Shufang Zhu and Giuseppe De Giacomo. Act for your duties but maintain your rights. In KR, 2022.
  • [39] Shufang Zhu, Lucas M. Tabajara, Jianwen Li, Geguang Pu, and Moshe Y. Vardi. Symbolic LTLf synthesis. In IJCAI, pages 1362–1369, 2017. doi:10.24963/IJCAI.2017/189.
  • [40] Shufang Zhu, Lucas M. Tabajara, Jianwen Li, Geguang Pu, and Moshe Y. Vardi. Symbolic LTLf synthesis. In IJCAI, pages 1362–1369, 2017. doi:10.24963/IJCAI.2017/189.