On the Complexity of the Realisability Problem for Visit Events in Trajectory Sample Databases

Jansen, Arthur; Kuijpers, Bart

doi:10.4230/LIPIcs.TIME.2025.12

On the Complexity of the Realisability Problem for Visit Events in Trajectory Sample Databases

Arthur Jansen¹¹1Corresponding author

Hasselt University, Databases and Theoretical Computer Science Group and
Data Science Institute (DSI), Agoralaan, Building D, 3590 Diepenbeek, Belgium Bart Kuijpers

Hasselt University, Databases and Theoretical Computer Science Group and
Data Science Institute (DSI), Agoralaan, Building D, 3590 Diepenbeek, Belgium

Abstract

Trajectory sample databases store finite sequences of measured space-time locations of moving objects, along with a speed bound for each object. These databases can be seen as uncertain databases. We propose a language that allows the formulation of queries about the uncertainty in trajectory sample databases. As part of that language, we introduce the notion of visit events, which are used to describe certain constraints on the movement of an object. In our language, an atomic query asks whether a moving object can, given its limitations, realise such an event. We give complexity results for this realisability problem, in various settings.

Keywords and phrases:

Trajectory sample databases, uncertain databases, query languages, complexity

Funding:

Arthur Jansen: Bijzonder Onderzoeksfonds (BOF22OWB06) from UHasselt

Copyright and License:

2012 ACM Subject Classification:

Information systems

\rightarrow

Spatial-temporal systems ; Information systems

\rightarrow

Query languages

DOI:

10.4230/LIPIcs.TIME.2025.12

Event:

32nd International Symposium on Temporal Representation and Reasoning (TIME 2025)

Editors:

Thierry Vidal and Przemysław Andrzej Wałęga

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Due to the proliferation of location-aware devices (such as GPS receivers) in the past two decades, one of the use-cases of moving object databases [8] is the storage of time-stamped measured locations of moving objects [7]. Such a sequence of spatio-temporal measurements of a single moving object is called a trajectory sample. Given this type of partial information on a moving object, we do not know the precise space-time path (or trajectory) which the object has followed, but we do know that the trajectory must have passed these measured spatio-temporal locations. However, without making further assumptions, there are no theoretical limits to the movement of the object in between two measurements. An assumption that originates from the area of time geography, where the moving object’s accessibility to an environment is studied, is that we know a bound on the speed of the moving object [5, 9, 13]. Therefore, it is common to associate a maximal speed to each moving object, alongside a trajectory sample. With this additional knowledge, the actual trajectory of a moving object is guaranteed to be contained in a spatio-temporal region known as a “lifeline necklace” in spatio-temporal and moving object databases [6, 10, 18], or simply as a “chain of space-time prisms” in the fields of time geography [9] and Geographical Information Systems (GIS) [15, 12, 14].

We refer to a moving object database with the specific purpose of storing trajectory samples and speed bounds as a trajectory sample database. A trajectory sample database can be seen as an uncertain (or incomplete) database [1, 11]. In general, an uncertain database represents a set of “possible worlds”, where each possible world is a concrete instantiation of the data. In our setting, a possible world corresponds to an assignment of a trajectory to every object, satisfying the known limitations of that object. In this paper, we propose a query language for trajectory sample databases that is based on events that are possibly semantically interesting for some application and that may occur during a trajectory of a moving object. We introduce visit events as part of our language, which are used to describe constraints on trajectories. The most basic visit event expresses that a trajectory visits a particular region during a particular period of time. To allow for composition, the class of visit events is closed under Boolean combinations (using negation, conjunction and disjunction). For example, we could express the complex event that states that an object has visited a museum in the morning, some restaurant at lunch and has not been in a particular church in the afternoon. For a dataset of tourists visiting some city, this event may have occurred during the movement of some of the tourists. This example shows that our proposed query language is related to the field of semantic trajectories and places of interest (POIs) in such trajectories and could be used to match trajectories with event patterns [4, 17, 21]. The queries appearing in [20], where a cylinder model of uncertainty is used, are similar to what our languages can express.

The atomic query in our language asks whether a given visit event is realisable by the trajectory of some moving object, that is, whether there exists a trajectory that satisfies the constraints described by that event. We call the evaluation of this atomic query the realisability problem. To allow for composition on the query level, the query expressions are also closed under Boolean combinations. An important detail is that our query language is defined relative to two parameters: $\mathcal{R}$ , denoting the class of spatial regions that can occur in visit events, and $\mathcal{T}$ , denoting the periods of time that can occur in visit events. We study the complexity of the realisability problem for different choices of $\mathcal{R}$ and $\mathcal{T}$ . For $\mathcal{R}$ , we consider the class of singleton points ( $\mathsf{Point}$ ) and of semi-algebraic sets ( $\mathsf{SemiAlg}$ ). For $\mathcal{T}$ , on the other hand, we consider the class of singleton moments ( $\mathsf{Moment}$ ) and intervals of time ( $\mathsf{Interval}$ ). Because the realisability problem already becomes NP-hard for very restrictive classes of events, and because we believe it is of interest to measure the influence that the size of the database has on the complexity of query evaluation, we distinguish between combined complexity, data complexity and query complexity [1]. A summary of our complexity results is displayed in Table 1.

Table 1: A summary of the complexity results: data, query, and combined complexity.

Class of events	Data	Query	Combined
$(\mathsf{Point},\mathsf{Moment})$ -event in DNF	linear	polynomial	polynomial
Positive $(\mathsf{Point},\mathsf{Moment})$ -events	linear	NP-hard	NP-hard
$(\mathsf{SemiAlg},\mathsf{Moment})$ -events	linear	NP-hard	NP-hard
Conjunctive $(\mathsf{Point},\mathsf{Interval})$ -events	polynomial	NP-hard	NP-hard
Positive $(\mathsf{SemiAlg},\mathsf{Interval})$ -events	polynomial	NP-hard	NP-hard

The paper is organised as follows. In Section 2, we give the definitions that are necessary to formalize the notion of a trajectory sample database. In Section 3, we define the syntax and the semantics of our query languages. In Section 4, we study the complexity of the realisability problem in different settings. Finally, we conclude the paper with Section 5.

2 Definitions and preliminaries on trajectory sample databases

In this section, we give definitions of the concepts needed to introduce the notion of a trajectory sample database.

We consider the space in which objects move to be the plane $\mathbb{R}^{2}$ , and we use the letters $p,q,\dots$ (with or without indices) to denote locations in space. Time is modelled as the real line $\mathbb{R}$ , and we use $t$ (with or without indices) to refer to the temporal points (or moments). We also use $x$ and $y$ to refer to the real coordinates of spatial locations. A spatio-temporal point $(p,t)$ is then an element of $\mathbb{R}^{2}\times\mathbb{R}$ and if $p=(x,y)$ , we also write $(x,y,t)$ for the spatio-temporal point $(p,t)$ . Furthermore, we use $d$ to denote the Euclidean distance in $\mathbb{R}^{2}$ .

The movement of an object is captured by the notion of “trajectory” and it corresponds to a function mapping (all possible) moments in time to locations in space, as is expressed by the following definition.

Definition 1.

A trajectory is a continuous function from $\mathbb{R}$ to $\mathbb{R}^{2}$ .

We use the Greek letter $\gamma$ (with or without indices) to refer to trajectories. In practice trajectories are only observed or measured at discrete moments in time and we call these partial views on trajectories “trajectory samples”.

Definition 2.

A trajectory sample (or sample, for short) is a finite sequence of space-time points $\langle(p_{1},t_{1}),\dots,(p_{n},t_{n})\rangle$ that is ordered by the temporal component (that is, $t_{1}<\dots<t_{n}$ ).

We use the letter $S$ (with or without indices) to refer to trajectory samples. The following definition captures the notion of trajectories matching trajectory samples.

Definition 3.

We say a trajectory $\gamma$ visits a space-time point $(p,t)$ when $\gamma(t)=p$ and we say that a trajectory $\gamma$ visits a subset of space-time if $\gamma$ visits one of its elements.

A trajectory sample $\langle(p_{1},t_{1}),\dots,(p_{n},t_{n})\rangle$ matches the trajectory $\gamma$ if $\gamma$ visits all the space-time points $(p_{i},t_{i})$ , for $i=1,\dots,n$ .

Trajectory sample databases do not only store a trajectory sample for each moving object, but also a maximal speed for each moving object. Because we assume that every moving object has such speed bound, these bounds further restrict trajectories (as trajectory samples do). This is capured in the following definition.

Definition 4.

A trajectory $\gamma$ is called $v_{\mathrm{max}}$ -bounded when for all $t,t^{\prime}\in\mathbb{R}$ we have $d(\gamma(t),\gamma(t^{\prime}))\leq v_{\mathrm{max}}\cdot|t-t^{\prime}|$ .

We use $\Gamma_{(S,v_{\mathrm{max}})}$ to denote the set of all $v_{\mathrm{max}}$ -bounded trajectories matching the sample $S$ . Obviously, some trajectory samples have no $v_{\mathrm{max}}$ -bounded trajectories that match them. The consistency of a sample with a speed bound $v_{\mathrm{max}}$ is expressed in the following definition.

Definition 5.

A trajectory sample $\langle(p_{1},t_{1}),\dots,(p_{n},t_{n})\rangle$ is called $v_{\mathrm{max}}$ -consistent when $d(p_{i},p_{i+1})\leq v_{\mathrm{max}}\cdot(t_{i+1}-t_{i})$ , for $i=1,\dots,n-1$ .

In the remainder of this paper, we assume, when given a trajectory sample $S$ and a speed bound $v_{\mathrm{max}}$ , that $S$ is $v_{\mathrm{max}}$ -consistent.

Definition 6.

The linear interpolation trajectory of a trajectory sample $S=\langle(p_{1},t_{1}),\allowbreak\dots,\allowbreak(p_{n},t_{n})\rangle$ , denoted by $\mathsf{LIT}(S)$ , is defined as the trajectory $\gamma$ , with

\gamma(t)=\begin{cases}p_{1}&\text{if }t\leq t_{1}\\ \frac{t_{i+1}-t}{t_{i+1}-t_{i}}\cdot p_{i}+\frac{t-t_{i}}{t_{i+1}-t_{i}}\cdot p% _{i+1}&\text{if }t_{i}<t\leq t_{i+1}\mbox{ \rm\ \ \ \ (for }1\leq i<n\mbox{\rm% )}\\ p_{n}&\text{if }t_{n}<t.\end{cases}

We note that if $S$ is $v_{\mathrm{max}}$ -consistent, then $\mathsf{LIT}(S)$ is a $v_{\mathrm{max}}$ -bounded trajectory.

In trajectory sample databases, we use natural numbers as identifiers for moving objects. Therefore, such database is defined relative to a finite subset $\mathsf{Obj}$ of $\mathbb{N}$ , called the object identifiers.

Definition 7.

A trajectory sample database $D$ over object identifiers $\mathsf{Obj}\subset\mathbb{N}$ is a function mapping each identifier $i\in\mathsf{Obj}$ to a pair $D(i)=(S_{D}(i),v_{D}(i))$ , where $S_{D}(i)$ is a trajectory sample and $v_{D}(i)$ is a speed bound.

This means that, given a database $D$ , the set $\Gamma_{D(i)}=\Gamma_{(S_{D}(i),v_{D}(i))}$ contains all trajectories that the object with identifier $i$ may have followed (given the sample and the speed bound).

3 Syntax, semantics and evaluation of $(\mathcal{R},\mathcal{T})$ -queries

In this section, we describe a family of query languages for trajectory sample databases. Our languages consists of two “tiers”: in the inner tier, we have the events, which are used as subexpressions in the outer tier, where we have the query expressions. We define the language relative to two parameters, $\mathcal{R}$ and $\mathcal{T}$ , where $\mathcal{R}$ is a collection of spatial regions (or subsets of $\mathbb{R}^{2}$ ), and $\mathcal{T}$ is a collection of temporal periods (or subsets of $\mathbb{R}$ ).

In the following subsections, we define the syntax of the query languages, their semantics, and we end with a remark on the evaluation of query expressions.

3.1 The syntax of $(\mathcal{R},\mathcal{T})$ -queries

We start by defining the inner tier of our language, which is a calculus of events. These events occur as subexpressions in the query expressions defined later on. Events express visits that may occur during the movement of an object.

Definition 8.

We define the $(\mathcal{R},\mathcal{T})$ -events as follows:

1.

$\mathsf{visits}(R,T)$ , with $R\in\mathcal{R}$ and $T\in\mathcal{T}$ , is an atomic $(\mathcal{R},\mathcal{T})$ -event;
2.

If $e,e_{1}$ and $e_{2}$ are $(\mathcal{R},\mathcal{T})$ -events, then so are $(\lnot e)$ , $(e_{1}\land e_{2})$ and $(e_{1}\lor e_{2})$ .

In other words, $(\mathcal{R},\mathcal{T})$ -events can be considered as propositional formulas over the propositional symbols $\mathsf{visits}(R,T)$ , for every $R\in\mathcal{R}$ and $T\in\mathcal{T}$ . Now, we are ready to define $(\mathcal{R},\mathcal{T})$ -queries. Their expression is given in the following definition.

Definition 9.

Let $\mathsf{Var}$ be a set of object identifier variables (or variables, for short). The $(\mathcal{R},\mathcal{T})$ -query expressions are defined as follows:

1.

If $e$ is an $(\mathcal{R},\mathcal{T})$ -event, $i$ a natural number and $x\in\mathsf{Var}$ , then $\mathsf{realisable}(i,e)$ and $\mathsf{realisable}(x,e)$ are atomic $(\mathcal{R},\mathcal{T})$ -query expressions;
2.

If $q,q_{1}$ and $q_{2}$ are $(\mathcal{R},\mathcal{T})$ -query expression, then so are $(\lnot q)$ , $(q_{1}\land q_{2})$ and $(q_{1}\lor q_{2})$ .

3.2 The semantics of $(\mathcal{R},\mathcal{T})$ -queries

To define the semantics of query expressions, we first define what it means for an event to be realised by a trajectory.

Definition 10.

Let $e$ be an $(\mathcal{R},\mathcal{T})$ -event and let $\gamma$ be a trajectory.

1.

If $e$ is the atomic event $\mathsf{visits}(R,T)$ , then $e$ is realised by $\gamma$ if $\gamma(t)\in R$ for some $t\in T$ (that is, $\gamma$ visits $R\times T$ ).
2.

If $e$ is of the form $(\lnot e_{1})$ , then $e$ is realised by $\gamma$ when $e_{1}$ is not realised by $\gamma$ .
3.

If $e$ is of the form $(e_{1}\land e_{2})$ , then $e$ is realised by $\gamma$ when $e_{1}$ and $e_{2}$ are realised by $\gamma$ .
4.

If $e$ is of the form $(e_{1}\lor e_{2})$ , then $e$ is realised by $\gamma$ when $e_{1}$ or $e_{2}$ is realised by $\gamma$ .

In our definition of the semantics of $(\mathcal{R},\mathcal{T})$ -queries, we distinguish between Boolean and non-Boolean queries. The first type of queries contain no variables and they evaluate to true or false. The second type of queries contain variables and they define a relations over $\mathsf{Obj}$ .

Definition 11.

Let $D$ be a trajectory sample database over object identifiers $\mathsf{Obj}$ . We write $D\models q$ to express that a variable-free $(\mathcal{R},\mathcal{T})$ -query expression $q$ evaluates to true on $D$ . This relation is defined as follows:

1.

$D\models\mathsf{realisable}(i,e)$ if $i\in\mathsf{Obj}$ and there exists a trajectory in $\Gamma_{D(i)}$ that realises $e$ .
2.

$D\models\lnot q$ if $D\models q$ does not hold.
3.

$D\models q_{1}\land q_{2}$ if $D\models q_{1}$ and $D\models q_{2}$ .
4.

$D\models q_{1}\lor q_{2}$ if $D\models q_{1}$ or $D\models q_{2}$ .

Definition 12.

Let $D$ be a trajectory sample database over object identifiers $\mathsf{Obj}$ . If $q$ is an $(\mathcal{R},\mathcal{T})$ -query expression containing variables $x_{1},\dots,x_{k}\in\mathsf{Var}$ , then the result of $q$ evaluated in $D$ is

q(D)=\left\{(i_{1},\dots,i_{k})\in\mathsf{Obj}_{D}^{k}\mid D\models q[i_{1}/x_% {1},\dots,i_{k}/x_{k}]\right\},

where $q[i_{1}/x_{1},\dots,i_{k}/x_{k}]$ is obtained from $q$ by instantiating the variable $x_{j}$ in $q$ by $i_{j}$ , for $j=1,...,k$ .

3.3 Evaluation of $(\mathcal{R},\mathcal{T})$ -queries

Having defined the semantics of our $(\mathcal{R},\mathcal{T})$ -query languages, there is a “standard” way of evaluating query expressions with variables: given a query expression with $k$ variables, we enumerate all $k$ -tuples of object identifiers, consider all the instantiations associated with them and then evaluate the variable-free expression obtained by substituting the variable occurrences by the concrete object identifiers from this tuple.

However, it is not immediately clear from the definition of the semantics how an atomic query of the form $\mathsf{realisable}(i,e)$ can be evaluated. In what follows, we restrict our attention to this decision problem.

Definition 13.

We define the $(\mathcal{R},\mathcal{T})$ -realisability problem to be the following decision problem: given $(\mathcal{R},\mathcal{T})$ -event $e$ , sample $S$ and speed bound $v$ , does there exist a trajectory in $\Gamma_{(S,v)}$ that realises $e$ ?

From now on, we use the notation $(S,v)\models e$ to express that there exists a trajectory in $\Gamma_{(S,v)}$ that realises the event $e$ .

4 The complexity of the $(\mathcal{R},\mathcal{T})$ -realisability problem

In this section, we give a number of complexity results on the above realisability problem. We recall that the input to this problem is a trajectory sample, a speed bound and an $(\mathcal{R},\mathcal{T})$ -event. In order for the realisability problem to be a proper computational decision problem, these inputs must have finite representations. This means, for example, that we cannot take arbitrary real numbers as input and that we need to assume some finite encoding of our inputs. Here, we assume that the spatio-temporal points occurring in the trajectory sample have rational coordinates, and that the speed bound is rational. As for the $(\mathcal{R},\mathcal{T})$ -event, its representation depends on the choice for $\mathcal{R}$ and $\mathcal{T}$ and the representation of their elements. The choices for $\mathcal{R}$ that we consider are $\mathsf{Point}$ , the collection of all singletons in $\mathbb{Q}^{2}$ ; and $\mathsf{SemiAlg}$ , the collection of all semi-algebraic sets in the plane. A semi-algebraic set in the plane is a subset of $\mathbb{R}^{2}$ that can be defined using a Boolean combination of polynomial (in)equalities over two real variables (where the polynomials have integer coefficients) [3]. Elements of $\mathsf{Point}$ are simply represented by the coordinates of the point in question, while an element of $\mathsf{SemiAlg}$ is represented by some encoding of its defining formula. The choices for $\mathcal{T}$ we consider are $\mathsf{Moment}$ , containing all singletons in $\mathbb{Q}$ , and $\mathsf{Interval}$ , containing all closed intervals of $\mathbb{R}$ with rational endpoints.

For notational convenience, the atomic $(\mathsf{Point},\mathcal{T})$ -event $\mathsf{visits}(\{p\},T)$ will be written as $\mathsf{visits}(p,T)$ . And similarly, we will write $\mathsf{visits}(R,t)$ for the atomic $(\mathcal{R},\mathsf{Moment})$ -event $\mathsf{visits}(R,\{t\})$ .

As we show below, the realisability problem is already NP-hard for quite restricted classes of events. Similar to the evaluation of queries in relational databases, the hardness of the realisability problem is caused by the size of the event, and not by the size of the trajectory sample. Therefore, we study the complexity of the realisability problem in three different settings [1], being:

$\blacksquare$

the data complexity, where we measure the complexity in terms of the size of the sample, and consider the event to be fixed,
$\blacksquare$

the query complexity, where we measure the complexity in terms of the size of event, and consider the sample to be fixed, and
$\blacksquare$

the combined complexity, where we measure the complexity in terms of both the size of the sample, and the size of the event.

We also consider several restrictions to the class of input events, such as positive events, containing no negations, conjunctive events, being conjunctions of atoms (or, not containing disjunctions and negations), and events in disjunctive normal form (DNF).

In the remainder of this section, we give various results on the complexity of the $(\mathcal{R},\mathcal{T})$ -realisability problem, for different choices of $\mathcal{R}$ and $\mathcal{T}$ and in the three different settings.

For the complexity results mentioned below, we use a computational model in which operations (addition, multiplication, …) and comparison relations ( $=$ , $<$ , …) on rational numbers are assumed to take unit time. That is, we measure the time complexity in terms of the number of spatio-temporal points in a trajectory sample and in terms of number of atoms (and their length) in an event-expression.

4.1 The query complexity of the $(\mathcal{R},\mathcal{T})$ -realisability problem

Our first result shows that the realisability problem is already NP-hard for a relatively restricted class of events, namely the positive $(\mathsf{Point},\mathsf{Moment})$ -events.

Theorem 14.

In terms of query complexity, the $(\mathsf{Point},\mathsf{Moment})$ -realisability problem for positive events is NP-hard.

Proof.

To prove NP-hardness, we describe a reduction from SAT, the satisfiability problem of propositional formulas. Because the statement concerns query complexity, we reduce SAT to the $(\mathsf{Point},\mathsf{Moment})$ -realisability problem with fixed sample and speed bound. In this case, we choose the sample $S=\langle((0,0),0),((0,0),1)\rangle$ and the speed bound $v=1$ . The input of the reduction is a propositional formula $\phi$ . We can assume that $\phi$ is in negation normal form³³3A formula is said to be in negation normal form if negation operators are only applied to atoms., because the satisfiability problem remains NP-hard under this restriction. The output is a positive $(\mathsf{Point},\mathsf{Moment})$ -event $e_{\phi}$ such that the formula $\phi$ is satisfiable iff there exists a trajectory $\gamma$ in $\Gamma_{(S,v)}$ that realises the event $e_{\phi}$ .

Let $P_{1},\dots P_{k}$ be the propositional symbols occuring in $\phi$ . For $1\leq i\leq k$ , we define $t_{i}=\frac{i+1}{k+2}$ , $p_{i}=(\frac{1}{2(k+2)},0)$ and $p_{i}^{\prime}=(-\frac{1}{2(k+2)},0)$ . Now, we take $e_{\phi}$ to be the result of substituting occurences of $\lnot P_{i}$ by $\mathsf{visits}(p_{i}^{\prime},t_{i})$ and non-negated occurrences of $P_{i}$ by $\mathsf{visits}(p_{i},t_{i})$ in $\phi$ . Clearly, $e_{\phi}$ is a $(\mathsf{Point},\mathsf{Moment})$ -event, not containing negations.

Finally, we show that $\phi$ is satisfiable if and only if there exists a $\gamma$ in $\Gamma_{(S,v)}$ that realises $e_{\phi}$ . The “if”-direction is straightforward. If there is some $\gamma$ that realises $e_{\phi}$ , then $\phi$ must certainly be satisfiable. To be precise, the assignment that assigns $P_{i}$ to true if and only if $\gamma(t_{i})=p_{i}$ satisfies $\phi$ . For the “only if”-direction, assume that $\phi$ is satisfiable. Then there exists a truth assignment $\alpha$ that makes $\phi$ true. Now, we extend the sample $S$ to a sample $S^{\prime}$ such that it contains $(p_{i},t_{i})$ if $P_{i}$ is assigned true by $\alpha$ , and otherwise contains $(p_{i}^{\prime},t_{i})$ . It is easily verified that $S^{\prime}$ is $v$ -consistent, which means $\mathsf{LIT}(S^{\prime})$ is a $v$ -bounded trajectory. Now, we have that $\mathsf{LIT}(S^{\prime})$ realises $\mathsf{visits}(p_{i},t_{i})$ if $P_{i}$ is true under $\alpha$ , and realises $\mathsf{visits}(p_{i}^{\prime},t_{i})$ if $\lnot P_{i}$ is true under $\alpha$ . It follows from the way we constructed the event, that $\mathsf{LIT}(S^{\prime})$ realises $e_{\phi}$ . $\hfill\blacktriangleleft$

While the above result shows that the $(\mathsf{Point},\mathsf{Moment})$ -realisability problem is NP-hard for positive events (not containing negations), the problem for $(\mathsf{Point},\mathsf{Interval})$ -events already becomes NP-hard for conjunctions of atoms, as shown below.

Theorem 15.

In terms of query complexity, the $(\mathsf{Point},\mathsf{Interval})$ -realisability problem for conjunctive events is NP-hard.

Proof.

We give a reduction from the Euclidean travelling saleman problem (E-TSP for short), shown to be NP-hard in [16] (the problem we refer to here is called the Euclidean tour-TSP there). The Euclidean travelling saleman problem asks, given a finite set of locations $P\subseteq\mathbb{Q}^{2}$ and a positive number $\ell\in\mathbb{Q}$ , whether there is a cycle through all locations of $P$ whose length is at most $\ell$ . Formally, this means there is a permutation $p_{1},\dots,p_{n}$ of the locations in $P$ such that $\sum_{i=1}^{n-1}d(p_{i},p_{i+1})+d(p_{n},p_{1})\leq\ell$ . Without loss of generality, we assume that $P$ always contains the origin $(0,0)$ .

Again, we work with a fixed sample $S=\langle((0,0),0)\rangle$ and speed bound $v=1$ . From an instance $P,\ell$ of E-TSP, we give a conjunction of $(\mathsf{Point},\mathsf{Interval})$ -atoms $C$ , such that $(S,v)\models C$ if and only if there is cycle through $P$ of length at most $\ell$ . We define $C$ as

\mathsf{visits}((0,0),[\ell,\ell])\land\bigwedge_{p\in P\setminus\{(0,0)\}}% \mathsf{visits}(p,[0,\ell]).

To prove that this reduction is correct, we first show that if $(S,v)\models C$ , then there is cycle through $P$ of length at most $\ell$ . Let $\gamma$ be a $v$ -bounded trajectory matching $S$ and realising $C$ . Because $\gamma$ matches $S$ , we have $\gamma(0)=(0,0)$ . And, because $\gamma$ realises $C$ , it reaches every location in $P$ at some moment in $[0,\ell]$ , and $\gamma(\ell)=(0,0)$ . This induces an order $p_{1},\dots,p_{n}$ of the locations in $P$ , where $\gamma$ first reaches $p_{1}$ , then $p_{2}$ , and so on (we note that $p_{1}$ is always $(0,0)$ ). Thus, if we let $t_{i}$ be the first moment where $\gamma(t_{i})=p_{i}$ for $i=1,\dots,n$ , then $0=t_{1}<\dots<t_{n}\leq\ell$ . We claim the cycle $p_{1},\dots,p_{n},p_{1}$ has length at most $\ell$ . The length of this cycle is $\sum_{i=1}^{n-1}d(p_{i},p_{i+1})+d(p_{n},p_{1})$ . For every $i$ , $\gamma$ visits $(p_{i},t_{i})$ , and because $\gamma$ is $v$ -bounded, we have $d(p_{i},p_{i+1})\leq v\cdot|t_{i}-t_{i+1}|=t_{i+1}-t_{i}$ . Similarly, $\gamma$ visits $(p_{n},t_{n})$ and $(p_{1},\ell)$ , thus $d(p_{n},p_{1})\leq\ell-t_{n}$ . It follows that the length of the cycle is at most $\sum_{i=1}^{n-1}(t_{i+1}-t_{i})+\ell-t_{n}=t_{n}-t_{1}+\ell-t_{n}=\ell$ .

Finally, we show that if there is cycle through $P$ of length at most $\ell$ , then $(S,v)\models C$ . Let $p_{1},\dots,p_{n},p_{1}$ be such cycle, where we choose $p_{1}$ to be $(0,0)$ . Now, let $t_{1}=0$ and for $i=2,\dots,n$ , take $t_{i}=t_{i-1}+d(p_{i-1},p_{i})$ . Then, $t_{n}=\sum_{i=1}^{n-1}d(p_{i},p_{i+1})$ , which, by assumption, is at most $\ell-d(p_{n},p_{1})$ . Define the trajectory sample $S^{\prime}=\langle(p_{1},t_{1}),\dots,(p_{n},t_{n}),(p_{1},\ell)\rangle$ . It is clear from the definition of $t_{i}$ that the part of $S^{\prime}$ excluding $(p_{1},\ell)$ is $v$ -consistent. We have seen that $t_{n}\leq\ell-d(p_{n},p_{1})$ , and thus $d(p_{n},p_{1})\leq\ell-t_{n}$ , which implies that $S^{\prime}$ is $v$ -consistent. From this follows that $\mathsf{LIT}(S^{\prime})$ is $v$ -bounded, and it clearly matches $S$ and realises $C$ . $\hfill\blacktriangleleft$

4.2 The data complexity of the $(\mathcal{R},\mathcal{T})$ -realisability problem

In this section, we show that the $(\mathsf{SemiAlg},\mathsf{Moment})$ -realisability problem has linear-time data complexity, and the $(\mathsf{SemiAlg},\mathsf{Interval})$ -realisability problem has polynomial-time data complexity

We start with the result on $(\mathsf{SemiAlg},\mathsf{Moment})$ -queries, but first we introduce some notation and we give two lemmas. Every region in $\mathsf{SemiAlg}$ is of the form $\{(x,y)\in\mathbb{R}^{2}\mid\varphi(x,y)\}$ , where $\varphi=\varphi(x,y)$ is a quantifier-free formula over the vocabulary $(+,\times,<,0,1)$ , with $x$ and $y$ as free variables. The set $\{(x,y)\in\mathbb{R}^{2}\mid\varphi(x,y)\}$ is called the region defined by $\varphi$ , and we denote it by $R(\varphi)$ .

Definition 16.

If $C=\mathsf{visits}(R_{1},t_{1})\land\dots\land\mathsf{visits}(R_{k},t_{k})$ is a conjunction of atomic $(\mathcal{R},\mathsf{Moment})$ -events, and $I\subseteq\mathbb{R}$ is an interval, the formula $C_{I}$ is the conjunction of those $\mathsf{visits}(R_{i},t_{i})$ , with $1\leq i\leq k$ , for which $t_{i}\in I$ .

Lemma 17.

Let $C$ be a conjunction of atomic $(\mathcal{R},\mathsf{Moment})$ -events, let $S$ be a trajectory sample $\langle(p_{1},t_{1}),\dots,(p_{n},t_{n})\rangle$ and let $v$ be a speed bound. Then, $(S,v)\models C$ if and only if all of the following are true:

(1)

$(\langle(p_{1},t_{1})\rangle,v)\models C_{(-\infty,t_{1}]}$ ,
(2)

for $i=1,\dots,n-1$ , we have $(\langle(p_{i},t_{i}),(p_{i+1},t_{i+1})\rangle,v)\models C_{[t_{i},t_{i+1}]}$ , and
(3)

$(\langle(p_{n},t_{n})\rangle,v)\models C_{[t_{n},+\infty)}$ .

Proof.

The “only if”-direction is obvious. We prove the “if”-direction. Assume (1), (2) and (3) are true. By (1), there exists a $v$ -bounded trajectory $\gamma_{0}$ matching $\langle(p_{1},t_{1})\rangle$ that realises $C_{(-\infty,t_{1}]}$ . By (2), for $i=1,\dots,n-1$ , there exists a $v$ -bounded trajectory $\gamma_{i}$ matching $\langle(p_{i},t_{i}),(p_{i+1},t_{i+1})\rangle$ that realises $C_{[t_{i},t_{i+1}]}$ . And by (3), there exists a $v$ -bounded trajectory $\gamma_{n}$ matching $\langle(p_{n},t_{n})\rangle$ that realises $C_{[t_{n},+\infty)}$ . We define the trajectory $\gamma$ as follows:

\gamma(t)=\begin{cases}\gamma_{0}(t)&\text{if }t\in({-\infty},t_{1}\mathclose{% ]},\\ \gamma_{i}(t)&\text{if }t\in[t_{i},t_{i+1}]\mbox{ and }\\ \gamma_{n}(t)&\text{if }t\in[t_{n},+\infty).\end{cases}

We note that the intervals $[t_{i-1},t_{i}]$ and $[t_{i},t_{i+1}]$ both contain $t_{i}$ . However, this does not pose a problem for the definition of $\gamma$ , because both $\gamma_{i-1}$ and $\gamma_{i}$ visit $(p_{i},t_{i})$ , which means $\gamma_{i-1}(t_{i})=\gamma_{i}(t_{i})=p_{i}$ . It only requires a simple application of the triangle inequality (of Euclidean distance) to show that $\gamma$ is a $v$ -bounded trajectory. For $i=1,\dots,n$ , we have $\gamma(t_{i})=p_{i}$ , thus $\gamma$ matches $S$ . The only thing left to prove is that $\gamma$ realises $C$ . Let $A=\mathsf{visits}(R,t)$ be an arbitrary conjunct of $C$ . Depending on $t$ , there are three cases to be considered. First, if $t\in({-\infty},t_{1}]$ , then $A$ is a conjunct of $C_{({-\infty},t_{1}]}$ . This implies $\gamma_{0}$ realises $A$ , so $\gamma(t)=\gamma_{0}(t)\in R$ , which means $\gamma$ realises $A$ . The other two cases, when $t$ is contained in $[t_{i},t_{i+1}]$ or in $[t_{n},+\infty)$ , are similar. Because $\gamma$ realises all the conjuncts of $C$ , it also realises $C$ itself. $\hfill\blacktriangleleft$

Lemma 18.

If $C=\mathsf{visits}(R_{1},t_{1})\land\dots\land\mathsf{visits}(R_{k},t_{k})$ is a conjunction of atomic $(\mathcal{R},\mathsf{Moment})$ -events, $S$ is a trajectory sample and $v$ a speed bound, then $(S,v)\models C$ if and only if there exist $k$ spatial locations $p_{1},\dots,p_{k}$ such that

(1)

for $i=1,\dots,k$ we have $p_{i}\in R_{i}$ , and
(2)

the sample containing the points $(p_{1},t_{1}),\dots,(p_{k},t_{k})$ , as well as the ones in $S$ , is $v$ -consistent.

We remark that we consider condition (2) from the lemma to be false when some space-time point among $(p_{1},t_{1}),\dots(p_{k},t_{k})$ shares its temporal component with a point in $S$ , while their spatial component differs.

Proof.

The “only if”-direction is obvious. We prove the “if”-direction. Assume there are locations $p_{1},\dots,p_{k}$ satisfying (1) and (2). Now take the trajectory $\gamma$ to be the linear interpolation trajectory of the sample containing the points $(p_{1},t_{1}),\dots,(p_{k},t_{k})$ , as well as the ones in $S$ . It is clear that $\gamma$ matches $S$ and assumption (2) implies it is a $v$ -bounded trajectory. Finally, assumption (1) implies that $\gamma$ realises $C$ . $\hfill\blacktriangleleft$

Definition 19.

We say two $(\mathcal{R},\mathcal{T})$ -events $A$ and $B$ are equivalent if for every sample $S$ and speed bound $v$ , we have $(S,v)\models A$ if and only if $(S,v)\models B$ .

The following proposition follows directly from the fact that, for $p\in\mathbb{R}^{2}$ , we have $p\notin R(\varphi)$ if and only if $p\in R(\lnot\varphi)$ .

Proposition 20.

A $(\mathsf{SemiAlg},\mathsf{Moment})$ -event of the form $\lnot\mathsf{visits}(R(\varphi),t)$ is equivalent to the event $\mathsf{visits}(R(\lnot\varphi),t)$ .

Given an arbitrary $(\mathsf{SemiAlg},\mathsf{Moment})$ -event, we can convert it into negation normal form and use Proposition 20 to remove all negations. The result of this process is a positive $(\mathsf{SemiAlg},\mathsf{Moment})$ -event that is equivalent to the original. Because this process can be performed in linear time (with respect to the length of the event), we can assume that any given $(\mathsf{SemiAlg},\mathsf{Moment})$ -event is positive, without loss of generality.

Theorem 21.

In terms of data complexity, the $(\mathsf{SemiAlg},\mathsf{Moment})$ -realisability problem is decidable in linear time.

Proof.

We describe an algorithm to decide the realisability problem of a $(\mathsf{SemiAlg},\mathsf{Moment})$ -event $e$ for input sample $S=\langle(p_{1},t_{1}),\dots,(p_{n},t_{n})\rangle$ , where $p_{i}=(x_{i},y_{i})$ , and speed bound $v$ . Noting the remark made above, we assume that $e$ is positive. The first step is to convert $e$ into its disjunctive normal form $\bar{e}$ . Since we are dealing with data complexity, the possibly increased size of $\bar{e}$ , compared to $e$ , has no impact on the running time of our method. In fact, because the number of disjuncts of $\bar{e}$ is constant, it is sufficient to show that the realisability problem for a single disjunct can be decided in linear time.

Consider a disjunct $C$ of $\bar{e}$ . Because $\bar{e}$ is positive and in DNF, the event $C$ must be a conjunction of atomic events. This means we can apply Lemma 17, and we can determine whether $(S,v)\models C$ by testing whether each of the following conditions are met:

(1)

$(\langle(p_{1},t_{1})\rangle,v)\models C_{(-\infty,t_{1}]}$ ,
(2)

for $i=1,\dots,n-1$ , we have $(\langle(p_{i},t_{i}),(p_{i+1},t_{i+1})\rangle,v)\models C_{[t_{i},t_{i+1}]}$ , and
(3)

$(\langle(p_{n},t_{n})\rangle,v)\models C_{[t_{n},+\infty)}$ .

We focus our attention to condition (2). For every value of $i$ , we want to decide whether $C_{[t_{i},t_{i+1}]}$ is realisable for sample $\langle(p_{i},t_{i}),(p_{i+1},t_{i+1})\rangle$ and speed bound $v$ . Let us write the conjunction $C_{[t_{i},t_{i+1}]}$ as $\mathsf{visits}(R(\varphi_{1}),t_{1}^{\prime})\land\dots\land\mathsf{visits}(R% (\varphi_{k}),t_{k}^{\prime})$ , with $t_{1}^{\prime}\leq\dots\leq t_{k}^{\prime}$ . We remark that this implies $t_{i}\leq t_{1}^{\prime}\leq\dots\leq t_{k}^{\prime}\leq t_{i+1}$ . From Lemma 18, we know that $(\langle(p_{i},t_{i}),(p_{i+1},t_{i+1})\rangle,v)\models C_{[t_{i},t_{i+1}]}$ if and only if there exist locations $q_{1},\dots,q_{k}\in\mathbb{R}^{2}$ , with $q_{i}=(x_{i}^{\prime},y_{i}^{\prime})$ , such that

(a)

for $j=1,\dots,k$ we have $q_{j}\in R(\varphi_{j})$ , and
(b)

the sample $\langle(p_{i},t_{i}),(q_{1},t_{1}^{\prime}),\dots,(q_{k},t_{k}^{\prime}),(p_{i% +1},t_{i+1})\rangle$ is $v$ -consistent.

In other words, we have $(\langle(p_{i},t_{i}),(p_{i+1},t_{i+1})\rangle,v)\models C_{[t_{i},t_{i+1}]}$ if and only if the formula $\psi=\exists x_{1}^{\prime}\exists y_{1}^{\prime}\exists\dots\exists x_{k}^{% \prime}\exists y_{k}^{\prime}(\psi_{a}\land\psi_{b})$ is true, where $\psi_{a}$ is $\varphi_{1}(x_{1}^{\prime},y_{1}^{\prime})\land\dots\land\varphi_{k}(x_{k}^{% \prime},y_{k}^{\prime})$ , expressing condition (a), and to express condition (b), we take $\psi_{b}$ to be the conjunction of $k+1$ distance inequalities.

Because $\psi$ is a first-order logic sentence over the ordered field of real numbers, its truth can be determined by a decision procedure for the theory of real closed fields (first described by Tarski [19], we refer to Basu et al. [2] for a modern exposition). We note that the size of $\psi$ is independent of $n$ , and thus has constant size in the data complexity setting ( $k$ is bounded by the length of $C$ ). The time needed to determine the truth of $\psi$ is thus also constant. To test for condition (2), we have to perform the above steps for $n-1$ values of $i$ . Conditions (1) and (3) can both be tested in constant time, in a manner similar to the above, the only difference is that the constructed sentence requires one less inequality for expressing $v$ -consistency. We have thus shown that the realisability problem for a single disjunct of $\bar{e}$ can be answered in $O(n)$ time. This concludes the proof. $\hfill\blacktriangleleft$

Our next result concerns the data complexity of the $(\mathsf{SemiAlg},\mathsf{Interval})$ -realisability problem for positive events.

Lemma 22.

If $e$ is a positive $(\mathcal{R},\mathsf{Interval})$ -event containing $k$ distinct atoms $A_{1},\dots,A_{k}$ , where $A_{i}=\mathsf{visits}(R_{i},T_{i})$ , then $(S,v)\models e$ if and only if there exist $k$ space-time points $(p_{1},t_{1}),\dots,(p_{k},t_{k})$ such that

(1)

the sample containing the points $(p_{1},t_{1}),\dots,(p_{k},t_{k})$ , as well as the ones in $S$ , is $v$ -consistent, and
(2)

the Boolean expression obtained by replacing $A_{i}$ in $e$ by true if $(p_{i},t_{i})\in R_{i}\times T_{i}$ , and by false otherwise, evaluates to true.

Proof.

We first prove the “if”-direction. Assume that there exist $(p_{1},t_{1}),\dots,(p_{k},t_{k})$ satisfying (1) and (2). Let $S^{\prime}$ be the sample containing the points $(p_{1},t_{1}),\dots,(p_{k},t_{k})$ , as well as the ones in $S$ . Assumption (1) says $S^{\prime}$ is $v$ -consistent, which means $\mathsf{LIT}(S^{\prime})$ is $v$ -bounded. Because $S^{\prime}$ is an extension of $S$ , and $\gamma$ matches $S^{\prime}$ , it must also match $S$ . The trajectory $\mathsf{LIT}(S^{\prime})$ realises all the atoms $A_{i}$ for which $(p_{i},t_{i})\in R_{i}\times T_{i}$ , and potentially others. Because of assumption (2) and the fact that $e$ is positive, $\mathsf{LIT}(S^{\prime})$ realises $e$ , and thus $(S,v)\models e$ .

For the “only if”-direction, we assume that $(S,v)\models e$ . That means that there exists a $v$ -bounded $\gamma$ realising $e$ and matching $S$ . We choose $(p_{1},t_{1}),\dots,(p_{k},t_{k})$ as follows. For every atom $A_{i}$ which $\gamma$ realises, we know that $\gamma$ visits some point in $R_{i}\times T_{i}$ , and take $(p_{i},t_{i})$ to be such point. For an atom $A_{i}$ not realised by $\gamma$ , we take (quite arbitrarily) $(p_{i},t_{i})$ to be the first anchor point of $S$ . Then, $(p_{i},t_{i})\notin R_{i}\times T_{i}$ , since otherwise $\gamma$ would realise $A_{i}$ . All points of $(p_{1},t_{1}),\dots,(p_{k},t_{k})$ and $S$ are visited by $\gamma$ , so the sample containing all those points must be $v$ -consistent, satisfying condition (1). Because $\gamma$ realises $A_{i}$ if and only if $(p_{i},t_{i})\in R_{i}\times T_{i}$ and $\gamma$ realises $e$ , condition (2) is also satisfied. $\hfill\blacktriangleleft$

The following proposition follows directly from the definition of $v$ -consistency and the fact that the distance function $d$ obeys the triangle inequality.

Proposition 23.

If $M$ is an (unordered) finite set of space-time points, then the sample containing all points in $M$ is $v$ -consistent if and only if for every pair of points $(p_{1},t_{1})$ and $(p_{2},t_{2})$ from $M$ , we have $d(p_{1},p_{2})\leq v\cdot|t_{1}-t_{2}|$ .

It will be of interest later that the condition $d(p_{1},p_{2})\leq v\cdot|t_{1}-t_{2}|$ from the above proposition is equivalent to $d(p_{1},p_{2})^{2}\leq v^{2}\cdot(t_{1}-t_{2})^{2}$ , which, if $p_{1}=(x_{1},y_{1})$ and $p_{2}=(x_{2},y_{2})$ , is a polynomial inequality with variables $x_{1},y_{1},t_{1},x_{2},y_{2},,t_{2},v$ .

Theorem 24.

In terms of data complexity, the $(\mathsf{SemiAlg},\mathsf{Interval})$ -realisability problem for positive events can be decided in polynomial time.

Proof.

We describe an algorithm to decide the realisability of a positive $(\mathsf{SemiAlg},\mathsf{Interval})$ -event $e$ for input sample $S$ , of length $n$ , and speed bound $v$ . Lemma 22 gives a condition equivalent to $(S,v)\models e$ . We can express this condition using a first-order logic sentence over the ordered field of real numbers $\psi=\exists x_{1}\exists y_{1}\exists t_{1}\dots\exists x_{k}\exists y_{k}% \exists t_{k}(\psi_{1}\land\psi_{2})$ , where $\psi_{1}$ expresses part (1) of Lemma 22, and $\psi_{2}$ expresses part (2). To express part (1), stating that the sample containing $(x_{1},y_{1},t_{1}),\dots,(x_{k},y_{k},t_{k})$ as well as the points from $S$ is $v$ -consistent, we can take $\psi_{1}$ to be a conjunction of $(n+k)^{2}$ distance inequalities, as per Proposition 23. To express the second part, we construct $\psi_{2}$ by taking $e$ and replacing every atom $\mathsf{visits}(R(\varphi_{i}),[t_{i}^{-},t_{i}^{+}])$ by the formula $\varphi_{i}(x_{i},y_{i})\land t_{i}^{-}\leq t_{i}\land t_{i}\leq t_{i}^{+}$ .

We have now constructed a formula $\psi$ which is true if and only if $(S,v)\models e$ . Thus, if there is a method to determine the truth $\psi$ , we can decide the realisability problem for positive $(\mathsf{SemiAlg},\mathsf{Interval})$ -events. Because $\psi$ is an existantial sentence, we can apply known decision procedures for the existential theory of the reals. Of course, the time complexity required to decide the realisability of positive $(\mathsf{SemiAlg},\mathsf{Interval})$ -events in the described manner, depends on the time complexity of the existential theory of the reals. In [2] (see theorem 13.14), an upper bound of $s^{m+1}d^{O(m)}$ is given, where $s$ is the number of polynomials occuring in the formula, $m$ the number of variables, and $d$ the maximum degree of the polynomials. Because we are considering data complexity, the number of polynomials in the $\varphi_{i}$ ’s, as well as their maximum degree, is constant, and so is $k$ . This implies that $\psi$ contains $O(n^{2})$ polynomials of constant degree, and $3k$ variables. Thus, by Theorem 13.14 from [2], the truth of $\psi$ can be decided in $O((n^{2})^{3k+1})=O(n^{6k+2})$ time, which is polynomial in $n$ . It is also clear that $\psi$ can be constructed in polynomial time. $\hfill\blacktriangleleft$

4.3 A class of events for which the realisability problem has polynomial time combined complexity

Until now, we have only seen hardness results in the query (and thus, combined) complexity setting. In this section, we give an example of a class of events for which the realisability problem has polynomial-time combined complexity.

An $(\mathcal{R},\mathcal{T})$ -literal is either an atomic $(\mathcal{R},\mathcal{T})$ -event, or the negation of an atomic $(\mathcal{R},\mathcal{T})$ -event. Conjunctions of $(\mathsf{Point},\mathsf{Moment})$ -literals provide a class of events for which the realisability problem has an efficient solution in terms of combined complexity. Before giving our result, we first prove a lemma.

Lemma 25.

If $S_{1}$ and $S_{2}$ are trajectory samples, then there exists a $v$ -bounded trajectory matching $S_{1}$ and not visiting any point in $S_{2}$ if and only if

(1)

$S_{1}$ does not contain a point in $S_{2}$ ,
(2)

$S_{1}$ is $v$ -consistent, and
(3)

if $\mathsf{LIT}(S_{1})$ visits a point from $S_{2}$ , on the line segment between points $(p_{i},t_{i})$ and $(p_{i+1},t_{i+1})$ from $S_{1}$ , then $d(p_{i},p_{i+1})<v\cdot(t_{i+1}-t_{i})$ .

Proof.

It is obvious that the conditions (1), (2) and (3) are necessary for the existence of a $v$ -bounded trajectory matching $S_{1}$ and not visiting points in $S_{2}$ . To show that they are also sufficient, we assume that all three conditions are met. Consider the trajectory $\mathsf{LIT}(S_{1})$ . By condition (2), it is $v$ -bounded. In case $\mathsf{LIT}(S_{1})$ visits one or more points from $S_{2}$ , we show that we can extend (by adding points) $S_{1}$ to a sample $S^{*}$ , such that $\mathsf{LIT}(S^{*})$ is $v$ -bounded, matches $S_{1}$ and does not visit points from $S_{2}$ . Because $S^{*}$ is obtained by adding points to $S_{1}$ , it is obvious that $\mathsf{LIT}(S^{*})$ matches $S_{1}$ . Let us say that $\mathsf{LIT}(S_{1})$ visits a point $(q,t)$ from $S_{2}$ on the line segment between $(p_{i},t_{i})$ and $(p_{i+1},t_{i+1})$ . From (1) we know that $t_{i}<t<t_{i+1}$ , and from (3) we have $d(p_{i},p_{i+1})<v\cdot(t_{i+1}-t_{i})$ . Together, these observations imply there must be a small disk around $q$ , such that for every $p$ in this disk, the extension of $S_{1}$ with the point $(p,t)$ remains $v$ -consistent. Informally, this means we can deviate $\mathsf{LIT}(S_{1})$ slightly between $t_{i}$ and $t_{i+1}$ , while the trajectory remains $v$ -bounded. Now, for every point $(q^{\prime},t^{\prime})$ of $S_{2}$ with $t_{i}<t^{\prime}<t_{i+1}$ , there is at most one choice for $p$ inside the disk that makes the linear interpolation trajectory of the extended sample visit $(q^{\prime},t^{\prime})$ . Because there are infinitely many points in the disks, we can choose $p$ such that, between $t_{i}$ and $t_{i+1}$ , none of the points from $S_{2}$ are visited. We can repeat this process for each line segment of $\mathsf{LIT}(S_{1})$ that visits a point from $S_{2}$ . The linear interpolation trajectory of the resulting sample $S^{*}$ then matches $S_{1}$ , is $v$ -bounded and does not visit points from $S_{2}$ , as desired. $\hfill\blacktriangleleft$

Theorem 26.

In terms of combined complexity, the $(\mathsf{Point},\mathsf{Moment})$ -realisability problem for a conjunction of $k$ literals and a trajectory sample of length $n$ is decidable in $O(n+k\log k)$ time.

Proof.

We assume that we are given a conjunction $C$ of $k$ $(\mathsf{Point},\mathsf{Moment})$ -literals, a sample $S$ of length $n$ and a speed bound $v$ as input. Let $P_{1},\dots,P_{m}$ be the positive atoms occuring in $C$ and let $N_{1},\dots,N_{\ell}$ be the atoms occuring negated in $C$ . First we compute the sample $S_{1}=\langle(p_{1},t_{1}),\dots(p_{n+m},t_{n+m})\rangle$ , containing both the points in $S$ and the ones occuring in the atoms $P_{1},\dots,P_{m}$ . Computing this sample requires ordering the points occuring in $P_{1},\dots,P_{m}$ by time, and merging these with those from $S$ (which are already ordered by time). This can be done in $O(n+k\log k)$ time. We also compute a sample $S_{2}$ , containing the points from $N_{1},\dots,N_{\ell}$ , in $O(k\log k)$ time. Now we have $(S,v)\models C$ if and only if there exists a $v$ -bounded trajectory matching $S_{1}$ and not visiting any point in $S_{2}$ . This can be determined by testing for the conditions of Lemma 25 in $O(|S_{1}|+|S_{2}|)$ time, where $|S_{1}|+|S_{2}|=n+m+\ell=n+k$ . Thus, the total time required by the described procedure is $O(n+k\log k)$ . $\hfill\blacktriangleleft$

Corollary 27.

In terms of combined complexity, the $(\mathsf{Point},\mathsf{Moment})$ -realisability problem for an event in disjunctive normal form, of length $k$ , and a trajectory sample of length $n$ is decidable in $O(kn+k\log k)$ time.

Proof.

We assume that we are given a $(\mathsf{Point},\mathsf{Moment})$ -event $e$ in disjunctive normal form, of length $k$ , a sample $S$ of length $n$ and a speed bound $v$ as input. Then, $e$ is of the form $C_{1}\lor\dots\lor C_{\ell}$ , where every disjunct $C_{i}$ is a conjunction of literals. Let us say that the event $C_{i}$ contains $k_{i}$ literals. Because $e$ is realisable if and only if one of the $C_{i}$ ’s is realisable, we can use Theorem 26 to determine the realisability of $e$ , by determining it for each of the disjuncts. This takes $\sum_{i=1}^{\ell}O(n+k_{i}\log k_{i})$ time, which is $O(kn+k\log k)$ . $\hfill\blacktriangleleft$

5 Conclusion

We have proposed a family of query languages for trajectory sample databases. Query expressions in these languages contain events, which are used to describe constraints on trajectories. When performing query evaluation, an essential problem required to be solved is that of the realisability of an event. We studied the complexity of this realisability problem in terms of data, query and combined complexity. These results are summarised in Table 1. These complexity results are given in a computational model wherein (arithmetic and comparison) operations on rational numbers take unit time. It is not clear whether our results remain true in a model in which we measure the cost of these operations in terms of the length of the bit-representation of rational numbers. Also, we did not give much attention to the evaluation of query expressions outside of the realisability problem. We assumed that query expressions are evaluated in some standard way, but more efficient strategies might exist.

References

[1] Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. URL: http://webdam.inria.fr/Alice/.
[2] S. Basu, R. Pollack, and M.-F. Roy. Algorithms in real algebraic geometry. Algorithms and Computation in Mathematics 10. Springer, Berlin, 2003.
[3] Jacek Bochnak, Michel Coste, and Marie-Françoise Roy. Real Algebraic Geometry, volume 36 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer-Verlag, 1998.
[4] Vania Bogorny, Bart Kuijpers, and Luis Otávio Alvares. ST-DMQL: A semantic trajectory data mining query language. Int. J. Geogr. Inf. Sci., 23(10):1245–1276, 2009. URL: http://www.informaworld.com/smpp/content%7Edb=all%7Econtent=a915093101%7Efrm=abslink.
[5] L. Burns. Transportation, Temporal, and Spatial Components of Accessibility. Lexington Books, Lexington, MA, 1979.
[6] Max J. Egenhofer. Approximation of geospatial lifelines. In Elisa Bertino and Leila De Floriani, editors, SpadaGIS, Workshop on Spatial Data and Geographic Information Systems. University of Genova, 2003.
[7] Fosca Giannotti and Dino Pedreschi, editors. Mobility, Data Mining and Privacy - Geographic Knowledge Discovery. Springer, 2008. doi:10.1007/978-3-540-75177-9.
[8] R. Güting and M. Schneider. Moving Object Databases. Morgan Kaufmann, 2005.
[9] T. Hägerstrand. What about people in regional science? Papers of the Regional Science Association, 24:7–21, 1970.
[10] Kathleen Hornsby and Max J. Egenhofer. Modeling moving objects over multiple granularities. Ann. Math. Artif. Intell., 36(1-2):177–194, 2002. doi:10.1023/A:1015812206586.
[11] Tomasz Imieliński and Witold Lipski. Incomplete information in relational databases. J. ACM, 31(4):761–791, September 1984. doi:10.1145/1634.1886.
[12] Donald Janelle and Michael Goodchild. Diurnal patterns of social group distributions in a canadian city. Economic Geography, 59(4):403–425, 1983. URL: http://www.jstor.org/stable/144166.
[13] B. Lenntorp. Paths in Space-Time Environments: A Time-Geographic Study of the Movement Possibilities of Individuals. Number 44 in Series B. Lund Studies in Geography, 1976.
[14] H.J. Miller. Modeling accessibility using space-time prism concepts within geographical information systems. International Journal of Geographical Information Systems, 5:287–301, 1991. doi:10.1080/02693799108927856.
[15] H.J. Miller. A measurement theory for time geography. Geographical Analysis, 2005. doi:10.1111/j.1538-4632.2005.00575.x.
[16] Christos H. Papadimitriou. The Euclidean travelling salesman problem is NP-complete. Theoretical Computer Science, 4(3):237–244, 1977. doi:10.1016/0304-3975(77)90012-3.
[17] Christine Parent, Stefano Spaccapietra, Chiara Renso, Gennady L. Andrienko, Natalia V. Andrienko, Vania Bogorny, Maria Luisa Damiani, Aris Gkoulalas-Divanis, José Antônio Fernandes de Macêdo, Nikos Pelekis, Yannis Theodoridis, and Zhixian Yan. Semantic trajectories modeling and analysis. ACM Comput. Surv., 45(4):42:1–42:32, 2013. doi:10.1145/2501654.2501656.
[18] Dieter Pfoser and Christian S. Jensen. Capturing the uncertainty of moving-object representations. In Ralf Hartmut Güting, Dimitris Papadias, and Frederick H. Lochovsky, editors, Advances in Spatial Databases, 6th International Symposium, SSD’99, Proceedings, volume 1651 of Lecture Notes in Computer Science, pages 111–132. Springer, 1999. doi:10.1007/3-540-48482-5_9.
[19] Alfred Tarski and J. C. C. McKinsey. A Decision Method for Elementary Algebra and Geometry. University of California Press, 1951.
[20] G. Trajcevski, O. Wolfson, K. Hinrichs, and S. Chamberlain. Managing uncertainty in moving objects databases. ACM Trans. Database Syst., 29(3):463–507, 2004. doi:10.1145/1016028.1016030.
[21] Zhixian Yan, Dipanjan Chakraborty, Christine Parent, Stefano Spaccapietra, and Karl Aberer. Semantic trajectories: Mobility data computation and annotation. ACM Trans. Intell. Syst. Technol., 4(3):49:1–49:38, 2013. doi:10.1145/2483669.2483682.

[bib.bib1] [1] Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. URL: http://webdam.inria.fr/Alice/.

[bib.bib2] [2] S. Basu, R. Pollack, and M.-F. Roy. Algorithms in real algebraic geometry. Algorithms and Computation in Mathematics 10. Springer, Berlin, 2003.

[bib.bib3] [3] Jacek Bochnak, Michel Coste, and Marie-Françoise Roy. Real Algebraic Geometry, volume 36 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer-Verlag, 1998.

[bib.bib4] [4] Vania Bogorny, Bart Kuijpers, and Luis Otávio Alvares. ST-DMQL: A semantic trajectory data mining query language. Int. J. Geogr. Inf. Sci., 23(10):1245–1276, 2009. URL: http://www.informaworld.com/smpp/content%7Edb=all%7Econtent=a915093101%7Efrm=abslink.

[bib.bib5] [5] L. Burns. Transportation, Temporal, and Spatial Components of Accessibility. Lexington Books, Lexington, MA, 1979.

[bib.bib6] [6] Max J. Egenhofer. Approximation of geospatial lifelines. In Elisa Bertino and Leila De Floriani, editors, SpadaGIS, Workshop on Spatial Data and Geographic Information Systems. University of Genova, 2003.

[bib.bib7] [7] Fosca Giannotti and Dino Pedreschi, editors. Mobility, Data Mining and Privacy - Geographic Knowledge Discovery. Springer, 2008. doi:10.1007/978-3-540-75177-9.

[bib.bib8] [8] R. Güting and M. Schneider. Moving Object Databases. Morgan Kaufmann, 2005.

[bib.bib9] [9] T. Hägerstrand. What about people in regional science? Papers of the Regional Science Association, 24:7–21, 1970.

[bib.bib10] [10] Kathleen Hornsby and Max J. Egenhofer. Modeling moving objects over multiple granularities. Ann. Math. Artif. Intell., 36(1-2):177–194, 2002. doi:10.1023/A:1015812206586.

[bib.bib11] [11] Tomasz Imieliński and Witold Lipski. Incomplete information in relational databases. J. ACM, 31(4):761–791, September 1984. doi:10.1145/1634.1886.

[bib.bib12] [12] Donald Janelle and Michael Goodchild. Diurnal patterns of social group distributions in a canadian city. Economic Geography, 59(4):403–425, 1983. URL: http://www.jstor.org/stable/144166.

[bib.bib13] [13] B. Lenntorp. Paths in Space-Time Environments: A Time-Geographic Study of the Movement Possibilities of Individuals. Number 44 in Series B. Lund Studies in Geography, 1976.

[bib.bib14] [14] H.J. Miller. Modeling accessibility using space-time prism concepts within geographical information systems. International Journal of Geographical Information Systems, 5:287–301, 1991. doi:10.1080/02693799108927856.

[bib.bib15] [15] H.J. Miller. A measurement theory for time geography. Geographical Analysis, 2005. doi:10.1111/j.1538-4632.2005.00575.x.

[bib.bib16] [16] Christos H. Papadimitriou. The Euclidean travelling salesman problem is NP-complete. Theoretical Computer Science, 4(3):237–244, 1977. doi:10.1016/0304-3975(77)90012-3.

[bib.bib17] [17] Christine Parent, Stefano Spaccapietra, Chiara Renso, Gennady L. Andrienko, Natalia V. Andrienko, Vania Bogorny, Maria Luisa Damiani, Aris Gkoulalas-Divanis, José Antônio Fernandes de Macêdo, Nikos Pelekis, Yannis Theodoridis, and Zhixian Yan. Semantic trajectories modeling and analysis. ACM Comput. Surv., 45(4):42:1–42:32, 2013. doi:10.1145/2501654.2501656.

[bib.bib18] [18] Dieter Pfoser and Christian S. Jensen. Capturing the uncertainty of moving-object representations. In Ralf Hartmut Güting, Dimitris Papadias, and Frederick H. Lochovsky, editors, Advances in Spatial Databases, 6th International Symposium, SSD’99, Proceedings, volume 1651 of Lecture Notes in Computer Science, pages 111–132. Springer, 1999. doi:10.1007/3-540-48482-5_9.

[bib.bib19] [19] Alfred Tarski and J. C. C. McKinsey. A Decision Method for Elementary Algebra and Geometry. University of California Press, 1951.

[bib.bib20] [20] G. Trajcevski, O. Wolfson, K. Hinrichs, and S. Chamberlain. Managing uncertainty in moving objects databases. ACM Trans. Database Syst., 29(3):463–507, 2004. doi:10.1145/1016028.1016030.

[bib.bib21] [21] Zhixian Yan, Dipanjan Chakraborty, Christine Parent, Stefano Spaccapietra, and Karl Aberer. Semantic trajectories: Mobility data computation and annotation. ACM Trans. Intell. Syst. Technol., 4(3):49:1–49:38, 2013. doi:10.1145/2483669.2483682.

On the Complexity of the Realisability Problem for Visit Events in Trajectory Sample Databases

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

2 Definitions and preliminaries on trajectory sample databases

Definition 1.

Definition 2.

Definition 3.

Definition 4.

Definition 5.

Definition 6.

Definition 7.

3 Syntax, semantics and evaluation of (𝓡,𝓣)-queries

3.1 The syntax of (𝓡,𝓣)-queries

Definition 8.

Definition 9.

3.2 The semantics of (𝓡,𝓣)-queries

Definition 10.

Definition 11.

Definition 12.

3.3 Evaluation of (𝓡,𝓣)-queries

Definition 13.

4 The complexity of the (𝓡,𝓣)-realisability problem

4.1 The query complexity of the (𝓡,𝓣)-realisability problem

Theorem 14.

Proof.

Theorem 15.

Proof.

4.2 The data complexity of the (𝓡,𝓣)-realisability problem

Definition 16.

Lemma 17.

Proof.

Lemma 18.

Proof.

Definition 19.

Proposition 20.

Theorem 21.

Proof.

Lemma 22.

Proof.

Proposition 23.

Theorem 24.

Proof.

4.3 A class of events for which the realisability problem has polynomial time combined complexity

Lemma 25.

Proof.

Theorem 26.

Proof.

Corollary 27.

Proof.

5 Conclusion

References

3 Syntax, semantics and evaluation of $(\mathcal{R},\mathcal{T})$ -queries

3.1 The syntax of $(\mathcal{R},\mathcal{T})$ -queries

3.2 The semantics of $(\mathcal{R},\mathcal{T})$ -queries

3.3 Evaluation of $(\mathcal{R},\mathcal{T})$ -queries

4 The complexity of the $(\mathcal{R},\mathcal{T})$ -realisability problem

4.1 The query complexity of the $(\mathcal{R},\mathcal{T})$ -realisability problem

4.2 The data complexity of the $(\mathcal{R},\mathcal{T})$ -realisability problem