Weighted Operator Precedence Languages

In the last years renewed investigation of operator precedence languages (OPL) led to discover important properties thereof: OPL are closed with respect to all major operations, are characterized, besides the original grammar family, in terms of an automata family and an MSO logic; furthermore they significantly generalize the well-known visibly pushdown languages (VPL). In another area of research, quantitative models of systems are also greatly in demand. In this paper, we lay the foundation to marry these two research fields. We introduce weighted operator precedence automata and show how they are both strict extensions of OPA and weighted visibly pushdown automata. We prove a Nivat-like result which shows that quantitative OPL can be described by unweighted OPA and very particular weighted OPA. In a B\"uchi-like theorem, we show that weighted OPA are expressively equivalent to a weighted MSO-logic for OPL.


Introduction
In the long history of formal languages the family of regular languages (RL), those that are recognized by finite state machines (FSM) or are generated by regular grammars, has always played a major role: thanks to its simplicity and naturalness it enjoys properties that are only partially extended to larger families. Among the many positive results that have been achieved for RL (e.g., expressiveness, decidability, minimization, ...), those of main interest in this paper are the following: • RLs have been characterized in terms of various mathematical logics. The pioneering papers are due to Büchi, Elgot, and Trakhtenbrot [7,22,37] who independently developed a monadic second order (MSO) logic defining exactly ⋆ supported by Deutsche Forschungsgemeinschaft (DFG) Graduiertenkolleg 1763 (QuantLA).
the RL family. This work too has been followed by many further results; in particular those that exploited weaker but simpler logics such as first-order, propositional, and temporal ones which culminated in the breakthrough of model checking to support automatic verification [31,23,8]. • Weighted RLs have been introduced by Schützenberger in his pioneering paper [35]: by assigning a weight in a suitable algebra to each language word, we may specify several attributes of the word, e.g., relevance, probability, etc. Much research then followed and extended the original Schützenberger's work in various directions, cf. the books [4,21,26,34,14].
Unfortunately, all families with greater expressive power than RL -typically context-free languages (CFL), which are the most widely used family in practical applications-pay a price in terms of properties and, consequently, of possible tools supporting their automatic analysis. For instance, for CFL, the containment problem is undecidable and they are not closed under complement. What was not possible for general CFL, however, has been possible for important subclasses of this family, which together we call structured CFL. Informally, with this term we denote those CFLs where the syntactic tree-structure of their words is immediately "visible" in the words themselves. A first historical example of such families is that of parenthesis languages, introduced by McNaughton in another seminal paper [30], which are generated by grammars whose right hand sides are enclosed within pairs of parentheses; not surprisingly an equivalent formalism of parenthesis grammars was soon defined, namely tree-automata which generalize the basics of FSM to tree-like structures instead of linear strings [36]. Among the many variations and generalizations of parenthesis languages the recent family of input-driven languages (IDL) [32,6], alias visibly pushdown languages (VPL) [2], have received much attention in recent literature. For most of these structured CFL, including in particular IDL, all of the algebraic properties of RL still hold [2]. One of the most noticeable results of this research field has been a characterization of IDL/VPL in terms of a MSO logic that is a fairly natural extension of the original Büchi's one for RL [27,2]. This fact has suggested to extend the investigation of weighted RL to various cases of structured languages. The result of such a fertile approach is a rich collection of weighted logics, first studied by Droste and Gastin [12], associated with weighted tree automata [19] and weighted VPAs the automata recognizing VPLs, also called weighted NWAs [29,11].
In an originally unrelated way operator precedence languages (OPL) have been defined and studied in two phases temporally separated by four decades. In his seminal work [24] Floyd was inspired by the precedence of multiplicative operations over additive ones in the execution of arithmetic expressions and extended such a relation to the whole input alphabet in such a way that it could drive a deterministic parsing algorithm that builds the syntax tree of any word that reflects the word's semantics; Fig. 1 and Section 2 give an intuition of how an OP grammar generates arithmetic expressions and assigns them a natural structure. After a few further studies [10], OPL's theoretical investigation has been abandoned due to the advent of LR grammars which, unlike OPL grammars, generate all deterministic CFL.
OPL, however, enjoy a distinguishing property which we can intuitively describe as "OPL are input driven but not visible". They can be claimed as inputdriven since the parsing actions on their words -whether to push or to pop their stack-depend exclusively on the input alphabet and on the relation defined thereon, but their structure is not visible in their words: e.g, they can include unparenthesized arithmetic expressions where the precedence of multiplicative operators over additive ones is explicit in the syntax trees but hidden in their frontiers (see Fig. 1). Furthermore, unlike other structured CFL, OPL include deterministic CFL that are not real-time [28].
This remark suggested to resume their investigation systematically at the light of the recent technological advances and related challenges. Such a renewed investigation led to prove their closure under all major language operations [9] and to characterize them, besides the original Floyd's grammars, in terms of an appropriate class of pushdown automata (OPA) and in terms of a MSO logic which is a fairly natural but not trivial extension of the previous ones defined to characterize RL and VPL [28]. Thus, OPL enjoy the same nice properties of RL and many structured CFL but considerably extend their applicability by breaking the barrier of visibility and real-time push-down recognition.
In this paper we put together the two above research fields, namely we introduce weighted OPL and show that they are able to model system behaviors that cannot be specified by means of less powerful weighted formalisms such as weighted VPL. For instance, one might be interested in the behavior of a system which handles calls and returns but is subject to some emergency interrupts. Then it is important to evaluate how critically the occurrences of interrupts affect the normal system behavior, e.g., by counting the number of pending calls that have been preempted by an interrupt. As another example consider a system logging all hierarchical calls and returns over words where this structural information is hidden. Depending on changing exterior factors like energy level, such a system could decide to log the above information in a selective way.
Our main contributions in this paper are the following.
• The model of weighted OPA, which have semiring weights at their transitions, significantly increases the descriptive power of previous weighted extensions of VPA, and has desired closure and robustness properties. • For arbitrary semirings, there is a relevant difference in the expressive power of the model depending on whether it permits assigning weights to pop transitions or not. For commutative semirings, however, weights on pop transitions do not increase the expressive power of the automata. The difference in descriptive power between weighted OPA with arbitrary weights and without weights at pop transitions is due to the fact that OPL may be non-real-time and therefore OPA may execute several pop moves without advancing their reading heads. • An extension of the classical result of Nivat [33] to weighted OPL. This robustness result shows that the behaviors of weighted OPA without weights at pop transitions are exactly those that can be constructed from weighted OPA with only one state, intersected with OPL, and applying projections which preserve the structural information. • A weighted MSO logic and, for arbitrary semirings, a Büchi-Elgot-Trakhtenbrot-Theorem proving its expressive equivalence to weighted OPA without weights at pop transitions. As a corollary, for commutative semirings this weighted logic is equivalent to weighted OPA including weights at pop transitions.

Preliminaries
We start with an example to provide an intuition of the idea by which R. Floyd made the hidden precedences between symbols occurring in a grammar explicit in parse trees [24]: consider arithmetic expressions with two operators, an additive one and a multiplicative one that takes precedence over the other one, in the sense that, during the interpretation of the expression, multiplications must be executed before sums. Parentheses are used to force different precedence hierarchies. Figure 1 (left) presents a grammar and (center) the derivation tree of the expression n + n × (n + n); all nonterminals are axioms.
Notice that the structure of the syntax tree (uniquely) corresponding to the input expression reflects the precedence order which drives computing the value attributed to the expression. This structure, however, is not immediately visible in the expression; if we used a parenthesis grammar, it would produce the string (n + (n × (n + n))) instead of the previous one, and the structure of the corresponding tree would be immediately visible. For this reason we say that such grammars "hide" the structure associated with a sentence, whereas parenthesis grammars and other input-driven ones make the structure explicit in the sentences they generate. To model this hierarchical structure and make it accessible, we introduce the chain relation . This new relation can be compared with the nesting or matching relation of [2], as it also is a non-crossing relation, going always forward and originating from additional information on the alphabet. However, it also features significant differences: Instead of adding unary information to symbols, which partition the alphabet into three disjoint parts (calls, internals, and returns), we add a binary relation for every pair of symbols denoting their precedence relation. Therefore, in contrast to the nesting relation, the same symbol can be either call or return depending on its context. Furthermore, the same position can be part of multiple chain relations.
More precisely, we define an OP alphabet as a pair (Σ, M ), where Σ is an alphabet and M , the operator precedence matrix (OPM) is a |Σ ∪ {#}| 2 array describing for each ordered pair of symbols at most one (operator precedence) relation, that is, every entry of M is either ⋖ (yields precedence), . = (equal in precedence, ⋗ (takes precedence), or empty (no relation).
We use the symbol # to mark the beginning and the end of a word and let always be # ⋖ a and a ⋗ # for all a ∈ Σ. As an example, Figure 1 (right) depicts the OPM of the grammar reported on its left, omitting the standard relations for #.
Let w = (a 1 ...a n ) ∈ Σ + be a word. We say a 0 = a n+1 = # and define a new relation on the set of all positions of #w#, inductively, as follows. Let i, j ∈ {0, 1, ..., n + 1}, i < j. Then, we write i j if there exists a sequence of positions We say w is compatible with M if for #w# we have 0 n + 1. In particular, this forces M aiaj = ∅ for all i + 1 = j and for all i j. We denote by (Σ, M ) + the set of all non-empty words over Σ which are compatible with M . For a complete OPM M , i.e. one without empty entries, this is Σ + .
We recall the definition of an operator precedence automaton from [28]. Let Γ = Σ × Q. A configuration of A is a triple C = Π, q, w# , where Π ∈ ⊥Γ * represents a stack, q ∈ Q the current state, and w the remaining input to read. A run of A on w = a 1 ...a n is a finite sequence of configurations C 0 ⊢ ... ⊢ C m such that every transition C i ⊢ C i+1 has one of the following forms, where a is the topmost alphabet symbol of Π and b is the next symbol of the input to read: An accepting run of A on w is a run from ⊥, q I , w# to ⊥, q F , # , where q I ∈ I and q F ∈ F . The language accepted by A, denoted L(A), consists of all words over (Σ, M ) + which have an accepting run on A. We say that L ⊆ (Σ, M ) + is an OPL if L is accepted by an OPA over (Σ, M ). As proven by [28], the deterministic variant of an OPA, using a single initial state instead of I and transition functions instead of relations, is equally expressive to nondeterministic OPA. An example automaton is depicted in Figure 2: with the OPM of Figure 1 (right), it accepts the same language as the grammar of Figure 1 where a ∈ Σ ∪{#}, x, y are first-order variables; and X is a second order variable.
We define the natural semantics for this (unweighted) logic as in [28]. The relation refers to the chain relation introduced above.

Weighted OPA and Their Connection to Weighted VPA
In this section, we introduce a weighted extension of operator precedence automata. We show that weighted OPL include weighted VPL and give examples showing how these weighted automata can express behaviors which were not expressible before. Let K = (K, +, ·, 0, 1) be a semiring, i.e., (K, +, 0) is a commutative monoid, (K, ·, 1) is a monoid, (x+y)·z = x·z+y·z, x·(y+z) = x·y+x·z, and 0 · x = x · 0 = 0 for all x, y, z ∈ K. K is called commutative if (K, ·, 1) is commutative.
A configuration of a wOPA is a tuple C = Π, q, w#, k , where (Π, q, w#) is a configuration of the OPA A ′ and k ∈ K. A run of A is a again a sequence of configurations C 0 ⊢ C 1 . . . ⊢ C m satisfying the previous conditions and, additionally, the weight of a configuration is updated by multiplying with the weight of the encountered transition, as follows. As before, we denote with a the topmost symbol of Π and with b the next symbol of the input to read: We call a run ρ accepting if it goes from ⊥, q I , 1, w# to ⊥, q F , k, # , where q I ∈ I and q F ∈ F . For such an accepting run, the weight of ρ is defined as wt(ρ) = k. We denote by acc(A, w) the set of all accepting runs of A on w.
Finally, the behavior of A is a function A : (Σ, M ) + → K, defined as Every function S : (Σ, M ) + → K is called an OP-series (short: series, also weighted language). A wOPA A recognizes or accepts a series S if A = S. A series S is called regular or a wOPL if there exists an wOPA A accepting it. S is strictly regular or an rwOPL if there exists an rwOPA A accepting it.
Example 5. Let us resume, in a simplified version, an example presented in [28] (Example 8) which exploits the ability of OPA to pop many items from the stack without advancing the input head: in this way we can model a system that manages calls and returns in a traditional LIFO policy but discards all pending calls if an interrupt occurs 4 . The weighted automaton of Figure 3 attaches weights to the OPA's transitions in such a way that the final weight of a string is 1 only if no pending call is discarded by any interrupt; otherwise, the more calls are discarded the lower the "quality" of the input as measured by its weight. More precisely, we define Σ = {call, ret, int} and the precedence matrix M as a subset of the matrix of Example 8 of [28], i.e., call ⋖ call, call . = ret, call ⋗ int, int ⋖ int, int ⋗ call, and ret ⋗ a for all a ∈ Σ.
By adopting the same graphical notation as in [28] pushes are normal arrows, shifts are dashed, pops are double arrows; weights are given in brackets at transitions. Let #pcall(w) be the number of pending calls of w, i.e., q0 call( 1 2 ) ret (2) q0 (1) int (1) Fig. 3. The weighted OPA A penalty penalizing unmatched calls calls which are never answered by a return. Then the behavior of the automaton A penalty over (Σ, M ) and the semiring (N, +, ·, 0, 1) given in Figure 3 is . The example can be easily enriched by following the same path outlined in [28]: we could add symbols specifying the serving of an interrupt, add different types of calls and interrupts with different priorities and more sophisticated policies (e.g., lower level interrupts disable new calls but do not discard them, whereas higher level interrupts reset the whole system, etc.) Example 6. The wOPA of Figure 3 is "rooted" in a deterministic OPA; thus the semiring of weights is exploited in a fairly trivial way since only the · operation is used. The automaton A policy given in Figure 4, instead, formalizes a more complex system where the penalties for unmatched calls may change nondeterministically within intervals delimited by the special symbol $. Precisely, the symbols $ mark intervals during which sequences of calls, returns, and interrupts occur; "normally" unmatched calls are not penalized, but there is a special, nondeterministically chosen interval during which they are penalized; the global weight assigned to an input sequence is the maximum over all nondeterministic runs that are possible when recognizing the sequence.
Here, the alphabet is Σ = {call, ret, int, $}, and the OPM M , with a ⋖ $ and $ ⋗ a, for all a ∈ Σ is a natural extension of the OPM of Example 5. As semiring, we take R max = (R∪{−∞}, max, +, −∞, 0). Then, A policy (w) equals the maximal number of pending calls between two consecutive $. Again, A policy can be easily modified/enriched to formalize several variations of its policy: e.g., q1 q2 Note that both automata, A penalty and A policy , do not use the weight assignment for pops.
Example 7. The next automaton A log , depicted in Figure 5 chooses non-deterministically between logging everything and logging only 'important' information, e.g., only interrupts (this could be a system dependent on energy, WiFi, ...). Notice that, unlike the previous examples, in this case assigning nontrivial weights to pop transitions is crucial. Let Σ = {call, ret, int}, and define M as for A penalty . We employ the semiring As hinted at by our last example, the following proposition shows that in general, wOPA are more expressive than rwOPA.

Proposition 8.
There exists an OP alphabet (Σ, M ) and a semiring K such that there exists a weighted language S which is regular but not strictly regular.
Proof. Let Σ = {c, r}, c ⋖ c, and c . = r. Consider the semiring Fin {a,b} of all finite languages over {a, b} together with union and concatenation. Let n ∈ N and S : (Σ, M ) + → Fin {a,b} be the following series Then, we can define a wOPA which only reads c n r, assigns the weight {a} to every push and pop, and the weight {b} to the one shift, and therefore accepts S, as in Figure 6. Now, we show with a pumping argument that there exists no rwOPA which recognizes S. Assume there is an rwOPA A with A = S. Note that for all n ∈ N, the structure of c n r is fixed as c ⋖ c ⋖ ... ⋖ c . = r. Let ρ be an accepting run of A on c n r with wt(ρ) = {a n ba n }. Then, the transitions of ρ consist of n pushes, followed by a shift, followed by n pops and can be written as Both the number of states and the amount of pairs of states are bound. If n is sufficiently large, there exists two pop transitions pop(q, p, r) and pop(q ′ , p ′ , r ′ ) in this sequence such that q = q ′ and p = p ′ . This means that we have a loop in the pop transitions going from state q to q ′ = q. Furthermore, the corresponding push to the first transition of this loop was invoked when the automaton was in state p ′ , while the corresponding push to the last pop was invoked in state p.
Since p = p ′ , we also have a loop at the corresponding pushes. Then, the run where we skip both loops in the pops and in the pushes is an accepting run for c n−k r, for some k ∈ N \ {0}.
Since the weight of all pops is trivial, the weight of the pop-loop is ε. If the weight of the push-loop is also ε, then we have an accepting run for c n−k r of weight {a n ba n }, a contradiction. If the weight of the push-loop is not trivial, then by a simple case distinction it has to be either {a i } for some i ∈ N \ {0} or it has to contain the b. In the first case, the run without both loops has weight {a n−i ba n } or {a n ba n−i }, in the second case it has weight {a j }, for some j ∈ N. All these runs are not of the form a n−k ba n−k , a contradiction.

⊓ ⊔
We notice that using the same arguments, we can show that also no weighted nested word automata as defined in [29,18] can recognize this series. Even stronger, we can prove that restricted weighted OPLs are a generalization of weighted VPLs in the following sense. We shortly recall the important definitions. Let Σ = Σ call ⊔ Σ int ⊔ Σ ret be a visibly pushdown alphabet. A VPA is a pushdown automata which uses a push and pop transitions whenever it reads a call or return symbol, respectively. In [9], it was shown that using the complete OPM of Fig. 7, for every VPA, there exists an equivalent operator precedence grammar which in turn can be transformed into an equivalent OPA.
In [29] and [18] weighted extensions of VPA were introduced (in the form of weighted nested word automata wNWA). These add semiring weights at every transition again depending on the information what symbols are calls, internals, or returns. Note that every nested word has a representation as a word over a visibly pushdown alphabet Σ and therefore can be seen as a compatible word of (Σ, M ) + , where M is the OPM of Fig. 7, i.e., we can interpret the behavior of a wNWA as an OP-series (Σ, M ) + → K. Theorem 9. Let K be a semiring, Σ be a visibly pushdown alphabet, and M be the OPM of Fig. 7. Then for every wNWA A defined as in [18], there exists an rwOPA B with A (w) = B (w) for all w ∈ (Σ, M ) + .
We give an intuition for this result as follows. Note that although sharing some similarities, pushes, shifts, and pops are not the same thing as calls, internals, and returns. Indeed, a return of a (w)NWA reads and 'consumes' a symbol, while a pop of an (rw)OPA just pops the stack and leaves the next symbol untouched.
After studying Figure 7, this leads to the important observation that every symbol of Σ ret and therefore every return transition of an NWA is simulated not by a pop, but by a shift transition of an OPA (in the unweighted and weighted case).
We give a short demonstrating example: Then every run of an NWA for this word looks like Every run of an OPA (using the OPM of Fig. 7) looks as follows: where the return was substituted (by the OPM, not by a choice of ours) by a shift followed by a pop. It follows that we can simulate a weighted call by a weighted push, a weighted internal by a weighted push together with a pop and a weighted return by a weighted shift together with a pop. Therefore, we may indeed omit weights at pop transitions.

⊓ ⊔
Together with the result that OPA are strictly more expressive than VPAs [9], this gives a complete picture of the expressive power of these three classes of weighted languages: wVPL rwOPL wOPL .
The following result shows that for commutative semirings the second part of this hierarchy collapses, i.e. restricted rwOPA are equally expressive as wOPA (and therefore can be seen as a kind of normal form in this case). Proof. Let A = (Q, I, F, δ, wt) be a wOPA over (Σ, M ) and K. Note that for every pop transition of a wOPA, there exists exactly one push transition. We construct an rwOPA B over the state set Q ′ = Q × Q × Q and with the same behavior as A with the following idea in mind. In the first state component B simulates A. In the second and third state component of Q ′ the automaton B preemptively guesses the states q and r of the pop transition (q, p, r) of A which corresponds to the next push transition following after this configuration. This enables us to transfer the weight from the pop transition to the correct push transition.
The detailed construction of B = (Q ′ , I ′ , F ′ , δ ′ , wt ′ ) over (Σ, M ) and K is the following. If Q = ∅, then A ≡ 0 is trivially strictly regular. If Q is nonempty, let q ∈ Q be a fixed state. Then, we set Q ′ = Q × Q × Q, I ′ = {(q 1 , q 2 , q 3 ) | q 1 ∈ I, q 2 , q 3 ∈ Q}, F ′ = {(q 1 , q, q) | q 1 ∈ F }, and δ ′ push = {((q 1 , q 2 , q 3 ), a, (r 1 , r 2 , r 3 )) | (q 1 , a, r 1 ) ∈ δ push and (q 2 , q 1 , Here, every push of B controls that the previously guessed q 2 and q 3 can be used by a pop transition of A going from q 2 to q 3 with q 1 on top of the stack. Every pop controls that the symbols on top of the stack are exactly the ones used at this pop. Since the second and third state component are guessed for the next push, they are passed on whenever we read a shift or pop. The second and third component pushed at the first position of a word are guessed by an initial state. At the last push, which therefore has no following push and will propagate the second and third component to the end of the run, the automaton B has to guess the distinguished state used in the final states. Therefore, B has exactly one accepting run (of the same length) for every accepting run of A, and vice versa. Finally, we define the transition weights as follows.
⊓ ⊔ In the following, we study closure properties of weighted OPA and restricted weighted OPA. As usual, we extend the operation + and · to series S, T : (Σ, M ) + → K by means of pointwise definitions as follows: Proposition 11. The sum of two regular (resp. strictly regular) series over (Σ, M ) + is again regular (resp. strictly regular).
Proof. We use a product construction of automata.
Note that given a word w, the automata A, B, and C have to use pushes, shifts, and pops at the same positions. Hence, every accepting run of C on w defines exactly one accepting run of B and exactly one accepting run of A on w with matching weights, and vice versa. We obtain It follows that, C = S ∩ L.
Analogously to [18] and [11], this implies that for every run ρ of A on w, there exists exactly one run ρ ′ of B on v with h(w) = v and wt(ρ)=wt(ρ ′ ). One difference to previous works is that a pop of a wOPA is not consuming the symbol. Therefore, we have to make sure to not change the symbol, which we are currently remembering while processing a pop.

A Nivat Theorem
In this section, we establish a connection between weighted OPLs and strictly regular series. We show that strictly regular series are exactly those series which can be derived from a restricted weighted OPA with only one state, intersected with an unweighted OPL, and using an OPM-preserving projection of the alphabet.
Let h : Σ ′ → Σ be a map between two alphabets. Given an OP alphabet As h is OPM-preserving, for every series S : (Σ, M ) + → K, we get a series h(S) : (Σ ′ , h −1 (M )) + → K, using the sum over all pre-images as in formula (1).
Let N (Σ, M, K) comprise all series S : (Σ, M ) + → K for which there exist an alphabet Σ ′ , a map h : Σ ′ → Σ, and a one-state rwOPA B over (Σ ′ , h −1 (M )) and K and an OPL L over (Σ ′ , h −1 (M )) such that S = h( B ∩ L). Now, we show that every strictly regular series can be decomposed into the above introduced fragments. Proof. We follow some ideas of [15] and [17].
Let A = (Q, I, F, δ, wt) be a rwOPA over (Σ, M ) and K with A = S. We set Σ ′ = Q × Σ × Q as the extended alphabet. The intuition is that Σ ′ consists of the push and the shift transitions of A. Let h be the projection of Σ ′ to Σ and let M ′ = h −1 (M ).
Let L ⊆ (Σ ′ , M ′ ) + be the language consisting of all words w ′ over the extended alphabet such that h(w ′ ) has an accepting run on A which uses at every position the push, resp. the shift transition defined by the symbol of Σ ′ at this position.

Weighted MSO-Logic for OPL
We use modified ideas from Droste and Gastin [12], also incorporating the distinction into an unweighted (boolean) and a weighted part by Bollig and Gastin [5].
Definition 16. We define the weighted logic MSO(K, (Σ, M )), short MSO(K), as x, y are first-order variables; and X is a second order variable.
We call β boolean and ϕ weighted formulas. Let w ∈ (Σ, M ) + and ϕ ∈ MSO(K).  ) if and only if a ⊙ b. We represent the word w together with the assignment σ as a word (w, σ) over (Σ V , M V ) such that 1 denotes every position where x resp. X holds. A word over Σ V is called valid, if every first-order variable is assigned to exactly one position. Being valid is a regular property which can be checked by an OPA.
Example 17. Let us go back to the automaton A policy depicted in Figure 4. The following boolean formula β defines three subsets of string positions, X 0 , X 1 , X 2 , representing, respectively, the string portions where unmatched calls are not penalized, namely X 0 , X 2 , and the portion where they are, namely X 1 .
Weight assignment is formalized by which assigns weight 0 to calls, returns, and ints outside portion X 1 ; and which assigns weights 1, −1, 0 to calls, returns, and ints, respectively, within portion X 1 . Then, the formula ψ = x (β ⊗ ϕ 0,2 ⊗ ϕ 1 ) defines the weight assigned by A policy to an input string through a single nondeterministic run and finally χ = X0 X1 X2 ψ defines the global weight of every string in an equivalent way as the one defined by A policy .
Proof. This is shown by means of Proposition 13 analogously to Proposition 3.3 of [12]. ⊓ ⊔ As shown by [12] in the case of words, the full weighted logic is strictly more powerful than weighted automata. A similar example also applies here. Therefore, in the following, we restrict our logic in an appropriate way. The main idea for this is to allow only functions with finitely many different values (step functions) after a product quantification. Furthermore, in the non-commutative case, we either also restrict the application of ⊗ to step functions or we enforce all occurring weights (constants) of ϕ ⊗ θ to commute. Definition 19. The set of almost boolean formulas is the smallest set of all formulas of MSO(K) containing all constants k ∈ K and all boolean formulas which is closed under ⊕ and ⊗.
The following propositions show that almost boolean formulas are describing precisely a certain form of rwOPA's behaviors, which we call OPL step functions. We adapt ideas from [16].
A series S is called an OPL step function, if it has a representation where L i are OPL forming a partition of (Σ, M ) + and k i ∈ K for each i ∈ {1, ..., n}; so ϕ (w) = k i iff w ∈ L i , for each i ∈ {1, ..., n}.

Lemma 21.
The set of all OPL step functions is closed under + and ⊙.
j be OPL step functions. Then the following holds Since (L i ∩ L ′ j ) are also OPL and form a partition of (Σ, M ) + , it follows that S + S ′ and S ⊙ S ′ are also OPL step functions. Proof. (a) We show the first statement by structural induction on ϕ. If ϕ is boolean, then ϕ = ½ L(ϕ) , were L(ϕ) and L(¬ϕ) are OPL due to Theorem 3. Therefore, ϕ = 1 K ½ L(ϕ) + 0 K ½ L(¬ϕ) is an OPL step function. If ϕ = k, k ∈ K, then k = k½ (Σ,M) + is an OPL step function. Let V = free(ϕ 1 ) ∪ free(ϕ 2 ). By lifting Lemma 18 to OPL step functions as in [17] and by Lemma 21, we see that Given an OPL step function ϕ = n i=1 k i ½ Li , we use Theorem 3 to get ϕ i with ϕ i = ½ Li . Then, the second statement follows from setting ϕ = n i (k i ∧ ϕ i ) and the fact that the OPL (L i ) 1≤i≤n form a partition of (Σ, M ) + .
⊓ ⊔ Proposition 23. Let S be an OPL step function. Then S is strictly regular.
Proof. Let n ∈ N, (L i ) 1≤i≤n be OPL forming a partition of (Σ, M ) + and k i ∈ K for each i ∈ {1, ..., n} such that Its easy to construct a 2 state rwOPA recognizing the constant series k i which assigns the weight k i to every word. Hence, k i ½ Li = k i ∩ L i is strictly regular by Proposition 12. Therefore, by Proposition 11, S is strictly regular. ⊓ ⊔ Definition 24. Let ϕ ∈ MSO(K). We denote by const(ϕ) all weights of K occurring in ϕ and we call ϕ ⊗-restricted if for all subformulas ψ ⊗ θ of ϕ either ψ is almost boolean or const(ψ) and const(θ) commute elementwise. We call ϕ -restricted if for all subformulas x ψ of ϕ, ψ is almost boolean. We call ϕ restricted if it is both ⊗and -restricted.
In Example 17, the formula β is boolean, the formulas φ are almost boolean, and ψ and χ are restricted. Notice that ψ and χ would be restricted even if K were not commutative.
For use in Section 6, we note: Proposition 25. Let S : (Σ, M ) + → K be a regular (resp. strictly regular) series and k ∈ K. Then k ⊙ S is regular (resp. strictly regular).
Proof. Since ϕ is ⊗-restricted, either ψ is almost boolean or the constants of both formulas commute.
Case 1: Let us assume ψ is almost boolean. Then, we can write ψ as OPL step function, i.e., ψ = n i=1 k i ½ Li , where L i are OPL. So, the series ψ ⊗ θ equals a sum of series of the form ( k i ⊗θ ∩L i ). Then, by Proposition 25, k i ⊗θ is a regular (resp. strictly regular) series. Therefore, ( k i ⊗ θ ∩ L i ) is regular (resp. strictly regular) by Proposition 12. Hence, ψ ⊗ θ is (strictly) regular by Proposition 11.
Case 2: Let us assume that the constants of ψ and θ commute. Then, the second part of Proposition 12 yields the claim.
⊓ ⊔ Lemma 28 (Closure under x , X ). Let ϕ be a formula of MSO(K) such that ϕ is regular (resp. strictly regular). Then, x ϕ and X ϕ are regular (resp. strictly regular).
⊓ ⊔ Proposition 29 (Closure under restricted x ). Let ϕ be an almost boolean formula of MSO(K). Then, x ϕ is strictly regular. Proof. We use ideas of [12] and the extensions in [18] and [11] with the following intuition.
In the first part, we write ϕ as OPL step function and encode the information to which language (w, σ[x → i]) belongs in a specially extended languageL. Then we construct an MSO-formula for this language. Therefore, by Theorem 3, we get a deterministic OPA recognizingL. In the second part, we add the weights k i to this automaton and return to our original alphabet.
More detailed, let ϕ ∈ MSO(K, (Σ, M )). We define V = free( x.ϕ) and W = free(ϕ) ∪ {x}. We consider the extended alphabets Σ V and Σ W together with their natural OPMs M V and M W . By Proposition 22 and lifting Lemma 18 to OPL step functions, ϕ is an OPL step function. Let ϕ = m j=1 k j ½ Lj where L j is an OPL over (Σ W , M W ) for all j ∈ {1, ..., m} and (L j ) is a partition of (Σ W , M W ) + . By the semantics of the product quantifier, we get Now, in the first part, we encode the information to which language (w, σ[x → i]) belongs in a specially extended languageL and construct an MSO-formula for this language. We define the extended alphabetΣ = Σ × {1, ..., n}, together with its natural OPMM which only refers to Σ, so: We define the languagesL,L j ,L ′ j ⊆ (Σ V ,M V ) + as follows: Then,L = m j=1L j . Hence, in order to show thatL is an OPL, it suffices to show that eachL j is an OPL. By a standard procedure, compare [12], we obtain a formulaφ j ∈ MSO(Σ V ,M V ) with L(φ j ) =L ′ j . Therefore, by Theorem 3,L ′ j is an OPL. It is straightforward to define an OPA acceptingÑ V , the language of all valid words. By closure under intersection,L j =L ′ j ∩Ñ V is also an OPL and so isL. Hence, there exists a deterministic OPAÃ = (Σ, q 0 , F,δ) recognizingL.
In the second part, we add weights toÃ as follows. We construct the wOPA A = (Q, I, F, δ, wt) over (Σ V , M V ) and K by adding to every transition ofÃ with g(i) = j the weight k j .
That is, we keep the states, the initial state, and the accepting states, and for δ = (δ push , δ shift , δ pop ) and all q, q ′ , p ∈ Q and (a, j, s) ∈Σ V , we define SinceÃ is deterministic, for every (w, g, σ) ∈L, there exists exactly one accepted runr ofÃ. On the other hand, for every (w, g, σ) / ∈L, there is no accepted run ofÃ. Since (L j ) is a partition of (Σ W , M W ) + , for every (w, σ) ∈ (Σ V , M V ), there exists exactly one g with (w, g, σ) ∈L. Thus, every (w, σ) ∈ (Σ V , M V ) has exactly one run r of A determined by the runr of (w, g, σ) ofÃ. We denote with wt A (r, (w, σ), i) the weight used by the run r on (w, σ) over A at position i, which is always the weight of the push or shift transition used at this position. Then by definition of A andL, the following holds for all i ∈ [w] By formula (2), we obtain Hence, for the behavior of the automaton A the following holds Thus, A recognizes x ϕ .

⊓ ⊔
The following proposition is a summary of the previous results. Proof. We use structural induction on ϕ. If ϕ is an almost boolean formula, then by Proposition 22 ϕ is an OPL step function. By Proposition 23 every OPL step function is strictly regular. Closure under ⊕ is dealt with by Lemma 26, closure under ⊗ by Proposition 27. The sum quantifications x and X are dealt with by Lemma 28. Since ϕ is restricted, we know that for every subformula x ψ, the formula ψ is an almost boolean formula. Therefore, we can apply Proposition 29 to maintain recognizability of our formula in this case.
The next proposition shows that the converse also holds.
Proposition 31. For every rwOPA A, there exists a restricted MSO(K)-sentence ϕ with A = ϕ . If K is commutative, then for every wOPA A, there exists a restricted MSO(K)-sentence ϕ with A = ϕ .
Proof. The rationale adopted to build formula ϕ from A integrates the approach followed in [12,18] with the one of [28] On the one hand we need second order variables suitable to "carry" weights; on the other hand, unlike previous non-OP cases which are managed through real-time automata, an OPA can perform several transitions while remaining in the same position. Thus, we introduce the following second order variables: X push p,a,q represents the set of positions where A performs a push move from state p, reading symbol a and reaching state q; X shift p,a,q has the same meaning as X push p,a,q for a shift operation; X pop p,q,r represents the set of positions of the symbol that is on top of the stack when A performs a pop transition from state p, with q on top of the stack, reaching r.  Let V consist of all X push p,a,q , X shift p,a,q , and X pop p,q,r such that a ∈ Σ, p, q, r ∈ Q and (p, a, q) ∈ δ push resp. δ shift , resp. (p, q, r) ∈ δ pop . Since Σ and Q are finite, there is an enumerationX = (X 1 , .., X m ) of all variables of V. We denote bȳ X push ,X shift , andX pop enumerations over only the respective set of second order variables.
We use the following usual abbreviations for unweighted formulas of MSO: Additionally, we use the shortcuts Tree(x, z, v, y), Next i (x, y), Q i (x, y), and Tree p,q (x, z, v, y), originally defined in [28], reported and adapted here for convenience: In other words, Tree holds among the four positions (x, z, v, y) iff, at the time when a pop transition is executed: x (resp. y) is the rightmost leaf at the left (resp. the leftmost at the right) of the subtree whose scanning (and construction if used as a parser) is completed by the OPA through the current transition; z and y are the leftmost and rightmost terminal characters of the right hand side of the grammar production that is reduced by the pop transition of the OPA [28]. For instance, with reference to Figures 1 and 9, Tree (5,7,7,9) and Tree (4,5,9,10) hold.
I.e., y is the position adjacent to x, Lab a (y) and, while reading a, the OPA reaches state q, either through a push or through a shift move.
Next r (x, y) := ∃z∃v. Finally, Tree i,j (x, z, v, y) := Tree(x, z, v, y) ∧ Q i (v, y) ∧ Q j (x, z) refines the predicate Tree by making explicit that i and j are, respectively, the current state and the state on top of the stack when the pop move is executed. We now define the unweighted formula ψ to characterize all accepted runs Here, the subformula P artition will enforce the push and shift sets to be (together) a partition of all positions. InitF inal controls the initial and the acceptance condition and T rans op the transitions of the run together with the labels.
T rans push = ∀x. p,q∈Q,a∈Σ I.e., if x ∈ X push p,a,q (resp. X shift ) the formula holds in a run where, reading character a in position x, the automaton performs a push (resp. a shift) reaching state q from p; this may occur when z ⋖ x (resp., z . = x) is immediately adjacent to x or after a subtree between positions z and x has been built. Notice that the converse too of the above implications holds, due to the fact that the whole set of string positions is partitioned into the two disjoint sets X push , X shift .
T rans pop = ∀v. p,q∈Q r∈Q v ∈ X pop p,q,r ↔ ∃x∃y∃z.(Tree p,q (x, z, v, y)) Thus, with arguments similar to [28] it can be shown that the sentences satisfying ψ are exactly those recognized by the unweighted OPA subjacent to A.
For an unweighted formula β and two weights k 1 and k 2 , we define the following shortcut for an almost boolean weighted formula: Now, we add weights to ψ by defining the following restricted weighted formula p,a,q THEN wt push (p, a, q) ELSE 1) ⊗ ⊗ a∈Σ (IF x ∈ X shift p,a,q THEN wt shift (p, a, q) ELSE 1) ⊗ ⊗ r∈Q (IF x ∈ X pop p,q,r THEN wt pop (p, q, r) ELSE 1) .
Here, the second part of θ multiplies up all weights of the encountered transitions. This is the crucial part where we either need that K is commutative or all pop weights are trivial because the product quantifier of θ assigns the pop weight at a different position than the occurrence of the respective pop transition in the automaton. Using only one product quantifier (weighted universal quantifier) this is unavoidable, since the number of pops at a given position is only bounded by the word length.

⊓ ⊔
The following theorem summarizes the main results of this section.
Theorem 32. Let K be a semiring and S : (Σ, M ) + → K a series.
1. The following are equivalent: (i) S = A for some restricted wOPA.
(ii) S = ϕ for some restricted sentence ϕ of MSO(K). 2. Let K be commutative. Then, the following are equivalent: (i) S = A for some wOPA.
Theorem 32 documents a further step in the path of generalizing a series of results beyond the barrier of regular and structured -or visible-CFLs. Up to a few years ago, major properties of regular languages, such as closure w.r.t. all main language operations, decidability results, logic characterization, and, in this case, weighted language versions, could be extended to several classes of structured CFLs, among which the VPL one certainly obtained much attention. OPLs further generalize the above results not only in terms of strict inclusion, but mainly because they are not visible, in the sense explained in the introduction, nor are they necessarily real-time: this allows them to cover important applications that could not be adequately modeled through more restricted classes. Theorem 32 also shows that the typical logical characterization of weighted languages does not generalize in the same way to the whole class wOPL: for non-rwOPL we need the extra hypothesis that K be commutative. This is due to the fact that pop transitions are applied in the reverse order than that of positions to which they refer (position v in formula T rans pop ). Notice, however, that rwOPL do not forbid unbounded pop sequences; thus, they too include languages that are neither real-time nor visible. This remark naturally raises new intriguing questions which we will briefly address in the conclusion.

Conclusion
We introduced and investigated weighted operator precedence automata and a corresponding weighted MSO logic. In our main results we show, for any semiring, that wOPA without pop weights and a restricted weighted MSO logic have the same expressive power; furthermore, these behaviors can also be described as homomorphic images of the behaviors of particularly simple wOPA reduced to arbitrary unweighted OPA. If the semiring is commutative, these results apply also to wOPA with arbitrary pop weights.
This raises the problems to find, for arbitrary semirings and for wOPA with pop weights, both an expressively equivalent weighted MSO logic and a Nivattype result. In [19], very similar problems arose for weighted automata on unranked trees and weighted MSO logic. In [13], the authors showed that with another definition of the behavior of weighted unranked tree automata, an equivalence result for the restricted weighted MSO logic could be derived. Is there another definition of the behavior of wOPA (with pop weights) making them expressively equivalent to our restricted weighted MSO logic?
In [28], operator precedence languages of infinite words were investigated and shown to be practically important. Therefore, the problem arises to develop a theory of wOPA on infinite words. In order to define their infinitary quantitative behaviors, one could try to use valuation monoids as in [16].
Finally, a new investigation field can be opened by exploiting the natural suitability of OPL towards parallel elaboration [3]. Computing weights, in fact, can be seen as a special case of semantic elaboration which can be performed hand-in-hand with parsing. In this case too, we can expect different challenges depending on whether the weight semiring is commutative or not and/or weights are attached to pop transitions too, which would be the natural way to follow the traditional semantic evaluation through synthesized attributes [25].