Fast Gaussian Elimination for Low Treewidth Matrices
Abstract
Let be an matrix whose elements lie in an arbitrary field , and let be the bipartite graph with vertex set such that vertices and are adjacent if and only if . We introduce an algorithm that finds an matrix in row echelon form and a permutation matrix of order , such that is row equivalent to . If a tree decomposition of of width and size is part of the input, then and the columns of that contain a pivot can be computed in time . Among other things, this allows us to compute the rank and the determinant of in time . It also allows us to decide in time whether the linear system has a solution and to compute a solution of the linear system in case it exists.
Keywords and phrases:
Gaussian elimination, FPT algorithms, treewidthFunding:
Carlos Hoppen: C. Hoppen acknowledges the support of CNPq 315132/2021-3.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Design and analysis of algorithmsEditors:
Anne Benoit, Haim Kaplan, Sebastian Wild, and Grzegorz HermanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Reducing a matrix to a desired form or decomposing a matrix into a product of matrices with a given structure are basic operations in linear algebra. Problems involving triangular matrices and matrices in row echelon form are often easy to solve, and a large number of tasks may be accomplished by first reducing a general matrix into a matrix of this form. To be precise, a matrix is in row echelon form if all rows consisting entirely of zeros are at the bottom, and the following holds for the remaining rows. If is a nonzero row, its pivot is the element with least . If rows have pivots in columns and , respectively, then .
A basic result in linear algebra states that every matrix can be transformed into a matrix in row echelon form by a sequence of three types of row operations, known as elementary row operations. In this paper, we shall refer to two of these types: a type I elementary row operation on a matrix involves interchanging two rows. A type II elementary row operation involves adding a multiple of a row to another row. We say that and are row equivalent if can be transformed into by a sequence of elementary row operations. The process of reducing a matrix to row echelon form by elementary row operations is known as Gaussian elimination. The standard school algorithm reduces an arbitrary input matrix with entries in a field to row echelon form in field operations. However, it can be faster if more is known about the structure of . For instance, if is sparse, its structure can be exploited if we find a pivoting scheme (also known as elimination ordering) that minimizes the fill-in, defined as the set of matrix positions that were initially 0, but became nonzero at some point during the computation. It is worth noting that sparse matrices often occur for some structural reason, such as small treewidth. Gaussian elimination cannot be used efficiently on random sparse matrices with any elimination order, as we would expect such matrices to quickly get dense as the algorithm progresses. We should mention that other strategies can be used to exploit the sparsity of the input matrix, see for instance the work of Peng and Vempala [12].
Much of the early focus has been on the question of characterizing the graphs corresponding to matrices with a perfect elimination order, meaning that there is a pivot strategy for Gaussian elimination without any fill-in. Parter [11] has assigned a graph to a matrix with an edge if or are non-zero. He has then shown that there is a perfect elimination order if the graph is a tree. As became customary afterwards, he chooses all pivots in the diagonal and makes the strong assumption that all diagonal elements of the matrix are non-zero and remain so during the computation, usually referred to as not having any accidental cancellation.
Obviously, this assumption simplifies Gaussian elimination enormously. It is justified because it is true for symmetric positive definite matrices, which are the most important special class of matrices for systems of linear equations. However, one would like to see more widely applicable methods.
Rose [16] shows that Parter’s result can be extended to all triangulated graphs (also known as chordal graphs). Rose, Tarjan, and Lueker [18] show that a graph has a perfect elimination order if and only if the graph is triangulated. In this case, they find a perfect elimination order in time. Rose and Tarjan [17] efficiently find a perfect elimination order in a directed graph if one exists, and show that finding a minimal elimination ordering (minimizing the fill-in) is NP-hard. A directed graph describes the non-zero off-diagonal entries in an matrix, where Gaussian elimination is done by choosing pivots in the diagonal under the assumption that there will never be a 0 in the diagonal.
The perfect elimination question is more complicated for the case of asymmetric matrices , where any nonzero pivot is allowed. Any matrix may be naturally associated with a bipartite graph with vertex set such that vertices and are adjacent if and only if . We say that is the underlying bipartite graph of . This allows us to employ structural decompositions of graph theory to deal with the nonzero entries of in an efficient way. Different from the symmetric case with pivots restricted to the diagonal, there is no nice characterization known for the bipartite graphs with perfect elimination order when arbitrary pivots are allowed. Nevertheless, a very large class is given by the chordal bipartite graphs defined by Golumbic and Goss [7]. These graphs are not chordal, as obviously only forests are bipartite and chordal. Still, chordal bipartite graphs are characterized similarly to chordal graphs. For chordal graphs, the defining property is that every cycle of length at least 4 has a chord, while chordal bipartite graphs are defined by the requirement that every cycle of length at least 6 has a chord. Golumbic and Goss [7] have shown that chordal bipartite graphs have a perfect elimination order.
They are not the only bipartite graphs with a perfect elimination order, but the only ones where one can always pick any simplicial edge (definition follows) as a pivot [7]. An edge is simplicial if the union of the neighborhoods of and induces a complete bipartite graph. It is easy to see that a bipartite graph has a perfect elimination order if there exists an ordering of a matching in such that is simplicial in the subgraph for .
Radhakrishnan, Hunt, and Stearns [13] do Gaussian elimination for symmetric matrices with a given tree decomposition of width in time . As has been common, they implicitly assume that the diagonal is non-zero and there is no accidental cancellation. Naturally, given today’s familiarity with algorithms based on tree decompositions, it would now be an easy exercise to design such an algorithm when pivots can always be selected in the diagonal.
Only in recent years has there finally been some progress towards the difficult problem of efficient Gaussian elimination for matrices of small treewidth without the simplifying assumption that pivots can always be chosen in the diagonal. Fomin et al. [5] have shown a Gaussian elimination algorithm running in time where is the lesser known tree-partition width parameter or the pathwidth. The tree-partition width of a graph can be arbitrarily large already for graphs of constant treewidth, unless the maximum degree is bounded [19]. However, Fomin et al. [5] still obtained an algorithm for treewidth . We should also refer to the work of Dong, Lee, and Ye [4], whose main result implies an algorithm for the solution of a linear system such that the matrix is full-rank (actually, their result is about the solution of a linear program).
Naturally, the longstanding main goal is an Gaussian elimination algorithm for treewidth without restrictive assumptions. This would provide a smooth transition to the school algorithm when approaches . It would also achieve the same running time bound for the general case as for the easy case of non-zero diagonal and no accidental cancellation. It actually seems that the goal has not been explicitly formulated until recently [5], maybe because it had not seemed achievable. However, when this goal has been obtained by simple algorithms under very strong assumptions, the question whether these assumptions (that convenient entries are nonzero and thus can be used as pivots) are necessary must have been obvious.
We make substantial progress in this direction by devising the algorithm Fast Gaussian Elimination. Given an matrix over a field with underlying bipartite graph , and given a tree decomposition of of width and size , it produces a matrix in row echelon form that is row equivalent to . The entries of the columns of that contain a pivot (and therefore generate a space of dimension ) are fully computed in time . The entries of the remaining columns of are described implicitly as linear combinations of at most other columns, and may be computed explicitly in time . In particular, this algorithm allows us to compute the rank and the determinant of in time . It also allows us to decide in time whether the linear system has a solution, and to compute a solution of the linear system in case it exists. Our main result is stated more formally below.
Theorem 1.
Let be an matrix over a field , let be its underlying bipartite graph, and let be a tree decomposition of of width and size . Algorithm Fast Gaussian Elimination produces an matrix in row echelon form, and a permutation matrix of order with the property that is row equivalent to . The permutation matrix , and the columns of that contain a pivot may be computed with field operations. The full matrix may be computed with additional field operations.
We actually also compute a permutation matrix of order such that is the row echelon form that would be obtained from by the most standard algorithm (adding multiples of earlier rows to later rows).
Our algorithm is not based on the algorithm of Fomin et al. [5], as it uses standard tree decompositions in a natural way, and does not rely on the somewhat obscure notion of tree-partition width. Even though our algorithm removes the decades old severe restriction to pivoting in the main diagonal, the main idea of our Gaussian elimination algorithm is quite simple. The tree decomposition provides an elimination order for rows (equations) and columns (variables). This order sequentially hands out permissions to rows and columns to be eliminated. However, when a row receives a green light to be processed, it is only processed immediately if it contains a variable that has already received a green light. Otherwise, this row is put into a buffer to be delayed until one of its variables also has a green light. Then is chosen as a pivot. Likewise, when a variable receives a green light then, if needed, its column is put into a buffer until some row containing this variable has a green light.
As was the case in Fomin et al. [5] we use a nice tree decomposition of the underlying bipartite graph as defined by Kloks [9]. However, to deal with ’s at intended pivot locations, we create temporary buffers for rows and columns that allow us to delay the selection of pivots as discussed above. As is typical for algorithms based on tree decompositions, row operations are actually performed on small submatrices of the input matrices. For simplicity, the algorithm does not manipulate the large given matrix. Instead, these small submatrices, called boxes, are maintained separately. The small submatrices in this paper include additional information to keep track of the rows and columns that lie in the temporary buffer, which allows us to make headway towards the output matrix in row echelon form even when accidental cancellations occur. It is crucial to perform so-called bookkeeping operations that ensure that the temporary buffers remain small during the entire application of the buffer.
The paper is organized as follows. The algorithm is described in Section 2, where we also discuss its correctness. The running time is analyzed in Section 3. We should mention that, to achieve this running time, we need to keep track of the field operations in a global way, rather than adding the contributions of the worst-case scenarios at each step. Together, these two sections establish Theorem 1. We provide several consequences of our result in Section 4.
2 The Algorithm
We now describe the algorithm Fast Gaussian Elimination. We start with the definition of a tree decomposition, which we now state. Let be a -vertex graph, with the standard assumption that . A tree decomposition of a graph is a tree with nodes , where each node is associated with a bag , satisfying the following properties: (1) ; (2) For every edge , there exists containing and ; (3) For any , the subgraph of induced by the nodes whose bags contain is connected. The width of the tree decomposition is defined as and the treewidth of graph is the smallest such that has a tree decomposition of width . Tree decompositions were popularized by the seminal work of Robertson and Seymour [14, 15], but other definitions that are similar or even equivalent have appeared in earlier work, see [1, 8]. As mentioned in the introduction, we use the concept of nice tree decomposition introduced by Kloks [9], which is a rooted tree decomposition of a graph such that all nodes are of one of the following types:
-
(a)
(Leaf) The node is a leaf of ;
-
(b)
(Introduce) The node introduces vertex , that is, it has a single child , and .
-
(c)
(Forget) The node forgets vertex , that is, has a single child , and ;
-
(d)
(Join) The node is a join, that is, it has two children and , and .
We further assume that the bag associated with the root is empty. This is easy to accomplish, as any tree decomposition that satisfies (a)-(d), but does not satisfy this additional requirement, may be turned into a nice tree decomposition with empty root bag by appending a path of length at most to the root of , where all nodes have type forget. Having a root with an empty bag ensures that each vertex of is associated with exactly one forget node that forgets it. Moreover, it ensures that, for any , the node of that is closest to the root among all such that is the child of the node that forgets . Despite the additional structure, a nice tree decomposition may be efficiently derived from an arbitrary tree decomposition. More precisely, Kloks [9, Lemma 13.1.2] has shown that, if is a graph of order and we are given an arbitrary tree decomposition of with width and nodes, it is possible to turn it into a nice tree decomposition of with at most nodes and width at most in time . Regarding the computation of a tree decomposition, Bodlaender’s theorem [2] provides an optimal tree decomposition in linear time when the treewidth is a constant. Bodlaender et al. [3] and Korhonen [10] compute faster constant factor approximations in linear time. Thus, for bounded , there is no need to assume a tree decomposition is given, since both the tree decomposition and the Gaussian elimination are then computed in time . However, the constants hidden in the -notation grow quickly with .
We are now ready to describe the algorithm. Let be an matrix with entries in a field and let be the underlying bipartite graph with vertex set associated with it. We wish to find an permutation matrix , an permutation matrix and an matrix in row echelon form such that is row equivalent to in a direct way. We shall operate on a nice tree decomposition of with node set and width rather than on the matrix itself. The algorithm Fast Gaussian Elimination works bottom-up on the rooted tree , that is, it only processes a node after its children have been processed. Each node except the root produces a data structure known as a box and transmits it to its parent. A box is a matrix with rows and columns. It may be recorded as a triple of matrices whose rows and columns are labeled by distinct elements of and , respectively, associating rows and columns of the boxes with rows and columns of the original matrix. We may view it as
| (1) |
where is an matrix whose rows and columns are labeled by the elements in , the bag associated with node . Here is the number of row vertices (i.e., vertices ) in and is the number of column vertices (i.e., vertices ) in . The matrices and have dimensions and , respectively, where and . Observe that some of these matrices can be degenerated, in the sense that , , or . In a case where , , and , for instance, the box is matrix, is a matrix with no rows and two empty columns, and both and are matrices. We should note that a similar data structure, also called a box, has been exploited in the context of computing a diagonal matrix that is congruent to a given symmetric matrix using a tree decomposition of its underlying graph [6].
It will also be useful to assign a canonical ordering to the rows and columns of the input matrix. We assume that they are ranked according to the order in which their Forget nodes appear in some topological ordering (e.g., obtained by a post-order traversal of ), that is, is the first vertex that is forgotten among all vertices that represent rows, is the second, and so on. The same applies to .
We start with an intuition about the meaning of the boxes and about how the algorithm works. Recall that the nodes of the nice tree decomposition are ordered bottom-up as , where is the root and . At each node , the algorithm initializes box (if is a leaf) or produces a new box based on its bag and on the boxes transmitted by its children. While producing the box , the algorithm may access entries of the input matrix such that both and are associated with vertices in . The box is then transmitted to its parent, except if is the root of . We call this processing node .
It is also useful to keep in mind that the algorithm performs two types of elementary row operations, which we call authentic operations and bookkeeping operations. The former are operations that can be performed in the full matrix in order to reach row echelon form. The latter are operations that are only performed on the boxes as a way to limit their sizes, but which do not have full matrix counterparts. The effect of processing nodes may be viewed as a sequence of matrices
| (2) |
where is in row echelon form and is obtained from by performing the authentic elementary row operations that have been performed while node is processed. We should mention that this sequence of row equivalent matrices is a simplification of what actually happens, but that it could be achieved if an oracle gave us the proper ordering of rows and columns of . In reality, the proper orderings are also computed by the algorithm while the nodes are processed, leading to the permutation matrices and mentioned in the statement of Theorem 1.
To achieve the complexity stated in Theorem 1, row operations cannot be actually performed on matrices, but are instead performed only in the (smaller) boxes. It is useful to think that any row or column of the input matrix , say row , is initially untouched and changes to a regular row when the vertex associated with it appears in the bag of a node of the branch111We use “branch” to refer to the subtree induced by all descendants of a node. of the tree decomposition rooted at . When is removed from the bag (at the node that forgets it in the nice tree decomposition), either row is assigned a pivot in some column , in which case row is classified as processed (column is also classified as processed), or row becomes a buffer row, meaning that it has been given the green light for a pivot, but that the decision of assigning a pivot has been deferred to a later step because no column has received a green light. In the process of keeping the size of the box bounded, the algorithm may decide that it is impossible or unnecessary to assign a pivot to a buffer row or to a buffer column, in which case the status of the row or column also changes to processed. We observe that the classification of each row and column is defined independently for disjoint branches of the tree decomposition, and an important feature of the algorithm is that these independent classifications can be made compatible when two branches are merged (which happens at join nodes).
This finally gets us to the meaning of a box at node . The rows and columns of the matrix are precisely the regular rows and columns of the branch of rooted at node . In other words, these are the rows and columns corresponding to the vertices in the bag . Moreover, each entry of records the net change to the original entry due to row operations performed while processing nodes in . The matrix records the entries of the matrix defined in (2) in the case where is a buffer row (with respect to ) and is a regular column. Analogously, records the entries of in the case where is a regular row and is a buffer column. This reveals another important feature of the algorithm: The entries of the full matrix associated with regular rows and regular columns can easily be modified in parallel by computations in disjoint branches, as the algorithm does not capture their exact values while processing a single branch. The boxes just store the net changes to the values due to operations in this branch. However, this does not happen for buffer rows and buffer columns, for which exact values are recorded in the box. The top left corner of in (1) shows that, if is a buffer row and is a buffer column, the entry in is equal to zero. This is consistent with the idea that the algorithm selects a new pivot whenever there are a buffer row and a buffer column for which the element is nonzero, as both the row and the column had already been given the green light as possible rows and columns for a pivot. So, if row and column are in the buffer at the end of step , then the entry must be 0.
While producing the boxes, the algorithm may select pivots for the output matrix , it may identify zero rows that cannot contain a pivot and it may identify columns that are linear combinations222Actually, they are not necessarily linear combinations of the other columns with respect to the entire matrix, but they are linear combinations with respect to a submatrix as depicted in Figure 1. of other columns, and therefore cannot contain pivots together with the other columns. The status of these rows and columns changes to processed, which means that information about them is recorded by global variables, while they are removed from the box.
After this description, we come back to a matrix in (2), depicted in Figure 1333The authors thank Elizandro Max Borba for the figure.. The top left entries lie in processed rows and processed columns that contain pivots. These entries are not modified later in the algorithm. The bottom rows are (processed) zero rows that have been produced while limiting the size of the set of buffer rows (it is indeed impossible to find a pivot in such a row), while the rightmost columns are (processed) columns that have been identified as linear combinations of other columns (except for possible entries in rows whose pivot had already been computed when this column was processed). As a consequence, it is unnecessary to assign a pivot to them. Moreover, while the actual values in the rightmost columns may be modified in later steps of the algorithm, we need not keep track of them, as each such column may be computed as a linear combination of (a known set of) columns with smaller index at the end of the algorithm. The central submatrix represents entries in the buffer, and in regular or untouched rows and columns. Some of them lie in the box where computations are actually performed in a given step.
Before giving a formal description of every step of the algorithm, we briefly describe what is done at a node of each type. A node of type leaf merely initializes a box with regular rows or columns, while a node of type introduce simply adds a new regular row or column to the box transmitted by its child. Nodes of type join combine the information that is passed on by two different branches of the decomposition, which have been processed independently of each other. Let and denote the boxes produces by the children of . A new matrix is produced by stacking the matrix on top of , and a new matrix is produced by juxtaposing to the left of . Such matrices and may become too large, so that row operations are performed to remove some rows and/or columns from the buffer. Producing also requires merging and , as the regular rows and columns are the same. Being able to easily merge the matrices transmitted by the children is the reason why the rows and columns are considered with the canonical ordering mentioned above.
Nodes of type forget concentrate most of the action. Recall that each row of is associated with a vertex of , which in turn is associated with the unique Forget node of that forgets . Analogously, each column of is associated with a vertex of and the unique Forget node of that forgets . Assume that the algorithm is processing a Forget node , which means that it received a box of its child . Further assume that node forgets a vertex corresponding to row . This means that the algorithm singles out this row as a candidate for having a pivot. In the box , it checks whether there is a column that has already been singled out for a possible pivot, but which has no pivot so far. These are precisely the columns of . If such a column exists, it checks whether the entry in nonzero. If this is also satisfied, the algorithm performs operations to create a pivot in row and column , and they are both removed from the box. If one of the conditions fails, row is removed from and added to , which contains rows singled out for a pivot, but which have no pivot so far. Furthermore, if the condition on the maximum number of rows of fails after the addition of this new row, the algorithm acts to remove some rows from the buffer.
Analogously, if node forgets vertex corresponding to column , it first checks whether there are rows that have been singled out for pivots, but have not been used, namely the rows of . It then looks for one such row for which the entry is nonzero. If such a row exists, the algorithm performs operations to create a pivot in row and column , and they are both removed from the box. If one of the conditions fails, column is removed from and added to , which contains columns singled out for a pivot, but which have no pivot so far. If the condition on the number of columns of fails after the addition of this new column, the algorithm acts to remove some columns from the buffer.
A high level description of the algorithm may be found in Figure 2. Note that, in this descriptiom, we have five global variables: the number of pivots already computed, two vectors and of length and two vectors and of length . Initially, these are zero vectors. At the end of the algorithm, vector allows us reconstruct the rows of the output matrix , while vectors and encode the permutations that generate and , respectively. As we discussed, the algorithm may also find columns of that are linear combinations of other columns and remove them from boxes. When this happens, the linear combination that produces it is recorded in the vector , so that the values of entries of in these columns can also be recovered at the end of the algorithm.
In the remainder of this section, we shall describe each operation in detail. To simplify our discussion, for a node , we say that buffer rows and buffer columns have type 1, while regular rows and regular columns have type 2.
Leaf Box
When a node is a leaf corresponding to a bag , then we apply procedure Leaf Box. This procedure simply initializes a box where , , and is a zero matrix of dimension whose rows and columns are labeled by the elements of , which contains row vertices and column vertices. No elementary row operations are performed.
Introduce Box
When a node introduces a vertex , its bag satisfies , where (which does not contain ) is the bag of its child . The input of Introduce Box is the vertex that has been introduced and the box transmitted by its child. The box is obtained by the insertion of a new type 2 row or column labeled whose elements are all zero444The added row or column may actually be empty if a new row vertex is introduced, but and , or a new column vertex is introduced, but and ., taking care to insert it in the correct order (with respect to the canonical order of rows and columns of . More precisely, if is a row vertex , this means that and are obtained from and , respectively, by the addition of a zero row, while . It is clear that . Similarly, if is a column vertex , the addition of this column means that and are obtained from and , respectively, by the addition of a zero column, while . It is clear that . No elementary row operations are performed.
Join Box
Let be a node of type Join and let and be the boxes transmitted by its children, where . By the definition of box, and have the same rows and columns of type 2, that is, and have the same dimension and have rows and columns labeled by the same elements (in the same order). Moreover, because vertices of type 1 have been forgotten in their respective branches, the labels of the rows of and are all different, as are the labels of the columns of and . The JoinBox operation first creates a matrix whose rows and columns are labeled by the union of the labels of and with the structure below.
| (3) |
Here has order and records the net change of the entries of the original matrix because of row operations in the two branches that are being merged at this step. Let be the matrix obtained by stacking on top of 555Alternatively, to preserve the canonical ordering of rows, we could merge the two matrices and in a way that preserves the canonical ordering. However, this does not affect the final result.. Let be the matrix obtained by placing to the left of .
If , define . Otherwise . Apply Gaussian elimination to until we get a matrix in row echelon form. Here, we can use a standard algorithm for dense matrices that takes operations, as has columns and at most rows. Since , this process ends with at least zero rows. The matrix is obtained from by removing all the zero rows, and by reordering the rows so that they remain in the original ordering of . Assume that the zero rows are labeled by the vertices . The algorithm updates the permutation vector by assigning the values to the last components of with value 0, which means that the zero rows of become the top zero rows in the description of Figure 1. Given that the new box is defined based on the matrix that has been modified, the type II row operations used for Gaussian elimination are authentic row operations. Given that the order of the rows has been preserved, type I row operations are bookkeeping operations (we observe that the labels of the rows used for authentic row operations will always refer to the original input matrix to account for the fact that row interchanging is not actually performed)666The algorithm could be easily adapted to a version where row interchanges are also authentic row operations, but this makes the analysis slightly more involved..
Similarly, if , define . Otherwise, make an auxiliary copy of and apply Gaussian elimination on it until we get a matrix in row echelon form. As above, we can use a standard algorithm for dense matrices that takes operations. There is a set of at most columns that form a basis of the column space of (precisely the columns of with pivots in ). The matrix is obtained from by removing the columns that are not in , form which there are at least .
Remark 2.
We emphasize that is defined in terms of rather than in this case, which means that the row operations performed to obtain are not performed in the full matrix while producing the sequence (2), but are instead auxiliary operations to find a basis of the column space of and to write the remaining columns as linear combinations of the columns in this basis. This is crucial. If we consider the effect of the row operations in the full matrix, while the nonzero entries of the rows corresponding to in the full matrix lie entirely within (except perhaps for entries in columns that are linear combinations of other columns), there may be nonzero entries in the rows corresponding to that lie outside the box . Indeed, the rows of are regular rows, so that they may even be processed in parallel by different branches of the tree decomposition. Adding multiples of them to other rows might affect entries that are not being monitored at this step. This is why the operations performed to turn a copy of to row echelon form must be bookkeeping operations.
Coming back to the description of the algorithm, assume that are the labels of the columns of with no pivot in , so that they are linear combinations of the remaining columns of . The algorithm updates the permutation vector by assigning the values to the last components of with value 0, which may be seen as moving these columns to the right (see Figure 1). Moreover, the corresponding entries of the vector record information about the linear combinations. For each column , this information consists of the indices of the columns that produce the linear combination, the coefficients of this linear combination, and the number of pivots at this step of the algorithm (as the linear combinations are with respect to the part of the column that does not include rows for which pivots have been assigned.)
Forget Box
Assume that forgets vertex and let be its child, so that . This procedure uses to produce a new box where the row or column associated with gets a pivot (and is therefore removed from the box) or changes to type 1.
Assume first that for a row . Consider the matrix given by777Representing the row associated with as the last row is helpful for visualizing the operations, but this need not be part of an implementation.
| (4) |
where , is row of , is obtained from by removing row , is obtained from by removing row , and is obtained from row of by adding the entry of the input matrix to the entry corresponding to each column .
If is empty or zero, the algorithm sets , and adds to as the new bottom row to produce . If has at most rows, the algorithm calls this matrix and concludes the step. Intuitively, this means that row is a new buffer row. Otherwise has rows. The algorithm turns this matrix into a matrix in row echelon form and removes zero rows of to produce , as was done in Join Box. As a byproduct of these authentic row operations, the algorithm updates . Similarly, if , we set . However, because and the number of columns of is , it may be that . If this happens, as was the case for Join Box, the algorithm performs bookkeeping operations to define , and update and .
If is nonempty and nonzero, the algorithm selects the first nonzero entry of , suppose that it is in column . It then performs row operations in in order to eliminate all the other nonzero elements in column (note that these elements are necessarily in rows of type 2, so that only rows in and are modified). The algorithm defines and , while records information about the nonzero entries of the vector , namely the labels of the nonzero columns of the vector and the actual values of the nonzero entries. The algorithm then adds one to the value of . Matrix is assigned this modified , matrix is defined as . Moreover, column is removed from to produce . The row operations performed up to this point are authentic. If the number of columns in is less than (note that and that has columns), is set to be . Otherwise, and the algorithm performs bookkeeping operations as in Join Box to define , and update and .
Next assume that for a column . Consider the matrix given by
| (5) |
where , is obtained from by removing column , is column of , is obtained from by removing column and is obtained from column of by adding the entry (of the input matrix) to the entry corresponding to each row .
If is empty or zero, the algorithm sets and adds as a new column of to to produce . If has fewer than columns, the algorithm calls this matrix and concludes the step. Otherwise has columns and the algorithm defines with bookkeeping operations as was done in Join Box (this leads to changes in and ). Similarly, if , we set . However, because and the number of rows of is , it may be that . If this happens, the algorithm turns into a matrix in row echelon form and removes the zero rows of to produce , as was done above (this leads to changes in ). These operations are authentic.
If is nonempty and nonzero, the algorithm selects the first nonzero entry of , suppose that it is in row (of type 1). Let be this nonzero entry and let be row in . The algorithm then performs authentic row operations in using row to eliminate all the other nonzero elements in column (note that only rows in and are modified). It then removes row from to produce . As in the case of rows, the algorithm defines and , while records the nonzero entries of . It then updates to . If the number of columns rows of is less than , is set to be . Otherwise, the algorithm turns into a matrix in row echelon form with authentic operations and removes zero rows of to produce , as was done in Join Box (this leads to changes in ). Matrix is defined as .
3 Running time of the algorithm
In this section, we find an upper bound on the number of operation required to produce matrices , and as in the statement of Theorem 1 using algorithm Fast Gaussian Elimination. If the algorithm is given a tree decomposition with nodes and width of the underlying graph of matrix , it first computes a nice tree decomposition of the same width with fewer than nodes in time as discussed above. As it turns out, the number of nodes of each type in is at most .
We consider the number of operations performed at each type of node. Leaf Box initializes a matrix in trivial steps. Introduce Box uses steps to create a row or column filled with zeros.
For Join Box and Forget Box, the main cost comes from row operations. As each row has size at most , each such row operation requires additions and multiplications. Regarding Forget Box, when vertex associated with a row or vertex associated with a column is forgotten, either a new type 1 row or column is created, or a new pivot is found. If the latter happens, the algorithm performs at most row operations, where each row has size at most , leading to field operations. Additionally, the algorithm may perform row operations to keep or of the right size, just as in Join Box (see below). If a new type 1 row or column is created, the algorithm just inserts the new row in or the new column in , possibly performing row operations to keep or of the right size, just as in Join Box.
It remains to account for the row operations performed to keep matrices and of the right size. Recall that, every time row operations of this type are necessary to produce , we start with a matrix of dimension , where and we create at least zero rows. These rows become processed. For , we start with a matrix of dimension , where and we identify at least columns that are linear combinations of other columns. Again, the columns become processed. We shall keep track of the overall number of field operations performed at these steps by “charging” the rows and columns processed at this step, that is, if field operations are performed and rows are processed, each of the rows is charged with operations. Because of the dimensions of the matrices, getting them to row echelon form takes and field operations, respectively. In particular, each row can be charged at most and each column can be charged at most . Since any row or column may be processed at most once in this way, the overall number of field operations that the algorithm performs to keep matrices or of the right size is .
Overall, the number of operations taken by the algorithm to produce the vectors and , in addition to vectors and for every and every is .
We now consider the decomposition . The matrices and are defined using and with no additional field operations. Indeed, is the permutation matrix such that for , while is the permutation matrix such that for .
For , the entries may be computed in two ways. They are either obtained from a vector with no additional computation, or they are obtained as a linear combination of earlier values using coefficients recorded in , which requires sums and products. The number of operations performed to obtain entries of the second type is at most
| (6) |
This concludes the analysis.
4 Consequences of Theorem 1
We conclude the paper with the statement of results that are immediate consequences of Theorem 1.
Corollary 3.
If is an matrix whose bipartite graph is given with a tree decomposition of width with nodes and is a column vector of length , then using time and arithmetic operations, it can be decided whether the system of linear equations has a solution, and if so, then such a solution can be produced.
Proof.
To get one solution of the system of linear equations , the algorithm to transform into row echelon form is augmented by doing all row operations on the right hand side as well. If one ever gets a zero row on the left hand side together with a non-zero value on the right hand side, then the system has no solutions. Otherwise, with the left hand side in row echelon form, a solution is obtained by back-substitution. Hereby, all variables that correspond to columns that have been identified as linearly dependent during the algorithm, or just have no pivot, are set to 0.
Corollary 4.
The following hold if is an matrix whose bipartite graph is given with a tree decomposition of width with nodes:
-
(a)
The rank of can be computed using time and arithmetic operations.
-
(b)
For , the determinant of can be computed using time and arithmetic operations.
Proof.
The rank is just the number of non-zero rows of the output matrix . If one of the columns is a linear combination of previous columns, the determinant is zero. Otherwise, the determinant is just the product of the signs of the permutation matrices and with the product of the diagonal elements of the equivalent matrix in row echelon form.
Corollary 5.
For a graph given with a tree decomposition of width with nodes, the size of a maximum matching can be computed using time and arithmetic operations by a randomized algorithm with with one-sided error that is correct with probability at least for an arbitrary constant. In case of an error, the algorithm reports a suboptimal value.
Proof.
This follows from Theorem 1.4 of Fomin et al. [5] by replacing their rank algorithm by our rank algorithm.
References
- [1] U. Bertelè and F. Brioschi. Nonserial Dynamic Programming. Elsevier, 1972.
- [2] Hans L. Bodlaender. A linear-time algorithm for finding tree-decompositions of small treewidth. SIAM J. Comput., 25(6):1305–1317, 1996. doi:10.1137/S0097539793251219.
- [3] Hans L. Bodlaender, Pål Grønås Drange, Markus S. Dregi, Fedor V. Fomin, Daniel Lokshtanov, and Michał Pilipczuk. A 5-approximation algorithm for treewidth. SIAM Journal on Computing, 45(2):317–378, 2016. doi:10.1137/130947374.
- [4] Sally Dong, Yin Tat Lee, and Guanghao Ye. A nearly-linear time algorithm for linear programs with small treewidth: a multiscale representation of robust central path. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1784–1797. ACM, 2021. doi:10.1145/3406325.3451056.
- [5] Fedor V. Fomin, Daniel Lokshtanov, Saket Saurabh, Michal Pilipczuk, and Marcin Wrochna. Fully polynomial-time parameterized computations for graphs and matrices of low treewidth. ACM Trans. Algorithms, 14(3):34:1–34:45, 2018. doi:10.1145/3186898.
- [6] Martin Fürer, Carlos Hoppen, and Vilmar Trevisan. Efficient diagonalization of symmetric matrices associated with graphs of small treewidth. Theoretical Computer Science, 1040:115187, 2025. doi:10.1016/j.tcs.2025.115187.
- [7] Martin C. Golumbic and Clinton F. Goss. Perfect elimination and chordal bipartite graphs. Journal of Graph Theory, 2(2):155–163, 1978. doi:10.1002/jgt.3190020209.
- [8] Rudolf Halin. S-functions for graphs. Journal of Geometry, 8(1):171–186, 1976. doi:10.1007/BF01917434.
- [9] Ton Kloks. Treewidth, Computations and Approximations, volume 842 of Lecture Notes in Computer Science. Springer, 1994. doi:10.1007/BFB0045375.
- [10] Tuukka Korhonen. A single-exponential time 2-approximation algorithm for treewidth. SIAM Journal on Computing, pages FOCS21–174, November 2023. doi:10.1137/22M147551X.
- [11] Seymour V. Parter. The use of linear graphs in Gauss elimination. SIAM Review, 3(2):119–130, 1961. doi:10.1137/1003021.
- [12] Richard Peng and Santosh S. Vempala. Solving sparse linear systems faster than matrix multiplication. Commun. ACM, 67(7):79–86, 2024. doi:10.1145/3615679.
- [13] Venkatesh Radhakrishnan, Harry B. Hunt III, and Richard E. Stearns. Efficient algorithms for solving systems of linear equations and path problems. In Alain Finkel and Matthias Jantzen, editors, STACS 92, 9th Annual Symposium on Theoretical Aspects of Computer Science, Cachan, France, February 13-15, 1992, Proceedings, volume 577 of Lecture Notes in Computer Science, pages 109–119. Springer, 1992. doi:10.1007/3-540-55210-3_177.
- [14] Neil Robertson and Paul D. Seymour. Graph minors. I. excluding a forest. J. Comb. Theory B, 35(1):39–61, 1983. doi:10.1016/0095-8956(83)90079-5.
- [15] Neil Robertson and Paul D. Seymour. Graph minors. II. algorithmic aspects of tree-width. J. Algorithms, 7(3):309–322, 1986. doi:10.1016/0196-6774(86)90023-4.
- [16] Donald J. Rose. Triangulated graphs and the elimination process. Journal of Mathematical Analysis and Applications, 32(3):597–609, 1970. doi:10.1016/0022-247X(70)90282-9.
- [17] Donald J. Rose and Robert E. Tarjan. Algorithmic aspects of vertex elimination on directed graphs. SIAM Journal on Applied Mathematics, 34(1):176–197, 1978.
- [18] Donald J. Rose, Robert E. Tarjan, and George S. Lueker. Algorithmic aspects of vertex elimination on graphs. SIAM Journal on Computing, 5(2):266–283, 1976. doi:10.1137/0205021.
- [19] David R. Wood. On tree-partition-width. European Journal of Combinatorics, 30(5):1245–1253, 2009. Part Special Issue on Metric Graph Theory. doi:10.1016/j.ejc.2008.11.010.
