Extremal Betti Numbers and Persistence in Flag Complexes
Abstract
We investigate several problems concerning extremal Betti numbers and persistence in filtrations of flag complexes. For graphs on vertices, we show that is maximal when , the Turán graph on partition classes, where denotes the flag complex of . Building on this, we construct an edgewise (one edge at a time) filtration for which is maximal for all graphs on vertices and edges. Moreover, the persistence barcode achieves a maximal number of intervals, and total persistence, among all edgewise filtrations with edges.
For , we consider edgewise filtrations of the complete graph . We show that the maximal number of intervals in the persistence barcode is obtained precisely when . Among such filtrations, we characterize those achieving maximal total persistence. We further show that no filtration can optimize for all , and conjecture that our filtrations maximize the total persistence over all edgewise filtrations of .
Keywords and phrases:
Topological data analysis, Extremal graph theoryCopyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Mathematics of computing Algebraic topologyAcknowledgements:
We thank Ulrich Bauer for suggesting the problem of maximizing the number of intervals in the barcode of a Vietoris–Rips filtration.Editors:
Oswin Aichholzer and Haitao WangSeries and Publisher:

1 Introduction
A central theme in topological data analysis (TDA) is the computation of homological invariants from simplicial complexes. These complexes are often constructed from point cloud data, with the Vietoris–Rips complex being one of the most widely used constructions. For a finite set in a metric space , the Vietoris–Rips complex at scale , denoted , includes all simplices such that for all . This complex is a flag complex, meaning it is the largest simplicial complex with a given underlying graph (the 1-skeleton). In particular, the edges in the connectivity graph of at scale fully determine the higher-dimensional simplices in . Varying the scale parameter induces a filtration of simplicial complexes:
where . Applying -dimensional homology over a field to this filtration yields a persistence barcode in degree , denoted . This barcode, consisting of intervals , encodes the birth and death of topological features across scales and serves as a powerful tool for extracting topological information from . When is a sufficiently dense sampling of an underlying space , the number of “long bars” in determines the -th Betti number , which describes the -dimensional topological features of [7]. For a comprehensive introduction to topological data analysis, we refer the reader to [8].
Given the central role of such filtrations in TDA, this paper addresses a fundamental question: how many topological features can arise in a data set of points? Specifically, we investigate bounds on quantities such as the maximal value of , the maximal number of intervals in , the length of the longest interval, and the sum of the lengths of the intervals (the total persistence).
Remark 1.
In this paper, we focus on edgewise filtrations of graphs, i.e., sequences of graphs on vertices, where and differ by precisely one edge. Notably, every filtration of flag complexes can be realized as the Vietoris–Rips filtration of an appropriate metric on the vertices of ; see the full version of this paper [3, A]. Thus, our results are directly applicable to data analysis using persistent homology.
1.1 Overview and contributions
In Section 2, we introduce the relevant background material and compute the Betti numbers of flag complexes on Turán graphs.
In Section 3, we examine extremal Betti numbers and establish tight upper bounds on , where is a flag complex on vertices (Theorem 10). This upper bound is achieved by the Turán graph .
In Section 4, we study filtrations of flag complexes on vertices with at most edges, where is the number of edges in . We prove that our filtration maximizes at all filtration steps (Theorem 16).
In Section 5, we analyze the longest possible interval in the barcode of a filtration of flag complexes on vertices (Corollary 21).
Additionally, in Section 6, we focus on degree homology and show that any filtration of flag complexes on vertices will achieve the maximal number of intervals in its degree barcode if and only if the filtration contains (Corollary 25).
In Section 7, we explore flag filtrations in homology degree that achieve both a maximal number of intervals and maximal total persistence. Equivalently, we maximize the total persistence of all filtrations of graphs containing . Given the importance of Turán graphs in extremal graph theory, we believe that identifying extremal filtrations of Turán graphs is interesting in its own right. Our result relies on an elaborate combinatorial analysis that precisely classifies the extremal filtrations (Theorem 31).
Finally, the paper concludes with a discussion in which we outline several important conjectures for future work.
1.2 Related work
The study of extremal values of -linear functions on the - and -vectors of simplicial complexes with vertices has a rich history. Classical questions include determining the extremal Euler characteristic and the maximal sum of Betti numbers. These problems were addressed for general simplicial complexes in [4] and later extended to arbitrary -linear functions in [11]. Specifically, [11, Theorem 3.4] shows that for flag complexes, extremal values are always realized by for some . Consequently, Theorem 10 follows directly from [11, Theorem 3.4].
Our proof of Theorem 10 adapts the approach of [1, Theorem 1.1], which gives an alternative proof for the extremal total Betti number of flag complexes. The key observation in our proof of Theorem 10 plays a central role in our results on extremal interval lengths (Section 5).
Another related work is [10], which examines asymptotic bounds for in Vietoris–Rips complexes of point samples in . Similar questions have been explored for Čech complexes; see, e.g., [9].
Our work diverges from earlier work by considering filtered flag complexes. This approach is closely tied to the problem of finding extremal values for complexes with precisely vertices and edges; we return to this in Section 8. Importantly, and in contrast to classical work, graphs achieving extremal values need not have as a subgraph.
2 Background
Graphs and Simplicial Complexes.
Let be a graph. We write for the edge connecting the vertices and . For a vertex , the neighborhood is , and the closed neighborhood is . The degree of is . The complete graph on vertices is denoted .
The complement of , denoted , is the graph on the same vertex set as , where two distinct vertices of are adjacent if and only if they are non-adjacent in . The join of disjoint graphs and is the graph with vertex set and edge set . The complete bipartite graph is defined as the join , where each is an empty graph on vertices. The sets and are called the partition classes of .
For a vertex in a simplicial complex , the link and (closed) star of are the simplicial complexes given respectively by
We write for the simplicial complex with simplices for which .
For a graph , we let denote the simplicial complex with -simplices given by the -cliques in . A simplicial complex is called a flag complex if for some graph . The independence complex of is the simplicial complex . Examples of flag and independence complexes are given in Figure 1.
Homology.
For a simplicial complex , we let denote the dimension of the reduced homology group , and let denote the vector space of -chains. We use coefficients in a fixed, but arbitrary, field ; our optimal constructions are torsion-free and thus our results do not depend on the choice of coefficient field. For that reason, we simply write . We also employ the notation for a graph .
Persistent Homology.
A filtration of simplicial complexes is a collection of simplicial complexes such that for . Applying to a filtration yields a sequence of vector spaces and linear maps called a persistence module. Provided all the vector spaces are finite-dimensional, is uniquely described by a collection of intervals in , called the degree barcode of . We shall denote this barcode by . The total persistence of is given by
For a more thorough introduction to persistent homology, (generalized) persistence modules, and examples, see, e.g., [8, Chapter 3] or [5].
For a filtration of graphs (1-dimensional simplicial complexes), we get a filtration of simplicial complexes by taking the flag complex at every index. If is a single edge for all , then we say that the filtration is edgewise. We shall employ the notation .
Example 2.
An edgewise filtration of can be found in Figure 2.
2.1 Elementary homological properties
The following two lemmas are well-known and important in combinatorial topology. For completeness, their proofs can be found in the full version [3, B.1].
Lemma 3.
Let be a simplicial complex and a vertex. Then, for all ,
Lemma 4.
For any two graphs and , and ,
The following observation is essential for our work in Section 7.
Proposition 5.
Let be the vertex set of a graph containing all edges of the form where and . Let denote the number of connected components of restricted to . Then,
Proof.
Let denote the complement graph of and observe that where and are the full subgraphs of with vertices and , respectively. From Lemma 4,
2.2 Turán graphs
Definition 6.
Let and , and let be the smallest positive integer such that . Then, the graph is the complement graph of the graph
Example 7.
See Figure 3 for some examples of Turán graphs.
The proof of the following can be found in the full version [3, B.1].
Proposition 8.
For all integers and , we let be the smallest positive integer such that . We have
In particular, if is a multiple of , then
3 The maximum value of
The following is well-known; see the full version [3, B.2] for a proof.
Lemma 9.
Let be a positive integer, and be integers satisfying . Then, where
(1) |
As said in Section 1.2, the following proof is a modification of the proof of [1, Theorem 1.1].
Theorem 10.
Let be a graph on vertices. Then, for all ,
Proof.
We shall work inductively on . The result is trivially true for , so assume that it holds for all .
Note that . Let denote the minimal degree over all . Assume that is a vertex with , and let denote the neighbours of in . Moreover, let for and . Applying Lemma 3,
Here has an isolated vertex and thus is a cone, and therefore . Since every vertex of has degree at least , it follows that contains at most vertices. By the induction hypothesis it follows that
for integers for which . Hence, by Lemma 9,
Let . Combining Theorem 10 and Proposition 8, we have
Corollary 11.
Of all graphs on vertices, maximizes the -th Betti number. I.e.,
In particular, if is a multiple of , then
4 The maximum value of for
In this section, we fix , and the notation
(2) |
The following result follows from combining Lemma 3 and Corollary 11, and will be integral in proving the main result of this section.
Corollary 12.
Let be a vertex of degree in a graph . Then,
Proof.
If , then , and Lemma 3 implies that
Now observe that , and since has vertices, it follows from Corollary 11 that . We now define an edgewise filtration of the Turán graph . In this section, we shall show, for , that is maximal over all graphs with edges, i.e., that is fiberwise optimal.
Definition 13.
Let denote the vertices of and label the elements of such that and are in the same partition. Writing each edge as for , we order the edges by the following co-lexicographic order: if , or if and , or if and . Following this order, let denote the subgraph of with the first edges.
Note that, since contains no -simplices, is increasing as a function of .
Example 14.
The next lemma shows that, once a vertex has been connected to the growing component, and until the next vertex gets connected, the change in from adding a single edge increases with the number of edges already added. Its proof, similar to that of Corollary 12, can be found in the full version [3, B.3].
Lemma 15.
For edges, let be maximal such that is a subgraph of . Then,
In particular, for , we have that
Theorem 16.
Let , let be as above, and let be any other graph on vertices and edges. Then,
Proof.
We prove this inductively on the number of edges . The statement is clearly true for , so let’s assume that the result holds for all .
Let be the number of vertices in with positive degree, and let denote the number of vertices in with positive degree. If , then by Theorem 10,
Hence, we may assume that . In particular, the average degree of the positive-degree-vertices in is no larger than the average degree of positive-degree-vertices in .
Choose a vertex in with minimal positive degree , and observe that
since . In fact, if , then we must have that the average degree in is at least . Importantly, this happens if and only if and is a multiple of .
We may therefore assume that .
Let denote the number of edges that is added to before a new vertex gets positive degree in the filtration . Let be the degree of the last vertex in that obtained positive degree in . We consider two cases.
-
Case 1: . Then,
-
Case 2: . Note that and (see Figure 5). In particular,
In both cases, the inequality follows from Corollary 12 and Lemma 15.
5 Tight bounds on the vanishing of homology and extremal interval lengths
For a graph with vertices, it is guaranteed that there exists a vertex of degree when the average degree exceeds . Specifically, if the number of edges satisfies , then must be a cone, implying that for all . In practical scenarios, however, the primary interest is with for small . In this section, we provide tight bounds for the vanishing of for a fixed .
Lemma 17.
Let be a graph with minimum degree . Let be a vertex of degree and let . Let . If has vertices and edges, then
Proof.
Let . The fact that is trivial. For showing the inequality, note that, by removing from , we remove the edges containing , and all edges containing the vertices of . Those vertices all have degree at least , which means that they have at least neighbors apart from . Because it might be the case that , it follows that
Theorem 18.
Let be a graph with vertices and edges. Then .
Proof.
Let us first consider the case . This case is immediate from the fact that the maximal number of edges in a graph on vertices with an isolated vertex is .
Working inductively on , assume that result holds for all , and that . Let be the complement graph of , and note that the number of edges satisfies
From the proof of Theorem 10, we have that
where , and are the neighbors of a vertex with minimal degree . We shall show that all the terms in the sum are zero.
If we let denote the minimal degree of a vertex in ,
Here, follows from Lemma 17, from , and from the fact that is obtained by removing vertices from with degree at least in . Now write
and observe that
It follows that,
Hence, by the induction hypothesis.
Remark 19.
Observe that the difference between the bound for a cone and the bound given in Theorem 18 is . While this difference is linear in the number of vertices, it can result in a significant reduction in the number of higher-dimensional simplices; a reduction which has the potential to speed up current implementations of persistent homology for flag complexes, e.g., Ripser [2].
In the following example, we show that that bounds in the previous theorem are tight.
Example 20.
Let , and let be a star graph on vertices and edges, i.e., a central vertex connected to all other vertices. Observe that as one vertex is completely disconnected in the complement graph. If, , then it follows from repeated application of Lemma 4 that . The number of edges in is
Corollary 21.
Let be an edgewise filtration on vertices. Then for any , we have
Proof.
The bound on follows from Theorem 16, and the bound on from Theorem 18. It is straightforward to define a filtration of the complement graph of from Example 20 such that contains the interval from Corollary 21.
6 The maximum value of
Proposition 22.
Let be an edgewise filtration. Then, there exists a triangle-free subgraph of such that
Proof.
Choose a representative cycle for each non-empty interval , e.g., by running the standard algorithm for persistent homology. For a cycle , let denote the set of edges on which the cycle has a non-zero coefficient, and let be the subgraph of with edges given by
Note that the cycles are linearly independent (as elements of ) by virtue of being representatives for non-trivial intervals.
If contains a 2-simplex , then any cycle supported on the edge is homologous in to the cycle . Hence, we can remove from , without reducing the 1st Betti number. Doing this for all triangles, we end up with a triangle-free graph containing at least linearly independent -cycles. In particular, .
Remark 23.
This argument does not apply to homology in higher degrees. For instance, the removal of an edge can result in the removal of many -simplices, some of which are faces of -simplices, and others that are not.
The previous result actually gives a short proof of Corollary 11 for the case . We shall include this proof here, as it ensures uniqueness of the extremal complex.
Proposition 24.
For any graph on vertices, we have with equality if and only if .
Proof.
Define any edgewise filtration for which . Then, by Proposition 22, there exists a triangle-free graph such that . Hence, it suffices to maximize over triangle-free graphs. By the Euler-Poincaré formula,
In conclusion, we are seeking a triangle-free graph with a maximal number of edges, but this is well-known to be uniquely the graph by Turán’s theorem [6, Theorem 11.17].
Combined we arrive at the following.
Corollary 25.
If is a filtration of , then with equality if and only if .
7 Extremal filtrations for degree 1 persistent homology
In this section, we consider edgewise filtrations of the complete graph that maximize the total persistence and the number of bars for degree homology.
We shall answer the following question:
For a fixed number of vertices and for , which edgewise filtrations of the complete graph with a maximal number of bars achieve maximal total persistence?
By Corollary 25, this translates to finding
First, we give some preliminaries and establish some notation.
-
All graphs have vertices. The join of two graphs and , notation , is as in Section 2.
-
For a filtration , we write for its total persistence of degree .
-
Denote . If , then has as a subgraph, with partition classes and of sizes and .
-
Let be the number of connected components of . Then, by Proposition 5,
(3)
We also define what it means for two graphs and two filtrations to be (fiberwise) isomorphic.
Definition 26.
Two graphs and with vertex sets and edge sets are isomorphic, notation , if there is a bijection , such that if and only if Furthermore, two filtrations and are fiberwise isomorphic if for all .
First, by Corollary 25, any filtration with a maximal number of bars must contain the Turán graph . Moreover, Theorem 16 establishes that the filtration from Definition 13 is fiberwise optimal, reducing our problem to maximizing the total persistence of an edgewise filtration of the complete graph, beginning with the Turán graph plus one edge. This is a bit more involved, because we cannot find a fiberwise optimal filtration of the complete graph, as is shown in the following example.
Example 27.
We can prove the following about the structure of the graphs in an optimal filtration of .
Lemma 28.
Let be an edgewise filtration of with maximal total persistence that includes . Then, for and , one of the subgraphs consists of isolated vertices and a single component , and the other subgraph consists of isolated vertices, and a single component with vertices, such that is a subgraph of this component.
Proof.
Suppose that for some , , where and are isolated vertices in (without loss of generality). Also assume that contains a component of size . To maximize total persistence, it is more optimal to connect to , since this allows us to add edges of the form (where ) without increasing the number of connected components (cf. (3)). Because these additional edges do not decrease the degree homology, we should add them as early in the filtration as possible to maximize total persistence. Hence, by this same argument, should be isomorphic to and some isolated vertices.
Furthermore, note that we added the restriction because Theorem 18 ensures that beyond this point, . This means that
(4) |
but if , then could be any graph containing as a subgraph. Because of this, we restrict our focus to edgewise filtrations of the graph from (4).
Moreover, by Lemma 28, we only have to consider filtrations of the described form. We can represent them by a sequence of tuples of the form
(5) |
such that and . The representation from (5) corresponds to the filtration , where, for ,
In other words, the representation from (5) means that, starting from the Turán graph :
-
1.
we begin adding edges by forming a in ,
-
2.
then add edges to form a in ,
-
3.
increase the sizes of the complete subgraphs in and by constructing a (containing the smaller ) in , a in , and so on.
Note that this representation is not unique ( is equivalent to ).
Example 30.
In Figure 7, one finds a filtration and a corresponding representation.
Surprisingly, the optimal strategy is to first construct two large cliques in and of sizes approximately and , respectively, and then, alternating between and , increase the sizes of the cliques one by one. We prove this in the main result of this section.
Theorem 31.
Let and let and . Up to fiberwise isomorphism, the edgewise filtration of with maximal total persistence and a maximum number of bars is given by (Definition 13) for . After , for , the filtration is unique up to fiberwise isomorphism and represented by a sequence of tuples (cf. (5)) as follows
If , then there are, up to fiberwise isomorphism, two optimal filtrations, represented by
Remark 32.
We prove the even case. The proof of the odd case is more or less the same (although the values are slightly different). However, we need to prove that, for any representation as in (5), for . In the even case, we may assume this.
Proof.
It follows from Corollary 25 that, for a filtration to have a maximum number of bars, we must have . For a filtration to also have maximal total persistence, it immediately follows by Theorem 16 that for .
Any filtration that we consider in this proof will therefore be such that for . We are interested in the remainder of the filtration, which we can represent by a sequence of tuples as in (5). Within the proof, we will further reduce it by writing
(6) |
Furthermore, given a filtration represented by , we define its alternation depth as follows:
-
We let if for all .
-
Otherwise, let be such that for all , but .
The idea of the proof is to start with a filtration , with a representation as in (6), and show that we can change the filtration in steps, to obtain in every step a filtration such that , and end up with an optimal solution.
Now, writing , let be a filtration, represented by
We shall show that
-
1.
If the alternation depth , then we can change to a filtration such that and .
-
2.
If , then we can change to a filtration represented by with , such that or (depending on ) and .
For the first part, let be a filtration with alternation depth . Its representation, substituting by , is of the form
Since we assumed that , we must have . We transform into a filtration which is the same as , except that we add an extra alternation step. We consider two cases.
-
Case : . In this case, is represented by
-
Case : . In this case, is represented by
In both cases, we see that . Moreover, as we show in more detail in [3, B.4],
(7) |
To see that this difference is non-negative, first recall that , so the first two terms are strictly positive. Furthermore, since , we have , so
since . Note that if and only if and .
We now show the second part. We let be a filtration with representation
We transform into a filtration such that its start is different from ’s, but from the tuple onwards, and are the same. We again consider two cases.
-
Case : . In this case, is represented by
-
Case : . In this case, is represented by
In both cases, as we show in more detail in the full version [3, B.4], we have
(8) |
where we use in that , and in that . If , then implies that . If , then . Only in this case, and if , then the inequality is sharp.
Hence, if we start with an arbitrary filtration , then we can transform it step by step into a more optimal filtration with , by the first part. Then, by the second part, we can transform step by step into a more optimal filtration that starts with , with alternation depth . Hence, is the optimal filtration as described in the statement for .
In the case that , we saw in the proof of case and case , that the filtration starting with and having alternation depth satisfies , hence both and are optimal.
8 Discussion
In this paper, we have answered several key questions in the context of topological data analysis. However, many questions remain open, and we propose the following conjectures.
Conjecture 33.
If is a filtration on vertices, then the number of intervals in the barcode of in homology degree satisfies
Conjecture 34.
The extremal filtrations described in Section 7 achieve the maximal total persistence over any edgewise filtration of flag complexes in homology degree 1.
Proving the latter conjecture likely requires new ideas, as no filtration can maximize for all . In fact, extremal values need not be realized by graphs containing Turán graphs as spanning subgraphs, as illustrated by the following example.
Example 35.
Let and . Suppose contains the Turán graph . By Proposition 5, we seek to add 5 edges to the Turán graph to maximize the product . It is easy to see that this product can be at most 1. However, by partitioning the vertices as , with and , and adding all edges between and , we need to add 6 more edges. Adding these as a in the larger partition gives and ; see Figure 9. The value is extremal; see Figure 10.
Finding extremal values of -linear functions defined on simplicial complexes with precisely vertices and edges presents a challenging and interesting direction for future research.
Conjecture 36.
Let denote the collection of graphs on vertices and edges. If and , then contains a complete bipartite spanning subgraph.
References
- [1] Michał Adamaszek. Extremal problems related to Betti numbers of flag complexes. Discrete Applied Mathematics, 173:8–15, 2014. doi:10.1016/J.DAM.2014.04.006.
- [2] Ulrich Bauer. Ripser: efficient computation of Vietoris-Rips persistence barcodes. Journal of Applied and Computational Topology, 5(3):391–423, 2021. doi:10.1007/S41468-021-00071-5.
- [3] Lies Beers and Magnus Bakke Botnan. Extremal Betti Numbers and Persistence in Flag Complexes, 2025. arXiv:2502.21294.
- [4] Anders Björner and Gil Kalai. An extended Euler-Poincaré theorem, 1988.
- [5] Magnus Botnan and Michael Lesnick. An introduction to multiparameter persistence. In Representations of Algebras and Related Structures, pages 77–150. EMS Press, 2023.
- [6] Gary Chartrand and Ping Zhang. A first course in graph theory. Courier Corporation, 2013.
- [7] Frédéric Chazal and Steve Yann Oudot. Towards persistence-based reconstruction in Euclidean spaces. In Proceedings of the twenty-fourth annual symposium on Computational geometry, pages 232–241, 2008.
- [8] Tamal Krishna Dey and Yusu Wang. Computational topology for data analysis. Cambridge University Press, 2022.
- [9] Herbert Edelsbrunner and János Pach. Maximum Betti numbers of Čech complexes. In 40th International Symposium on Computational Geometry (SoCG 2024). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.SoCG.2024.53.
- [10] Michael Goff. Extremal Betti numbers of Vietoris–Rips complexes. Discrete and Computational Geometry, 46(1):132, 2011. doi:10.1007/s00454-010-9274-z.
- [11] Dmitry N Kozlov. Convex hulls of -and -vectors. Discrete & Computational Geometry, 18:421–431, 1997. doi:10.1007/PL00009326.