Abstract 1 Introduction 2 Background 3 The maximum value of 𝜷𝒌(𝑿(𝑮)) 4 The maximum value of 𝜷𝒌(𝑿(𝑮)) for |𝑬(𝑮)||𝑬(𝓣𝒏,𝒌+𝟏)| 5 Tight bounds on the vanishing of homology and extremal interval lengths 6 The maximum value of |𝜷𝟏𝐅𝐋(𝓖)| 7 Extremal filtrations for degree 1 persistent homology 8 Discussion References

Extremal Betti Numbers and Persistence in Flag Complexes

Lies Beers ORCID Vrije Universiteit Amsterdam, The Netherlands Magnus Bakke Botnan ORCID Vrije Universiteit Amsterdam, The Netherlands
Abstract

We investigate several problems concerning extremal Betti numbers and persistence in filtrations of flag complexes. For graphs on n vertices, we show that βk(X(G)) is maximal when G=𝒯n,k+1, the Turán graph on k+1 partition classes, where X(G) denotes the flag complex of G. Building on this, we construct an edgewise (one edge at a time) filtration 𝒢=G1𝒯n,k+1 for which βk(X(Gi)) is maximal for all graphs on n vertices and i edges. Moreover, the persistence barcode k(X(G)) achieves a maximal number of intervals, and total persistence, among all edgewise filtrations with |E(𝒯n,k+1)| edges.

For k=1, we consider edgewise filtrations of the complete graph Kn. We show that the maximal number of intervals in the persistence barcode is obtained precisely when Gn/2n/2=𝒯n,2. Among such filtrations, we characterize those achieving maximal total persistence. We further show that no filtration can optimize β1(X(Gi)) for all i, and conjecture that our filtrations maximize the total persistence over all edgewise filtrations of Kn.

Keywords and phrases:
Topological data analysis, Extremal graph theory
Copyright and License:
[Uncaptioned image] © Lies Beers and Magnus Bakke Botnan; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Mathematics of computing Algebraic topology
Related Version:
Full Version: https://arxiv.org/abs/2502.21294 [3]
Acknowledgements:
We thank Ulrich Bauer for suggesting the problem of maximizing the number of intervals in the barcode of a Vietoris–Rips filtration.
Editors:
Oswin Aichholzer and Haitao Wang

1 Introduction

A central theme in topological data analysis (TDA) is the computation of homological invariants from simplicial complexes. These complexes are often constructed from point cloud data, with the Vietoris–Rips complex being one of the most widely used constructions. For a finite set P in a metric space (M,d), the Vietoris–Rips complex at scale r, denoted VRr(P), includes all simplices σ={p0,,pn} such that d(pi,pj)2r for all 0i,jn. This complex is a flag complex, meaning it is the largest simplicial complex with a given underlying graph (the 1-skeleton). In particular, the edges in the connectivity graph of P at scale r fully determine the higher-dimensional simplices in VRr(P). Varying the scale parameter r induces a filtration of simplicial complexes:

VR(P)r0VR(P)r1VR(P)rm,

where r0<r1<<rm. Applying k-dimensional homology over a field 𝐤 to this filtration yields a persistence barcode in degree k, denoted k(VR(P)). This barcode, consisting of intervals [a,b), encodes the birth and death of topological features across scales and serves as a powerful tool for extracting topological information from P. When P is a sufficiently dense sampling of an underlying space X, the number of “long bars” in k(VR(P)) determines the k-th Betti number βk(X), which describes the k-dimensional topological features of X [7]. For a comprehensive introduction to topological data analysis, we refer the reader to [8].

Given the central role of such filtrations in TDA, this paper addresses a fundamental question: how many topological features can arise in a data set of n points? Specifically, we investigate bounds on quantities such as the maximal value of βk(VRr(P)), the maximal number of intervals in k(VR(P)), the length of the longest interval, and the sum of the lengths of the intervals (the total persistence).

 Remark 1.

In this paper, we focus on edgewise filtrations of graphs, i.e., sequences of graphs 𝒢=G1G2Gm={Gi}i=1m, on n vertices, where Gi+1 and Gi differ by precisely one edge. Notably, every filtration of flag complexes can be realized as the Vietoris–Rips filtration of an appropriate metric on the vertices of Gm; see the full version of this paper [3, A]. Thus, our results are directly applicable to data analysis using persistent homology.

1.1 Overview and contributions

In Section 2, we introduce the relevant background material and compute the Betti numbers of flag complexes on Turán graphs.

In Section 3, we examine extremal Betti numbers and establish tight upper bounds on βk(F), where F is a flag complex on n vertices (Theorem 10). This upper bound is achieved by the Turán graph 𝒯n,k+1.

In Section 4, we study filtrations of flag complexes on n vertices with at most e edges, where e is the number of edges in 𝒯n,k+1. We prove that our filtration maximizes βk at all filtration steps (Theorem 16).

In Section 5, we analyze the longest possible interval in the barcode of a filtration of flag complexes on n vertices (Corollary 21).

Additionally, in Section 6, we focus on degree 1 homology and show that any filtration of flag complexes on n vertices will achieve the maximal number of intervals in its degree 1 barcode if and only if the filtration contains 𝒯n,2 (Corollary 25).

In Section 7, we explore flag filtrations in homology degree 1 that achieve both a maximal number of intervals and maximal total persistence. Equivalently, we maximize the total persistence of all filtrations of graphs containing Gn/2n/2=𝒯n,2. Given the importance of Turán graphs in extremal graph theory, we believe that identifying extremal filtrations of Turán graphs is interesting in its own right. Our result relies on an elaborate combinatorial analysis that precisely classifies the extremal filtrations (Theorem 31).

Finally, the paper concludes with a discussion in which we outline several important conjectures for future work.

1.2 Related work

The study of extremal values of -linear functions on the f- and β-vectors of simplicial complexes with n vertices has a rich history. Classical questions include determining the extremal Euler characteristic and the maximal sum of Betti numbers. These problems were addressed for general simplicial complexes in [4] and later extended to arbitrary -linear functions in [11]. Specifically, [11, Theorem 3.4] shows that for flag complexes, extremal values are always realized by 𝒯n,k for some k. Consequently, Theorem 10 follows directly from [11, Theorem 3.4].

Our proof of Theorem 10 adapts the approach of [1, Theorem 1.1], which gives an alternative proof for the extremal total Betti number of flag complexes. The key observation in our proof of Theorem 10 plays a central role in our results on extremal interval lengths (Section 5).

Another related work is [10], which examines asymptotic bounds for βk in Vietoris–Rips complexes of point samples in d. Similar questions have been explored for Čech complexes; see, e.g., [9].

Our work diverges from earlier work by considering filtered flag complexes. This approach is closely tied to the problem of finding extremal values for complexes with precisely n vertices and e edges; we return to this in Section 8. Importantly, and in contrast to classical work, graphs achieving extremal values need not have 𝒯n,k as a subgraph.

2 Background

Graphs and Simplicial Complexes.

Let G=(V,E) be a graph. We write {v,w} for the edge connecting the vertices v and w. For a vertex vV, the neighborhood is NG(v){w:{v,w}E}, and the closed neighborhood is NG[v]NG(v){v}. The degree of v is deg(v)degG(v)|NG(v)|. The complete graph on n vertices is denoted Kn.

The complement of G, denoted G¯, is the graph on the same vertex set as G, where two distinct vertices of G¯ are adjacent if and only if they are non-adjacent in G. The join of disjoint graphs G1 and G2 is the graph G1G2 with vertex set V(G1)V(G2) and edge set E(G1)E(G2){{v1,v2}:v1V(G1),v2V(G2)}. The complete bipartite graph Kn1,n2 is defined as the join G1G2, where each Gi is an empty graph on ni vertices. The sets V(G1) and V(G2) are called the partition classes of Kn1,n2.

For a vertex v in a simplicial complex K, the link and (closed) star of v are the simplicial complexes given respectively by

lkK(v)={τK:vτ,{v}τK}stK(v)={τK:{v}τK}.

We write Kv for the simplicial complex with simplices τK for which vK.

For a graph G, we let X(G) denote the simplicial complex with m-simplices given by the (m+1)-cliques in G. A simplicial complex K is called a flag complex if K=X(G) for some graph G. The independence complex of G is the simplicial complex Ind(G)=X(G¯). Examples of flag and independence complexes are given in Figure 1.

Figure 1: A graph (left), its flag complex (middle) and independence complex (right).

Homology.

For a simplicial complex K, we let βk(K) denote the dimension of the reduced homology group H~k(K;𝐤), and let Ck(K) denote the vector space of k-chains. We use coefficients in a fixed, but arbitrary, field 𝐤; our optimal constructions are torsion-free and thus our results do not depend on the choice of coefficient field. For that reason, we simply write H~k(K). We also employ the notation βkFL(G)=βk(X(G)) for a graph G.

Persistent Homology.

A filtration 𝒦 of simplicial complexes is a collection of simplicial complexes {Ki}i=1m such that KiKj for ij. Applying H~k(;𝐤) to a filtration yields a sequence of vector spaces and linear maps H~k(𝒦):H~k(K1)H~k(Km) called a persistence module. Provided all the vector spaces are finite-dimensional, H~k(𝒦) is uniquely described by a collection of intervals in {1,,m}, called the degree k barcode of 𝒦. We shall denote this barcode by k(𝒦). The total persistence of 𝒦 is given by

Tβk(𝒦)=[a,b)k(𝒦)(ba)=i=1mβk(Ki).

For a more thorough introduction to persistent homology, (generalized) persistence modules, and examples, see, e.g., [8, Chapter 3] or [5].

For a filtration 𝒢 of graphs (1-dimensional simplicial complexes), we get a filtration X(𝒢) of simplicial complexes by taking the flag complex at every index. If Gi+1Gi is a single edge for all i, then we say that the filtration 𝒢 is edgewise. We shall employ the notation kFL(G)=k(X(𝒢)).

Example 2.

An edgewise filtration of K5 can be found in Figure 2.

Figure 2: An edgewise filtration of K5.

2.1 Elementary homological properties

The following two lemmas are well-known and important in combinatorial topology. For completeness, their proofs can be found in the full version [3, B.1].

Lemma 3.

Let K be a simplicial complex and vV(K) a vertex. Then, for all k1,

βk(K)βk(Kv)+βk1(lkK(v)).
Lemma 4.

For any two graphs G and H, and k1,

βk(Ind(GH))=i,j1;i+j=k1βi(Ind(G))βj(Ind(H)).

The following observation is essential for our work in Section 7.

Proposition 5.

Let V(G)=V1V2 be the vertex set of a graph G containing all edges of the form {v,w} where vV1 and wV2. Let di1 denote the number of connected components of G restricted to Vi. Then, β1FL(G)=β0FL(G1)β0FL(G2)=(d11)(d21).

Proof.

Let G¯ denote the complement graph of G and observe that G¯=G¯1G¯2 where G1 and G2 are the full subgraphs of G with vertices V1 and V2, respectively. From Lemma 4,

β1FL(G)=β1(Ind(G¯))=β0(Ind(G¯1))β0(Ind(G¯2))=β0(X(G1))β0(X(G2)).

2.2 Turán graphs

Definition 6.

Let n0 and k1, and let n be the smallest positive integer such that nnmodk. Then, the (n,k) graph 𝒯n,k is the complement graph of the graph

𝒯n,k¯=i=1kKni,ni={n/k if 1in,n/kotherwise.
Example 7.

See Figure 3 for some examples of Turán graphs.

Figure 3: From left to right: the Turán graphs 𝒯5,2, 𝒯8,2 and 𝒯8,3.

The proof of the following can be found in the full version [3, B.1].

Proposition 8.

For all integers n1 and k1, we let n be the smallest positive integer such that nnmodk. We have

βiFL(𝒯n,k)={(n/k1)n(n/k1)knif i=k1,0otherwise.

In particular, if n is a multiple of k, then βk1=(n/k1)k.

3 The maximum value of 𝜷𝒌(𝑿(𝑮))

The following is well-known; see the full version [3, B.2] for a proof.

Lemma 9.

Let S be a positive integer, and xi1 be integers satisfying i=1nxi=S. Then, i=1n(xi1)i=1n(yi1) where

yi={S/n if 1iSmodnS/notherwise. (1)

As said in Section 1.2, the following proof is a modification of the proof of [1, Theorem 1.1].

Theorem 10.

Let G be a graph on n vertices. Then, for all k0,

βkFL(G)(n/(k+1)1)(nmodk+1)(n/(k+1)1)(k+1)(nmodk+1).

Proof.

We shall work inductively on n. The result is trivially true for n=1, so assume that it holds for all n<n.

Note that X(G)=Ind(G¯). Let d denote the minimal degree over all vV(G¯). Assume that v is a vertex with deg(v)=d, and let {v1,,vd} denote the neighbours of v in G¯. Moreover, let G¯i=G¯{v1,,vi} for i=1,,d and G¯0=G¯. Applying Lemma 3,

βk(Ind(G¯)) βk(Ind(G¯1))+βk1(Ind(G¯NG¯[v1])
βk(Ind(G¯2))+βk1(Ind(G¯1NG¯1[v2]))+βk1(Ind(G¯NG¯[v1])))
βk(Ind(G¯d))+i=0d1βk1(Ind(G¯iNG¯i[vi+1]))

Here G¯d has an isolated vertex and thus Ind(G¯d) is a cone, and therefore βk(Ind(G¯d))=0. Since every vertex of G¯ has degree at least d, it follows that G¯iNG¯i(vi+1) contains at most n(d+1)=nd1 vertices. By the induction hypothesis it follows that

βk(Ind(G¯))d(x11)(xk1)=((d+1)1)(x11)(xk1)

for integers xi1 for which (d+1)+i=1kxi=d+1+nd1=n. Hence, by Lemma 9,

βk(Ind(G¯))(n/(k+1)1)(nmodk+1)(n/(k+1)1)(k+1)(nmodk+1).

Let 𝐆(n)={G: graph on n vertices}. Combining Theorem 10 and Proposition 8, we have

Corollary 11.

Of all graphs on n vertices, 𝒯n,k+1 maximizes the k-th Betti number. I.e.,

maxG𝐆(n)βkFL(G)=βkFL(𝒯n,k+1).

In particular, if n is a multiple of k, then maxG𝐆(n)βkFL(G)=(n/k1)k.

4 The maximum value of 𝜷𝒌(𝑿(𝑮)) for |𝑬(𝑮)||𝑬(𝓣𝒏,𝒌+𝟏)|

In this section, we fix n,k1, and the notation

en,k+1|E(𝒯n,k+1)| and Δm1,mk+1emk+1em1k+1. (2)

The following result follows from combining Lemma 3 and Corollary 11, and will be integral in proving the main result of this section.

Corollary 12.

Let v be a vertex of degree d1 in a graph G. Then,

βkFL(G)βkFL(Gv)+βk1FL(𝒯d,k).

Proof.

If K=X(G), then Kv=X(Gv), and Lemma 3 implies that

βkFL(G)βkFL(Gv)+βk1(lkX(G)(v)).

Now observe that lkX(G)(v)=X(NG[v]), and since lkX(G)(v) has d vertices, it follows from Corollary 11 that βk1(lkK(v))βk1FL(𝒯d,k) We now define an edgewise filtration n,k+1=={Hi}i=1en,k+1={Hin,k+1}i=1en,k+1 of the Turán graph 𝒯n,k+1. In this section, we shall show, for m=1,,|𝒯n,k+1|, that βkFL(Hm) is maximal over all graphs with m edges, i.e., that is fiberwise optimal.

Figure 4: The filtration of 𝒯8,2 from Definition 13.
Definition 13.

Let V denote the vertices of 𝒯n,k+1 and label the elements of V such that vi and vi+k+1 are in the same partition. Writing each edge as ei,j={vi,vj} for i<j, we order the edges by the following co-lexicographic order: ei,j<ek,l if max{i,j}<max{k,l}, or if i=k and j<l, or if j=l and i<k. Following this order, let Hi=Hin,k+1 denote the subgraph of 𝒯n,k+1 with the first i edges.

Note that, since 𝒯n,k+1 contains no k+1-simplices, βk(Hi) is increasing as a function of i.

Example 14.

See Figure 2 for an edgewise filtration 𝒢={Gi}i=16 of K5, where GiHi5,2 for i=1,,6. Furthermore, in Figure 4 one can find the filtration 8,2 of 𝒯8,2.

The next lemma shows that, once a vertex has been connected to the growing component, and until the next vertex gets connected, the change in βkFL from adding a single edge increases with the number of edges already added. Its proof, similar to that of Corollary 12, can be found in the full version [3, B.3].

Lemma 15.

For e edges, let m be maximal such that 𝒯m,k+1 is a subgraph of He. Then,

βkFL(He)=βkFL(𝒯m,k+1)+βk1FL(𝒯eemk+1,k).

In particular, for emk+1e1<e2<em+1k+1, we have that

βkFL(He1+1)βkFL(He1)βkFL(He2+1)βkFL(He2).
Theorem 16.

Let een,k+1, let He be as above, and let G be any other graph on n vertices and e edges. Then, βkFL(G)βkFL(He).

Proof.

We prove this inductively on the number of edges e. The statement is clearly true for e=1, so let’s assume that the result holds for all e<e.

Let m be the number of vertices in He with positive degree, and let n denote the number of vertices in G with positive degree. If n<m, then by Theorem 10,

βkFL(G)βkFL(𝒯m1,k+1)βkFL(He).

Hence, we may assume that nm. In particular, the average degree of the positive-degree-vertices in G is no larger than the average degree of positive-degree-vertices in He.

Choose a vertex v in G with minimal positive degree d, and observe that

demk+1em1k+1=Δm1,m,

since 𝒯m1,k+1He𝒯m,k+1. In fact, if d=Δm1,m, then we must have that the average degree in He is at least d. Importantly, this happens if and only if He=𝒯m,k+1 and m is a multiple of k+1.

By the induction assumption,

βkFL(Gv) βkFL(Hedeg(v)), and thus, by Corollary 12,
βkFL(G) βkFL(Hedeg(v))+βk1FL(𝒯d,k).

If d=Δm1,m, then this becomes, by Lemma 15,

βkFL(G)βkFL(𝒯m1,k+1))+βFLk1(𝒯d,k)=βFLk(𝒯m,k+1)=βFLk(He).
Figure 5: An illustration of Case 2 from the proof of Theorem 16.

We may therefore assume that dΔm1,m1Δm2,m1.

Let e^ denote the number of edges that is added to Hed before a new vertex gets positive degree in the filtration . Let d be the degree of the last vertex in Hed that obtained positive degree in . We consider two cases.

  • Case 1: 𝒯m1,kHed. Then,

    βkFL(G) βkFL(Hed)+βk1FL(𝒯d,k)
    =βkFL(Hed)+βkFL(Hemk+1+d)βkFL(Hemk+1)
    βkFL(Hed)+βkFL(He)βkFL(Hed)
    =βkFL(He).
  • Case 2: 𝒯m2,kHed𝒯m1,k. Note that d+e^=Δm2,m1d and de^0 (see Figure 5). In particular,

    βkFL(G) βkFL(Hed)+βk1FL(𝒯d,k)
    =βkFL(Hed)+(βk1FL(𝒯d,k)βk1FL(𝒯de^,k))+βk1FL(𝒯de^,k)
    βkFL(Hed)+(βk1FL(𝒯d+e^,k)βk1FL(𝒯d,k))+βk1FL(𝒯de^,k)
    =βkFL(He).

In both cases, the inequality follows from Corollary 12 and Lemma 15.

5 Tight bounds on the vanishing of homology and extremal interval lengths

For a graph G with n vertices, it is guaranteed that there exists a vertex of degree n1 when the average degree exceeds n2. Specifically, if the number of edges satisfies m>n(n2)2, then X(G) must be a cone, implying that βkFL(G)=0 for all k. In practical scenarios, however, the primary interest is with βkFL(G) for small k. In this section, we provide tight bounds for the vanishing of βkFL(G) for a fixed k.

Lemma 17.

Let G be a graph with minimum degree u. Let v be a vertex of degree u and let NG(v)={v1,,vu}. Let dideg(vi). If G has n vertices and m edges, then

|V(GNG[vi])|=ndi1 and |E(GNG[vi])|m(di+(u1)di2).

Proof.

Let G^GNG[vi]. The fact that |V(G^)|=ndi1 is trivial. For showing the inequality, note that, by removing NG[vi] from G, we remove the di edges containing vi, and all edges containing the vertices of NG(vi). Those vertices all have degree at least u, which means that they have at least u1 neighbors apart from vi. Because it might be the case that NG(NG[vi])=NG[vi], it follows that

|E(G^)|m(di+(u1)di2).

Theorem 18.

Let G be a graph with n vertices and m>(n12)+k edges. Then βkFL(G)=0.

Proof.

Let us first consider the case k=0. This case is immediate from the fact that the maximal number of edges in a graph on n vertices with an isolated vertex is (n12).

Working inductively on k, assume that result holds for all k<k, and that m>(n12)+k. Let G¯ be the complement graph of G, and note that the number of edges m¯=|E(G¯)| satisfies

m¯=(n2)m<(n2)(n12)k=n1k.

From the proof of Theorem 10, we have that

βkFL(G)=βk(Ind(G¯))i=0d1βk1(Ind(G¯iNG¯i[vi+1])),

where G¯i=G¯{v1,,vi}, and {v1,,vu} are the neighbors of a vertex vV(G¯) with minimal degree u. We shall show that all the terms in the sum are zero.

If we let uui denote the minimal degree of a vertex in V(G¯iNG¯i[vi+1]),

|E(G¯iNG¯i[vi+1])|
(i)|E(G¯i)|(degG¯i(vi+1)+(u1)degG¯i(vi+1)2)
(ii)|E(G¯i)|(ui+(ui1)(ui)2)=|E(G¯i)|j=1uij
(iii)|E(G¯)|j=ui+1ujj=1uij=|E(G¯)|j=1uj=m¯(u(u+1)2).

Here, (i) follows from Lemma 17, (ii) from degG¯i(vi)u, and (iii) from the fact that G¯i is obtained by removing i vertices from G¯ with degree at least u in G¯. Now write

n=|V(Ind(G¯iNG¯i[vi+1]))|nu1,

and observe that

|E(Ind(G¯iNG¯i[vi+1]))|(n2)m¯+u(u+1)/2.

It follows that,

|E(Ind(G¯iNG¯i[vi+1]))|(n12)
(n2)m¯+u(u+1)/2(n12)
=n1m¯+u(u+1)/2>n1(n1k)+u(u+1)/2k+u(u+1)/2k.

Hence, βk1(Ind(G¯iNG¯i[vi+1]))=0 by the induction hypothesis.

 Remark 19.

Observe that the difference between the bound for a cone and the bound given in Theorem 18 is n/2k1. While this difference is linear in the number of vertices, it can result in a significant reduction in the number of higher-dimensional simplices; a reduction which has the potential to speed up current implementations of persistent homology for flag complexes, e.g., Ripser [2].

In the following example, we show that that bounds in the previous theorem are tight.

Example 20.

Let n=(k+1)p, and let K1,p1 be a star graph on p vertices and p1 edges, i.e., a central vertex connected to all other vertices. Observe that β0(Ind(K1,p1))=1 as one vertex is completely disconnected in the complement graph. If, H=i=1k+1K1,p1, then it follows from repeated application of Lemma 4 that βk(Ind(H))i=1k+1β0(Ind(K1,p1))=1. The number of edges in Ind(H) is

(n2)(k+1)(p1)=(n2)n+(k+1)=(n2)(n1)+k=(n12)+k.
Corollary 21.

Let 𝒢 be an edgewise filtration on n vertices. Then for any [a,b)kFL(𝒢), we have

2k(k+1)a<b(n12)+k.

Proof.

The bound on a follows from Theorem 16, and the bound on b from Theorem 18. It is straightforward to define a filtration of the complement graph of H from Example 20 such that kFL() contains the interval [a,b) from Corollary 21.

6 The maximum value of |𝜷𝟏𝐅𝐋(𝓖)|

Proposition 22.

Let 𝒢={Gi}i=1m be an edgewise filtration. Then, there exists a triangle-free subgraph H of Gm such that β1FL(H)|1FL(𝒢)|.

Proof.

Choose a representative cycle c[a,b] for each non-empty interval [a,b)1FL(𝒢), e.g., by running the standard algorithm for persistent homology. For a cycle c, let e(c) denote the set of edges on which the cycle has a non-zero coefficient, and let H be the subgraph of Gm with edges given by

[a,b)1FL(𝒢)e(c[a,b)]).

Note that the cycles {c[a,b):[a,b)1FL(𝒢)} are linearly independent (as elements of C1(G)) by virtue of being representatives for non-trivial intervals.

If X(H) contains a 2-simplex τ={v1,v2,v3}, then any cycle c[a,b) supported on the edge {v2,v3} is homologous in H~1(X(H)) to the cycle c[a,b)=c[a,b)2(τ). Hence, we can remove {v2,v3} from H, without reducing the 1st Betti number. Doing this for all triangles, we end up with a triangle-free graph H^ containing at least |1FL(𝒢)| linearly independent 1-cycles. In particular, β1FL(H^)|1FL(𝒢)|.

 Remark 23.

This argument does not apply to homology in higher degrees. For instance, the removal of an edge can result in the removal of many 2-simplices, some of which are faces of 3-simplices, and others that are not.

The previous result actually gives a short proof of Corollary 11 for the case k=1. We shall include this proof here, as it ensures uniqueness of the extremal complex.

Proposition 24.

For any graph on n vertices, we have β1FL(G)β1FL(𝒯n,2) with equality if and only if G=𝒯n,2.

Proof.

Define any edgewise filtration 𝒢={Gi}i=1m for which Gm=G. Then, by Proposition 22, there exists a triangle-free graph H such that β1FL(H)1FL(𝒢)β1FL(G). Hence, it suffices to maximize β1FL(G) over triangle-free graphs. By the Euler-Poincaré formula,

β1FL(G)=β0FL(G)1|V(G)|+|E(G)|.

In conclusion, we are seeking a triangle-free graph with a maximal number of edges, but this is well-known to be uniquely the graph 𝒯n,2 by Turán’s theorem [6, Theorem 11.17].

Combined we arrive at the following.

Corollary 25.

If 𝒢 is a filtration of Kn, then |1FL(𝒢)|β1FL(𝒯n,2) with equality if and only if Gn/2n/2=𝒯n,2.

7 Extremal filtrations for degree 1 persistent homology

In this section, we consider edgewise filtrations 𝒢=G1G(n2) of the complete graph Kn that maximize the total persistence and the number of bars for degree 1 homology. We shall answer the following question:
For a fixed number of vertices n and for k=1, which edgewise filtrations of the complete graph Kn with a maximal number of bars achieve maximal total persistence?
By Corollary 25, this translates to finding

max{i=1n(n1)/2β1FL(Gi):𝒢={Gi:1i(n2),G|E(𝒯n,2)|=𝒯n,2 and Gn(n1)/2=Kn}}.

First, we give some preliminaries and establish some notation.

  • All graphs G=(V,E) have n vertices. The join of two graphs G1 and G2, notation G1G2, is as in Section 2.

  • For a filtration 𝒢={Gi}i=1m, we write Tβ1FL(𝒢)=i=1mβ1FL(Gi) for its total persistence of degree 1.

  • Denote en|E(𝒯n,2)|. If |E(G)|en, then G has 𝒯n,2 as a subgraph, with partition classes V1 and V2 of sizes |V1|=n/2 and |V2|=n/2.

  • Let di be the number of connected components of G|Vi. Then, by Proposition 5,

    β1FL(G)=(d11)(d21). (3)

We also define what it means for two graphs and two filtrations to be (fiberwise) isomorphic.

Definition 26.

Two graphs G1 and G2 with vertex sets V(Gi) and edge sets E(Gi) are isomorphic, notation G1G2, if there is a bijection φ:V(G1)V(G2), such that {u1,u2}E(G1) if and only if {φ(u1),φ(u2)}E(G2). Furthermore, two filtrations 𝒢1={Gi1}i=1m and 𝒢2={Gi2}i=1m are fiberwise isomorphic if Gi1Gi2 for all i=1,,m.

First, by Corollary 25, any filtration with a maximal number of bars must contain the Turán graph 𝒯n,2. Moreover, Theorem 16 establishes that the filtration ={Hi}i=1en from Definition 13 is fiberwise optimal, reducing our problem to maximizing the total persistence of an edgewise filtration of the complete graph, beginning with the Turán graph plus one edge. This is a bit more involved, because we cannot find a fiberwise optimal filtration of the complete graph, as is shown in the following example.

Example 27.

If |E|=en+2, then one gets optimal degree 1 homology by adding one edge to V1 and one edge to V2 (left graph in Figure 6). However, if |E|=en+3, then optimal degree 1 homology is obtained by adding all three edges to V1 to form a K3 (second-to-left graph in Figure 6). It is clear that a filtration of Kn cannot contain both graphs.

Figure 6: The graphs from Example 27 and 29.

We can prove the following about the structure of the graphs in an optimal filtration of Kn.

Lemma 28.

Let 𝒢 be an edgewise filtration {Gi}i=1(n2) of Kn with maximal total persistence that includes 𝒯n,2. Then, for t=1,2 and i=en+1,,(n12)+1, one of the subgraphs Gi|Vt consists of isolated vertices and a single component Km, and the other subgraph Gi|Vt consists of isolated vertices, and a single component with l vertices, such that Kl1 is a subgraph of this component.

Example 29.

Cf. Figure 6 for four examples of graphs with the properties from Lemma 28.

Proof.

Suppose that for some i, E(Gi)E(Gi1)={w1,w2}, where w1 and w2 are isolated vertices in Gj1|V1 (without loss of generality). Also assume that Gj1|V1 contains a component C of size |C|>1. To maximize total persistence, it is more optimal to connect w1 to C, since this allows us to add |C|1 edges of the form {w1,c} (where cC) without increasing the number of connected components (cf. (3)). Because these additional edges do not decrease the degree 1 homology, we should add them as early in the filtration as possible to maximize total persistence. Hence, by this same argument, Gj1|V2 should be isomorphic to Km and some isolated vertices.

Furthermore, note that we added the restriction j(n12)+1 because Theorem 18 ensures that beyond this point, β1FL(Gj)=0. This means that

G(n12)+1=(K1Kn/21)(K1Kn/21), (4)

but if j>(n12)+1, then Gj could be any graph containing G(n12)+1 as a subgraph. Because of this, we restrict our focus to edgewise filtrations of the graph G(n12)+1 from (4).

Moreover, by Lemma 28, we only have to consider filtrations of the described form. We can represent them by a sequence of tuples of the form

(K1,K1)=(Kl1,Kr1),(Kl2,Kr2),,(Klc,Krc)=(Kn/21,Kn/21), (5)

such that lili1 and riri1. The representation from (5) corresponds to the filtration 𝒢={Gi}i=1m, where, for i1,

Gen+|E(Kli)|+|E(Kri)| (Klij=li+1n/2K1)(Krij=ri+1n/2K1) and
Gen+|E(Kli+1)|+|E(Kri)| (Kli+1j=li+1n/2K1)(Krij=ri+1n/2K1).

In other words, the representation from (5) means that, starting from the Turán graph 𝒯n,2:

  1. 1.

    we begin adding edges by forming a Kl2 in V1,

  2. 2.

    then add edges to form a Kr2 in V2,

  3. 3.

    increase the sizes of the complete subgraphs in V1 and V2 by constructing a Kl3 (containing the smaller Kl2) in V1, a Kr3 in V2, and so on.

Note that this representation is not unique ((K3,K1) is equivalent to (K2,K1),(K3,K1)).

Figure 7: A filtration and its representation. Here, denotes the join of the left and right graphs.
Example 30.

In Figure 7, one finds a filtration and a corresponding representation.

Figure 8: A filtration with the same total persistence as the one from Figure 7.

Surprisingly, the optimal strategy is to first construct two large cliques in V1 and V2 of sizes approximately 34|V1| and 34|V2|, respectively, and then, alternating between V1 and V2, increase the sizes of the cliques one by one. We prove this in the main result of this section.

Theorem 31.

Let n4 and let jn(3n2)/8 and kn(3n+7)/8. Up to fiberwise isomorphism, the edgewise filtration 𝒢n,max={Gi}i=1(n12)+1 of G(n12)+1 with maximal total persistence and a maximum number of bars is given by GiHi (Definition 13) for i=1,,en. After Gen=𝒯n,2, for n0mod8, the filtration is unique up to fiberwise isomorphism and represented by a sequence of tuples (cf. (5)) as follows

{(K1,K1),(Kjn,Kjn),(Kjn+1,Kjn+1),,(Kn/21,Kn/21), if n2,4,6mod8,(K1,K1),(Kkn,Kkn1),(Kkn+1,Kkn),,(K(n1)/2,K(n3)/2), if n1mod2.

If n0mod8, then there are, up to fiberwise isomorphism, two optimal filtrations, represented by

(K1,K1),(Kjn,Kjn),(Kjn+1,Kjn+1),,(Kn/21,Kn/21) and
(K1,K1),(Kjn+1,Kjn+1),(Kjn+2,Kjn+2),,(Kn/21,Kn/21).
 Remark 32.

We prove the even case. The proof of the odd case is more or less the same (although the values are slightly different). However, we need to prove that, for any representation as in (5), liri for i=1,,c. In the even case, we may assume this.

See Figure 8 for an example of and the full version [3, B.4] for a proof of this assumption.

Proof.

It follows from Corollary 25 that, for a filtration 𝒢 to have a maximum number of bars, we must have Gen=𝒯n,2. For a filtration 𝒢 to also have maximal total persistence, it immediately follows by Theorem 16 that GiHi for i=1,,en.

Any filtration 𝒢 that we consider in this proof will therefore be such that GiHi for i=1,,en. We are interested in the remainder of the filtration, which we can represent by a sequence of tuples as in (5). Within the proof, we will further reduce it by writing

(l1,r1),(l2,r2),,(lc,rc) instead of (Kl1,Kr1),(Kl2,Kr2),,(Klc,Krc). (6)

Furthermore, given a filtration 𝒢 represented by (l1,r1),(l2,r2),,(lc,rc), we define its alternation depth d𝒢 as follows:

  • We let d𝒢1 if (li,ri)=(li1+1,ri1+1) for all i=2,,c.

  • Otherwise, let d𝒢2 be such that (li,ri)=(li1+1,ri1+1) for all i=d𝒢+1,,c, but (ld𝒢,rd𝒢)(ld𝒢1+1,rd𝒢1+1).

The idea of the proof is to start with a filtration 𝒢={Gi}i=1(n12)+1, with a representation as in (6), and show that we can change the filtration in steps, to obtain in every step a filtration 𝒢 such that Tβ1FL(𝒢)Tβ1FL(𝒢), and end up with an optimal solution.

Now, writing n=2p, let 𝒢 be a filtration, represented by

(1,1)=(l1,r1),(l2,r2),,(lc,rc)=(n/21,n/21)=(p1,p1).

We shall show that

  1. 1.

    If the alternation depth d𝒢>jn, then we can change 𝒢 to a filtration 𝒢 such that d𝒢<d𝒢 and Tβ1FL(𝒢)Tβ1FL(𝒢).

  2. 2.

    If l2<jn, then we can change 𝒢 to a filtration 𝒢 represented by (l1,r1),,(lc,rc) with l2=l2+1, such that c=c1 or c=c (depending on 𝒢) and Tβ1FL(𝒢)Tβ1FL(𝒢).

For the first part, let 𝒢 be a filtration with alternation depth d𝒢>jn. Its representation, substituting n by 2p, is of the form

(1,1),,(t,u),(v,w),(d𝒢,d𝒢),(d𝒢+1,d𝒢+1),,(p1,p1) with (v,w)(d𝒢1,d𝒢1).

Since we assumed that vw, we must have w<d𝒢1. We transform 𝒢 into a filtration 𝒢 which is the same as 𝒢, except that we add an extra alternation step. We consider two cases.

  • Case 1: v<d𝒢1. In this case, 𝒢 is represented by

    (1,1),(t,u),(v,w),(d𝒢1,d𝒢1),(d𝒢,d𝒢),(d𝒢+1,d𝒢+1),,(p1,p1).
  • Case 2: v=d𝒢1. In this case, 𝒢 is represented by

    (1,1),(t,u),(v=d𝒢1,d𝒢1),(d𝒢,d𝒢),(d𝒢+1,d𝒢+1),,(p1,p1).

In both cases, we see that d𝒢<d𝒢. Moreover, as we show in more detail in [3, B.4],

Tβ1FL(𝒢)Tβ1FL(𝒢)=16(d𝒢1w)(d𝒢w)(4d𝒢+2w3p2)0. (7)

To see that this difference is non-negative, first recall that w<d𝒢1, so the first two terms are strictly positive. Furthermore, since d𝒢>3p/41/4, we have d𝒢3p/43p/4, so

4d𝒢+2w3p22w20,

since w1. Note that Tβ1FL(𝒢)Tβ1FL(𝒢)=0 if and only if w=1 and d𝒢=3p/4.

We now show the second part. We let 𝒢 be a filtration with representation

(1,1),(j,k),(l,m),,(p1,p1) with j3p/41.

We transform 𝒢 into a filtration 𝒢 such that its start is different from 𝒢’s, but from the tuple (l,m) onwards, 𝒢 and 𝒢 are the same. We again consider two cases.

  • Case 1: l>j+1. In this case, 𝒢 is represented by (1,1),(j+1,k),(l,m),,(p1,p1).

  • Case 2: l=j+1. In this case, 𝒢 is represented by (1,1),(j+1=l,m),,(p1,p1).

In both cases, as we show in more detail in the full version [3, B.4], we have

Tβ1FL(𝒢)Tβ1FL(𝒢)=(k1)(j(pj1)k6(3p2k2))(i)2((jk+1)(jk))(ii)0, (8)

where we use in (i) that j3p/41, and in (ii) that jk. If n2,4,6mod8, then j3p/41 implies that j3p/41=3p/45/4=jn1. If n0mod8, then j3p/41=jn. Only in this case, and if j=k=3p/41, then the inequality is sharp.

Hence, if we start with an arbitrary filtration 𝒢, then we can transform it step by step into a more optimal filtration 𝒢1 with d𝒢1jn, by the first part. Then, by the second part, we can transform 𝒢1 step by step into a more optimal filtration 𝒢2 that starts with (1,1),(jn,jn), with alternation depth d𝒢2=jn. Hence, 𝒢2 is the optimal filtration as described in the statement for n2,4,6mod8.

In the case that n0mod8, we saw in the proof of case 1 and case 2, that the filtration 𝒢3 starting with (1,1),(jn+1,jn+1) and having alternation depth d𝒢3=jn+1 satisfies Tβ1FL(𝒢3)=Tβ1FL(𝒢2), hence both 𝒢2 and 𝒢3 are optimal.

8 Discussion

In this paper, we have answered several key questions in the context of topological data analysis. However, many questions remain open, and we propose the following conjectures.

Conjecture 33.

If 𝒢 is a filtration on n vertices, then the number of intervals in the barcode of 𝒢 in homology degree k satisfies |kFL(𝒢)|βkFL(𝒯n,k+1).

Conjecture 34.

The extremal filtrations described in Section 7 achieve the maximal total persistence over any edgewise filtration of flag complexes in homology degree 1.

Proving the latter conjecture likely requires new ideas, as no filtration 𝒢 can maximize β1FL(Gi) for all i. In fact, extremal values need not be realized by graphs containing Turán graphs as spanning subgraphs, as illustrated by the following example.

Example 35.

Let n=8 and m=16+5. Suppose G contains the Turán graph 𝒯8,2. By Proposition 5, we seek to add 5 edges to the Turán graph to maximize the product (d11)(d21). It is easy to see that this product can be at most 1. However, by partitioning the vertices as V=V1V2, with |V1|=3 and |V2|=5, and adding all edges between V1 and V2, we need to add 6 more edges. Adding these as a K4 in the larger partition gives d1=2 and d2=1; see Figure 9. The value 2 is extremal; see Figure 10.

Figure 9: Two graphs Gl (left) and Gr (right) such that β1FL(Gl)=1 and β1FL(Gr)=2.

Finding extremal values of -linear functions defined on simplicial complexes with precisely n vertices and m edges presents a challenging and interesting direction for future research.

Conjecture 36.

Let 𝐆(n,m) denote the collection of graphs on n vertices and m edges. If G𝐆(n,m) and β1FL(G)=maxH𝐆(n,m)β1FL(H), then G contains a complete bipartite spanning subgraph.

Figure 10: The vertical axis is β1FL, and the value k along the x-axis represents (n/2)2+k edges. Here n is the number of vertices in the underlying graph; n=8 (left) and n=100 (right). The solid curves give the optimal value of β1FL(G) for any graph G containing a complete bipartite spanning subgraph, and the dashed line is an upper bound derived from the proof of Theorem 10 in conjuction with Lemma 17 (details omitted).

References

  • [1] Michał Adamaszek. Extremal problems related to Betti numbers of flag complexes. Discrete Applied Mathematics, 173:8–15, 2014. doi:10.1016/J.DAM.2014.04.006.
  • [2] Ulrich Bauer. Ripser: efficient computation of Vietoris-Rips persistence barcodes. Journal of Applied and Computational Topology, 5(3):391–423, 2021. doi:10.1007/S41468-021-00071-5.
  • [3] Lies Beers and Magnus Bakke Botnan. Extremal Betti Numbers and Persistence in Flag Complexes, 2025. arXiv:2502.21294.
  • [4] Anders Björner and Gil Kalai. An extended Euler-Poincaré theorem, 1988.
  • [5] Magnus Botnan and Michael Lesnick. An introduction to multiparameter persistence. In Representations of Algebras and Related Structures, pages 77–150. EMS Press, 2023.
  • [6] Gary Chartrand and Ping Zhang. A first course in graph theory. Courier Corporation, 2013.
  • [7] Frédéric Chazal and Steve Yann Oudot. Towards persistence-based reconstruction in Euclidean spaces. In Proceedings of the twenty-fourth annual symposium on Computational geometry, pages 232–241, 2008.
  • [8] Tamal Krishna Dey and Yusu Wang. Computational topology for data analysis. Cambridge University Press, 2022.
  • [9] Herbert Edelsbrunner and János Pach. Maximum Betti numbers of Čech complexes. In 40th International Symposium on Computational Geometry (SoCG 2024). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.SoCG.2024.53.
  • [10] Michael Goff. Extremal Betti numbers of Vietoris–Rips complexes. Discrete and Computational Geometry, 46(1):132, 2011. doi:10.1007/s00454-010-9274-z.
  • [11] Dmitry N Kozlov. Convex hulls of f-and β-vectors. Discrete & Computational Geometry, 18:421–431, 1997. doi:10.1007/PL00009326.