Modularity of Preferential Attachment Graphs

Rybarczyk, Katarzyna; Sulkowska, Małgorzata

doi:10.4230/LIPIcs.STACS.2026.76

Modularity of Preferential Attachment Graphs

Katarzyna Rybarczyk

Adam Mickiewicz University, Poznań, Poland Małgorzata Sulkowska

Department of Fundamentals of Computer Science, Wrocław University of Science and Technology, Poland

Abstract

We study a preferential attachment model $G_{n}^{h}$ . The graph $G_{n}^{h}$ is generated from a finite initial graph by adding new vertices one at a time. Each new vertex connects to $h\geq 1$ already existing vertices, and these are chosen with probability proportional to their current degrees. We are particularly interested in the community structure of $G_{n}^{h}$ , which is expressed in terms of the so–called modularity. We prove that the modularity of $G_{n}^{h}$ is, with high probability, upper bounded by a function that tends to $0$ as $h$ tends to infinity. This resolves a conjecture of Prokhorenkova, Prałat, and Raigorodskii from 2016.

As a byproduct, we obtain novel concentration results (which are interesting in their own right) for the volume and edge density parameters of vertex subsets of $G_{n}^{h}$ . The key ingredient here is the definition of a function $\mu$ , which serves as a natural measure for vertex subsets, and is proportional to the average size of their volumes. This extends previous results on the topic by Frieze, Pérez-Giménez, Prałat, and Reiniger from 2019.

Keywords and phrases:

Modularity, preferential attachment model, edge expansion

Copyright and License:

2012 ACM Subject Classification:

Mathematics of computing

\rightarrow

Random graphs ; Mathematics of computing

\rightarrow

Stochastic processes

Related Version:

Full Version: https://arxiv.org/pdf/2501.06771 [33]

Funding:

This research was funded in whole or in part by National Science Centre, Poland, grant OPUS-25 no 2023/49/B/ST6/02517. For the purpose of Open Access, the authors have applied a CC-BY public copyright licence to any Author Accepted Manuscript (AAM) version arising from this submission.

DOI:

10.4230/LIPIcs.STACS.2026.76

Event:

43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Editors:

Meena Mahajan, Florin Manea, Annabelle McIver, and Nguyễn Kim Thắng

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Real-world networks, ranging from social and information networks to biological and technological infrastructures, often exhibit a rich community structure. Detecting and analyzing such communities has far-reaching applications: identifying groups of common interest in social media, classifying spam and misinformation, retrieving related content, uncovering proteins with similar biological functions, optimizing large-scale infrastructures, improving network visualization, etc. [19, 28].

To model these networks mathematically, preferential attachment graphs have become one of the main paradigms. Their early forms appeared as random recursive trees [22, 35]. However, the in-depth study of preferential attachment models was initiated in 1999 by the work of Barabási and Albert [3], who indicated the applications of such graphs in network modeling. The preferential attachment model was subsequently formally defined and analyzed by Bollobás, Riordan, Spencer, and Tusnády [8], and Bollobás and Riordan in [6, 7]. It relies on two mechanisms: growth (the graph is growing over time, gaining a new vertex and a bunch of $h\geq 1$ edges at each time step) and preferential attachment (an arriving vertex is more likely to attach to other vertices with high degree rather than with low degree), for a precise definition check Section 2. Its degree distribution as well as diameter often fit in with the ones spotted in reality [15, 28]. Nevertheless, an experimental study shows that, unlike real networks, it lacks apparent community structure.

Quantifying community structure itself is a subtle task. Among the many measures proposed, modularity, introduced by Newman and Girvan in 2004 [28, 29], has emerged as a central metric. The vertices of a graph with high modularity may be partitioned into subsets in which there are much more internal edges than we would expect by chance (see Definition 1). Nowadays, modularity is widely used not only as a quality function judging the performance of community detection algorithms [19], but also as a central ingredient of such algorithms, like in the Louvain algorithm [5], the Leiden algorithm [36] or the Tel-Aviv algorithm [13]. Early theoretical results on modularity were given for trees [2] and regular graphs [24]. For a summary of results for various families of graphs check the appendix of [26] by McDiarmid and Skerman from $2020$ . More recent discoveries include [9] by Chellig, Fountoulakis, and Skerman (for random graphs on the hyperbolic plane), [20] by Lasoń and Sulkowska (for minor-free graphs), [21] by Lichev and Mitsche (for $3$ -regular graphs and graphs with a given degree sequence), or [32] by Rybarczyk (for random intersection graphs).

Despite being so widely used in practice, modularity still suffers from a narrow theoretical study in the families of random graphs devoted to modeling real-life networks. The first results for the well-known and most studied random graph, the binomial $G(n,p)$ , were given by McDiarmid and Skerman just in 2020 [26]. It is commonly known that $G(n,p)$ is a poor fit to real networks [19]. Preferential attachment models perform here much better. Prokhorenkova, Prałat, and Raigorodskii opened the preliminary study on modularity of a standard preferential attachment graph in [30]. They obtained non-trivial upper and lower bounds, however, the gap to close remained big. They conjectured that the modularity of such a graph with high probability tends to $0$ with $h$ (the number of edges added per step) tending to infinity (see Conjecture 3). In this paper we prove their conjecture, confirming the supposition that a standard preferential attachment model might have too small modularity to mirror well the behavior of real networks.

As a result, we derive new and interesting concentration results for the volume and edge density parameters of a given subset of vertices in the preferential attachment graph. To this end, we introduce a new function $\mu$ , which serves as a natural measure for vertex subsets, and is proportional to the average size of their volumes. These findings are noteworthy on their own and could have potential applications to other problems related to the model in the future. They extend previous results from [12] by Frieze, Pérez-Giménez, Prałat, and Reiniger (see lemmas 3 and 4 therein), which were utilized in the context of Hamilton cycles in the preferential attachment model.

In the following section we give the formal definition of the preferential attachment model and state the main result. Section 3 is devoted to presenting the results regarding the volume and the edge density parameters of subsets of vertices in $G_{n}^{h}$ . Section 4 is technical, it contains several facts and auxiliary lemmas used in the latter parts of the paper. In Section 5 we derive concentration results stated in Section 3. These results are used in Section 6 to prove the main theorem about vanishing modularity in the standard preferential attachment graph. Section 7 contains concluding remarks.

2 Model and main result

Let $\mathbb{N}$ denote the set of natural numbers, $\mathbb{N}=\{1,2,3,\ldots\}$ . For $n\in\mathbb{N}$ let $[n]=\{1,2,\ldots,n\}$ . For functions $f(n)$ and $g(n)$ we write $f(n)\sim g(n)$ if $\lim_{n\rightarrow\infty}f(n)/g(n)=1$ . We say that an event $\mathcal{E}$ occurs with high probability (whp) if the probability $\mathbb{P}[\mathcal{E}]$ depends on a certain number $n$ and tends to $1$ as $n$ tends to infinity.

All of the graphs considered in this paper are finite, undirected, and loops and multiple edges are allowed. Thus a graph is a pair $G=(V,E)$ , where $V$ is a finite set of vertices and $E$ is a finite multiset of elements from $V^{(1)}\cup V^{(2)}$ with $V^{(k)}$ being a set of all $k$ -element subsets of $V$ . Let $e(G)=|E|$ and for $S,U\subseteq V$ set $e_{G}(S)=|\{e\in E\cap(S^{(1)}\cup S^{(2)})|$ and $e_{G}(S,U)=|\{e\in E:e\cap S\neq\emptyset\wedge e\cap U\neq\emptyset\}|$ . The degree of a vertex $v\in V$ in $G$ , denoted by $\deg_{G}(v)$ , is the number of edges to which $v$ belongs but loops are counted twice, i.e., $\deg_{G}(v)=2|\{e\in E:v\in e\wedge e\in V^{(1)}\}|+|\{e\in E:v\in e\wedge e% \in V^{(2)}\}|$ . We define the volume of $S\subseteq V$ in $G$ by $\operatorname{vol}_{G}(S)=\sum_{v\in S}\deg_{G}(v)$ . By the volume of a graph, $\operatorname{vol}(G)$ , we understand $\operatorname{vol}_{G}(V)$ . Whenever the context is clear we write $e(S)$ instead of $e_{G}(S)$ , $e(S,U)$ instead of $e_{G}(S,U)$ , $\deg(v)$ instead of $\deg_{G}(v)$ and $\operatorname{vol}(S)$ instead of $\operatorname{vol}_{G}(S)$ .

We focus on a particular random graph model, called here simply the preferential attachment graph (consult [3, 6, 15]). Given $h,n\in\mathbb{N}$ , we construct a preferential attachment graph $G_{n}^{h}$ in two phases. In the first phase we sample a particular random tree $T_{hn}$ , whose vertices are called mini-vertices. (We call $T_{hn}$ a tree, however it might be disconnected, and loops, i.e. single-vertex edges, are allowed in $T_{hn}$ .) Next, the appropriate mini-vertices of $T_{hn}$ are grouped to form vertices of $G_{n}^{h}$ . Let us describe this procedure in detail.

Phase 1.

We start the whole process with $T_{1}$ which is a graph consisting of a single mini-vertex $1$ with a single loop (thus the degree of vertex $1$ is $2$ ). For $t\geq 1$ , the graph $T_{t+1}$ is built upon $T_{t}$ by adding a mini-vertex $(t+1)$ and joining it by an edge with a mini-vertex $i$ according to the following probability distribution:

\mathbb{P}(i=s)=\begin{cases}\frac{\deg_{T_{t}}(s)}{2t+1}&\textnormal{for}% \quad 1\leq s\leq t\\ \frac{1}{2t+1}&\textnormal{for}\quad s=t+1.\end{cases}

Note that we allow a newly arrived vertex to connect to itself. We continue the process until we get the random tree $T_{hn}$ .

Phase 2.

A random multigraph $G_{n}^{h}$ is obtained from $T_{hn}$ by merging each set of mini-vertices $\{h(i-1)+1,h(i-1)+2,\ldots,h(i-1)+h\}$ into a single vertex $i$ for $i\in\{1,2,\ldots,n\}$ , keeping loops and multiple edges.

Note that if $G_{n}^{h}=(V,E)$ then $V=[n]$ , $|V|=n$ and $|E|=hn$ . Since we will refer very often to the number of edges of $G_{n}^{h}$ , it will be also denoted by $M$ , i.e., $M:=hn$ . Given $G_{n}^{h}$ , by $G_{t}^{h}$ , where $t\in[n]$ , we understand the subgraph of $G_{n}^{h}$ induced by the set of vertices $[t]$ .

Our main goal is to upper bound the graph parameter called modularity for $G_{n}^{h}$ . Its formal definition is given just below.

Definition 1 (Modularity, [29]).

Let $G$ be a graph with at least one edge. For a partition $\mathcal{A}$ of $V$ define a modularity score of $G$ as

\operatorname{mod}_{\mathcal{A}}(G)=\sum_{S\in\mathcal{A}}\left(\frac{e(S)}{e(% G)}-\left(\frac{\operatorname{vol}(S)}{\operatorname{vol}(G)}\right)^{2}\right).

Modularity of $G$ is given by

\operatorname{mod}(G)=\max_{\mathcal{A}}\operatorname{mod}_{\mathcal{A}}(G),

where maximum runs over all the partitions of the set $V$ .

Conventionally, a graph with no edges has the modularity equal to $0$ . A single summand of the modularity score is the difference between the fraction of edges within $S$ and the expected fraction of edges within $S$ in a certain random multigraph on $V$ with the expected degree sequence given by $G$ (see, e.g., [18]). It is easy to check that $\operatorname{mod}(G)\in[0,1)$ .

Non-trivial lower and upper bounds for the modularity of $G_{n}^{h}$ obtained by Prokhorenkova, Prałat and Raigorodskii in [30] are the following.

Theorem 2 ([30], Theorem 4.2, Section 4.2).

Let $G_{n}^{h}=(V,E)$ be a preferential attachment graph. Then whp by $n\to\infty$

\operatorname{mod}(G_{n}^{h})=\Omega_{h}(1/\sqrt{h})

and whp by $n\to\infty$

\operatorname{mod}(G_{n}^{h})\leq 1-\min\{\delta(G_{n}^{h})/(2h),1/16\},

where $\delta(G_{n}^{h})=\min\limits_{\begin{subarray}{c}S\subseteq V,1\leq|S|\leq|V|% /2\end{subarray}}\frac{e(S,V\setminus S)}{|S|}$ is the edge expansion of $G_{n}^{h}$ .

Applying the results for the edge expansion of $G_{n}^{h}$ by Mihail, Papadimitriou and Saberi from [27] to the upper bound one obtains that whp $\operatorname{mod}(G_{n}^{h})\leq 1-O(1/h)$ . Indeed, the gap between the upper and the lower bound remained big. The authors stated the following two conjectures suggesting that the upper bound could be improved.

Conjecture 3 ([30]).

Let $G_{n}^{h}$ be a preferential attachment graph. Then whp by $n\to\infty$

\operatorname{mod}(G_{n}^{h})\nobreak\ \xrightarrow{h\rightarrow\infty}% \nobreak\ 0.

Conjecture 4 ([30]).

Let $G_{n}^{h}$ be a preferential attachment graph. Then whp by $n\to\infty$

\operatorname{mod}(G_{n}^{h})=\Theta_{h}(1/\sqrt{h}).

In this paper we present a much better upper bound for the modularity of $G_{n}^{h}$ than the one from Theorem 2 when $h$ is large, resolving, in the positive, Conjecture 3. Conjecture 4 still remains open. The main result of the paper may be presented as follows.

Theorem 5.

Let $G_{n}^{h}$ be a preferential attachment graph. Then for every $\varepsilon>0$ , whp by $n\to\infty$

\operatorname{mod}(G_{n}^{h})\leq\frac{(1+\varepsilon)f(h)}{\sqrt{h}},

where

f(h)=6g_{\mathcal{V}}(h)+4\sqrt{2\ln{2}}-g_{\mathcal{V}}(h)^{2}/\sqrt{h}

with

g_{\mathcal{V}}(h)=\frac{1}{6}\sqrt{2\ln{2}\,(9\ln{h}+8\ln{2})}+(2/3)\ln{2}+2.

$\blacktriangleright$ Remark.

Note that $f(h)\sim 3\sqrt{2\ln{2}}\sqrt{\ln{h}}$ as $h\rightarrow\infty$ thus $f(h)/\sqrt{h}\rightarrow 0$ as $h\rightarrow\infty$ . The value of $f(h)/\sqrt{h}$ drops below $1$ for $h\geq 810$ .

Corollary 6.

Let $G_{n}^{h}$ be a preferential attachment graph. Then whp by $n\to\infty$

\operatorname{mod}(G_{n}^{h})\leq\frac{3.54\sqrt{\ln{h}+0.62}+19.49}{\sqrt{h}}.

$\blacktriangleright$ Remark.

The value $\frac{3.54\sqrt{\ln{h}+0.62}+19.49}{\sqrt{h}}$ drops below $1$ for $h\geq 847$ .

$\blacktriangleright$ Remark.

Some new results on the fact that $\operatorname{mod}(G_{n}^{h})$ is whp separated from 1 by a constant even for small values of $h$ can be found in [23].

3 Volume and edge density

When talking about $G_{n}^{h}=(V,E)$ we will very often refer to its corresponding random tree $T_{hn}=(\tilde{V},\tilde{E})$ . Recall that $V=[n]$ , $\tilde{V}=[hn]$ and $|\tilde{E}|=|E|=hn=:M$ . For $S\subseteq V$ the corresponding set of its mini-vertices in $\tilde{V}$ will be denoted by $\tilde{S}$ , thus $|\tilde{S}|=h|S|$ . For $i\in[n]$ and $S\subseteq V$ let $S_{i}=S\cap[i]$ , in particular $S_{n}=S$ . Analogously, for $i\in[M]$ and $\tilde{S}\subseteq\tilde{V}$ set $\tilde{S}_{i}=\tilde{S}\cap[i]$ , in particular $\tilde{S}_{M}=\tilde{S}$ . Note that for $S\subseteq V$ we have $\operatorname{vol}_{G_{n}^{h}}(S)=\operatorname{vol}_{T_{hn}}(\tilde{S})$ and $e_{G_{n}^{h}}(S)=e_{T_{hn}}(\tilde{S})$ .

When working with modularity we need to have a control over $e_{G_{n}^{h}}(S)$ and $\operatorname{vol}_{G_{n}^{h}}(S)$ , where $S\subseteq V$ . Those values depend a lot on the arrival times of vertices from $S$ . To capture this phenomenon we define a special measure $\mu:2^{\tilde{V}}\rightarrow[0,\infty)$ , where $2^{\tilde{V}}$ stands for the set of all subsets of $\tilde{V}$ .

Definition 7 (Measure $\mu$ ).

Let $G_{n}^{h}=(V,E)$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Let $S\subseteq V$ thus $\tilde{S}\subseteq\tilde{V}$ is the set of its corresponding mini-vertices. Associate $\tilde{S}$ with the set of indicator functions

{\delta}_{i}^{\tilde{S}}=\begin{cases}1&\quad\textnormal{if}\quad i\in\tilde{S% }\\ 0&\quad\textnormal{if}\quad i\notin\tilde{S},\end{cases}

where $i\in[M]$ (whenever the context is clear we write ${\delta}_{i}$ instead of ${\delta}_{i}^{\tilde{S}}$ ). Define a function $\mu:2^{\tilde{V}}\rightarrow[0,\infty)$ as follows:

\mu(\tilde{S})=\frac{\sqrt{\pi}}{2}\cdot\sum_{j=1}^{M}{\delta}_{j}^{\tilde{S}}% c_{j-1}

with $c_{j}=\prod_{i=1}^{j}\frac{2i-1}{2i}$ for $j\geq 1$ and $c_{0}=1$ .

$\blacktriangleright$ Remark.

Let $G_{n}^{h}=(V,E)$ be a preferential attachment graph, $S\subseteq V$ and $t\in[M]$ . Note that

\mu(\tilde{S}_{t})=\frac{\sqrt{\pi}}{2}\cdot\sum_{j=1}^{t}{\delta}_{j}^{\tilde% {S}}c_{j-1}.

We use the measure $\mu$ to express the following novel concentration results for $\operatorname{vol}_{G_{n}^{h}}(S)$ , $e_{G_{n}^{h}}(S)$ , and $e_{G_{n}^{h}}(S,V\setminus S)$ , where $S$ is an arbitrary subset of $V$ .

Theorem 8.

Let $G_{n}^{h}=(V,E)$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Then for every $\varepsilon>0$ whp by $n\to\infty$

\forall S\subseteq V\,\,\left|\operatorname{vol}(S)-2\sqrt{M}\,\mu(\tilde{S})% \right|\leq(1+\varepsilon)g_{\mathcal{V}}(h)\frac{M}{\sqrt{h}},

where $g_{\mathcal{V}}(h)=\frac{1}{6}\sqrt{2\ln{2}\,(9\ln{h}+8\ln{2})}+(2/3)\ln{2}+2$ .

Theorem 9.

Let $G_{n}^{h}$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Then for every $\varepsilon>0$ whp by $n\to\infty$

\forall S\subseteq V\,\,\left|e(S)-\mu(\tilde{S})^{2}\right|\leq(1+\varepsilon% )g_{\mathcal{E}}(h)\frac{M}{\sqrt{h}},

where

g_{\mathcal{E}}(h)=\frac{g_{\mathcal{V}}(h)}{2}+\sqrt{2\ln{2}}

with

g_{\mathcal{V}}(h)=\frac{1}{6}\sqrt{2\ln{2}\,(9\ln{h}+8\ln{2})}+(2/3)\ln{2}+2.

Theorem 10.

Let $G_{n}^{h}$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Then for every $\varepsilon>0$ whp by $n\to\infty$

\forall S\subseteq V\,\left|e(S,V\setminus S)-2\mu(\tilde{S})(\sqrt{M}-\mu(% \tilde{S}))\right|\leq(1+\varepsilon)\left(\frac{3}{2}g_{\mathcal{V}}(h)+\sqrt% {2\ln{2}}\right)\frac{M}{\sqrt{h}},

where

g_{\mathcal{V}}(h)=\frac{1}{6}\sqrt{2\ln{2}\,(9\ln{h}+8\ln{2})}+(2/3)\ln{2}+2.

To grasp the intuition hidden behind the above concentration results, it is helpful to know that there is a relation between the structure of the graph $T_{hn}$ and the structure of a random graph $\hat{G}$ on the vertex set $[M]$ in which every edge $\{i,j\}$ (for $i,j\in[M]$ ) is present with probability $1/(2\sqrt{ij})$ , independently of the other possible edges (in particular, a loop at vertex $i$ is present with probability $1/(2i)$ , consult Section 4 in [7] by Bollobás and Riordan). We will see that, for any set $\tilde{S}$ , the values $\mu(\tilde{S})$ and $\mu(\tilde{S})^{2}$ are closely related to the expected value of $\operatorname{vol}_{\hat{G}}(\tilde{S})$ and $e_{\hat{G}}(\tilde{S})$ , respectively.

Let $\tilde{S}\subseteq[M]$ . The number of inner edges of $\tilde{S}$ in $\hat{G}$ , $e_{\hat{G}}(\tilde{S})$ , satisfies

	$\displaystyle\mathbb{E}[e_{\hat{G}}(\tilde{S})]$	$\displaystyle=\frac{1}{2}\sum_{i\in\tilde{S}}\sum_{\begin{subarray}{c}j\in S\\ j\neq i\end{subarray}}\frac{1}{2\sqrt{ij}}+\sum_{i\in\tilde{S}}\frac{1}{2i}=% \sum_{i\in\tilde{S}}\frac{1}{2\sqrt{i}}\sum_{\begin{subarray}{c}j\in\tilde{S}% \\ j\neq i\end{subarray}}\frac{1}{2\sqrt{j}}+\sum_{i\in\tilde{S}}\frac{1}{2i}$
		$\displaystyle=\left(\sum_{i\in\tilde{S}}\frac{1}{2\sqrt{i}}\right)^{2}-\sum_{i% \in\tilde{S}}\frac{1}{4i}=\left(\sum_{i\in\tilde{S}}\frac{1}{2\sqrt{i}}\right)% ^{2}+O(\ln{M}).$

Analogously one shows that the number of edges between $\tilde{S}$ and $[M]\setminus\tilde{S}$ , $e_{\hat{G}}(\tilde{S},[M]\setminus\tilde{S})$ , fulfills

\mathbb{E}[e_{\hat{G}}(\tilde{S},[M]\setminus\tilde{S})]=2\left(\sum_{i\in% \tilde{S}}\frac{1}{2\sqrt{i}}\right)\left(\sum_{i\in[M]\setminus\tilde{S}}% \frac{1}{2\sqrt{i}}\right).

Since $\operatorname{vol}_{\hat{G}}(\tilde{S})=2e_{\hat{G}}(\tilde{S})+e_{\hat{G}}(% \tilde{S},\tilde{V}\setminus{\tilde{S}})$ one also gets

\mathbb{E}[\operatorname{vol}_{\hat{G}}(\tilde{S})]=2\left(\sum_{i\in[M]}\frac% {1}{2\sqrt{i}}\right)\left(\sum_{i\in\tilde{S}}\frac{1}{2\sqrt{i}}\right)+O(% \ln{M}).

We will see later (consult Lemma 14) that the value of $\sqrt{\pi}c_{j}$ is asymptotically close to $1/\sqrt{j}$ thus the measure $\mu$ is constructed in such a way that $\mu(\tilde{S})$ mimics the behavior of $\sum_{i\in\tilde{S}}\frac{1}{2\sqrt{i}}$ in $\hat{G}$ . In particular $\mu(\{i\})\sim\frac{1}{2\sqrt{i}}$ for $i\xrightarrow[n\rightarrow\infty]{}\infty$ and $\mu([M])\sim\sqrt{M}$ . Therefore we may expect that in $T_{hn}$ we will get $e(\tilde{S})\approx\mu(\tilde{S})^{2}$ , $e(\tilde{S},V\setminus\tilde{S})\approx 2\mu(\tilde{S})(\sqrt{M}-\mu(\tilde{S}))$ and $\operatorname{vol}(\tilde{S})\approx 2\sqrt{M}\mu(\tilde{S})$ .

4 Auxiliary lemmas

The current section gathers all technical lemmas needed in the latter parts of the paper.

The concentration results presented in Section 3 and proved in Section 5 are based on two variants of the Azuma-Hoeffding martingale inequality. The first one is standard. We state it as it appears in [16] by Janson, Łuczak and Ruciński.

Lemma 11 (Azuma-Hoeffding inequality, [1, 14]).

If $X_{0},X_{1},\ldots,X_{n}$ is a martingale and there exist $b_{1},\ldots,b_{n}$ such that $|X_{j}-X_{j-1}|\leq b_{j}$ for each $j\in[n]$ , then, for every $x>0$ ,

\mathbb{P}[X_{n}\geq X_{0}+x]\leq\exp\left\{-\frac{x^{2}}{2\sum_{j=1}^{n}b_{j}% ^{2}}\right\}.

The second one is Freedman’s inequality. We state it below in the form very similar to the one presented in Lemma 2.2 of [37] by Warnke (one may consult also [4] by Bennett and Dudek).

Lemma 12 (Freedman’s inequality, [11]).

Let $X_{0},X_{1},\ldots,X_{n}$ be a martingale with respect to a filtration $\mathcal{F}_{0}\subseteq\mathcal{F}_{1}\subseteq\ldots\subseteq\mathcal{F}_{n}$ . Set $A_{k}=\max_{i\in[k]}(X_{i}-X_{i-1})$ and $W_{k}=\sum_{i=1}^{k}\mathrm{Var}[X_{i}-X_{i-1}|\mathcal{F}_{i-1}]$ . Then for every $\lambda>0$ and $W,A>0$ we have

\mathbb{P}\big[\exists k\in[n]\quad X_{k}\geq X_{0}+\lambda,W_{k}\leq W,A_{k}% \leq A\big]\leq\exp\left\{-\frac{\lambda^{2}}{2W+2A\lambda/3}\right\}.

Next, we present bounds for the values of $c_{j}$ and an upper bound for the function $\mu(\tilde{S})$ , both introduced in Definition 7. These results will be referred to very often later on. They are derived using Stirling’s approximation.

Lemma 13 (Stirling’s approximation, [31]).

Let $n\in\mathbb{N}$ . Then

\sqrt{2\pi n}\left(\frac{n}{e}\right)^{n}\exp\left\{\frac{1}{12n+1}\right\}<n!% <\sqrt{2\pi n}\left(\frac{n}{e}\right)^{n}\exp\left\{\frac{1}{12n}\right\}.

Lemma 14.

For $j\geq 1$ let $c_{j}=\prod_{i=1}^{j}\frac{2i-1}{2i}$ . Then

\exp\left\{-\frac{1}{8j}-\frac{1}{4\cdot 144j^{2}}\right\}\cdot\frac{1}{\sqrt{% \pi j}}\leq c_{j}\leq\frac{1}{\sqrt{\pi j}.}

Proof.

By Stirling’s approximation (Lemma 13) we get

c_{j}=\prod_{i=1}^{j}\frac{2i-1}{2i}=\frac{(2j)!}{2^{2j}(j!)^{2}}\leq\frac{1}{% \sqrt{\pi j}}\exp\left\{\frac{1}{24j}-\frac{2}{12j+1}\right\}\leq\frac{1}{% \sqrt{\pi j}}.

Analogously, since $\frac{1}{12\cdot 2j+1}\geq\frac{1}{12\cdot 2j}-\frac{1}{144(2j)^{2}}$ ,

c_{j}\geq\frac{1}{\sqrt{\pi j}}\exp\left\{\frac{1}{24j}-\frac{1}{4\cdot 144j^{% 2}}-\frac{2}{12j}\right\}=\frac{1}{\sqrt{\pi j}}\exp\left\{-\frac{1}{8j}-\frac% {1}{4\cdot 144j^{2}}\right\}.\

$\hfill\blacktriangleleft$

Lemma 15.

Let $G_{n}^{h}$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Let also $t\in[M]$ and $\tilde{S}\subseteq\tilde{V}$ . Then

\mu(\tilde{S}_{t})\leq\sqrt{t}+\frac{1}{2}.

Proof of Lemma 15.

By Lemma 14 we get

\mu(\tilde{S}_{t})\leq\frac{\sqrt{\pi}}{2}\sum_{j=1}^{t}c_{j-1}\leq\frac{\sqrt% {\pi}}{2}+\frac{1}{2}\sum_{j=2}^{t}\frac{1}{\sqrt{j-1}}\leq\frac{\sqrt{\pi}}{2% }+\frac{1}{2}+\int_{1}^{t}\frac{1}{2\sqrt{j}}\,dj\leq\sqrt{t}+\frac{1}{2}.\

$\hfill\blacktriangleleft$

Lemmas 16, 17, and 18 are auxiliary calculations for expressions involving the volumes of subsets of $\tilde{V}$ (however the reader might not notice the connection with volumes at this point). They will be directly used in Section 5 in the proof of Theorem 9 stating the result on the concentration of $e_{G_{n}^{h}}(S)$ for $S\subseteq V$ . For the proofs check the full version of this paper [33].

Lemma 16.

Let $G_{n}^{h}$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Fix $\tilde{S}\subseteq\tilde{V}$ and let $t_{0}\in[M]$ be such that $t_{0}=t_{0}(n)\xrightarrow{n\rightarrow\infty}\infty$ . Then

\sum_{i=t_{0}+1}^{M}\frac{{\delta}_{i}}{2i-1}=\frac{\pi}{2}\sum_{i=1}^{M}({% \delta}_{i}c_{i-1})^{2}+O(\ln{M}).

Lemma 17.

Let $G_{n}^{h}$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Fix $\tilde{S}\subseteq\tilde{V}$ and let $t_{0}\in[M]$ be such that $t_{0}=t_{0}(n)\xrightarrow{n\rightarrow\infty}\infty$ . Then

\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{2\sqrt{i-1}\mu(\tilde{S}_{i-1})}{2i-1}=% \frac{\pi}{2}\sum_{i=1}^{M}\left({\delta}_{i}c_{i-1}\sum_{j=1}^{i-1}{\delta}_{% j}c_{j-1}\right)+O(\ln{M}+t_{0}).

Lemma 18.

Let $G_{n}^{h}$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Fix $\tilde{S}\subseteq\tilde{V}$ and let $t_{0}\in[M]$ . Then for any constant $C>0$

\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{C(i-1)}{2i-1}\leq\frac{C}{2}(M-t_{0}).

5 Edge density and volume results for $G_{n}^{h}$

In this section we use martingale techniques to prove theorems 8, 9, and 10 stated in Section 3, i.e., we derive concentration results for $\operatorname{vol}(S)$ , $e(S)$ , and $e(S,V\setminus S)$ for an arbitrary subset $S\subseteq V$ in $G_{n}^{h}$ . A series of results will lead us to Corollary 22 which implies, as a special case, Theorem 8. We start with analyzing the volumes.

Lemma 19.

Let $G_{n}^{h}$ be a preferential attachment graph. Consider the process of constructing its corresponding random tree $T_{hn}=(\tilde{V},\tilde{E})$ . Fix $\tilde{S}\subseteq\tilde{V}$ and for $t\in[M]$ let $Z_{t}=\operatorname{vol}_{T_{t}}(\tilde{S}_{t})$ . Set

\hat{Z}_{t}=c_{t}Z_{t}-\sum_{j=1}^{t}{\delta}_{j}c_{j-1}

(recall that ${\delta}_{j}$ and $c_{j}$ were introduced in Definition 7). Let $\mathcal{F}_{t}$ be a $\sigma$ -algebra associated with all the events that happened till time $t$ . Then $\hat{Z}_{1},\hat{Z}_{2},\ldots,\hat{Z}_{M}$ is a martingale with respect to the filtration $\mathcal{F}_{1}\subseteq\ldots\subseteq\mathcal{F}_{M}$ . Moreover, for $t\in[M-1]$

|\hat{Z}_{t+1}-\hat{Z}_{t}|\leq\frac{2}{\sqrt{\pi t}}\quad\quad\textnormal{and% }\quad\quad\mathrm{Var}[(\hat{Z}_{t+1}-\hat{Z}_{t})|\mathcal{F}_{t}]\leq\frac{% 1}{4\pi(t+1)}.

$\blacktriangleright$ Remark.

The formula for $\hat{Z}_{t}$ was inspired by the martingale constructed in Lemma 4 of [12] by Frieze, Prałat, Pérez-Giménez, and Reiniger. The Frieze et al.’s martingale, in contrast to ours, consisted only of the first term, thus of $c_{t}Z_{t}=\left(\prod_{j=1}^{t}\frac{2j-1}{2j}\right)Z_{t}$ , and therefore addressed only those sets of mini-vertices that formed compact intervals. By introducing the second term, i.e., by subtracting $\sum_{j=1}^{t}{\delta}_{j}c_{j-1}$ , we are able to handle all types of sets, including those scattered throughout the entire interval $[1,M]$ .

Proof.

Let $t\in[M-1]$ . Recall that when mini-vertex $(t+1)$ arrives, it may also connect to itself. Therefore, conditioned on $\mathcal{F}_{t}$ ,

Z_{t+1}=\begin{cases}Z_{t}+\delta_{t+1}+1&\quad\textnormal{with probability}% \quad\frac{Z_{t}+\delta_{t+1}}{2t+1}\\ Z_{t}+\delta_{t+1}&\quad\textnormal{otherwise}.\end{cases}

(1)

Additionally, since $c_{t}=c_{t+1}\cdot\frac{2t+2}{2t+1}$ , we get

\begin{split}\mathbb{E}[\hat{Z}_{t+1}|\mathcal{F}_{t}]&=\mathbb{E}\bigg[c_{t+1% }Z_{t+1}-\sum_{j=1}^{t+1}\delta_{j}c_{j-1}\bigg|\mathcal{F}_{t}\bigg]=c_{t+1}% \left(Z_{t}+\delta_{t+1}+\frac{Z_{t}+\delta_{t+1}}{2t+1}\right)-\sum_{j=1}^{t+% 1}\delta_{j}c_{j-1}\\ &=c_{t+1}\cdot\frac{2t+2}{2t+1}(Z_{t}+\delta_{t+1})-\sum_{j=1}^{t+1}\delta_{j}% c_{j-1}=c_{t}Z_{t}-\sum_{j=1}^{t}\delta_{j}c_{j-1}=\hat{Z}_{t},\end{split}

thus $\hat{Z}_{1},\ldots,\hat{Z}_{M}$ is a martingale with respect to the filtration $\mathcal{F}_{1}\subseteq\ldots\subseteq\mathcal{F}_{M}$ . Next, since $c_{t+1}=c_{t}\cdot\frac{2t+1}{2t+2}$

\begin{split}|\hat{Z}_{t+1}-\hat{Z}_{t}|&=|c_{t+1}Z_{t+1}-c_{t}Z_{t}-\delta_{t% +1}c_{t}|=c_{t}\left|\frac{2t+1}{2t+2}Z_{t+1}-Z_{t}-\delta_{t+1}\right|\\ &=c_{t}\left|(Z_{t+1}-Z_{t})-\left(\frac{Z_{t+1}}{2t+2}+\delta_{t+1}\right)% \right|.\end{split}

Note that $(Z_{t+1}-Z_{t})\in\{0,1,2\}$ . Moreover, the volume of $\tilde{S}_{t+1}$ in $T_{t+1}$ may be at most $2t+2$ . Thus $Z_{t+1}\leq 2t+2$ , which also implies that $\left(\frac{Z_{t+1}}{2t+2}+\delta_{t+1}\right)\in[0,2]$ . Now use Lemma 14 and the fact that $|a-b|\leq\max\{a,b\}$ for non-negative $a, b$ to get

|\hat{Z}_{t+1}-\hat{Z}_{t}|\leq 2c_{t}\leq\frac{2}{\sqrt{\pi t}}.

By the fact that $\hat{Z}_{1},\ldots,\hat{Z}_{M}$ is a martingale with respect to the filtration $\mathcal{F}_{1}\subseteq\ldots\subseteq\mathcal{F}_{M}$ , $c_{t}=\frac{2t+2}{2t+1}\cdot c_{t+1}$ and by (1) we also have

\begin{split}\mathrm{Var}[(\hat{Z}_{t+1}-\hat{Z}_{t})|\mathcal{F}_{t}]&=% \mathbb{E}[(\hat{Z}_{t+1}-\hat{Z}_{t})^{2}|\mathcal{F}_{t}]=\mathbb{E}[(c_{t+1% }Z_{t+1}-c_{t}Z_{t}-{\delta}_{t+1}c_{t})^{2}|\mathcal{F}_{t}]\\ &=c_{t+1}^{2}\nobreak\ \mathbb{E}\left[\left(Z_{t+1}-\frac{2t+2}{2t+1}(Z_{t}+{% \delta}_{t+1})\right)^{2}\Bigg|\mathcal{F}_{t}\right]\\ &=c_{t+1}^{2}\left(\left(Z_{t}+\delta_{t+1}+1-\frac{2t+2}{2t+1}(Z_{t}+{\delta}% _{t+1})\right)^{2}\frac{Z_{t}+\delta_{t+1}}{2t+1}\right.\\ &\quad\quad\quad+\left.\left(Z_{t}+\delta_{t+1}-\frac{2t+2}{2t+1}(Z_{t}+{% \delta}_{t+1})\right)^{2}\left(1-\frac{Z_{t}+\delta_{t+1}}{2t+1}\right)\right)% \\ &=c_{t+1}^{2}\left(\frac{Z_{t}+\delta_{t+1}}{2t+1}\left(1-\frac{Z_{t}+\delta_{% t+1}}{2t+1}\right)\right)\leq\frac{c_{t+1}^{2}}{4}\leq\frac{1}{4\pi(t+1)},\end% {split}

where the last inequalities follow from the fact that $\frac{Z_{t}+\delta_{t+1}}{2t+1}\in[0,1]$ and Lemma 14, respectively. $\hfill\blacktriangleleft$ The next proof utilizes the following well-known approximation.

Lemma 20 (See [17]).

Let $n\in\mathbb{N}$ and $H_{n}=\sum_{k=1}^{n}\frac{1}{k}$ . Then $H_{n}=\ln{n}+\gamma+\frac{1}{2n}-\alpha_{n}$ , where $\gamma\approx 0.5772$ is known as Euler-Mascheroni constant and $0\leq\alpha_{n}\leq 1/(8n^{2})$ .

Theorem 21.

Let $G_{n}^{h}$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Fix $\tilde{S}\subseteq\tilde{V}$ and let $t\in[M]$ . Then for every $\varepsilon>0$ , for sufficiently large $t$ and for sufficiently large $n$ we get

\mathbb{P}\left[\left|\operatorname{vol}_{T_{t}}(\tilde{S}_{t})-2\sqrt{t}\mu(% \tilde{S}_{t})\right|\geq(1+\varepsilon)g_{\mathcal{V}}(h)\frac{t}{\sqrt{h}}% \right]\leq 2\cdot 2^{-(1+{\varepsilon}/2)t/h},

where $g_{\mathcal{V}}(h)=\frac{1}{6}\sqrt{2\ln{2}\,(9\ln{h}+8\ln{2})}+(2/3)\ln{2}+2$ .
(Recall that $\mu(\tilde{S}_{t})$ was introduced in Definition 7).

Proof.

Throughout the proof we refer to the process of constructing the random tree $T_{hn}$ . For $t\in[M]$ let $\mathcal{F}_{t}$ be a $\sigma$ -algebra associated with all the events that happened till time $t$ . Fix $\varepsilon>0$ , set $t_{0}=\lfloor t/h\rfloor$ and for $j\in\{t_{0},t_{0}+1,\ldots,t\}$ consider

\hat{Z}_{j}=c_{j}Z_{j}-\sum_{i=1}^{j}{\delta}_{i}c_{i-1},

where $Z_{j}=\operatorname{vol}_{T_{j}}(\tilde{S}_{j})$ and $c_{i}$ ’s and ${\delta}_{i}$ ’s are as in Definition 7. By Lemma 19 we know that $\hat{Z}_{t_{0}},\ldots,\hat{Z}_{t}$ is a martingale with respect to the filtration $\mathcal{F}_{t_{0}}\subseteq\ldots\subseteq\mathcal{F}_{t}$ such that $|\hat{Z}_{j}-\hat{Z}_{j-1}|\leq\frac{2}{\sqrt{\pi(j-1)}}$ and $\mathrm{Var}[(\hat{Z}_{j}-\hat{Z}_{j-1})|\mathcal{F}_{j-1}]\leq\frac{1}{4\pi j}$ . Therefore

\max_{j\in\{t_{0}+1,\ldots,t\}}(\hat{Z}_{j}-\hat{Z}_{j-1})\leq\frac{2}{\sqrt{% \pi(t_{0}-1)}}=\frac{2}{\sqrt{\pi(t/h)}}(1+O(1/t))

and, by Lemma 20,

\begin{split}\sum_{j=t_{0}+1}^{t}\mathrm{Var}[(\hat{Z}_{j}-\hat{Z}_{j-1})|% \mathcal{F}_{j-1}]&\leq\sum_{j=t_{0}+1}^{t}\frac{1}{4\pi j}\leq\frac{1}{4\pi}% \ln{(t/t_{0})}+\frac{1}{4\pi}\cdot\frac{1}{8\lfloor t/h\rfloor^{2}}\\ &=\frac{1}{4\pi}\ln{h}+O(1/t).\end{split}

Applying Freedman’s inequality (Lemma 12) to $\hat{Z}_{t_{0}},\ldots,\hat{Z}_{t}$ with $A=\frac{2}{\sqrt{\pi(t/h)}}(1+O(1/t))$ , $W=\frac{1}{4\pi}\ln{h}+O(1/t)$ and $\lambda=\frac{(1+\varepsilon)\bar{g}_{\mathcal{V}}(h)}{\sqrt{\pi}}\sqrt{t/h}$ , where $\bar{g}_{\mathcal{V}}(h)=g_{\mathcal{V}}(h)-2$ , we get

\begin{split}\mathbb{P}\left[\hat{Z}_{t}\vphantom{\frac{(1+\varepsilon)\bar{g}% _{\mathcal{V}}(h)}{\sqrt{\pi}}}\right.&\left.\geq\hat{Z}_{t_{0}}+\frac{(1+% \varepsilon)\bar{g}_{\mathcal{V}}(h)}{\sqrt{\pi}}\sqrt{t/h}\right]\\ &\leq\exp\left\{-\frac{\frac{(1+\varepsilon)^{2}\bar{g}_{\mathcal{V}}(h)^{2}}{% \pi}\frac{t}{h}}{2\cdot\frac{1}{4\pi}\ln{h}+O(1/t)+\frac{2}{3}\cdot\frac{2}{% \pi}(1+\varepsilon)\bar{g}_{\mathcal{V}}(h)(1+O(1/t))}\right\}\\ &\leq\exp\left\{-\frac{(1+\varepsilon/2)^{2}\bar{g}_{\mathcal{V}}(h)^{2}}{% \frac{1}{2}\ln{h}+\frac{4}{3}(1+\varepsilon/2)\bar{g}_{\mathcal{V}}(h)}\cdot% \frac{t}{h}\right\},\end{split}

where the last inequality holds for sufficiently large $t$ and $n$ . One can verify that

\frac{(1+\varepsilon/2)^{2}\bar{g}_{\mathcal{V}}(h)^{2}}{\frac{1}{2}\ln{h}+% \frac{4}{3}(1+\varepsilon/2)\bar{g}_{\mathcal{V}}(h)}\geq(1+{\varepsilon}/2)% \ln{2}

thus for sufficiently large $t$ and $n$ we get

\mathbb{P}\left[\hat{Z}_{t}\geq\hat{Z}_{t_{0}}+\frac{(1+\varepsilon)\bar{g}_{% \mathcal{V}}(h)}{\sqrt{\pi}}\sqrt{t/h}\right]\leq 2^{-(1+{\varepsilon}/2)t/h}.

(2)

Let us now analyze the complementary event $\left\{\hat{Z}_{t}\leq\hat{Z}_{t_{0}}+\frac{(1+\varepsilon)\bar{g}_{\mathcal{V% }}(h)}{\sqrt{\pi}}\sqrt{t/h}\right\}$ . It is equivalent to

\left\{c_{t}Z_{t}-\sum_{i=1}^{t}{\delta}_{i}c_{i-1}\leq c_{t_{0}}Z_{t_{0}}-% \sum_{i=1}^{t_{0}}{\delta}_{i}c_{i-1}+\frac{(1+\varepsilon)\bar{g}_{\mathcal{V% }}(h)}{\sqrt{\pi}}\sqrt{t/h}\right\}

which is

\left\{Z_{t}-\frac{1}{c_{t}}\sum_{i=1}^{t}{\delta}_{i}c_{i-1}\leq\frac{c_{t_{0% }}}{c_{t}}Z_{t_{0}}-\frac{1}{c_{t}}\sum_{i=1}^{t_{0}}{\delta}_{i}c_{i-1}+\frac% {1}{c_{t}}\frac{(1+\varepsilon)\bar{g}_{\mathcal{V}}(h)}{\sqrt{\pi}}\sqrt{t/h}% \right\}.

By the definition of $\mu(\tilde{S}_{t})$ , Lemma 14 and Lemma 15 we have

\begin{split}\frac{1}{c_{t}}\sum_{i=1}^{t}{\delta}_{i}c_{i-1}&=\frac{2}{\sqrt{% \pi}c_{t}}\mu(\tilde{S}_{t})\leq 2\sqrt{t}\mu(\tilde{S}_{t})e^{\frac{1}{8t}+% \frac{1}{4\cdot 144t^{2}}}\\ &=2\sqrt{t}\mu(\tilde{S}_{t})(1+O(1/t))=2\sqrt{t}\mu(\tilde{S}_{t})+O(1).\end{split}

(3)

Note that $Z_{t_{0}}\leq 2t_{0}$ , therefore by Lemma 14

\frac{c_{t_{0}}}{c_{t}}Z_{t_{0}}\leq 2\sqrt{t}\sqrt{t_{0}}(1+O(1/t))=\frac{2t}% {\sqrt{h}}+O(1),

(4)

and, again by Lemma 14,

\begin{split}\frac{1}{c_{t}}\frac{(1+\varepsilon)\bar{g}_{\mathcal{V}}(h)}{% \sqrt{\pi}}\sqrt{t/h}&\leq(1+\varepsilon)\bar{g}_{\mathcal{V}}(h)\frac{t}{% \sqrt{h}}\nobreak\ (1+O(1/t))=(1+\varepsilon)\bar{g}_{\mathcal{V}}(h)\frac{t}{% \sqrt{h}}+O(1).\end{split}

(5)

Thus by (3), (4) and (5) we get that the event $\left\{\hat{Z}_{t}\leq\hat{Z}_{t_{0}}+\frac{(1+\varepsilon)\bar{g}_{\mathcal{V% }}(h)}{\sqrt{\pi}}\sqrt{t/h}\right\}$ implies

\left\{Z_{t}-2\sqrt{t}\mu(\tilde{S}_{t})\leq\frac{2t}{\sqrt{h}}+(1+\varepsilon% )\bar{g}_{\mathcal{V}}(h)\frac{t}{\sqrt{h}}+O(1)\right\}

which, by (2), for sufficiently large $t$ and $n$ , gives (recall that $\bar{g}_{\mathcal{V}}(h)+2=g_{\mathcal{V}}(h)$ )

\mathbb{P}\left[\operatorname{vol}_{T_{t}}(\tilde{S}_{t})-2\sqrt{t}\mu(\tilde{% S}_{t})\geq(1+\varepsilon)g_{\mathcal{V}}(h)\frac{t}{\sqrt{h}}\right]\leq 2^{-% (1+{\varepsilon}/2)t/h}.

(6)

To get the opposite bound we repeat the reasoning for the martingale $-\hat{Z}_{t_{0}},\ldots,-\hat{Z}_{t}$ (check the full version of this paper for the details [33]). $\hfill\blacktriangleleft$

Corollary 22.

Let $G_{n}^{h}=(V,E)$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Then for every $\varepsilon>0$ we have

\begin{split}\mathbb{P}\left[\forall i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n% \}\,\forall S\subseteq V\left|\operatorname{vol}_{G_{i}^{h}}(S_{i})-2\sqrt{hi}% \,\mu(\tilde{S}_{hi})\right|\leq(1+\varepsilon)g_{\mathcal{V}}(h)\frac{hi}{% \sqrt{h}}\right]=1-o(1),\end{split}

where $g_{\mathcal{V}}(h)=\frac{1}{6}\sqrt{2\ln{2}\,(9\ln{h}+8\ln{2})}+(2/3)\ln{2}+2$ .

Proof.

Fix $\varepsilon>0$ . Recall that $\operatorname{vol}_{G_{i}^{h}}(S_{i})=\operatorname{vol}_{T_{hi}}(\tilde{S}_{% hi})$ . For $S\subseteq V$ and $i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n\}$ define the event $\mathcal{E}_{S,i}$ as follows

\mathcal{E}_{S,i}=\Bigl\{\left|\operatorname{vol}_{T_{hi}}(\tilde{S}_{hi})-2% \sqrt{hi}\,\mu(\tilde{S}_{hi})\right|\leq(1+\varepsilon)g_{\mathcal{V}}(h)% \frac{hi}{\sqrt{h}}\Bigr\}.

For $i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n\}$ , by Theorem 21 and the union bound, for sufficiently large $n$ we have

\mathbb{P}\left[\exists S\subseteq V\,\,\mathcal{E}_{S,i}^{C}\right]\leq 2^{i}% \cdot 2\cdot 2^{-(1+\varepsilon/2)i}=2\cdot 2^{-(\varepsilon/2)i}.

Indeed, note that $i$ iterates over the vertices of $G_{n}^{h}$ and at time $i$ there are $2^{i}$ possible configurations for $S_{i}$ , thus also for $\tilde{S}_{hi}$ . Next, again by the union bound, for sufficiently large $n$ we get

\begin{split}\mathbb{P}\left[\exists i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n% \}\,\,\exists S\subseteq V\,\,\mathcal{E}_{S,i}^{C}\right]&\leq\sum_{i=\lfloor% \log_{2}{n}\rfloor}^{n}2\cdot 2^{-(\varepsilon/2)i}\\ &\leq\frac{2\cdot 2^{-\varepsilon\lfloor\log_{2}{n}\rfloor}}{1-2^{-\varepsilon% }}\sim\frac{2}{(1-2^{-\varepsilon})n^{\varepsilon}},\end{split}

which implies

\mathbb{P}\left[\forall i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n\}\,\,\forall S% \subseteq V\,\,\mathcal{E}_{S,i}\right]=1-o(1).\

$\hfill\blacktriangleleft$

Note that considering only $i=n$ in Corollary 22 we get the statement of Theorem 8, which finishes its proof.

Now we will again use martingale inequalities which, together with concentration results for volumes from Corollary 22, will lead to the proof of Theorem 9. Consider the process of constructing the random tree $T_{hn}=(\tilde{V},\tilde{E})$ . Let $\tilde{S}\subseteq\tilde{V}$ and $j\in\{1,\ldots,M\}$ . The result stated in Corollary 22 gives the concentration of the volumes of $\tilde{S}_{j}$ at time $j$ only for $j=hi$ , where $i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n\}$ . In particular, it says nothing about the concentration of the volumes of the sets $\tilde{S}_{hi+k}$ , where $k\in\{1,2,\ldots h-1\}$ , at time $hi+k$ . Such intermediate concentrations will be needed to prove Theorem 9, i.e., to draw a conclusion about the concentration of the number of edges within $\tilde{S}$ . We derive those intermediate concentrations in Lemma 23.

Lemma 23.

Let $G_{n}^{h}=(V,E)$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. Then for every $\varepsilon>0$ we have

\begin{split}\mathbb{P}\left[\forall t\in\{\lfloor\log_{2}{n}\rfloor h,\ldots,% M\}\,\forall S\subseteq V\left|\operatorname{vol}_{T_{t}}(\tilde{S}_{t})-2% \sqrt{t}\,\mu(\tilde{S}_{t})\right|\leq(1+\varepsilon)g_{\mathcal{V}}(h)\frac{% t}{\sqrt{h}}\right]=1-o(1),\end{split}

where $g_{\mathcal{V}}(h)=\frac{1}{6}\sqrt{2\ln{2}\,(9\ln{h}+8\ln{2})}+(2/3)\ln{2}+2$ .

Proof.

Fix $\varepsilon>0$ . Define the events $\mathcal{H}$ and $\mathcal{V}$ as follows

\begin{split}\mathcal{H}&=\left\{\forall t\in\{\lfloor\log_{2}{n}\rfloor h,% \ldots,M\}\,\forall S\subseteq V\left|\operatorname{vol}_{T_{t}}(\tilde{S}_{t}% )-2\sqrt{t}\,\mu(\tilde{S}_{t})\right|\leq(1+\varepsilon)g_{\mathcal{V}}(h)% \frac{t}{\sqrt{h}}\right\}\\ \mathcal{V}&=\left\{\vphantom{\frac{C}{\sqrt{h}}}\right.\forall j=ih% \textnormal{ such that }i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n\}\,\forall S% \subseteq V\\ &\quad\quad\quad\quad\quad\quad\left.\left|\operatorname{vol}_{T_{j}}(\tilde{S% }_{j})-2\sqrt{j}\,\mu(\tilde{S}_{j})\right|\leq(1+\varepsilon/2)g_{\mathcal{V}% }(h)\frac{j}{\sqrt{h}}\right\}.\end{split}

By Corollary 22 we know that $\mathbb{P}[\mathcal{V}]=1-o(1)$ thus it is enough to show that the event $\mathcal{V}$ implies the event $\mathcal{H}$ .

Assume that $\mathcal{V}$ holds. For $t\in[M]$ and $S\subseteq V$ let $Z_{t}=\operatorname{vol}_{T_{t}}(\tilde{S}_{t})$ . Consider all $j=ih$ where $i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n-1\}$ and let $k\in\{1,2,\ldots h-1\}$ . By the fact that $\sqrt{1+k/j}=1+O(1/j)$ , ${\delta}_{\ell}\in\{0,1\}$ , Lemma 14 and Lemma 15 we get

\begin{split}2\sqrt{j+k}\,\mu(\tilde{S}_{j+k})&=2\sqrt{j}\sqrt{1+k/j}\left(\mu% (\tilde{S}_{j})+\sum_{\ell=j+1}^{j+k}{{\delta}_{\ell}c_{\ell-1}}\right)\\ &\leq 2\sqrt{j}\left(1+O(1/j)\right)\left(\mu(\tilde{S}_{j})+\sum_{\ell=j+1}^{% j+k}\frac{1}{\sqrt{\pi(\ell-1)}}\right)\\ &=(2\sqrt{j}+O(1/\sqrt{j}))\left(\mu(\tilde{S}_{j})+O(1/\sqrt{j})\right)=2% \sqrt{j}\,\mu(\tilde{S}_{j})+O(1).\end{split}

Therefore, if $\mathcal{V}$ holds, then for all $j=ih$ where $i\in\{\lfloor\log_{2}{n}\rfloor,\ldots,n-1\}$ , for all $S\subseteq V$ , and for all $k\in\{1,2,\ldots,h-1\}$ , for sufficiently large $n$ on one hand,

\begin{split}Z_{j+k}&\geq Z_{j}\geq 2\sqrt{j}\,\mu(\tilde{S}_{j})-(1+% \varepsilon/2)g_{\mathcal{V}}(h)\frac{j}{\sqrt{h}}\\ &\geq 2\sqrt{j+k}\,\mu(\tilde{S}_{j+k})-O(1)-(1+\varepsilon/2)g_{\mathcal{V}}(% h)\frac{j+k}{\sqrt{h}}\\ &\geq 2\sqrt{j+k}\,\mu(\tilde{S}_{j+k})-(1+\varepsilon)g_{\mathcal{V}}(h)\frac% {j+k}{\sqrt{h}}\end{split}

and on the other hand

\begin{split}Z_{j+k}&\leq Z_{j}+2k\leq 2\sqrt{j}\,\mu(\tilde{S}_{j})+(1+% \varepsilon/2)g_{\mathcal{V}}(h)\frac{j}{\sqrt{h}}+2h\\ &\leq 2\sqrt{j+k}\,\mu(\tilde{S}_{j+k})+(1+\varepsilon)g_{\mathcal{V}}(h)\frac% {j+k}{\sqrt{h}}.\end{split}

Thus for sufficiently large $n$ the event $\mathcal{V}$ implies the event $\mathcal{H}$ , therefore $\mathbb{P}[\mathcal{V}]=1-o(1)$ implies $\mathbb{P}[\mathcal{H}]=1-o(1)$ and the proof is finished. $\hfill\blacktriangleleft$

Now, we move on to the concentration of the number of edges within subsets of $V$ .

Lemma 24.

Let $G_{n}^{h}$ be a preferential attachment graph. Consider the process of constructing its corresponding random tree $T_{hn}=(\tilde{V},\tilde{E})$ . Fix $\tilde{S}\subseteq\tilde{V}$ and for $t\in[M]$ let $X_{t}=e(\tilde{S}_{t})$ and $\mathcal{F}_{t}$ be a $\sigma$ -algebra associated with all the events that happened till time $t$ . For $j\in\{2,\ldots,M\}$ set $D_{j}=\mathbb{E}[X_{j}-X_{j-1}|\mathcal{F}_{j-1}]$ and define

\hat{X}_{t}=X_{t}-\sum_{j=2}^{t}D_{j}.

Then $\hat{X}_{1},\hat{X}_{2},\ldots,\hat{X}_{M}$ is a martingale with respect to the filtration $\mathcal{F}_{1}\subseteq\ldots\subseteq\mathcal{F}_{M}$ . Moreover, for $t\in\{2,\ldots,M\}$

|\hat{X}_{t}-\hat{X}_{t-1}|\leq{\delta}_{t}.

Proof.

Let $t\in\{2,\ldots,M\}$ . Note that

\begin{split}\mathbb{E}[\hat{X}_{t}-\hat{X}_{t-1}|\mathcal{F}_{t-1}]&=\mathbb{% E}[{X}_{t}-{X}_{t-1}-D_{t}|\mathcal{F}_{t-1}]\\ &=\mathbb{E}[{X}_{t}-{X}_{t-1}|\mathcal{F}_{t-1}]-\mathbb{E}[\mathbb{E}[X_{t}-% X_{t-1}|\mathcal{F}_{t-1}]|\mathcal{F}_{t-1}]\\ &=\mathbb{E}[{X}_{t}-{X}_{t-1}|\mathcal{F}_{t-1}]-\mathbb{E}[{X}_{t}-{X}_{t-1}% |\mathcal{F}_{t-1}]=0,\end{split}

thus $\hat{X}_{1},\ldots,\hat{X}_{M}$ is a martingale with respect to the filtration $\mathcal{F}_{1}\subseteq\ldots\subseteq\mathcal{F}_{M}$ .

Now, for $t\in[M]$ let $Z_{t}=\operatorname{vol}_{T_{t}}(\tilde{S}_{t})$ . Recall that when the mini-vertex $t$ arrives it may also connect to itself thus for $t\in\{2,\ldots,M\}$ we have

D_{t}=\mathbb{E}[X_{t}-X_{t-1}|\mathcal{F}_{t-1}]={\delta}_{t}\frac{Z_{t-1}+1}% {2t-1}.

Therefore

\begin{split}|\hat{X}_{t}-\hat{X}_{t-1}|&=|X_{t}-X_{t-1}-D_{t}|=\left|X_{t}-X_% {t-1}-{\delta}_{t}\frac{Z_{t-1}+1}{2t-1}\right|\\ &\leq\max\left\{X_{t}-X_{t-1},{\delta}_{t}\frac{Z_{t-1}+1}{2t-1}\right\}\leq{% \delta}_{t},\end{split}

where we used the fact that $0\leq X_{t}-X_{t-1}\leq{\delta}_{t}$ , $Z_{t-1}\leq 2t-2$ thus $0\leq{\delta}_{t}\frac{Z_{t-1}}{2t-1}\leq{\delta}_{t}$ and $|a-b|\leq\max\{a,b\}$ for non-negative $a, b$ . $\hfill\blacktriangleleft$

Lemma 25.

Let $G_{n}^{h}$ be a preferential attachment graph and $T_{hn}=(\tilde{V},\tilde{E})$ its corresponding random tree. For $t\in[M]$ and $S\subseteq V$ let $Z_{t}=\operatorname{vol}_{T_{t}}(\tilde{S}_{t})$ . Then for $t_{0}\in[M]$ divisible by $h$ and for every $\varepsilon>0$ we have

\mathbb{P}\left[\forall S\subseteq V\,\left|e(S)-\left(e(S_{t_{0}/h})+\sum_{i=% t_{0}+1}^{M}{\delta}_{i}\frac{Z_{i-1}+1}{2i-1}\right)\right|\leq{B}_{% \varepsilon}\frac{M}{\sqrt{h}}\right]=1-o(1),

where ${B}_{\varepsilon}=\sqrt{(1+\varepsilon)2\ln{2}}$ .

Proof.

Throughout the proof we again refer to the process of constructing the random tree $T_{hn}$ . For $t\in[M]$ let $\mathcal{F}_{t}$ be a $\sigma$ -algebra associated with all the events that happened till time $t$ . For $S\subseteq V$ let $X_{t}=e_{T_{t}}(\tilde{S}_{t})$ and $D_{t}=\mathbb{E}[X_{t}-X_{t-1}|\mathcal{F}_{t-1}]$ . Fix $\varepsilon>0$ and for $j\in\{t_{0},t_{0}+1,\ldots,M\}$ and $S\subseteq V$ consider

\hat{X}_{j}=X_{j}-\sum_{i=2}^{j}D_{i}.

By Lemma 24 we know that $\hat{X}_{t_{0}},\ldots,\hat{X}_{M}$ is a martingale with respect to the filtration $\mathcal{F}_{t_{0}}\subseteq\ldots\subseteq\mathcal{F}_{M}$ such that $|\hat{X}_{j}-\hat{X}_{j-1}|\leq{\delta}_{j}$ . Moreover

\sum_{j=t_{0}}^{M}{\delta}_{j}^{2}=\sum_{j=t_{0}}^{M}{\delta}_{j}\leq M.

Thus applying Azuma-Hoeffding inequality (Lemma 11) to $\hat{X}_{t_{0}},\ldots,\hat{X}_{M}$ with $b_{j}={\delta}_{j}$ and $x={B}_{\varepsilon}\frac{M}{\sqrt{h}}$ , where ${B}_{\varepsilon}=\sqrt{(1+\varepsilon)2\ln{2}}$ we get

\begin{split}\mathbb{P}\left[\hat{X}_{M}\geq\hat{X}_{t_{0}}+{B}_{\varepsilon}% \frac{M}{\sqrt{h}}\right]&\leq\exp\left\{-\frac{(1+\varepsilon)2\ln{2}\cdot M^% {2}/h}{2M}\right\}=2^{-(1+\varepsilon)n}.\end{split}

(7)

Let us now analyze the event $\left\{\hat{X}_{M}\geq\hat{X}_{t_{0}}+{B}_{\varepsilon}\frac{M}{\sqrt{h}}\right\}$ . It is equivalent to

\left\{X_{M}-\sum_{i=2}^{M}D_{i}\geq X_{t_{0}}-\sum_{i=2}^{t_{0}}D_{i}+{B}_{% \varepsilon}\frac{M}{\sqrt{h}}\right\}

which is

\left\{X_{M}\geq X_{t_{0}}+\sum_{i=t_{0}+1}^{M}D_{i}+{B}_{\varepsilon}\frac{M}% {\sqrt{h}}\right\}

and, by the definition of $D_{i}$ (check the proof of Lemma 24),

\left\{X_{M}\geq X_{t_{0}}+\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{Z_{i-1}+1}{2i% -1}+{B}_{\varepsilon}\frac{M}{\sqrt{h}}\right\}.

Thus, by (7), we get

\mathbb{P}\left[X_{M}-\left(X_{t_{0}}+\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{Z_% {i-1}+1}{2i-1}\right)\geq{B}_{\varepsilon}\frac{M}{\sqrt{h}}\right]\leq 2^{-(1% +\varepsilon)n}.

(8)

Acting analogously for the martingale $-\hat{X}_{t_{0}},\ldots,-\hat{X}_{M}$ we get the opposite bound (check the full version of this paper for the details [33]). $\hfill\blacktriangleleft$

We are ready to prove Theorem 9.

Proof of Theorem 9.

Throughout the proof we again refer to the process of constructing the random tree $T_{hn}$ . Fix $\varepsilon>0$ . For $t\in[M]$ and $S\subseteq V$ let $X_{t}=e_{T_{t}}(\tilde{S}_{t})$ and $Z_{t}=\operatorname{vol}_{T_{t}}(\tilde{S}_{t})$ . For $t_{0}=h\lfloor\log_{2}{n}\rfloor$ we define the events $\mathcal{H}$ , $\mathcal{E}$ and $\mathcal{V}$ as follows

\begin{split}\mathcal{H}=\left\{\vphantom{\frac{C}{\sqrt{h}}}\forall S\right.&% \subseteq V\left.\left|e(S)-\mu(\tilde{S})^{2}\right|\leq(1+\varepsilon)g_{% \mathcal{E}}(h)\frac{M}{\sqrt{h}}\right\},\\ \mathcal{E}=\left\{\vphantom{\frac{C}{\sqrt{h}}}\forall S\right.&\subseteq V% \left.\left|e(S)-\left(e(S_{t_{0}/h})+\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{Z_% {i-1}+1}{2i-1}\right)\right|\leq{B}_{\varepsilon}\frac{M}{\sqrt{h}}\right\},\\ \mathcal{V}=\left\{\vphantom{\frac{C}{\sqrt{h}}}\forall t\right.&\in\{t_{0},% \ldots,M\}\,\forall S\subseteq V\left.|Z_{t}-2\sqrt{t}\,\mu(\tilde{S}_{t})|% \leq(1+\varepsilon)g_{\mathcal{V}}(h)\frac{t}{\sqrt{h}}\right\},\end{split}

where ${B}_{\varepsilon}=\sqrt{(1+\varepsilon)2\ln{2}}$ . By lemmas 23 and 25 we know that $\mathbb{P}[\mathcal{E}\cap\mathcal{V}]=1-o(1)$ . Thus it is enough to show that the event $\mathcal{E}\cap\mathcal{V}$ implies the event $\mathcal{H}$ .

Assume that $\mathcal{E}\cap\mathcal{V}$ holds. By lemmas 16, 17, and 18, for any $S\subseteq V$ , as $t_{0}=O(\ln M)$ , we can write

\begin{split}\sum_{i=t_{0}+1}^{M}{\delta}_{i}&\frac{Z_{i-1}+1}{2i-1}\leq\sum_{% i=t_{0}+1}^{M}\frac{{\delta}_{i}}{2i-1}+\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{% 2\sqrt{i-1}\mu(\tilde{S}_{i-1})}{2i-1}+\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{(% 1+\varepsilon)g_{\mathcal{V}}(h)\frac{i-1}{\sqrt{h}}}{2i-1}\\ \leq&\nobreak\ \frac{\pi}{2}\sum_{i=1}^{M}({\delta}_{i}c_{i-1})^{2}+\frac{\pi}% {2}\sum_{i=1}^{M}\left({\delta}_{i}c_{i-1}\sum_{j=1}^{i-1}{\delta}_{j}c_{j-1}% \right)+\frac{(1+\varepsilon)g_{\mathcal{V}}(h)}{2}\frac{M-t_{0}}{\sqrt{h}}+O(% \ln{M})\\ =&\left(\frac{\sqrt{\pi}}{2}\right)^{2}\left(\sum_{i=1}^{M}({\delta}_{i}c_{i-1% })^{2}+2\sum_{i=1}^{M}\left({\delta}_{i}c_{i-1}\sum_{j=1}^{i-1}{\delta}_{j}c_{% j-1}\right)\right)+\frac{\pi}{4}\sum_{i=1}^{M}({\delta}_{i}c_{i-1})^{2}\\ &+\frac{(1+\varepsilon)g_{\mathcal{V}}(h)}{2}\frac{M-t_{0}}{\sqrt{h}}+O(\ln{M}% )\\ \leq&\nobreak\ \mu(\tilde{S}_{M})^{2}+\frac{\pi}{4}\sum_{i=1}^{M}\frac{1}{\pi(% i-1)}+\frac{(1+\varepsilon)g_{\mathcal{V}}(h)}{2}\frac{M-t_{0}}{\sqrt{h}}+O(% \ln{M})\\ =&\nobreak\ \mu(\tilde{S}_{M})^{2}+\frac{(1+\varepsilon)g_{\mathcal{V}}(h)}{2}% \frac{M-t_{0}}{\sqrt{h}}+O(\ln{M}),\end{split}

where the last inequality follows from Lemma 14, the fact that ${\delta}_{i}\in\{0,1\}$ , and the fact that

\mu(\tilde{S}_{M})^{2}=\left(\frac{\sqrt{\pi}}{2}\sum_{i=1}^{M}{\delta}_{i}c_{% i-1}\right)^{2}=\left(\frac{\sqrt{\pi}}{2}\right)^{2}\left(\sum_{i=1}^{M}({% \delta}_{i}c_{i-1})^{2}+2\sum_{i=1}^{M}\left({\delta}_{i}c_{i-1}\sum_{j=1}^{i-% 1}{\delta}_{j}c_{j-1}\right)\right).

Analogously, again by lemmas 16, 17, and 18, for any $S\subseteq V$ we can get

\begin{split}\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{Z_{i-1}+1}{2i-1}&\geq\sum_{% i=t_{0}+1}^{M}\frac{{\delta}_{i}}{2i-1}+\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{% 2\sqrt{i-1}\mu(\tilde{S}_{i-1})}{2i-1}-\sum_{i=t_{0}+1}^{M}{\delta}_{i}\frac{(% 1+\varepsilon)g_{\mathcal{V}}(h)\frac{i-1}{\sqrt{h}}}{2i-1}\\ &\geq\mu(\tilde{S}_{M})^{2}-\frac{(1+\varepsilon)g_{\mathcal{V}}(h)}{2}\frac{M% -t_{0}}{\sqrt{h}}-O(\ln{M}).\end{split}

Note that for any $S\subseteq V$ we have $e(S_{t_{0}/h})\leq t_{0}$ and recall that $t_{0}=h\lfloor\log_{2}{n}\rfloor$ . Thus for all $S\subseteq V$ , for sufficiently large $n$ we may write (recall that $\mathcal{E}\cap\mathcal{V}$ holds)

\begin{split}e(S)&\leq t_{0}+{B}_{\varepsilon}\frac{M}{\sqrt{h}}+\mu(\tilde{S}% )^{2}+\frac{(1+\varepsilon)g_{\mathcal{V}}(h)}{2}\frac{M-t_{0}}{\sqrt{h}}+O(% \ln{M})\\ &\leq\mu(\tilde{S})^{2}+(1+\varepsilon)\sqrt{2\ln{2}}\frac{M}{\sqrt{h}}+\frac{% (1+\varepsilon)g_{\mathcal{V}}(h)}{2}\frac{M}{\sqrt{h}}=\mu(\tilde{S})^{2}+(1+% \varepsilon)g_{\mathcal{E}}(h)\frac{M}{\sqrt{h}},\end{split}

where the term $O(\ln{M})+t_{0}$ vanishes after the second inequality as the factor $\sqrt{1+\varepsilon}$ from $B_{\varepsilon}$ is replaced by $(1+\varepsilon)$ . Analogously we get

\begin{split}e(S)&\geq-{B}_{\varepsilon}\frac{M}{\sqrt{h}}+\mu(\tilde{S})^{2}-% \frac{(1+\varepsilon)g_{\mathcal{V}}(h)}{2}\frac{M-t_{0}}{\sqrt{h}}-O(\ln{M})% \geq\mu(\tilde{S})^{2}-(1+\varepsilon)g_{\mathcal{E}}(h)\frac{M}{\sqrt{h}}.% \end{split}

This means that for sufficiently large $n$ , the event $\mathcal{E}\cap\mathcal{V}$ implies the event $\mathcal{H}$ . Thus $\mathbb{P}[\mathcal{E}\cap\mathcal{V}]=1-o(1)$ implies $\mathbb{P}[\mathcal{H}]=1-o(1)$ , and the proof is finished. $\hfill\blacktriangleleft$ Mimicking the reasoning from the proofs of Lemma 24, Lemma 25, and Theorem 9 one can prove Theorem 10. We leave it without the proof details as this would be very repetitive.

6 Modularity of $G_{n}^{h}$ vanishes with $𝒉$

Recall that the main result of the paper (Theorem 5) states that the modularity of a preferential attachment graph $G_{n}^{h}$ is with high probability upper bounded by a function tending to $0$ with $h$ tending to infinity. The whole current section is devoted to its proof.

The first step in the proof of Theorem 5 follows from an interesting general result on modularity by Dinh and Thai.

Lemma 26 ([10], Lemma 1).

Let $G$ be a graph with at least one edge and let $k\in\mathbb{N}\setminus\{1\}$ . Then

\operatorname{mod}(G)\leq\frac{k}{k-1}\max_{\mathcal{A}:|\mathcal{A}|\leq k}% \operatorname{mod}_{\mathcal{A}}(G).

In particular,

\operatorname{mod}(G)\leq 2\max_{\mathcal{A}:|\mathcal{A}|\leq 2}\operatorname% {mod}_{\mathcal{A}}(G).

Corollary 27.

Let $G=(V,E)$ be a graph with at least one edge. Then

\operatorname{mod}(G)\leq 4\cdot\max_{S\subseteq V}\left(\frac{e(S)}{e(G)}-% \frac{\operatorname{vol}(S)^{2}}{\operatorname{vol}(G)^{2}}\right).

Proof.

Consider $2$ -element partitions of $V$ . We have

\begin{split}\max_{\mathcal{A}:|\mathcal{A}|=2}\operatorname{mod}_{\mathcal{A}% }(G)&=\max_{\mathcal{A}=\{S,V\setminus S\}}\left(\frac{e(S)}{e(G)}-\frac{% \operatorname{vol}(S)^{2}}{\operatorname{vol}(G)^{2}}+\frac{e(V\setminus S)}{e% (G)}-\frac{\operatorname{vol}(V\setminus S)^{2}}{\operatorname{vol}(G)^{2}}% \right)\\ &\leq 2\cdot\max_{S\subseteq V}\left(\frac{e(S)}{e(G)}-\frac{\operatorname{vol% }(S)^{2}}{\operatorname{vol}(G)^{2}}\right).\end{split}

The conclusion follows by Lemma 26. (Note that for $S=V$ the argument of the maximum equals $0$ thus the bound is non-negative.) $\hfill\blacktriangleleft$

The above corollary frees us from considering all the partitions of $V$ when analyzing modularity of $G_{n}^{h}$ . We can simply concentrate on upper bounding the values of $\left(\frac{e(S)}{e(G_{n}^{h})}-\frac{\operatorname{vol}(S)^{2}}{\operatorname% {vol}(G_{n}^{h})^{2}}\right)$ over all $S\subseteq V$ . To do so, we use the concentration results for $e(S)$ and $\operatorname{vol}(S)$ obtained in Section 5.

Proof of Theorem 5.

Fix $\varepsilon>0$ and let $g_{\mathcal{E}}(h)=\frac{g_{\mathcal{V}}(h)}{2}+\sqrt{2\ln{2}}$ . Let us define the events $\mathcal{H}$ , $\mathcal{E}$ and $\mathcal{V}$ as follows

\mathcal{H}=\left\{\operatorname{mod}(G_{n}^{h})\leq\frac{(1+\varepsilon)f(h)}% {\sqrt{h}}\right\},

\mathcal{E}=\left\{\forall S\subseteq V\,e(S)\leq\mu(\tilde{S})^{2}+(1+% \varepsilon)g_{\mathcal{E}}(h)\frac{M}{\sqrt{h}}\right\},

\mathcal{V}=\left\{\forall S\subseteq V\,\operatorname{vol}(S)\geq 2\sqrt{M}% \mu(\tilde{S})-(1+\varepsilon)g_{\mathcal{V}}(h)\frac{M}{\sqrt{h}}\right\}.

By Theorem 8 and Theorem 9 we know that $\mathbb{P}[\mathcal{E}\cap\mathcal{V}]=1-o(1)$ . Thus it is enough to show that the event $\mathcal{E}\cap\mathcal{V}$ implies the event $\mathcal{H}$ .

Assume that $\mathcal{E}\cap\mathcal{V}$ holds. Recall that $e(G_{n}^{h})=M$ and $\operatorname{vol}(G_{n}^{h})=2M$ . For any $S\subseteq V$ we may write

\begin{split}\frac{e(S)}{e(G_{n}^{h})}&-\frac{\operatorname{vol}(S)^{2}}{% \operatorname{vol}(G_{n}^{h})^{2}}=\frac{4Me(S)-\operatorname{vol}(S)^{2}}{4M^% {2}}\\ &\leq\frac{1}{4M^{2}}\left(4M\mu(\tilde{S})^{2}+4(1+\varepsilon)g_{\mathcal{E}% }(h)\frac{M^{2}}{\sqrt{h}}-\left(2\sqrt{M}\mu(\tilde{S})-(1+\varepsilon)g_{% \mathcal{V}}(h)\frac{M}{\sqrt{h}}\right)^{2}\right)\\ &=\frac{(1+\varepsilon)g_{\mathcal{E}}(h)}{\sqrt{h}}+\frac{\mu(\tilde{S})(1+% \varepsilon)g_{\mathcal{V}}(h)}{\sqrt{Mh}}-\frac{(1+\varepsilon)^{2}g_{% \mathcal{V}}(h)^{2}}{4h}\\ &\leq\frac{(1+\varepsilon)\left(g_{\mathcal{E}}(h)+g_{\mathcal{V}}(h)-g_{% \mathcal{V}}(h)^{2}/(4\sqrt{h})\right)}{\sqrt{h}},\end{split}

where the last inequality follows from Lemma 15 and is valid for sufficiently large $n$ . By Corollary 27, for sufficiently large $n$ we obtain

\begin{split}\operatorname{mod}{(G_{n}^{h})}&\leq 4\cdot\max_{S\subseteq V}% \left(\frac{e(S)}{e(G_{n}^{h})}-\frac{\operatorname{vol}(S)^{2}}{\operatorname% {vol}(G_{n}^{h})^{2}}\right)\\ &\leq\frac{(1+\varepsilon)\left(4g_{\mathcal{E}}(h)+4g_{\mathcal{V}}(h)-g_{% \mathcal{V}}(h)^{2}/\sqrt{h}\right)}{\sqrt{h}}=\frac{(1+\varepsilon)f(h)}{% \sqrt{h}}.\end{split}

We got that for sufficiently large $n$ the event $\mathcal{E}\cap\mathcal{V}$ implies the event $\mathcal{H}$ thus $\mathbb{P}[\mathcal{E}\cap\mathcal{V}]=1-o(1)$ implies $\mathbb{P}[\mathcal{H}]=1-o(1)$ and the proof is finished. $\hfill\blacktriangleleft$

We finish by proving Corollary 6.

Proof of Corollary 6.

The corollary follows from Theorem 5 by considering $\tilde{f}(h)=6g_{\mathcal{V}}(h)+4\sqrt{2\ln{2}}$ instead of $f(h)$ in the upper bound (note that $3\sqrt{2\ln{2}}\leq 3.54$ , $(8/9)\ln{2}\leq 0.62$ and $4\ln{2}+12+4\sqrt{2\ln{2}}\leq 19.49$ ). $\hfill\blacktriangleleft$

7 Concluding remarks

We showed that the modularity of a preferential attachment graph $G_{n}^{h}$ is, with high probability, upper bounded by a function of the order $\Theta(\sqrt{\ln{h}}/\sqrt{h})$ . This proves Conjecture 3 but means that Conjecture 4, saying that modularity of $G_{n}^{h}$ is, with high probability, of the order $\Theta(1/\sqrt{h})$ , still remains open. Note that the term $\Theta(1/\sqrt{h})$ can also be seen as $\Theta(1/\sqrt{\bar{d}_{h}})$ , where $\bar{d}_{h}$ states for the average vertex degree in $G_{n}^{h}$ . Such behavior of modularity has already been reported in other random graphs. For $G_{n,r}$ being a random $r$ -regular graph it is known that whp $\operatorname{mod}(G_{n,r})=\Theta(1/\sqrt{r})$ (see [25]). For binomial random graph $G(n,p)$ it is known that $\operatorname{mod}(G(n,p))=\Theta(1/\sqrt{np})$ when $np\geq 1$ and $p$ is bounded below $1$ (see [26, 34]). These might be premises supporting Conjecture 4.

To the best of our knowledge, this paper provides the first concentration results for $\operatorname{vol}(S)$ , $e(S)$ , and $e(S,V\setminus S)$ in $G_{n}^{h}$ , where $S$ can be an arbitrary subset of $V$ . The analogous results obtained so far, e.g. in [7], [12] or [30] always addressed “compact” subsets of vertices, i.e., sets of the form $\{i,i+1,i+2,\ldots,j\}$ . (In Lemma 4 of [12] the authors investigate the volume of any set $S\subseteq[t]$ of size $1\leq k\leq t$ at time $t$ but in fact in their proof the volume of $S$ is upper bounded by the volume of the $k$ “oldest” vertices in $[t]$ , i.e., the volume of the set $[k]$ .) In this paper more accurate analysis was possible thanks to introducing indicator functions $\delta_{i}^{\tilde{S}}$ in Definition 7. We believe that the proof methods leading to results obtained in Section 5 might help in the future to get bounds for edge expansion or conductance of $G_{n}^{h}$ that are stronger than those currently known.

References

[1] K. Azuma. Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, 19(3):357–367, 1967. doi:10.2748/tmj/1178243286.
[2] J.P. Bagrow. Communities and bottlenecks: Trees and treelike networks have high modularity. Physical Review E, 85:066118, 2012. doi:10.1103/PhysRevE.85.066118.
[3] A.L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999. doi:10.1126/science.286.5439.509.
[4] P. Bennett and A. Dudek. A gentle introduction to the differential equation method and dynamic concentration. Discrete Mathematics, 345(12):113071, 2022. doi:10.1016/j.disc.2022.113071.
[5] V.D. Blondel, J.L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008. doi:10.1088/1742-5468/2008/10/P10008.
[6] B. Bollobás and O. Riordan. Handbook of Graphs and Networks: From the Genome to the Internet. Wiley-VCH, 2003. Pages 1–34. doi:10.1002/3527602755.
[7] B. Bollobás and O. Riordan. The diameter of a scale-free random graph. Combinatorica, 24:5–34, 2004. doi:10.1007/s00493-004-0002-2.
[8] B. Bollobás, O. Riordan, J. Spencer, and G. Tusnády. The degree sequence of a scale-free random graph process. Random Structures & Algorithms, 18(3):279–290, April 2001. doi:10.1002/rsa.1009.
[9] J. Chellig, N. Fountoulakis, and F. Skerman. The modularity of random graphs on the hyperbolic plane. Journal of Complex Networks, 10(1):cnab051, 2021. doi:10.1093/comnet/cnab051.
[10] T.N. Dinh and M.T. Thai. Finding community structure with performance guarantees in scale-free networks. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing, pages 888–891, 2011. doi:10.1109/PASSAT/SocialCom.2011.185.
[11] D. Freedman. On tail probabilities for martingales. The Annals of Probability, 3(1):100–118, 1975. doi:10.1214/aop/1176996452.
[12] A. Frieze, X. Pérez-Giménez, P. Prałat, and B. Reiniger. Perfect matchings and Hamiltonian cycles in the preferential attachment model. Random Structures & Algorithms, 54(2):258–288, 2019. doi:10.1002/rsa.20778.
[13] G. Gilad and R. Sharan. From Leiden to Tel-Aviv University (TAU): exploring clustering solutions via a genetic algorithm. PNAS Nexus, 2(6):pgad180, 2023. doi:10.1093/pnasnexus/pgad180.
[14] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963. doi:10.1080/01621459.1963.10500830.
[15] R. van der Hofstad. Random Graphs and Complex Networks. Vol.1. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2017. doi:10.1017/9781316779422.
[16] S. Janson, T. Łuczak, and A. Ruciński. Random Graphs. John Wiley & Sons, Inc., 2000. doi:10.1002/9781118032718.
[17] R. P. Boas Jr. and J. W. Wrench Jr. Partial sums of the harmonic series. The American Mathematical Monthly, 78(8):864–870, 1971. doi:10.2307/2316476.
[18] B. Kamiński, V. Poulin, P. Prałat, P. Szufel, and F. Théberge. Clustering via hypergraph modularity. Plos One, 14:e0224307, February 2019. doi:10.1371/journal.pone.0224307.
[19] B. Kamiński, P. Prałat, and F.Théberge. Mining Complex Networks. Chapman and Hall/CRC, 2021. doi:10.1201/9781003218869.
[20] M. Lasoń and M. Sulkowska. Modularity of minor-free graphs. Journal of Graph Theory, 102(4):728–736, 2023. doi:10.1002/jgt.22896.
[21] L. Lichev and D. Mitsche. On the modularity of 3-regular random graphs and random graphs with given degree sequences. Random Structures & Algorithms, 61(4):754–802, 2022. doi:10.1002/rsa.21080.
[22] Hosam M. Mahmoud, R. T. Smythe, and J. Szymański. On the structure of random plane-oriented recursive trees and their branches. Random Structures & Algorithms, 4(2):151–176, 1993. doi:10.1002/rsa.3240040204.
[23] C. McDiarmid, K. Rybarczyk, F. Skerman, and M. Sulkowska. Note on edge expansion and modularity in preferential attachment graphs, 2025. Preprint.
[24] C. McDiarmid and F. Skerman. Modularity in random regular graphs and lattices. Electronic Notes in Discrete Mathematics, 43:431–437, 2013. doi:10.1016/j.endm.2013.07.063.
[25] C. McDiarmid and F. Skerman. Modularity of regular and treelike graphs. Journal of Complex Networks, 6(4):596–619, 2018. doi:10.1093/comnet/cnx046.
[26] C. McDiarmid and F. Skerman. Modularity of Erdős-Rényi random graphs. Random Structures & Algorithms, 57(1):211–243, 2020. doi:10.1002/rsa.20910.
[27] M. Mihail, C. Papadimitriou, and A. Saberi. On certain connectivity properties of the internet topology. Journal of Computer and System Sciences, 72:239–251, 2006. doi:10.1016/j.jcss.2005.06.009.
[28] M.E.J. Newman. Networks. Oxford University Press, 2018. doi:10.1093/oso/9780198805090.001.0001.
[29] M.E.J. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review E, 69(2):026113, 2004. doi:10.1103/physreve.69.026113.
[30] L. Prokhorenkova, A. Raigorodskii, and P. Prałat. Modularity of complex networks models. Internet Mathematics, 2017. doi:10.24166/im.12.2017.
[31] H. Robbins. A remark on Stirling’s formula. The American Mathematical Monthly, 62(1):26–29, 1955. doi:10.2307/2308012.
[32] K. Rybarczyk. Modularity of random intersection graphs. In M. Bloznelis, P. Drungilas, B. Kamiński, P. Prałat, M. Šileikis, F. Théberge, and R. Vaicekauskas, editors, Modelling and Mining Networks, pages 30–44, Cham, 2025. Springer Nature Switzerland. doi:10.1007/978-3-031-92898-7_3.
[33] K. Rybarczyk and M. Sulkowska. Modularity of preferential attachment graphs, 2025. doi:10.48550/arXiv.2501.06771.
[34] K. Rybarczyk and M. Sulkowska. New bounds on the modularity of G(n,p), 2025. doi:10.48550/arXiv.2504.16254.
[35] J. Szymański. On a nonuniform random recursive tree. In A. Barlotti, M. Biliotti, A. Cossu, G. Korchmaros, and G. Tallini, editors, Annals of Discrete Mathematics (33), volume 144 of North-Holland Mathematics Studies, pages 297–306. North-Holland, 1987. doi:10.1016/S0304-0208(08)73062-7.
[36] V.A. Traag, L. Waltman, and N.J. van Eck. From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9(5233), 2019. doi:10.1038/s41598-019-41695-z.
[37] L. Warnke. On the method of typical bounded differences. Combinatorics, Probability & Computing, 25:269–299, 2022. doi:10.1017/S0963548315000103.

[bib.bib1] [1] K. Azuma. Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, 19(3):357–367, 1967. doi:10.2748/tmj/1178243286.

[bib.bib2] [2] J.P. Bagrow. Communities and bottlenecks: Trees and treelike networks have high modularity. Physical Review E, 85:066118, 2012. doi:10.1103/PhysRevE.85.066118.

[bib.bib3] [3] A.L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999. doi:10.1126/science.286.5439.509.

[bib.bib4] [4] P. Bennett and A. Dudek. A gentle introduction to the differential equation method and dynamic concentration. Discrete Mathematics, 345(12):113071, 2022. doi:10.1016/j.disc.2022.113071.

[bib.bib5] [5] V.D. Blondel, J.L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008. doi:10.1088/1742-5468/2008/10/P10008.

[bib.bib6] [6] B. Bollobás and O. Riordan. Handbook of Graphs and Networks: From the Genome to the Internet. Wiley-VCH, 2003. Pages 1–34. doi:10.1002/3527602755.

[bib.bib7] [7] B. Bollobás and O. Riordan. The diameter of a scale-free random graph. Combinatorica, 24:5–34, 2004. doi:10.1007/s00493-004-0002-2.

[bib.bib8] [8] B. Bollobás, O. Riordan, J. Spencer, and G. Tusnády. The degree sequence of a scale-free random graph process. Random Structures & Algorithms, 18(3):279–290, April 2001. doi:10.1002/rsa.1009.

[bib.bib9] [9] J. Chellig, N. Fountoulakis, and F. Skerman. The modularity of random graphs on the hyperbolic plane. Journal of Complex Networks, 10(1):cnab051, 2021. doi:10.1093/comnet/cnab051.

[bib.bib10] [10] T.N. Dinh and M.T. Thai. Finding community structure with performance guarantees in scale-free networks. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing, pages 888–891, 2011. doi:10.1109/PASSAT/SocialCom.2011.185.

[bib.bib11] [11] D. Freedman. On tail probabilities for martingales. The Annals of Probability, 3(1):100–118, 1975. doi:10.1214/aop/1176996452.

[bib.bib12] [12] A. Frieze, X. Pérez-Giménez, P. Prałat, and B. Reiniger. Perfect matchings and Hamiltonian cycles in the preferential attachment model. Random Structures & Algorithms, 54(2):258–288, 2019. doi:10.1002/rsa.20778.

[bib.bib13] [13] G. Gilad and R. Sharan. From Leiden to Tel-Aviv University (TAU): exploring clustering solutions via a genetic algorithm. PNAS Nexus, 2(6):pgad180, 2023. doi:10.1093/pnasnexus/pgad180.

[bib.bib14] [14] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963. doi:10.1080/01621459.1963.10500830.

[bib.bib15] [15] R. van der Hofstad. Random Graphs and Complex Networks. Vol.1. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2017. doi:10.1017/9781316779422.

[bib.bib16] [16] S. Janson, T. Łuczak, and A. Ruciński. Random Graphs. John Wiley & Sons, Inc., 2000. doi:10.1002/9781118032718.

[bib.bib17] [17] R. P. Boas Jr. and J. W. Wrench Jr. Partial sums of the harmonic series. The American Mathematical Monthly, 78(8):864–870, 1971. doi:10.2307/2316476.

[bib.bib18] [18] B. Kamiński, V. Poulin, P. Prałat, P. Szufel, and F. Théberge. Clustering via hypergraph modularity. Plos One, 14:e0224307, February 2019. doi:10.1371/journal.pone.0224307.

[bib.bib19] [19] B. Kamiński, P. Prałat, and F.Théberge. Mining Complex Networks. Chapman and Hall/CRC, 2021. doi:10.1201/9781003218869.

[bib.bib20] [20] M. Lasoń and M. Sulkowska. Modularity of minor-free graphs. Journal of Graph Theory, 102(4):728–736, 2023. doi:10.1002/jgt.22896.

[bib.bib21] [21] L. Lichev and D. Mitsche. On the modularity of 3-regular random graphs and random graphs with given degree sequences. Random Structures & Algorithms, 61(4):754–802, 2022. doi:10.1002/rsa.21080.

[bib.bib22] [22] Hosam M. Mahmoud, R. T. Smythe, and J. Szymański. On the structure of random plane-oriented recursive trees and their branches. Random Structures & Algorithms, 4(2):151–176, 1993. doi:10.1002/rsa.3240040204.

[bib.bib23] [23] C. McDiarmid, K. Rybarczyk, F. Skerman, and M. Sulkowska. Note on edge expansion and modularity in preferential attachment graphs, 2025. Preprint.

[bib.bib24] [24] C. McDiarmid and F. Skerman. Modularity in random regular graphs and lattices. Electronic Notes in Discrete Mathematics, 43:431–437, 2013. doi:10.1016/j.endm.2013.07.063.

[bib.bib25] [25] C. McDiarmid and F. Skerman. Modularity of regular and treelike graphs. Journal of Complex Networks, 6(4):596–619, 2018. doi:10.1093/comnet/cnx046.

[bib.bib26] [26] C. McDiarmid and F. Skerman. Modularity of Erdős-Rényi random graphs. Random Structures & Algorithms, 57(1):211–243, 2020. doi:10.1002/rsa.20910.

[bib.bib27] [27] M. Mihail, C. Papadimitriou, and A. Saberi. On certain connectivity properties of the internet topology. Journal of Computer and System Sciences, 72:239–251, 2006. doi:10.1016/j.jcss.2005.06.009.

[bib.bib28] [28] M.E.J. Newman. Networks. Oxford University Press, 2018. doi:10.1093/oso/9780198805090.001.0001.

[bib.bib29] [29] M.E.J. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review E, 69(2):026113, 2004. doi:10.1103/physreve.69.026113.

[bib.bib30] [30] L. Prokhorenkova, A. Raigorodskii, and P. Prałat. Modularity of complex networks models. Internet Mathematics, 2017. doi:10.24166/im.12.2017.

[bib.bib31] [31] H. Robbins. A remark on Stirling’s formula. The American Mathematical Monthly, 62(1):26–29, 1955. doi:10.2307/2308012.

[bib.bib32] [32] K. Rybarczyk. Modularity of random intersection graphs. In M. Bloznelis, P. Drungilas, B. Kamiński, P. Prałat, M. Šileikis, F. Théberge, and R. Vaicekauskas, editors, Modelling and Mining Networks, pages 30–44, Cham, 2025. Springer Nature Switzerland. doi:10.1007/978-3-031-92898-7_3.

[bib.bib33] [33] K. Rybarczyk and M. Sulkowska. Modularity of preferential attachment graphs, 2025. doi:10.48550/arXiv.2501.06771.

[bib.bib34] [34] K. Rybarczyk and M. Sulkowska. New bounds on the modularity of G(n,p), 2025. doi:10.48550/arXiv.2504.16254.

[bib.bib35] [35] J. Szymański. On a nonuniform random recursive tree. In A. Barlotti, M. Biliotti, A. Cossu, G. Korchmaros, and G. Tallini, editors, Annals of Discrete Mathematics (33), volume 144 of North-Holland Mathematics Studies, pages 297–306. North-Holland, 1987. doi:10.1016/S0304-0208(08)73062-7.

[bib.bib36] [36] V.A. Traag, L. Waltman, and N.J. van Eck. From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9(5233), 2019. doi:10.1038/s41598-019-41695-z.

[bib.bib37] [37] L. Warnke. On the method of typical bounded differences. Combinatorics, Probability & Computing, 25:269–299, 2022. doi:10.1017/S0963548315000103.

Modularity of Preferential Attachment Graphs

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Funding:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

2 Model and main result

Definition 1 (Modularity, [29]).

Theorem 2 ([30], Theorem 4.2, Section 4.2).

Conjecture 3 ([30]).

Conjecture 4 ([30]).

Theorem 5.

▶ Remark.

Corollary 6.

▶ Remark.

▶ Remark.

3 Volume and edge density

Definition 7 (Measure μ).

▶ Remark.

Theorem 8.

Theorem 9.

Theorem 10.

4 Auxiliary lemmas

Lemma 11 (Azuma-Hoeffding inequality, [1, 14]).

Lemma 12 (Freedman’s inequality, [11]).

Lemma 13 (Stirling’s approximation, [31]).

Lemma 14.

Proof.

Lemma 15.

Proof of Lemma 15.

Lemma 16.

Lemma 17.

Lemma 18.

5 Edge density and volume results for 𝑮𝒏𝒉

Lemma 19.

▶ Remark.

Proof.

Lemma 20 (See [17]).

Theorem 21.

Proof.

Corollary 22.

Proof.

Lemma 23.

Proof.

Lemma 24.

Proof.

Lemma 25.

Proof.

Proof of Theorem 9.

6 Modularity of 𝑮𝒏𝒉 vanishes with 𝒉

Lemma 26 ([10], Lemma 1).

Corollary 27.

Proof.

Proof of Theorem 5.

Proof of Corollary 6.

7 Concluding remarks

References

$\blacktriangleright$ Remark.

$\blacktriangleright$ Remark.

$\blacktriangleright$ Remark.

Definition 7 (Measure $\mu$ ).

$\blacktriangleright$ Remark.

5 Edge density and volume results for $G_{n}^{h}$

$\blacktriangleright$ Remark.

6 Modularity of $G_{n}^{h}$ vanishes with $𝒉$