Near-Universally-Optimal Differentially Private Minimum Spanning Trees

Hladík, Richard; Tětek, Jakub

doi:10.4230/LIPIcs.FORC.2025.6

Near-Universally-Optimal Differentially Private
Minimum Spanning Trees

Richard Hladík

ETH Zurich, Switzerland Jakub Tětek

INSAIT, Sofia University “St. Kliment Ohridski”, Bulgaria

Abstract

Devising mechanisms with good beyond-worst-case input-dependent performance has been an important focus of differential privacy, with techniques such as smooth sensitivity, propose-test-release, or inverse sensitivity mechanism being developed to achieve this goal. This makes it very natural to use the notion of universal optimality in differential privacy. Universal optimality is a strong instance-specific optimality guarantee for problems on weighted graphs, which roughly states that for any fixed underlying (unweighted) graph, the algorithm is optimal in the worst-case sense, with respect to the possible setting of the edge weights.

In this paper, we give the first such result in differential privacy. Namely, we prove that a simple differentially private mechanism for approximately releasing the minimum spanning tree is near-optimal in the sense of universal optimality for the $\ell_{1}$ neighbor relation. Previously, it was only known that this mechanism is nearly optimal in the worst case. We then focus on the $\ell_{\infty}$ neighbor relation, for which the described mechanism is not optimal. We show that one may implement the exponential mechanism for MST in polynomial time, and that this results in universal near-optimality for both the $\ell_{1}$ and the $\ell_{\infty}$ neighbor relations.

Keywords and phrases:

differential privacy, universal optimality, minimum spanning trees

Funding:

Richard Hladík: Supported in part by the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (grant agreement No. 949272).

Copyright and License:

2012 ACM Subject Classification:

Security and privacy

\rightarrow

Privacy-preserving protocols ; Mathematics of computing

\rightarrow

Graph algorithms

Related Version:

Full Version: https://arxiv.org/abs/2404.15035

Acknowledgements:

We would like to thank Rasmus Pagh for helpful discussions and hosting Richard at the University of Copenhagen. We would like to thank Bernhard Haeupler for helpful discussions. Jakub worked on this paper while affiliated with BARC, University of Copenhagen. Richard partially worked on this paper while visiting BARC, University of Copenhagen.

Funding:

Partially funded by the Ministry of Education and Science of Bulgaria’s support for INSAIT as part of the Bulgarian National Roadmap for Research Infrastructure. Partially supported by the VILLUM Foundation grants 54451 and 16582.

DOI:

10.4230/LIPIcs.FORC.2025.6

Event:

6th Symposium on Foundations of Responsible Computing (FORC 2025)

Editors:

Mark Bun

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

The minimum spanning tree (MST) problem is one of the classic combinatorial problems, making the release of an MST a fundamental question in differential privacy. Unfortunately, if the edge set is private, this problem cannot be solved, as it would require us to release a subset of the edges (note that we want to release the edges of the MST and not just their total weight). It is therefore natural to consider this problem in the (standard) setting where the underlying unweighted graph is public, but the weights are private.

The following simple near-linear-time mechanism has been proposed for this problem [26]: add Laplacian noise to the edge weights, making them private, then find the MST with the noisy weights. At the same time, the author shows that this mechanism is near-optimal in the worst case.

In this paper, we prove a much stronger optimality claim for this algorithm, namely that it is universally optimal up to a logarithmic factor for the $\ell_{1}$ neighbor relation (i.e., two graphs on the same edge set are neighboring if the edge weights differ by $\leq 1$ in the $\ell_{1}$ norm). An algorithm (or differentially private mechanism) that works on weighted graphs is said to be universally optimal, if for any fixed underlying unweighted graph $G$ , it is worst-case optimal with respect to the possible settings of edge weights $\mathbf{w}$ . That is, if we write a weighted graph $\tilde{G}=(G,\mathbf{w})$ , universal optimality states that for any fixed $G$ , we are worst-case optimal w.r.t. $\mathbf{w}$ . Standard worst-case optimality, on the other hand, is worst-case w.r.t. both $\mathbf{w}$ and $G$ .

We then focus on differential privacy with the $\ell_{\infty}$ neighbor relation (defined analogously to the $\ell_{1}$ neighbor relation), for which the above algorithm is not universally optimal. We instead prove that one may implement the exponential mechanism for MST in polynomial time by relying on a known sampling result [6]. We prove that this more complicated and somewhat slower $\mathcal{O}(n^{\omega})$ -time algorithm does achieve universal optimality up to a logarithmic factor for both the $\ell_{1}$ and the $\ell_{\infty}$ neighbor relations. For the $\ell_{\infty}$ neighbor relation, this improves upon the PAMST algorithm of Pinot [23] which is only known to be near-optimal in the worst-case sense.

Our results are the first to prove the universal optimality of a differentially private mechanism.¹¹1The name universal optimality unfortunately has different meaning in different contexts. There are several results that are universally optimal with one of these different meanings, as we discuss in Section 1.2. We stick with the meaning commonly used in distributed algorithms. This is despite the fact that previous work in differential privacy has put a lot of emphasis on instance-specific performance guarantees: smooth sensitivity, propose-test-release, privately bounding local sensitivity, and inverse sensitivity mechanism are all examples of this trend. This makes it very natural to focus on universal optimality, perhaps making it somewhat surprising that universal optimality is not already being commonly used in differential privacy.

As a side note, we also prove that the above-mentioned linear-time algorithm is optimal up to a constant factor in the worst-case sense. This improves upon previous results which had a logarithmic gap [26]. Similarly, we prove that the exponential mechanism is for MST worst-case optimal up to a constant factor for both the $\ell_{1}$ and the $\ell_{\infty}$ neighbor relations.

1.1 Technical Overview

In this section, we briefly discuss the intuition and techniques behind our results. Here, we focus mostly on the $\ell_{1}$ neighbor relation and universal near-optimality. The lower bound arguments for $\ell_{\infty}$ and the worst-case optimality are similar. The comparison between ours and previous results is summarized in Table 1.

Our lower and upper bounds rely on the properties of the set $\mathcal{T}(G)$ of all spanning trees of a given (unweighted) graph $G$ . We define a Hamming-like metric $d_{H}$ in this space, defined as $d_{H}(T_{1},T_{2})=|T_{1}\setminus T_{2}|$ . It turns out that the diameter $D$ of $\mathcal{T}(G)$ with respect to $d_{H}$ is a natural parameter that determines the “hardness” of $G$ . Namely, we show an $\Omega(D/\varepsilon)$ lower bound and an $\mathcal{O}(D\log n/\varepsilon)$ upper bound on the expected error of the optimal $\varepsilon$ -differentially private algorithm.

Upper bound

In Section 3, we use the diameter $D$ to obtain a sharper analysis of the Laplace mechanism of Sealfon [26]. The mechanism is very simple: add Laplacian noise to every edge, then return the MST with respect to the noisy weights. Its standard analysis uses the fact that with high probability, the noise on every edge is $\mathcal{O}(\log n/\varepsilon)$ , and thus the total error of the spanning tree returned is $\mathcal{O}(n\log n/\varepsilon)$ . Our improvement follows from the fact that the returned spanning tree differs from the MST in at most $2D$ edges, and thus the total error accumulated is actually only $\mathcal{O}(D\log n/\varepsilon)$ .

Lower bound

The most technically interesting part of this paper is our lower bound in Section 4. We have a fixed underlying graph $G$ and for any $\varepsilon$ -differentially private mechanism, we want to find a weight assignment $\mathbf{w}^{-}$ on which the error will be large. Our lower bound builds on a general packing-based lower bound for differentially private algorithms, which we briefly paraphrase in the language of MSTs and universal optimality: Given a set $\mathcal{W}$ of weight vectors and a parameter $x\geq 0$ , define for each $\mathbf{w}\in\mathcal{W}$ the set $\mathcal{L}_{\mathbf{w}}$ of spanning trees that are $x$ -light for this $\mathbf{w}$ , i.e., that are heavier than the MST by at most $x$ . Moreover, assume that the sets $\mathcal{L}_{\mathbf{w}}$ are pairwise disjoint, that is, each spanning tree is $x$ -light with respect to at most one weight vector in $\mathcal{W}$ . Then the expected error of any $\varepsilon$ -differentially private algorithm is $\Omega(x\log|\mathcal{W}|/(r\varepsilon))$ on at least one weight assignment $\mathbf{w}^{-}\in\mathcal{W}$ , where $r$ is the diameter of $\mathcal{W}$ in the metric induced by the neighbor relation $\sim$ .

Intuitively speaking, we have a set $\mathcal{W}$ of weight assignments and for each of them, we consider a ball of outputs that are “good” in the sense that their error is “small” (with respect to $x$ ) for this input. Our goal is to find a large collection $\mathcal{W}$ of weights that are “close” (i.e., $r$ is small), but which induce disjoint balls that are “wide” (i.e., we can set $x$ to be large).

Reduction to finding many dissimilar spanning trees

The problem of finding $\mathcal{W}$ that maximizes the expression $x\log|\mathcal{W}|/(r\varepsilon)$ is somewhat complicated by the fact that we have a trade-off between $|\mathcal{W}|$ , $x$ and $r$ . In order to solve this problem, we first show how to reduce it to a combinatorial problem of finding a large set of dissimilar spanning trees.

The main idea of the reduction is as follows: assume we have $S\subseteq\mathcal{T}(G)$ such that every two $T_{1},T_{2}$ in $S$ have $d_{H}(T_{1},T_{2})>d$ . We construct $\mathcal{W}$ by, for each $T\in S$ , creating a weight vector $\mathbf{w}_{T}$ by defining $\mathbf{w}_{T}(e)=0$ if $e\in T$ and $1$ otherwise. Now the crucial observation is that $\mathbf{w}_{T}(\cdot)=d_{H}(T,\cdot)$ . That is, for any $T^{\prime}$ , the weight of $T^{\prime}$ under the weight vector $\mathbf{w}_{T}$ is exactly the Hamming distance between $T$ and $T^{\prime}$ . One can then verify that for every $T^{\prime}\in\mathcal{T}(G)$ , there is at most one weight $\mathbf{w}_{T}\in\mathcal{W}$ under which its weight is at most $d/2$ , as otherwise we would, by the triangle inequality, for some $T_{1},T_{2}\in S$ have that $d_{H}(T_{1},T_{2})\leq d_{H}(T_{1},T^{\prime})+d_{H}(T_{2},T^{\prime})\leq d$ , which would be a contradiction with the definition of $d$ . Therefore, we can set $x$ as large as $d/2$ while still ensuring the disjointness property of all $\mathcal{L}_{\mathbf{w}}$ required by the original lower bound.

An upper bound of $r\leq 2D$ can be argued as follows: We have that any two spanning trees differ in $\leq 2D$ edges. Combined with having zero-one weights, any two spanning trees’ weight vectors $\mathbf{w}_{T}$ thus also have $\ell_{1}$ -distance at most $2D$ , and thus $r\leq 2D$ .

Using these ideas, we have reduced the original problem to the problem of finding a large set $S$ of spanning trees that is sparse in the sense that no two spanning trees in $S$ are similar. More specifically, we have reduced the original problem to the problem of finding a set $S$ of spanning trees which maximizes $\log|S|/d$ .

How to find many dissimilar spanning trees?

Finally, the problem of finding a large, yet sparse $S$ , is reducible to a standard problem: We show that there are always at least $2^{D}$ spanning trees, and the problem of finding a sparse subset among them can be reduced to finding a binary code of length $D$ and minimum Hamming distance $\Omega(D)$ that has $2^{\Omega(D)}$ many codewords. Such a code exists by the Gilbert–Varshamov bound [14, 28]. The $S$ that we get from this reduction then allows us to prove an $\Omega(D/\varepsilon)$ lower bound on the expected error, which nearly matches the $\mathcal{O}(D\log n/\varepsilon)$ upper bound mentioned above.

1.2 Related Work

Table 1: Summary of our results. Recall that

D

is the diameter of the space

\mathcal{T}(G)

of spanning trees of

G

, as defined in Section 2.2.

neighborhood	on a fixed graph topology		worst-case	previous work (worst-case)
neighborhood	lower bound	upper bound	lower and upper bound	lower bound	upper bound	reference
$\ell_{1}$	$\Omega(D/\varepsilon)$	$\mathcal{O}(D\log n/\varepsilon)$	$\Theta(n\log n/\varepsilon)$	$\Omega(n/\varepsilon)$	$\mathcal{O}(n\log n/\varepsilon)$	[26]
$\ell_{\infty}$	$\Omega(D^{2}/\varepsilon)$	$\mathcal{O}(D^{2}\log n/\varepsilon)$	$\Theta(n^{2}\log n/\varepsilon)$	—	$\mathcal{O}(n^{2}\log n/\varepsilon)$	[23]

Differentially private minimum spanning trees

The problem of privately releasing the MST in the “public graph, private weights” setting was first studied by Sealfon [26]. The author shows a worst-case lower bound of $\Omega(n/\varepsilon)$ on the expected error for the $\ell_{1}$ neighbor relation, proposes a simple mechanism based on adding Laplacian noise to all weights and calculating the MST with respect to the noisy weights, and proves that it is worst-case optimal up to an $\mathcal{O}(\log n)$ factor. In Sections 3 and 4, we show that this mechanism is in fact worst-case optimal up to a constant factor and also universally optimal up to an $\mathcal{O}(\log n)$ factor.

Pinot [23] gives a mechanism for the $\ell_{\infty}$ neighbor relation by running a noisy version of the Jarník-Prim algorithm that, in each step, selects the edge to be included in the tree by running the exponential mechanism. It has an $\mathcal{O}(n^{2}\log n/\varepsilon)$ expected error, which, by our results from Section 4, is worst-case optimal. However, the analysis does not seem to be easily modifiable to show universal optimality. We remark that the author claims an expected error of $\mathcal{O}(n^{2}\log n/(m\varepsilon))$ , but this is only true if one normalizes all weights by $1/m$ .

We also note that releasing the weight of the minimum spanning tree is easy. One may easily show that the global sensitivity is $1$ (with $\ell_{1}$ ) or $n-1$ (with $\ell_{\infty}$ ), allowing us to simply find the MST and release its weight using the Laplace mechanism. In their work on smooth sensitivity, Nissim, Raskhodnikova, and Smith [21] consider the problem of privately releasing the weight of the MST in a slightly different setting where the weights are bounded and neighboring datasets differ by changing the weight of one edge.

Other differential privacy notions on graphs

The notion of privacy used in this work was introduced by Sealfon [26] and was since used widely [4, 23, 24, 5, 10, 22]. To paraphrase a real-world motivation, the graph may represent a city’s (publicly known) metro network, with edge weights representing how busy each line is. User’s commute data is private information which contributes to the edge weights and should be protected.²²2Based on what specific data are stored for each user, this example motivates both the $\ell_{1}$ and $\ell_{\infty}$ neighbor relation. Other common privacy notions include edge differential privacy and node differential privacy [18], where the graph itself is unweighted and private, and two graphs $G$ and $G^{\prime}$ are neighboring if one can be obtained from the other by deleting an edge (for edge privacy) or a node and all its adjacent edges (for node privacy).

Instance-optimality in differential privacy

The notion of universal optimality is closely linked to that of instance optimality, which states that our algorithm is “as good as any mechanism could be” on every single instance. Universal optimality can then be seen as a combination of instance-optimality w.r.t. the underlying graph and worst-case optimality w.r.t. the edge weights.

In the last few years, several instance-optimality results have been proven in differential privacy [19, 3, 7, 1]. We highlight here Huang, Liang, and Yi [19] which give an instance-optimal differentially private mechanism for releasing the mean. We also highlight Asi and Duchi [1] who introduce the inverse sensitivity mechanism, and give instance-optimality results for mean estimation, performing linear regression, and the principal component analysis.

It should be noted that there are subtleties in the precise definitions of instance optimality that these papers use. The reason is that under the definition of instance-optimality that is commonly used in other areas, often no instance-optimal mechanism exists in differential privacy for trivial reasons. Therefore, the precise definitions used in the mentioned papers differ somewhat.

Issues with nomenclature

The name “universal optimality” has unfortunately been used to mean different things in different contexts. Throughout this paper, we use universal optimality in the sense in which it is commonly used in distributed algorithms [17]. It should be noted that there have been several works in differential privacy that use the name “universal optimality” to denote completely unrelated concepts [13, 11]. Namely, these papers use the name universal optimality in a Bayesian setting, where universal optimality states that a given mechanism is optimal no matter the prior. This is a completely unrelated notion to what we consider in this paper.

Universal optimality

The notion of universal optimality started in distributed algorithms where multiple classic combinatorial problems are now known to have universally optimal algorithms in various settings [17, 29, 16, 25]. We highlight here the paper by Haeupler, Wajc, and Zuzic [17] which gives a universally optimal algorithm in the supported CONGEST model for the minimum spanning tree; the techniques used in that paper are different from those that we use, despite both papers considering the MST problem.

Recently, it was shown that Dijkstra’s algorithm is universally optimal for a version of the single-source shortest paths problem [15] in the standard Word-RAM model. To the best of our knowledge, that paper was the first universal optimality result in a non-distributed setting. This makes this paper the second such result.

1.3 Future work

We believe that our technique can be applied to other graph problems. Specifically, the framework that we use for our lower bound seems to be quite general. We believe that it is likely that the same techniques could be used for releasing shortest-path trees. With some additional tweaks, we believe our techniques could be useful for problems such as maximum-weight matching or minimum-weight perfect matching. Extending our results to approximate differential privacy would also be of great interest.

2 Preliminaries

In this paper, we consider all graphs to be simple, undirected and connected. We consider the setting introduced by Sealfon [26], where the unweighted graph $G=(V,E)$ is public and fixed, and the only private information are the edge weights $\mathbf{w}\in\mathbb{R}^{E}$ . We also assume that $G$ has at least two different spanning trees, i.e., it is not itself a tree. We denote by $\mathcal{T}(G)$ the set of all spanning trees of $G$ , and identify each $T\in\mathcal{T}(G)$ with the set of its edges. It is folklore that for all connected graphs, $1\leq|\mathcal{T}(G)|\leq n^{n-2}$ , with the maximum attained when $G$ is a clique. We write $\mathbf{w}(T)$ as a shortcut for $\sum_{e\in T}\mathbf{w}(e)$ , and write $T^{*}_{\mathbf{w}}$ to denote the minimum spanning tree under weights $\mathbf{w}$ . If $\mathbf{w}$ is clear from context, we write just $T^{*}$ . For a spanning tree $T$ , its error is defined as $\mathbf{w}(T)-\mathbf{w}(T^{*})$ .

2.1 Differential Privacy

We define two notions of adjacency for weight vectors: $\sim_{1}$ , defined so that $\mathbf{w}\sim_{1}\mathbf{w}^{\prime}$ if and only if $\|\mathbf{w}-\mathbf{w}^{\prime}\|_{1}\leq 1$ , and $\sim_{\infty}$ , defined so that $\mathbf{w}\sim_{\infty}\mathbf{w}^{\prime}$ if and only if $\|\mathbf{w}-\mathbf{w}^{\prime}\|_{\infty}\leq 1$ .

A mechanism for MST is any randomized algorithm that, for a fixed and public unweighted graph $G$ , takes as input a weight vector $\mathbf{w}$ and outputs a spanning tree of $G$ , specified by the list of its edges.³³3Formally, we have infinitely many mechanisms, each for one graph $G$ . What we will do instead is to pretend that there is a single mechanism that accepts $G$ as an additional parameter. We are interested in minimizing the expected error of $\mathcal{A}$ , defined as $\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}[\mathbf{w}(T)]-% \mathbf{w}(T^{*})$ . Furthermore, $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim$ (which is either $\sim_{1}$ or $\sim_{\infty}$ ), if, for every $\mathbf{w}\sim\mathbf{w}^{\prime}$ and every $T\in\mathcal{T}(G)$ ,

\Pr\left[\mathcal{A}(G,\mathbf{w})=T\right]\leq e^{\varepsilon}\cdot\Pr\left[% \mathcal{A}(G,\mathbf{w}^{\prime})=T\right],

where $\varepsilon>0$ is a parameter that controls the tradeoff between privacy and the expected error of $\mathcal{A}$ .

2.2 Spanning Trees

We define the following metric on the space of spanning trees: $d_{H}(T_{1},T_{2})\coloneq|T_{1}\setminus T_{2}|$ .⁴⁴4The reader is invited to verify that $d_{H}$ is indeed a metric. We call $d_{H}$ the Hamming metric or Hamming distance between $T_{1}$ and $T_{2}$ . Note that $|T_{1}\setminus T_{2}|=|T_{2}\setminus T_{1}|$ and we could have used either in the previous definition. We then define $\operatorname*{diam_{\mathcal{T}}}(G)\coloneqq\max_{T_{1},T_{2}\in\mathcal{T}(% G)}d_{H}(T_{1},T_{2})$ as the diameter of the space of spanning trees of $G$ with respect to $d_{H}$ . Trivially, $\operatorname*{diam_{\mathcal{T}}}(G)\leq n-1$ , but it can be much smaller: for example, we have $\operatorname*{diam_{\mathcal{T}}}(G)=1$ when $G$ is a cycle, and generally, $\operatorname*{diam_{\mathcal{T}}}(G)\leq k$ if $G$ has at most $n-1+k$ edges. In Section 5, we provide an exponential lower and upper bound on the relationship between $\operatorname*{diam_{\mathcal{T}}}(G)$ and $|\mathcal{T}(G)|$ .

In the lower bounds, we will make use of special zero-one weight vectors, derived from spanning trees:

Definition 1.

For a spanning tree $T\in\mathcal{T}(G)$ , define the weights $\mathds{1}_{T}\in R^{E}$ as $\mathds{1}_{T}(e)=0$ if $e\in T$ , and $\mathds{1}_{T}(e)=1$ otherwise.

Fact 2.

For two trees $T_{1},T_{2}\in\mathcal{T}(G)$ , it holds $\mathds{1}_{T_{1}}(T_{2})=|T_{2}\setminus T_{1}|=d_{H}(T_{1},T_{2})=|T_{1}% \setminus T_{2}|=\mathds{1}_{T_{2}}(T_{1})$ . In particular, $\mathds{1}_{T}(T)=0$ . It also holds that $\|\mathds{1}_{T_{1}}-\mathds{1}_{T_{2}}\|_{1}=2d_{H}(T_{1},T_{2})$ and $\|\mathds{1}_{T_{1}}-\mathds{1}_{T_{2}}\|_{\infty}\leq 1$ (with equality if $T_{1}\neq T_{2}$ ).

2.3 Universal Optimality

Universal optimality is a notion that, intuitively speaking, says that an algorithm is optimal for any fixed graph topology. It is most commonly used with respect to the time complexity, but here, we instead define it in terms of an expected error of an $\varepsilon$ -differentially private mechanism.

Definition 3.

Let $\mathcal{P}$ be any optimization problem on weighted graphs where for every input $(G,\mathbf{w})\in\mathcal{X}$ ,⁵⁵5Here $\mathcal{X}$ is the set of all problem inputs; in our specific problem, this is the (uncountably infinite) set of all possible weighted connected graphs with all possible choices of edge weights. the goal is to produce an output $y\in\mathcal{Y}$ minimizing some scoring function $\mu(G,\mathbf{w},y)$ . Define $\mu^{*}(G,\mathbf{w})=\min_{y\in\mathcal{Y}}\mu(G,\mathbf{w},y)$ . For a fixed graph $G$ and an algorithm $\mathcal{A}$ , define worst-case expected error of $\mathcal{A}$ on $G$ as

R(\mathcal{A},G)\coloneqq\max\Big{\{}\,\operatorname*{\mathbb{E}}\left[\mu(G,% \mathbf{w},\mathcal{A}(G,\mathbf{w}))\right]-\mu^{*}(G,\mathbf{w})\;\Big{|}\;% \mathbf{w}:(G,\mathbf{w})\in\mathcal{X}\,\Big{\}}.

We say that an $\varepsilon$ -differentially private mechanism $\mathcal{A}:\mathcal{X}\to\mathcal{Y}$ is universally optimal for $\mathcal{P}$ , if there exists a constant $c>0$ such that for any unweighted graph $G$ and any other $\varepsilon$ -differentially private mechanism $\mathcal{A}^{*}$ , we have $R(\mathcal{A},G)\leq c\cdot R(\mathcal{A}^{*},G).$ We instead say that $\mathcal{A}$ is universally optimal up to factor $f(G)$ if $R(\mathcal{A},G)\leq c\cdot f(G)\cdot R(\mathcal{A}^{*},G).$

3 Simple Mechanism for Privately Releasing the MST

In this section, we show a tighter analysis of a very simple mechanism for privately releasing an MST, which originally appeared in Sealfon [26]. Whereas the original analysis proves that the expected error is at most $\mathcal{O}(n\log n/\varepsilon)$ for the $\sim_{1}$ neighbor relation, we prove a tighter bound of $\mathcal{O}(D\log n/\varepsilon)$ for $D=\operatorname*{diam_{\mathcal{T}}}(G)$ (note that $D\leq n-1$ ). We prove in Section 4 a lower-bound of $\Omega(D/\varepsilon)$ , showing that the algorithm is universally near-optimal. On the other hand, in Claim 26 we prove that for $\sim_{\infty}$ , the algorithm is neither worst-case nor universally optimal.

Our goal is to prove the following corollary. It follows from Corollary 7 (which states the upper bound), Theorem 9 (which states the lower bound) and the fact that the time complexity of Algorithm 5 is dominated by the runtime of an MST algorithm.

Corollary 4.

Algorithm 5 is $\varepsilon$ -differentially private under the $\ell_{1}$ neighbor relation $\sim_{1}$ for any $\varepsilon=\mathcal{O}(1)$ . It is universally optimal up to an $\mathcal{O}(\log n)$ factor for releasing the MST under the $\ell_{1}$ neighbor relation. It runs in near-linear time.

Algorithm 5 (MST via postprocessing).

Given the weight vector $\mathbf{w}$ , let $b=1/\varepsilon$ for $\sim_{1}$ and $b=m/\varepsilon$ for $\sim_{\infty}$ . Release the noisy weights $\hat{\mathbf{w}}$ obtained by, for each edge, independently sampling $\hat{\mathbf{w}}(e)\coloneqq\mathbf{w}(e)+\operatorname*{Lap}(b)$ , where $\operatorname*{Lap}(b)$ is the Laplacian distribution with mean 0 and scale $b$ . Calculate the MST of $(G,\hat{\mathbf{w}})$ and return it.

Theorem 6.

Algorithm 5, denoted here as $\mathcal{A}$ , is $\varepsilon$ -differentially private for both $\sim_{1}$ and $\sim_{\infty}$ . For $\sim_{1}$ , and for any $\gamma>0$ , it holds that

\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\,\mathbf{w}(T)\leq\mathbf{w}(T^{*})% +\frac{4D\log(n/\gamma)}{\varepsilon}\,\right]\geq 1-\gamma,

where $T^{*}$ is the MST of $(G,\mathbf{w})$ and $D=\operatorname*{diam_{\mathcal{T}}}(G)$ .

The idea of our proof is analogous to the original proof by Sealfon [26], except that in the last step, instead of bounding the error incurred by all edges of $T$ and $T^{*}$ , we bound the (smaller) error incurred by the $2D$ edges in which $T$ and $T^{*}$ differ.

Proof.

By the guarantees of the Laplace mechanism [9], we can privately release the noised weight vector, where for $\sim_{\infty}$ , we additionally use that $\|\mathbf{w}-\mathbf{w}^{\prime}\|_{1}\leq m$ for $\mathbf{w}\sim_{\infty}\mathbf{w}^{\prime}$ . The privacy of Algorithm 5 then follows from the privacy of postprocessing.

Fix $\gamma>0$ . For each edge $e$ , it holds that $\Pr[\,|\mathbf{w}(e)-\hat{\mathbf{w}}(e)|\leq\log(m/\gamma)/\varepsilon\,]=1-% \gamma/m$ by the definition of $\operatorname*{Lap}(\cdot)$ . By the union bound, the probability that $|\mathbf{w}(e)-\hat{\mathbf{w}}(e)|\leq\log(m/\gamma)/\varepsilon$ holds for all edges is at least $1-\gamma$ . Let us condition on this event.

Let $T$ be the spanning tree returned by Algorithm 5, i.e., the MST of $(G,\hat{\mathbf{w}})$ , and let $T^{*}$ be the MST of $(G,\mathbf{w})$ . We can write

	$\displaystyle\mathbf{w}(T)-\mathbf{w}(T^{*})$	$\displaystyle=\sum_{e\in T\setminus T^{}}\mathbf{w}(e)-\sum_{e\in T^{}% \setminus T}\mathbf{w}(e)$
		$\displaystyle\leq\sum_{e\in T\setminus T^{}}\hat{\mathbf{w}}(e)-\sum_{e\in T^% {}\setminus T}\hat{\mathbf{w}}(e)+2D\log(m/\gamma)/\varepsilon$
		$\displaystyle=\hat{\mathbf{w}}(T)-\hat{\mathbf{w}}(T^{*})+2D\log(m/\gamma)/\varepsilon$
		$\displaystyle\leq 2D\log(m/\gamma)/\varepsilon.$

We used, respectively, the fact that edges in $T\cap T^{*}$ do not contribute to the error, the fact that $|T\setminus T^{*}|=|T^{*}\setminus T|\leq D$ , rewriting, and the fact that $T$ is an MST under $\hat{\mathbf{w}}$ . Noting that $\log(m/\gamma)\leq\log(n^{2}/\gamma)=2\log(n/\gamma)$ finishes the proof. $\hfill\blacktriangleleft$

Corollary 7.

Let $\mathcal{A}$ be Algorithm 5 in the setting with $\sim_{1}$ . It holds that $\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\mathbf{w}(T)% \right]-\mathbf{w}(T^{*})\leq 4D(\log n+1)/\varepsilon,$ where $T^{*}$ is the MST of $(G,\mathbf{w})$ and $D=\operatorname*{diam_{\mathcal{T}}}(G)$ .

Proof.

By Theorem 6, we have

\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\,\mathbf{w}(T)-\mathbf{w}(T^{*})% \leq\frac{4D(\log n+\log(1/\gamma))}{\varepsilon}\,\right]\geq 1-\gamma,

The claim then follows from the following lemma, with $a=4D/\varepsilon$ and $b=\log n$ . $\hfill\blacktriangleleft$

Lemma 8.

Let $X$ be a random variable and let us have $a>0$ , $b\in\mathbb{R}$ , such that for every $\gamma\in(0,1]$ , we have $\Pr[X\leq a(\log(1/\gamma)+b)]\geq 1-\gamma$ . Then $\operatorname*{\mathbb{E}}[X]\leq a(b+1)$ .

Proof.

Equivalently, we can write $\Pr\left[\,X/a-b\leq-\log\gamma\,\right]\geq 1-\gamma.$ Now let $x\in\mathbb{R}$ be such that $\gamma=\exp(-x)$ . We have $\Pr\left[\,X/a-b\leq x\,\right]\geq 1-\exp(-x).$ Denote $Y=\frac{X}{a}-b$ and let $Z\sim\text{Exp}(1)$ , where $\text{Exp}(\lambda)$ is the exponential distribution. For any $x\in\mathbb{R}$ , we have $\Pr[Y\leq x]\geq 1-\exp(-x)=\Pr[Z\leq x]$ , i.e., $Z$ stochastically dominates $Y$ , and thus $\operatorname*{\mathbb{E}}[Y]\leq\operatorname*{\mathbb{E}}[Z]=1$ . Thus, by linearity of expectation, $\operatorname*{\mathbb{E}}[X]=a(\operatorname*{\mathbb{E}}[Y]+b)\leq a(b+1)$ , as needed. $\hfill\blacktriangleleft$

4 Lower Bounds for MST

In this section, we focus on proving the lower bound. For $\sim_{1}$ , this is a lower bound that nearly matches the performance of Algorithm 5, for $\sim_{\infty}$ it nearly matches the performance of the mechanism from Section 6.

In both cases, there is a $\log n$ gap between the lower and upper bounds. This gap is inherent to our approach, but for some graphs, and in particular, for $G=K_{n}$ , we can make it disappear and get worst-case lower bounds of $\Omega(n\log n/\varepsilon)$ (for $\sim_{1}$ ) and $\Omega(n^{2}\log n/\varepsilon)$ (for $\sim_{\infty}$ ) that match our worst-case upper bounds from Section 6 up to a constant factor, as well as that of Sealfon [26] in the case of $\sim_{1}$ . Overall, we prove the following theorem:

Theorem 9.

Fix an unweighted graph $G$ and let $\mathcal{A}$ be mechanism for MST on $G$ . If $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim_{1}$ , then there exist weights $\mathbf{w}$ such that $\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}[\mathbf{w}(T)]=% \mathbf{w}(T^{*})+\Omega\left(D/\varepsilon\right)-\mathcal{O}(1),$ where $D=\operatorname*{diam_{\mathcal{T}}}(G)$ . Moreover, there exists at least one $G$ and $\mathbf{w}$ where $\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}[\mathbf{w}(T)]=% \mathbf{w}(T^{*})+\Omega\left(n\log n/\varepsilon\right)-\mathcal{O}(1).$

If $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim_{\infty}$ instead, then, for every fixed $G$ , there exist weights $\mathbf{w}$ such that $\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}[\mathbf{w}(T)]=% \mathbf{w}(T^{*})+\Omega\left(D^{2}/\varepsilon\right)-\mathcal{O}(D).$ Moreover, there exists at least one $G$ and $\mathbf{w}$ where $\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}[\mathbf{w}(T)]=% \mathbf{w}(T^{*})+\Omega\left(n^{2}\log n/\varepsilon\right)-\mathcal{O}(n).$

The rough outline of the proof is as follows: in Lemmas 10 and 11, we state (and restate in a language of MSTs) a general packing-based lower bound that applies whenever we can find many inputs (in our case, weight vectors) close in the metric induced by the neighbor relation such that, for each input, the set of all outputs (in our case, spanning trees) that give small error with respect to this input, is disjoint with all the other sets. In Lemma 12, we prove that instead of searching for a good set of weight vectors, we can search for a large set of spanning trees in which every two spanning trees differ in many edges. Lemmas 13 and 14 prove that such a large set always exists, with the latter being a special case providing stronger guarantees when the graph is a clique. We postpone their proofs to Section 5, where we also build the needed theory.

At the core of our lower bound is the following lemma, based on a general packing argument.

Lemma 10 (Vadhan [27], Theorem 5.13).

Let $\mathcal{C}\subseteq\mathcal{X}$ be a collection of datasets all at (neighbor-relation-induced) distance at most $r$ from some fixed dataset $x_{0}\in\mathcal{X}$ , and let $\{\mathcal{L}_{x}\}_{x\in\mathcal{C}}$ be a collection of disjoint subsets of $\mathcal{Y}$ . If there is an $\varepsilon$ -differentially private mechanism $\mathcal{M}:\mathcal{X}\to\mathcal{Y}$ such that $\Pr[\mathcal{M}(x)\in\mathcal{L}_{x}]\geq p$ for every $x\in\mathcal{C}$ , then $p\leq e^{r\varepsilon}/|\mathcal{C}|.$

The following rephrases Lemma 10 in the language of our problem. It also picks a specific way of constructing the sets $\mathcal{L}_{\mathbf{w}}$ , namely, by including precisely those outputs that have a “small enough” error with respect to $\mathbf{w}$ .

Lemma 11 (Adaptation of Lemma 10 for MST).

Fix an unweighted graph $G=(V,E)$ and let $\sim$ be an arbitrary neighbor relation on $\mathbb{R}^{E}$ . Given $\mathcal{W}\subseteq\mathbb{R}^{E}$ a collection of weight vectors, all at ( $\sim$ -induced) distance at most $r\in\mathbb{N}$ from some fixed weight vector $\mathbf{w}_{0}\in\mathbb{R}^{E}$ , and a parameter $x\geq 0$ , denote for each $\mathbf{w}\in\mathcal{W}$ the MST of $(G,\mathbf{w})$ by $T^{*}_{\mathbf{w}}$ , and define $\mathcal{L}_{\mathbf{w}}\coloneqq\{\,T\in\mathcal{T}(G)\mid\mathbf{w}(T)\leq% \mathbf{w}(T^{*}_{\mathbf{w}})+x\,\}$ the set of spanning trees that are “light” under $\mathbf{w}$ . Assume that $\mathcal{W}$ and $x$ are such that all $\mathcal{L}_{\mathbf{w}}$ are pairwise disjoint. Then for any $\varepsilon$ -differentially private (with respect to $\sim$ ) mechanism $\mathcal{M}:\mathbb{R}^{E}\to\mathcal{T}(G)$ , there exist weights $\mathbf{w}\in\mathcal{W}$ , such that $\Pr_{T\sim\mathcal{M}(\mathbf{w})}[\,\mathbf{w}(T)\leq\mathbf{w}(T^{*}_{% \mathbf{w}})+x\,]\leq e^{r\varepsilon}/|\mathcal{W}|.$

It turns out that there is a very natural class of sets $\mathcal{W}$ that give a reasonably good lower bound. Namely, the $\mathcal{W}$ we use in our lower bound will be of the form $\mathcal{W}\subseteq\{0,\alpha\}^{E}$ for some choice of $\alpha$ . This is formalized in the following statement. It says that to create our $\mathcal{W}$ , we can start with any $S\subseteq\mathcal{T}(G)$ , and then use the mapping $T\to\alpha\cdot\mathds{1}_{T}$ (recall that $\mathds{1}_{T}$ is the indicator function of a set $E\setminus T$ , see Definition 1). The quality of the lower bound provided by $S$ depends solely on two things: the size of $S$ , and the minimum Hamming distance between distinct $T_{1},T_{2}\in S$ . The parameter $\alpha>0$ can then be used to control the tradeoff between $x$ and the probability that the error will be smaller than $x$ in Lemma 11.

Lemma 12.

Fix an unweighted graph $G$ and denote $D=\operatorname*{diam_{\mathcal{T}}}(G)$ . Given a set of spanning trees $S\subseteq\mathcal{T}(G)$ , let $d>0$ be such that $d_{H}(T_{1},T_{2})>d$ for all $T_{1},T_{2}\in S$ . Finally, let $\mathcal{A}$ be a mechanism for releasing a minimum spanning tree of $G$ . If $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim_{1}$ , then there exist weights $\mathbf{w}$ such that

\displaystyle\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\,\mathbf{w}(T)\leq% \mathbf{w}(T^{*})+\frac{d}{D}\left(\frac{\log|S|}{8\varepsilon}-\frac{1}{4}% \right)\,\right]\leq\frac{1}{\sqrt{|S|}}.

If $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim_{\infty}$ , then there exist weights $\mathbf{w}$ such that

\displaystyle\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\,\mathbf{w}(T)\leq% \mathbf{w}(T^{*})+d\left(\frac{\log|S|}{4\varepsilon}-\frac{1}{2}\right)\,% \right]\leq\frac{1}{\sqrt{|S|}}.

Proof.

Let us walk through the proof with the assumption that $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim_{1}$ . At the very end, we will discuss the needed changes for $\sim_{\infty}$ .

Fix a parameter $\alpha\in\mathbb{R}$ and define $\mathcal{W}$ as $\{\,\alpha\cdot\mathds{1}_{T}\mid T\in S\,\}$ . Later we will specify the value of $\alpha$ that gives the needed bound.

Fix $\mathbf{w}_{0}=\alpha\mathds{1}_{T_{0}}\in\mathcal{W}$ arbitrarily. In order to apply Lemma 11, we need to determine $r$ and $x$ . For each $\mathbf{w}=\alpha\mathds{1}_{T}\in\mathcal{W}$ , we have $\|\mathbf{w}-\mathbf{w}_{0}\|_{1}=2\alpha d_{H}(T,T_{0})\leq 2\alpha D$ , and thus $r=\lceil 2\alpha D\rceil$ .

Set $x=\alpha d/2$ . Then for every $\mathbf{w}=\alpha\mathds{1}_{T}\in\mathcal{W}$ , we have:

	$\displaystyle\mathcal{L}_{\mathbf{w}}$	$\displaystyle=\{\,T^{\prime}\in\mathcal{T}(G)\mid\mathbf{w}(T^{\prime})<% \mathbf{w}(T)+x\,\}=\{\,T^{\prime}\in\mathcal{T}(G)\mid\alpha\mathds{1}_{T}(T^% {\prime})<\alpha d/2\,\}$
		$\displaystyle=\{\,T^{\prime}\in\mathcal{T}(G)\mid\|T^{\prime}\setminus T\|<d/2\,% \}=\{\,T^{\prime}\in\mathcal{T}(G)\mid d_{H}(T,T^{\prime})<d/2\,\}.$

We can conclude that all $\mathcal{L}_{\mathbf{w}}$ are disjoint: If we had $T\in\mathcal{L}_{\mathbf{w}_{1}}\cap\mathcal{L}_{\mathbf{w}_{2}}$ for distinct $\mathbf{w}_{1}=\alpha\mathds{1}_{T_{1}}\in\mathcal{W}$ and $\mathbf{w}_{2}=\alpha\mathds{1}_{T_{2}}\in\mathcal{W}$ , then we would have $d_{H}(T_{1},T_{2})\leq d_{H}(T_{1},T^{\prime})+d_{H}(T^{\prime},T_{2})<d$ , which would contradict the properties of $d$ , as $T_{1},T_{2}\in S$ .

Now we apply Lemma 11 to get that, for every $\varepsilon$ -DP mechanism for releasing the MST of $G$ there exist weights $\mathbf{w}\in\mathcal{W}\subseteq\mathbb{R}^{E}$ , such that

\displaystyle\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\,\mathbf{w}(T)\leq% \mathbf{w}(T^{*})+\frac{\alpha d}{2}\,\right]

\displaystyle\leq\frac{\exp(m\varepsilon)}{|S|}=\frac{\exp(\lceil 2\alpha D% \rceil\varepsilon)}{|S|}\leq\frac{\exp((2\alpha D+1)\varepsilon)}{|S|}.

Setting $\alpha=1/4\cdot\log|S|/(\varepsilon D)-1/2D$ yields:

\displaystyle\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\,\mathbf{w}(T)\leq% \mathbf{w}(T^{*})+\frac{d}{D}\left(\frac{\log|S|}{8\varepsilon}-\frac{1}{4}% \right)\,\right]\leq\frac{1}{\sqrt{|S|}}.

Finally, let us deal with the case when $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim_{\infty}$ instead. The idea of the proof is identical and we only point out the differences. When calculating $r$ , we now get that $\|\mathbf{w}-\mathbf{w}_{0}\|_{\infty}\leq\alpha$ , and thus $r=\lceil\alpha\rceil$ . By Lemma 11, and as $\lceil\alpha\rceil\leq\alpha+1$ , there exist weights $\mathbf{w}$ such that

\displaystyle\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\,\mathbf{w}(T)\leq% \mathbf{w}(T^{*})+\frac{\alpha d}{2}\,\right]

\displaystyle\leq\frac{\exp((\alpha+1)\varepsilon)}{|S|}.

Setting $\alpha=1/2\cdot\log|S|/\varepsilon-1$ yields:

\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\left[\,\mathbf{w}(T)\leq\mathbf{w}(T^{*})% +d\left(\frac{\log|S|}{4\varepsilon}-\frac{1}{2}\right)\,\right]\leq\frac{1}{% \sqrt{|S|}}.\

$\hfill\blacktriangleleft$

Lemma 12 reduces our problem into a problem of finding a set $S\subseteq\mathcal{T}(G)$ , which is as large as possible, and at the same time, the Hamming distance between any two spanning trees in the set is large. In Section 5, we prove the following two results:

Lemma 13.

For any unweighted graph $G$ with $\operatorname*{diam_{\mathcal{T}}}(G)=D$ , there exists a set $S\subseteq\mathcal{T}(G)$ of size $2^{\Theta(D)}$ such that for any distinct $T_{1},T_{2}\in S$ , we have $d_{H}(T_{1},T_{2})=\Theta(D)$ .

Lemma 14.

For any unweighted clique $G$ on $n>2$ vertices, there exists a set $S\subseteq\mathcal{T}(G)$ of size $2^{\Theta(n\log n)}$ such that for any distinct $T_{1},T_{2}\in S$ , we have $d_{H}(T_{1},T_{2})=\Theta(n)$ .

Now we are ready to prove Theorem 9.

Proof of Theorem 9.

Lemma 12 says that for any mechanism $\mathcal{A}$ , there exist weights $\mathbf{w}$ such that, if $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim_{1}$ , then:

\displaystyle\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\Bigg{[}\,\mathbf{w}(T)\leq% \mathbf{w}(T^{*})+\underbrace{\frac{d}{D}\left(\frac{\log|S|}{8\varepsilon}-% \frac{1}{4}\right)}_{x}\,\Bigg{]}\leq|S|^{-1/2}

for $D=\operatorname*{diam_{\mathcal{T}}}(G)$ and $d$ such that $d_{H}(T_{1},T_{2})<d$ for any distinct $T_{1},T_{2}\in S$ . This means that $\Pr[\mathbf{w}(T)>\mathbf{w}(T^{*})+x]\geq 1-|S|^{-1/2}=\Omega(1)$ , and thus, as $d/D\leq 1$ :

\operatorname*{\mathbb{E}}[\mathbf{w}(T)]\geq\mathbf{w}(T^{*})+\Omega(x)=% \mathbf{w}(T^{*})+\Omega\left(\frac{d\log|S|}{D\varepsilon}\right)-\mathcal{O}% (1).

Now, if $S$ is from Lemma 13, we have $\log|S|=\Theta(D)$ and $d=\Theta(D)$ . If $G=K_{n}$ then instead we take $S$ from Lemma 14, and $\log|S|=\Theta(n\log n)$ and $d=\Theta(D)$ . Substituting twice in the equation above proves the first half of the theorem.

If, instead, $\mathcal{A}$ is $\varepsilon$ -differentially private with respect to $\sim_{\infty}$ , there likewise exist weights $\mathbf{w}$ such that:

\displaystyle\Pr_{T\sim\mathcal{A}(G,\mathbf{w})}\Bigg{[}\,\mathbf{w}(T)\leq% \mathbf{w}(T^{*})+d\left(\frac{\log|S|}{4\varepsilon}-\frac{1}{2}\right)\,% \Bigg{]}\leq|S|^{-1/2},

and by a similar argument, $\operatorname*{\mathbb{E}}[\mathbf{w}(T)]\geq\mathbf{w}(T^{*})+\Omega\left(d% \log|S|/\varepsilon\right)-\mathcal{O}(d).$ Now, we can again substitute for $\log|S|$ and $d$ according to either Lemma 13 (if $G$ is a general graph), or Lemma 14 (if $G$ is a clique). This proves the second half of the theorem. $\hfill\blacktriangleleft$

5 Finding a Large Set of Dissimilar Trees

In this section, our goal is to prove Lemmas 13 and 14, which assert the existence of a large set $S$ of spanning trees such that every two spanning trees differ in many edges. In Section 5.1, we state some common properties of spanning trees and prove that there are at least $2^{D}$ spanning trees (for $D=\operatorname*{diam_{\mathcal{T}}}(G)$ ). In Section 5.2, we show how to embed binary block codes into the $2^{D}$ spanning trees from Section 5.1, which leads to our first method of constructing $S$ , and proving Lemma 13; this is the lemma that we need in order to prove our universal optimality results.

In Section 5.3, we bound the number of trees in the $d$ -ball around some tree $T$ , and use this in conjunction with a greedy packing argument to provide a different method of constructing $S$ that gives slightly different guarantees than the one in Section 5.2. We use this to prove Lemma 14; this is the lemma that we need to prove our worst-case optimality results.

5.1 Properties of Spanning Trees

Here we state some simple properties of spanning trees. Although (some of) these results could be considered folklore, we give proofs for completeness.

Lemma 15 (Exchange lemma).

Let $T_{x},T_{y}\in\mathcal{T}(G)$ be two spanning trees and let $e\in T_{y}\setminus T_{x}$ . Then there exists $f\in T_{x}\setminus T_{y}$ such that $T^{\prime}=T_{x}\cup\{e\}\setminus\{f\}$ is also a spanning tree. Furthermore, $|T_{y}\setminus T^{\prime}|=|T_{y}\setminus T_{x}|-1$ .

Proof.

The graph $T_{x}\cup\{e\}$ has exactly one cycle $C$ . $T_{y}$ does not have cycles, and $C$ must thus contain an edge $f\notin T_{y}$ ; furthermore, $f\neq e$ , as $e\in T_{y}$ . But then $f\in C\setminus\{e\}\subseteq T_{x}$ and thus $f\in T_{x}\setminus T_{y}$ as needed, and $T^{\prime}=T_{x}\cup\{e\}\setminus\{f\}$ is again a tree. Finally, $|T_{y}\setminus T^{\prime}|=|T_{y}\setminus T_{x}|-1$ , as removing $f$ had no effect on $|T_{y}\setminus T^{\prime}|$ and adding $e$ decreased it by $1$ . $\hfill\blacktriangleleft$

Lemma 16 (Iterated exchange lemma).

Given two spanning trees $T_{a},T_{b}\in\mathcal{T}(G)$ , and a set of edges $Q\subseteq T_{b}\setminus T_{a}$ , there exists a spanning tree $T_{Q}$ such that $T_{Q}\setminus T_{a}=Q$ and $|T_{b}\setminus T_{Q}|=|T_{b}\setminus T_{a}|-|Q|$ .

Proof.

Let $k=|T_{b}\setminus T_{a}|$ . The claim follows by induction on $|Q|$ . If $|Q|=0$ , then we can set $T_{Q}\coloneqq T_{a}$ . Otherwise, take any $e\in Q$ , and let $Q^{\prime}=Q\setminus\{e\}$ . By induction, let $T^{\prime}$ be the tree such that $T^{\prime}\setminus T_{a}=Q^{\prime}$ and $|T_{b}\setminus T^{\prime}|=|T_{b}\setminus T_{a}|-|Q^{\prime}|=k-|Q|+1$ . Now invoke Lemma 15 with $T_{x}=T^{\prime}$ , $T_{y}=T_{b}$ and $e$ to obtain $f\in T^{\prime}\setminus T_{b}$ and a tree $T=T^{\prime}\cup\{e\}\setminus\{f\}$ satisfying $|T_{b}\setminus T|=|T_{b}\setminus T^{\prime}|-1=k-|Q|$ .

We claim that $T\setminus T_{a}\subseteq Q$ , since $T^{\prime}\setminus T_{a}=Q^{\prime}\subseteq Q$ by induction, and the only edge $T$ additionally contains is $e\in Q$ . But simultaneously, by triangle inequality, $d_{H}(T_{a},T)\geq d_{H}(T_{a},T_{b})-d_{H}(T,T_{b})=k-(k-|Q|)=|Q|.$ Hence, $T\setminus T_{a}\subseteq Q$ and $|T\setminus T_{a}|=|Q|$ , and thus $T\setminus T_{a}=Q$ as needed and we can set $T_{Q}\coloneqq T$ . $\hfill\blacktriangleleft$

Corollary 17.

For every graph $G$ , it holds that $|\mathcal{T}(G)|\geq 2^{D}$ for $D=\operatorname*{diam_{\mathcal{T}}}(G)$ .

Proof.

Fix any two $T_{a},T_{b}\in\mathcal{T}(G)$ such that $|T_{b}\setminus T_{a}|=D$ . By Lemma 16, we have that for any $Q\subseteq T_{b}\setminus T_{a}$ , there exists a spanning tree $T_{Q}$ such that $T_{Q}\setminus T_{a}=Q$ . As there are $2^{D}$ possible choices of $Q$ , and each of them yields a unique spanning tree, we can conclude that there are at least $2^{D}$ different spanning trees. $\hfill\blacktriangleleft$

5.2 Dissimilar Trees via Binary Codes

In this section, we show that the set $\mathcal{Z}$ of $2^{D}$ trees from Corollary 17, behaves, in some sense, as the space $\{0,1\}^{D}$ . Namely, there is a correspondence between $\mathcal{Z}$ and $\{0,1\}^{D}$ , such that if $\mathbf{x},\mathbf{y}\in\{0,1\}^{D}$ differ in $k$ positions, then their corresponding spanning trees have Hamming distance at least $k/2$ . In this way, we reduce the problem of finding a set $S\subseteq\mathcal{Z}$ of dissimilar spanning trees to the problem of finding a good binary block code.

Lemma 18.

Given two spanning trees $T_{a}$ , $T_{b}$ , let $Q_{1},Q_{2}\subseteq T_{b}\setminus T_{a}$ . Let $T_{Q_{1}}$ and $T_{Q_{2}}$ be trees such that $T_{Q_{1}}\setminus T_{b}=Q_{1}$ and $T_{Q_{2}}\setminus T_{b}=Q_{2}$ . Then it holds that $Q_{1}\setminus Q_{2}\subseteq T_{Q_{1}}\setminus T_{Q_{2}}$ .

We note that $T_{Q_{1}}$ and $T_{Q_{2}}$ always exist thanks to Lemma 16.

Proof.

By rules for set subtraction, we have $Q_{1}\setminus Q_{2}=(T_{Q_{1}}\setminus T_{b})\setminus(T_{Q_{2}}\setminus T_% {b})=(T_{Q_{1}}\setminus T_{Q_{2}})\setminus T_{b}\subseteq T_{Q_{1}}\setminus T% _{Q_{2}}.$ $\hfill\blacktriangleleft$

We briefly recall the definition of a block code. Note that we use the less common definition of an $(n,M,d)_{2}$ code where $M$ is not the message length, but the (exponentially larger) number of codewords.

Definition 19.

An $(n,M,d)_{2}$ code is a set $\mathcal{C}\subseteq\{0,1\}^{n}$ such that $|\mathcal{C}|\geq M$ and every two different vectors $\mathbf{x},\mathbf{y}\in\mathcal{C}$ differ in at least $d$ positions.

Next, we prove the reduction between finding a set of dissimilar spanning trees and finding a good block code:

Lemma 20.

Let $\mathcal{C}$ be a $(D,M,d+1)_{2}$ code and let $G$ be an unweighted graph such that $D=\operatorname*{diam_{\mathcal{T}}}(G)$ . Then there exists a set $S\subseteq\mathcal{T}(G)$ such that $|S|=M$ and $d_{H}(T_{1},T_{2})>d/2$ for any two different $T_{1},T_{2}\in S$ .

Proof.

We will construct $S$ as follows: first, we fix $T_{a}$ and $T_{b}$ such that $d_{H}(T_{a},T_{b})=D$ , as in Corollary 17. We number the edges of $T_{b}\setminus T_{a}$ in any order as $e_{1},\ldots,e_{D}$ . Now, for every $\mathbf{x}\in\mathcal{C}$ , define $Q_{\mathbf{x}}\coloneq\{\,e_{i}\mid i\in\{1,\ldots,D\}\land\mathbf{x}_{i}=1\,\}$ , and let $T_{Q_{\mathbf{x}}}$ be the tree obtained by invoking Lemma 16. Namely, $T_{Q_{\mathbf{x}}}$ satisfies $T_{Q_{\mathbf{x}}}\setminus T_{a}=Q_{\mathbf{x}}$ . Finally, we set $S=\{\,T_{Q_{\mathbf{x}}}\mid\mathbf{x}\in\mathcal{C}\,\}$ .

Clearly $|S|=|\mathcal{C}|=M$ , and we need to show that $d_{H}(T_{Q_{\mathbf{x}}},T_{Q_{\mathbf{y}}})>d/2$ for all distinct $\mathbf{x},\mathbf{y}\in\mathcal{C}$ . Note that the number of positions in which $\mathbf{x}$ and $\mathbf{y}$ differ is exactly $|Q_{\mathbf{x}}\setminus Q_{\mathbf{y}}|+|Q_{\mathbf{y}}\setminus Q_{\mathbf{x% }}|$ . Now we can use Lemma 18, to write

d+1\leq|Q_{\mathbf{x}}\setminus Q_{\mathbf{y}}|+|Q_{\mathbf{y}}\setminus Q_{% \mathbf{x}}|\leq|T_{Q_{\mathbf{x}}}\setminus T_{Q_{\mathbf{y}}}|+|T_{Q_{% \mathbf{y}}}\setminus T_{Q_{\mathbf{x}}}|=2d_{H}(T_{Q_{\mathbf{x}}},T_{Q_{% \mathbf{y}}}),

and thus $d_{H}(T_{Q_{\mathbf{x}}},T_{Q_{\mathbf{y}}})>d/2$ , as needed. $\hfill\blacktriangleleft$

Finally, we show that there always exists a good block code:

Lemma 21.

For every $n$ , there exists an $(n,2^{\lfloor n/3\rfloor},\lfloor n/6\rfloor+1)_{2}$ code.

Proof.

First assume that $n$ is divisible by $6$ . By the Gilbert–Varshamov bound [14, 28], there exists a $(n,K,n/6+1)_{2}$ code for some $K$ that satisfies $2^{n}/K\leq\sum_{i=0}^{n/6}\binom{n}{i}.$

Using the bound on the sum of binomial coefficients (see e.g. Flum and Grohe, p. 427 [12]), we obtain $\sum_{i=0}^{n/6}\leq 2^{n\cdot H(1/6)}$ with $H(p)$ being the binary entropy function $H(p)=-p\log p-(1-p)\log(1-p)$ . One can verify that $H(1/6)\leq 2/3$ and thus $K\geq 2^{n}/2^{n\cdot 2/3}=2^{n/3}$ as needed.

Finally, if $n$ is not divisible by $6$ , we can apply the lemma with $n^{\prime}=6\lfloor n/6\rfloor$ and then pad every codeword of the resulting code with $n-n^{\prime}$ zeros. $\hfill\blacktriangleleft$

Combination of these results proves Lemma 13:

Proof of Lemma 13.

Combining Lemmas 20 and 21 and setting $n=D$ in the latter immediately yields $S\subseteq\mathcal{T}(G)$ with $|S|=2^{\Theta(D)}$ and $d=\Theta(D)$ . $\hfill\blacktriangleleft$

5.3 Dissimilar Trees via Greedy Packing

In this section, we provide a different approach for finding a large set $S$ of dissimilar spanning trees. It works by producing a crude upper bound $U$ on the number of trees in the $d$ -ball around a tree $T$ , and then using a greedy packing argument to show that we can always find $S$ of size $|S|\geq|\mathcal{T}(G)|/U$ .

Lemma 22 (Volume of a $d$ -ball around a spanning tree).

For a graph $G$ , $T\in\mathcal{T}(G)$ , and $d\in\mathbb{R}_{>0}$ , it holds that $|\{\,T^{\prime}\in\mathcal{T}(G)\mid d_{H}(T,T^{\prime})\leq d\,\}|\leq m^{d}n% ^{d}.$

Proof.

Any $T^{\prime}$ with $d_{H}(T,T^{\prime})=d^{\prime}\leq\lfloor d\rfloor$ can be fully (and possibly non-uniquely) described by a list $L_{+}$ of $d^{\prime}$ edges $e\in E\setminus T$ to be added to $T$ and another list $L_{-}$ of $d^{\prime}$ edges $e\in T$ to be removed. Furthermore, both lists can be padded to have length exactly $\lfloor d\rfloor$ by repeating arbitrary entries. As there are at most $(m-n+1)^{\lfloor d\rfloor}(n-1)^{\lfloor d\rfloor}\leq m^{\lfloor d\rfloor}n^{% \lfloor d\rfloor}$ possible pairs $(L_{+},L_{-})\in(E\setminus T)^{\lfloor d\rfloor}\times T^{\lfloor d\rfloor}$ of lists of length ${\lfloor d\rfloor}$ , there must be at most $m^{\lfloor d\rfloor}n^{\lfloor d\rfloor}\leq m^{d}n^{d}$ possible trees $T^{\prime}$ . $\hfill\blacktriangleleft$

Lemma 23 (Greedy packing).

For a graph $G$ , and a parameter $d>0$ , there exists a set $S\subseteq\mathcal{T}(G)$ such that $d_{H}(T_{1},T_{2})>d$ for all distinct $T_{1},T_{2}\in S$ and furthermore, $|S|\geq{|\mathcal{T}(G)|}/{m^{d}n^{d}}.$

Proof.

We will construct $S$ greedily: start with $X=\mathcal{T}(G)$ . As long as $X$ is nonempty, pick arbitrary $T\in X$ and add it to $S$ . Then, set $X\coloneqq X\setminus\{\,T^{\prime}\mid d_{H}(T,T^{\prime})<d\,\}$ , and repeat. By Lemma 22, in each step we remove at most $m^{d}n^{d}$ elements from a set of size $|\mathcal{T}(G)|$ , and therefore only stop after $S$ contains at least $|\mathcal{T}(G)|/(m^{d}n^{d})$ elements. $\hfill\blacktriangleleft$

Now we are ready to prove Lemma 14:

Proof of Lemma 14.

We invoke Lemma 23 on $G$ with $d=(n-2)/6$ . Since $G$ is a clique, we have $|\mathcal{T}(G)|=n^{n-2}$ . Clearly, $d=\Theta(n)$ and $|S|\leq|\mathcal{T}(G)|=2^{\mathcal{O}(n\log n)}$ , and we can write

|S|\geq\frac{|\mathcal{T}(G)|}{m^{d}n^{d}}\geq\frac{|\mathcal{T}(G)|}{n^{3d}}=% \frac{|\mathcal{T}(G)|}{n^{(n-2)/2}}=\frac{|\mathcal{T}(G)|}{\sqrt{|\mathcal{T% }(G)|}}=\sqrt{|\mathcal{T}(G)|}=2^{\Omega(n\log n)}.\

$\hfill\blacktriangleleft$

Lemma 22 also gives us the following relationship between $|\mathcal{T}(G)|$ and $D$ :

Lemma 24.

For a graph $G$ with $\operatorname*{diam_{\mathcal{T}}}(G)=D$ , it holds that $|\mathcal{T}(G)|\leq 2^{3D\log_{2}n}$ .

Proof.

Take any $T\in\mathcal{T}(G)$ and define $S=\{\,T^{\prime}\in\mathcal{T}(G)\mid d_{H}(T,T^{\prime})\leq D\,\}$ . Necessarily $S=\mathcal{T}(G)$ by the definition of $D$ . Now we invoke Lemma 22 to conclude that $|\mathcal{T}(G)|=|S|\leq m^{D}n^{D}\leq n^{3D}=2^{3D\log_{2}n}$ . $\hfill\blacktriangleleft$

6 Universal Near-Optimality via the Exponential Mechanism

In this section, we start by showing that Algorithm 5 is in fact not universally optimal for $\sim_{\infty}$ . The main contribution of this section is then that we prove that the exponential mechanism is universally near-optimal with respect to both $\sim_{1}$ and $\sim_{\infty}$ . We also show that it can be implemented in polynomial time by relying on a result of Colbourn, Myrvold, and Neufeld [6]. The properties of the exponential mechanism that we rely on are summarized in Fact 31 in Appendix A.

Our goal is to prove the following corollary. It follows from Theorem 29 (which states the upper bounds and time complexity) and Theorem 9 (which states the lower bounds).

Corollary 25.

For any $\varepsilon=\mathcal{O}(1)$ , the exponential mechanism with loss function $\mu(\mathbf{w},T)=\mathbf{w}(T)$ is universally optimal up to an $\mathcal{O}(\log n)$ factor for releasing the MST, in both the $\ell_{1}$ and the $\ell_{\infty}$ neighbor relations. It can be implemented in the matrix multiplication time $\mathcal{O}(n^{\omega})$ .

We now show that Algorithm 5 is neither worst-case, nor universally optimal. Note also that when we are using the $\sim_{\infty}$ neighbor relation, the noise magnitude used by the algorithm is indeed optimal in the sense that any lower noise magnitude will not lead to the weights themselves being private after adding the noise. This can be easily seen as follows: With the current amount of noise added, if each weight changes by 1, we lose up to $\varepsilon/m$ privacy on each edge. Since the composition theorem for pure differential privacy is tight, we thus may indeed lose up to $\varepsilon$ privacy in total. Any lower amount of noise would not give $\varepsilon$ -differential privacy.

$\vartriangleright$ Claim 26.

Denote Algorithm 5 used with the neighbor relation $\sim_{\infty}$ as $\mathcal{A}$ . For every graph $G$ , there exist weights $\mathbf{w}$ such that $\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}[\mathbf{w}(T)]=% \mathbf{w}(T^{*})+\Omega\left(mD/\varepsilon\right)-\mathcal{O}(1).$

Proof.

The claim follows immediately from the fact that $\mathcal{A}$ is also $\varepsilon/m$ -differentially private with respect to $\sim_{1}$ , and thus by the first part of Theorem 9 with $\varepsilon^{\prime}=\varepsilon/m$ , such weights $\mathbf{w}$ must exist. $\hfill\vartriangleleft$

Below, we will prove that one can in fact achieve an expected error of $\mathcal{O}(D^{2}\log n/\varepsilon)$ . This implies that Algorithm 5 is in fact neither universally, nor worst-case optimal for $\sim_{\infty}$ .

In the rest of this section, our goal is to prove that the exponential mechanism can be implemented in polynomial time and that it is universally optimal for releasing the MST. We start by stating a useful result on efficiently sampling spanning trees.

Lemma 27.

There is an algorithm that, given an unweighted graph $G$ , a weight vector $\mathbf{w}$ , and a parameter $\lambda\in\mathbb{R}$ , samples a spanning tree of $G$ such that the probability that $T\in\mathcal{T}(G)$ is returned is proportional to $\exp(-\lambda\mathbf{w}(T))$ . It runs in matrix multiplication time $\mathcal{O}(n^{\omega})$ .

Proof.

Such an algorithm is provided in Colbourn, Myrvold, and Neufeld [6]. The original paper only deals with unweighted graphs, but it is mentioned in Durfee et al. [8] that this approach is actually easily generalized to the weighted case. $\hfill\blacktriangleleft$

Note that generally, any exact spanning tree sampling algorithm that supports weighted graphs works for Lemma 27. There are faster algorithms available, but every faster algorithm known to us either does not sample from the exact distribution (failing or sampling from a different distribution with some small probability $\delta$ ), or does not support weighted graphs out of the box.

The following lemma allows us to analyze the exponential mechanism more tightly by changing the loss function so that the outcome probabilities do not change, but the global sensitivity decreases.

Lemma 28.

Let $\mu,\mu^{\prime}:\mathcal{X}\times\mathcal{Y}\to\mathbb{R}$ be two loss functions related in the following way: for each $x\in\mathcal{X}$ , there exists $c_{x}\in\mathbb{R}$ such that for all $y\in\mathcal{Y}$ , $\mu^{\prime}(x,y)=\mu(x,y)+c_{x}$ .

Given $\lambda\in\mathbb{R}$ , let $\mathcal{A}$ and $\mathcal{A}^{\prime}$ be instantiations of the exponential mechanism that, given $x\in\mathcal{X}$ , sample $y\in\mathcal{Y}$ with probability proportional to $\exp(-\lambda\mu(x,y))$ and $\exp(-\lambda\mu^{\prime}(x,y))$ , respectively. Then $\mathcal{A}$ and $\mathcal{A}^{\prime}$ are equivalent, that is, for each $x$ , they return the same distribution on $\mathcal{Y}$ .

Proof.

$\mathcal{A}^{\prime}(x)$ returns $y$ with probability

\frac{\exp(-\lambda\mu^{\prime}(x,y))}{\sum_{y^{\prime}\in\mathcal{Y}}\exp(-% \lambda\mu^{\prime}(x,y^{\prime}))}=\frac{\exp(-\lambda c_{x})\cdot\exp(-% \lambda\mu(x,y))}{\sum_{y^{\prime}\in\mathcal{Y}}\exp(-\lambda c_{x})\cdot\exp% (-\lambda\mu(x,y^{\prime}))}=\frac{\exp(-\lambda\mu(x,y))}{\sum_{y^{\prime}\in% \mathcal{Y}}\exp(-\lambda\mu(x,y^{\prime}))},

which is exactly the probability of $\mathcal{A}(x)$ returning $y$ . $\hfill\blacktriangleleft$

Theorem 29.

There is an $\mathcal{O}(n^{\omega})$ -time mechanism $\mathcal{A}$ for MST, $\varepsilon$ -differentially private with respect to $\sim_{1}$ , such that, for every weighted graph $(G,\mathbf{w})$ ,

\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}[\mathbf{w}(T)]\leq% \mathbf{w}(T^{*})+\frac{2\log|\mathcal{T}(G)|}{\varepsilon}=\mathbf{w}(T^{*})+% \mathcal{O}\left(\frac{D\log n}{\varepsilon}\right),

where $T^{*}$ is the MST of $(G,\mathbf{w})$ and $D=\operatorname*{diam_{\mathcal{T}}}(G)$ . Furthermore, there is an $\mathcal{O}(n^{\omega})$ -time mechanism $\mathcal{A}$ for MST, $\varepsilon$ -differentially private with respect to $\sim_{\infty}$ , such that, for every weighted graph $(G,\mathbf{w})$ ,

\operatorname*{\mathbb{E}}_{T\sim\mathcal{A}(G,\mathbf{w})}[\mathbf{w}(T)]\leq% \mathbf{w}(T^{*})+\frac{4D\log|\mathcal{T}(G)|}{\varepsilon}=\mathbf{w}(T^{*})% +\mathcal{O}\left(\frac{D^{2}\log n}{\varepsilon}\right).

Proof.

The asymptotic bounds on the error follow from the exact ones by Lemma 24, thus we will only focus on the exact inequalities. Let us assume the $\sim_{\infty}$ case first; we will deal with the $\sim_{1}$ case at the end of the proof.

$\mathcal{A}$ will be an instantiation of the exponential mechanism from Lemma 27, with $\mu(\mathbf{w},T)\coloneq\mathbf{w}(T)$ , and with $\lambda$ determined later.

We will use Lemma 28 to analyze an equivalent exponential mechanism $\mathcal{A}^{\prime}$ that uses the loss function $\mu^{\prime}$ defined as follows: fix globally some $T_{0}\in\mathcal{T}(G)$ and define $\mu^{\prime}(\mathbf{w},T)\coloneq\mathbf{w}(T)-\mathbf{w}(T_{0})$ . It holds that, for every fixed $\mathbf{w}$ , we can write $\mu^{\prime}(\mathbf{w},T)=\mu(\mathbf{w},T)+c_{\mathbf{w}}$ , where $c_{\mathbf{w}}=-\mathbf{w}(T_{0})$ does not depend on $T$ , and thus $\mathcal{A}$ and $\mathcal{A}^{\prime}$ are equivalent by Lemma 28.

Let us choose the right $\lambda$ for $\mathcal{A}^{\prime}$ (and thus also for $\mathcal{A}$ ). By the standard properties of the exponential mechanism (see Fact 31), $\mathcal{A}^{\prime}$ is $\varepsilon$ -differentially private if we choose $\lambda\leq\varepsilon/(2\Delta)$ , where $\Delta$ is the global sensitivity of $\mu^{\prime}$ , as defined in Fact 31. As a next step, we bound $\Delta$ . For each $\mathbf{w}\sim_{\infty}\mathbf{w}^{\prime}$ , we have:

	$\displaystyle\left\|\mu^{\prime}(\mathbf{w},T)-\mu^{\prime}(\mathbf{w}^{\prime}% ,T)\right\|$	$\displaystyle=\left\|(\mathbf{w}-\mathbf{w}^{\prime})(T)-(\mathbf{w}-\mathbf{w}% ^{\prime})(T_{0})\right\|$
		$\displaystyle=\bigg{\|}\sum_{e\in T\setminus T_{0}}(\mathbf{w}-\mathbf{w}^{% \prime})(e)-\sum_{e\in T_{0}\setminus T}(\mathbf{w}-\mathbf{w}^{\prime})(e)% \bigg{\|}$
		$\displaystyle\leq\sum_{e\in T\setminus T_{0}}\left\|(\mathbf{w}-\mathbf{w}^{% \prime})(e)\right\|+\sum_{e\in T_{0}\setminus T}\left\|(\mathbf{w}-\mathbf{w}^{% \prime})(e)\right\|$
		$\displaystyle\leq 2d_{H}(T_{0},T)\leq 2R_{0},$

for $R_{0}\coloneqq\max_{T\in\mathcal{T}(G)}d_{H}(T_{0},T)$ . We used the fact that edges present in both $T$ and $T_{0}$ do not count towards the result. Hence, $\Delta\leq 2R_{0}\leq 2D$ . If we thus choose $\lambda=\frac{\varepsilon}{4R_{0}}$ in $\mathcal{A}^{\prime}$ , we immediately get from Fact 31 that $\mathcal{A}^{\prime}$ is $\varepsilon$ -differentialy private and the expected error is at most $4R_{0}\log|\mathcal{T}(G)|/\varepsilon\leq 4D\log|\mathcal{T}(G)|/\varepsilon$ , as needed. By the equivalence of $\mathcal{A}$ and $\mathcal{A}^{\prime}$ , the same holds for $\mathcal{A}$ .

Finally, note that $\mathcal{A}$ can compute $R_{0}$ (and thus $\lambda$ ) quickly: namely, if we denote by $T^{*}$ the MST of a graph $(G,-\mathds{1}_{T_{0}})$ , then $R_{0}=d_{H}(T_{0},T^{*})$ . That is because $T^{*}$ minimizes the expression $-\mathds{1}_{T_{0}}(T)$ , which, by Fact 2, is equal to $-d_{H}(T_{0},T)$ , just as needed. The MST can be computed in linear time using e.g. the Jarník-Prim algorithm with a double-ended queue as the priority queue, as all weights are either $-1$ or $0$ .

Let us now deal with the $\sim_{1}$ case. $\mathcal{A}$ will again be an instantiation of the exponential mechanism. This time, we set $\lambda=\varepsilon/2$ and analyze $\mathcal{A}$ directly, without the use of Lemma 28. For any $\mathbf{w}\sim_{1}\mathbf{w}^{\prime}$ and $T\in\mathcal{T}(G)$ , we immediately have $|\mathbf{w}(T)-\mathbf{w}^{\prime}(T)|\leq\|\mathbf{w}-\mathbf{w}^{\prime}\|_{% 1}\leq 1$ , and thus the global sensitivity of $\mu$ is $\Delta\leq 1$ . By Fact 31, $\mathcal{A}$ is $\varepsilon$ -differentially private and the expected error is at most $2\log|\mathcal{T}(G)|/\varepsilon$ , exactly as needed. $\hfill\blacktriangleleft$

Since $R_{0}$ computed in the above proof satisfies $D/2\leq R_{0}\leq D$ , we immediately get the following corollary:

Corollary 30.

A 2-approximation of $D=\operatorname*{diam_{\mathcal{T}}}(G)$ can be computed in linear time.

Note that the above algorithm is actually strictly stronger than Algorithm 5, as there are graphs where $\log|\mathcal{T}(G)|=o(D\log n)$ . We suspect that, in fact, the exponential mechanism is universally optimal for both $\sim_{1}$ and $\sim_{\infty}$ , but we were not able to prove a stronger lower bound.

References

[1] Hilal Asi and John C. Duchi. Instance-optimality in differential privacy via approximate inverse sensitivity mechanisms. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, volume 33, pages 14106–14117, 2020. URL: https://proceedings.neurips.cc/paper/2020/hash/a267f936e54d7c10a2bb70dbe6ad7a89-Abstract.html.
[2] Raef Bassily, Kobbi Nissim, Adam D. Smith, Thomas Steinke, Uri Stemmer, and Jonathan R. Ullman. Algorithmic stability for adaptive data analysis. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18-21, 2016, pages 1046–1059. ACM, 2016. doi:10.1145/2897518.2897566.
[3] Jaroslaw Błasiok, Mark Bun, Aleksandar Nikolov, and Thomas Steinke. Towards instance-optimal private query release. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 2480–2497. SIAM, SIAM, 2019. doi:10.1137/1.9781611975482.152.
[4] Solenn Brunet, Sébastien Canard, Sébastien Gambs, and Baptiste Olivier. Edge-calibrated noise for differentially private mechanisms on graphs. In 14th Annual Conference on Privacy, Security and Trust, PST 2016, Auckland, New Zealand, December 12-14, 2016, pages 42–49. IEEE, IEEE, 2016. doi:10.1109/PST.2016.7906935.
[5] Justin Y. Chen, Shyam Narayanan, and Yinzhan Xu. All-pairs shortest path distances with differential privacy: Improved algorithms for bounded and unbounded weights. CoRR, abs/2204.02335, 2022. doi:10.48550/arXiv.2204.02335.
[6] Charles J. Colbourn, Wendy J. Myrvold, and Eugene Neufeld. Two algorithms for unranking arborescences. Journal of Algorithms, 20(2):268–281, 1996. doi:10.1006/jagm.1996.0014.
[7] Wei Dong and Ke Yi. A nearly instance-optimal differentially private mechanism for conjunctive queries. In PODS ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 – 17, 2022, pages 213–225. ACM, 2022. doi:10.1145/3517804.3524143.
[8] David Durfee, Rasmus Kyng, John Peebles, Anup B. Rao, and Sushant Sachdeva. Sampling random spanning trees faster than matrix multiplication. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 730–742. ACM, 2017. doi:10.1145/3055399.3055499.
[9] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography, Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006, Proceedings, volume 3876 of Lecture Notes in Computer Science, pages 265–284. Springer, Springer, 2006. doi:10.1007/11681878_14.
[10] Chenglin Fan, Ping Li, and Xiaoyun Li. Private graph all-pairwise-shortest-path distance release with improved error rate. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 – December 9, 2022, volume 35, pages 17844–17856, 2022. URL: http://papers.nips.cc/paper_files/paper/2022/hash/71b17f00017da0d73823ccf7fbce2d4f-Abstract-Conference.html.
[11] Natasha Fernandes, Annabelle McIver, Catuscia Palamidessi, and Ming Ding. Universal optimality and robust utility bounds for metric differential privacy. In 35th IEEE Computer Security Foundations Symposium, CSF 2022, Haifa, Israel, August 7-10, 2022, pages 348–363. IEEE, August 2022. doi:10.1109/CSF54842.2022.9919647.
[12] Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2006. doi:10.1007/3-540-29953-X.
[13] Arpita Ghosh, Tim Roughgarden, and Mukund Sundararajan. Universally utility-maximizing privacy mechanisms. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 – June 2, 2009, pages 351–360. ACM, 2009. doi:10.1145/1536414.1536464.
[14] Edgar N Gilbert. A comparison of signalling alphabets. The Bell system technical journal, 31(3):504–522, 1952.
[15] Bernhard Haeupler, Richard Hladík, Václav Rozhon, Robert E. Tarjan, and Jakub Tetek. Universal optimality of dijkstra via beyond-worst-case heaps. In 65th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2024, Chicago, IL, USA, October 27-30, 2024, pages 2099–2130. IEEE, 2024. doi:10.1109/FOCS61266.2024.00125.
[16] Bernhard Haeupler, Harald Räcke, and Mohsen Ghaffari. Hop-constrained expander decompositions, oblivious routing, and distributed universal optimality. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 – 24, 2022, pages 1325–1338. ACM, 2022. doi:10.1145/3519935.3520026.
[17] Bernhard Haeupler, David Wajc, and Goran Zuzic. Universally-optimal distributed algorithms for known topologies. In STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1166–1179. ACM, 2021. doi:10.1145/3406325.3451081.
[18] Michael Hay, Chao Li, Gerome Miklau, and David D. Jensen. Accurate estimation of the degree distribution of private networks. In ICDM 2009, The Ninth IEEE International Conference on Data Mining, Miami, Florida, USA, 6-9 December 2009, pages 169–178. IEEE, IEEE Computer Society, 2009. doi:10.1109/ICDM.2009.11.
[19] Ziyue Huang, Yuting Liang, and Ke Yi. Instance-optimal mean estimation under differential privacy. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, volume 34, pages 25993–26004, 2021. URL: https://proceedings.neurips.cc/paper/2021/hash/da54dd5a0398011cdfa50d559c2c0ef8-Abstract.html.
[20] Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2007), October 20-23, 2007, Providence, RI, USA, Proceedings, pages 94–103. IEEE, IEEE Computer Society, 2007. doi:10.1109/FOCS.2007.41.
[21] Kobbi Nissim, Sofya Raskhodnikova, and Adam D. Smith. Smooth sensitivity and sampling in private data analysis. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing, San Diego, California, USA, June 11-13, 2007, pages 75–84. ACM, 2007. doi:10.1145/1250790.1250803.
[22] Rasmus Pagh and Lukas Retschmeier. Faster private minimum spanning trees. CoRR, abs/2408.06997, 2024. doi:10.48550/arXiv.2408.06997.
[23] Rafael Pinot. Minimum spanning tree release under differential privacy constraints. CoRR, abs/1801.06423, 2018. doi:10.48550/arXiv.1801.06423.
[24] Rafael Pinot, Anne Morvan, Florian Yger, Cédric Gouy-Pailler, and Jamal Atif. Graph-based clustering under differential privacy. In Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2018, Monterey, California, USA, August 6-10, 2018, pages 329–338. AUAI Press, 2018. URL: http://auai.org/uai2018/proceedings/papers/132.pdf.
[25] Václav Rozhon, Christoph Grunau, Bernhard Haeupler, Goran Zuzic, and Jason Li. Undirected $(1+\epsilon)$ -shortest paths via minor-aggregates: near-optimal deterministic parallel and distributed algorithms. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 – 24, 2022, pages 478–487. ACM, 2022. doi:10.1145/3519935.3520074.
[26] Adam Sealfon. Shortest paths and distances with differential privacy. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 – July 01, 2016, pages 29–41. ACM, 2016. doi:10.1145/2902251.2902291.
[27] Salil Vadhan. The complexity of differential privacy. Tutorials on the Foundations of Cryptography: Dedicated to Oded Goldreich, pages 347–450, 2017. doi:10.1007/978-3-319-57048-8_7.
[28] Rom Rubenovich Varshamov. Estimate of the number of signals in error correcting codes. Doklady Akad. Nauk, SSSR, 117:739–741, 1957.
[29] Goran Zuzic, Gramoz Goranci, Mingquan Ye, Bernhard Haeupler, and Xiaorui Sun. Universally-optimal distributed shortest paths and transshipment via graph-based $\ell_{1}$ -oblivious routing. In Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, Virtual Conference / Alexandria, VA, USA, January 9 – 12, 2022, pages 2549–2579. SIAM, SIAM, 2022. doi:10.1137/1.9781611977073.100.

Appendix A Exponential Mechanism

For completeness, we restate the guarantees of the exponential mechanism.

Fact 31 (Guarantees of the exponential mechanism [20, 2]).

Let $\mu:\mathcal{X}\times\mathcal{Y}\to\mathbb{R}$ be a function, and let $\sim$ be a neighbor relation on $\mathcal{X}$ . The exponential mechanism $\mathcal{A}:\mathcal{X}\to\mathcal{Y}$ that, given $x\in\mathcal{X}$ , samples $y\in\mathcal{Y}$ with probability proportional to $\exp(-\frac{\varepsilon}{2\Delta}\cdot\mu(x,y))$ , is $\varepsilon$ -differentially private and satisfies:

\operatorname*{\mathbb{E}}_{y\sim\mathcal{A}(x)}[\mu(x,y)]\leq\mu(x,y^{*})+% \frac{2\Delta\log|\mathcal{Y}|}{\varepsilon},

where $y^{*}$ is the minimizer of $\mu(x,\cdot)$ and $\Delta$ is the global sensitivity of $\mu$ , defined as

\Delta=\sup_{x\sim x^{\prime}\in\mathcal{X}}\max_{y\in\mathcal{Y}}\left|\mu(x,% y)-\mu(x^{\prime},y)\right|.

[bib.bib1] [1] Hilal Asi and John C. Duchi. Instance-optimality in differential privacy via approximate inverse sensitivity mechanisms. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, volume 33, pages 14106–14117, 2020. URL: https://proceedings.neurips.cc/paper/2020/hash/a267f936e54d7c10a2bb70dbe6ad7a89-Abstract.html.

[bib.bib2] [2] Raef Bassily, Kobbi Nissim, Adam D. Smith, Thomas Steinke, Uri Stemmer, and Jonathan R. Ullman. Algorithmic stability for adaptive data analysis. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18-21, 2016, pages 1046–1059. ACM, 2016. doi:10.1145/2897518.2897566.

[bib.bib3] [3] Jaroslaw Błasiok, Mark Bun, Aleksandar Nikolov, and Thomas Steinke. Towards instance-optimal private query release. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 2480–2497. SIAM, SIAM, 2019. doi:10.1137/1.9781611975482.152.

[bib.bib4] [4] Solenn Brunet, Sébastien Canard, Sébastien Gambs, and Baptiste Olivier. Edge-calibrated noise for differentially private mechanisms on graphs. In 14th Annual Conference on Privacy, Security and Trust, PST 2016, Auckland, New Zealand, December 12-14, 2016, pages 42–49. IEEE, IEEE, 2016. doi:10.1109/PST.2016.7906935.

[bib.bib5] [5] Justin Y. Chen, Shyam Narayanan, and Yinzhan Xu. All-pairs shortest path distances with differential privacy: Improved algorithms for bounded and unbounded weights. CoRR, abs/2204.02335, 2022. doi:10.48550/arXiv.2204.02335.

[bib.bib6] [6] Charles J. Colbourn, Wendy J. Myrvold, and Eugene Neufeld. Two algorithms for unranking arborescences. Journal of Algorithms, 20(2):268–281, 1996. doi:10.1006/jagm.1996.0014.

[bib.bib7] [7] Wei Dong and Ke Yi. A nearly instance-optimal differentially private mechanism for conjunctive queries. In PODS ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 – 17, 2022, pages 213–225. ACM, 2022. doi:10.1145/3517804.3524143.

[bib.bib8] [8] David Durfee, Rasmus Kyng, John Peebles, Anup B. Rao, and Sushant Sachdeva. Sampling random spanning trees faster than matrix multiplication. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 730–742. ACM, 2017. doi:10.1145/3055399.3055499.

[bib.bib9] [9] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography, Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006, Proceedings, volume 3876 of Lecture Notes in Computer Science, pages 265–284. Springer, Springer, 2006. doi:10.1007/11681878_14.

[bib.bib10] [10] Chenglin Fan, Ping Li, and Xiaoyun Li. Private graph all-pairwise-shortest-path distance release with improved error rate. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 – December 9, 2022, volume 35, pages 17844–17856, 2022. URL: http://papers.nips.cc/paper_files/paper/2022/hash/71b17f00017da0d73823ccf7fbce2d4f-Abstract-Conference.html.

[bib.bib11] [11] Natasha Fernandes, Annabelle McIver, Catuscia Palamidessi, and Ming Ding. Universal optimality and robust utility bounds for metric differential privacy. In 35th IEEE Computer Security Foundations Symposium, CSF 2022, Haifa, Israel, August 7-10, 2022, pages 348–363. IEEE, August 2022. doi:10.1109/CSF54842.2022.9919647.

[bib.bib12] [12] Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2006. doi:10.1007/3-540-29953-X.

[bib.bib13] [13] Arpita Ghosh, Tim Roughgarden, and Mukund Sundararajan. Universally utility-maximizing privacy mechanisms. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 – June 2, 2009, pages 351–360. ACM, 2009. doi:10.1145/1536414.1536464.

[bib.bib14] [14] Edgar N Gilbert. A comparison of signalling alphabets. The Bell system technical journal, 31(3):504–522, 1952.

[bib.bib15] [15] Bernhard Haeupler, Richard Hladík, Václav Rozhon, Robert E. Tarjan, and Jakub Tetek. Universal optimality of dijkstra via beyond-worst-case heaps. In 65th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2024, Chicago, IL, USA, October 27-30, 2024, pages 2099–2130. IEEE, 2024. doi:10.1109/FOCS61266.2024.00125.

[bib.bib16] [16] Bernhard Haeupler, Harald Räcke, and Mohsen Ghaffari. Hop-constrained expander decompositions, oblivious routing, and distributed universal optimality. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 – 24, 2022, pages 1325–1338. ACM, 2022. doi:10.1145/3519935.3520026.

[bib.bib17] [17] Bernhard Haeupler, David Wajc, and Goran Zuzic. Universally-optimal distributed algorithms for known topologies. In STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1166–1179. ACM, 2021. doi:10.1145/3406325.3451081.

[bib.bib18] [18] Michael Hay, Chao Li, Gerome Miklau, and David D. Jensen. Accurate estimation of the degree distribution of private networks. In ICDM 2009, The Ninth IEEE International Conference on Data Mining, Miami, Florida, USA, 6-9 December 2009, pages 169–178. IEEE, IEEE Computer Society, 2009. doi:10.1109/ICDM.2009.11.

[bib.bib19] [19] Ziyue Huang, Yuting Liang, and Ke Yi. Instance-optimal mean estimation under differential privacy. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, volume 34, pages 25993–26004, 2021. URL: https://proceedings.neurips.cc/paper/2021/hash/da54dd5a0398011cdfa50d559c2c0ef8-Abstract.html.

[bib.bib20] [20] Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2007), October 20-23, 2007, Providence, RI, USA, Proceedings, pages 94–103. IEEE, IEEE Computer Society, 2007. doi:10.1109/FOCS.2007.41.

[bib.bib21] [21] Kobbi Nissim, Sofya Raskhodnikova, and Adam D. Smith. Smooth sensitivity and sampling in private data analysis. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing, San Diego, California, USA, June 11-13, 2007, pages 75–84. ACM, 2007. doi:10.1145/1250790.1250803.

[bib.bib22] [22] Rasmus Pagh and Lukas Retschmeier. Faster private minimum spanning trees. CoRR, abs/2408.06997, 2024. doi:10.48550/arXiv.2408.06997.

[bib.bib23] [23] Rafael Pinot. Minimum spanning tree release under differential privacy constraints. CoRR, abs/1801.06423, 2018. doi:10.48550/arXiv.1801.06423.

[bib.bib24] [24] Rafael Pinot, Anne Morvan, Florian Yger, Cédric Gouy-Pailler, and Jamal Atif. Graph-based clustering under differential privacy. In Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2018, Monterey, California, USA, August 6-10, 2018, pages 329–338. AUAI Press, 2018. URL: http://auai.org/uai2018/proceedings/papers/132.pdf.

[bib.bib25] [25] Václav Rozhon, Christoph Grunau, Bernhard Haeupler, Goran Zuzic, and Jason Li. Undirected $(1+\epsilon)$ -shortest paths via minor-aggregates: near-optimal deterministic parallel and distributed algorithms. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 – 24, 2022, pages 478–487. ACM, 2022. doi:10.1145/3519935.3520074.

[bib.bib26] [26] Adam Sealfon. Shortest paths and distances with differential privacy. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 – July 01, 2016, pages 29–41. ACM, 2016. doi:10.1145/2902251.2902291.

[bib.bib27] [27] Salil Vadhan. The complexity of differential privacy. Tutorials on the Foundations of Cryptography: Dedicated to Oded Goldreich, pages 347–450, 2017. doi:10.1007/978-3-319-57048-8_7.

[bib.bib28] [28] Rom Rubenovich Varshamov. Estimate of the number of signals in error correcting codes. Doklady Akad. Nauk, SSSR, 117:739–741, 1957.

[bib.bib29] [29] Goran Zuzic, Gramoz Goranci, Mingquan Ye, Bernhard Haeupler, and Xiaorui Sun. Universally-optimal distributed shortest paths and transshipment via graph-based $\ell_{1}$ -oblivious routing. In Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, Virtual Conference / Alexandria, VA, USA, January 9 – 12, 2022, pages 2549–2579. SIAM, SIAM, 2022. doi:10.1137/1.9781611977073.100.

	$\displaystyle\left\|\mu^{\prime}(\mathbf{w},T)-\mu^{\prime}(\mathbf{w}^{\prime}% ,T)\right\|$	$\displaystyle=\left\|(\mathbf{w}-\mathbf{w}^{\prime})(T)-(\mathbf{w}-\mathbf{w}% ^{\prime})(T_{0})\right\|$
		$\displaystyle=\bigg{\|}\sum_{e\in T\setminus T_{0}}(\mathbf{w}-\mathbf{w}^{% \prime})(e)-\sum_{e\in T_{0}\setminus T}(\mathbf{w}-\mathbf{w}^{\prime})(e)% \bigg{\|}$
		$\displaystyle\leq\sum_{e\in T\setminus T_{0}}\left\|(\mathbf{w}-\mathbf{w}^{% \prime})(e)\right\|+\sum_{e\in T_{0}\setminus T}\left\|(\mathbf{w}-\mathbf{w}^{% \prime})(e)\right\|$
		$\displaystyle\leq 2d_{H}(T_{0},T)\leq 2R_{0},$

Near-Universally-Optimal Differentially Private Minimum Spanning Trees

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Acknowledgements:

Funding:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

1.1 Technical Overview

Upper bound

Lower bound

Reduction to finding many dissimilar spanning trees

How to find many dissimilar spanning trees?

1.2 Related Work

Differentially private minimum spanning trees

Other differential privacy notions on graphs

Instance-optimality in differential privacy

Issues with nomenclature

Universal optimality

1.3 Future work

2 Preliminaries

2.1 Differential Privacy

2.2 Spanning Trees

Definition 1.

Fact 2.

2.3 Universal Optimality

Definition 3.

3 Simple Mechanism for Privately Releasing the MST

Corollary 4.

Algorithm 5 (MST via postprocessing).

Theorem 6.

Proof.

Corollary 7.

Proof.

Lemma 8.

Proof.

4 Lower Bounds for MST

Theorem 9.

Lemma 10 (Vadhan [27], Theorem 5.13).

Lemma 11 (Adaptation of Lemma 10 for MST).

Lemma 12.

Proof.

Lemma 13.

Lemma 14.

Proof of Theorem 9.

5 Finding a Large Set of Dissimilar Trees

5.1 Properties of Spanning Trees

Lemma 15 (Exchange lemma).

Proof.

Lemma 16 (Iterated exchange lemma).

Proof.

Corollary 17.

Proof.

5.2 Dissimilar Trees via Binary Codes

Lemma 18.

Proof.

Definition 19.

Lemma 20.

Proof.

Lemma 21.

Proof.

Proof of Lemma 13.

5.3 Dissimilar Trees via Greedy Packing

Lemma 22 (Volume of a d-ball around a spanning tree).

Proof.

Lemma 23 (Greedy packing).

Proof.

Proof of Lemma 14.

Lemma 24.

Proof.

6 Universal Near-Optimality via the Exponential Mechanism

Corollary 25.

⊳ Claim 26.

Proof.

Near-Universally-Optimal Differentially Private
Minimum Spanning Trees

Lemma 22 (Volume of a $d$ -ball around a spanning tree).

$\vartriangleright$ Claim 26.