Uniform Edge Sampling from Complete k-Partite Graphs

Aug 29, 2019

This week I implemented my own version of graph2gauss, a deep learning model for node embeddings, in PyTorch. At its core is a loss function with the following structure.

i=0nj,ksp(i,j)<sp(i,k)f(i,j,k)\sum_{i = 0}^n \sum_{\scriptstyle j, k \atop\scriptstyle \operatorname{sp}(i, j) < \operatorname{sp}(i, k)} f(i, j, k)

Here the inner sum iterates over all possible pairs of nodes that have a different shortest-path distance to the node ii. In the paper, the authors derive the Monte-Carlo approximation

i=0nEj1,,jKNi,1,,Ni,K1k,lKk<lNi,jkNi,jlf(i,jk,jl)\sum_{i = 0}^n \mathbb{E}_{j_1, \dots, j_K \sim N_{i, 1}, \dots, N_{i, K}} \sum_{\scriptstyle 1 \le k, l \le K \atop\scriptstyle k < l} \left| N_{i, j_k} \right| \cdot \left| N_{i, j_l} \right| \cdot f(i, j_k, j_l)

which leverages the runtime efficiency of independent, uniform sampling from node subsets but is a bit unwieldy as far as I am concerned. I would much rather write it as

i=0nNiEj,kf(i,j,k)\sum_{i = 0}^n \left|N_i\right| \cdot \mathbb{E}_{j, k}\, f(i, j, k)

where Ni={(j,k)sp(i,j)<sp(i,k)}N_i = \{ (j, k) \mid \operatorname{sp}(i, j) < \operatorname{sp}(i, k) \} is the set of all admissible pairs and the expected value considers a uniform distribution over NiN_i. At a first glance sampling from NiN_i might be a problem because Ni|N_i| grows quadratically in the number of reachable nodes from ii. On second thought NiN_i can be seen as the edge set of a complete multi-partite graph of all neighbors of ii where the nodes are partitioned on sp(i,j)\operatorname{sp}(i, j). So the set actually has a simple form and uniform sampling should be possible without enumerating the set.

Now let G=(E,V)G = (E, V) be a complete, undirected, kk-partite graph with a set of nodes VV that are partitioned into subsets V1,,VkV_1, \dots, V_k with cardinalities n1,,nkn_1, \dots, n_k and edges

E={(u,v)uVa,vVb,ab}.E = \{ (u, v) \mid u \in V_a, v \in V_b, a \ne b \}.

So every node is connected to every other except the ones from its own partition. At first I thought that uniformly sampling uVau \in V_a from VV and then vv from VVaV \setminus V_a could work because the probabilities might cancel out just right but the following example shows that that is not the case.

Uniform node sampling leads to non-uniform edge probabilities in this complete 3-partitioned graph

We see that edges between high-degree nodes receive too little probability weight. Therefore, a strategy that samples edges uniformly needs to assign higher probability to nodes that participate in many edges. Let dud_u be the out-degree of node uu. In a graph such as GG, du=nnad_u = n - n_a where n=i=1knin = \sum_{i = 1}^k n_i and uVau \in V_a. Choose the first node proportional to its out-degree and the second one uniformly from all nodes that share an edge with the first one, i.e.

p(u)duandp(vu)=1du.p(u) \propto d_u \quad \textrm{and} \quad p(v \mid u) = \frac{1}{d_u}.

This leads to a uniform distribution over ordered pairs of nodes (u,v)(u, v)

p((u,v))=p(u)p(vu)=duC1du=1Cp((u, v)) = p(u) \cdot p(v \mid u) = \frac{d_u}{C} \cdot \frac{1}{d_u} = \frac{1}{C}

where CC is the normalization constant of p(u)p(u). But GG is undirected and (u,v)(u, v) is the same edge as (v,u)(v, u). Hence the probability of an edge e=(u,v)e = (u, v) is

p(e)=p((u,v))+p((v,u))=2C.p(e) = p((u, v)) + p((v, u)) = \frac{2}{C}.

Since p(e)p(e) is independent of ee, p(e)p(e) must of course be the uniform distribution but just to double-check we will compute CC. C=jVdjC = \sum_{j \in V} d_j is the sum of all out-degrees which would just be the number of edges in a directed version of GG. But GG is undirected so CC counts every edge twice. Therefore C=2EC = 2|E| and p(e)=1Ep(e) = \frac{1}{|E|}.

The sampling method described in this post allowed me fine-grained control over the accuracy-performance trade-off in my implementation of graph2gauss. It is however generally applicable to any problem that can be modelled as sampling from a complete kk-partite graph.