CN5 Generating function


Aims

  • make statements about the properties of a network if we only know the degree distribution
  • properties of generating function
  • analyze the friendship paradox

Questions

  • what is a generating function?
    • a function to encode a sequence of integer probability.
  • what does the $m$-th power of a generating function $G_0(x)$​ generate?
    • $$
      \langle k^m\rangle
      $$

    • derivate and multiplied by $x$ and set $x$ as 1.

  • How is the second raw moment of a distribution related to its variance?
    • variance = second RM - first RM
  • How is the first raw moment of a distribution related to its mean?
    • equal??? Exactly.
  • What does the composition $G_1(G_0(x)$ generate?
  • what is the friendship paradox?
    • the average friends number of your friends is larger than the average friends number of yours.
  • what practical applications of the friendship paradox can you think of?

Generating functions

it provide a general tool for us to make analytical statements about the expected properties of networks generated from a statistical ensemble with arbitrary degree distributions.

Functions

  • it is maybe the most important methodology in the macroscopic study of complex networks
  • it explains why degree distributions are such an important aggregate characteristic of complex networks.
  • it is helpful to understand how widely known results about networks are derived
  • it will help us to understand the limitation of the ensemble perspective.
  • It is a fancy way to encode a sequence of numbers.

Definition

Given an integer probability sequence $(p(1), p(2),\cdots, p(n),\cdots)$​​​, where $\sum_i p(i)=1$​​​,
$$
G_0(x)=\sum_{i=1} p(i)x^i, x\in[0,1]
$$

  • the probability: the object must be an integer,
    • could be 0,
    • could be infinite.
  • $x$ must in the range from 0 to 1.

Calculation

Derivation

The probability

$P(k)$ is the parameter of $x^k$​​, so we have
$$
[\frac{1}{k!}\frac{\mathrm{d}^k}{\mathrm{d}x^k}G_0(x)]_{x=0}=P(k)
$$

  • derivate by $k$ times, to eliminate the items with power less than $k$​.
  • set $x$ as 0, to eliminate the items with power larger than $k$.
  • divided by $k!$ to eliminate the power influence on the parameter.

The average degree

$$
G_0’(1)=[ \sum_{k=0}^\infty k P(k) x^{k-1} ]_{x=1}
$$

$$
= \sum_{k=0} kP(k)=\langle k \rangle
$$

  • take the first derivative, to pull down the integer.
  • set $x$​ as 1, to eliminate the $x$​’s influence.

$m$-th raw moment of $P(k)$​

$$
[(x\frac{\mathrm{d}}{\mathrm{d}x})^m G_0(x)]{x=1}= \sum{k=0}^\infty k^m P(k)=\langle k^m\rangle
$$

  • the $k$ is on the index, so take derivative to pull it as a multiplier.
  • multiplied by $x$​ to make up the “$k$”​. Actually, only the first $k-1$​ times matter. The last $x$​ is just for the form unified.
  • set $x$ as 1, to keep $P(k)$ while eliminating the $x$’s influence.

Multiplication

Suppose $Z=X+Y$,
$$
G_Z(x)=G_{X+Y}(x)=G_X(x)G_Y(x)
$$
the distribution of the sum of two random independent variable $X$ and $Y$.

From the multiplication, we have many variants:

A “new” variable $X$

$xG_0(x)$ generates distribution of $P(X=k-1)=P(X+1=k)$.

We can understand by

  • we take $G_Y(x)=0+x+0\cdot x^2+\cdots$ . It corresponds to a random variable $Y$, which is always $1$, $P(Y=1)=1$
  • So $xG_0(x)=G_X(x)G_Y(x)$​, means the sum of one variable follows $P_X(x)$ and the other one, which is always 1.

$\frac{1}{x}G_0(x)$ generates distribution of $P(X=k+1)=P(X-1=k)$

Similarly, it can be understood by add a random variable, which is always be -1. cause $x^{-1}=\frac{1}{x}$

Power

This is a variant of multiplication.

m-th power $[G_0(x)]^m$ generates distribution for the sum of $m$ independent realizations of random variable $X$, given the distribution of $X$​.

e.g.
$$
[G_0(x)]^2=\sum_{i+j=0}^\infty P(i)P(j)x^{i+j}
$$

$$
[G_0(x)]^m=\sum_{k_1+k_2+\cdots + k_m=0}^\infty P(k_1)P(k_2)\cdots P(k_m)x^{k_1+k_2+\cdots k_m}
$$

Composition

$G_1(G_0(x))$​ generates the sum of a random number $n_i$​ of variable $X$
$$
G_1(G_0(x))=\sum_{k=0}P_1(k)[G_0(x)]^k
$$

  • a random number $n_i$ follows the distribution given by $G_1(x)$
  • the variable $X$ follows the distribution given by $G_0(x)$

Example: we have two dice, one with six sides, and the second one with eight sides.

  • We throw the 6-one to decide how many times we throws the 8-one.
  • we sum the number of the 8-one.
  • we get the distribution of this sum.

“0” and “1”

  • $$
    G_0(1)=\sum_{k=0}P(k)\cdot 1 = 1
    $$

  • $$
    G_0(0)=\sum_{i=1} p(i)0^i=0
    $$

Applications

Networks

$G_0(x)$ generates the degree distribution.

Friendship paradox

Your friends have, on average, more friends than you.

Mathematically, $\langle k_n\rangle>\langle k\rangle$

  • $\langle k\rangle$​ can be achieved by $G_0(x)$​, which generates the given degree distribution. $\langle k\rangle=\frac{1}{N}\sum_{i=1}^N k_i$​

  • $\langle k_n\rangle$​ means the average number of node’s neighbors degree. Mathematically, $\langle k_n\rangle=\frac{1}{N}\sum_{i=1}^N\bar{k}_i$​

    • where $\bar{k}_i$​​​​ means how many neighbors on average, does node $i$​​​​’s neighbors have. Mathematically, $\bar{k}i=\frac{1}{k_i} \sum{j\in I_i} d_j$​​​​
      • $I_i$​​​ denotes the set of node $i$​​​’s neighbor.
  • Trick: if we can know the degree distribution of a node’s neighbor from the original node distribution, we will solve this problem. That is the probability $P_1(X=k)$ means a neighbor of a node has degree $k$.

The degree distribution of a node’s neighbor

We define $P_1(X=k)$ as above. This probability means I randomly select a node’s neighbor, the probability of its degree is $k$.

  • If a node has degree $k$​, it has $k$​ times to be others’ neighbor. So, $P_1(X=k)\sim k P_0(k)$​.

  • we normalize by the sum of $P_1(X=k)$ over $k$​. It is
    $$
    \sum_k P_1(k)=\sum kP_0(k)=\langle k\rangle
    $$

  • So, we get
    $$
    P_1(X=k)= \frac{k P_0(k)}{\langle k\rangle}
    $$

  • We can define $G_1(x)$​ by $P_1(X=k)$​, not including the edge linking these two nodes.
    $$
    G_1(x)=\frac{1}{x}\sum_{i=1}P_1(i)x^i=\frac{G_0’(x)}{\langle k\rangle}
    $$

  • The average degree
    $$
    \langle k_n\rangle = 1+G_1’(1)=1+\frac{G_0’’(1)}{G_0’(1)}
    $$

    $$
    =\sum_{i=1}iP_1(i)=\sum_{i=1}i^2\frac{P_0(i)}{\langle k\rangle}=\frac{\langle k^2\rangle}{\langle k\rangle}
    $$

  • The key is the variance of the degree sequence!!!

Variant

Also, we can define $P_2(X=k)=P_1(X=k+1)$. The probability of a degree of a node’s neighbor

According to the calculation property of generating function, it is easy to get

  • $G_2(x)=xG_1(x)=\frac{xG_0’(x)}{\langle k\rangle}$

Besides, we can define $G_0(G_1(x)$ : the distribution of the number of second-neighbor.

Applications

  • vaccination strategy: Instead of sampling random individuals, we ask them to name a random friend and then offer this friend to receive a vaccination. This simple strategy has a larger impact than the vaccination of a random sample of the population.
  • explore topology. Study from the friends to get upper limit.

Author: Fululu
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Fululu !
  TOC