CN7 Scale-Free Network and limitations of Ensemble Studies


Objectives

  • analyzing robustness with generating functions
  • robustness of scale-free network
  • limitations of ensemble-based approach
  • AS-level Internet topology

Questions

  • what is the critical failure probability $q_c$ for random networks with arbitrary degree distribution?
    • the failure probability making the average clustering size goes to finite.

$$
q_c=1-(\frac{\langle k^2\rangle}{\langle k\rangle}-1)^{-1}
$$

  • what is $q_c$ for $G(n,p)$ model?

$$
q_c=1-\frac{1}{np}
$$

  • what are scale-free network? How can we characterize their degree distribution in the finite and infinite case?
    • the degree distribution follows power-law distribution
    • finite: $P(k)=k^{-\gamma}(\sum_{i=1}^N i^{-\gamma})^{-1}=\frac{k^{-\gamma}}{\sum_{i=1}i^{-\gamma}}$​
    • infinite: $P(k)=\frac{k^{-\gamma}}{\zeta(\gamma)}$
  • what is $q_c$ for a scale-free network with exponent $\gamma$ in the limit of infinite size?
    • $q_c= 1-(\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}-1)^{-1}$
  • what is the reason for the discrepancy between the predicted and the empirical surviving lcc for random failures in scale-free networks?
    • degree distribution
    • finite nodes
    • different ensemble methods.
  • what is the robust-yet-fragile nature of random scale-free network?
    • critical failure rate approximates to 1.
    • relies heavily on few important nodes.
  • explain three possible fallacies that can occur when applying the ensemble perspective to real systems?
    • degree distribution
    • finite/infinite
    • ensemble methods

Keywords

  • critical failure probability
  • scale-free network
  • ensemble methods fallacies

Robustness

Related to the giant connected components: at which point does the removal of nodes destroy the giant connected component of a network system?

Topological robustness: the robustness of a system’s connectedness against the loss of nodes and/or links

Assumptions

  • stochastic node failure model
    • each node has the uniform failure probability $q$
    • failed nodes and all adjacent links are removed.
    • node failures are independent.

Generating functions

We try to model the distribution of the survived connected component size. Then, see the mean size of connected components.

$H_0$

distribution of surviving component size for a randomly chosen node $v$

Think of the original expand form of $H_0$​
$$
H_0(x)= P(K=0)x^0+\sum_{i=1}^{\infty}P(K=i)x^i
$$

  • $P(K=0)$ means the node $v$ fails. The probability is $q$
  • consider the second part. Following our past method, we consider the distribution of the neighbor of node $v$. Here, we define two other generating functions:
    • $H_1$: the distribution of surviving component size of a node $w$ as a neighbor of another node $v$
    • $F_0$: like $G_0$, which describes the degree distribution of a node $v$, describe the degree distribution of a node $v$ with the failure probability $q$.
  • Then, we can rewrite $H_0$ as

$$
H_0=q+xF_0(H_1(x))
$$

$$
H_0’(x)=F_0(H_0(x))+xF_0’(H_1(x))H_1’(x)
$$

$$
\Rightarrow H_0’(1)=1-q+F_0’(1)H_1’(1)
$$

$H_1$

the distribution of surviving component size of a node $w$ as a neighbor of another node $v$

like $H_0$,$H_1$ can also be decomposed by two parts,

  • the node fails, the probability is $q$
  • the node doesn’t fail

like we calculate the surviving component size, there is “self-consistency”.
$$
H_1=q+xF_1(H_1(x))
$$

  • we include $F_1(X)$, also denote “direction”, the “children neighbor” of node $w$​.

$$
H_1’(x)=F_1(H_1(x))+F_1’(H_1(x))H_1’(x)
$$

$$
\Rightarrow H_1’(1) =1-q+F_1’(1)H_1’(1)
$$

$$
\Rightarrow H_1’(1) =\frac{1-q}{1-F_1’(1)}
$$

$F_0$

**Part of ** the distribution of a node degree, with the node failure probability $q$ what we know is a fixed degree distribution, that is $G_0$. So, we want to know $F_0$ from $G_0$ and $q$​

The probability that a node $v$​ has degree $k$​ and survives is generated by
$$
F_0(x)=\sum_{k=0}^{\infty}(1-q)P(k)x^k = (1-q)G_0(x)
$$
Note that $F_0(1)=1-q$, not 1. It is due to we only consider the surviving part of the network, not the whole part.

$F_1$

similarly, $F_1(x)=(1-q)G_1(x)$

$\langle s\rangle$​

$$
\langle s\rangle=H_0’(1) = 1-q+1\cdot F_0’(1)H_1’(1)
$$

$$
=1-q+\frac{(1-q)F_0’(1)}{1-F_1’(1)}=( 1+\frac{(1-q)G_0’(1)}{1-(1-q)G_1’(1)})(1-q)
$$

Critical failure rate

When the denominator is closed to 0, is the key point.

That is
$$
G_1’(1)\to \frac{1}{1-q}
$$
and
$$
G_1’(1)=\frac{\langle k^2\rangle}{\langle k\rangle}-1
$$
we have
$$
q_c =1-(\frac{\langle k^2\rangle}{\langle k\rangle}-1)^{-1}
$$

Remark

Since we use whether the average component size is finite to judge the robustness, it hides the assumption that $n\to\infty$

This formula gives an upper bound of the failure rate.

E-R Model

In ER model, we have
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{np+n^2p^2}{np}= 1+np
$$
So,
$$
q_c= 1-\frac{1}{np}
$$
According to the Molloy-Reed criterion, when $np>1$​, we have a giant connected component. That is $\langle k\rangle>1$

$K$-regular network

In K-regular model all nodes have the same degree $k$​.
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{k^2}{k}=k
$$
So,
$$
q_c= 1-\frac{1}{k-1}
$$
According to the Molloy-Reed criterion, we need $\langle k\rangle=k>2$

It is different from the E-R, due to the variance of nodes. It tells us the variance makes networks more robust!!

Broad distribution network

Explanations are below:
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}
$$

  • when $\gamma\in(2,3)$, the denominator is finite and the nominator is infinite.

  • so, $q_c=1-(\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}-1)^{-1}$

  • when $\gamma\in(2,3),q_c\to\infty$.

    This shows the scale-free network is super-robust against random failures.

    • In the limit of infinite network size, the scale-free network cannot be destroyed by any random failure probability.

Scale-free network

Definition

  • power law distribution
    • scale invariance: $P(\alpha k)\propto P(k)$

    • $$
      P(k)=k^{-\gamma}(\sum_{i=1}^N i^{-\gamma})^{-1}=\frac{k^{-\gamma}}{\sum_{i=1}i^{-\gamma}}
      $$

    • this is one kind of broad distribution. Other broad distribution examples: Zipf distribution.

    • $\gamma$ decides how quickly the distribution drops. The lager the quicker.

Approximation

We use Zeta distribution to imitate power-law distribution degrees with $n\to\infty$

The formula of Zeta distribution:

$$
P(k) =\frac{k^{-\gamma}}{\zeta(\gamma)}
$$

where
$$
\zeta(\gamma) =\sum_{i=1}i^{-\gamma}
$$
It is valid for $\gamma>1$

  • $\zeta(\gamma\to 1) %3D \to\infty$

  • $\langle k^m\rangle=\frac{\zeta(\gamma-m)}{\zeta(\gamma)}$

Diameter

$$
D\simeq\frac{\log\log(n)}{\log(\gamma-2)}
$$

  • for $\gamma\in(2,3)$, $\gamma-2\in(0,1)$
    • the smaller $\gamma$​, the larger the absolute value of $\log(\gamma)$ will be, the smaller D (the diameter) will be

Conclusion:

  • networks where the degree distribution has an infinite second raw moment have a very small diameter.
  • the more heterogeneous the degree distribution (smaller $\gamma$) is, the smaller diameter will be.

Robustness

Super robust. See above.

Different from ER model, the limitation of finite size matters a lot here.

Fragility

Degree-dependent failure ratio

the nodes with high degree are more likely to be attacked, or to fail.

it could be true, considering

  • targeted attack
  • load-dependent failures of nodes

The random scale-free networks are

  • robust, against random failures
  • fragile, against degree-dependent failure.

Limitations of ensemble studies

  • analytical form of degree distribution is foundation for theoretical claims.
    • e.g. log-normal distribution looks quite similar to zeta distribution.
  • finite or infinite.
    • the theoretical assumption is infinite
    • the real case is finite, even large.
  • microstates vs real system
    • different choice of ensemble methods, based on different microstates of the given systems. may generate quite different statistical results.
      • our statistical findings are based on our assumptions.

Author: Fululu
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Fululu !
  TOC