Objectives
- analyzing robustness with generating functions
- robustness of scale-free network
- limitations of ensemble-based approach
- AS-level Internet topology
Questions
- what is the critical failure probability $q_c$ for random networks with arbitrary degree distribution?
- the failure probability making the average clustering size goes to finite.
$$
q_c=1-(\frac{\langle k^2\rangle}{\langle k\rangle}-1)^{-1}
$$
- what is $q_c$ for $G(n,p)$ model?
$$
q_c=1-\frac{1}{np}
$$
- what are scale-free network? How can we characterize their degree distribution in the finite and infinite case?
- the degree distribution follows power-law distribution
- finite: $P(k)=k^{-\gamma}(\sum_{i=1}^N i^{-\gamma})^{-1}=\frac{k^{-\gamma}}{\sum_{i=1}i^{-\gamma}}$
- infinite: $P(k)=\frac{k^{-\gamma}}{\zeta(\gamma)}$
- what is $q_c$ for a scale-free network with exponent $\gamma$ in the limit of infinite size?
- $q_c= 1-(\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}-1)^{-1}$
- what is the reason for the discrepancy between the predicted and the empirical surviving lcc for random failures in scale-free networks?
- degree distribution
- finite nodes
- different ensemble methods.
- what is the robust-yet-fragile nature of random scale-free network?
- critical failure rate approximates to 1.
- relies heavily on few important nodes.
- explain three possible fallacies that can occur when applying the ensemble perspective to real systems?
- degree distribution
- finite/infinite
- ensemble methods
Keywords
- critical failure probability
- scale-free network
- ensemble methods fallacies
Robustness
Related to the giant connected components: at which point does the removal of nodes destroy the giant connected component of a network system?
Topological robustness: the robustness of a system’s connectedness against the loss of nodes and/or links
Assumptions
- stochastic node failure model
- each node has the uniform failure probability $q$
- failed nodes and all adjacent links are removed.
- node failures are independent.
Generating functions
We try to model the distribution of the survived connected component size. Then, see the mean size of connected components.
$H_0$
distribution of surviving component size for a randomly chosen node $v$
Think of the original expand form of $H_0$
$$
H_0(x)= P(K=0)x^0+\sum_{i=1}^{\infty}P(K=i)x^i
$$
- $P(K=0)$ means the node $v$ fails. The probability is $q$
- consider the second part. Following our past method, we consider the distribution of the neighbor of node $v$. Here, we define two other generating functions:
- $H_1$: the distribution of surviving component size of a node $w$ as a neighbor of another node $v$
- $F_0$: like $G_0$, which describes the degree distribution of a node $v$, describe the degree distribution of a node $v$ with the failure probability $q$.
- Then, we can rewrite $H_0$ as
$$
H_0=q+xF_0(H_1(x))
$$
$$
H_0’(x)=F_0(H_0(x))+xF_0’(H_1(x))H_1’(x)
$$
$$
\Rightarrow H_0’(1)=1-q+F_0’(1)H_1’(1)
$$
$H_1$
the distribution of surviving component size of a node $w$ as a neighbor of another node $v$
like $H_0$,$H_1$ can also be decomposed by two parts,
- the node fails, the probability is $q$
- the node doesn’t fail
like we calculate the surviving component size, there is “self-consistency”.
$$
H_1=q+xF_1(H_1(x))
$$
- we include $F_1(X)$, also denote “direction”, the “children neighbor” of node $w$.
$$
H_1’(x)=F_1(H_1(x))+F_1’(H_1(x))H_1’(x)
$$
$$
\Rightarrow H_1’(1) =1-q+F_1’(1)H_1’(1)
$$
$$
\Rightarrow H_1’(1) =\frac{1-q}{1-F_1’(1)}
$$
$F_0$
**Part of ** the distribution of a node degree, with the node failure probability $q$ what we know is a fixed degree distribution, that is $G_0$. So, we want to know $F_0$ from $G_0$ and $q$
The probability that a node $v$ has degree $k$ and survives is generated by
$$
F_0(x)=\sum_{k=0}^{\infty}(1-q)P(k)x^k = (1-q)G_0(x)
$$
Note that $F_0(1)=1-q$, not 1. It is due to we only consider the surviving part of the network, not the whole part.
$F_1$
similarly, $F_1(x)=(1-q)G_1(x)$
$\langle s\rangle$
$$
\langle s\rangle=H_0’(1) = 1-q+1\cdot F_0’(1)H_1’(1)
$$
$$
=1-q+\frac{(1-q)F_0’(1)}{1-F_1’(1)}=( 1+\frac{(1-q)G_0’(1)}{1-(1-q)G_1’(1)})(1-q)
$$
Critical failure rate
When the denominator is closed to 0, is the key point.
That is
$$
G_1’(1)\to \frac{1}{1-q}
$$
and
$$
G_1’(1)=\frac{\langle k^2\rangle}{\langle k\rangle}-1
$$
we have
$$
q_c =1-(\frac{\langle k^2\rangle}{\langle k\rangle}-1)^{-1}
$$
Remark
Since we use whether the average component size is finite to judge the robustness, it hides the assumption that $n\to\infty$
This formula gives an upper bound of the failure rate.
E-R Model
In ER model, we have
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{np+n^2p^2}{np}= 1+np
$$
So,
$$
q_c= 1-\frac{1}{np}
$$
According to the Molloy-Reed criterion, when $np>1$, we have a giant connected component. That is $\langle k\rangle>1$
$K$-regular network
In K-regular model all nodes have the same degree $k$.
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{k^2}{k}=k
$$
So,
$$
q_c= 1-\frac{1}{k-1}
$$
According to the Molloy-Reed criterion, we need $\langle k\rangle=k>2$
It is different from the E-R, due to the variance of nodes. It tells us the variance makes networks more robust!!
Broad distribution network
Explanations are below:
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}
$$
when $\gamma\in(2,3)$, the denominator is finite and the nominator is infinite.
so, $q_c=1-(\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}-1)^{-1}$
when $\gamma\in(2,3),q_c\to\infty$.
This shows the scale-free network is super-robust against random failures.
- In the limit of infinite network size, the scale-free network cannot be destroyed by any random failure probability.
Scale-free network
Definition
- power law distribution
scale invariance: $P(\alpha k)\propto P(k)$
$$
P(k)=k^{-\gamma}(\sum_{i=1}^N i^{-\gamma})^{-1}=\frac{k^{-\gamma}}{\sum_{i=1}i^{-\gamma}}
$$this is one kind of broad distribution. Other broad distribution examples: Zipf distribution.
$\gamma$ decides how quickly the distribution drops. The lager the quicker.
Approximation
We use Zeta distribution to imitate power-law distribution degrees with $n\to\infty$
The formula of Zeta distribution:
$$
P(k) =\frac{k^{-\gamma}}{\zeta(\gamma)}
$$
where
$$
\zeta(\gamma) =\sum_{i=1}i^{-\gamma}
$$
It is valid for $\gamma>1$
$\zeta(\gamma\to 1) %3D \to\infty$
$\langle k^m\rangle=\frac{\zeta(\gamma-m)}{\zeta(\gamma)}$
Diameter
$$
D\simeq\frac{\log\log(n)}{\log(\gamma-2)}
$$
- for $\gamma\in(2,3)$, $\gamma-2\in(0,1)$
- the smaller $\gamma$, the larger the absolute value of $\log(\gamma)$ will be, the smaller D (the diameter) will be
Conclusion:
- networks where the degree distribution has an infinite second raw moment have a very small diameter.
- the more heterogeneous the degree distribution (smaller $\gamma$) is, the smaller diameter will be.
Robustness
Super robust. See above.
Different from ER model, the limitation of finite size matters a lot here.
Fragility
Degree-dependent failure ratio
the nodes with high degree are more likely to be attacked, or to fail.
it could be true, considering
- targeted attack
- load-dependent failures of nodes
- …
The random scale-free networks are
- robust, against random failures
- fragile, against degree-dependent failure.
Limitations of ensemble studies
- analytical form of degree distribution is foundation for theoretical claims.
- e.g. log-normal distribution looks quite similar to zeta distribution.
- finite or infinite.
- the theoretical assumption is infinite
- the real case is finite, even large.
- microstates vs real system
- different choice of ensemble methods, based on different microstates of the given systems. may generate quite different statistical results.
- our statistical findings are based on our assumptions.
- different choice of ensemble methods, based on different microstates of the given systems. may generate quite different statistical results.