Learning Notes

Publish Date: 2019-07-23

Objectives

analyzing robustness with generating functions
robustness of scale-free network
limitations of ensemble-based approach
AS-level Internet topology

Questions

what is the critical failure probability $q_c$ for random networks with arbitrary degree distribution?
- the failure probability making the average clustering size goes to finite.

$$
q_c=1-(\frac{\langle k^2\rangle}{\langle k\rangle}-1)^{-1}
$$

what is $q_c$ for $G(n,p)$ model?

$$
q_c=1-\frac{1}{np}
$$

what are scale-free network? How can we characterize their degree distribution in the finite and infinite case?
- the degree distribution follows power-law distribution
- finite: $P(k)=k^{-\gamma}(\sum_{i=1}^N i^{-\gamma})^{-1}=\frac{k^{-\gamma}}{\sum_{i=1}i^{-\gamma}}$
- infinite: $P(k)=\frac{k^{-\gamma}}{\zeta(\gamma)}$
what is $q_c$ for a scale-free network with exponent $\gamma$ in the limit of infinite size?
- $q_c= 1-(\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}-1)^{-1}$
what is the reason for the discrepancy between the predicted and the empirical surviving lcc for random failures in scale-free networks?
- degree distribution
- finite nodes
- different ensemble methods.
what is the robust-yet-fragile nature of random scale-free network?
- critical failure rate approximates to 1.
- relies heavily on few important nodes.
explain three possible fallacies that can occur when applying the ensemble perspective to real systems?
- degree distribution
- finite/infinite
- ensemble methods

Keywords

critical failure probability
scale-free network
ensemble methods fallacies

Robustness

Related to the giant connected components: at which point does the removal of nodes destroy the giant connected component of a network system?

Topological robustness: the robustness of a system’s connectedness against the loss of nodes and/or links

Assumptions

stochastic node failure model
- each node has the uniform failure probability $q$
- failed nodes and all adjacent links are removed.
- node failures are independent.

Generating functions

We try to model the distribution of the survived connected component size. Then, see the mean size of connected components.

$H_0$

distribution of surviving component size for a randomly chosen node $v$

Think of the original expand form of $H_0$
$$
H_0(x)= P(K=0)x^0+\sum_{i=1}^{\infty}P(K=i)x^i
$$

$P(K=0)$ means the node $v$ fails. The probability is $q$
consider the second part. Following our past method, we consider the distribution of the neighbor of node $v$. Here, we define two other generating functions:
- $H_1$: the distribution of surviving component size of a node $w$ as a neighbor of another node $v$
- $F_0$: like $G_0$, which describes the degree distribution of a node $v$, describe the degree distribution of a node $v$ with the failure probability $q$.
Then, we can rewrite $H_0$ as

$$
H_0=q+xF_0(H_1(x))
$$

$$
H_0’(x)=F_0(H_0(x))+xF_0’(H_1(x))H_1’(x)
$$

$$
\Rightarrow H_0’(1)=1-q+F_0’(1)H_1’(1)
$$

$H_1$

the distribution of surviving component size of a node $w$ as a neighbor of another node $v$

like $H_0$,$H_1$ can also be decomposed by two parts,

the node fails, the probability is $q$
the node doesn’t fail

like we calculate the surviving component size, there is “self-consistency”.
$$
H_1=q+xF_1(H_1(x))
$$

we include $F_1(X)$, also denote “direction”, the “children neighbor” of node $w$.

$$
H_1’(x)=F_1(H_1(x))+F_1’(H_1(x))H_1’(x)
$$

$$
\Rightarrow H_1’(1) =1-q+F_1’(1)H_1’(1)
$$

$$
\Rightarrow H_1’(1) =\frac{1-q}{1-F_1’(1)}
$$

$F_0$

**Part of ** the distribution of a node degree, with the node failure probability $q$ what we know is a fixed degree distribution, that is $G_0$. So, we want to know $F_0$ from $G_0$ and $q$

The probability that a node $v$ has degree $k$ and survives is generated by
$$
F_0(x)=\sum_{k=0}^{\infty}(1-q)P(k)x^k = (1-q)G_0(x)
$$
Note that $F_0(1)=1-q$, not 1. It is due to we only consider the surviving part of the network, not the whole part.

$F_1$

similarly, $F_1(x)=(1-q)G_1(x)$

$\langle s\rangle$

$$
\langle s\rangle=H_0’(1) = 1-q+1\cdot F_0’(1)H_1’(1)
$$

$$
=1-q+\frac{(1-q)F_0’(1)}{1-F_1’(1)}=( 1+\frac{(1-q)G_0’(1)}{1-(1-q)G_1’(1)})(1-q)
$$

Critical failure rate

When the denominator is closed to 0, is the key point.

That is
$$
G_1’(1)\to \frac{1}{1-q}
$$
and
$$
G_1’(1)=\frac{\langle k^2\rangle}{\langle k\rangle}-1
$$
we have
$$
q_c =1-(\frac{\langle k^2\rangle}{\langle k\rangle}-1)^{-1}
$$

Remark

Since we use whether the average component size is finite to judge the robustness, it hides the assumption that $n\to\infty$

This formula gives an upper bound of the failure rate.

E-R Model

In ER model, we have
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{np+n^2p^2}{np}= 1+np
$$
So,
$$
q_c= 1-\frac{1}{np}
$$
According to the Molloy-Reed criterion, when $np>1$, we have a giant connected component. That is $\langle k\rangle>1$

$K$-regular network

In K-regular model all nodes have the same degree $k$.
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{k^2}{k}=k
$$
So,
$$
q_c= 1-\frac{1}{k-1}
$$
According to the Molloy-Reed criterion, we need $\langle k\rangle=k>2$

It is different from the E-R, due to the variance of nodes. It tells us the variance makes networks more robust!!

Broad distribution network

Explanations are below:
$$
\frac{\langle k^2\rangle}{\langle k\rangle}=\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}
$$

when $\gamma\in(2,3)$, the denominator is finite and the nominator is infinite.
so, $q_c=1-(\frac{\zeta(\gamma-2)}{\zeta(\gamma-1)}-1)^{-1}$
when $\gamma\in(2,3),q_c\to\infty$.

This shows the scale-free network is super-robust against random failures.
- In the limit of infinite network size, the scale-free network cannot be destroyed by any random failure probability.

Scale-free network

Definition

power law distribution
- scale invariance: $P(\alpha k)\propto P(k)$
- $$
  P(k)=k^{-\gamma}(\sum_{i=1}^N i^{-\gamma})^{-1}=\frac{k^{-\gamma}}{\sum_{i=1}i^{-\gamma}}
  $$
- this is one kind of broad distribution. Other broad distribution examples: Zipf distribution.
- $\gamma$ decides how quickly the distribution drops. The lager the quicker.

Approximation

We use Zeta distribution to imitate power-law distribution degrees with $n\to\infty$

The formula of Zeta distribution:

$$
P(k) =\frac{k^{-\gamma}}{\zeta(\gamma)}
$$

where
$$
\zeta(\gamma) =\sum_{i=1}i^{-\gamma}
$$
It is valid for $\gamma>1$

$\zeta(\gamma\to 1) %3D \to\infty$
$\langle k^m\rangle=\frac{\zeta(\gamma-m)}{\zeta(\gamma)}$

Diameter

$$
D\simeq\frac{\log\log(n)}{\log(\gamma-2)}
$$

for $\gamma\in(2,3)$, $\gamma-2\in(0,1)$
- the smaller $\gamma$, the larger the absolute value of $\log(\gamma)$ will be, the smaller D (the diameter) will be

Conclusion:

networks where the degree distribution has an infinite second raw moment have a very small diameter.
the more heterogeneous the degree distribution (smaller $\gamma$) is, the smaller diameter will be.

Robustness

Super robust. See above.

Different from ER model, the limitation of finite size matters a lot here.

Fragility

Degree-dependent failure ratio

the nodes with high degree are more likely to be attacked, or to fail.

it could be true, considering

targeted attack
load-dependent failures of nodes
…

The random scale-free networks are

robust, against random failures
fragile, against degree-dependent failure.

Limitations of ensemble studies

analytical form of degree distribution is foundation for theoretical claims.
- e.g. log-normal distribution looks quite similar to zeta distribution.
finite or infinite.
- the theoretical assumption is infinite
- the real case is finite, even large.
microstates vs real system
- different choice of ensemble methods, based on different microstates of the given systems. may generate quite different statistical results.
  - our statistical findings are based on our assumptions.

Fululu

https://fuguigui.github.io

All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Fululu !

Graph Complex System

CIL9 Dictionary Learning

2019-07-25 Learning Notes

Machine Learning

CIL8 Sparse Coding

2019-07-22 Learning Notes

Machine Learning

CN7 Scale-Free Network and limitations of Ensemble Studies

Objectives

Questions

Keywords

Robustness

Assumptions

Generating functions

$H_0$

$H_1$

$F_0$

$F_1$

$\langle s\rangle$​

Critical failure rate

Remark

E-R Model

$K$-regular network

Broad distribution network

Scale-free network

Definition

Approximation

Diameter

Robustness

Fragility

Degree-dependent failure ratio

Limitations of ensemble studies

你的赏识是我前进的动力

$\langle s\rangle$