Aims:
- graph theory: micro-/macro perspective
- ensemble perspective
- random graph models
- empirical examples
Questions
For an undirected network, define the degree distribution and the mean degree.
- degree distribution: $P(d_i=n)=\frac{|d_j=n|}{N}$
- mean degree: $\bar{d}=\frac{\sum_i d_i}{N}$
define the degree assortativity coefficient. What do positive and negative values indicate, what does a zero value mean?
what is a stylised fact? Give an example for a real-world network
algorithmic formulation for $\frac{G(n,p)}{G(n, m)}$ model
- $G(n,p)$:
- build $n$ nodes
- for each node $i$, iterate on all the other nodes
- generate a uniformly distributed random number $t$ between 0 and 1.
- if $t<p$, add an edge between i and the other node
- $G(n,m)$ (undirected version):
- build $n$ nodes
- from $(1, 2, … , \frac{n(n-1)}{2})$ randomly choose m different numbers.
- link edges between the pairs corresponding to the encoded number.
- $G(n,p)$:
what is a statistical ensemble in the context of complex networks? what is macrostate? what is microstate?
what is the limiting degree distribution for networks generated by the $G(n,p)$ model?
In $G(n,p)$ model, a node has degree $i$ ‘s probability
$$
P(d_x=i)=\frac{(n-1)!}{i!(n-1-i)!}p^i(1-p)^{n-1-i}
$$this is a binomial distribution. I think the limitation of binomial distribution with $n\to\infty$ is a Poisson distribution. No, it is normal distribution.
Under what condition does the degree distribution converge to a Poisson distribution?
- inspired by the $G(n,p)$ model: when all the nodes have almost the same degree?
What degree assortativity coefficient do you expect for a random network?
Keys
- degree assortativity coefficient,
- statistical ensemble,
- $\frac{G(n,p)}{G(n,m)}$
- limiting degree distribution
Statistical index
Degree
Degree sequence
- graphic degree sequence
Degree distribution
compared to degree sequence: we lose the information of the number of nodes.
- mean degree, can be calculated from degree distribution $\bar{d}=\frac{\sum_id_i}{N}=\sum_{i\in V}k\cdot P(k)$
Degree assortativity coefficient
$$
r=\frac{\sum_{ij}(A_{ij}-\frac{d_id_j}{2|E|})d_id_j}{\sum_{ij}(d_i\delta_{ij}-\frac{d_id_j}{2|E|})d_id_j}\in [-1, 1]
$$
$$
r=\frac{\sum_{ij}(A_{ij}-\frac{d_id_j}{2|E|})d_id_j}{\sum_id_i^3 -\frac{1}{2|E|}\sum_{ij}(d_id_j)^2}
$$
- assortativity: measure something in common.
- degree assortativity: to what extent the nodes are linked to nodes that are similar.
- Positive value (assortative mixing): high-degree nodes are
preferentially connected to other high-degree nodes and low-degree nodes
are preferentially connected to other low-degree nodes - negative value (disassortative mixing): high-degree nodes are
preferentially connected to low-degree nodes and vice-versa - In social networks, assortative mixing is common.
Triads
Transitivity
closed triads
Clustering coefficient
clustering coefficient can be interpreted as conditional probability, when $A$ and $B$ both link to $C$, the probability of there is an edge between $A$ and $B$.
local clustering coefficient: the node level
is defined by the number of links between its neighbors normalized by the total possible number of links between its neighbors.
- undirected network: $C_i=\frac{k(i)}{d_i(d_i-1)/2}$
global clustering coefficient: the network level
average through all the nodes $C_g=\frac{\sum_i C_i}{N}$
Stylised facts
summarize results in a rough way, capturing some broad tendencies and ignoring the individual detail.
For example, a network has small clustering coefficient, large diameters, broad/narrow degree distribution, assortative/disassortative.
- We can compare to random networks to generate some interesting findings!!!
Micro-/Macro-scopic Perspective
- when we have full knowledge of a system, it is reasonable to take a microscopic perspective.
- when we want to understand why different systems exhibit similar features, we need to take a macroscopic perspective.
Statistical ensembles
to deal with lack of knowledge about microscopic details.
We use statistical ensembles to find statistical regularities of microstates, when given macrostates.
Random graph models
- using simple stochastic null models as the reference study networks
- important baseline for network analysis and pattern recoginition
- models:
- Erdos-Renyi model: $\frac{G(n,p)}{G(n,m)}$
- for $G(n,m)$ model, we can calculate:
- the mean of degree: $\frac{2m}{n}$
- the variance of degree:
there are totally $\frac{n(n-1)}{2}$ edges
the probability of each edge being selected is: $\frac{2m}{n(n-1)}$
the probability of a node has degree $k$ is:
$$
\frac{(n-1)!}{k!(n-1-k)!}\cdot p^k(1-p)^{n-1-k}
$$the variance is: $n\cdot\frac{2m}{n(n-1)}\cdot(1-\frac{2m}{n(n-1)})$
- for $G(n,p)$ model, we can calculate
- the probability of one node has degree $i$
- the probability of the network has m edges. (these two are binomial distribution)
- first raw moment: the mean degree: $\langle k\rangle = np$
- second raw moment $\to$ the variance of degrees: $\text{Var}(k)=\langle k^2\rangle -\langle k\rangle=np(1-p)$
- limiting degree distribution.
- Fixed $p$, For very large network, it approximates to a normal distribution, with mean $np$, variance $np(1-p)$
- Fixed $np$. That is when $n$ goes to infinity, $p$ goes to 0. It approximates to Poisson distribution, with $\lambda = \langle k\rangle=np$ and $P(k)\sim\frac{\lambda^k e^{-\lambda}}{k!}$
- for $G(n,m)$ model, we can calculate:
- Erdos-Renyi model: $\frac{G(n,p)}{G(n,m)}$
Macroscopic
we usually study a network based on its aggregate statistics. Lose detailed information allows us to imitate more networks sharing the same statistics and study and compare them.
Examples
Examine the following index of a network
the number of nodes
the number of edges
the mean degree
the diameter of the network
the average path length
the clustering coefficient
the degree assortativity
the degree distribution. When the degree distribution is broad and the mean degree is small, it requires more care to interpret the mean degree.