CN3 Ensemble perspective


Aims:

  • graph theory: micro-/macro perspective
  • ensemble perspective
  • random graph models
  • empirical examples

Questions

  • For an undirected network, define the degree distribution and the mean degree.

    • degree distribution: $P(d_i=n)=\frac{|d_j=n|}{N}$​
    • mean degree: $\bar{d}=\frac{\sum_i d_i}{N}$
  • define the degree assortativity coefficient. What do positive and negative values indicate, what does a zero value mean?

  • what is a stylised fact? Give an example for a real-world network

  • algorithmic formulation for $\frac{G(n,p)}{G(n, m)}$ model

    • $G(n,p)$:
      1. build $n$ nodes
      2. for each node $i$, iterate on all the other nodes
        1. generate a uniformly distributed random number $t$ between 0 and 1.
        2. if $t<p$, add an edge between i and the other node
    • $G(n,m)$ (undirected version):
      1. build $n$ nodes
      2. from $(1, 2, … , \frac{n(n-1)}{2})$​ randomly choose m different numbers.
      3. link edges between the pairs corresponding to the encoded number.
  • what is a statistical ensemble in the context of complex networks? what is macrostate? what is microstate?

  • what is the limiting degree distribution for networks generated by the $G(n,p)$ model?

    • In $G(n,p)$ model, a node has degree $i$​​​ ‘s probability
      $$
      P(d_x=i)=\frac{(n-1)!}{i!(n-1-i)!}p^i(1-p)^{n-1-i}
      $$

    • this is a binomial distribution. I think the limitation of binomial distribution with $n\to\infty$ is a Poisson distribution. No, it is normal distribution.

  • Under what condition does the degree distribution converge to a Poisson distribution?

    • inspired by the $G(n,p)$ model: when all the nodes have almost the same degree?
  • What degree assortativity coefficient do you expect for a random network?

Keys

  • degree assortativity coefficient,
  • statistical ensemble,
  • $\frac{G(n,p)}{G(n,m)}$
  • limiting degree distribution

Statistical index

Degree

Degree sequence

  • graphic degree sequence

Degree distribution

compared to degree sequence: we lose the information of the number of nodes.

  • mean degree, can be calculated from degree distribution $\bar{d}=\frac{\sum_id_i}{N}=\sum_{i\in V}k\cdot P(k)$

Degree assortativity coefficient

$$
r=\frac{\sum_{ij}(A_{ij}-\frac{d_id_j}{2|E|})d_id_j}{\sum_{ij}(d_i\delta_{ij}-\frac{d_id_j}{2|E|})d_id_j}\in [-1, 1]
$$

$$
r=\frac{\sum_{ij}(A_{ij}-\frac{d_id_j}{2|E|})d_id_j}{\sum_id_i^3 -\frac{1}{2|E|}\sum_{ij}(d_id_j)^2}
$$

  • assortativity: measure something in common.
  • degree assortativity: to what extent the nodes are linked to nodes that are similar.
  • Positive value (assortative mixing): high-degree nodes are
    preferentially connected to other high-degree nodes and low-degree nodes
    are preferentially connected to other low-degree nodes
  • negative value (disassortative mixing): high-degree nodes are
    preferentially connected to low-degree nodes and vice-versa
  • In social networks, assortative mixing is common.

Triads

Transitivity

closed triads

Clustering coefficient

clustering coefficient can be interpreted as conditional probability, when $A$ and $B$ both link to $C$, the probability of there is an edge between $A$ and $B$.

  • local clustering coefficient: the node level

    is defined by the number of links between its neighbors normalized by the total possible number of links between its neighbors.

    • undirected network: $C_i=\frac{k(i)}{d_i(d_i-1)/2}$
  • global clustering coefficient: the network level

    average through all the nodes $C_g=\frac{\sum_i C_i}{N}$

Stylised facts

summarize results in a rough way, capturing some broad tendencies and ignoring the individual detail.

For example, a network has small clustering coefficient, large diameters, broad/narrow degree distribution, assortative/disassortative.

  • We can compare to random networks to generate some interesting findings!!!

Micro-/Macro-scopic Perspective

  • when we have full knowledge of a system, it is reasonable to take a microscopic perspective.
  • when we want to understand why different systems exhibit similar features, we need to take a macroscopic perspective.

Statistical ensembles

to deal with lack of knowledge about microscopic details.

We use statistical ensembles to find statistical regularities of microstates, when given macrostates.

Random graph models

  • using simple stochastic null models as the reference study networks
  • important baseline for network analysis and pattern recoginition
  • models:
    • Erdos-Renyi model: $\frac{G(n,p)}{G(n,m)}$
      • for $G(n,m)$ model, we can calculate:
        • the mean of degree: $\frac{2m}{n}$
        • the variance of degree:
          • there are totally $\frac{n(n-1)}{2}$ edges

          • the probability of each edge being selected is: $\frac{2m}{n(n-1)}$​

          • the probability of a node has degree $k$​​ is:
            $$
            \frac{(n-1)!}{k!(n-1-k)!}\cdot p^k(1-p)^{n-1-k}
            $$

          • the variance is: $n\cdot\frac{2m}{n(n-1)}\cdot(1-\frac{2m}{n(n-1)})$​

      • for $G(n,p)$ model, we can calculate
        • the probability of one node has degree $i$
        • the probability of the network has m edges. (these two are binomial distribution)
        • first raw moment: the mean degree: $\langle k\rangle = np$
        • second raw moment $\to$​ the variance of degrees: $\text{Var}(k)=\langle k^2\rangle -\langle k\rangle=np(1-p)$​
        • limiting degree distribution.
          • Fixed $p$, For very large network, it approximates to a normal distribution, with mean $np$, variance $np(1-p)$
          • Fixed $np$​​. That is when $n$​​ goes to infinity, $p$​​ goes to 0. It approximates to Poisson distribution, with $\lambda = \langle k\rangle=np$​ and $P(k)\sim\frac{\lambda^k e^{-\lambda}}{k!}$​​

Macroscopic

we usually study a network based on its aggregate statistics. Lose detailed information allows us to imitate more networks sharing the same statistics and study and compare them.

Examples

Examine the following index of a network

  • the number of nodes

  • the number of edges

  • the mean degree

  • the diameter of the network

  • the average path length

  • the clustering coefficient

  • the degree assortativity

  • the degree distribution. When the degree distribution is broad and the mean degree is small, it requires more care to interpret the mean degree.


Author: Fululu
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Fululu !
  TOC