CN12 Exponential random graph model


Objectives

using ERGM model to generate statistical ensembles with arbitrary fixed properties.

Questions

  • For which microstate probabilities is the entropy of a statistical ensemble maximized?
    • the one best describe the given knowledge
  • How is the microstate probability in the exponential random graph model defined?
  • What is the basic underlying assumption behind this microstate probability?
  • How is the $G(n,p)$ model related to ERGM?
    • using p as the expected property.
  • How can we maximize the likelihood function in the ERGM?
    • derivative the log-likelihood function
  • How can the parameters of an ERGM fit be interpreted?
    • the positive/negative effect of the factors

Keywords

  • microstate probabilities
  • statistical ensemble, e.g. $G(n,p)$​​ model
  • ERGM

Microstate probability

Definition

The probability of each microstate, e.g. a network of a network generation model, in the ensemble model.

For instance, in $G(n,m)$ model, each microstate has the same probability $P=\frac{1}{T}$ where $T={D\choose m}$ and $D = {n+1\choose 2}$

Maximum entropy principle

The probability distribution which maximizes the entropy is the best one to describe the given state of knowledge.

Use $\Pi $​ to denote a probability distribution. The MEP gives
$$
\Pi^*=\arg_{\Pi}\max H(\Pi)=\arg_{\Pi}\max -\sum_{G\in\Omega} P(G)\log(P(G))
$$

  • $\Omega$ denotes the given state of knowledge. For example \Omega restricts all the networks have m edges.
  • each $\Pi$​ gives a set of $P()$

refer to this websiteThis will be the system with the largest remaining uncertainty, and by choosing it you’re making sure you’re not adding any extra biases or uncalled for assumptions into your analysis.

ERGM

Idea

  • maximizing the entropy
  • preserving the expected properties

Grand canonical ensemble

Grand canonical means the overall features of all the microstates keep unchanged, which can be shown by the expected values of all the microstates.

For example, G(n,p) model gives us a lot of random networks. Taking the expectation of the edge number over these networks, we will get a fixed value np/2. The edge number is a grand canonical feature for this model.

Properties

  • fix the expected values of our interested properties:

    • $$
      \langle f_i\rangle = \sum_G p(G)f_i(G)
      $$
  • the model is

    • to maximize $H(\Omega)$

    • under the constraint of probability sum 1 and the expected property value

    • take derivative for each microstate and set as 0
      $$
      \frac{\partial}{\partial p(G)}{H+\alpha(1-\sum_G p(G)+\sum_i\theta_i(\langle f_i\rangle-\sum_Gf_i(G)p(G))}=0
      $$

    we can get
    $$
    p(G)=\frac{\exp(-\sum_i\theta_if_i(G))}{Z}
    $$

    $$
    Z=\sum_G\exp(-\sum_i \theta_if_i(G))
    $$

    $$
    \frac{\partial Z}{\partial \theta_i} = \sum_G-f_i(G)\exp(-\sum_j\theta_jf_j(G))
    $$

    $$
    =-\sum_G f_i(G) Z\frac{\exp(-\sum_j\theta_jf_j(G))}{Z} = -Z\langle f_i\rangle
    $$

    $$
    \langle f_i\rangle=-\frac{1}{Z}\frac{\partial Z}{\partial \theta_i}
    $$

  • we need to estimate theta

Simulation algorithms

Random walk

  1. construct the transition matrix $T: T_{ij}$: the transition probability from microstate $G_i$ to $G_j$
    1. walking sufficiently long such that visitation probabilities are close to stationary distribution
  2. start with random network $G_i$
  3. for random nodes $v,w$:
    • if link $(v,w)$ exists: remove
    • else, add
    • accept change with probability $T_{v,w}$
  4. repeat large enough steps to generate a new network
  5. calculate the expected properties on the newly generated networks, compared with the empirical network.

MCMCMLE

  1. choose initial value $\theta_0$
  2. compute $P(G|\theta_0)$ by generating enough microstates
  3. randomly disturb $\theta_0$ to get $\theta_1$
  4. repeat step 2 for $\theta_1$
  5. compare the results of step 2 and 4. Keep the better parameter.
  6. repeat step 2-5 until convergence.

Author: Fululu
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Fululu !
  TOC