Objectives
using ERGM model to generate statistical ensembles with arbitrary fixed properties.
Questions
- For which microstate probabilities is the entropy of a statistical ensemble maximized?
- the one best describe the given knowledge
- How is the microstate probability in the exponential random graph model defined?
- What is the basic underlying assumption behind this microstate probability?
- How is the $G(n,p)$ model related to ERGM?
- using p as the expected property.
- How can we maximize the likelihood function in the ERGM?
- derivative the log-likelihood function
- How can the parameters of an ERGM fit be interpreted?
- the positive/negative effect of the factors
Keywords
- microstate probabilities
- statistical ensemble, e.g. $G(n,p)$ model
- ERGM
Microstate probability
Definition
The probability of each microstate, e.g. a network of a network generation model, in the ensemble model.
For instance, in $G(n,m)$ model, each microstate has the same probability $P=\frac{1}{T}$ where $T={D\choose m}$ and $D = {n+1\choose 2}$
Maximum entropy principle
The probability distribution which maximizes the entropy is the best one to describe the given state of knowledge.
Use $\Pi $ to denote a probability distribution. The MEP gives
$$
\Pi^*=\arg_{\Pi}\max H(\Pi)=\arg_{\Pi}\max -\sum_{G\in\Omega} P(G)\log(P(G))
$$
- $\Omega$ denotes the given state of knowledge. For example \Omega restricts all the networks have m edges.
- each $\Pi$ gives a set of $P()$
refer to this websiteThis will be the system with the largest remaining uncertainty, and by choosing it you’re making sure you’re not adding any extra biases or uncalled for assumptions into your analysis.
ERGM
Idea
- maximizing the entropy
- preserving the expected properties
Grand canonical ensemble
Grand canonical means the overall features of all the microstates keep unchanged, which can be shown by the expected values of all the microstates.
For example, G(n,p) model gives us a lot of random networks. Taking the expectation of the edge number over these networks, we will get a fixed value np/2. The edge number is a grand canonical feature for this model.
Properties
fix the expected values of our interested properties:
- $$
\langle f_i\rangle = \sum_G p(G)f_i(G)
$$
- $$
the model is
to maximize $H(\Omega)$
under the constraint of probability sum 1 and the expected property value
take derivative for each microstate and set as 0
$$
\frac{\partial}{\partial p(G)}{H+\alpha(1-\sum_G p(G)+\sum_i\theta_i(\langle f_i\rangle-\sum_Gf_i(G)p(G))}=0
$$
we can get
$$
p(G)=\frac{\exp(-\sum_i\theta_if_i(G))}{Z}
$$$$
Z=\sum_G\exp(-\sum_i \theta_if_i(G))
$$$$
\frac{\partial Z}{\partial \theta_i} = \sum_G-f_i(G)\exp(-\sum_j\theta_jf_j(G))
$$$$
=-\sum_G f_i(G) Z\frac{\exp(-\sum_j\theta_jf_j(G))}{Z} = -Z\langle f_i\rangle
$$$$
\langle f_i\rangle=-\frac{1}{Z}\frac{\partial Z}{\partial \theta_i}
$$we need to estimate theta
Simulation algorithms
Random walk
- construct the transition matrix $T: T_{ij}$: the transition probability from microstate $G_i$ to $G_j$
- walking sufficiently long such that visitation probabilities are close to stationary distribution
- start with random network $G_i$
- for random nodes $v,w$:
- if link $(v,w)$ exists: remove
- else, add
- accept change with probability $T_{v,w}$
- repeat large enough steps to generate a new network
- calculate the expected properties on the newly generated networks, compared with the empirical network.
MCMCMLE
- choose initial value $\theta_0$
- compute $P(G|\theta_0)$ by generating enough microstates
- randomly disturb $\theta_0$ to get $\theta_1$
- repeat step 2 for $\theta_1$
- compare the results of step 2 and 4. Keep the better parameter.
- repeat step 2-5 until convergence.