Learning Notes

Publish Date: 2019-07-30

Objectives

topology’s influence on dynamical process
model and analyze diffusion process in complex networks.
random walk model for diffusion
Markov chain convergence theorem
Feedback centrality measures

Questions

how is the transition matrix of a random walk defined for weighted and unweighted networks?
- what is the entry in the transition matrix
- what is the effect of weight?
- $T_{ij}=\frac{A_{ij}}{\sum_k A_{ik}}$
When does a random walk in a network converge to a unique stationary distribution?
- the transition matrix: irreducible, aperiodic
How can we compute the stationary distribution of a random walk?
- multiply the transition matrix many times enough.
- calculate the eigenvector corresponding to the eigenvalue1 of the transition matrix.
How is total variation distance defined? What is its maximum value?
- the half of the sum of absolute difference in each dimension.
- 1
For a right-stochastic transition matrix T and an initial distribution $\pi^{(0)}$, how can we compute the visitation probabilities after t steps of a random walk?
- $\pi^{(0)}T^t$
Can you give an example for a network in which a random walk never reaches a stationary distribution?
- $\pi^{(0)}=(1,0,0)$
- $P(1\to2)=1,P(2\to3)=1,P(3\to1)=1$
Can you give an example for a network in which a random walk reaches multiple stationary distributions for different initial distributions?
- separated network with many clusters, within each cluster it will reach a stable distribution.
How can a periodic transition matrix be made aperiodic?
- add small random values on the diagonal entries.

Keywords

dynamic process, total variation distance, transition matrix

Dynamic Process

Definition

Dynamic: means the nodes’ states changing with time. The network topology is static.

Classification

Four classes:
- synchronization: the purpose of the common wave length or rhythm is to strengthen the group bond, like homophily. Property’s effect on network evolution
  - e.g: people sharing the same culture work together
- consensus: nodes adopt state of neighbors. Network evolution’s effect on property change.
  - e.g.: opinion formation
- propagation: a process by which some initial quantities multiply or proliferate throughout the network. Non-conserved quantity: spreading to others doesn’t reduce the original quantity.
  - e.g.: disease broadcast, rumours proliferation.
- diffusion: net movement of anything (e.g., ideas, ions, molecules) from a region of higher concentration to a region of lower concentration. Conserved quantity.
  - e.g.: water, gas, electricity transportation.

Diffusion process

the topology’s influence
the role of nodes

Random Walk

Idea

using probability to model lack of knowledge.

generate random paths through a state space.

Markov Chains

If we use the Markov property (memoryless) to build random walk, we get a kind of random walk, memoryless Markov process.
$$
P(X_{t+1=s_{t+1}}|X_t=s_t)
$$

Transition Matrix

$A$ is the adjacency matrix of the network. We can define transition matrix as
$$
T_{ij}=\frac{A_{ij}}{\sum_k A_{ik}}
$$

weighted/unweighted is decided by $A$.
the $T_{ij}$ is normalized, so we can scale out all the outgoing weights of a node without any change of $T$.

Row-stochastic/right-stochastic

the row sum of stochastic vector is 1.

Right means we multiply $T$ at the right side of a vector $vT$. Here $v\in\mathbb{R}^{1\times N}$

Aperiodicity

State i is said to be periodic with period t iff

$T_{ii}^{(n)} = 0, \forall n\neq t, 2t, 3t, \cdots$
$T_{ii}^{(n)} \neq 0, \forall n= t, 2t, 3t, \cdots$

Write in another way
$$
r(s_i)= \text{gcd}{t\geq 1: (T^t)_{ii}>0}
$$

aperiodic: $r(s_i)=1,\forall i\in V$
periodic: otherwise

How to make a matrix aperiodic?

add small random value on the diagonal items.
$T_{ii}=\epsilon,\forall v\in V$
a random walk with such a transition matrix is called lazy.
a network with self-loop

Irreducibility

Reducible:
$$
\exists (i,j), \delta,\forall t>\delta, (T^t)_{ij}=0
$$
Means

strong connectivity of directed networks.
connectivity of undirected networks.

Markov Chain Convergence theorem

$T$: irreducible and aperiodic

for any initial distribution $\pi^{(0)}$
$$
\exists| \pi,\delta(\pi^{(t)},\pi)\to 0 (t\to \infty)
$$
the stable distribution is independent from the initial distribution.

Existence

if there is a periodic behavior, the network doesn’t have a stationary distribution.

Lazy

Uniqueness

means for all possible initial distribution, have only one final stable distribution. Requires irreducibility.

Initial Distribution
$$
\pi^{(0)}\in\mathbb{R}^{1\times N}
$$

the sum is 1.
all entries are non-negative.

$$
\pi^{(t)}=\pi^{(0)}\cdot T^t
$$

Stationary distribution

$$
\pi^{(t)}= \pi^{(t+1)}\cdot 1=\pi^{(t)}\cdot T
$$

the left eigenvector of $T$ corresponding to eigenvalue $v=1$.
gives a notion of node centrality, feedback centrality.

View from another point:

$T^{t}_{ij}$: the probability of node $i$ goes to node $j$ after $t$ steps.

It should be unchanged when $t$ is large enough.

That is, there is a $t_0$,

$$
\forall t>t_0 \forall k>0 T^{t+k}{ij}=T^t{ij}
$$

That means, $T^{t+1} = T^t =T^t\cdot T$ Each row of $T^t$ is the eigenvector of $T$, with the eigenvalue 1.
The stationary distribution is also the eigenvector of $T$ with the eigenvalue 1.
Each row of $T^t$ is the same, which is exactly the stationary distribution $\pi$.

To network

undirected network: $\pi_i=\frac{d_i}{2m}$ directly related to the node degrees
directed network:
- the number of nodes pointing to a node
- the importance of these pointing nodes

Total Variation distance

$$
\delta(\pi_1,\pi_2)=\frac{1}{2}\sum_i |\pi_{1,i}-\pi_{2,i}|
$$

normalized by 2: to restrict the value less than or equal to 1. The biggest value of the sum of differences for each dimension is 2, with the 1-sum restriction.
used to measure if a sequences of vectors converges.

Others

Connectivity of directed networks

strongly connected: all pairs of nodes $i,j$ have a path from $i$ to $j$ and from $j$ to $i$.
weakly connected: if the corresponding undirected network is connected.

Feedback Centrality

$$
\pi = \pi \cdot T
$$

$$
\pi_j = \sum_i \pi_i T_{ij}
$$

feedback: the centrality of those nodes that point to a node influence the node’s centrality.
a node that is pointed to by “important” nodes should be “intuitively” more important than those pointed by “unimportant” nodes.
a node’s importance transition on others will be reduced with more outgoing links.

Zero-indegree

Since a node’s eigenvector, feedback centrality, is determined by its in-going nodes, a node with indegree 0 has 0 centrality.

Useless for directed acyclic graphs.

Solution: PageRank

assign constant amount of one unit of centrality “for free”
$$
\alpha x= x\cdot T+\vec{1}
$$

$$
x_j =\frac{1}{\alpha}\sum_i x_i\frac{A_{ij}}{d_{out}(i)}+1
$$

Eigenvector Centrality

$$
\alpha x= x\cdot A
$$

$$
x_j=\frac{1}{\alpha}\sum_i x_iA_{ij}
$$

$\alpha$ is the largest eigenvalue of $A$
- usually, we take the largest one, but not necessarily.
$T$ is replaced by the adjacency matrix $A$. The difference is whether to normalize the degrees. In another way, whether “conserve quantity”.
- $T$: conserve
- $A$: not conserve
- The importance of a node will keep even with more outgoing links.

Centrality comparison

degree: local measure
Betweenness, closeness, eigenvector, and PageRank: global characteristics

Fululu

https://fuguigui.github.io

All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Fululu !

Graph Complex System

CN9 Spectral properties

2019-07-31 Learning Notes

Graph Complex System

CIL9 Dictionary Learning

2019-07-25 Learning Notes

Machine Learning

CN8 Random Walks and Diffusion

Objectives

Questions

Keywords

Dynamic Process

Definition

Classification

Diffusion process

Random Walk

Idea

Markov Chains

Transition Matrix

Row-stochastic/right-stochastic

Aperiodicity

Irreducibility

Markov Chain Convergence theorem

Existence

Uniqueness

Stationary distribution

To network

Total Variation distance

Others

Connectivity of directed networks

Feedback Centrality

Zero-indegree

Solution: PageRank

Eigenvector Centrality

Centrality comparison

你的赏识是我前进的动力