Write a script that counts the number of occurences of 3-node feedforward and feedback loops in a directed network. Is it easier to represent the network by its adjacency matrix or as a list of edges (dataframe with source and target columns)?
%% Cell type:code id: tags:
``` python
# create a sorted list of nodes in the network
nodes=df.tf.append(df.gene).unique().tolist()
nodes=sorted(nodes)
# create a dictionary with nodes names and their indexes
n=len(nodes)
nodes_dict=dict(zip(nodes,np.arange(n)))
# create an empty matrix Adj_mat of size n-by-n
Adj_mat=np.empty(shape=(n,n),dtype=int)
# fill in Adj_mat_d0 with 1 according to the data
foriindf.index:
r=nodes_dict[df.loc[i,'tf']]
c=nodes_dict[df.loc[i,'gene']]
# omit the diagonal elements r!=c
if(df.loc[i,'effect']in['+','-']):
Adj_mat[r,c]=1
```
%% Cell type:code id: tags:
``` python
# create an empty matrix Adj_mat_d0 of size n-by-n and zeros on diagonal
Adj_mat_0d=Adj_mat
foriinrange(n):Adj_mat_0d[i,i]=0
# re-define the adjacency matrix as a list of links
# this helps to decrearse memory usage and number of computations in further operations
Adj_mat_0d=lil_matrix(Adj_mat_0d)
```
%% Cell type:code id: tags:
``` python
# define functions to count feed-forward and feed-back loops
Generate 1000 random networks with (approximately) the same number of nodes, edges and degrees as the real E. coli transcriptional network (see Assignment Week 35).
print(f'Mean of feed-forward loops in random networks: {np.mean(f_f_l_random)}')
print(f'Mean of feed-back loops in random networks: {np.mean(f_b_l_random)}')
```
%%%% Output: stream
Mean of feed-forward loops in random networks: 99.59
Mean of feed-back loops in random networks: 0.963
%% Cell type:markdown id: tags:
Compute the enrichment Z-score for 3-node feedforward and feedback loops. Which ones occur significantly more often in the real network than in the random networks?
In the given data the number of feed-forward loop is significantly larger than in random networks with similar characteristics.
In the given data (and with fixed random seed) the number of feed-forward loops is significantly larger than in random networks with similar characteristics.
%% Cell type:code id: tags:
``` python
print('t-test result on equal means for feed-back loops:')