Fast counting of interactions between groups given users interactions and groups assignments

Question 1

Given:

An array c that contains the group assignment of every user (c[i]=4 indicates that user i belongs to group 4)
A matrix y of 0 and 1 that indicates whether user i interacted with user j (y[i][j] = 1 if i interacted with j, 0 otherwise)

What is the fastest way to count the number of interactions and lack of interactions between every pair of groups?

This is what I have:

Solution 1:

This is what I have:

import numpy as np
def test1(y, c):
 N = len(c) # number of users
 C = len(set(c)) # number of clusters 
 # Final counts matrices
 I = np.zeros((C,C)) 
 notI = np.zeros((C,C))
 # Loop every row i/ column j, look at the groups where the users belong
 # and update I accordingly
 for i in range(N):
 for j in range(N):
 if y[i][j] == 1:
 I[c[i]][c[j]] += 1
 else: 
 notI[c[i]][c[j]] += 1
 return I, notI

timeit:

nusers = 100
nclusters = 4
y = np.random.random_integers(0, 1, (nusers,nusers))
c = np.random.choice(nclusters, nusers)
%timeit test1(y,c)

10 loops, best of 3: 42 ms per loop

Solution 2:

def test2(y, c):
 N = len(c)
 C = len(set(c))
 I = np.zeros((C,C))
 notI = np.zeros((C,C))
 for i, jj in enumerate(y):
 for j, value in enumerate(jj):
 if value:
 I[c[i]][c[j]] += 1
 else: 
 notI[c[i]][c[j]] += 1
 return I, notI

timeit: 10 loops, best of 3: 30.8 ms per loop

Question 2

in Solution 2 you wrote len(z), that is, "z" instead of "c". Typo? z is not defined in the posted code.

Question 3

also, will it make a difference if you change if y[i][j] == 1: to if y[i][j]: ?

Question 4

@janos it is a typo, thanks! And no, it would make no difference (actually I tested after posting the question and y[i][j]: seems a bit faster)

Question 5

Re-formulating the problem into a matrix multiplication one speeds up the algorithm very much.

Algebra:

Create a matrix G of n_groups x n_users such that every column is a group mask, that is, all users (columns) that belong to that group (row) are flagged are 1 and 0 otherwise. If Y is the matrix of interactions, then the interactions between every pair of groups can be obtained by the following equation:

$$I = GYG^T$$

Code:

The code is:

import numpy as np
from numpy import dot
def test3(y, c):
 N = len(c)
 C = len(set(c))
 noty = np.mod(y+1,2)
 # create C group masks of 1 (if user is belongs to group) and 0. 
 G = np.zeros((C,N))
 for group in range(C):
 G[group] = [1 if c_== group else 0 for c_ in c]
 # Groups (src) * interaction_matrix * Groups_transposed (dst)
 # Every row of Groups matrix is the mask of a group
 return dot(dot(G,y),G.T), dot(dot(G,noty),G.T)

1000 loops, best of 3: 810 μs per loop

alberto alberto 2352 silver badges9 bronze badges · Answer 1 · 2014-10-19 19:07:18Z

Re-formulating the problem into a matrix multiplication one speeds up the algorithm very much.

Algebra:

Create a matrix G of n_groups x n_users such that every column is a group mask, that is, all users (columns) that belong to that group (row) are flagged are 1 and 0 otherwise. If Y is the matrix of interactions, then the interactions between every pair of groups can be obtained by the following equation:

$$I = GYG^T$$

Code:

The code is:

import numpy as np
from numpy import dot
def test3(y, c):
 N = len(c)
 C = len(set(c))
 noty = np.mod(y+1,2)
 # create C group masks of 1 (if user is belongs to group) and 0. 
 G = np.zeros((C,N))
 for group in range(C):
 G[group] = [1 if c_== group else 0 for c_ in c]
 # Groups (src) * interaction_matrix * Groups_transposed (dst)
 # Every row of Groups matrix is the mask of a group
 return dot(dot(G,y),G.T), dot(dot(G,noty),G.T)

1000 loops, best of 3: 810 μs per loop

Stack Exchange Network

Fast counting of interactions between groups given users interactions and groups assignments

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Fast counting of interactions between groups given users interactions and groups assignments

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions