Create a new numpy array from elements of another numpy array

Question 1

I've been strugling to create a sub-array from specific elements of a first array.

Given a first array that looks like this (it commes from a txt file with two lines :

L1,(B:A:3:1),(A:C:5:2),(C:D:2:3)
L2,(C:E:2:0.5),(E:F:10:1),(F:D:0.5:0.5)):

code

toto = pd.read_csv("bd_2_test.txt",delimiter=",",header=None,names=["Line","1rst","2nd","3rd"])
matrix_toto = toto.values
matrix_toto

result

 Line 1rst 2nd 3rd
0 L1 (B:A:3:1) (A:C:5:2) (C:D:2:3)
1 L2 (C:E:2:0.5) (E:F:10:1) (F:D:0.5:0.5)

how can I transform it into an array like this one?

array([['B', 'A', 3, 1],
 ['A', 'C', 5, 2],
 ['C', 'D', 2, 3],
 ['C', 'E', 2, 0.5],
 ['E', 'F', 10, 1],
 ['F', 'D', 0.5, 0.5]], dtype=object)

I tried vectorizing but I get each second element of the array.

np.vectorize(lambda s: s[1])(matrice_toto)
array([['1', 'B', 'A', 'C'],
 ['2', 'C', 'E', 'F']], dtype='<U1')

Question 2

What are those (B:A:3:1) in the first array. Are they tuples? Strings?

Question 3

It seems to me that you're just trying to flatten an array. And maybe convert all of its elments into another form. Could you provide some code that build the 1st array?

Question 4

The first array commes from a txt file and it is just two lines: L1,(B:A:3:1),(A:C:5:2),(C:D:2:3) L2,(C:E:2:0.5),(E:F:10:1),(F:D:0.5:0.5)

Question 5

In your dataframe, "(B:A:3:1)" is a string. So is "L1". And resulting values array is a (2,4) array of strings. Your lambda just takes the 2nd character from each string. So for one thing you need to skip the "L1" element. You also need to split the other strings into 4 characters - drop the () and split on :. I don't know if pandas has a 'expand' that will help. But at the numpy level you have a lot string manipulation to do first.

Question 6

I am not sure what you are trying is the optimal solution to your real problem. But, well, staying as close as possible to your initial try

# We need regular expression to transform a string of ``"(x:y:z:t)"`` into an array``["x","y","z","t"]``
import re
# tr does that transformation
tr=lambda s: np.array(re.findall('\(([^:]*):([^:]*):([^:]*):([^:]*)\)', s)[0])
# Alternative version, without re (and maybe best, I've benchmarked it)
tr=lambda s: s[1:-1].split(':') # s[1:-1] remove 1st and last char, so parenthesis. And .split(':') creates an array for substring separated by colons.
# trv is the vectorization of tr
# We need the signature, because the return type is an array itself.
trv=np.vectorize(tr, signature='()->(n)')
result=trv(matrix_toto[:,1:].flatten())

Note that matrix_toto[:,1:] is your matrix, without the 1st column (the line name). And matrix_toto[:,1:].flatten() flatten it, so we have 1 entry per cell of your initial array (excluding line name). Each of those cell is a string "(x:y:z:t)". Which is transformed by trv into an array.

Result is

array([['B', 'A', '3', '1'],
 ['A', 'C', '5', '2'],
 ['C', 'D', '2', '3'],
 ['C', 'E', '2', '0'],
 ['E', 'F', '1', '1'],
 ['F', 'D', '0', '0']], dtype='<U1')

Obviously you need only one of the 2 lines tr=.... I let both in the code, because I don't know the exact specification of those (x:y:z:t) patterns, so you may need to adapt from one of the 2 versions.

chrslg chrslg 15.2k11 gold badges26 silver badges42 bronze badges · Accepted Answer · 2022-10-02 16:26:27Z

I am not sure what you are trying is the optimal solution to your real problem. But, well, staying as close as possible to your initial try

# We need regular expression to transform a string of ``"(x:y:z:t)"`` into an array``["x","y","z","t"]``
import re
# tr does that transformation
tr=lambda s: np.array(re.findall('\(([^:]*):([^:]*):([^:]*):([^:]*)\)', s)[0])
# Alternative version, without re (and maybe best, I've benchmarked it)
tr=lambda s: s[1:-1].split(':') # s[1:-1] remove 1st and last char, so parenthesis. And .split(':') creates an array for substring separated by colons.
# trv is the vectorization of tr
# We need the signature, because the return type is an array itself.
trv=np.vectorize(tr, signature='()->(n)')
result=trv(matrix_toto[:,1:].flatten())

Note that matrix_toto[:,1:] is your matrix, without the 1st column (the line name). And matrix_toto[:,1:].flatten() flatten it, so we have 1 entry per cell of your initial array (excluding line name). Each of those cell is a string "(x:y:z:t)". Which is transformed by trv into an array.

Result is

array([['B', 'A', '3', '1'],
 ['A', 'C', '5', '2'],
 ['C', 'D', '2', '3'],
 ['C', 'E', '2', '0'],
 ['E', 'F', '1', '1'],
 ['F', 'D', '0', '0']], dtype='<U1')

Obviously you need only one of the 2 lines tr=.... I let both in the code, because I don't know the exact specification of those (x:y:z:t) patterns, so you may need to adapt from one of the 2 versions.

CollectivesTM on Stack Overflow

Create a new numpy array from elements of another numpy array

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related