I am trying to sort an array and separate it in python.
Example:
I have an data file like this that I will import:
x y z
1 3 83
2 4 38
8 1 98
3 87 93
4 1 73
1 3 67
9 9 18
1 4 83
9 3 93
8 2 47
I want it to first look like this:
x y z
1 3 83
1 3 67
1 4 83
2 4 38
3 87 93
4 1 73
8 1 98
8 2 47
9 9 18
9 3 93
So the x column is in ascending order, followed by the y column.
And then finally I want to build an array out of these arrays? Can I do that?
So I have:
array[0] = [[1, 3, 83],[1, 3, 67],[1, 4, 83]]
array[1] = [[2, 4, 38]]
array[2] = [[3, 87, 93]]
array[3] = [[4, 1, 73]]
array[4] = [[8, 1, 98],[8,2,47]]
and so on...
Starting out:
import numpy as np
import matplotlib.pyplot as plt
data_file_name = 'whatever.dat'
data=np.loadtxt(data_file_name)
-
Can you please provide a minimal reproducible example so we can assist with the issues you are having in your implementation attempt?idjaw– idjaw2016年04月01日 19:15:52 +00:00Commented Apr 1, 2016 at 19:15
-
Are you willing to use the Pandas package, or do you want a pure Python solution?Alexander– Alexander2016年04月01日 19:19:28 +00:00Commented Apr 1, 2016 at 19:19
-
Pure python would be the best -- thank you kindlysci-guy– sci-guy2016年04月01日 19:19:48 +00:00Commented Apr 1, 2016 at 19:19
2 Answers 2
Here is a numpy solution (given that you used it for loading the data):
import numpy as np
data_file_name = 'whatever.dat'
data = np.loadtxt(data_file_name,
skiprows=1,
dtype=[('x', float), ('y', float), ('z', float)])
data.sort(axis=0, order=['x', 'y', 'z'])
unique_x_col_vals = set(row[0] for row in data)
array = {n: [list(row) for row in data if row[0] == val]
for n, val in enumerate(unique_x_col_vals)}
>>> array
{0: [[1.0, 3.0, 67.0], [1.0, 3.0, 83.0], [1.0, 4.0, 83.0]],
1: [[2.0, 4.0, 38.0]],
2: [[3.0, 87.0, 93.0]],
3: [[4.0, 1.0, 73.0]],
4: [[8.0, 1.0, 98.0], [8.0, 2.0, 47.0]],
5: [[9.0, 3.0, 93.0], [9.0, 9.0, 18.0]]}
It uses a dictionary comprehension to generate the array, internally using a list comprehension to extract each row for the unique values based on column x.
I've used floats when importing the data, but you can also specify int if that matches your data.
Comments
You can use pandas for this, with just couple lines of code:
df = pd.read_csv(txt, sep=r"\s*")
print df.sort(['x','y'], ascending=[True,True])