I have the following list:
x = np.array([1, 1, 2, 2, 2])
with np.unique
values of [1, 2]
How do I generate the following list:
[1, 2, 1, 2, 3]
i.e. a running index from 1 for each of the unique elements in the list x
.
3 Answers 3
you can use pandas.cumcount()
after grouping by the value itself, it does exactly that:
Number each item in each group from 0 to the length of that group - 1.
try this:
import numpy as np
import pandas as pd
x = np.array([1, 1, 2, 2, 2])
places = list(pd.Series(x).groupby(by=x).cumcount().values + 1)
print(places)
Output:
[1, 2, 1, 2, 3]
-
Thanks. Is there a way to do this with out pandas?ajrlewis– ajrlewis2019年07月16日 10:52:16 +00:00Commented Jul 16, 2019 at 10:52
-
1@alex_lewis I tried with native numpy function but couldn't get any nice solution. the other way I think of is using
list.index
in a loop, which will be 100x times slower :( maybe someon else will come up with something numpy only.Adam.Er8– Adam.Er82019年07月16日 11:07:56 +00:00Commented Jul 16, 2019 at 11:07
Just use return_counts=True
of np.unique
with listcomp and np.hstack
. It is still faster pandas solution
c = np.unique(x, return_counts=True)[1]
np.hstack([np.arange(item)+1 for item in c])
Out[869]: array([1, 2, 1, 2, 3], dtype=int64)
-
this will only work if the values are consecutive. for
x = np.array([1, 1, 2, 2, 2, 1])
you'll get[1 2 3 1 2 3]
while you should be getting[1 2 1 2 3 3]
Adam.Er8– Adam.Er82019年07月16日 12:01:26 +00:00Commented Jul 16, 2019 at 12:01
I'm not sure, if this is any faster or slower solution, but if you need just a list result with no pandas, you could try this
arr = np.array([1, 1, 2, 2, 2])
from collections import Counter
ranges = [range(1,v+1) for k,v in Counter(arr).items()]
result = []
for l in ranges:
result.extend(list(l))
print(result)
[1, 2, 1, 2, 3]
(or make your own counter with dict
instead of Counter()
)