I have a dictionary as follows:
s_dict = {'s' : 'ATGCGTGACGTGA'}
I want to change the string stored as the value of the dictionary for key 's' at positions 4, 6, 7 and 10 to h, k, p and r.
pos_change = {'s' : ['4_h', '6_k', '7_p', '10_r']}
The way I can think about it is in a loop:
for key in s_dict:
for position in pos_change[key]:
pos = int(position.split("_")[0])
char = position.split("_")[1]
l = list(s_dict[key])
l[pos]= char
s_dict[key] = "".join(l)
Output:
s_dict = {'s': 'ATGChTkpCGrGA'}
This works fine but my actual s_dict file is about 1.5 Gb. Is there a faster way of replacing a list of characters at specific indices in a string or list?
Thanks!
asked Aug 21, 2018 at 13:33
Homap
2,2245 gold badges26 silver badges41 bronze badges
2 Answers 2
Here is my take on your interesting problem:
s_dict = {'s' : 'ATGCGTGACGTGA'}
pos_change = {'s' : ['4_h', '6_k', '7_p', '10_r']}
# 1rst change `pos_change` into something more easily usable
pos_change = {k: dict(x.split('_') for x in v) for k, v in pos_change.items()}
print(pos_change) # {'s': {'4': 'h', '6': 'k', '7': 'p', '10': 'r'}}
# and then...
for k, v in pos_change.items():
temp = set(map(int, v))
s_dict[k] = ''.join([x if i not in temp else pos_change[k][str(i)] for i, x in enumerate(s_dict[k])])
print(s_dict) # {'s': 'ATGChTkpCGrGA'}
answered Aug 21, 2018 at 14:10
Ma0
15.2k4 gold badges38 silver badges70 bronze badges
Sign up to request clarification or add additional context in comments.
Comments
as an option of solution you can use s_dict['s'] = '%s%s%s' % (s_dict['s'][:pos], char, s_dict['s'][pos+1:]) instead of do list and join
In [1]: s_dict = {'s' : 'ATGCGTGACGTGA' * 10}
...: pos_change = {'s' : ['4_h', '6_k', '7_p', '10_r']}
...:
...: def list_join():
...: for key in s_dict:
...: for position in pos_change[key]:
...: pos = int(position.split("_")[0])
...: char = position.split("_")[1]
...: l = list(s_dict[key])
...: l[pos]= char
...: s_dict[key] = "".join(l)
...:
...: def by_str():
...: for key in s_dict:
...: for position in pos_change[key]:
...: pos = int(position.split("_")[0])
...: char = position.split("_")[1]
...: values = s_dict['s'][:pos], char, s_dict['s'][pos+1:]
...: s_dict['s'] = '%s%s%s' % values
...:
In [2]: %timeit list_join()
11.7 μs ± 191 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [3]: %timeit by_str()
4.29 μs ± 46.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Comments
lang-py
pos_changewould be better as a dict of dicts (pos_change = {'s' : {4: 'h', 6: 'k', 7: 'p', 10: 'r'}})s_dict['s'] = '%s%s%s' % (s_dict['s'][:pos], char, s_dict['s'][pos+1:])instead of do list and joinbytearrayas it is mutable