So I've got some data which I scraped from a website and it looks like this:
table = '[['January 2010',1368,719],['February 2010',1366,729] ... ['April 2018',1429,480],['Today',1440,448]]'
It's already formatted pretty well but how would I go about converting this string into a list of lists that looks the exact same but simply has a list datatype instead of string?
I tried just converting the datatype but that didn't give me the right answer:
> print(list(table))
> ['[', '[', "'", 'J', 'a' ... '4', '4', '8', ']', ']']
Thanks in advance
2 Answers 2
Use ast.literal_eval:
In [24]: import ast
In [25]: ast.literal_eval(table)
Out[25]:
[['January 2010', 1368, 719],
['February 2010', 1366, 729],
['April 2018', 1429, 480],
['Today', 1440, 448]]
Comments
Found a more intricate solution to this, but this could be more complicated and requires that the strings are equally sized:
_data = {'000000222222000000': {4: '0000022aaad2200000',
5: '000002adaada200000',
6: '00002aadaaada20000',
7: '00002adadacaa20000',
8: '0002d0addadc0a2000'}}
import pandas as pd, numpy as np
_df = pd.DataFrame(_data)
df = _df.iloc[:, 0].apply(list).apply(pd.Series)
Output:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
4 0 0 0 0 0 2 2 a a a d 2 2 0 0 0 0 0
5 0 0 0 0 0 2 a d a a d a 2 0 0 0 0 0
6 0 0 0 0 2 a a d a a a d a 2 0 0 0 0
7 0 0 0 0 2 a d a d a c a a 2 0 0 0 0
8 0 0 0 2 d 0 a d d a d c 0 a 2 0 0 0
To get what you are looking for, I imagine you could do a trick where you test for the longest string in the list, convert your lists to a dictionary with unique keys, pad each string with a leading or trailing number of characters, and then simplify the dataframe. I can go through if required, but I did not need to for my version of this problem.
ast.literal_eval(table)