I'm trying to use df.eval to evaluate an expression that contains a call to a function - in my example it's numpy.around which I have imported in the local namespace. According to the documentation, using @ before the function name should do the trick, but it throws this error
TypeError: 'Series' objects are mutable, thus they cannot be hashed
What am I doing wrong? Doesn't work after fresh run, in IDE (Spyder / Jupyter notebook) or from console.
import numpy as np
import pandas as pd
from numpy import around
df = pd.DataFrame({'x':np.array([1.12,2.76])})
# this throws TypeError: 'Series' objects are mutable, thus they cannot be hashed
df['y'] = df.eval('@around(x,1)')
# this works
df['z'] = around(df['x'],1)
print(pd.__version__)
# 0.23.4
print(np.__version__)
# 1.15.1
import sys
print(sys.version)
# 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)]
1 Answer 1
Update
The workaround is simpler, no need to remove numexpr package, just use a different parsing engine for the expression:
df['y'] = df.eval('@around(x,1)', engine = 'python')
First answer
I managed to find a workaround and I also figured out where the problem seems to be. First I updated conda itself and all packages (including python to 3.7.0, but I don't believe Python version is relevant). After that:
Step 1: remove pandas and numpy
conda remove pandas numpy
The following packages will be REMOVED:
bokeh: 0.13.0-py37_0
mkl_fft: 1.0.4-py37h1e22a9b_1
mkl_random: 1.0.1-py37h77b88f5_1
numba: 0.39.0-py37h830ac7b_0
numexpr: 2.6.8-py37h9ef55f4_0
numpy: 1.15.1-py37ha559c80_0
pandas: 0.23.4-py37h830ac7b_0
scikit-learn: 0.19.1-py37hae9bb9f_0
scipy: 1.1.0-py37h4f6bf74_1
Step 2: reinstall only pandas and numpy
conda install pandas numpy
The following NEW packages will be INSTALLED:
mkl_fft: 1.0.4-py37h1e22a9b_1
mkl_random: 1.0.1-py37h77b88f5_1
numpy: 1.15.1-py37ha559c80_0
pandas: 0.23.4-py37h830ac7b_0
After step 2, the code worked as expected, so the problem must lie in one of the other packages removed.
Step 3: add each ofthe packages removed initially one by one (bokeh, numba, numexpr, scikit-learn, scipy) and test each time if the code still works. After installing numexpr the code failed, so that is where the problem is. Not sure how you could add numpexpr back again - I tried some older versions but everytime the code failed
1 Comment
.eval required numexpr because it's required for .query... Checking pd.get_versions() I see I didn't have it installed and presumably why your code worked in the virtualenv I was using. This kind of feels like it should be an issue on the pandas GH...
numpy1.14.5,pandas0.20.3{'x': {0: 1.12, 1: 2.76}, 'y': {0: 1.1, 1: 2.8}}which looks correct...