Pandas cat.categories.isin list, is this a bug?

zljubisic at gmail.com zljubisic at gmail.com
Mon May 14 07:18:31 EDT 2018


On Monday, 14 May 2018 13:05:24 UTC+2, zlju... at gmail.com wrote:
> Hi,
>> I have dataframe with CRM_assetID column as category dtype:
>> df.info()
>> <class 'pandas.core.frame.DataFrame'>
> RangeIndex: 1435952 entries, 0 to 1435951
> Data columns (total 75 columns):
> startTime 1435952 non-null object
> CRM_assetID 1435952 non-null category
>> searching a dataframe for each of three categories:
>> df[df.CRM_assetID == 'V1254748'].shape
> (35, 75)
> df[df.CRM_assetID == 'V805722'].shape
> (45, 75)
> df[df.CRM_assetID == 'V1105400'].shape
> (34, 75)
>>> len(df.CRM_assetID.cat.categories.isin(['V1254748', 'V805722', 'V1105400']))
>> Why this len is not equal to 114 (35 + 45 + 34)?
>> Regards.

I forgot to copy result of:
len(df.CRM_assetID.cat.categories.isin(['V1254748', 'V805722', 'V1105400'])) 
which is 55418.


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /