Dropping rows from a PANDAS dataframe where some of the columns have value 0

Question 1

I am dropping rows from a PANDAS dataframe when some of its columns have 0 value. I got the output by using the below code, but I hope we can do the same with less code — perhaps in a single line.

df:

My code:

 drop_A=df.index[df["A"] == 0].tolist()
 drop_B=df.index[df["C"] == 0].tolist()
 c=drop_A+drop_B
 df=df.drop(df.index[c])

[out]

 A B C
 0 1 2 5
 2 6 8 4

Question 2

Do you want to know a better way to do what your code is doing, or do you want us to code golf it?

Question 3

I need a better way

Question 4

I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:

df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
 A B C
0 1 2 5
2 6 8 4

Details:

print (df[['A','C']] != 0)
 A C
0 True True
1 True False
2 True True
3 False True
print ((df[['A','C']] != 0).all(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:

df = df[~(df[['A','C']] == 0).any(axis=1)]

Details:

print (df[['A','C']])
 A C
0 1 5
1 4 0
2 6 4
3 0 2
print (df[['A','C']] == 0)
 A C
0 False False
1 False True
2 False False
3 True False
print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool
print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

Question 5

Jezrael , I want to consider only column A and C , pls check my question once

Question 6

@pyd Clarify this in your question.

Question 7

You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?

Question 8

@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)

Question 9

One line hack using .dropna()

import pandas as pd
df = pd.DataFrame({'A':[1,4,6,0],'B':[2,4,8,4],'C':[5,0,4,2]})
print df
 A B C
0 1 2 5
1 4 4 0
2 6 8 4
3 0 4 2
columns = ['A', 'C']
df = df.replace(0, pd.np.nan).dropna(axis=0, how='any', subset=columns).fillna(0).astype(int)
print df
 A B C
0 1 2 5
2 6 8 4

So, what's happening is:

Replace 0 by NaN with .replace()
Use .dropna() to drop NaN considering only columns A and C
Replace NaN back to 0 with .fillna() (not needed if you use all columns instead of only a subset)
Correct the data type from float to int with .astype()

Question 10

Nice! I was hoping there was .dropna() hack to be had... good one paulo!

Question 11

Just tried this and received the following: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead

Question 12

Sure, just import numpy directly (import numpy as np) and replace pd.np.nan with np.nan instead.

jezrael jezraeljezrael 2464 silver badges8 bronze badges · Accepted Answer · 2018-01-18 11:28:07Z

I think you need create boolean DataFrame by compare all filtered columns values by scalar for not equality and then check all Trues per rows by all:

df = df[(df[['A','C']] != 0).all(axis=1)]
print (df)
 A B C
0 1 2 5
2 6 8 4

Details:

print (df[['A','C']] != 0)
 A C
0 True True
1 True False
2 True True
3 False True
print ((df[['A','C']] != 0).all(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

I think you need create boolean DataFrame by compare all values by scalar and then check any Trues per rows by any and last invert mask by ~:

df = df[~(df[['A','C']] == 0).any(axis=1)]

Details:

print (df[['A','C']])
 A C
0 1 5
1 4 0
2 6 4
3 0 2
print (df[['A','C']] == 0)
 A C
0 False False
1 False True
2 False False
3 True False
print ((df[['A','C']] == 0).any(axis=1))
0 False
1 True
2 False
3 True
dtype: bool
print (~(df[['A','C']] == 0).any(axis=1))
0 True
1 False
2 True
3 False
dtype: bool

Jezrael , I want to consider only column A and C , pls check my question once
You have both "all not equal to 0" and "not any equal to zero". Did you intend these to be two options, or did you accidentally post two solutions?
@Accumulation No, it was no accident. I post first the best solution and second very nice, the best 2. :)

Stack Exchange Network

Dropping rows from a PANDAS dataframe where some of the columns have value 0

2 Answers 2

Hot Network Questions

Dropping rows from a PANDAS dataframe where some of the columns have value 0

2 Answers 2

Related

Hot Network Questions