return default if pandas dataframe.loc location doesn't exist

Question 1

I find myself often having to check whether a column or row exists in a dataframe before trying to reference it. For example I end up adding a lot of code like:

if 'mycol' in df.columns and 'myindex' in df.index:
 x = df.loc[myindex, mycol]
else:
 x = mydefault

Is there any way to do this more nicely? For example on an arbitrary object I can do x = getattr(anobject, 'id', default) - is there anything similar to this in pandas? Really any way to achieve what I'm doing more gracefully?

Question 2

There is a method for Series:

So you could do:

df.mycol.get(myIndex, NaN)

Example:

In [117]:
df = pd.DataFrame({'mycol':arange(5), 'dummy':arange(5)})
df
Out[117]:
 dummy mycol
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
[5 rows x 2 columns]
In [118]:
print(df.mycol.get(2, NaN))
print(df.mycol.get(5, NaN))
2
nan

Question 3

I was also able to get it to work when the index is known to exist: df.loc['myindex'].get('mycol', NaN) A shame that you still need to be sure that one of the index or column exists, but nonetheless this will be useful in a lot of scenarios. Thank you!

Question 4

Python has this mentality to ask for forgiveness instead of permission. You'll find a lot of posts on this matter, such as this one.

In Python catching exceptions is relatively inexpensive, so you're encouraged to use it. This is called the EAFP approach.

For example:

try:
 x = df.loc['myindex', 'mycol']
except KeyError:
 x = mydefault

Question 5

Perhaps I should use more EAFP, but my personal preference is to save try/excepts for when there's no other easy choice. Thanks though.

Question 6

@Foobar: according to this link it is only the try: that is inexpensive. except: seems to be expensive. The moral of the story seems to be that the caller is left to decide between testing for existence or try: except:ing. The performance trade off depending on your use case. i.e. how long it takes to test existence vs how many times not testing will raise. Nevertheless, it would be nice if pandas offered syntactic sugar by permitting that choice to be argument driven. As far as I can tell, it does not.

Question 7

There is the get method for DataFrame to get a column and another get for Series to get an item. So you can chain them together to get a single value:

 A B
0 0 2
1 1 3
df.get('B', default=pd.Series()).get(1, default='[unknown]')

Output:

If the index or column is missing:

df.get('B', default=pd.Series()).get(2, default='[unknown]')
# or
df.get('C', default=pd.Series()).get(1, default='[unknown]')

Output:

'[unknown]'

Question 8

Use reindex:

df.reindex(index=['row1', 'row2'], columns=['col1', 'col2'], fill_value=mydefault)

What's great here is using lists for the rows and columns, where some of them exist and some of them don't, and you get the fallback value whenever either the row or column is missing.

Example:

In[1]:
df = pd.DataFrame({ 
 'A':[1, 2, 3],
 'B':[5, 3, 7],
})
df
Out[1]:
 A B
0 1 5
1 2 3
2 3 7
In[2]:
df.reindex(index=[0, 1, 100], columns=['A', 'C'], fill_value='FV')
Out[2]:
 A C
0 1 FV
1 2 FV
100 FV FV

Question 9

Always good to have alternatives, but this would be very slow on a df of any significant size. It creates an entirely new df just to get one value.

Question 10

@fantabolous well the point of this is that you can get more than one value, in which case you are creating a new df anyway

Question 11

Define Function

 # Define Function:
 def getvalue(df,index,column_key,default_value):
 try:
 return df.loc[index,column_key]
 except KeyError:
 return default_value

Example:

# define dictionary
thisdict = {
 "brand": ["Ford",'Honda','Toyta'],
 "model": ["Mustang",'CRV','Camry'],
 "year": [1964,2004,1892 ]
}
# create dataframe 
df = pd.DataFrame(thisdict)
# print dataframe
print(df )
print()
# Test all 4 scenarios
colNotFound = getvalue(df,1,'name',"ColNotFound")
print(colNotFound + '\n')
indexNotFound = getvalue(df, 4,'model',"indexNotFound")
print(indexNotFound + '\n')
colandindexNotFound = getvalue(df, 4,'name',"colandindexNotFound")
print(colandindexNotFound + '\n')
keyandcolindf = getvalue(df, 1,'model',"Nothing")
print(keyandcolindf + '\n')

output:

 brand model year
0 Ford Mustang 1964
1 Honda CRV 2004
2 Toyta Camry 1892
ColNotFound
indexNotFound
colandindexNotFound
CRV

Question 12

I like the idea, but I tried and neither method actually works. Did you test these? I may be wrong but I don't think KeyError can be used as a boolean (first method) or index key (second method). Also, in the first method it evaluates the conditional BEFORE it evaluates df.loc[] so the KeyError wouldn't have raised yet.

Question 13

i have revised my answer.

EdChum EdChum 396k204 gold badges836 silver badges583 bronze badges · Accepted Answer · 2014-05-01 08:49:45Z

62

There is a method for Series:

So you could do:

df.mycol.get(myIndex, NaN)

Example:

In [117]:
df = pd.DataFrame({'mycol':arange(5), 'dummy':arange(5)})
df
Out[117]:
 dummy mycol
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
[5 rows x 2 columns]
In [118]:
print(df.mycol.get(2, NaN))
print(df.mycol.get(5, NaN))
2
nan

Share

Improve this answer

edited May 29, 2024 at 21:29

wjandrea's user avatar

wjandrea

33.6k10 gold badges69 silver badges104 bronze badges

answered May 1, 2014 at 8:49

EdChum's user avatar

EdChum EdChum

396k204 gold badges836 silver badges583 bronze badges

1 Comment

fantabolous

fantabolous Over a year ago

I was also able to get it to work when the index is known to exist: df.loc['myindex'].get('mycol', NaN) A shame that you still need to be sure that one of the index or column exists, but nonetheless this will be useful in a lot of scenarios. Thank you!

2014年05月01日T12:25:35.03Z+00:00

CollectivesTM on Stack Overflow

return default if pandas dataframe.loc location doesn't exist

5 Answers 5

1 Comment

2 Comments

Comments

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

5 Answers 5

1 Comment

2 Comments

Comments

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related