Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Resampling data #210

Answered by kernc
jamhot1 asked this question in Q&A
Dec 27, 2020 · 1 comments · 5 replies
Discussion options

I have been playing around with resampling but decided to create a resampled data.close column through pandas for hourly from 5 min data from yahoo.
Because the resampled data only appears as a value every 12 rows, I wanted to be able to create a conditional from a variable that stores the last known value.
The data looks something like
nan
nan
nan
nan
nan ....
109.66653
nan
nan...

I create a condition to check if not nan and to store that int in a variable to check against later in my trade logic.

when I print the variable to see what is being stored, it seems to have stored all values including the nan's despite placing the conditional within the next method to ignore all nan values.

I have tried to store specific row values from within the Next method previously in a different way and seemed to run into a similar issue. Am I missing something?

example:
if not self.data.H_Close == 'nan' : self.Hourly = self.data.H_Close

when self.Hourly is printed on each iteration I get all values including nan

confused!

You must be logged in to vote

Checking for nan should be done with numpy.isnan() as nans are a particularly curious bunch:

>>> float('nan') == float('nan')
False
>>> import numpy as np
>>> np.nan == np.nan
False

But more importantly, why not simply use Series.ffill()?

>>> pd.Series([1, np.nan, np.nan, 2, np.nan]).ffill().tolist()
[1, 1, 1, 2, 2]

Replies: 1 comment 5 replies

Comment options

Checking for nan should be done with numpy.isnan() as nans are a particularly curious bunch:

>>> float('nan') == float('nan')
False
>>> import numpy as np
>>> np.nan == np.nan
False

But more importantly, why not simply use Series.ffill()?

>>> pd.Series([1, np.nan, np.nan, 2, np.nan]).ffill().tolist()
[1, 1, 1, 2, 2]
You must be logged in to vote
5 replies
Comment options

Thanks for the pointers Kernc!!
I did also try :
df['D_Close'] = df['D_Close'].fillna('a')

and checked for 'a' instead.

def next(self):
if not self.data.D_Close == 'a' :
self.Hourly = self.data.D_Close
print(self.Hourly)

And it prints the 'a' included with the int's

I will read up on the series.ffill()

Comment options

just an update on where I think I was tripped up.

where
self.Hourly = self.data.D_Close

I found that
self.Hourly = self.data.D_Close[-1]

This gave me a printed output which didn't include the entire array (including nans) above it.

what am I seeing here? I can't quite understand why I need to look back a row to see what I thought was my current row.

Comment options

This is pretty much how Python sequences work.

The reason a simple:

self.data.Close

(i.e. without explicit indexing) can be used in some (that is, boolean and numeric-scalar) contexts:

# e.g.
if self.data.Close > self.data.Low + 2:

is the following piece of code which does the indexing for you:

def __bool__(self):
try:
return bool(self[-1])
except IndexError:
return super().__bool__()
def __float__(self):
try:
return float(self[-1])
except IndexError:
return super().__float__()
Comment options

ahhh, yes. Of course. I do now recall this. What has thrown me is that you can call self.data.Close without [-1] within next(). So which line is does it call in this case?

Comment options

Depends on the context:

if self.data.some_signal:

reduces to __bool__ method result above, and:

self.buy(limit=self.data.Low)

to the __float__ one.

But:

if self.data.Close > sma:

actually applies the comparison to the whole array first, producing a new boolean array, and only afterwards reduces to its last value (via __bool__ again). In this way, it's way less efficient than the explicit case (if self.data.Close[-1] > sma), but it's easy to write, easy to read, and it works well enough.

Answer selected by jamhot1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants

AltStyle によって変換されたページ (->オリジナル) /