Getting same values in next() method · kernc/backtesting.py · Discussion #477

flexelem
Sep 22, 2021

Hi there,

I am testing an indicator from pandas_ta library and wrapping with self.I in init function. This indicator returns a DataFrame. Later on, I try to get values inside next() method which prints same value unexpectedly.

from backtesting import Strategy
from pandas_ta import supertrend
import pandas
class Supertrend(Strategy):
 multiplier = 3.0
 periods = 10
 def init(self):
 high = self.data.High
 low = self.data.Low
 close = self.data.Close
 self.spt_df = self.I(supertrend, high.s, low.s, close.s, self.periods, self.multiplier).df
 self.dir = self.spt_df.iloc[ : , 1]
 print(self.dir)
 def next(self):
 print(f'self.dir[-2] {self.dir[-2]} self.dir[-1] {self.dir[-1]}')
 if self.dir[-2] == -1 and self.dir[-1] == 1:
 # never enters here because value is always 1 
 self.buy()
 elif self.dir[-2] == 1 and self.dir[-1] == -1:
 # never enters here because value is always 1
 self.sell()

When I print self.dir column inside init() I verify that all the colums are calculated correctly;

DateTime
2017年08月30日 1.0
2017年08月31日 1.0
2017年09月01日 1.0
2017年09月02日 1.0
2017年09月03日 1.0
2017年09月04日 -1.0
2017年09月05日 -1.0
2017年09月06日 -1.0
2017年09月07日 -1.0

But when I print self.dir values I always get the same value which is not reflecting the current state next method is running.

self.dir[-2] 1.0 self.dir[-1] 1.0
self.dir[-2] 1.0 self.dir[-1] 1.0
self.dir[-2] 1.0 self.dir[-1] 1.0
self.dir[-2] 1.0 self.dir[-1] 1.0
self.dir[-2] 1.0 self.dir[-1] 1.0
self.dir[-2] 1.0 self.dir[-1] 1.0
self.dir[-2] 1.0 self.dir[-1] 1.0
self.dir[-2] 1.0 self.dir[-1] 1.0
self.dir[-2] 1.0 self.dir[-1] 1.0

I appreciate for any kind of help. Thanks!

Answered by kernc

Sep 24, 2021

@flexelem As @AGG2017 said, if you modify:

 ...
 self.dir = self.I(lambda: self.spt_df.iloc[:, 1])

then self.dir[-1] in next() will work as you expect. Calling self.I() wraps the values into an indicator type that can be length-managed iteration-wise.

@AGG2017 Instead of self._broker._i I'd recommend i = len(self.data) or similar. This interface is public whereas _broker is not.

View full answer

Replies: 5 comments 5 replies

AGG2017
Sep 22, 2021

I'm sorry that I don't have time to analyse you code but with just a look I see that in the indicator assignment line:
self.spt_df = self.I(supertrend, high.s, low.s, close.s, self.periods, self.multiplier).df
you end with calling .df which will convert it to pandas dataframe. This will not allow you to use the internal array representation with moving array length where at each next call [-1] will be always the current last element. Instead now you have the complete dataframe and [-1] is always the last original element. Remove .df or use self._broker._i as global index pointer to the current element, so self._broker._i-1 wil be the previous bar, etc.

1 reply

@flexelem

flexelem Sep 22, 2021
Author

Thanks for helping. You are right, but the indicator I am importing returns a df. Anyway, I removed .df and tried to test my strategy in next() method. There is a weird behaviour which if I replace self.buy() and self.close() methods with print() it prints the correct number of BUY and SELL trades. However, if I use actual buy() and close(), after the first sell trade it doesn't apply any new trades.

class Supertrend(Strategy):
 multiplier = 3.0
 periods = 10
 def init(self):
 super().init()
 high = self.data.High
 low = self.data.Low
 close = self.data.Close
 self.spt_df = self.I(supertrend, high.s, low.s, close.s, self.periods, self.multiplier)
 def next(self):
 direct = self.spt_df.df.iloc[:, 1]
 if direct[-1] == -1 and direct[-2] == 1:
 print('BUYING')
 # self.buy()
 elif direct[-1] == 1 and direct[-2] == -1:
 print('SELLING')
 # self.sell()

flexelem
Sep 23, 2021
Author

Thanks to @AGG2017 I solved the issue I was having about returning and operating on dataframe. But there is another issue I am having about warming up period. The indicator I am importing returns values including NaN which is ok for me. Is there a way to disable warming up period? Because of this warming up period my results are returning wrong trades. So, based on these values next() method is called with 19th record whereas I want to make it start with 1st record. Thanks

2017年08月17日 0.000000 1.0 NaN NaN
2017年08月18日 NaN 1.0 NaN NaN
2017年08月19日 NaN 1.0 NaN NaN
2017年08月20日 NaN 1.0 NaN NaN
2017年08月21日 NaN 1.0 NaN NaN
2017年08月22日 NaN 1.0 NaN NaN
2017年08月23日 NaN 1.0 NaN NaN
2017年08月24日 NaN 1.0 NaN NaN
2017年08月25日 NaN 1.0 NaN NaN
2017年08月26日 NaN 1.0 NaN NaN
2017年08月27日 225.595000 1.0 225.595000 NaN
2017年08月28日 233.308500 1.0 233.308500 NaN
2017年08月29日 254.970150 1.0 254.970150 NaN
2017年08月30日 272.235135 1.0 272.235135 NaN
2017年08月31日 285.015622 1.0 285.015622 NaN
2017年09月01日 296.723059 1.0 296.723059 NaN
2017年09月02日 296.723059 1.0 296.723059 NaN
2017年09月03日 296.723059 1.0 296.723059 NaN
2017年09月04日 427.125475 -1.0 NaN 427.125475
2017年09月05日 413.118427 -1.0 NaN 413.118427

0 replies

AGG2017
Sep 24, 2021

Unfortunately there is no way to disable it. It always skip all first nan rows so you have to add more history data at the beginning. We have to wait the version where we can select start and end date-time for the backtesting. It can help to force the start and end at specific range. I did it for myself with many other changes for backtesting options so I know how helpful it is.

3 replies

@eervin123

eervin123 Sep 24, 2021
Sponsor

Is there a pull request for this feature? I agree this would be very helpful.

@kernc

kernc Sep 24, 2021
Maintainer

version where we can select start and end date-time for the backtesting. It can help to force the start and end at specific range

What's wrong with clipping the range before the backtest, such as:

backtest_df = df.loc['2017-01-01':'2020-12-31']
bt = Backtest(backtest_df, ...)

@flexelem

flexelem Sep 24, 2021
Author

For indicators like moving averages which returns a Series instance it makes sense to ignore NaN values. But for more complex indicators which returns a dataframe configuring the date won't give the desired results. In my case, I am only interested in one of the colums of the precalculated df, so having NaN values for other columns is ignorable. By the way, luckily the indicator I am importing in this thread has fillna value which solved the issue for me.

kernc
Sep 24, 2021
Maintainer

@flexelem As @AGG2017 said, if you modify:

 ...
 self.dir = self.I(lambda: self.spt_df.iloc[:, 1])

then self.dir[-1] in next() will work as you expect. Calling self.I() wraps the values into an indicator type that can be length-managed iteration-wise.

@AGG2017 Instead of self._broker._i I'd recommend i = len(self.data) or similar. This interface is public whereas _broker is not.

1 reply

@flexelem

flexelem Sep 25, 2021
Author

@flexelem As @AGG2017 said, if you modify:
 ...
 self.dir = self.I(lambda: self.spt_df.iloc[:, 1])
then self.dir[-1] in next() will work as you expect. Calling self.I() wraps the values into an indicator type that can be length-managed iteration-wise.

@AGG2017 Instead of self._broker._i I'd recommend i = len(self.data) or similar. This interface is public whereas _broker is not.

Thanks @kernc it did the trick. The more I am debugging or testing the framework more I am learning it.

Answer selected by flexelem

AGG2017
Sep 24, 2021

@kernc Yes, you are right. It's better to use i = len(self.data). I started to modify everything just because I prefer working with the original size dataframes and didn't like the internal implementation of the indicators where to access one indicator with many columns I had to remember the index number of each column. But trying to help others confuse me with the endless number of my modifications.

What's wrong with clipping the range before the backtest, such as:
backtest_df = df.loc['2017-01-01':'2020-12-31']
bt = Backtest(backtest_df, ...)

We often need to optimize the parameters of the indicators and for this reason they need to be calculated inside init. If you provide limited date range the way you propose then most indicators will create a lot of nan at the beginning and this will additionally reduce the range. I prefer to provide the max data available and then to select different regions for backtesting. Usually I optimize for 70% of the beginning of the dataframe and test how it perform at the last 30% of the data. Can be done somehow with providing subranges with extra initial data but the statistics will always be wrong.

0 replies

Uh oh!

Getting same values in next() method #477

Uh oh!

flexelem Sep 22, 2021

Replies: 5 comments · 5 replies

Uh oh!

AGG2017 Sep 22, 2021

Uh oh!

flexelem Sep 22, 2021 Author

Uh oh!

flexelem Sep 23, 2021 Author

Uh oh!

AGG2017 Sep 24, 2021

Uh oh!

eervin123 Sep 24, 2021 Sponsor

Uh oh!

kernc Sep 24, 2021 Maintainer

Uh oh!

flexelem Sep 24, 2021 Author

Uh oh!

kernc Sep 24, 2021 Maintainer

Uh oh!

flexelem Sep 25, 2021 Author

Uh oh!

Uh oh!

AGG2017 Sep 24, 2021

flexelem
Sep 22, 2021

Replies: 5 comments 5 replies

AGG2017
Sep 22, 2021

flexelem Sep 22, 2021
Author

flexelem
Sep 23, 2021
Author

AGG2017
Sep 24, 2021

eervin123 Sep 24, 2021
Sponsor

kernc Sep 24, 2021
Maintainer

flexelem Sep 24, 2021
Author

kernc
Sep 24, 2021
Maintainer

flexelem Sep 25, 2021
Author

AGG2017
Sep 24, 2021