I am new to machine learning and have started brainstorming some model ideas that involve financial instrument/time series data. I was thinking it might be useful to use a classification algorithm to predict if an instrument was in fact up or down y% (TRUE/FALSE) after n days, based on i.e. a combination of technical indicator states for each learning example.
That said, in researching the idea I came upon an article that stated training examples in time series data are not independent of each other:
"Time series data has a natural temporal ordering - this differs from typical data mining/machine learning applications where each data point is an independent example of the concept to be learned, and the ordering of data points within a data set does not matter"
My question is as follows: is the above true only if we are trying to predict the continuous value of an asset n days into the future? As far as my idea outlined in the first paragraph is concerned, would this then still be valid considering I am not (apparently) taking into account the specific relationship
1 Answer 1
Consider that:
a "sliding window" approach can be used with any standard regression / classification algorithm. E.g. given the following time series
Time High Low Open Close Volume 1 H1 L1 O1 C1 V1 2 H2 L2 O2 C2 V2 ... i Hi Li Oi Ci Viyou can feed the ML algorithm with these examples:
O0, H1, L1, O1, C1, V1, H2, L2, O2, C2, V2, H3, L3, O3, C3, V3 O1, H2, L2, O2, C2, V2, H3, L3, O3, C3, V3, H4, V4, O4, C4, V4 O2, H3, L3, O3, C3, V3, H4, V4, O4, C4, V4, H5, V5, O5, C5, V5 ...(the first value is the target)
The size of the window (
s) is an important limitation: the algorithm cannot correlate "candles" that lie more thanstimesteps apart.On the other end, with a large
s, all examples appear to be sparse and dissimilar in many ways (see Curse of dimensionality).You may use other features (e.g. averages, multiple timeframes, Japanese candlestick patterns...) but not without a good knowledge of the specific domain.
ML has tools that, in theory, can deal with timeseries. E.g. Recurrent neural networks can take advantage of their internal memory to process arbitrary sequences of inputs and exhibit dynamic temporal behavior. See Which machine learning algorithms can be used for time series forecasts? for references.
-
$\begingroup$ Neural networks seem like overkill. Linear mixed models can capture non-independence between samples. $\endgroup$Nicholas Mancuso– Nicholas Mancuso2016年09月18日 00:21:20 +00:00Commented Sep 18, 2016 at 0:21
Explore related questions
See similar questions with these tags.