Draft:Sequential Bootstrapping
- If you would like to continue working on the submission, click on the "Edit" tab at the top of the window.
- If you have not resolved the issues listed above, your draft will be declined again and potentially deleted.
- If you need extra help, please ask us a question at the AfC Help Desk or get live help from experienced editors.
- Please do not remove reviewer comments or this notice until the submission is accepted.
- If you need help editing or submitting your draft, please ask us a question at the AfC Help Desk or get live help from experienced editors. These venues are only for help with editing and the submission process, not to get reviews.
- If you need feedback on your draft, or if the review is taking a lot of time, you can try asking for help on the talk page of a relevant WikiProject. Some WikiProjects are more active than others so a speedy reply is not guaranteed.
- Wikipedia:Contributing to Wikipedia – a basic overview on how to edit Wikipedia.
- Help:Wikitext – how to use the markup
- Help:Referencing for beginners – how to include references
- Wikipedia:Article development – how to develop your article
- Wikipedia:Writing better articles – how to improve your article
- Wikipedia:Verifiability – make sure your article includes reliable third-party sources
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article.
To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags.
- Easy tools: Citation bot (help) | Advanced: Fix bare URLs
- Comment: Looks like an ad for the "Advances in Financial Machine Learning" book given that most of the draft is un-cited. Smallangryplanet (talk) 16:44, 3 December 2025 (UTC)
- Comment: Is this AI-generated? —pythoncoder (talk | contribs) 09:37, 17 November 2025 (UTC)
Sequential bootstrapping is a resampling method used in financial machine learning to account for the dependence structure among labeled events in time series data. It is designed to create bootstrap samples with lower redundancy by favoring observations that contain more unique information. The technique is commonly applied in training machine learning models for financial prediction tasks, particularly when labels overlap in time due to event-based labeling methods. The concept appears in the academic literature on financial machine learning, including Advances in Financial Machine Learning (2018).[1]
Overview
[edit ]Traditional bootstrap procedures assume that observations are independent and identically distributed (IID). Financial time series often violate this assumption due to serial dependence, overlapping prediction horizons, and events spanning multiple timestamps. Sequential bootstrapping modifies the sampling process by incorporating a measure known as uniqueness, which quantifies the proportion of non-overlapping information carried by each observation.[2]
Motivation
[edit ]In many financial machine learning applications, labels are generated using event-based methods such as the triple-barrier approach. Each labeled event may extend over a range of timestamps, resulting in overlapping periods among multiple events. When classical bootstrap methods are applied to such data, samples often contain redundant information, which leads to biased performance estimates and increases the risk of model overfitting. Sequential bootstrapping reduces this bias by incorporating the dependence structure directly into the sampling probabilities.
Uniqueness
[edit ]Let each event {\displaystyle i} span a set of timestamps {\displaystyle T_{i}}. At any timestamp {\displaystyle t}, let {\displaystyle c_{t}} denote the number of concurrent events. The uniqueness of event {\displaystyle i} is defined as:
{\displaystyle u_{i}={\frac {1}{|T_{i}|}}\sum _{t\in T_{i}}{\frac {1}{c_{t}}}.}
Events that heavily overlap with others (high concurrency) receive low uniqueness scores, while events that introduce independent information receive higher scores. Sequential bootstrapping uses these scores as sampling weights.[1]
Algorithm
[edit ]Sequential bootstrapping typically proceeds as follows:[1]
- Construct an indicator matrix specifying which events are active at each timestamp.
- Compute the concurrency at each timestamp.
- Calculate event uniqueness values based on concurrency.
- Select an event at random with probability proportional to its uniqueness.
- Remove the selected event from the concurrency matrix.
- Recompute uniqueness values and repeat until the desired sample size is reached.
This iterative procedure generates a bootstrap sample with reduced dependency among observations.
Properties
[edit ]Sequential bootstrapping exhibits several notable properties:[1]
- Lower redundancy: Samples contain more diverse information compared to standard bootstrap samples.
- Reduced model bias: Machine learning models trained on sequentially bootstrapped samples tend to exhibit more realistic out-of-sample performance.
- Compatibility with financial cross-validation: The method complements purged k-fold and combinatorial purged cross-validation techniques, which also account for label overlap.
Applications
[edit ]Sequential bootstrapping is used in various areas of quantitative finance, including:[1]
- Training supervised learning models for price movement prediction
- Enhancing the diversity of ensemble models
- Constructing bootstrap samples for bagging and model averaging
- Weighting observations in event-based datasets
- Evaluating model robustness in the presence of dependent labels
References
[edit ]- ^ a b c d e López de Prado, Marcos (2018). Advances in Financial Machine Learning. Hoboken, NJ: John Wiley & Sons. ISBN 978-1119482086.
- ^ Efron, Bradley; Tibshirani, Robert J. (1994). An Introduction to the Bootstrap. Boca Raton, FL: Chapman & Hall/CRC. ISBN 978-0412042317.