Questions tagged [machine-learning]

Ask Question

Machine learning algorithms build a model of the training data. The term "machine learning" is vaguely defined; it includes what is also called statistical learning, reinforcement learning, unsupervised learning, etc. ALWAYS ADD A MORE SPECIFIC TAG.

20,439 questions

Newest Active Bountied 1 Unanswered

2 votes

0 answers

9 views

Help matching an ordered list of events (no timestamps) to a noisy timestamped time-series

I’m stuck and this is starting to feel pretty convoluted, so I’ll try to be clear. What I have: A timestamped stochastic time-series (e.g. market prices). It’s noisy but when an event happens the ...

user501063's user avatar

user501063

asked 6 hours ago

3 votes

0 answers

19 views

Why do "good" loss functions in ML need both Lipschitz continuity and smoothness?

I’m trying to understand the common assumptions in machine-learning optimization theory, where a "well-behaved" loss function is often required to be both L-Lipschitz and β-smooth (i.e., have β-...

Antonios Sarikas's user avatar

Antonios Sarikas

asked 10 hours ago

0 votes

0 answers

24 views

What data mining freeware is available that replicates SAS EMiner's interactive Decision Tree node?

Its 2025, and yes I'm still using SAS EMiner's Decision Tree..... If anyone knows a modern freeware version that replicates the Interactive mode effectively (with controlling split cutoff values, a ...

Anthony Galka's user avatar

Anthony Galka

asked yesterday

0 votes

0 answers

23 views

Modeling recurring monthly transactions with weekend-shift effects: DBSCAN vs rule-based temporal detection?

I have 3 months of categorized bank transaction data and need to identify recurring cash inflows and outflows for lending risk modeling. Complications: 1. Income dates shift earlier when payday falls ...

Awande Ntombela's user avatar

Awande Ntombela

asked Nov 19 at 9:09

2 votes

0 answers

38 views

Restrict training data to only rows with values for most important variable? [closed]

My training data is mostly missing values for the feature that I know will be the most important variable. This missingness is semi-random. For example, I know the value is missing for this feature ...

mdrishan's user avatar

mdrishan

asked Nov 18 at 17:40

0 votes

0 answers

23 views

Is the figure showing margin violation for the support vector machine correct?

I am listening to a lecture on soft margin SVM https://youtu.be/XUj5JbQihlU?si=b66SblRnw9mmczVU&t=2969 The lecturer says that the blue dot represents a violation of the margin. I don't really ...

Your neighbor Todorovich's user avatar

Your neighbor Todorovich

asked Nov 18 at 17:10

3 votes

1 answer

102 views

+50

Accuracy in Machine Learning vs. Accuracy in Statistics vs. pass@1,1 in Generative Modeling: What's the Difference?

I've encountered the term "accuracy" used differently across several evaluation contexts, and I want to clearly understand their mathematical and conceptual distinctions using consistent ...

Charlie Parker's user avatar

Charlie Parker

7,338

asked Nov 17 at 21:53

1 vote

0 answers

29 views

Is the strong duality of the hard-margin SVM really trivially satisfied all the time?

It is widely known that if you were to calculate the maximizer of the dual SVM program (denote as $\alpha^*$), then the primal minimizer of the hard-margin SVM program, \begin{aligned}&{\underset {...

Your neighbor Todorovich's user avatar

Your neighbor Todorovich

asked Nov 17 at 13:36

0 votes

1 answer

51 views

Guidance for communicating insights to inform breakdown companies how to assess breakdown risk [closed]

I come from a machine learning background, however I am trying to learn more traditional data science. I have a dataset of vehicles and the target is the Breakdown Likelihood (1 to 3, 1 being lowest), ...

92carmnad's user avatar

92carmnad

asked Nov 16 at 10:40

0 votes

0 answers

26 views

Time-based regression: is it leakage if training includes snapshots closer to the event than those used at prediction?

I’m building a regression model that predicts the final number of vehicles booked for a ferry trip. Each training row represents the state of bookings for a given trip N days before departure. Example ...

vpvinc's user avatar

vpvinc

asked Nov 13 at 10:18

0 votes

0 answers

41 views

Definition(s) of "data augmentation"

The first paragraph of the Wikipedia page for "data augmentation" seems to conflate two different meanings of the term. The more classical definition comes from Bayesian computation: ...

Taylor's user avatar

Taylor

22.4k

asked Nov 11 at 15:46

0 votes

0 answers

59 views

Extending the TVD-MI mechanism beyond information-based questions for scalable oversight [closed]

TVD-MI (Total Variation Distance–Mutual Information) has been proposed as a mechanism for evaluating the trustworthiness of judges (such as LLMs scoring code correctness or theorem validity) without ...

Charlie Parker's user avatar

Charlie Parker

7,338

asked Nov 8 at 21:10

0 votes

0 answers

16 views

Clarifying notation for agent/item indices in TVD-MI mechanism

In the context of the TVD-MI (Total Variation Distance–Mutual Information) mechanism described by Zachary Robertson et al., what precisely do the indices (i, j) represent? Specifically, are (i, j) ...

Charlie Parker's user avatar

Charlie Parker

7,338

asked Nov 8 at 20:58

1 vote

0 answers

37 views

Designing a demand forecasting model with a dynamic daily update and a final horizon prediction — best practices to avoid leakage?

I am working on a demand forecasting problem for ferry vehicle capacity. For each voyage, I have daily snapshots of the cumulative reservations from the opening date until departure day. So each ...

Analivia Valery's user avatar

Analivia Valery

asked Nov 7 at 15:53

3 votes

2 answers

67 views

Should the minimum and maximum of each feature be contained in the train set for machine learning?

When using machine learning algorithms for regressions, I know that the prediction of the final model will be best when the features are within the ranges used for training, to avoid extrapolation. ...

n6r5's user avatar

n6r5

asked Nov 7 at 10:04

15 30 50 per page

2 3 4 5

...

1363 Next

Stack Exchange Network

Questions tagged [machine-learning]

Help matching an ordered list of events (no timestamps) to a noisy timestamped time-series

Why do "good" loss functions in ML need both Lipschitz continuity and smoothness?

What data mining freeware is available that replicates SAS EMiner's interactive Decision Tree node?

Modeling recurring monthly transactions with weekend-shift effects: DBSCAN vs rule-based temporal detection?

Restrict training data to only rows with values for most important variable? [closed]

Is the figure showing margin violation for the support vector machine correct?

Accuracy in Machine Learning vs. Accuracy in Statistics vs. pass@1,1 in Generative Modeling: What's the Difference?

Is the strong duality of the hard-margin SVM really trivially satisfied all the time?

Guidance for communicating insights to inform breakdown companies how to assess breakdown risk [closed]

Time-based regression: is it leakage if training includes snapshots closer to the event than those used at prediction?

Definition(s) of "data augmentation"

Extending the TVD-MI mechanism beyond information-based questions for scalable oversight [closed]

Clarifying notation for agent/item indices in TVD-MI mechanism

Designing a demand forecasting model with a dynamic daily update and a final horizon prediction — best practices to avoid leakage?

Should the minimum and maximum of each feature be contained in the train set for machine learning?

Hot Network Questions