Newest 'imbalanced-data' Questions

1. Home
2. Questions
3. AI Assist
4. Tags
5. Challenges
6. Chat
7. Articles
8. Users
9. Companies
11. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Bring the best of human thought and AI automation together at your work. Learn more

325 questions

0 votes

1 answer

73 views

Brier Skill Score returns NaN in cross_val_score with imbalanced dataset

I’m trying to evaluate classification models on a highly imbalanced fraud dataset using the Brier Skill Score (BSS) as the evaluation metric. The dataset has ~2133 rows and the target Fraud_Flag is ...

Br0k3nS0u1's user avatar

Br0k3nS0u1

asked Aug 28, 2025 at 12:20

0 votes

0 answers

100 views

Loan Default Prediction - Kaggle

I am working on the loan default prediction data set available on Kaggle which has a highly skewed class distribution. The best model I have gotten so far is as follows using ExtraTreesClassifier: ...

RenamedUser7008's user avatar

RenamedUser7008

asked Apr 16, 2025 at 8:59

0 votes

1 answer

240 views

Why is my BERT model producing NaN loss during training for multi-label classification on imbalanced data?

I’m running into a frustrating issue while training a BERT-based multi-label text classification model on an imbalanced dataset. After a few epochs, the training loss suddenly becomes NaN, and I can’t ...

Erhan Arslan's user avatar

Erhan Arslan

asked Jan 28, 2025 at 13:03

-2 votes

1 answer

51 views

Improving Accuracy [closed]

I am working on testing accuracy and performance using deep learning models on a complex dataset but I have reached a good accuracy but I need to improve it so any suggestions other than what I did(...

Menna's user avatar

Menna

asked Dec 20, 2024 at 16:32

0 votes

0 answers

77 views

Understanding the `model.fit` function in keras and imbalanced datasets

As an exercise, I'm trying to translate a model written in Keras (https://github.com/CVxTz/ECG_Heartbeat_Classification/blob/master/code/baseline_mitbih.py) into Pytorch code. I realize in Keras much ...

user26579046's user avatar

user26579046

asked Sep 7, 2024 at 21:08

1 vote

2 answers

481 views

Problem with Keras class weights and KeyError

I anticipate that I have seen the question: Keras class_weight error dictionary keys/values referring to the same problem, but the solution does not seem to help me. With this code, where I just added ...

Pinguiz's user avatar

Pinguiz

asked Aug 14, 2024 at 17:15

0 votes

0 answers

69 views

Weighted F1-score

I'm training and validating models for a binary classification problem in a dataset that has great class imbalance. When searching for metrics for evaluating the performance of the models, I found ...

JS_ps's user avatar

JS_ps

asked Jul 3, 2024 at 15:40

1 vote

1 answer

1k views

Does XGBoost's scale_pos_weight correctly balance the positive samples if the training dataset has more positive than negative samples?

After researching, I realized that scale_pos_weight is typically calculated as the ratio of the number of negative samples to the number of positive samples in the training data. My dataset has 840 ...

viji's user avatar

viji

asked Jun 6, 2024 at 14:27

1 vote

0 answers

88 views

Class_weight parameter not impacting results in imbalanced dataset with RandomForestClassifier

I'm fairly new to ML and now I'm in the process of predicting employee attrition in a medium sized dataset. I have been able to run everything smoothly, but, as the dataset is imbalanced, I've been ...

Raughar's user avatar

Raughar

asked Apr 26, 2024 at 8:03

0 votes

0 answers

125 views

How do I add a bias to the last layer in my model if my model outputs logits and not probabilities?

I'm working on a medical image binary segmentation problem using a U-Net in tensorflow, and my classes are extremely unbalanced (about 1 in 10,000). As a result, my model wastes a ton of time going ...

Thao Nguyen's user avatar

Thao Nguyen

asked Apr 22, 2024 at 3:47

0 votes

1 answer

44 views

Train and test split in such a way that each name and proportion of tartget class is present in both train and test

I am trying to solve a ML problem if a person will deliver an order or not. Highly Imbalance dataset. Here is the glimpse of my dataset [{'order_id': '1bjhtj', 'Delivery Guy': 'John', 'Target': 0}, {'...

DSR's user avatar

DSR

asked Mar 29, 2024 at 7:21

0 votes

0 answers

36 views

Questions of handling imbalance dataset classification

I am trying to predict number of members who will discontinue their membership. The whole dataset is about 12 millions rows of data with about 40 columns. A member status can be "Continue", "Voluntary ...

Anson's user avatar

Anson

asked Mar 27, 2024 at 15:24

-1 votes

1 answer

190 views

Kernel dies on fit_resample of SMOTE-NC from imblearn

I have a dataset for fraud detection (i can't disclose dataset) which is extremely imbalanced, when i use SMOTE everything works, but as i have 9 categorical features i wanted to use SMOTE-NC but when ...

dsk4ch's user avatar

dsk4ch

asked Mar 27, 2024 at 12:54

0 votes

0 answers

59 views

AttributeError: 'EasyEnsembleClassifier' object has no attribute 'fit_resample'

I am trying to perform a balancing between two classes, one majority and one minority. The majority class is a number of no landslide points and the minority class is landslide. I am trying to apply ...

MM-'s user avatar

MM-

asked Mar 21, 2024 at 14:58

0 votes

1 answer

363 views

Tidymodels and Imbalanced datasets - Subsampling when resampling

When dealing with imbalanced datasets, my understanding is possible solutions are subsampling or oversampling the training set. However, the test set should reflect the imbalance of the original ...

GeorgeM's user avatar

GeorgeM

asked Feb 15, 2024 at 20:09

15 30 50 per page

2 3 4 5

...

22 Next

CollectivesTM on Stack Overflow

Brier Skill Score returns NaN in cross_val_score with imbalanced dataset

Loan Default Prediction - Kaggle

Why is my BERT model producing NaN loss during training for multi-label classification on imbalanced data?

Improving Accuracy [closed]

Understanding the `model.fit` function in keras and imbalanced datasets

Problem with Keras class weights and KeyError

Weighted F1-score

Does XGBoost's scale_pos_weight correctly balance the positive samples if the training dataset has more positive than negative samples?

Class_weight parameter not impacting results in imbalanced dataset with RandomForestClassifier

How do I add a bias to the last layer in my model if my model outputs logits and not probabilities?

Train and test split in such a way that each name and proportion of tartget class is present in both train and test

Questions of handling imbalance dataset classification

Kernel dies on fit_resample of SMOTE-NC from imblearn

AttributeError: 'EasyEnsembleClassifier' object has no attribute 'fit_resample'

Tidymodels and Imbalanced datasets - Subsampling when resampling

Hot Network Questions