325 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
0
votes
1
answer
73
views
Brier Skill Score returns NaN in cross_val_score with imbalanced dataset
I’m trying to evaluate classification models on a highly imbalanced fraud dataset using the Brier Skill Score (BSS) as the evaluation metric.
The dataset has ~2133 rows and the target Fraud_Flag is ...
0
votes
0
answers
100
views
Loan Default Prediction - Kaggle
I am working on the loan default prediction data set available on Kaggle which has a highly skewed class distribution. The best model I have gotten so far is as follows using ExtraTreesClassifier:
...
0
votes
1
answer
240
views
Why is my BERT model producing NaN loss during training for multi-label classification on imbalanced data?
I’m running into a frustrating issue while training a BERT-based multi-label text classification model on an imbalanced dataset. After a few epochs, the training loss suddenly becomes NaN, and I can’t ...
-2
votes
1
answer
51
views
Improving Accuracy [closed]
I am working on testing accuracy and performance using deep learning models on a complex dataset but I have reached a good accuracy but I need to improve it so any suggestions other than what I did(...
0
votes
0
answers
77
views
Understanding the `model.fit` function in keras and imbalanced datasets
As an exercise, I'm trying to translate a model written in Keras (https://github.com/CVxTz/ECG_Heartbeat_Classification/blob/master/code/baseline_mitbih.py) into Pytorch code. I realize in Keras much ...
1
vote
2
answers
481
views
Problem with Keras class weights and KeyError
I anticipate that I have seen the question: Keras class_weight error dictionary keys/values referring to the same problem, but the solution does not seem to help me.
With this code, where I just added ...
0
votes
0
answers
69
views
Weighted F1-score
I'm training and validating models for a binary classification problem in a dataset that has great class imbalance.
When searching for metrics for evaluating the performance of the models, I found ...
1
vote
1
answer
1k
views
Does XGBoost's scale_pos_weight correctly balance the positive samples if the training dataset has more positive than negative samples?
After researching, I realized that scale_pos_weight is typically calculated as the ratio of the number of negative samples to the number of positive samples in the training data. My dataset has 840 ...
1
vote
0
answers
88
views
Class_weight parameter not impacting results in imbalanced dataset with RandomForestClassifier
I'm fairly new to ML and now I'm in the process of predicting employee attrition in a medium sized dataset. I have been able to run everything smoothly, but, as the dataset is imbalanced, I've been ...
0
votes
0
answers
125
views
How do I add a bias to the last layer in my model if my model outputs logits and not probabilities?
I'm working on a medical image binary segmentation problem using a U-Net in tensorflow, and my classes are extremely unbalanced (about 1 in 10,000). As a result, my model wastes a ton of time going ...
0
votes
1
answer
44
views
Train and test split in such a way that each name and proportion of tartget class is present in both train and test
I am trying to solve a ML problem if a person will deliver an order or not. Highly Imbalance dataset. Here is the glimpse of my dataset
[{'order_id': '1bjhtj', 'Delivery Guy': 'John', 'Target': 0},
{'...
0
votes
0
answers
36
views
Questions of handling imbalance dataset classification
I am trying to predict number of members who will discontinue their membership. The whole dataset is about 12 millions rows of data with about 40 columns. A member status can be "Continue", "Voluntary ...
-1
votes
1
answer
190
views
Kernel dies on fit_resample of SMOTE-NC from imblearn
I have a dataset for fraud detection (i can't disclose dataset) which is extremely imbalanced,
when i use SMOTE everything works, but as i have 9 categorical features i wanted to use SMOTE-NC but when ...
0
votes
0
answers
59
views
AttributeError: 'EasyEnsembleClassifier' object has no attribute 'fit_resample'
I am trying to perform a balancing between two classes, one majority and one minority. The majority class is a number of no landslide points and the minority class is landslide. I am trying to apply ...
0
votes
1
answer
363
views
Tidymodels and Imbalanced datasets - Subsampling when resampling
When dealing with imbalanced datasets, my understanding is possible solutions are subsampling or oversampling the training set. However, the test set should reflect the imbalance of the original ...