Questions tagged [unsupervised-learning]

Ask Question

Finding hidden (statistical) structure in unlabelled data, including clustering and feature extraction for dimensionality reduction.

709 questions

Newest Active Bountied Unanswered

0 votes

0 answers

23 views

Modeling recurring monthly transactions with weekend-shift effects: DBSCAN vs rule-based temporal detection?

I have 3 months of categorized bank transaction data and need to identify recurring cash inflows and outflows for lending risk modeling. Complications: 1. Income dates shift earlier when payday falls ...

Awande Ntombela's user avatar

Awande Ntombela

asked Nov 19 at 9:09

0 votes

0 answers

27 views

How to identify and quantify main tendencies across participants from cluster membership heatmaps?

I'd appreciate your thoughts on the following problem. I've created a heatmap plot (attached) showing the cluster membership ratio for each participant (in separate subplots) and condition (η). Now, I'...

maria mystakidou's user avatar

maria mystakidou

asked Oct 23 at 10:08

2 votes

3 answers

90 views

If a point is a marginal anomaly, should it be considered a joint anomaly no matter how mundane the other multivariate components are?

I envision a situation where multivariate data are observed and one observation of one variable seems way far away from any kind of expected behavior, say a value of 7ドル$ for data assumed to be or ...

Dave's user avatar

Dave

72.8k

asked Aug 27 at 21:10

0 votes

0 answers

28 views

What is the interval of values of the CDbw index for clustering internal evaluation?

I'm currently studying the CDbw (Compose Density between and within clusters) index, which is metric designed for internal clustering evaluation. The original article of this index was published in ...

DavideChicco.it's user avatar

DavideChicco.it

asked Jun 16 at 9:52

1 vote

1 answer

122 views

Pseudo label as ground truth?

I'm new to machine learning and currently working on new topic discovery and topic modelling under nlp. If I have unlabeled survey responses that I want to categorise but don't know how, run an NMF ...

viktor nikiforov's user avatar

viktor nikiforov

asked Jun 13 at 5:41

0 votes

0 answers

35 views

Rigorous books on unsupervised ML / latent variable modelling?

I'm looking for some rigorous book(s) on unsupervised machine learning, especially latent variable modelling (e.g., EM algorithm and various instances of it, state space models, filtering). Time ...

Community wiki

Simplex1

0 votes

0 answers

41 views

Is analyzing test scores a clustering problem or an EDA problem?

I have a dataset of 28 personality assessment features, which measures personality attributes like Diligence or Sociability to determine performance in the corporate workplace. I'm tasked with ...

Michael Tran's user avatar

Michael Tran

asked Apr 8 at 8:21

0 votes

0 answers

51 views

Calculating Standard Deviation of RMSE of an unsupervised algorithm

If there is an ML model, the standard deviation (SD) of the root mean squared error (RMSE) can be calculated using time series splits by fitting the model on different training sets and evaluating it ...

Geek_Tech's user avatar

Geek_Tech

asked Mar 16 at 23:29

5 votes

2 answers

602 views

How can I use unsupervised methods to recommend an "ideal" number of managers for companies when no labels exist?

I have a dataset of around 100,000 companies. For each company, I have a bunch of features such as: Number of employees, Number of customers, Number of complaints, other additional company attributes ...

B_fig's user avatar

B_fig

asked Feb 13 at 12:57

1 vote

1 answer

100 views

Dimension reduction on ordinal, related features with additional continuous features

I have what I think is a peculiar dataset that is a set of molecule features relating to a simple bead and spring molecular model. The raw molecule data is as follows ...

user6277's user avatar

user6277

asked Jan 14 at 13:26

0 votes

0 answers

103 views

Finding Dependencies in Blackbox Way

Given a 3-rank tensor with dimensions $x,y,z$. Where: $x$: number of graphs (number of samples) $y$: number of nodes (let's say 5ドル$: $a, b, c, d,$ and $e$) $z$: embedding dimension (e.g. 2ドル$ for ...

Muhammad Ikhwan Perwira's user avatar

Muhammad Ikhwan Perwira

asked Jan 13 at 19:28

9 votes

3 answers

1k views

What is a good approach to show my data only belongs to one cluster?

I hope the question is not stupid, but after a long search I have not found a satisfactory answer. I have a question about how to proceed if I want to test whether my data is from just one cluster or ...

David's user avatar

David

asked Jan 13 at 10:27

2 votes

1 answer

97 views

How Barlow Twins avoid embeddings that differ by affine transformation?

I am reading the Barlow Twins (BT) paper and just don't get how it can avoid the following scenario. The BT loss is minimized when the cross-correlation matrix equals the identity matrix. A necessary ...

Antonios Sarikas's user avatar

Antonios Sarikas

asked Dec 20, 2024 at 23:56

6 votes

1 answer

249 views

Why the loss is not considered as a "supervisory signal" in unsupervised learning?

It is said that supervised is different from unsupervised learning due to the presence of "supervisory signals" aka labels. However, in both cases we have a loss function. Isn't the loss a ...

Antonios Sarikas's user avatar

Antonios Sarikas

asked Dec 7, 2024 at 22:00

1 vote

0 answers

80 views

What if PCA is unable to group my samples, but K-means perfectly clusters them? Is there any problem with my data analysis? Is it possible? [closed]

I am not an expert, but I am currently using unsupervised methods to better explain my mass spectrometry data obtained via DART-MS analyses. I am still learning. It turned out that when analyzing my ...

Isabela's user avatar

Isabela

asked Aug 5, 2024 at 14:26

15 30 50 per page

2 3 4 5

...

48 Next

Stack Exchange Network

Questions tagged [unsupervised-learning]

Modeling recurring monthly transactions with weekend-shift effects: DBSCAN vs rule-based temporal detection?

How to identify and quantify main tendencies across participants from cluster membership heatmaps?

If a point is a marginal anomaly, should it be considered a joint anomaly no matter how mundane the other multivariate components are?

What is the interval of values of the CDbw index for clustering internal evaluation?

Pseudo label as ground truth?

Rigorous books on unsupervised ML / latent variable modelling?

Is analyzing test scores a clustering problem or an EDA problem?

Calculating Standard Deviation of RMSE of an unsupervised algorithm

How can I use unsupervised methods to recommend an "ideal" number of managers for companies when no labels exist?

Dimension reduction on ordinal, related features with additional continuous features

Finding Dependencies in Blackbox Way

What is a good approach to show my data only belongs to one cluster?

How Barlow Twins avoid embeddings that differ by affine transformation?

Why the loss is not considered as a "supervisory signal" in unsupervised learning?

What if PCA is unable to group my samples, but K-means perfectly clusters them? Is there any problem with my data analysis? Is it possible? [closed]

Hot Network Questions