4

I have a set of user queries from a search engine that I want to cluster. The only clustering algorithm I have come across so far is the K-means clustering algorithm, which requires defining the number of clusters up front. But in this case, I do not know how many clusters exist in the data. Is there any clustering algorithm that performs clustering without predefining the number of clusters?

AakashM
2,16217 silver badges21 bronze badges
asked Jan 29, 2013 at 10:25

2 Answers 2

3

DBSCAN?

http://en.wikipedia.org/wiki/DBSCAN

DBSCAN requires two parameters: distance (eps) and the minimum number of points required to form a cluster (minPts).

answered Jan 29, 2013 at 12:35
3

There are several techniques that allow you to cluster unsupervised data. K-means is probably the most famous one. But as you have already seen, most k-means algorithms require the number of clusters to be specified in advance.

Nevertheless, at least two kinds of algorithms might suit your needs:

  1. Connectivity based clustering (hierarchical clustering);
  2. Density-based clustering (such as DBSCAN or OPTICS).

By the way, there is a similar question in StackOverflow.

Have fun!

answered Jan 29, 2013 at 13:15

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.