|
| 1 | +## kMeans Clustering |
| 2 | + |
1 | 3 | K-Means has the advantage that it’s pretty fast, as all we’re really doing is computing the distances between points and group centers; very few computations! It thus has a linear complexity O(n).
|
2 | 4 |
|
3 | 5 | On the other hand, K-Means has a couple of disadvantages. Firstly, you have to select how many groups/classes there are. This isn’t always trivial and ideally with a clustering algorithm we’d want it to figure those out for us because the point of it is to gain some insight from the data. K-means also starts with a random choice of cluster centers and therefore it may yield different clustering results on different runs of the algorithm. Thus, the results may not be repeatable and lack consistency. Other cluster methods are more consistent.
|
4 | 6 |
|
5 | 7 | More on Clustering: https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
|
| 8 | + |
| 9 | +### Generated Data |
| 10 | + |
| 11 | + |
| 12 | +### Before kMeans |
| 13 | + |
| 14 | + |
| 15 | +### Predicting Cluster Centers |
| 16 | + |
| 17 | +* #### Predicted Cluster Centers after 0th Iteration |
| 18 | +  |
| 19 | + |
| 20 | +* #### Predicted Cluster Centers after 4th Iteration |
| 21 | +  |
| 22 | + |
| 23 | +* #### Predicted Cluster Centers after 8th Iteration |
| 24 | +  |
| 25 | + |
| 26 | +* #### Predicted Cluster Centers after 12th Iteration |
| 27 | +  |
| 28 | + |
| 29 | + ### Final kMeans Clustering Result |
| 30 | +  |
0 commit comments