Balanced clustering
Balanced clustering is a special case of clustering where, in the strictest sense, cluster sizes are constrained to {\displaystyle \lfloor {n \over k}\rfloor } or {\displaystyle \lceil {n \over k}\rceil }, where {\displaystyle n} is the number of points and {\displaystyle k} is the number of clusters.[1] A typical algorithm is balanced k-means, which minimizes mean square error (MSE). Another type of balanced clustering called balance-driven clustering has a two-objective cost function that minimizes both the imbalance and the MSE. Typical cost functions are ratio cut[2] and Ncut.[3] Balanced clustering can be used for example in scenarios where freight has to be delivered to {\displaystyle n} locations with {\displaystyle k} cars. It is then preferred that each car delivers to an equal number of locations.
Software
[edit ]There exists implementations for balanced k-means[4] and Ncut[5]
References
[edit ]- ^ M. I. Malinen and P. Fränti (August 2014). "Balanced K-Means for Clustering". Structural, Syntactic, and Statistical Pattern Recognition. Lecture Notes in Computer Science. Vol. 8621. pp. 32–41. doi:10.1007/978-3-662-44415-3_4. ISBN 978-3-662-44414-6.
- ^ L. Hagen and A. B. Kahng (1992). "New spectral methods for ratio cut partitioning and clustering". IEEE Transactions on Computer-Aided Design. 11 (9): 1074–1085. doi:10.1109/43.159993.
- ^ J. Shi and J. Malik (2000). "Normalized cuts and image segmentation". IEEE Transactions on Pattern Analysis and Machine Intelligence. 22 (8): 888–905. doi:10.1109/34.868688.
- ^ M. I. Malinen and P. Fränti. "Balanced k-Means implementation". University of Eastern Finland.
- ^ T. Cour, S. Yu and J. Shi. "Ncut implementation". University of Pennsylvania.
Levin, M. Sh. (2017). "On Balanced Clustering (Indices, Models, Examples)". Journal of Communications Technology and Electronics. 62 (12): 1506–1515. doi:10.1134/S1064226917120105. S2CID 255277095.