ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms

Aumüller, Martin; Bernhardsson, Erik; Faithfull, Alexander

doi:10.1007/978-3-319-68474-1_3

Martin Aumüller ¹⁷,
Erik Bernhardsson ¹⁸ &
Alexander Faithfull ¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10609))

Included in the following conference series:

International Conference on Similarity Search and Applications

2941 Accesses
103 Citations
3 Altmetric

Abstract

This paper describes ANN-Benchmarks, a tool for evaluating the performance of in-memory approximate nearest neighbor algorithms. It provides a standard interface for measuring the performance and quality achieved by nearest neighbor algorithms on different standard data sets. It supports several different ways of integrating k-NN algorithms, and its configuration system automatically tests a range of parameter settings for each algorithm. Algorithms are compared with respect to many different (approximate) quality measures, and adding more is easy and fast; the included plotting front-ends can visualise these as images, plots, and websites with interactive plots. ANN-Benchmarks aims to provide a constantly updated overview of the current state of the art of k-NN algorithms. In the short term, this overview allows users to choose the correct k-NN algorithm and parameters for their similarity search task; in the longer term, algorithm designers will be able to use this overview to test and refine automatic parameter tuning. The paper gives an overview of the system, evaluates the results of the benchmark, and points out directions for future work. Interestingly, very different approaches to k-NN search yield comparable quality-performance trade-offs. The system is available at http://sss.projects.itu.dk/ann-benchmarks/.

The research of the first and third authors has received funding from the European Research Council under the European Union’s 7th Framework Programme (FP7/2007-2013)/ERC grant agreement no. 614331.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from 17,985円 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 7435; Price includes VAT (Japan)

Softcover Book: JPY 9294; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

kNN Classification: a review

Article 01 September 2023

An Innovative Approach Towards Designing Efficient K-Nearest Neighbour Algorithm

Fast spectral analysis for approximate nearest neighbor search

Article 07 January 2022

Discover the latest articles, books and news in related subjects, suggested using machine learning.

References

Ahle, T.D., Aumüller, M., Pagh, R.: Parameter-free locality sensitive hashing for spherical range reporting. In: SODA 2017, pp. 239–256
Google Scholar
Alman, J., Williams, R.: Probabilistic polynomials and hamming nearest neighbors. In: FOCS 2015, pp. 136–150
Google Scholar
Andoni, A., Indyk, P., Laarhoven, T., Razenshteyn, I.P., Schmidt, L.: Practical and optimal LSH for angular distance. In: NIPS 2015, pp. 1225–1233. https://falconn-lib.org/
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Article MATH Google Scholar
Bernhardsson, E.: Annoy. https://github.com/spotify/annoy
Boytsov, L., Naidan, B.: Engineering efficient and effective non-metric space library. In: Brisaboa, N., Pedreira, O., Zezula, P. (eds.) SISAP 2013. LNCS, vol. 8199, pp. 280–293. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41062-8_28
Chapter Google Scholar
Boytsov, L., Novak, D., Malkov, Y., Nyberg, E.: Off the beaten path: let’s replace term-based retrieval with k-NN search. In: CIKM 2016, pp. 1099–1108
Google Scholar
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: VLDB 1997, pp. 426–435 (1997)
Google Scholar
Curtin, R.R., Cline, J.R., Slagle, N.P., March, W.B., Ram, P., Mehta, N.A., Gray, A.G.: MLPACK: a scalable C++ machine learning library. J. Mach. Learn. Res. 14, 801–805 (2013)
MathSciNet MATH Google Scholar
Dong, W.: KGraph. https://github.com/aaalgo/kgraph
Dong, W., Wang, Z., Josephson, W., Charikar, M., Li, K.: Modeling LSH for performance tuning. In: CIKM 2008, pp. 669–678. ACM. http://lshkit.sourceforge.net/
Edel, M., Soni, A., Curtin, R.R.: An automatic benchmarking system. In: NIPS 2014 Workshop on Software Engineering for Machine Learning (2014)
Google Scholar
Heo, J.P., Lee, Y., He, J., Chang, S.F., Yoon, S.E.: Spherical hashing: binary code embedding with hyperspheres. IEEE TPAMI 37(11), 2304–2316 (2015)
Article Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC 1998, pp. 604–613
Google Scholar
Johnson, W.B., Lindenstrauss, J., Schechtman, G.: Extensions of Lipschitz maps into Banach spaces. Isr. J. Math. 54(2), 129–138 (1986)
Article MathSciNet MATH Google Scholar
Kriegel, H., Schubert, E., Zimek, A.: The (black) art of runtime evaluation: are we comparing algorithms or implementations? Knowl. Inf. Syst. 52(2), 341–378 (2017)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Li, W., Zhang, Y., Sun, Y., Wang, W., Zhang, W., Lin, X.: Approximate nearest neighbor search on high dimensional data - experiments, analyses, and improvement (v1.0). CoRR abs/1610.02455 (2016). http://arxiv.org/abs/1610.02455
Lyst Engineering: Rpforest. https://github.com/lyst/rpforest
Malkov, Y.A., Yashunin, D.A.: Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. ArXiv e-prints, March 2016
Google Scholar
Malkov, Y., Ponomarenko, A., Logvinov, A., Krylov, V.: Approximate nearest neighbor algorithm based on navigable small world graphs. Inf. Syst. 45, 61–68 (2014)
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS 2013, pp. 3111–3119
Google Scholar
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISSAPP 2009, pp. 331–340. INSTICC Press
Google Scholar
Norouzi, M., Punjani, A., Fleet, D.J.: Fast search in hamming space with multi-index hashing. In: CVPR 2012, pp. 3108–3115. IEEE
Google Scholar
Pham, N.: Hybrid LSH: faster near neighbors reporting in high-dimensional space. In: EDBT 2017, pp. 454–457
Google Scholar
van Rijn, J.N., Bischl, B., Torgo, L., Gao, B., Umaashankar, V., Fischer, S., Winter, P., Wiswedel, B., Berthold, M.R., Vanschoren, J.: OpenML: a collaborative science platform. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS, vol. 8190, pp. 645–649. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40994-3_46
Chapter Google Scholar
Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey. CoRR abs/1408.2927 (2014). http://arxiv.org/abs/1408.2927
Williams, R.: A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci. 348(2–3), 357–365 (2005)
Article MathSciNet MATH Google Scholar
Zezula, P., Savino, P., Amato, G., Rabitti, F.: Approximate similarity retrieval with M-Trees. VLDB J. 7(4), 275–293 (1998)
Article Google Scholar

Download references

Acknowledgements

We thank the anonymous reviewers for their careful comments that allowed us to improve the paper. The first and third authors thank all members of the algorithm group at ITU Copenhagen for fruitful discussions.

Author information

Authors and Affiliations

IT University of Copenhagen, Copenhagen, Denmark
Martin Aumüller & Alexander Faithfull
Better, New York, USA
Erik Bernhardsson

Authors

Martin Aumüller
View author publications
Search author on:PubMed Google Scholar
Erik Bernhardsson
View author publications
Search author on:PubMed Google Scholar
Alexander Faithfull
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Martin Aumüller .

Editor information

Editors and Affiliations

Fraunhofer Institute for Applied Information Technology, Sankt Augustin, Germany
Christian Beecks
Ludwig-Maximilians-Universität München, Munich, Germany
Felix Borutta
Ludwig-Maximilians-Universität München, Munich, Germany
Peer Kröger
Ludwig-Maximilians-Universität München, Munich, Germany
Thomas Seidl

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aumüller, M., Bernhardsson, E., Faithfull, A. (2017). ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds) Similarity Search and Applications. SISAP 2017. Lecture Notes in Computer Science(), vol 10609. Springer, Cham. https://doi.org/10.1007/978-3-319-68474-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-68474-1_3
Published: 28 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68473-4
Online ISBN: 978-3-319-68474-1
eBook Packages: Computer Science Computer Science (R0)

Publish with us

Policies and ethics

ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

kNN Classification: a review

An Innovative Approach Towards Designing Efficient K-Nearest Neighbour Algorithm

Fast spectral analysis for approximate nearest neighbor search

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Subscribe and save

Buy Now

ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

kNN Classification: a review

An Innovative Approach Towards Designing Efficient K-Nearest Neighbour Algorithm

Fast spectral analysis for approximate nearest neighbor search

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us