Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10609))
Included in the following conference series:
-
2941 Accesses
-
103 Citations
Abstract
This paper describes ANN-Benchmarks, a tool for evaluating the performance of in-memory approximate nearest neighbor algorithms. It provides a standard interface for measuring the performance and quality achieved by nearest neighbor algorithms on different standard data sets. It supports several different ways of integrating k-NN algorithms, and its configuration system automatically tests a range of parameter settings for each algorithm. Algorithms are compared with respect to many different (approximate) quality measures, and adding more is easy and fast; the included plotting front-ends can visualise these as images, plots, and websites with interactive plots. ANN-Benchmarks aims to provide a constantly updated overview of the current state of the art of k-NN algorithms. In the short term, this overview allows users to choose the correct k-NN algorithm and parameters for their similarity search task; in the longer term, algorithm designers will be able to use this overview to test and refine automatic parameter tuning. The paper gives an overview of the system, evaluates the results of the benchmark, and points out directions for future work. Interestingly, very different approaches to k-NN search yield comparable quality-performance trade-offs. The system is available at http://sss.projects.itu.dk/ann-benchmarks/.
The research of the first and third authors has received funding from the European Research Council under the European Union’s 7th Framework Programme (FP7/2007-2013)/ERC grant agreement no. 614331.
This is a preview of subscription content, log in via an institution to check access.
Access this chapter
Subscribe and save
- Starting from 10 chapters or articles per month
- Access and download chapters and articles from more than 300k books and 2,500 journals
- Cancel anytime
Buy Now
- Chapter
-
JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 7435
- Price includes VAT (Japan)
- Softcover Book
- JPY 9294
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
kNN Classification: a review
Fast spectral analysis for approximate nearest neighbor search
Explore related subjects
Discover the latest articles, books and news in related subjects, suggested using machine learning.References
Ahle, T.D., Aumüller, M., Pagh, R.: Parameter-free locality sensitive hashing for spherical range reporting. In: SODA 2017, pp. 239–256
Alman, J., Williams, R.: Probabilistic polynomials and hamming nearest neighbors. In: FOCS 2015, pp. 136–150
Andoni, A., Indyk, P., Laarhoven, T., Razenshteyn, I.P., Schmidt, L.: Practical and optimal LSH for angular distance. In: NIPS 2015, pp. 1225–1233. https://falconn-lib.org/
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Bernhardsson, E.: Annoy. https://github.com/spotify/annoy
Boytsov, L., Naidan, B.: Engineering efficient and effective non-metric space library. In: Brisaboa, N., Pedreira, O., Zezula, P. (eds.) SISAP 2013. LNCS, vol. 8199, pp. 280–293. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41062-8_28
Boytsov, L., Novak, D., Malkov, Y., Nyberg, E.: Off the beaten path: let’s replace term-based retrieval with k-NN search. In: CIKM 2016, pp. 1099–1108
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: VLDB 1997, pp. 426–435 (1997)
Curtin, R.R., Cline, J.R., Slagle, N.P., March, W.B., Ram, P., Mehta, N.A., Gray, A.G.: MLPACK: a scalable C++ machine learning library. J. Mach. Learn. Res. 14, 801–805 (2013)
Dong, W.: KGraph. https://github.com/aaalgo/kgraph
Dong, W., Wang, Z., Josephson, W., Charikar, M., Li, K.: Modeling LSH for performance tuning. In: CIKM 2008, pp. 669–678. ACM. http://lshkit.sourceforge.net/
Edel, M., Soni, A., Curtin, R.R.: An automatic benchmarking system. In: NIPS 2014 Workshop on Software Engineering for Machine Learning (2014)
Heo, J.P., Lee, Y., He, J., Chang, S.F., Yoon, S.E.: Spherical hashing: binary code embedding with hyperspheres. IEEE TPAMI 37(11), 2304–2316 (2015)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC 1998, pp. 604–613
Johnson, W.B., Lindenstrauss, J., Schechtman, G.: Extensions of Lipschitz maps into Banach spaces. Isr. J. Math. 54(2), 129–138 (1986)
Kriegel, H., Schubert, E., Zimek, A.: The (black) art of runtime evaluation: are we comparing algorithms or implementations? Knowl. Inf. Syst. 52(2), 341–378 (2017)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Li, W., Zhang, Y., Sun, Y., Wang, W., Zhang, W., Lin, X.: Approximate nearest neighbor search on high dimensional data - experiments, analyses, and improvement (v1.0). CoRR abs/1610.02455 (2016). http://arxiv.org/abs/1610.02455
Lyst Engineering: Rpforest. https://github.com/lyst/rpforest
Malkov, Y.A., Yashunin, D.A.: Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. ArXiv e-prints, March 2016
Malkov, Y., Ponomarenko, A., Logvinov, A., Krylov, V.: Approximate nearest neighbor algorithm based on navigable small world graphs. Inf. Syst. 45, 61–68 (2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS 2013, pp. 3111–3119
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISSAPP 2009, pp. 331–340. INSTICC Press
Norouzi, M., Punjani, A., Fleet, D.J.: Fast search in hamming space with multi-index hashing. In: CVPR 2012, pp. 3108–3115. IEEE
Pham, N.: Hybrid LSH: faster near neighbors reporting in high-dimensional space. In: EDBT 2017, pp. 454–457
van Rijn, J.N., Bischl, B., Torgo, L., Gao, B., Umaashankar, V., Fischer, S., Winter, P., Wiswedel, B., Berthold, M.R., Vanschoren, J.: OpenML: a collaborative science platform. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS, vol. 8190, pp. 645–649. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40994-3_46
Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey. CoRR abs/1408.2927 (2014). http://arxiv.org/abs/1408.2927
Williams, R.: A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci. 348(2–3), 357–365 (2005)
Zezula, P., Savino, P., Amato, G., Rabitti, F.: Approximate similarity retrieval with M-Trees. VLDB J. 7(4), 275–293 (1998)
Acknowledgements
We thank the anonymous reviewers for their careful comments that allowed us to improve the paper. The first and third authors thank all members of the algorithm group at ITU Copenhagen for fruitful discussions.
Editor information
Editors and Affiliations
Fraunhofer Institute for Applied Information Technology, Sankt Augustin, Germany
Christian Beecks
Ludwig-Maximilians-Universität München, Munich, Germany
Felix Borutta
Ludwig-Maximilians-Universität München, Munich, Germany
Peer Kröger
Ludwig-Maximilians-Universität München, Munich, Germany
Thomas Seidl
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Aumüller, M., Bernhardsson, E., Faithfull, A. (2017). ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds) Similarity Search and Applications. SISAP 2017. Lecture Notes in Computer Science(), vol 10609. Springer, Cham. https://doi.org/10.1007/978-3-319-68474-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-68474-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68473-4
Online ISBN: 978-3-319-68474-1
eBook Packages: Computer Science Computer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative