Apache Lucene 10 has been released. The updated version adds a new IndexInput prefetch API, support for sparse indexing on doc values, and upgraded Snowball dictionaries resulting in improved tokenization.
Apache Lucene is a high-performance search engine library written entirely in Java. The developers describe it as being suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search on high-dimensionality vectors, spell correction or query suggestions. There's also a PyLucene sub project that provides Python bindings for Lucene Core.
The first improvement is the new IndexInput#prefetch API, which means query evaluation logic can let the Directory know about regions of data that are about to be read. This helps perform I/O concurrently.
Lucene also now has support for sparse indexing on doc values. The sparse index will record the minimum and maximum values per block of doc IDs, and when used in conjunction with index sorting to cluster similar documents, allows for very space-efficient and CPU-efficient filtering.
Search concurrency has also been improved so that it is now decoupled from the index geometry, meaning an index can be searched using any number of threads, regardless of its number of segments.
Snowball dictionaries have been upgraded, resulting in improved tokenization, and Kmeans clustering has been added on vectors.
This release also adds initial support for intra-segment concurrency, meaning the index searcher now supports searching across leaf reader partitions concurrently. The developers say this helps make maximum use of available resources especially with force merged indices or big segments, but there is still a performance penalty for queries that require segment-level computation ahead of time, such as points/range queries. This is an implementation limitation that the developers expect to improve in future releases, but at the moment intra-segment slicing is not enabled by default.
Lucene 10.0 is available now.
More Information
Related Articles
Apache Lucene Adds Similarity Vector Searches
Lucene Core and Solr updated to 3.3
Elastic 8 Enhances ElasticSearch
To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
TestSprite 2.0 Sees User Growth
29/10/2025
TestSprite has announced a six times increase in users alongside a successful funding round. TestSprite is an agentic testing tool. Initially released in beta last fall, the number of users has risen [ ... ]
Apache Grails 7.0 Released
06/11/2025
A new major version of Grails has been announced, together with news of its graduation to an Apache top-level project.
- Robotic Gut Spider For Exploring Digestive Tract
- Eclipse Foundation Adds Agentic Functionality To Eclipse LMOS
- Scouting America Launches AI And Cybersecurity Badges
- Apple Extends Bug Bounty Program
- Mico - A Personality For Copilot
- PyTorch Team Introduces Cluster Programming
- Robot Army Video As Robots Shipped En Masse
- Join The Protest Against The Closing Of Android
- Visual Studio Adds Planning Mode To Copilot
- IBM Launches Granite Version 4.0 and Granite-Docling
- GitHub Copilot CLI And Spaces In Preview
- C# Could Overtake Java in TIOBE Index
- Google Tunix Hack Hackathon Now Open
Comments
or email your comment to: comments@i-programmer.info