-
Notifications
You must be signed in to change notification settings - Fork 147
Optimizes reducelanes in diversityCalculation of PQVectors, for Euclidean function#623
Open
MarkWolters wants to merge 2 commits intomain from
Open
Optimizes reducelanes in diversityCalculation of PQVectors, for Euclidean function #623MarkWolters wants to merge 2 commits intomain from
MarkWolters wants to merge 2 commits intomain from
Conversation
@MarkWolters
MarkWolters
requested review from
jshook and
tlwillke
as code owners
February 12, 2026 17:52
Contributor
Before you submit for review:
- Does your PR follow guidelines from CONTRIBUTIONS.md?
- Did you summarize what this PR does clearly and concisely?
- Did you include performance data for changes which may be performance impacting?
- Did you include useful docs for any user-facing changes or features?
- Did you include useful javadocs for developer oriented changes, explaining new concepts or key changes?
- Did you trigger and review regression testing results against the base branch via Run Bench Main?
- Did you adhere to the code formatting guidelines (TBD)
- Did you group your changes for easy review, providing meaningful descriptions for each commit?
- Did you ensure that all files contain the correct copyright header?
If you did not complete any of these, then please explain below.
jshook
jshook
approved these changes
Feb 12, 2026
Contributor
@jshook
jshook
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me. It would be nice to see the results laid out a little better for before and after, but it still looks like a solid improvement.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces improvement to Euclidean similarity function in PQVectors.diversityFunctionFor. From flamegraph, it is observed that considerable amount of time is spent in jdk/incubator/vector/FloatVector.reducelanesTemplate. This is mainly because FloatVector.reducelanes() is expensive and it is being called inside a for loop (via VectorUtil.squareL2Distance). Modification in this pull request moves call to reduceLanes() outside the for loop.
Change proposed here was tested with the benchmark, PQDistanceCalculationBenchmark.diversityCalculation
With this benchmark, ~18% reduction in time was observed when M=64 and ~22% when M=192.
Code modifications:
Added a new function pqDiversityEuclidean in VectorUtilSupport and its corresponding implementations
Removed for loop in PQVectors.diversityFunctionFor and moved it into pqDiversityEuclidean
Moved FloatVector.reducelanes() outside the for loop
Test setup:
Jvector version : main branch (as of 2025年08月28日)
JDK version : openjdk version "24.0.2" 2025年07月15日
Platform : INTEL(R) XEON(R) PLATINUM 8592+
Benchmark : PQDistanceCalculationBenchmark.diversityCalculation
New changes
I have modified the code to include dot product & cosine functions and implemented similar changes for scoreFunctionFor.
With the changes applied to scoreFunctionFor, when M=64: dot product shows ~30% reduction in latency & cosine shows ~43% reduction.
I can add data points for other subspace counts, if required.
Code modifications:
Added new functions pqScoreEuclidean, pqScoreDotProduct, pqScoreCosine in VectorUtilSupport and its corresponding implementations for diversityFunctionFor
Added overloaded version of above functions for scoreFunctionFor
Removed for loop in PQVectors.diversityFunctionFor and PQVectors.scoreFunctionFor and moved them into respective functions in PanamaVectorUtilSupport
Moved FloatVector.reducelanes() outside the for loop
Added a new benchmark which uses MutablePQVectors to test this
Test setup:
Jvector version : main branch (as of 2025年10月22日)
JDK version : openjdk version "25.0.1" 2025年10月21日
Platform : Intel(R) Xeon(R) 6979P
Benchmark : PQDistanceCalculationMutableVectorBenchmark (Added this by replicating PQDistanceCalculationBenchmark to measure performance for MutablePQVectors)