Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Optimizes reducelanes in diversityCalculation of PQVectors, for Euclidean function#623

Open
MarkWolters wants to merge 2 commits intomain from
reduce_lanes
Open

Optimizes reducelanes in diversityCalculation of PQVectors, for Euclidean function #623
MarkWolters wants to merge 2 commits intomain from
reduce_lanes

Conversation

@MarkWolters
Copy link
Contributor

@MarkWolters MarkWolters commented Feb 12, 2026

This pull request introduces improvement to Euclidean similarity function in PQVectors.diversityFunctionFor. From flamegraph, it is observed that considerable amount of time is spent in jdk/incubator/vector/FloatVector.reducelanesTemplate. This is mainly because FloatVector.reducelanes() is expensive and it is being called inside a for loop (via VectorUtil.squareL2Distance). Modification in this pull request moves call to reduceLanes() outside the for loop.

Change proposed here was tested with the benchmark, PQDistanceCalculationBenchmark.diversityCalculation
With this benchmark, ~18% reduction in time was observed when M=64 and ~22% when M=192.

Code modifications:

Added a new function pqDiversityEuclidean in VectorUtilSupport and its corresponding implementations
Removed for loop in PQVectors.diversityFunctionFor and moved it into pqDiversityEuclidean
Moved FloatVector.reducelanes() outside the for loop
Test setup:
Jvector version : main branch (as of 2025年08月28日)
JDK version : openjdk version "24.0.2" 2025年07月15日
Platform : INTEL(R) XEON(R) PLATINUM 8592+
Benchmark : PQDistanceCalculationBenchmark.diversityCalculation

New changes
I have modified the code to include dot product & cosine functions and implemented similar changes for scoreFunctionFor.
With the changes applied to scoreFunctionFor, when M=64: dot product shows ~30% reduction in latency & cosine shows ~43% reduction.
I can add data points for other subspace counts, if required.

Code modifications:

Added new functions pqScoreEuclidean, pqScoreDotProduct, pqScoreCosine in VectorUtilSupport and its corresponding implementations for diversityFunctionFor
Added overloaded version of above functions for scoreFunctionFor
Removed for loop in PQVectors.diversityFunctionFor and PQVectors.scoreFunctionFor and moved them into respective functions in PanamaVectorUtilSupport
Moved FloatVector.reducelanes() outside the for loop
Added a new benchmark which uses MutablePQVectors to test this
Test setup:
Jvector version : main branch (as of 2025年10月22日)
JDK version : openjdk version "25.0.1" 2025年10月21日
Platform : Intel(R) Xeon(R) 6979P
Benchmark : PQDistanceCalculationMutableVectorBenchmark (Added this by replicating PQDistanceCalculationBenchmark to measure performance for MutablePQVectors)

Copy link
Contributor

Before you submit for review:

  • Does your PR follow guidelines from CONTRIBUTIONS.md?
  • Did you summarize what this PR does clearly and concisely?
  • Did you include performance data for changes which may be performance impacting?
  • Did you include useful docs for any user-facing changes or features?
  • Did you include useful javadocs for developer oriented changes, explaining new concepts or key changes?
  • Did you trigger and review regression testing results against the base branch via Run Bench Main?
  • Did you adhere to the code formatting guidelines (TBD)
  • Did you group your changes for easy review, providing meaningful descriptions for each commit?
  • Did you ensure that all files contain the correct copyright header?

If you did not complete any of these, then please explain below.

Copy link
Contributor

@jshook jshook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. It would be nice to see the results laid out a little better for before and after, but it still looks like a solid improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@jshook jshook jshook approved these changes

@tlwillke tlwillke Awaiting requested review from tlwillke tlwillke is a code owner

At least 2 approving reviews are required to merge this pull request.

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /