Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat(mito): Optimize async index building with priority-based batching #7034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
SNC123 wants to merge 6 commits into GreptimeTeam:main
base: main
Choose a base branch
Loading
from SNC123:feat/async_index_build_optimization

Conversation

@SNC123
Copy link
Contributor

@SNC123 SNC123 commented Sep 27, 2025
edited
Loading

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

#6756 (Tracking Issue)

What's changed and what's your intention?

This PR aims to optimize the asynchronous index building process in the Mito engine. The main intention is to improve efficiency and resource management when building indexes for SST files.

The key changes include:

  • Priority-based Batching Scheduler: Introduced a new IndexBuildScheduler that groups index build tasks into batches and processes them based on priority. This prevents the system from being overwhelmed by numerous small tasks and allows for more efficient resource utilization.
  • Separate Index File ID: Added an index_file_id to FileMeta. This allows an index file to have a different ID from its corresponding data (SST) file, which is crucial for the async build process where an index is created after the SST file already exists. This change has been propagated through read, write, and delete paths.
    Related Disscussion: feat: introduce IndexBuildTask for async index build #6927 (comment)
  • Cache Cleanup: Fixed an issue where old Puffin-related cache content were not being correctly purged during rebuilding index.

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

@github-actions github-actions bot added size/XL docs-not-required This change does not impact docs. labels Sep 27, 2025
@SNC123 SNC123 changed the title (削除) perf: some optimization for async index build (削除ここまで) (追記) perf: optimization for async index build (追記ここまで) Sep 27, 2025
SNC123 added 6 commits October 24, 2025 10:26
Signed-off-by: SNC123 <sinhco@outlook.com>
Signed-off-by: SNC123 <sinhco@outlook.com>
Signed-off-by: SNC123 <sinhco@outlook.com>
Signed-off-by: SNC123 <sinhco@outlook.com>
Signed-off-by: SNC123 <sinhco@outlook.com>
Signed-off-by: SNC123 <sinhco@outlook.com>
@SNC123 SNC123 force-pushed the feat/async_index_build_optimization branch from 552a39d to 3824c30 Compare October 24, 2025 03:46
@SNC123 SNC123 changed the title (削除) perf: optimization for async index build (削除ここまで) (追記) feat(mito): Optimize async index building with priority-based batching (追記ここまで) Oct 24, 2025
@SNC123 SNC123 marked this pull request as ready for review October 24, 2025 04:36
Comment on lines +543 to +553
let worker_request = WorkerRequest::Background {
region_id: self.file_meta.region_id,
notify: BackgroundNotify::IndexBuildStopped(IndexBuildStopped {
region_id: self.file_meta.region_id,
file_id: self.file_meta.file_id,
}),
};
let _ = self
.request_sender
.send(WorkerRequestWithTime::new(worker_request))
.await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate processing with the outer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why not call on_failure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mistake when rebase... 🐁
Will fix later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@zhongzc zhongzc zhongzc left review comments

@evenyag evenyag Awaiting requested review from evenyag evenyag is a code owner

@v0y4g3r v0y4g3r Awaiting requested review from v0y4g3r v0y4g3r is a code owner

@waynexia waynexia Awaiting requested review from waynexia waynexia is a code owner

At least 2 approving reviews are required to merge this pull request.

Assignees

No one assigned

Labels

docs-not-required This change does not impact docs. size/M

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

AltStyle によって変換されたページ (->オリジナル) /