Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: cacheless hamt iteration #2216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
hanabi1224 wants to merge 8 commits into filecoin-project:master
base: master
Choose a base branch
Loading
from hanabi1224:hamt/for_each_cacheless

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Sep 4, 2025
edited
Loading

(Simliar to #2189)
This PR adds HamtImpl::for_each_cacheless and StateTree::for_each_cacheless to reduce memory usage for iterating over a large hamt, to address #2215 (comment)

cargo bench shows no significant difference between for_each and for_each_cacheless

HAMT for_each function time: [108.24 μs 108.70 μs 109.19 μs]
Found 4 outliers among 100 measurements (4.00%)
 4 (4.00%) high mild
HAMT for_each_cacheless function
 time: [106.85 μs 107.63 μs 108.58 μs]
Found 2 outliers among 100 measurements (2.00%)
 1 (1.00%) high mild
 1 (1.00%) high severe

Copy link

codecov-commenter commented Sep 4, 2025
edited
Loading

Codecov Report

❌ Patch coverage is 65.43210% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.52%. Comparing base (71c2f8c) to head (b0385e5).

Files with missing lines Patch % Lines
fvm/src/state_tree.rs 0.00% 10 Missing ⚠️
ipld/hamt/src/node.rs 83.01% 9 Missing ⚠️
ipld/hamt/src/pointer.rs 0.00% 9 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@ Coverage Diff @@
## master #2216 +/- ##
==========================================
- Coverage 77.58% 77.52% -0.07% 
==========================================
 Files 147 147 
 Lines 15789 15870 +81 
==========================================
+ Hits 12250 12303 +53 
- Misses 3539 3567 +28 
Files with missing lines Coverage Δ
ipld/hamt/src/hamt.rs 97.27% <100.00%> (+0.11%) ⬆️
ipld/hamt/src/hash_algorithm.rs 73.33% <ø> (ø)
ipld/hamt/src/lib.rs 100.00% <ø> (ø)
ipld/hamt/src/node.rs 90.26% <83.01%> (-1.05%) ⬇️
ipld/hamt/src/pointer.rs 79.06% <0.00%> (-5.94%) ⬇️
fvm/src/state_tree.rs 72.61% <0.00%> (-3.15%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hanabi1224 hanabi1224 marked this pull request as ready for review September 4, 2025 12:31
fn hash<X>(key: &X) -> HashedKey
where
X: Hash,
X: Hash + ?Sized,
Copy link
Contributor Author

@hanabi1224 hanabi1224 Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix a clippy warning

@hanabi1224 hanabi1224 self-assigned this Sep 5, 2025
@BigLep BigLep moved this from 📌 Triage to 🔎 Awaiting Review in FilOz Sep 8, 2025
Ok(())
}

pub fn for_each_cacheless<F>(&self, mut f: F) -> anyhow::Result<()>
Copy link
Contributor

@LesnyRumcajs LesnyRumcajs Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs?

Copy link
Contributor Author

@hanabi1224 hanabi1224 Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@LesnyRumcajs LesnyRumcajs Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog entry?

Copy link
Contributor Author

@hanabi1224 hanabi1224 Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

All in all, LGTM, but I'll let @rvagg have a final say here, my HAMT-fu is not that strong.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds cacheless iteration functionality to HAMT (Hash Array Mapped Trie) to reduce memory usage when iterating over large HAMTs. The primary addition is for_each_cacheless methods that avoid caching nodes during iteration, which can be more memory-efficient for one-time traversals.

  • Implements for_each_cacheless methods for both HamtImpl and StateTree
  • Adds comprehensive tests for the new cacheless iteration functionality
  • Includes benchmarks comparing performance between cached and cacheless iteration

Reviewed Changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ipld/hamt/tests/hamt_tests.rs Adds comprehensive test suite for for_each_cacheless functionality and refactors existing test infrastructure
ipld/hamt/src/pointer.rs Implements Clone trait for Pointer enum to support cacheless iteration
ipld/hamt/src/node.rs Implements Clone trait for Node and adds core for_each_cacheless iteration logic
ipld/hamt/src/lib.rs Adds Clone trait to KeyValuePair struct to support cloning operations
ipld/hamt/src/hash_algorithm.rs Minor refactor to parameter order for Identity::hash method
ipld/hamt/src/hamt.rs Adds public for_each_cacheless method to the main HAMT API
ipld/hamt/benches/hamt_benchmark.rs Adds benchmark for for_each_cacheless and refactors existing benchmarks
ipld/hamt/Cargo.toml Adds itertools dependency for test utilities
fvm/src/state_tree.rs Adds for_each_cacheless method to StateTree API

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

}
IterItem::Borrowed(Pointer::Dirty(node)) => stack.push(node.pointers.iter().into()),
IterItem::Owned(Pointer::Dirty(node)) => {
stack.push(node.pointers.clone().into_iter().into())
Copy link

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clone() call here creates an unnecessary copy of the entire pointers vector. Consider restructuring to avoid this clone operation, as it defeats the memory efficiency purpose of the cacheless iteration.

Suggested change
stack.push(node.pointers.clone().into_iter().into())
stack.push(node.pointers.into_iter().into())

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@hanabi1224 hanabi1224 Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

/// map.set(4, 2).unwrap();
///
/// let mut total = 0;
/// map.for_each_cacheless(|_, v: &u64| {
Copy link

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation in the example should be &usize to match the generic type used in the example setup, not &u64.

Suggested change
/// map.for_each_cacheless(|_, v: &u64| {
/// map.for_each_cacheless(|_, v: &usize| {

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@hanabi1224 hanabi1224 Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor Author

hanabi1224 commented Oct 23, 2025
edited
Loading

Hey @LesnyRumcajs @rvagg I have addressed all outstanding comments and all CI checks are green, please take another look.

Copy link
Contributor

@LesnyRumcajs LesnyRumcajs left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'll let @rvagg make a final call. As mentioned earlier, my HAMT-fu is weak. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@LesnyRumcajs LesnyRumcajs LesnyRumcajs left review comments

Copilot code review Copilot Copilot left review comments

@rvagg rvagg Awaiting requested review from rvagg

At least 1 approving review is required to merge this pull request.

Labels

None yet

Projects

Status: 🔎 Awaiting Review

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /