spritely/goblins
4
58
Fork
You've already forked goblins
4

WIP: Add ^hashtable actor #876

Draft
flockofbirbs wants to merge 2 commits from hashtable-actor into main
pull from: hashtable-actor
merge into: spritely:main
spritely:main
spritely:vat-log-resize-metacommand
spritely:fix-ocapn-ci
spritely:fix-796
spritely:support-zero-values
spritely:maybe-fix-ocapn-ci
spritely:document-wants-partial
spritely:prelay-docs
spritely:aurie-docs
spritely:lazy-captp
spritely:fast-sealers-fast-spawn
spritely:more-handoff-recieve-checks
spritely:ocapn-test-suite
spritely:add-keys-values-to-ghash-actor
spritely:docs-captp-ref-passing
spritely:fix-nested-with-vat
spritely:fix-281
spritely:captp-crossed-hellos
spritely:fix-243
spritely:sleepymaps
spritely:hootify-tests
spritely:move-define-actor-docs
spritely:websocket-netlayer
spritely:hoot-deprecation-warning
spritely:add-new-on-macros
spritely:replace-while-with-do
spritely:unenclose-syscaller
spritely:update-methods
spritely:new-devel
spritely:aurie-netlayer
spritely:devel
spritely:document-value-extraction
spritely:aurie-churn-fix
spritely:io-actor
spritely:cwebber-aurie-foundations
spritely:ci-code-coverage
spritely:0.12-news
spritely:gnutls-package
spritely:oops-all-goblins
spritely:relay-maybe-simplified
spritely:fake-indentation-fix
spritely:captp-remove-wants-partial-from-op-listen
spritely:mvre-wtf
spritely:multi-value-return-everywhere
spritely:simple-pick
spritely:remove-ports-dynamic-wrap
spritely:fast-spawn
spritely:additional-vat-events
spritely:onion-netlayer-temp
spritely:klugey-queue-captp-message
spritely:rest-of-captp
spritely:facet
spritely:ghash-pretty-print
spritely:fixing-syrup
spritely:fake-netlayer
spritely:onion-netlayer
spritely:string-formatting-errors
spritely:remove-waiter-cruft
spritely:lmethods
spritely:current-scheduler-kludge
spritely:hygienic-define-recordable
spritely:convert-captp

This is a typical buckets-and-chains hashtable that uses a vector-of-cells so set and remove operations only result in persisting a single cell's value as opposed to the entire table. The entire table is re-persisted only when the load factor dips above/below a threshold which is increasingly rare as the bucket grows.

Questions: Should this actor replace ^ghash? Should it take the ^ghash name?

Fixes #873

This is a typical buckets-and-chains hashtable that uses a vector-of-cells so `set` and `remove` operations only result in persisting a single cell's value as opposed to the entire table. The entire table is re-persisted only when the load factor dips above/below a threshold which is increasingly rare as the bucket grows. Questions: Should this actor replace `^ghash`? Should it take the `^ghash` name? Fixes #873
Marie-Joseph left a comment
Member
Copy link

This mostly looks good. I don't have any concerns, at least.

As an overall question, what is the intended use case for this actor? I certainly see how having something like this may be useful in general; do we need or want it for anything specific right now?

Another general inquiry. Would it be reasonable to treat chains as alists (perhaps with some wrappers around the usual alist getters and setters to ensure the proper equality predicates) so we can use Guile's list manipulation procedures on them? Or does that over-complicate the logic given the need to remove entries, overwrite existing keys, and use specific hash and equality procedures?

As to replacing ^ghash, I'm inclined towards "no" there. hashmap is intended to be both more performant and more memory-efficient than a chain-and-buckets hashtable, so going to the effort of switching to hashmap only to backtrack seems counterproductive.

This mostly looks good. I don't have any concerns, at least. As an overall question, what is the intended use case for this actor? I certainly see how having something like this may be useful in general; do we need or want it for anything specific right now? Another general inquiry. Would it be reasonable to treat chains as alists (perhaps with some wrappers around the usual alist getters and setters to ensure the proper equality predicates) so we can use Guile's list manipulation procedures on them? Or does that over-complicate the logic given the need to remove entries, overwrite existing keys, and use specific hash and equality procedures? As to replacing `^ghash`, I'm inclined towards "no" there. `hashmap` is intended to be both more performant and more memory-efficient than a chain-and-buckets hashtable, so going to the effort of switching to `hashmap` only to backtrack seems counterproductive.
@ -101,0 +220,4 @@
(define (^hashtable bcom)
"Spawn and return an actor that stores unordered key/value
pairs. Keys are compared by @code{eq?} for refrs and by @code{equal?}
for everything else.

Should we allow users to specify their preferred hash and equality procedures? Depending on the use case, that may not be necessary.

Should we allow users to specify their preferred hash and equality procedures? Depending on the use case, that may not be necessary.
Author
Owner
Copy link

Trouble here is that we don't have any way to meaningfully serialize a procedure so the best we could do is provide some fixed number of possibilities. The ghash approach has been good for a general-purpose thing.

Trouble here is that we don't have any way to meaningfully serialize a procedure so the best we could do is provide some fixed number of possibilities. The ghash approach has been good for a general-purpose thing.
Marie-Joseph marked this conversation as resolved
Author
Owner
Copy link

@Marie-Joseph wrote in #876 (comment):

As an overall question, what is the intended use case for this actor? I certainly see how having something like this may be useful in general; do we need or want it for anything specific right now?

The use case is Brux, the petname database. We don't want it to become increasingly expensive to use in a persistent vat as it grows, which would be the case if ^ghash were used.

Another general inquiry. Would it be reasonable to treat chains as alists (perhaps with some wrappers around the usual alist getters and setters to ensure the proper equality predicates) so we can use Guile's list manipulation procedures on them? Or does that over-complicate the logic given the need to remove entries, overwrite existing keys, and use specific hash and equality procedures?

Chains are indeed alists but yes, the alist procedures in Guile aren't suitable in most of the cases where we need to iterate. set and remove have special logic to them and elsewhere the code is so minimal that there's no real advantage to using them, it would just increase closure allocation.

As to replacing ^ghash, I'm inclined towards "no" there. hashmap is intended to be both more performant and more memory-efficient than a chain-and-buckets hashtable, so going to the effort of switching to hashmap only to backtrack seems counterproductive.

I haven't done any benchmarking to know which one is faster. It's likely that ^hashtable is more memory efficient, though. Putting performance concerns aside, the issue I have is that there would now be two hash actors and users will have to understand the tradeoffs to know which they want. I feel like there should just be one.

@Marie-Joseph wrote in https://codeberg.org/spritely/goblins/pulls/876#issuecomment-7638388: > As an overall question, what is the intended use case for this actor? I certainly see how having something like this may be useful in general; do we need or want it for anything specific right now? The use case is Brux, the petname database. We don't want it to become increasingly expensive to use in a persistent vat as it grows, which would be the case if `^ghash` were used. > Another general inquiry. Would it be reasonable to treat chains as alists (perhaps with some wrappers around the usual alist getters and setters to ensure the proper equality predicates) so we can use Guile's list manipulation procedures on them? Or does that over-complicate the logic given the need to remove entries, overwrite existing keys, and use specific hash and equality procedures? Chains are indeed alists but yes, the alist procedures in Guile aren't suitable in most of the cases where we need to iterate. `set` and `remove` have special logic to them and elsewhere the code is so minimal that there's no real advantage to using them, it would just increase closure allocation. > As to replacing `^ghash`, I'm inclined towards "no" there. `hashmap` is intended to be both more performant and more memory-efficient than a chain-and-buckets hashtable, so going to the effort of switching to `hashmap` only to backtrack seems counterproductive. I haven't done any benchmarking to know which one is faster. It's likely that `^hashtable` is more memory efficient, though. Putting performance concerns aside, the issue I have is that there would now be two hash actors and users will have to understand the tradeoffs to know which they want. I feel like there should just be one.

@flockofbirbs wrote in #876 (comment):

... the issue I have is that there would now be two hash actors and users will have to understand the tradeoffs to know which they want. I feel like there should just be one.

I have similar concerns, yeah. I don't have a strong preference which it is.

It's worth noting that ^ghash is used pretty extensively, and several of those uses assume the ability to provide an initial hashmap. Specifically, make-ocapn-peer-hashmap seeds several ^ghash actors used in ^mycapn and one in ^unix-domain-socket-server. ^hashtable would therefore need to support some mechanism for custom equality and hashing if it aims to replace ^ghash.

@flockofbirbs wrote in https://codeberg.org/spritely/goblins/pulls/876#issuecomment-7638571: > ... the issue I have is that there would now be two hash actors and users will have to understand the tradeoffs to know which they want. I feel like there should just be one. I have similar concerns, yeah. I don't have a strong preference which it is. It's worth noting that `^ghash` is used pretty extensively, and several of those uses assume the ability to provide an initial hashmap. Specifically, [`make-ocapn-peer-hashmap`](https://codeberg.org/spritely/goblins/src/commit/908d653c0f09dfb4d523532256b5d206d39593a2/goblins/ocapn/ids.scm#L256) seeds several `^ghash` actors used in `^mycapn` and one in `^unix-domain-socket-server`. `^hashtable` would therefore need to support some mechanism for custom equality and hashing if it aims to replace `^ghash`.
tsyesika left a comment
Owner
Copy link

This looks generally good. I think we should be replacing ^ghash with this though (including taking its name).

This looks generally good. I think we should be replacing `^ghash` with this though (including taking its name).
@ -55,6 +56,13 @@
[(as-list)
(vhash->list vh)]))
(define* (make-vector-of-cells size #:optional fill)
Owner
Copy link

Since we're pretty much doing the same thing in ^vector, can vector use this too?

Since we're pretty much doing the same thing in `^vector`, can vector use this too?
Author
Owner
Copy link

I had considered this but I didn't want to add another layer of indirection.

I had considered this but I didn't want to add another layer of indirection.
flockofbirbs changed title from (削除) Add ^hashtable actor (削除ここまで) to WIP: Add ^hashtable actor 2025年10月09日 16:46:07 +02:00
flockofbirbs force-pushed hashtable-actor from 0c87508b08
All checks were successful
/ no-tabs-please (push) Successful in -1m38s
/ unit-tests (push) Successful in -8s
/ distcheck (push) Successful in 1m7s
/ ocapn-test-suite (push) Successful in 1m5s
/ basic-hoot-check (push) Successful in -18s
to 42da8ae351
All checks were successful
/ no-tabs-please (push) Successful in -1m37s
/ unit-tests (push) Successful in -8s
/ distcheck (push) Successful in 1m8s
/ ocapn-test-suite (push) Successful in 1m5s
/ basic-hoot-check (push) Successful in -18s
2025年10月09日 18:15:44 +02:00
Compare
All checks were successful
/ no-tabs-please (push) Successful in -1m37s
Required
Details
/ unit-tests (push) Successful in -8s
Required
Details
/ distcheck (push) Successful in 1m8s
Required
Details
/ ocapn-test-suite (push) Successful in 1m5s
/ basic-hoot-check (push) Successful in -18s
This pull request is marked as a work in progress.
This branch is out-of-date with the base branch
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin hashtable-actor:hashtable-actor
git switch hashtable-actor
Sign in to join this conversation.
No reviewers
Milestone
Clear milestone
No items
No milestone
Projects
Clear projects
No items
No project
Assignees
Clear assignees
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
spritely/goblins!876
Reference in a new issue
spritely/goblins
No description provided.
Delete branch "hashtable-actor"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?