-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement slab level cache for remote frees #634
Comments
Hi Matt,
We sure are. In tremor, and I have a project at work ( axiom now ) where I
think it's a great fit.
We'll ask the community as well
Cheers,
Darach.
…On Thu, Sep 14, 2023, 22:15 Matthew Parkinson ***@***.***> wrote:
@Licenser <https://github.com/Licenser> @darach
<https://github.com/darach> are you still using snmalloc. If so do you
have any benchmarks that represent your workload? We have some ideas that
would benefit your kind of workload for Tremor.
—
Reply to this email directly, view it on GitHub
<#634 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABX5MW5MOJBESOQQAOE7FLX2NQVNANCNFSM6AAAAAA4YXRFRI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Awesome thanks. Any benchmarks we can run would really help us in justifying the engineering work. |
Our CI benchmarks are snmalloc based https://www.tremor.rs/benchmarks/ - relatively boring UI there. They have changed enough though in themselves and we rewrote the runtime and connectors so the benchmarking |
I've spent a while playing with various cache strategies (though nothing very sophisticated, just sort of seeing what the low-hanging fruit was like). For the two workloads I've tried, for three interesting choices of hashes, I currently have these message counts:
The hash here is inspired by https://github.com/skeeto/hash-prospector but is sort of "a third" of that:
I'm sure it's possible to do better, but that seems to work alright, though it's not sensitive to the upper bits of the slab! Of note, however, changing the shift to |
* NFC: split freelist_queue from remoteallocator This lets us use freelists as message queues in contexts other than the remoteallocator. No functional change indended. * freelist_queue: add and use destroy_and_iterate * freelist: make backptr obfuscation key "tweakable" * freelist: tweakable keys in forward direction, too * test/perf/msgpass: ubench a producer-consumer app Approximate a message-passing application as a set of producers, a set of consumers, and a set of proxies that do both. We'll use this for some initial insight for #634 but it seems worth having in general.
* msvc: set __cplusplus to the actual value in use * ds_core/bits: add mask_bits; convert one_at_bit-s * remotecache: enable reserve_space multiple objects * nits * Small changes to tracing - Trace "Handling remote" once per batch, rather than per element - Remote queue events also log the associated metaslab; we'll use this to assess the efficacy of #634 * freelist builder: allow forcibly tracking length * Try forward declaring freelist::Builder to appease macos-14 * freelist: tweak intra-slab obfuscation keys by meta address * NFC: freelist: allow `next` to be arbitrary value * Switch to a central, tweaked key for all free lists * allocconfig: introduce some properties of slabs We'll use these to pack values in message queues. - Maximum distance between two objects in a single slab - Maximum number of objects in a slab * NFC: Templatize LocalCache on Config * NFC: split dealloc_local_object_slow We'll use the _slower form when we're just stepping a slab through multiple rounds of state transition (to come), which can't involve the actual memory object in question. * NFC: make freelist::Object::T-s by placement new * NFC: CoreAlloc: split dealloc_local_object The pattern of `if (!fast()) { slow() }` occurs in a few places, including in contexts where we already know the entry and so don't need to look it up.
@nwf-msr observed that we could improve the performance of remote deallocation if the producer does more work on building lists for each slab before returning to the original allocator. This could improve producer/consumer scenarios further.
Tasks
The text was updated successfully, but these errors were encountered: