-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: BatchIt #677
WIP: BatchIt #677
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushing with some changes applied to get CI's opinion. Will address the rest of your review comments soon. |
Unlike its brethren, `dealloc_local_object` and `dealloc_local_object_slow`, the `dealloc_local_object_slower` method does not take a pointer to free space. Make this slightly more apparent by renaming it and adding some commentary to both definition and call site.
Make both _fast() and _slow() arms take the meta as an argument; _meta() already did.
Plumb its use around remoteallocator and remotecache
This prepares the recipient to process a batched message.
Exercise recipient machinery by having the senders collect adjacent frees to the same slab into a batch.
OK, I think that's all the review comments. Also, huh, I guess my time at MS using this account (what a mistake) reset the CLA bot, which I appeased all the way back in #41 (how time flies!). @microsoft-github-policy-service agree |
I'm unsure what I could have done to make Windows unhappy, but that suggest I've broken something subtle. :( I no longer have a Windows machine on which to test things; can I ask you to have a look, @mjp41? |
src/snmalloc/mem/metadata.h
Outdated
if (SNMALLOC_LIKELY(_batch < _meta->needed())) | ||
{ | ||
// Will not hit threshold for state transition | ||
_meta->needed() -= _batch; | ||
return false; | ||
} | ||
|
||
/* | ||
* If we've already taken a step and there are no more unreturned | ||
* objects, then we don't need to take another step. It can happen | ||
* that we get here because _meta->needed() is still 0, tho', so | ||
* bail. | ||
*/ | ||
if (!first && _batch == 0) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure what I could have done to make Windows unhappy, but that suggest I've broken something subtle. :(
I no longer have a Windows machine on which to test things; can I ask you to have a look, @mjp41?
I haven't tried on Windows yet. But just inspecting the CI, I am wondering if it is valid to read needed after we have performed the last meta data update. Could another thread have reused the meta-data by this point or turned the page behind the meta-data off? Perhaps, we need to use the batch size first in the cases where we can. Which is effectively what your original code did as it would only look another time if it was non-zero.
if (SNMALLOC_LIKELY(_batch < _meta->needed())) | |
{ | |
// Will not hit threshold for state transition | |
_meta->needed() -= _batch; | |
return false; | |
} | |
/* | |
* If we've already taken a step and there are no more unreturned | |
* objects, then we don't need to take another step. It can happen | |
* that we get here because _meta->needed() is still 0, tho', so | |
* bail. | |
*/ | |
if (!first && _batch == 0) | |
return false; | |
/* | |
* Batch is non-zero for non-first iterations, so only check if it isn't the | |
* first call. | |
*/ | |
SNMALLOC_ASSERT(!first || (batch != 0)); | |
if (!first && _batch == 0) | |
return false; | |
if (SNMALLOC_LIKELY(_batch < _meta->needed())) | |
{ | |
// Will not hit threshold for state transition | |
_meta->needed() -= _batch; | |
return false; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, updated the suggestion above to fix my failure to do basic logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's a good point. Lemme apply that fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that SNMALLOC_ASSERT
is right, though. The thing the new condition is guarding against is precisely the case that we have taken the first step (and enough subsequent steps) to have returned all objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm a bit confused why the first one would be empty? Perhaps, applying de-Morgan to the assert would be helpful:
SNMALLOC_ASSERT(!(first && batch == 0));
This holds when I build locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I just couldn't read. I'll add the assert.
Looks like that was it! |
This might involve multiple (I think at most two, at the moment) transitions in the slab lifecycle state machine. Towards that end, return indicators to the caller that the slow path must be taken and how many objects of the original set have not yet been counted as returned.
We can, as Matt so kindly reminds me, go get them from the pagemap. Since we need this value only when closing a ring, the read from over there is probably not very onerous. (We could also get the slab pointer from an object in the ring, but we need that whenever inserting into the cache, so it's probably more sensible to store that locally?)
Move ring set bits and associativity knobs to allocconfig and expose them via CMake. If associtivity is zero, use non-batched implementations of the `RemoteMessage` and `RemoteDeallocCacheBatching` classes. By default, kick BatchIt on when we have enough room in the minimum allocation size to do it. Exactly how much space is enough is a function of which mitigations we have enabled and whether or not we are compiling with C++20. This commit reverts the change to `MIN_ALLOC_SIZE` made in "Introduce RemoteMessage structure" now that we have multiple types, and zies, of remote messages to choose from.
There's no need for a full pointer here, it'd just make the structure larger on CHERI.
In order not to thwart `mitigations(random_preserve)` too much, if it's on in combination with BatchIt, roll the dice every time we append to a batch to decide if we should stochastically evict this batch. By increasing the number of batches, we allow the recipient allocator increased opportunity to randomly stripe batches across the two `freelist::Builder` segments associated with each slab.
@nwf want to click the merge button? |
This recreates the current contents of #637 from a different repository, since the lifetime of the @nwf-msr account is now pretty limited. (I couldn't find a way to move branches between repos in a way that swung the PR, so had to settle for this low-tech solution.)