You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to understand them according to comments but not for sure. Suppose kUSER_MANAGED is used for context creation and open weight streaming.
From 2 can find total weight size and user can calculate the budget size according to it and set through 3. And What does result of 4 do ? just a info ?
And for 1
Does it should be called after 3?
I think the return of it is the upper limit of total size, so 6 only need to set once using this return and there is no need to call 5 any more. is it right ? Otherwise, 6 should be called to set new buffer according to 5 return value every time new shapes have been set.
The text was updated successfully, but these errors were encountered:
First, need understand ScratchMemory (Mainly used for temporary storage required by layer implementations ).
DeviceMemory is mainly used for intermediate activation tensors, also used for temporary storage required by layer implementations, so inculde ScratchMemory.
Assume weight streaming budget as a WSB_op. So k streamable weight, means k WSB_op.
getDeviceMemorySizeV2() = sum { ScratchMemory }
getStreamableWeightsSize() = sum { streamable weight }
getWeightStreamingScratchMemorySize() = size of the scratch memory required by the current WSB op.
If getDeviceMemorySizeV2() is called before enabling weight streaming by setWeightStreamingBudgetV2(), the return value will not include the extra scratch memory size required by weight streaming, which can be obtained using getWeightStreamingScratchMemorySize(). Otherwise, it will include this extra memory.
If the budget set by setWeightStreamingBudgetV2() is larger than the total size of streamable weights obtained by getStreamableWeightsSize(), the budget will be clipped to the total size, effectively disabling weight streaming.
You can query the budget set by getWeightStreamingBudgetV2().
@lix19937 Okay, Thank you !!
So setWeightStreamingBudgetV2() is set for persistent memory managed by TensorRT while setDeviceMemoryV2() is set for temporary memory which can be managed by users. Is that right ?
What about updateDeviceMemorySizeForShapes ? Is it adjust persistent memory and return a new temporary memory according to input shapes?
Hi, I upgrade TensorRT from 8.4 to 10.7 and find several new interfaces about size get/set, I have some questions about them.
Following are involved interfaces:
ICudaEngine::getDeviceMemorySizeV2
ICudaEngine::getStreamableWeightsSize
ICudaEngine::setWeightStreamingBudgetV2
ICudaEngine::getWeightStreamingScratchMemorySize
IExecutionContext::updateDeviceMemorySizeForShapes
IExecutionContext::setDeviceMemoryV2
I tried to understand them according to comments but not for sure. Suppose kUSER_MANAGED is used for context creation and open weight streaming.
From
2
can find total weight size and user can calculate the budget size according to it and set through3
. And What does result of4
do ? just a info ?And for
1
3
?6
only need to set once using this return and there is no need to call5
any more. is it right ? Otherwise,6
should be called to set new buffer according to5
return value every time new shapes have been set.The text was updated successfully, but these errors were encountered: