-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release memory where possible in RestElasticSearchClient #4685
Conversation
24c3181
to
6703d18
Compare
serializedSize += 1; //For follow-up NEW_LINE_BYTES | ||
if (this.requestSource != null) { | ||
serializedSize += this.requestSource.length; | ||
serializedSize+= 1; //For follow-up NEW_LINE_BYTES | ||
serializedSize += 1; //For follow-up NEW_LINE_BYTES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed some formatting that I missed previously
ListIterator<ElasticSearchMutation> requestsIter = requests.listIterator(); | ||
while (requestsIter.hasNext()) { | ||
ElasticSearchMutation request = requestsIter.next(); | ||
//Remove the element from the collection so the collection's reference to it doesn't hold it from being | ||
//GC'ed after it has been converted to its serialized form | ||
requestsIter.set(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing point 1 in #4684
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mutating the List
that is passed in, which will then be exposed to the caller since we can't take ownership of the List in Java in this context. However from reviewing the call sites, they create the list within their function scopes, and don't do anything with it after passing it into here so this should be "safe", but mutating a passed in list is sometimes frowned upon in Java, so I wanted to call that out.
Call Hierarchy analysis by IntelliJ finds:
In each case the passed in List is created in that callsite and isn't mutated, or even read, after being passed to here, at least at the time of this writing.
RequestBytes next = requestIterator.next(); | ||
//Remove the element from the collection, so the iterator doesn't prevent it from being GC'ed | ||
//due to the reference to it in the collection | ||
requestIterator.set(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing point 2 in #4684
@@ -490,14 +500,20 @@ class BulkRequestChunker implements Iterator<List<RequestBytes>> { | |||
//There is no "correct" number of actions to perform in a single bulk request. Experiment with different | |||
// settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum | |||
// size of a HTTP request to 100mb by default | |||
private final PeekingIterator<RequestBytes> requestIterator; | |||
private final ListIterator<RequestBytes> requestIterator; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to expose being able to set the element to null, which isn't exposed in PeekingIterator
so took on the peek()
behavior here, and switched to ListIterator
private int writeTo(byte[] target, int initialOffset) { | ||
int offset = initialOffset; | ||
System.arraycopy(this.requestBytes, 0, target, offset, this.requestBytes.length); | ||
offset += this.requestBytes.length; | ||
System.arraycopy(NEW_LINE_BYTES, 0, target, offset, NEW_LINE_BYTES.length); | ||
offset += NEW_LINE_BYTES.length; | ||
if (this.requestSource != null) { | ||
outputStream.write(requestSource); | ||
outputStream.write(NEW_LINE_BYTES); | ||
System.arraycopy(this.requestSource, 0, target, offset, this.requestSource.length); | ||
offset += this.requestSource.length; | ||
System.arraycopy(NEW_LINE_BYTES, 0, target, offset, NEW_LINE_BYTES.length); | ||
offset += NEW_LINE_BYTES.length; | ||
} | ||
return offset; | ||
} | ||
} | ||
|
||
private Pair<String, byte[]> buildBulkRequestInput(List<RequestBytes> requests, String ingestPipeline) throws IOException { | ||
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); | ||
private Pair<String, byte[]> buildBulkRequestInput(List<RequestBytes> requests, String ingestPipeline) { | ||
int totalBytes = requests.stream().mapToInt(RequestBytes::getSerializedSize).sum(); | ||
//By making a singular array we copy into avoids any dynamically expanded growth of the array that may overshoot | ||
//how much memory we actually need, additionally it also avoids a final copy at the end normally done by | ||
//ByteArrayOutputStream's toByteArray() | ||
byte[] bytes = new byte[totalBytes]; | ||
int offset = 0; | ||
for (final RequestBytes request : requests) { | ||
request.writeTo(outputStream); | ||
//We can't remove the element from the collection like we do elsewhere, because we need to retain the | ||
//serialized form in case of an error so the error can be paired to the originating request based on index | ||
offset = request.writeTo(bytes, offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing point 3 in #4684
IIRC @porunov & @li-boxuan you two reviewed this code last time, so this may interest you as an enhancement on that submission. As always happy to make changes, hoping this may also help others that bump into OOMs in this area 😅 . |
6703d18
to
66ccdc8
Compare
Hi @criminosis !
|
Closes JanusGraph#4684 Signed-off-by: Allan Clements <[email protected]>
66ccdc8
to
75d6f51
Compare
Thanks @porunov . Unfortunately missed those from a larger failure with the index provider tests I fixed before turning in last night. Sorry about that 🤦♂️ . Those tests were passing in an immutable list which blows up at runtime when the iterator attempts to call |
I'm going to pull this back to draft status. I think I have discovered a behavior asymmetry between the Java reference driver and the Rust (unofficial) driver. It seems like sessions aren't getting closed in JanusGraph in the internal Tinkerpop server and the Out Of Memory Errors that prompted this PR are just the part of the iceberg that stuck out of the water. |
So the apparent asymmetry ultimately lead to discovering a bug I believe in the Tinkerpop server. I opened an issue over there: https://issues.apache.org/jira/browse/TINKERPOP-3113 In the mean time @porunov I'm gonna go ahead and close this. If there's a desire for it happy to reopen, but the original premise that prompted this PR is no longer valid. Thank you for your attention nonetheless though. |
Closes #4684
Thank you for contributing to JanusGraph!
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
master
)?For code changes:
For documentation related changes: