-
-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DependencyTrack hangs when uploading a large SBOM to a project a second time #1905
Comments
Wow, that's certainly a first. Ever seen anything like this happening before, @stevespringett?
@JayAtFujifilm, would it be possible to provide the SBOM to us so we can try to reproduce the issue? |
7.5MB isn't that large. The BOM I use to perform all my performance testing with is 22MB and contains just over 9K components. See attached Bloated BOMs.zip. I think there's something else going on here, either memory or host configuration, or perhaps something in the BOM itself that contains an unexpected large amount of data in a field. |
Hello Guys, Looking at the code at line BomUploadProcessingTask.java:178 below, it is most likely a recursion issue (consistent with the stackoverflow error) due to a deep parent-child component hierarchy : private void processComponent(final QueryManager qm, final Bom bom, Component component,
final List<Component> flattenedComponents) {
component.setInternal(InternalComponentIdentificationUtil.isInternalComponent(component, qm));
....
if (component.getChildren() != null) {
for (final Component child : component.getChildren()) {
processComponent(qm, bom, child, flattenedComponents); <-- Line #178
}
}
} Would be interesting to know the maximum parent child relation depth in the problematic SBOM. |
Yup, dependency-track-dtrack-apiserver-1 | 2022-08-23 21:02:17,219 ERROR [LoggableUncaughtExceptionHandler] An unknown error occurred in an asynchronous event or notification thread
dependency-track-dtrack-apiserver-1 | java.lang.StackOverflowError: null
dependency-track-dtrack-apiserver-1 | at java.base/java.security.AccessController.doPrivileged(Native Method)
dependency-track-dtrack-apiserver-1 | at
org.datanucleus.state.StateManagerImpl.replaceStateManager(StateManagerImpl.java:2096)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.initialiseForDetached(StateManagerImpl.java:644)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.initialiseForDetached(StateManagerImpl.java:126)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.detachCopy(StateManagerImpl.java:4932)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.detachCopy(StateManagerImpl.java:126)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.ExecutionContextImpl.detachObjectCopy(ExecutionContextImpl.java:2741)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.store.fieldmanager.DetachFieldManager.processPersistableCopy(DetachFieldManager.java:76)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.store.fieldmanager.DetachFieldManager.processField(DetachFieldManager.java:154)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.store.fieldmanager.DetachFieldManager.internalFetchObjectField(DetachFieldManager.java:121)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.store.fieldmanager.AbstractFetchDepthFieldManager.fetchObjectField(AbstractFetchDepthFieldManager.java:105)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.replacingObjectField(StateManagerImpl.java:1995)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.model.Component.dnReplaceField(Component.java)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.model.Component.dnReplaceFields(Component.java)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.replaceFields(StateManagerImpl.java:4320)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.replaceFields(StateManagerImpl.java:4345)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.detachCopy(StateManagerImpl.java:4941)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.state.StateManagerImpl.detachCopy(StateManagerImpl.java:126)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.ExecutionContextImpl.detachObjectCopy(ExecutionContextImpl.java:2741)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.api.jdo.JDOPersistenceManager.jdoDetachCopy(JDOPersistenceManager.java:1121)
dependency-track-dtrack-apiserver-1 | at org.datanucleus.api.jdo.JDOPersistenceManager.detachCopy(JDOPersistenceManager.java:1150)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.persistence.ComponentQueryManager.createComponent(ComponentQueryManager.java:321)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.persistence.QueryManager.createComponent(QueryManager.java:452)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:171)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
dependency-track-dtrack-apiserver-1 | at org.dependencytrack.tasks.BomUploadProcessingTask.processComponent(BomUploadProcessingTask.java:178)
``` |
The SBOM for which this occurs contains a deeply nested component hierarchy that mimics the layout of our source code. To build the SBOM, we iterate through our source code, building an individual SBOM for each project (.Net or NodeJS), and use the Cyclone CLI tool to merge the individual SBOMs. We also use "pseudo-components" to represent source code folders that don't directly contain a project. |
Thanks for the update. FYI, at this time, Dependency-Track will flatten the component inventory and will not preserve hierarchy. Support for parent/child relationships for both projects and components is planned. |
Thank you, syalioune, for reproducing this so quickly! |
It would definitively help if you could provide your anonymized SBOM. My test SBOM is somewhat biased and extreme as I have duplicated the same component in the nested hierarchy probably causing an infinite recursion. |
We are working on trying to narrow down the cause the problem, and hope to have a smaller SBOM available for debugging soon. |
Hi @syalioune , Can you please mention what information is required in anonymized, The same issue is still in latest version Will the Structure of SBOM help to debug? |
What is important for reproduction will be to have :
|
Here is the SBOM structure, I have verified that we can reproduce the issue by using the structure SBOM |
Hi all, We are also seeing this issue in our organisation. It sounds like this only happens on the second iteration. So would a feasible work around be to purge the database prior to each run? What would be your recommendation? Also happy to provide logs etc if that's going to help in anyway. |
@JayAtFujifilm Did you find a work around for this? |
… with nested duplicate See DependencyTrack#1905 for details Signed-off-by: syalioune <[email protected]>
Looking at the different logs and sbom provided, the common pattern that emerge is a nested duplicate component like in the SBOM below {
"bomFormat": "CycloneDX",
"specVersion": "1.4",
"version": 1,
"metadata": {
"timestamp": "2023-01-01T11:01:51Z",
"tools": [
{
"vendor": "changeme",
"name": "changeme",
"version": "0.62.3"
}
]
},
"components": [
{
"bom-ref": "pkg:pypi/[email protected]",
"type": "library",
"name": "Pillow",
"version": "9.3.0",
"cpe": "cpe:2.3:a:alex_clark_\\(pil_fork_author\\):python-Pillow:9.3.0:*:*:*:*:*:*:*",
"purl": "pkg:pypi/[email protected]",
"components": [
{
"bom-ref": "pkg:pypi/[email protected]?package-id=212c649613e17901",
"type": "library",
"name": "Pillow",
"version": "9.3.0",
"cpe": "cpe:2.3:a:alex_clark_\\(pil_fork_author\\):python-Pillow:9.3.0:*:*:*:*:*:*:*",
"purl": "pkg:pypi/[email protected]"
}
]
}
]
} The scenario is :
Kind of the same premises as in #2131 (comment) The SBOM is flawed to begin with. @sahil3112 @dancundy @JayAtFujifilm can you please confirm that your non redacted SBOM match the pattern of the SBOM snippet above and which tool you used to generate it ? However DT can self protect against it. Two possibilities :
I've submited a draft PR with the second alternative. |
@syalioune
Below are some more details: So apparently in the input BOM there is parent-child relationship between two components that essentially refer to the same package (and which I would expect to be deduplicated on the second import of the BOM). After the first import, this is what we see in the components overview in the DT UI:
And this is what we see in the H2 console after the first import:
So we see that the second component refers to the first one as its parent, which is consistent with the BOM. During the second import, in
The component returned is the one with ID=936, as expected. Next, the child is being retrieved from the database by its component identity:
In this case, also the component with ID=136 is returned. Consequently, the Looking at the H2 console, we can see that the cyclic reference has been persisted after the second import. Moreover, the child component does not have a parent anymore, and the PURL property of the parent was set to the value of the child:
I cannot judge whether this storage retrieval/persistence behaviour is expected or not, maybe it's supposed to be like that as part of the deduplication? Anyway I hope this helps in isolating and resolving the issue. Below is the reduced BOM that has the issue:
|
Hello @salfie Thanks for your thorough investigation. It match with my observations and based on that, I can provide the attached real life reproductible example cyclonedx-gomod-issue-1905.zip. Given the example application and SBOM generation with cyclonedx-gomod using cyclonedx-gomod app -json -output acme-app.bom.json -licenses -packages . We end up with the nested golang module/package components {
"bom-ref": "pkg:golang/cloud.google.com/go/[email protected]?type=module",
"type": "library",
"name": "cloud.google.com/go/storage",
"version": "v1.30.1",
"scope": "required",
"hashes": [
{
"alg": "SHA-256",
"content": "b8e74cc40b3c1c4c6a0659cbb674323f4624bdb883a5d1928462adc7a53fa0d3"
}
],
"purl": "pkg:golang/cloud.google.com/go/[email protected]?type=module\u0026goos=linux\u0026goarch=amd64",
"components": [
{
"type": "library",
"name": "cloud.google.com/go/storage",
"version": "v1.30.1",
"purl": "pkg:golang/cloud.google.com/go/[email protected]?type=package"
}
]
} The if both |
Signed-off-by: nscuro <[email protected]>
@syalioune Apologies for the delayed response, I only now got some time to look at BOM processing more closely.
Agreed. I'd even go one step further: If There are multiple issues closely related to this one. It all comes down to For a change in Hyades, I now switched the matching logic to being "strict", and that fixes both #1905 and #2519: DependencyTrack/hyades-apiserver@6418879#diff-3a9c95d09a4a5285037a7d5ba65613e09198ce2b460279622cadd8e703677d40 |
Signed-off-by: nscuro <[email protected]>
Signed-off-by: nscuro <[email protected]>
* Add bloated BOM for ingestion performance testing Signed-off-by: nscuro <[email protected]> * Prevent query compilation cache being bypassed for `matchSingleIdentity` queries See DependencyTrack/dependency-track#2540 This also cleans the query from containing weird statements like `(cpe != null && cpe == null)` in case a component does not have a CPE. Signed-off-by: nscuro <[email protected]> * WIP: Improve BOM processing performance Signed-off-by: nscuro <[email protected]> * Handle dependency graph Signed-off-by: nscuro <[email protected]> * Improve dependency graph assembly Instead of using individual bulk UPDATE queries, use setters on persistent components instead. This way we can again make use of batched flushing. Signed-off-by: nscuro <[email protected]> * Completely replace old processing logic Also decompose large processing method into multiple smaller ones, and re-implement notifications. Signed-off-by: nscuro <[email protected]> * Fix not all BOM refs being updated with new component identities Signed-off-by: nscuro <[email protected]> * Be smarter about indexing component identities and BOM refs Also add more documentation Signed-off-by: nscuro <[email protected]> * Reduce logging noise Signed-off-by: nscuro <[email protected]> * Mark new components as such ... via new transient field. Required for compatibility with #217 Signed-off-by: nscuro <[email protected]> * Compatibility with #217 Signed-off-by: nscuro <[email protected]> * Cleanup tests Signed-off-by: nscuro <[email protected]> * Reduce code duplication Signed-off-by: nscuro <[email protected]> * Cleanup; Process services Signed-off-by: nscuro <[email protected]> * Finishing touches 🪄 Signed-off-by: nscuro <[email protected]> * Make flush threshold configurable The optimal value could depend on how beefy the database server is, and how much memory is available to the API server. Signed-off-by: nscuro <[email protected]> * Clarify `warn` log when rolling back active transactions Signed-off-by: nscuro <[email protected]> * Log number of consumed components and services before and after de-dupe Signed-off-by: nscuro <[email protected]> * Extend BOM processing test with bloated BOM Signed-off-by: nscuro <[email protected]> * Make component identity matching strict To address DependencyTrack/dependency-track#2519 (comment). Also add regression test for this specific issue. Signed-off-by: nscuro <[email protected]> * Add regression test for DependencyTrack/dependency-track#1905 Signed-off-by: nscuro <[email protected]> * Clarify why "reachability on commit" is disabled; Add assertion for persistent object state Signed-off-by: nscuro <[email protected]> * Add tests for `equals` and `hashCode` of `ComponentIdentity` Signed-off-by: nscuro <[email protected]> * Address review comments Signed-off-by: nscuro <[email protected]> --------- Signed-off-by: nscuro <[email protected]>
No pb. Same time issues here. I guess the fix you performed in hyades would be merged back here sometime ? |
Yes, we have many improvements from Hyades in the pipeline that we want to contribute back soon, and this is of course one of them. 🤘 |
* Add bloated BOM for ingestion performance testing Signed-off-by: nscuro <[email protected]> * Prevent query compilation cache being bypassed for `matchSingleIdentity` queries See DependencyTrack/dependency-track#2540 This also cleans the query from containing weird statements like `(cpe != null && cpe == null)` in case a component does not have a CPE. Signed-off-by: nscuro <[email protected]> * WIP: Improve BOM processing performance Signed-off-by: nscuro <[email protected]> * Handle dependency graph Signed-off-by: nscuro <[email protected]> * Improve dependency graph assembly Instead of using individual bulk UPDATE queries, use setters on persistent components instead. This way we can again make use of batched flushing. Signed-off-by: nscuro <[email protected]> * Completely replace old processing logic Also decompose large processing method into multiple smaller ones, and re-implement notifications. Signed-off-by: nscuro <[email protected]> * Fix not all BOM refs being updated with new component identities Signed-off-by: nscuro <[email protected]> * Be smarter about indexing component identities and BOM refs Also add more documentation Signed-off-by: nscuro <[email protected]> * Reduce logging noise Signed-off-by: nscuro <[email protected]> * Mark new components as such ... via new transient field. Required for compatibility with #217 Signed-off-by: nscuro <[email protected]> * Compatibility with #217 Signed-off-by: nscuro <[email protected]> * Cleanup tests Signed-off-by: nscuro <[email protected]> * Reduce code duplication Signed-off-by: nscuro <[email protected]> * Cleanup; Process services Signed-off-by: nscuro <[email protected]> * Finishing touches 🪄 Signed-off-by: nscuro <[email protected]> * Make flush threshold configurable The optimal value could depend on how beefy the database server is, and how much memory is available to the API server. Signed-off-by: nscuro <[email protected]> * Clarify `warn` log when rolling back active transactions Signed-off-by: nscuro <[email protected]> * Log number of consumed components and services before and after de-dupe Signed-off-by: nscuro <[email protected]> * Extend BOM processing test with bloated BOM Signed-off-by: nscuro <[email protected]> * Make component identity matching strict To address DependencyTrack/dependency-track#2519 (comment). Also add regression test for this specific issue. Signed-off-by: nscuro <[email protected]> * Add regression test for DependencyTrack/dependency-track#1905 Signed-off-by: nscuro <[email protected]> * Clarify why "reachability on commit" is disabled; Add assertion for persistent object state Signed-off-by: nscuro <[email protected]> * Add tests for `equals` and `hashCode` of `ComponentIdentity` Signed-off-by: nscuro <[email protected]> * Address review comments Signed-off-by: nscuro <[email protected]> --------- Signed-off-by: nscuro <[email protected]> Signed-off-by: mehab <[email protected]>
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Current Behavior:
When we create a new project and upload a large SBOM (~7.5MB) there is no problem. However, if we then upload the SBOM again to the same project, the DepTrack API server hangs and never finishes processing. Inspection of the logfile via Docker indicates a stack overflow error (java.lang.StackOverflowError), as shown in the attached logfile (DependencyTrackLog.txt).
This happens even if we wait a long time (several days) to upload the second SBOM.
For smaller SBOMs we never have this problem.
Steps to Reproduce:
Using the DependencyTrack API:
Expected Behavior:
Should be able to upload even large SBOMs multiple times to the same project.
Environment:
Additional Details:
Unfortunately, we cannot send you the actual SBOM for proprietary reasons.
The text was updated successfully, but these errors were encountered: