Skip to content
This repository has been archived by the owner on Apr 13, 2022. It is now read-only.

Node crash due to failed rollback in Hybrid #191

Open
dkaidalov opened this issue Feb 19, 2018 · 6 comments
Open

Node crash due to failed rollback in Hybrid #191

dkaidalov opened this issue Feb 19, 2018 · 6 comments
Assignees

Comments

@dkaidalov
Copy link
Contributor

There is a problem happening all the time to me if I run 2 or more nodes. A node may crash with the following exception:

java.util.NoSuchElementException: versionID not found, can not rollback
	at io.iohk.iodb.LSMStore.notFound$1(LSMStore.scala:868)
	at io.iohk.iodb.LSMStore.$anonfun$rollback$2(LSMStore.scala:875)
	at scala.Option.getOrElse(Option.scala:121)
	at io.iohk.iodb.LSMStore.rollback(LSMStore.scala:875)
	at examples.hybrid.state.HBoxStoredState.$anonfun$rollbackTo$1(HBoxStoredState.scala:88)
	at scala.util.Try$.apply(Try.scala:209)
	at examples.hybrid.state.HBoxStoredState.rollbackTo(HBoxStoredState.scala:84)
	at scorex.core.NodeViewHolder.$anonfun$updateState$1(NodeViewHolder.scala:230)
	at scala.util.Try$.apply(Try.scala:209)
	at scorex.core.NodeViewHolder.updateState(NodeViewHolder.scala:226)
	at scorex.core.NodeViewHolder.updateState$(NodeViewHolder.scala:220)
	at examples.hybrid.HybridNodeViewHolder.updateState(HybridNodeViewHolder.scala:26)
	at examples.hybrid.HybridNodeViewHolder.pmodModify(HybridNodeViewHolder.scala:123)
	at examples.hybrid.HybridNodeViewHolder.pmodModify(HybridNodeViewHolder.scala:26)
	at scorex.core.NodeViewHolder$$anonfun$processLocallyGeneratedModifiers$1.applyOrElse(NodeViewHolder.scala:373)
	......

So actually at some moment a node isn't able to do a rollback. A brief look at the issue brought me to this function source code link which, as I can see, isn't fully implemented. It is probably the cause of the crashes (not fully sure though).

Are you also having this problem? Are you going to fix it anytime soon?

@ceilican
Copy link
Contributor

@dkaidalov , I have recently executed a few nodes (see #152), and I didn't experience this behaviour. Could you tell us in more detail what you did to see this error?

@terjokhin terjokhin self-assigned this Mar 6, 2018
@terjokhin
Copy link
Contributor

This error related to HBoxState, not history. @dkaidalov could you provide more details? I'm trying to reproduce.

@dkaidalov
Copy link
Contributor Author

Indeed, the error is raised in HBoxStoredState, but its reason coming from the non-properly implemented HybridHistory::bestForkChanges

The main problem is that HybridHistory::bestForkChanges returns a ProgressInfo structure with toApply field containing only the head block instead of the whole applyBlocks array.
This, in turn, leads to the situation that not all necessary blocks are applied to HBoxStoredState (because it uses ProgressInfo structure to update its state) and then, at some point, it can't do a rollback, because of unknown branching point.

This is my understanding of the problem, but I could mislead something.

I catch this error very often. Two or more nodes are needed. It raises all the time if block generation is fast. I have 10s block interval.
Here is my config:

miner {
    offlineGeneration = true
    targetBlockDelay = 10s
    blockGenerationDelay = 100ms
    rParamX10 = 8
    initialDifficulty = 10
    posAttachmentSize = 100
  }

..............................

PosForger:
val InitialDifficuly = 1500000000L

Note that I decreased PoS initial difficulty to speed up block generation

@dkaidalov
Copy link
Contributor Author

@Daron666 actually I was able to reproduce this crash without any changes in configs and with clean master branch (I run 3 nodes)
I also noticed that the mentioned exception usually appears after the next error:

java.lang.IllegalArgumentException: requirement failed: Incorrect state version: 5wySxM4eYTLGFbboEeVYoKyPvu3u5C3fScuHoNAxm6Km found, (B78JEtziqmwttysEiab1KRNZ3oaSUYpUQvubfY5KTy1w || 87Phg3xSAwA58cRTZ1n4zTKmZwsjqkKwcijnmmspjsmA || List()) expected
	at scala.Predef$.require(Predef.scala:277)
	at examples.hybrid.state.HBoxStoredState.$anonfun$validate$1(HBoxStoredState.scala:56)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
	at scala.util.Try$.apply(Try.scala:209)
	at examples.hybrid.state.HBoxStoredState.validate(HBoxStoredState.scala:51)
	at examples.hybrid.state.HBoxStoredState.validate(HBoxStoredState.scala:22)
	at scorex.mid.state.BoxMinimalState.applyModifier(BoxMinimalState.scala:29)
	at scorex.mid.state.BoxMinimalState.applyModifier$(BoxMinimalState.scala:28)
	at examples.hybrid.state.HBoxStoredState.applyModifier(HBoxStoredState.scala:22)
	at scorex.core.NodeViewHolder.updateState(NodeViewHolder.scala:248)
	at scorex.core.NodeViewHolder.pmodModify(NodeViewHolder.scala:284)
	at scorex.core.NodeViewHolder.pmodModify$(NodeViewHolder.scala:271)
	at examples.hybrid.HybridNodeViewHolder.pmodModify(HybridNodeViewHolder.scala:22)
	at scorex.core.NodeViewHolder$$anonfun$processLocallyGeneratedModifiers$1.applyOrElse(NodeViewHolder.scala:380)
....

To facilitate reproduction you can:

  1. Set up all nodes with non-zero balances (by setting seed = "genesisoX") so they can issue PoSBlocks
  2. Decrease block delay
  3. Decrease initial Pos difficulty

@kushti
Copy link
Contributor

kushti commented Mar 24, 2018

@dkaidalov I guess I've fixed it, test please

@dkaidalov
Copy link
Contributor Author

@kushti Subjectively it becomes a bit more stable, but the same errors are still reproducible
I noticed that HybridHistory::bestForkChanges still returns a ProgressInfo structure with toApply field containing only the head block instead of the whole applyBlocks array. Is that done on purpose?
Cause I can confirm, according to my testing, that the IncorrectStateVersion exceptions start to appear right after applyBlocks.size > 1 has happened. And it then finally leads to Version Id not found exception

dkaidalov added a commit to dkaidalov/Scorex that referenced this issue Mar 28, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants