feat: use IPNI advertisements from the miner only #55

bajtos · 2024-03-07T15:42:34Z

Resolve minerId to miner's PeerId, which is the same value as Provider.ID in the IPNI records
Ignore IPNI records advertised by different miners
Report minerId and providerId as the new measurement fields

Links:

Link retrievals to miners space-meridian/roadmap#65

1. Resolve `minerId` to miner's PeerId, which is the same value as `Provider.ID` in the IPNI records 2. Ignore IPNI records advertised by different miners 3. Report `minerId` and `providerId` as the new measurement fields Signed-off-by: Miroslav Bajtoš <[email protected]>

test/integration.js

Signed-off-by: Miroslav Bajtoš <[email protected]>

juliangruber · 2024-03-07T16:53:11Z

test/integration.js

@@ -16,14 +19,20 @@ test('integration', async () => {

 test('retrieval check for our CID', async () => {
  const spark = new Spark()
-  spark.getRetrieval = async () => ({ cid: KNOWN_CID })
+  spark.getRetrieval = async () => ({ cid: KNOWN_CID, minerId: OUR_FAKE_MINER_ID })
+  spark.lookupMinerPeerId = async (minerId) => {


wouldn't it be safer to push the function calls into an array, and then assert this function was called at all? This is the advice you give me with all these mocks I believe

In this particular test, if my stubbed function isn't called, the retrieval check fails because Filecoin's RPC API call will reject the fake miner id.

The purpose of this test is not to verify how Spark works under the hood (and whether it calls all other functions we think it should call) but whether it can find the provider and retrieve the data for the CID served by our Frisbii instance.

However, if you think it's worth asserting that both functions were called, then I am fine with adding that. Let me know!

This was also my understanding when I wrote tests like this, but through your reviews I understood that it's clearer and safer to assert the function calls. If in the future a condition is introduced which skips lookupMinerPeerId(), then this test will still pass.

However,iIn my previous review I missed that this was an integration test and not a unit test. Please choose the testing method that you see fits best.

push the function calls into an array, and then assert this function was called

added in 0a48986

lib/miner-lookup.js

lib/spark.js

juliangruber · 2024-03-07T17:00:19Z

lib/spark.js

+      console.error(err)
+      // There are three common error cases:
+      //  1. We are offline
+      //  2. The JSON RPC provider is down
+      //  3. JSON RPC errors like when Miner ID is not a known actor
+      // There isn't much we can do in the first two cases. We can notify the user that we are not
+      // performing any jobs and wait until the problem is resolved.
+      // The third case should not happen unless we made a mistake, so we want to learn about it
+      if (err.name === 'FilecoinRpcError') {
+        // TODO: report the error to Sentry
+        console.error('The error printed above was not expected, please report it on GitHub:')
+        console.error('https://github.com/filecoin-station/spark/issues/new')
+      } else {
+        this.#activity.onError()
+      }
+      err.reported = true
+      // Abort the check, no measurement should be recorded
+      throw err


I don't understand the need for this custom property .reported here.

If we remove the console.error(err) above, then we can let the outer catch handler log it. Also, the outer catch handler can call #activity.onError() itself.

I think the only custom error handling we need here is when err.name === 'FilecoinRpcError', and in that case we can simply add a new log event. I'm proposing to refactor this catch handler like this:

Suggested change

console.error(err)

// There are three common error cases:

// 1. We are offline

// 2. The JSON RPC provider is down

// 3. JSON RPC errors like when Miner ID is not a known actor

// There isn't much we can do in the first two cases. We can notify the user that we are not

// performing any jobs and wait until the problem is resolved.

// The third case should not happen unless we made a mistake, so we want to learn about it

if (err.name === 'FilecoinRpcError') {

// TODO: report the error to Sentry

console.error('The error printed above was not expected, please report it on GitHub:')

console.error('https://github.com/filecoin-station/spark/issues/new')

} else {

this.#activity.onError()

}

err.reported = true

// Abort the check, no measurement should be recorded

throw err

// There are three common error cases:

// 1. We are offline

// 2. The JSON RPC provider is down

// 3. JSON RPC errors like when Miner ID is not a known actor

// There isn't much we can do in the first two cases. We can notify the user that we are not

// performing any jobs and wait until the problem is resolved.

// The third case should not happen unless we made a mistake, so we want to learn about it

if (err.name === 'FilecoinRpcError') {

// TODO: report the error to Sentry

console.error('Unexpected error, please report it on GitHub:')

console.error('https://github.com/filecoin-station/spark/issues/new')

}

// Abort the check, no measurement should be recorded

throw err

Let's discuss.

In my proposed implementation:

FilecoinRpcError does not trigger this.#activity.onError(), therefore Spark stays in "online" mode

We print the call to report the issue after the error stack.

In your proposal:

FilecoinRpcError triggers this.#activity.onError() and puts Spark to offline mode.

The call to report the issue is printed before the error to report. (I guess that's fine, we just need to tweak the message.)

Here is the important question we need to answer: How should we handle the case when we cannot get a miner's Peer ID because Filecoin RPC returns an error?

This can be either because we are calling the RPC API incorrectly or because the miner ID is not a valid one. (I suppose we could also receive this error if the RPC API server has an internal problem; it can process the JSON-RPC part, but the called RPC method fails.)

Do we want to indicate offline mode? The next retrieval check will likely be able fetch the round details from spark-api, which means Spark goes back to online mode, only to go offline again when the JSON RPC call fails.

Ah understood! I thought the purpose of err.reported was to prevent double log lines, now I see it's also to prevent reporting it to the activity state handler.

Could we be more explicit about this?

Rename to error.setActivityToOffline = false

Let the outer error handler handle logging, to be more consistent with other errors

See db3b2ae

lib/constants.js

lib/miner-lookup.js

Signed-off-by: Miroslav Bajtoš <[email protected]>

juliangruber · 2024-03-11T10:44:28Z

lib/spark.js

+      console.error(err)
+      // There are three common error cases:
+      //  1. We are offline
+      //  2. The JSON RPC provider is down
+      //  3. JSON RPC errors like when Miner ID is not a known actor
+      // There isn't much we can do in the first two cases. We can notify the user that we are not
+      // performing any jobs and wait until the problem is resolved.
+      // The third case should not happen unless we made a mistake, so we want to learn about it
+      if (err.name === 'FilecoinRpcError') {
+        // TODO: report the error to Sentry
+        console.error('The error printed above was not expected, please report it on GitHub:')
+        console.error('https://github.com/filecoin-station/spark/issues/new')
+      } else {
+        this.#activity.onError()
+      }
+      err.reported = true
+      // Abort the check, no measurement should be recorded
+      throw err


Ah understood! I thought the purpose of err.reported was to prevent double log lines, now I see it's also to prevent reporting it to the activity state handler.

Could we be more explicit about this?

Rename to error.setActivityToOffline = false

Let the outer error handler handle logging, to be more consistent with other errors

Signed-off-by: Miroslav Bajtoš <[email protected]>

bajtos requested a review from juliangruber March 7, 2024 15:42

bajtos commented Mar 7, 2024

View reviewed changes

test/integration.js Show resolved Hide resolved

bajtos commented Mar 7, 2024

View reviewed changes

test/integration.js Outdated Show resolved Hide resolved

fixup! improve error handling

a4a26fc

Signed-off-by: Miroslav Bajtoš <[email protected]>

bajtos mentioned this pull request Mar 7, 2024

fix: check Graphsync retrievals too #56

Merged

Merge branch 'main' into feat-miner-lookup

58145d9

juliangruber requested changes Mar 7, 2024

View reviewed changes

bajtos added 3 commits March 8, 2024 09:23

Merge branch 'main' into feat-miner-lookup

274909e

fixup! get rid of "lookup" word

0498482

Signed-off-by: Miroslav Bajtoš <[email protected]>

fixup! make getMinerPeerId private

7378e8d

Signed-off-by: Miroslav Bajtoš <[email protected]>

juliangruber requested changes Mar 11, 2024

View reviewed changes

bajtos added 2 commits March 19, 2024 14:00

fixup! verify miner ids queried

0a48986

Signed-off-by: Miroslav Bajtoš <[email protected]>

fixup! remove err.handled

db3b2ae

Signed-off-by: Miroslav Bajtoš <[email protected]>

bajtos requested a review from juliangruber March 19, 2024 13:18

juliangruber approved these changes Mar 19, 2024

View reviewed changes

bajtos merged commit 66202a3 into main Mar 19, 2024
1 check passed

bajtos deleted the feat-miner-lookup branch March 19, 2024 13:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: use IPNI advertisements from the miner only #55

feat: use IPNI advertisements from the miner only #55

bajtos commented Mar 7, 2024

juliangruber Mar 7, 2024

bajtos Mar 8, 2024

juliangruber Mar 11, 2024

bajtos Mar 19, 2024

juliangruber Mar 7, 2024

bajtos Mar 8, 2024

juliangruber Mar 11, 2024

bajtos Mar 19, 2024

juliangruber Mar 11, 2024

feat: use IPNI advertisements from the miner only #55

feat: use IPNI advertisements from the miner only #55

Conversation

bajtos commented Mar 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment