-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update protocol to match automerge-repo JS 1.0 #67
base: main
Are you sure you want to change the base?
Conversation
Context: the automerge-repo JS implementation supports a request workflow for syncing with a document we don't have. In this workflow the requesting peer sends a "request" message which is identical to the current sync message except tagged with a different message type. Responding peers can then either respond with a sync message or with an "unavailable" message, which allows the receiver to tell the difference between a document the other end doesn't have and a document the other end has but which is empty. Problem: in the repo loop we have no way of telling the difference between a request for a new document and announcing a document we have. This is because both situations are expressed using the RepoEvent::NewDoc event. Solution: split `NewDoc` into a `RequestDoc` and `NewDoc` event. This allows us to send request messages for `RequestDoc` and sync messages for `NewDoc`.
77d267a
to
782b2c3
Compare
782b2c3
to
c012dfb
Compare
Context: when a peer requests a document from us we often don't want to just check our own storage, we also want to ask all other peers we are connected to if they have the document. This is the case for e.g. a sync server which is relaying documents between two peers. Problem: the current immplementation just checks local storage Solution: when we receive a request, first send a request out to all the peers we are connected to and once they have all responded only then send a response to the requestor.
c012dfb
to
6aa2fd6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM although I didn't have enough time yet to review in all details...
Good to see many existing TODO's taken care off...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have finally wrapped my head around the logic.
At a very high-level I'm wondering: why separate the new request concept from the existing peer connection and document info? I find the back and forth between these hard to read, and I'm hoping the logic can be consolidated by updating the existing concepts, as opposed to introduce a new and separate one(which is logically not separate because of the back and forth).
Couple of things:
- The equivalent of
Self::fail_request
could be just called in the main loop on all requests that are complete. The benefit is that that logic is in one place, as opposed to being called at different points. - I'm wondering if
Request
could not be replaced by a combination of aawaiting_response_from: HashSet<RepoId>
onDocumentInfo
, andawaiting_our_response
could be inferred from the existence ofPeerConnection::PendingAuth
(perhaps with additional data to differentiate between a peer who requested the document, like a boolean) when the docstate transitions out of bootstrap state?. Request::initiate_local
seems to me the same as having a document in the bootstrap state without any peers awaiting our response?
In all cases, the goal is to not have to read all the code to understand the logic, instead just looking at the enums and structs should be enough. Currently this is not possible because request is separate from the document info, and the "state" of the algorithm appears shared between the state of the request and the state of the document info.
It would also be beneficial to layer the logic, as opposed to have what seems like special conditions such as in the handling of NetworkEvent::Request
, which skips the share auth phase if the document is not syncing.
Layering would look like:
- Always first get the share decisions
- If sharing is accepted, either start syncing if
DocState::Sync
, otherwise request from other peers. - As sync messages, or
Unavailable
events are handled, update the document info and or peer connection and send outgoing messages.
In other words, always follow the same steps, but with different data.
request::Request::new(document_id.clone()) | ||
}); | ||
|
||
if info.state.is_bootstrapping() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps move this to the DocState::Bootstrap
match arm above?
"responding to request with sync as we have the doc" | ||
); | ||
// if we have this document then just start syncing | ||
Self::enqueue_share_decisions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we do this first in all cases?
This is a fairly chunky change, the primary reason for which is that the automerge repo JS protocol introduces a request workflow for requesting documents which you don't have, in contrast to the symmetric sync protocol we have been using so far. This requires the addition of two new message types: a "request" and "unavailable" type. The requesting peer sends a request and then the responding peer responds with either a "sync" or "unavailable" message. Some further complexity is introduced by the fact that each peer is expected to only return unavailable once it has queried both it's storage and all the other peers it knows about.
To implement this then there are the following major changes:
request_document
to return anOption
. This is the primary advantage of the new workflow, that you no longer need to use a timeout in the common case when noone has the document, instead everyone will returnUnavailable
and we can returnOk(None)
fromrequest_document
RepoEvent::NewDoc
intoRepoEvent::NewDoc
andRepoEvent:RequestDoc
, so that we can distinguish between the two cases and decide whether to send request messages to our peers or notRepo::new_document
return a future, this allows us to move theDocumentInfo
creation into the repo loopRequest
state machine which is tracked by theRepo
to figure out when to send requests to other peers on behalf of incoming requests and when a request is complete.This brings us up to interop with the current JS implementation, with the exception of ephemeral messages, which I have another patch for which I will submit once this is merged.
This will be a breaking change to the network protocol and would thus require a network wide upgrade. It is possible that we could write a compatibility layer which we could maintain for a while. @issackelly is that something you would need or is a big bang upgrade okay for you?