Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Message Pickup v4 #110

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
373 changes: 373 additions & 0 deletions site/content/protocols/messagepickup/4.0/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,373 @@
---
title: Message Pickup
publisher: JamesKEbert
license: MIT
piuri: https://didcomm.org/message-pickup/4.0
status: Production
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we might need a somewhat more sophisticated process to advance didcomm protocols.

Because protocols are immediately versioned 1.0 (or another major version) it means every slight change that needs to happen baed on implementation experience is a new version.

Would make sense IMO if protocols would stay in "draft" mode for a few months (e.g. 0.1 or 1.0-draft) where breaking changes can be made.

This way we don't end up with a lot of pickup protocols versions each slightly different than the previous one

We need some playtime with a protocol before it's "locked"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do tend to agree. I'd opt for introducing RFCs on a 0.1 version, as it makes a lot of sense to update from 0.1 to 1.0 once the RFC is finalized. Theoretically I think the RFC flow found in the Aries RFCs could also handle this via the status changes (PROPOSED -> DEMONSTRATED -> ACCEPTED -> ADOPTED -> RETIRED) wherein theoretically breaking changes could be made prior to moving it to ACCEPTED. I don't think that's worked in practice though.

On a related note, I find it troubling that we made new major versions of protocols that only supported DIDComm v2, as it essentially made any adjustments/fixes to protocols on DIDComm v1 impossible that required breaking changes (it's really odd to make a v2 for DIDComm v1, v3 for DIDComm v2, and then a v4 for DIDComm v1). I think it might be wise to have protocols have dual support (as proposed in this RFC or as seen in https://didcomm.org/media-sharing/1.0/), or have a mechanism in the protocol URI to indicate which DIDComm version is in use.

In this instance though, Pickup v2 was introduced in early 2022 and was implemented in AFJ/Credo in mid-2022 and similar timing for support in ACA-Py. So, given that it's been around for that long (and used in production/deployments for that long), I think in this instance it seems likely appropriate to me to have a new version. And Pickup v3 is basically just a reskin of pickup v2 but for DIDComm v2.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually discussed this a good amount on one of the recent DIDComm UG calls--we agreed that starting a protocol at a 0.1 would be wise, but there were less immediate solutions available for subsequent versions (like in this case with v3->v4). One option discussed was adopting semver more closely, as it allows for something along the lines of 4.0.0-draft or 4.0.0-draft.2, etc. This would be ideal, but is a very major breaking change for protocol handler implementations.

summary: A protocol to facilitate an agent picking up messages held at a mediator.
tags: []
authors:
- name: Sam Curren
email: [email protected]
- name: James Ebert
email: [email protected]

---

## Summary
A protocol to facilitate a _Recipient_ agent picking up messages held at a _Mediator_. This protocol is likely to be used in tandem with the [cooridinate-mediation protocol](https://didcomm.org/coordinate-mediation/3.0/).
JamesKEbert marked this conversation as resolved.
Show resolved Hide resolved

## Motivation
This protocol is needed to facilitate retrieval of messages from a mediator in an explicit manner. Additionally, this protocol provides behavior for initiating live delivery of messages, which is crucial for good user experience for agents operating on mobile devices.

Motivation for v4 of this protocol stems from ambiguity in the [pickup v2 protocol](https://github.com/hyperledger/aries-rfcs/tree/main/features/0685-pickup-v2) and [messagepickup v3 protocol](https://didcomm.org/messagepickup/3.0/) as to whether `delivery` and `messages-received` messages must be used while using live mode.

## Roles
There are two roles in this protocol:

- `mediator`: The agent that has messages waiting for pickup by the `recipient`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is a specific reason for it, but I don't see why the 'message holder' role has been renamed to 'mediator' in v2. Is Message Pickup protocol now only supported for a mediator/mediatee relationship? If that's the case, only the forwarded messages are supposed to be queued and packed in this protocol, or this applies also for any message sent from mediator to mediatee (e.g. keylist update response)?

Some time ago I had an use case where mobile agents without mediator would get credentials from an issuer: when connecting through regular HTTP, we were using Message Pickup in polling mode to check for credential issuance and other messages until the flow was finished. Do you think this is not an intended usage for this protocol?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some time ago I had an use case where mobile agents without mediator would get credentials from an issuer: when connecting through regular HTTP, we were using Message Pickup in polling mode to check for credential issuance and other messages until the flow was finished. Do you think this is not an intended usage for this protocol?

My current answer is that I think I would consider that to be outside the scope/intentions of this protocol. Given the already fairly involved nature of this protocol, my initial response is that I'd rather have a separate protocol dedicated to that particular use case so it can be more specific in its definitions and clarifications (especially given the troublesome, hard to troubleshoot issues observed with mobile message pickup over the years).

Also some context there according to my memory -- Pickup v2 was a completely fresh draft of the protocol that did not draw heavily from the v1 version (it was more based off of the implicit mechanisms being used with ACA-Py), but we thought it would be ideal to replace the initial version.

- `recipient`: The agent who is picking up messages from the `mediator`.

## Requirements

### Return Route
The `return_route` extension must be supported by both agents (`recipient` and `mediator`).
The common use of this protocol is for the reply messages from the `mediator` to be synchronous, utilizing the same connection channel for the reply. In order to have this synchronous behavior the `recipient` should specify `return_route` header to `all`.
This header must be set each time the communication channel is established: once per established websocket, and every message for an HTTP POST.

### DIDComm V1 Requirements
When using this protocol with DIDComm V1, `recipient_did` **MUST** be a [`did:key` reference](https://github.com/hyperledger/aries-rfcs/tree/main/features/0360-use-did-key).

## Basic Walkthrough

This protocol consists of four different message requests from the `recipient` that should be replied to by the `mediator`:

1. Status Request -> Status
2. Delivery Request -> Message Delivery
3. Message Received -> Status
4. Live Mode -> Status or Problem Report

## States

This protocol follows the request-response message exchange pattern, and only requires the simple state of waiting for a response or to produce a response.

Additionally, the `return_route` header extension must be set to `all` in all request submitted by the `recipient`.

## Basic Walkthrough

The `status-request` message is sent by the `recipient` to the `mediator` to query how many messages are pending.

The `status` message is the response to `status-request` to communicate the state of the message queue.

The `delivery-request` message is sent by the `recipient` to request delivery of pending messages.

The `delivery` message is the response to the `delivery-request` to send queued messages back to the `recipient`.

The `messages-received` message is sent by the `recipient` to confirm receipt of delivered messages, prompting the `mediator` to clear messages from the queue.

The `live-delivery-change` message is used to set the state of `live_delivery`.

When _Live Mode_ is enabled, messages that arrive when an existing connection exists are delivered over the connection immediately, via a `delivery` message, rather than being pushed to the queue. See _Live Mode_ below for more details.

## Security

This protocol expects messages to be encrypted during transmission, and repudiable.

## Message Reference

### Status Request
Sent by the `recipient` to the `mediator` to request a `status` message.

Message Type URI: `https://didcomm.org/message-pickup/4.0/status-request`

DIDComm v1 example:
```json
{
"@id": "123456780",
"@type": "https://didcomm.org/message-pickup/4.0/status-request",
"recipient_did": "<did:key for messages>",
"~transport": {
"return_route": "all"
}
}
```

DIDComm v2 example:
```json
{
"id": "123456780",
"type": "https://didcomm.org/message-pickup/4.0/status-request",
"body" : {
"recipient_did": "<did for messages>"
},
"return_route": "all"
}
```
`recipient_did` is optional. When specified, the `mediator` **MUST** only return status related to that recipient did. This allows the `recipient` to discover if any messages are in the queue that were sent to a specific DID. If using DIDComm v1, `recipient_did` **MUST** be a [`did:key` reference](https://github.com/hyperledger/aries-rfcs/tree/main/features/0360-use-did-key).

### Status
Status details about waiting messages.

Message Type URI: `https://didcomm.org/message-pickup/4.0/status`

DIDComm v1 example:
```json
{
"@id": "123456780",
"@type": "https://didcomm.org/message-pickup/4.0/status",
"~thread": {
"thid": "<message id of status-request message>"
},
"recipient_did": "<did:key for messages>",
"message_count": 7,
"longest_waited_seconds": 3600,
"newest_received_time": 1658085169,
"oldest_received_time": 1658084293,
"total_bytes": 8096,
"live_delivery": false
}
```

DIDComm v2 example:
```json
{
"id": "123456780",
"thid": "<message id of status-request message>",
"type": "https://didcomm.org/message-pickup/4.0/status",
"body": {
"recipient_did": "<did for messages>",
"message_count": 7,
"longest_waited_seconds": 3600,
"newest_received_time": 1658085169,
"oldest_received_time": 1658084293,
"total_bytes": 8096,
"live_delivery": false
}
}
```

`message_count` is the only **REQUIRED** attribute. The others **MAY** be present if offered by the `mediator`.

`longest_waited_seconds` is in seconds, and is the longest delay of any message in the queue.
JamesKEbert marked this conversation as resolved.
Show resolved Hide resolved

`newest_received_time` and `oldest_received_time` are expressed in UTC Epoch Seconds (seconds since 1970-01-01T00:00:00Z) as an integer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In DIDComm V1 we have a Best Practice that indicates that fields whose suffix is _time are strings in ISO 8601 format. Since this protocol has examples in both V1 and V2, I want to point this out because it can trigger inconsistencies with other protocols in the ecosystem.

This is probably not really relevant for this protocol, because I think a resolution in seconds is enough (although it won´t hurt neither to suffix it to _t or _timestamp. But to me it is a big problem of DIDComm V2 compared to V1, since in some cases it is very important to distinguish the order of messages sent within the same second and ISO 8601 provides means to do so. If anybody knows the reasons behind the decision of switching to integers (other than saving some bytes), please let me know!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in the DIDComm UG on 12-9-24--we determined that we wanted to discuss with additional folks in the community to get their takes on a general best practice for DIDComm and also bring it for discussion in the WG spec meeting.
Specifically for this protocol we have determined we want additional specificity, but the debate is whether to do ISO 8601 or do UTC Epoch Milliseconds. I now lean towards the UTC approach since it's represented as an integer instead of a string.


`total_bytes` represents the total size of all messages.

If a `recipient_did` was specified in the `status-request` message, the matching value **MUST** be specified in the `recipient_did` attribute of the status message.

`live_delivery` state is also indicated in the status message.

**Note**: due to the potential for confusing what the actual state of the message queue is, a `status` message **MUST NOT** be put on the pending message queue and **MUST** only be sent when the `recipient` is actively connected (HTTP request awaiting response, WebSocket, etc.).


### Delivery Request
A request from the `recipient` to the `mediator` to have pending messages delivered.

Message Type URI: `https://didcomm.org/message-pickup/4.0/delivery-request`

DIDComm v1 example:
```json
{
"@id": "123456780",
"@type": "https://didcomm.org/message-pickup/4.0/delivery-request",
"limit": 10,
JamesKEbert marked this conversation as resolved.
Show resolved Hide resolved
"recipient_did": "<did:key for messages>",
"~transport": {
"return_route": "all"
}
}
```

DIDComm v2 example:
```json
{
"id": "123456780",
"type": "https://didcomm.org/message-pickup/4.0/delivery-request",
"body": {
"limit": 10,
"recipient_did": "<did for messages>"
},
"return_route": "all"
}
```

`limit` is a **REQUIRED** attribute, and after receipt of this message, the `mediator` **SHOULD** deliver up to the limit indicated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is some message exchange overhead for every loop (delivery-request -> delivery -> messages-received -> status), I think it is important to keep loop number as minimal as possible. Is actually a limit by message number the best approach, or should we consider also the transport capabilities of the recipient (i.e. maximum single message size)?

A recipient would likely want to receive all messages at once, as long as they fit into a single message. So I'm wondering that we can combine this specified limit with the max_receive_bytes constraint, in such a way that the mediator should consider both when constructing each delivery message.

I think this could work nicely unless a certain queued message exceeds recipient's capabilities: in such case, the message would be stuck in the queue indefinitely. Shall the mediator reject any inbound message for a recipient if it exceeds its reported max_receive_bytes value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A recipient would likely want to receive all messages at once, as long as they fit into a single message. So I'm wondering that we can combine this specified limit with the max_receive_bytes constraint, in such a way that the mediator should consider both when constructing each delivery message.

My only concern here is regarding additional processing power/resources required on the mediator side to determine the number of messages attachable. Is the max_receive_bytes considered by the raw message size, or after packing for delivery (inside a delivery wrapper, and then inside a forward wrapper)?

I think this could work nicely unless a certain queued message exceeds recipient's capabilities: in such case, the message would be stuck in the queue indefinitely. Shall the mediator reject any inbound message for a recipient if it exceeds its reported max_receive_bytes value?

Perhaps that should actually be covered in the Mediation Coordination protocol? Since it sounds like it should be set up once and forgotten. But I do agree having some mechanism for indicating what is too large of a message is probably ideal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yes, certainly that will need more processing on mediator side. But I would expect the mediator to be more powerful that most clients, which may be running in resource-constrained devices.

I do, however, understand your point, since working with byte count will require the mediator to query its data base in a way that it can sum up each message, and that of course will be a lot less efficient than simply doing a "select * from messages where recipient_did order by date limit 10".

I agree that all of this can be coordinated in another protocol (or simply queried with Discover Features), and that the mediator could take this constraint into account before actually packing and sending the messages, even if the requested limit is more than the total message count it is going to pack into delivery message. This is the solution we are following right now in our mediator and it is working quite well.

Unfortunately, this solution is not 100% suitable in this protocol as it is written right now, it will require the client to always create an extra status-request loop 'just in case' (see my another comment).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may be right @genaris. @TheTechmage do you know if this would be a lot of work to add to existing mediator implementations? I lean towards having it be an optional field (vs required) and then having the number of messages be determined for whichever is lower (limit or max_received_bytes).


`recipient_did` is optional. When specified, the `mediator` **MUST** only return messages sent to that recipient did.

If no messages are available to be sent, a `status` message **MUST** be sent immediately.

Delivered messages **MUST NOT** be deleted until delivery is acknowledged by a `messages-received` message.

### Message Delivery
Batch of messages delivered to the `recipient` as attachments.

Message Type URI: `https://didcomm.org/message-pickup/4.0/delivery`

DIDComm v1 example:
```json
{
"@id": "123456780",
"@type": "https://didcomm.org/message-pickup/4.0/delivery",
"~thread": {
"thid": "<message id of delivery-request message>"
},
"recipient_did": "<did:key for messages>",
"~attach": [{
"@id": "<id of message>",
"data": {
"base64": "<message>"
}
}]
}
```

DIDComm v2 example:
```json
{
"id": "123456780",
"thid": "<message id of delivery-request message>",
"type": "https://didcomm.org/message-pickup/4.0/delivery",
"body": {
"recipient_did": "<did for messages>"
},
"attachments": [{
"id": "<id of message>",
"data": {
"base64": "<message>"
}
}]
}
```

Messages delivered from the queue must be delivered in a batch delivery message as attachments, with a batch size specified by the `limit` provided in the `delivery-request` message.
The `id` of each attachment is used to confirm receipt.
The `id` is an opaque value, and the recipient should not deduce any information from it, except that it is unique to the mediator. The recipient can use the `id`s in the `message_id_list` field of `messages-received`.

The ONLY valid type of attachment for this message is a DIDComm v1 or v2 Message in encrypted form.

`thid` -- an optional field if the delivery message is in response to a singular `delivery-request` messsage.

The `recipient_did` attribute is only included when responding to a `delivery-request` message that indicates a `recipient_did`.

### Messages Received
After receiving messages, the `recipient` **MUST** send an acknowledge message indiciating which messages are safe to clear from the queue.

Message Type URI: `https://didcomm.org/message-pickup/4.0/messages-received`

DIDComm v1 example:
```json
{
"@id": "123456780",
"@type": "https://didcomm.org/message-pickup/4.0/messages-received",
"message_id_list": ["123","456"]
}
```

DIDComm v2 example:
```json
{
"id": "123456780",
"type": "https://didcomm.org/message-pickup/4.0/messages-received",
"body": {
"message_id_list": ["123","456"]
}
}
```

`message_id_list` is a list of `ids` of each message received. The `id` of each message is present in the attachment descriptor of each attached message of a delivery message.

Upon receipt of this message, the `mediator` knows which messages have been received, and can remove them from the collection of queued messages with confidence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed in this section that it does not say anything about an updated status message sent as response for this message. This was explicitly mentioned in previous versions, so I'm wondering if this update intends to reflect that no further response is expected, or if it's just to not be too redundant (in Basic Walkthrough it is already stated it is a response)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- that should definitely go in the changelog

I believe this is motivated by the same complexity / ambiguity as to why we removed a status message as a followup to a delivery-request when no messages are queued for delivery, as it is unclear as to if it should be filtered by recipient_did or not (and actually can't be).
For example, in the following flow: status-request (restricted by recipient_did) -> status -> delivery -> messages-received -> status (NOT restricted by recipient_did, as there's no way to know that we were doing messages-received in regards to a filtered flow)
This may give the incorrect view that a given recipient_did has additional messages awaiting for pickup, but in reality they are for a different recipient_did.

If I'm missing something here, please say so

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to avoid sending an status as a response to messages-received. But at the beginning of Basic Waltkthrough section I see:

This protocol consists of four different message requests from the recipient that should be replied to by the mediator:
...
3. Message Received -> Status

Which seems confusing.

A minor downside of this spec change is that it will force the recipient to always start an extra loop to determine that there are no further mesages: for instance, if I have 10 messages and my limit is 10, I'll need to do a status-request -> status -> delivery-request -> delivery -> messages-received followed by a status-request -> status. But should not affect in the vast majority of cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've somewhat fixed the walkthrough section (cause yeah that's confusing, nice catch).

if I have 10 messages and my limit is 10, I'll need to do a status-request -> status -> delivery-request -> delivery -> messages-received followed by a status-request -> status.

That's only if you assume that you could have received a new message in the 200ms or however long it takes to receive the 10 messages (because we were informed by the first status message that we had 10 at that point in time). You in fact could receive a new message right after sending the second status-request and think you have no messages queued. I think it's an inherit problem with polling strategies.


#### Multiple Recipients

If a message arrives at a `mediator` addressed to multiple `recipients`, the message **MUST** be queued for each `recipient` independently. If one of the addressed `recipients` retrieves a message and indicates it has been received, that message **MUST** still be held and then removed by the other addressed `recipients`.


### Live Mode
_Live Mode_ is the practice of delivering newly arriving messages directly to a connected `recipient`. It is disabled by default and only activated by the `recipient`. Messages that arrive when _Live Mode_ is off **MUST** be stored in the queue for retrieval as described above. If _Live Mode_ is active, and the connection is broken, a new inbound connection starts with _Live Mode_ disabled.

Messages already in the queue are not affected by _Live Mode_; they **MUST** still be requested with `delivery-request` messages.

_Live Mode_ **MUST** only be enabled when a persistent transport is used, such as WebSockets.

If _Live Mode_ is active, messages still **MUST** be delivered via a `delivery` message and the `recipient` **MUST** send an acknowledgement message `messages-received`. If a message is not acknowledged, the message **MUST** be added to the queue for later pickup.

Recipients have three modes of possible operation for message delivery with various abilities and level of development complexity:

1. Never activate _Live Mode_. Poll for new messages with a `status_request` message, and retrieve them when available.
2. Retrieve all messages from queue, and then activate _Live Mode_. This simplifies message processing logic in the `recipient`.
3. Activate _Live Mode_ immediately upon connecting to the `mediator`. Retrieve messages from the queue as possible. When receiving a message delivered live, the queue may be queried for any waiting messages delivered to the same did for processing.
### Live Mode Change
_Live Mode_ is changed with a `live-delivery-change` message.

Message Type URI: `https://didcomm.org/message-pickup/4.0/live-delivery-change`

DIDComm v1 example:
```json
{
"@id": "123456780",
"@type": "https://didcomm.org/message-pickup/4.0/live-delivery-change",
"live_delivery": true
}
```

DIDComm v2 example:
```json
{
"id": "123456780",
"type": "https://didcomm.org/message-pickup/4.0/live-delivery-change",
"body": {
"live_delivery": true
}
}
```

Upon receiving the `live_delivery_change` message, the `mediator` **MUST** respond with a `status` message.

If sent with `live_delivery` set to true on a connection incapable of live delivery, a `problem_report` **SHOULD** be sent as follows:

DIDComm v1 example:
```json
{
"@id": "123456780",
"@type": "https://didcomm.org/message-pickup/4.0/problem-report",
"~thread": {
"pthid": "<the value is the thid of the thread in which the problem occurred>"
},
"description": {
"code": "e.m.live-mode-not-supported",
"en": "Connection does not support Live Delivery"
}
}
```

DIDComm v2 example:
```json
{
"id": "123456780",
"type": "https://didcomm.org/message-pickup/4.0/problem-report",
"pthid": "<the value is the thid of the thread in which the problem occurred>",
"body": {
"code": "e.m.live-mode-not-supported",
"comment": "Connection does not support Live Delivery"
}
}
```

## L10n

No localization is required.

## Implementations

Name / Link | Implementation Notes
--- | ---

## Endnotes

### Future Considerations
The style of wrapping messages in a `delivery` message incurs an additional roughly 33% increased message size due to wrapping of the message. This size bloating is outweighed by the benefit of having explicit and gauranteed delivery of messages. This issue may be resolved in future versions of DIDComm.
JamesKEbert marked this conversation as resolved.
Show resolved Hide resolved

Should there be a strategy for a `mediator` to indicate support for _Live Mode_ via discover features?
Loading