Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

Commit

Permalink
Protocol 0.2.0
Browse files Browse the repository at this point in the history
Update to the protocol as a result of libchan meeting with Matteo Collina.
Bump the version to 0.2.0 and name the set the previous version as 0.1.0.

Protocol changes:
 - Define stream provider to support multiple multiplexing protocols
 - Support CBOR in addition to Msgpack for channel message encoding
 - Add extended type codes definition
 - Require byte-streams to send *"libchan-parent-ref"*
 - Allow byte-streams as duplex or half-duplex
 - Add channel synchronization through ack definition
 - Add channel errors
 - Update description of relationship to Go channels

Other changes:
 - Add Derek and Matteo to authors
 - Reformatted to 80 character lines
 - Much cleanup and rewording

Signed-off-by: Derek McGowan <[email protected]> (github: dmcgowan)
  • Loading branch information
dmcgowan committed Mar 11, 2015
1 parent d5db941 commit 403a80b
Showing 1 changed file with 219 additions and 106 deletions.
325 changes: 219 additions & 106 deletions PROTOCOL.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,179 +2,292 @@

Extreme portability is a key design goal of libchan.

This document specifies the libchan protocol to allow multiple implementations to co-exist with
full interoperability.
This document specifies the libchan protocol to allow multiple implementations
to co-exist with full interoperability.

## Version

No version yet.
0.2.0

## Author
## Authors

Solomon Hykes <[email protected]>
Derek McGowan <[email protected]>
Matteo Collina <[email protected]>

## Status

This specification is still work in progress. Things will change, probably in reverse-incompatible ways.
We hope to reach full API stability soon.
This specification is nearing a stable release. The protocol still may change in
reverse-incompatible ways.

## Terminology

### Channel

A `channel` is an object which allows 2 concurrent programs to communicate with each other. The semantics
of a libchan channel are very similar (but not identical) to those of Go's native channels.
A `channel` is an object which allows 2 concurrent programs to communicate with
each other. The semantics of a libchan channel are very similar (but not
identical) to those of Go's native channels. A channel may be used
synchronously, but do not support synchronization primitives such as Go's
channel select semantics.

A channel has 2 ends: a `Sender` end and a `Receiver` end. The Sender can send messages and close the channel.
The Receiver can receive messages. Messages arrive in the same order they were sent.
A channel has 2 ends: a `Sender` end and a `Receiver` end. The Sender can send
messages and close the channel. The Receiver can receive messages. Messages
arrive in the same order they were sent.

A channel is uni-directional: messages can only flow in one direction. So channels are more similar to pipes
than to sockets.
A channel is uni-directional: messages can only flow in one direction. So
channels are more similar to pipes than to sockets.

### Message

A message is a discrete packet of data which can be sent on a channel. Messages are structured into multiple
fields. The protocol defines which data types can be carried by a message, and how transports should encode and
decode them.
A message is a discrete packet of data which can be sent on a channel. The
protocol defines which data types can be carried sent as a message, and how
transports should encode and decode them. A message is similar to a JSON object
containing [custom types](#custom-types) to represent channels and byte streams.

### Byte stream

A byte stream is an object which implements raw IO with `Read`, `Write` and `Close` methods.
Typical byte streams are text files, network sockets, memory buffers, pipes, and so on.
A byte stream is an object which implements raw IO with `Read`, `Write` and
`Close` methods. Typical byte streams are text files, network sockets, memory
buffers, pipes, and so on. A byte stream may either be read only, write only, or
full duplex.

One distinct characteristic of libchan is that it can encode byte streams as first class fields
in a message, alongside more basic types like integers or strings.
One distinct characteristic of libchan is that it can encode byte streams as
first class fields in a message, alongside more basic types like integers or
strings.

### Nesting

Libchan supports nesting. This means that a libchan message can include a channel, which itself
can be used to send and receive messages, and so on all the way down.
Libchan supports nesting. This means that a libchan message can include a
channel, which itself can be used to send and receive messages, and so on all
the way down.

Nesting is a fundamental property of libchan.

## Underlying transport

The libchan protocol requires a reliable, 2-way byte stream as a transport.
The most popular options are TCP connections, unix stream sockets and TLS sessions.
The libchan protocol requires a reliable, 2-way byte stream with support for
multiplexing as a transport. The underlying byte stream protocol is abstracted
to the libchan protocol through a simple multiplexed stream interface which may
use SPDY/3.1, HTTP/2, or SSH over over TCP connections, unix stream sockets and
TLS sessions.

It is also possible to use websocket as an underlying transport, which allows exposing
a libchan endpoint at an HTTP1 url.
When a reliable stream transport is not available but a non-multiplexed
connection is available, a multiplexing protocol (such as SPDY or another
simple multiplexing protocol) may be done on top of the existing connection.
This also makes using websockets as an underlying byte stream transport
possible, which allows exposing a libchan endpoint at an HTTP/1 url.

## Authentication and encryption

Libchan can optionally use TLS to authenticate and encrypt communications. After the initial
handshake and protocol negotiation, the TLS session is simply used as the transport for
the libchan wire protocol.

## Wire protocol

Libchan uses SPDY (protocol draft 3) as its wire protocol, with no modification.
Libchan can optionally use TLS to authenticate and encrypt communications. After
the initial handshake and protocol negotiation, the TLS session is simply used
as the transport for the libchan multiplexed stream provider.

## Stream Provider

Libchan uses a stream provider to establish new channels and byte streams over
an underlying byte stream. The stream provider must be able to send
headers when creating new streams and retrieve headers for remotely created
streams.

### Headers
The stream provider must support sending key-value headers on stream creation.

- *"libchan-ref"* - String representation of a unique 64 bit integer identifier
for the established stream.
- *"libchan-parent-ref"* String representation of a unique 64 bit integer
identifier for parent of the established stream. (see *"Sending nested
channels"* and *"Sending byte streams"*)

### Streams
The stream provider provides the functionality for creating new streams as
well as accepting streams created remotely. A stream is create with
a set of headers and an accepted stream has a method for returning the
headers. Closing a stream must put the stream in a half-closed state and
not allow anymore data to be written. If the remote side has already
closed, the stream is fully closed. Reseting a stream forces the stream
into a fully closed state and should only be used in error cases.
Resetting does not give the remote a chance to finish sending data and
cleanly close.

## Stream identifiers
Libchan creates a unique identifier for every stream created by the stream
provider. The identifiers are integer values and should never be reused.
The identifier is only unique to a given endpoint, meaning both sides of a
connection may have the same identifier for two different streams. The
identifiers received from the remote endpoint should only be used to reference
streams from that endpoint, and never streams created locally. A remote
endpoint's stream identifier should never be sent in a libchan message.
To send a stream created remotely, a new stream should be created
locally, copied from the remote stream, and the identifier to the local copy
should be used.

## Control protocol

Once 2 libchan endpoints have established a SPDY session, they communicate with the following
control protocol.
Once 2 libchan endpoints have established a multiplexed stream session, they
communicate with the following control protocol.

### Top-level channels

Each SPDY session may carry multiple concurrent channels, in both directions, using standard
SPDY framing and stream multiplexing. Each libchan channel is implemented by an underlying
SPDY stream.
Each libchan session may carry multiple concurrent channels, in both directions,
using stream multiplexing. Each libchan channel is implemented by an underlying
stream.

To use a SPDY session, either endpoint may initiate new channels, wait for its peer to
initiate new channels, or a combination of both. Channels initiated in this way are called
*top-level channels*.
To use a libchan session, either endpoint may initiate new channels, wait for
its peer to initiate new channels, or a combination of both. Channels initiated
in this way are called *top-level channels*.

* To initiate a new top-level channel, either endpoint may initiate a new SPDY stream, then
start sending messages to it (see *"sending messages"*).
* To initiate a new top-level channel, either endpoint may initiate a new
stream, then start sending messages to it (see *"sending messages"*).

* The endpoint initiating a top-level channel MAY NOT allow the application to receive messages
from it and MUST ignore inbound messages received on that stream.
* The endpoint initiating a top-level channel MAY NOT allow the application to
receive messages from it and MUST interpret inbound messages received on that
stream as an ack or error message.

* When an endpoint receives a new inbound SPDY stream, and the initial headers DO NOT include
the key `libchan-ref`, it MUST queue a new `Receiver` channel to pass to the application.
* The endpoint initiating the channel must create a unique identifier for the
channel and include the value in the *"libchan-ref"* header when creating
the new stream.

* The endpoint receiving a top-level channel MAY NOT allow the application to send messages to
it.
* When an endpoint receives a new stream without the header
*"libchan-parent-ref"*, it MUST interpret the stream as an inbound top-level
channel and queue a new `Receiver` channel to pass to the application.

* The endpoint receiving a top-level channel MAY NOT allow the application to
send messages to it.

### Sending messages on a channel

Once a SPDY stream is initiated, it can be used as a channel, with the initiating endpoint holding
the `Sender` end of the channel, and the recipient endpoint holding the `Receiver` end.
### Sending messages on a channel

* To send a message, the sender MUST encode it using the [msgpack](https://msgpack.org) encoding format, and
send a single data frame on the corresponding SPDY stream, with the encoded message as the exact content of
the frame.
Once a stream is initiated, it can be used as a channel, with the initiating
endpoint holding the `Sender` end of the channel, and the recipient endpoint
holding the `Receiver` end.

* When receiving a data frame on any active SPDY stream, the receiver MUST decode it using msgpack. If
the decoding fails, the receiver MUST close the underlying stream, and future calls to `Receive` on that
channel MUST return an error.
* To send a message, the sender MUST encode it using the message encoding format
(see *"message encoding"*), and send the encoded message on the corresponding
stream.

* A valid msgpack decode operation with leftover trailing or leading data is considered an *invalid* msgpack
decode operation, and MUST yield the corresponding error.
* When receiving a data on any active stream, the receiver MUST decode it using
the same message encoding format. If the decoding fails, the receiver MUST close
the underlying stream, and future calls to `Receive` on that channel MUST return
an error.

### Closing a channel
* Every send message should have a corresponding receive of an ack message from
the peer. The ack message is a map with at least one field named `code`. The
`code` field should have an integer value, with an a value of zero considered
a successful ack and non-zero as an error. An error should be accompanied with
an additional `message` field of type string, describing the error. If an error
is received, the sender should close and pass an error to the application.

The endpoint which initiated a channel MAY close it by closing the underlying SPDY stream.
### Sending nested channels

*FIXME: provide more details*
* When sending a nested channel, in addition to the *"libchan-ref"* header, the
*"libchan-parent-ref"* header must be sent identifying the channel used to
create the nested channel.

### Sending byte streams
### Closing a channel

Libchan messages support a special type called *byte streams*. Unlike regular types like integers or strings,
byte streams are not fully encoded in the message. Instead, the message encodes a *reference* which allows
the receiving endpoint to reconstitute the byte stream after receiving the message, and pass it to the
application.
The endpoint which holds the send side of a channel MAY close it which will
half-close the stream. The receive side should respond by closing the stream,
putting the stream in a fully closed state. Any send or receive call from the
application after close should return an error.

*FIXME: specify use of msgpack extended types to encode byte streams*
When an error is received on a channel, the underlying stream should be
closed by both ends.

Libchan supports 2 methods for sending byte streams: a default method which is supported on all transports,
and an optional method which requires unix stream sockets. All implementations MUST support both methods.
### Sending byte streams

#### Default method: SPDY streams
Libchan messages support a special type called *byte streams*. Unlike regular
types like integers or strings, byte streams are not fully encoded in the
message. Instead, the message encodes a *reference* which allows the receiving
endpoint to reconstitute the byte stream after receiving the message, and pass
it to the application.

The primary method for sending a byte stream is to send it over a SPDY stream, with the following protocol:
Byte streams use the raw stream returned by the stream provider.

* When encoding a message including 1 or more byte stream values, the sender MUST assign to each value
an identifier unique to the session, and store these identifiers for future use.
* When encoding a message including 1 or more byte stream values, the sender
MUST assign to each value an identifier unique to the session, and store these
identifiers for future use.

* After sending the encoded message, the sender MUST initiate 1 new SPDY stream for each byte stream value
in the message.
* After sending the encoded message, the sender MUST create 1 new stream for
each byte stream value in the message.

* Each of those SPDY stream MUST include an initial header with as a key the string "*libchan-ref*", and
as a value the identifier of the corresponding byte stream.
* Each of new stream MUST include a header with the key *"libchan-ref"* and
a value of the identifier of the corresponding byte stream. It must also
include a header with the key *"libchan-parent-ref"* and a value of the
stream identifier for the message channel which created the byte stream.

Conversely, the receiver must follow this protocol:

* When decoding a message including 1 or more byte stream values, the receiver MUST store the unique identifier
of each value in a session-wide table of pending byte streams. It MAY then immediately pass the decoded message to the application.

* The sender SHOULD cap the size of its pending byte streams table to a reasonable value. It MAY make that value
configurable by the application. If it receives a message with 1 or more byte stream references, and the table
is full, the sender MAY suspend processing of the message until there is room in the table.

* When receiving new SPDY streams which include the header key "*libchan-ref*", the receiver MUST lookup that
header value in the table of pending byte streams. If the value is registered in the table, that SPDY stream
MUST be passed to the application.

On either end, once the SPDY stream for a byte-stream value is established, it MUST be exposed to the application
as follows:

* After sending each of those SPDY streams, each write operation by the application to a byte-stream field MUST
trigger the sending of a single data frame on the corresponding SPDY stream.

* Each read operation by the application from a byte-stream field MUST yield the content of the next
data frame received on the corresponding SPDY stream. If the reading end of the SPDY stream is closed,
the read operation MUST yield EOF.

* A close operation by the application on the a byte-stream field MUST trigger the closing of the writing end
of the corresponding SPDY stream.

#### Optional method: file descriptor passing
* When decoding a message including 1 or more byte stream values, the receiver
MUST store the unique identifier of each value in a session-wide table of
pending byte streams. It MAY then immediately pass the decoded message to the
application.

*FIXME*
* The sender SHOULD cap the size of its pending byte streams table to a
reasonable value. It MAY make that value configurable by the application. If it
receives a message with 1 or more byte stream references, and the table
is full, the sender MAY suspend processing of the message until there is room in
the table.

### Sending nested channels
* When receiving new streams which include the header key "*libchan-ref*", the
receiver MUST lookup that header value in the table of pending byte streams. If
the value is registered in the table, that stream MUST be passed to the
application.

*FIXME*
On either end, once the stream for a byte-stream value is established, it MUST
be exposed to the application as follows:

* After sending each of those streams, each write operation by the application
to a byte-stream field MUST trigger the sending of a single data frame on the
corresponding stream.

* Each read operation by the application from a byte-stream field MUST yield
the content of the next data frame received on the corresponding stream. If the
reading end of the stream is closed, the read operation MUST yield EOF.

* A close operation by the application on the a byte-stream field MUST trigger
the closing of the writing end of the corresponding SPDY stream.

## Message encoding
A message may be any type which supported by the libchan encoder. A libchan
message encoder must support encoding raw byte stream types as well as channels.
In addition to the libchan data types, time must also be encoded as a custom
type to increase portability of the protocol.

Currently supported message encoders are msgpack5 and soon CBOR.

### Custom Types
Each custom type defines a type code and the byte layout to represent that type.
Directions of descriptions are from the point of view of the endpoint encoding.
All multi-byte integers are encoded big endian. The length of bytes of the
encoded value will be provided by the encoding format, allowing integer values
to be variable length.

| Type | Code | Byte Layout|
|---|---|---|
| Duplex Byte Stream | 1 | 4 or 8 byte integer identifier |
| Inbound Byte Stream | 2 | 4 or 8 byte integer identifier |
| Outbound Byte Stream | 3 | 4 or 8 byte integer identifier |
| Inbound channel | 4 | 4 or 8 byte integer identifier |
| Outbound channel | 5 | 4 or 8 byte integer identifier |
| time | 6 | 8 byte integer seconds + 4 byte integer nanoseconds |

## Version History

0.2.0
- Define stream provider to support multiple multiplexing protocols
- Support CBOR in addition to Msgpack for channel message encoding
- Add extended type codes definition
- Require byte-streams to send *"libchan-parent-ref"*
- Allow byte-streams as duplex or half-duplex
- Add channel synchronization through ack definition
- Add channel errors
- Update description of relationship to Go channels

0.1.0
- Initial specification
- Message channels
- Nested message channels
- Duplex byte streams
- Msgpack channel message encoding
- SPDY stream multiplexing

0 comments on commit 403a80b

Please sign in to comment.