core.cdata: support for common SHM object types #1082

eugeneia · 2016-11-30T19:06:46Z

We anticipate increased usage of shared memory objects by multiprocessing aware apps that use the facility to coordinate using shared memory. This PR registers a bunch of basic types with core.shm so that they can be used in shared memory frames (or independently).

The first part is straight forward and imho generally useful, it adds the following SHM types under the cdata module:

int8_t int16_t int32_t int64_t
uint8_t uint16_t uint32_t uint64_t
float double
bool

These have well defined conversion semantics as documented here: http://luajit.org/ext_ffi_semantics.html (C Type Conversion Rules). With this PR they can be conveniently used in SHM frames:

myframe = shm.create_frame("myframe", {toggle = {cdata.bool, false})
cdata.set(myframe.toggle, not cdata.read(myframe.toggle)) -- toggle it

Additionally I have added some crude support for fixed length strings (three sizes for now: 64, 512, and 8192). This part of the PR might be too clunky or good enough, you decide. I am willing to scrap that part for a better alternative. See the clunky usage in action:

str = cdata.create("str.string64") -- Alternative to using cdata.string64 in create_frame
cdata.string64.copy(str, "foobar") -- could also use ffi.copy, with potential out of bounds write
ffi.string(str) => "foobar"

eugeneia · 2016-11-30T19:07:13Z

Cc @lukego

wingo · 2016-12-01T09:56:02Z

You seem to be going in a parallel direction to where we're going in the lwaftr :) I would be interested in a bigger picture design document.

lukego · 2016-12-01T10:01:17Z

@wingo The use case driving this is to export information from the main process that initializes a 100G NIC to the worker processes that have to attach to queues. Code: https://github.com/snabbco/snabb/blob/mellanox/src/apps/mellanox/connectx4.lua#L148-L176

Currently this is implemented via shm counter objects. However, it is a bit funny to use counters because these values are not logically counters (e.g. address of a DMA ring), the values are not always uint64_t, and counters are double-buffered (have to remember to counter.commit() to make them visible.)

Is there an lwAFTR mechanism that we could reuse instead already now?

wingo · 2016-12-01T10:08:42Z

If the need is just for information, the leader/follower apps include a generic one-way main-to-worker channel: https://github.com/Igalia/snabb/blob/yang/src/apps/config/README.md.

wingo · 2016-12-01T10:14:12Z

If the goal of this information is to simply configure the worker, then given that you want to move to YANG anyway, I would define a YANG module for the parts of the mellanox drivers that need this configuration (sendpath/receivepath), then use that YANG module to serialize that data. If the data use is more complicated (e.g. worker->main process communication) then that might not be a great option.

I am a little concerned that there's been a public ongoing design for multiprocess configuration, backs and forths from everyone, then something like this going in through the back door without a big picture design that I have seen, well it makes me nervous :) Perhaps that is just me being high-strung right before a deliverable :)

wingo · 2016-12-01T10:15:16Z

If this measure is seen to be a stopgap, then that's cool too. In that case I would suggest to put these data structures somewhere other than core.

lukego · 2016-12-01T10:29:55Z

@wingo The goal is to make instances of the Mellanox IO app automatically synchronize with each other even when they are running in separate processes. The information being shared is emphemeral operational data e.g. the physical address of each DMA ring and whether it is currently online. This is currently invisible to the application & app network - it is internal state of the Mellanox apps that happens to span multiple processes.

The most closely related public discussion is #1068 the "IO" mechanism i.e. the desire for drivers to be able to run distributed across multiple processes that all want to send/receive traffic on the same network interfaces. Currently we have put a simple interface on the IO apps and expected their instances to coordinate behind the scenes somehow.

Could alternatively model each piece of operational state with YANG and export that with Leader/Follower. So the YANG schema would have objects for e.g. the current physical address of the DMA ring for queue 7 and the index of the last successfully transmitted packet. This could then be provided to the app instances as configuration state. Is that what you have in mind? Just now this seems like taking YANG modeling a step too far - modeling state that apps actually want to hide internally - but this may be lack of imagination on my part.

wingo · 2016-12-01T10:36:22Z

Could be. If an app has a private channel, by all means make a private channel. If you want to model that data with YANG, then it's possible to compile a public configuration to a private configuration and ship that data over a channel ring buffer. Or if you want an app-specific mechanism, that's cool too (and that code should surely live in the app's directory and not core). But I still didn't get one part of what you are doing -- are you updating these values in place? Updating strings in place seems not like a good idea :)

wingo · 2016-12-01T10:37:38Z

I would not use YANG (or our yang code anyway) for things like "index of the last successfully transmitted packet".

lukego · 2016-12-01T10:50:49Z

@wingo I simply want to define some variables whose scope spans process boundaries. The existing shm mechanism seems like the natural choice except that its current set of user-defined types is sparse. Just now this seems orthogonal to YANG and channels: that is for synchronizing application-level configuration but what I am synchronizing is the internal state of some apps.

@eugeneia I suggest that we take @wingo's feedback as a NACK and I will cherry-pick any changes that I need from this branch into the mellanox driver directly. Then we can consider the issue again if/when we need to propagate this functionality into the other drivers.

lukego · 2016-12-01T11:17:02Z

@wingo Just reflection: we need to establish the scope of use cases for YANG/Leader+Follower/Channels/etc. Where does the YANG universe start and end? Lots of objects in Snabb - program configuration, app networks, app state, histograms and timeline logs, LuaJIT state, etc. It would be nice to have a common view on when to use these mechanisms and when to do something else.

wingo · 2016-12-01T11:18:15Z

Apologies for being grumpy and thank you for humoring me :) FWIW the core of my objection is really against SHM-based configuration (app-internal or otherwise) in core, at least at this time. I don't think there is a robust general way that can work; I could be wrong but we should start out app-specific instead of starting in core IMO. Counters and synchronization variables for me are fine tho, there's fewer ways to misuse those :)

lukego · 2016-12-01T11:32:04Z

I understand that point of view. Shared memory variables are too low-level to be a satisfactory general solution for synchronization.

This is a relatively simple use case. Most of the variables are written only once, at creation time, and then used read-only. In this case shm seems simpler to me than channels because it is passive state (named values) rather than moving parts (sending messages on ring buffers). I also need to be able to handle simple exceptional cases e.g. for the worker to be able to save a tiny amount of precious state when it shuts down so that the next instance can reattach to the rings correctly.

No problem to keep this local in the driver for now. I do like this strategy of keeping code outside of core until it ceases to be controversial.

eugeneia · 2016-12-01T13:47:11Z

Cc @alexandergall, who I believe might have some use for this. Initially, he complained about my “we only need 64bit counters stance.” :-)

Fix attempt to index a nil value

eugeneia added 3 commits November 28, 2016 19:30

add core.cdata for storing various Lua values in SHM.

25f7ce3

core.cdata: unroll struct/union into separate SHM types.

f61d5a5

core.cdata: add primitive types for fixed length strings.

4c3c94b

eugeneia added the rfc label Nov 30, 2016

dpino added a commit to dpino/snabb that referenced this pull request Jun 6, 2018

Merge pull request snabbco#1082 from dpino/fix-index-nil

be449cb

Fix attempt to index a nil value

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core.cdata: support for common SHM object types #1082

core.cdata: support for common SHM object types #1082

eugeneia commented Nov 30, 2016

eugeneia commented Nov 30, 2016

wingo commented Dec 1, 2016

lukego commented Dec 1, 2016

wingo commented Dec 1, 2016

wingo commented Dec 1, 2016

wingo commented Dec 1, 2016 •

edited

Loading

lukego commented Dec 1, 2016

wingo commented Dec 1, 2016

wingo commented Dec 1, 2016

lukego commented Dec 1, 2016

lukego commented Dec 1, 2016

wingo commented Dec 1, 2016 •

edited

Loading

lukego commented Dec 1, 2016 •

edited

Loading

eugeneia commented Dec 1, 2016

core.cdata: support for common SHM object types #1082

Are you sure you want to change the base?

core.cdata: support for common SHM object types #1082

Conversation

eugeneia commented Nov 30, 2016

eugeneia commented Nov 30, 2016

wingo commented Dec 1, 2016

lukego commented Dec 1, 2016

wingo commented Dec 1, 2016

wingo commented Dec 1, 2016

wingo commented Dec 1, 2016 • edited Loading

lukego commented Dec 1, 2016

wingo commented Dec 1, 2016

wingo commented Dec 1, 2016

lukego commented Dec 1, 2016

lukego commented Dec 1, 2016

wingo commented Dec 1, 2016 • edited Loading

lukego commented Dec 1, 2016 • edited Loading

eugeneia commented Dec 1, 2016

wingo commented Dec 1, 2016 •

edited

Loading

wingo commented Dec 1, 2016 •

edited

Loading

lukego commented Dec 1, 2016 •

edited

Loading