-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More ZMQ Streamers #72
Comments
Sounds interesting.. at this point, I don't really have the time or inclination to become an expert in ZMQ. But if you want to take a stab at some version of this (however it makes sense), I'm all for it. |
Yup, I didn't really expect you to dive in, so much as to offer any high-level API or feature or design guidance if you had any to offer. Otherwise, I'll just try to prototype a thing and submit a PR (...after I finish my taxes). |
I guess the one thing that comes to mind is that we should maybe think about the right distribution model here. Before diving down into this, it would be worth scoping out what kinds of problems this would be appropriate for, compared to, say, Dask, and then plan accordingly. For my purposes, ZMQStreamer exists just to make it easy to decouple the data generator from the model fitter; distributed computation doesn't really enter into the picture. |
I don't know from Dask, but looking at it, it looks like this might fill
the use case I'm looking at. I'll play with it a bit.
…On Thu, Mar 30, 2017 at 9:53 AM Brian McFee ***@***.***> wrote:
I guess the one thing that comes to mind is that we should maybe think
about the right distribution model here. Before diving down into this, it
would be worth scoping out what kinds of problems this would be appropriate
for, compared to, say, Dask, and then plan accordingly.
For my purposes, ZMQStreamer exists just to make it easy to decouple the
data generator from the model fitter; distributed computation doesn't
really enter into the picture.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#72 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA4t8xPqRDMjUvhwzxcE_Da11XniXmwDks5rq93WgaJpZM4MunsJ>
.
|
potentially related (somewhat maybe?) tensorflow/tensorflow#8728 |
Just some things I've been thinking about prototyping soon but I wanted to talk through them first / get your thoughts @bmcfee. This idea isn't fully fleshed out yet.
I've been doing some work with ZMQ at work lately, and I am learning that there are actually various different paradigms ZMQ is designed for. Currently, we are using only the "paired" mode (
request/recv
).In particular, I am thinking about creating multiple
Streamers
, wrapped as a sort of "Worker" in separate python processes, streaming to one central ZMQ receiver, which supplies batches to the training process. It should be possible to have the ZMQ Workers live on other machines, as well, and therefore enabling a sort of "CloudStreamer" through AWS/GCP/etc.I have this pyzmq example in mind. (Although it's currently unclear to me if we'd want a
Queue
device or aStreamer
ZMQ device).A first step might be to create something like a
ZMQWorkerStreamer
, which is aStreamer
, but sends to the intermediate Queue-like space. Then, you have aZMQConsumerStreamer
which might operate in much the way thatZMQStream
does now, except with external sources.(This also might be along the lines of the asyncio version we're talked a little about, with the added bonus of external sources).
The text was updated successfully, but these errors were encountered: