forked from booksbyus/zguide
-
Notifications
You must be signed in to change notification settings - Fork 0
/
book.xml
12198 lines (9683 loc) · 817 KB
/
book.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<book>
<title>The ZeroMQ Guide - for PHP Developers</title>
<bookinfo>
<isbn><!-- ISBN goes here --></isbn>
</bookinfo>
<dedication>
<para><emphasis role="bold">By Pieter Hintjens</emphasis></para>
<para>Please use the <ulink url="https://github.com/imatix/zguide/issues">issue tracker</ulink> for all comments and errata. This version covers the latest stable release of ØMQ (3.2) and was published on Mon 12 November, 2012. If you are using older versions of ØMQ then some of the examples and explanations won't be accurate.</para>
<para>The Guide is originally <ulink url="http://zguide.zeromq.org/page:all">in C</ulink>, but also in <ulink url="http://zguide.zeromq.org/php:all">PHP</ulink>, <ulink url="http://zguide.zeromq.org/py:all">Python</ulink>, <ulink url="http://zguide.zeromq.org/lua:all">Lua</ulink>, and <ulink url="http://zguide.zeromq.org/hx:all">Haxe</ulink>. We've also translated most of the examples into C++, C#, CL, Delphi, Erlang, F#, Felix, Haskell, Java, Objective-C, Ruby, Ada, Basic, Clojure, Go, Haxe, Node.js, ooc, Perl, and Scala.</para>
</dedication>
<preface>
<title>Preface</title>
<sect1>
<title>ØMQ in a Hundred Words</title>
<para>ØMQ (also seen as ZeroMQ, ØMQ, zmq) looks like an embeddable networking library but acts like a concurrency framework. It gives you sockets that carry atomic messages across various transports like in-process, inter-process, TCP, and multicast. You can connect sockets N-to-N with patterns like fanout, pub-sub, task distribution, and request-reply. It's fast enough to be the fabric for clustered products. Its asynchronous I/O model gives you scalable multicore applications, built as asynchronous message-processing tasks. It has a score of language APIs and runs on most operating systems. ØMQ is from <ulink url="http://www.imatix.com">iMatix</ulink> and is LGPLv3 open source.</para>
</sect1>
<sect1>
<title>The Zen of Zero</title>
<para>The Ø in ØMQ is all about tradeoffs. On the one hand this strange name lowers ØMQ's visibility on Google and Twitter. On the other hand it annoys the heck out of some Danish folk who write us things like "ØMG røtfl", and "Ø is not a funny looking zero!" and "<emphasis>Rødgrød med Fløde!</emphasis>", which is apparently an insult that means "may your neighbours be the direct descendants of Grendel!" Seems like a fair trade.</para>
<para>Originally the zero in ØMQ was meant as "zero broker" and (as close to) "zero latency" (as possible). In the meantime it has come to cover different goals: zero administration, zero cost, zero waste. More generally, "zero" refers to the culture of minimalism that permeates the project. We add power by removing complexity rather than exposing new functionality.</para>
</sect1>
<sect1>
<title>How the Guide Happened</title>
<para>In the summer of 2010, ØMQ was still a little-known niche library described by its rather terse man pages and a living but sparse wiki. Martin Sustrik and myself (Pieter Hintjens) were sitting in the bar of the Hotel Kyjev in Bratislava plotting how to make ØMQ more widely popular. Martin had written most of the ØMQ code, and I'd put up the funding and organized the community. Over some Zlaty Bazants, we agreed that ØMQ needed a new, simpler web site, and a basic guide to new users.</para>
<para>Martin collected some ideas for topics to explain. I'd never written a line of ØMQ code before this, so it became a live learning documentary. As I worked through simple examples to more complex ones I tried to answer many of the questions I'd seen on the mailing list. Since I've been building large-scale architectures for 30 years, there were a lot of problems I was keen to throw ØMQ at. Amazingly the results were mostly simple and elegant, even working in C. I felt a pure joy learning ØMQ and using it to solve real problems, which brought me back to programming after a few years' pause. And often, not knowing how it was "supposed" to be done, we improved ØMQ as we went along.</para>
<para>From the start I wanted the Guide to be a community project, so I put it onto github and let others contribute with pull requests. This was considered a radical, even vulgar approach by some. We came to a division of labor: I'd do the writing and make the original C examples, and others would help fix the text and translate the examples into other languages.</para>
<para>This worked better than I dared hope: you can now find all the examples in several languages and many in a dozen languages. It's a kind of programming language Rosetta stone and a valuable outcome in itself. We set-up a high-score: reach 80% translation and your language got its own Guide. PHP, Python, Lua, and Haxe reached this goal. People asked for PDFs, and we made that. People asked for ebooks, and got those. About a hundred people contributed to the Guide to date.</para>
<para>The Guide achieved its goal of popularizing ØMQ. The style pleases most and annoys some, which is how it should be. In December 2010 my work on ØMQ and the Guide stopped, as I found myself going through late-stage cancer, heavy surgery, and six months of chemotherapy. When I picked up work again in mid-2011, it was to start using ØMQ in anger for one of the largest use-cases imagineable: on the mobile phones and tablets of the world's biggest electronics company.</para>
<para>But the goal of the Guide was, from the start, a printed book. So it was exciting to get an email from Bill Lubanovic in January 2012 introducing me to his editor Andy Oram at O'Reilly, suggesting a ØMQ book. Of course! Where do I sign? How much do I have to pay? Oh, I <emphasis>get money</emphasis> for this? All I have to do is finish it?</para>
<para>Of course as soon as O'Reilly announced a ØMQ book, other publishers started sending out emails to potential authors. You'll probably see a rash of ØMQ books coming out next year. That's good: our niche library has hit the mainstream and deserves its six inches of shelf space. My apologies to the other ØMQ authors: we've set the bar horribly high, and my advice is to make your books complimentary. Perhaps focus on a specific language, platform, or pattern.</para>
<para>This is the magic and power of communities: be the first community in a space, stay healthy, and you own that space for ever.</para>
</sect1>
<sect1>
<title>Audience</title>
<para>This book is written for professional programmers who want to learn how to make the massively distributed software that will dominate the future of computing. We assume you can read C code, because most of the examples here are in C even though ØMQ is used in many languages. We assume you care about scale, because ØMQ solves that problem above all others. We assume you need the best possible results with the least possible cost, because otherwise you won't appreciate the trade-offs that ØMQ makes. Other than that basic background, we try to present all the concepts in networking and distributed computing you will need to use ØMQ.</para>
</sect1>
<sect1>
<title>Acknowledgements</title>
<para>Thanks to Andy Oram for making <ulink url="http://shop.oreilly.com/product/0636920026136.do">this happen at O'Reilly</ulink> and editing the book.</para>
<para>Thanks to Bill Desmarais, Brian Dorsey, Daniel Lin, Eric Desgranges, Gonzalo Diethelm, Guido Goldstein, Hunter Ford, Kamil Shakirov, Martin Sustrik, Mike Castleman, Naveen Chawla, Nicola Peduzzi, Oliver Smith, Olivier Chamoux, Peter Alexander, Pierre Rouleau, Randy Dryburgh, John Unwin, Alex Thomas, Mihail Minkov, Jeremy Avnet, Michael Compton, Kamil Kisiel, Mark Kharitonov, Guillaume Aubert, Ian Barber, Mike Sheridan, Faruk Akgul, Oleg Sidorov, Lev Givon, Allister MacLeod, Alexander D'Archangel, Andreas Hoelzlwimmer, Han Holl, Robert G. Jakabosky, Felipe Cruz, Marcus McCurdy, Mikhail Kulemin, Dr. Gergö Érdi, Pavel Zhukov, Alexander Else, Giovanni Ruggiero, Rick "Technoweenie", Daniel Lundin, Dave Hoover, Simon Jefford, Benjamin Peterson, Justin Case, Devon Weller, Richard Smith, Alexander Morland, Wadim Grasza, Michael Jakl, Uwe Dauernheim, Sebastian Nowicki, Simone Deponti, Aaron Raddon, Dan Colish, Markus Schirp, Benoit Larroque, Jonathan Palardy, Isaiah Peng, Arkadiusz Orzechowski, Umut Aydin, Matthew Horsfall, Jeremy W. Sherman, Eric Pugh, Tyler Sellon, John E. Vincent, Pavel Mitin, Min RK, Igor Wiedler, Olof Åkesson, Patrick Lucas, Heow Goodman, Senthil Palanisami, John Gallagher, Tomas Roos, Stephen McQuay, Erik Allik, Arnaud Cogoluègnes, Rob Gagnon, Dan Williams, Edward Smith, James Tucker, Kristian Kristensen, Vadim Shalts, Martin Trojer, Tom van Leeuwen, Pandya Hiten, Harm Aarts, Marc Harter, Iskren Ivov Chernev, Jay Han, Sonia Hamilton, Nathan Stocks, Naveen Palli, and Zed Shaw for their contributions.</para>
<para>Thanks to Stathis Sideris for <ulink url="http://www.ditaa.org">Ditaa</ulink>.</para>
</sect1>
</preface>
<chapter id="basics">
<title>Basics</title>
<sect1>
<title>Fixing the World</title>
<para>How to explain ØMQ? Some of us start by saying all the wonderful things it does. <emphasis>It's sockets on steroids. It's like mailboxes with routing. It's fast!</emphasis> Others try to share their moment of enlightenment, that zap-pow-kaboom satori paradigm-shift moment when it all became obvious. <emphasis>Things just become simpler. Complexity goes away. It opens the mind.</emphasis> Others try to explain by comparison. <emphasis>It's smaller, simpler, but still looks familiar.</emphasis> Personally, I like to remember why we made ØMQ at all, because that's most likely where you, the reader, still are today.</para>
<para>Programming is a science dressed up as art, because most of us don't understand the physics of software, and it's rarely if ever taught. The physics of software is not algorithms, data structures, languages and abstractions. These are just tools we make, use, throw away. The real physics of software is the physics of people.</para>
<para>Specifically, our limitations when it comes to complexity, and our desire to work together to solve large problems in pieces. This is the science of programming: make building blocks that people can understand and use <emphasis>easily</emphasis>, and people will work together to solve the very largest problems.</para>
<para>We live in a connected world, and modern software has to navigate this world. So the building blocks for tomorrow's very largest solutions are connected and massively parallel. It's not enough for code to be "strong and silent" any more. Code has to talk to code. Code has to be chatty, sociable, well-connected. Code has to run like the human brain, trillions of individual neurons firing off messages to each other, a massively parallel network with no central control, no single point of failure, yet able to solve immensely difficult problems. And it's no accident that the future of code looks like the human brain, because the endpoints of every network are, at some level, human brains.</para>
<para>If you've done any work with threads, protocols, or networks, you'll realize this is pretty much impossible. It's a dream. Even connecting a few programs across a few sockets is plain nasty, when you start to handle real life situations. Trillions? The cost would be unimaginable. Connecting computers is so difficult that software and services to do this is a multi-billion dollar business.</para>
<para>So we live in a world where the wiring is years ahead of our ability to use it. We had a software crisis in the 1980s, when leading software engineers like Fred Brooks believed <ulink url="http://en.wikipedia.org/wiki/No-Silver-Bullet">there was no "Silver Bullet"</ulink> to "promise even one order of magnitude of improvement in productivity, reliability, or simplicity".</para>
<para>Brooks missed free and open source software, which solved that crisis, enabling us to share knowledge efficiently. Today we face another software crisis, but it's one we don't talk about much. Only the largest, richest firms can afford to create connected applications. There is a cloud, but it's proprietary. Our data, our knowledge is disappearing from our personal computers into clouds that we cannot access, cannot compete with. Who owns our social networks? It is like the mainframe-PC revolution in reverse.</para>
<para>We can leave the political philosophy <ulink url="http://swsi.info">for another book</ulink>. The point is that while the Internet offers the potential of massively connected code, the reality is that this is out of reach for most of us, and so, large interesting problems (in health, education, economics, transport, and so on) remain unsolved because there is no way to connect the code, and thus no way to connect the brains that could work together to solve these problems.</para>
<para>There have been many attempts to solve the challenge of connected software. There are thousands of IETF specifications, each solving part of the puzzle. For application developers, HTTP is perhaps the one solution to have been simple enough to work, but it arguably makes the problem worse, by encouraging developers and architects to think in terms of big servers and thin, stupid clients.</para>
<para>So today people are still connecting applications using raw UDP and TCP, proprietary protocols, HTTP, Websockets. It remains painful, slow, hard to scale, and essentially centralized. Distributed P2P architectures are mostly for play, not work. How many applications use Skype or Bittorrent to exchange data?</para>
<para>Which brings us back to the science of programming. To fix the world, we needed to do two things. One, to solve the general problem of "how to connect any code to any code, anywhere". Two, to wrap that up in the simplest possible building blocks that people could understand and use <emphasis>easily</emphasis>.</para>
<para>It sounds ridiculously simple. And maybe it is. That's kind of the whole point.</para>
</sect1>
<sect1>
<title>Audience for This Book</title>
<para>We assume you are using the latest 3.2 release of ØMQ. We assume you are using a Linux box or something similar. We assume you can read C code, more or less, that's the default language for the examples. We assume that when we write constants like PUSH or SUBSCRIBE you can imagine they are really called <literal>ZMQ-PUSH</literal> or <literal>ZMQ-SUBSCRIBE</literal> if the programming language needs it.</para>
</sect1>
<sect1>
<title>Getting the Examples</title>
<para>The Guide examples live in the Guide's <ulink url="https://github.com/imatix/zguide">git repository</ulink>. The simplest way to get all the examples is to clone this repository:</para>
<screen>git clone --depth=1 git://github.com/imatix/zguide.git
</screen>
<para>And then browse the examples subdirectory. You'll find examples by language. If there are examples missing in a language you use, you're encouraged to <ulink url="http://zguide.zeromq.org/main:translate">submit a translation</ulink>. This is how the Guide became so useful, thanks to the work of many people. All examples are licensed under MIT/X11.</para>
</sect1>
<sect1>
<title>Ask and Ye Shall Receive</title>
<para>So let's start with some code. We start of course with a Hello World example. We'll make a client and a server. The client sends "Hello" to the server, which replies with "World"(<xref linkend="figure-1"/>). Here's the server in C, which opens a ØMQ socket on port 5555, reads requests on it, and replies with "World" to each request:</para>
<example id="hwserver-c">
<title>Hello World server (hwserver.c)</title>
<programlisting language="c">
//
// Hello World server
// Binds REP socket to tcp://*:5555
// Expects "Hello" from client, replies with "World"
//
#include <zmq.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main (void)
{
void *context = zmq_ctx_new ();
// Socket to talk to clients
void *responder = zmq_socket (context, ZMQ_REP);
zmq_bind (responder, "tcp://*:5555");
while (1) {
// Wait for next request from client
zmq_msg_t request;
zmq_msg_init (&request);
zmq_msg_recv (&request, responder, 0);
printf ("Received Hello\n");
zmq_msg_close (&request);
// Do some 'work'
sleep (1);
// Send reply back to client
zmq_msg_t reply;
zmq_msg_init_size (&reply, 5);
memcpy (zmq_msg_data (&reply), "World", 5);
zmq_msg_send (&reply, responder, 0);
zmq_msg_close (&reply);
}
// We never get here but if we did, this would be how we end
zmq_close (responder);
zmq_ctx_destroy (context);
return 0;
}
</programlisting>
</example>
<figure id="figure-1">
<title>Request-Reply</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/fig1.eps" format="EPS" width="4.8in"/>
</imageobject>
</mediaobject>
</figure>
<para>The REQ-REP socket pair is in lockstep. The client issues <literal>zmq-msg-send[3]</literal> and then <literal>zmq-msg-recv[3]</literal>, in a loop (or once if that's all it needs). Doing any other sequence (e.g. sending two messages in a row) will result in a return code of -1 from the <literal>send</literal> or <literal>recv</literal> call. Similarly, the service issues <literal>zmq-msg-recv[3]</literal> and then <literal>zmq-msg-send[3]</literal> in that order, as often as it needs to.</para>
<para>ØMQ uses C as its reference language and this is the main language we'll use for examples. If you're reading this on-line, the link below the example takes you to translations into other programming languages. Let's compare the same server in C++:</para>
<example id="hwserver-cpp">
<title>Hello World server (hwserver.cpp)</title>
<programlisting language="cpp">
//
// Hello World server in C++
// Binds REP socket to tcp://*:5555
// Expects "Hello" from client, replies with "World"
//
#include <zmq.hpp>
#include <string>
#include <iostream>
#include <unistd.h>
int main () {
// Prepare our context and socket
zmq::context_t context (1);
zmq::socket_t socket (context, ZMQ_REP);
socket.bind ("tcp://*:5555");
while (true) {
zmq::message_t request;
// Wait for next request from client
socket.recv (&request);
std::cout << "Received Hello" << std::endl;
// Do some 'work'
sleep (1);
// Send reply back to client
zmq::message_t reply (5);
memcpy ((void *) reply.data (), "World", 5);
socket.send (reply);
}
return 0;
}
</programlisting>
</example>
<para>You can see that the ØMQ API is similar in C and C++. In a language like PHP, we can hide even more and the code becomes even easier to read:</para>
<example id="hwserver-php">
<title>Hello World server (hwserver.php)</title>
<programlisting language="php">
<?php
/*
* Hello World server
* Binds REP socket to tcp://*:5555
* Expects "Hello" from client, replies with "World"
* @author Ian Barber <ian(dot)barber(at)gmail(dot)com>
*/
$context = new ZMQContext(1);
// Socket to talk to clients
$responder = new ZMQSocket($context, ZMQ::SOCKET_REP);
$responder->bind("tcp://*:5555");
while(true) {
// Wait for next request from client
$request = $responder->recv();
printf ("Received request: [%s]\n", $request);
// Do some 'work'
sleep (1);
// Send reply back to client
$responder->send("World");
</programlisting>
</example>
<para>Here's the client code:</para>
<example id="hwclient-php">
<title>Hello World client (hwclient.php)</title>
<programlisting language="php">
<?php
/*
* Hello World client
* Connects REQ socket to tcp://localhost:5555
* Sends "Hello" to server, expects "World" back
* @author Ian Barber <ian(dot)barber(at)gmail(dot)com>
*/
$context = new ZMQContext();
// Socket to talk to server
echo "Connecting to hello world server...\n";
$requester = new ZMQSocket($context, ZMQ::SOCKET_REQ);
$requester->connect("tcp://localhost:5555");
for($request_nbr = 0; $request_nbr != 10; $request_nbr++) {
printf ("Sending request %d...\n", $request_nbr);
$requester->send("Hello");
$reply = $requester->recv();
printf ("Received reply %d: [%s]\n", $request_nbr, $reply);
}
</programlisting>
</example>
<para>Now this looks too simple to be realistic, but a ØMQ socket is what you get when you take a normal TCP socket, inject it with a mix of radioactive isotopes stolen from a secret Soviet atomic research project, bombard it with 1950-era cosmic rays, and put it into the hands of a drug-addled comic book author with a badly-disguised fetish for bulging muscles clad in spandex(<xref linkend="figure-2"/>). Yes, ØMQ sockets are the world-saving superheroes of the networking world.</para>
<figure id="figure-2">
<title>A terrible accident...</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/fig2.eps" format="EPS" width="4.8in"/>
</imageobject>
</mediaobject>
</figure>
<para>You could throw thousands of clients at this server, all at once, and it would continue to work happily and quickly. For fun, try starting the client and <emphasis>then</emphasis> starting the server, see how it all still works, then think for a second what this means.</para>
<para>Let us explain briefly what these two programs are actually doing. They create a ØMQ context to work with, and a socket. Don't worry what the words mean. You'll pick it up. The server binds its REP (reply) socket to port 5555. The server waits for a request, in a loop, and responds each time with a reply. The client sends a request and reads the reply back from the server.</para>
<para>If you kill the server (Ctrl-C) and restart it, the client won't recover properly. Recovering from crashing processes isn't quite that easy. Making a reliable request-reply flow is complex enough that we won't cover it until Reliable Request-Reply Patterns<xref linkend="reliable-request-reply"/>.</para>
<para>There is a lot happening behind the scenes but what matters to us programmers is how short and sweet the code is, and how often it doesn't crash, even under heavy load. This is the request-reply pattern, probably the simplest way to use ØMQ. It maps to RPC and the classic client-server model.</para>
</sect1>
<sect1>
<title>A Minor Note on Strings</title>
<para>ØMQ doesn't know anything about the data you send except its size in bytes. That means you are responsible for formatting it safely so that applications can read it back. Doing this for objects and complex data types is a job for specialized libraries like Protocol Buffers. But even for strings you need to take care.</para>
<para>In C and some other languages, strings are terminated with a null byte. We could send a string like "HELLO" with that extra null byte:</para>
<programlisting language="c">
zmq-msg-init-data (&request, "Hello", 6, NULL, NULL);
</programlisting>
<para>However if you send a string from another language it probably will not include that null byte. For example, when we send that same string in Python, we do this:</para>
<programlisting language="python">
socket.send ("Hello")
</programlisting>
<para>Then what goes onto the wire is a length (one byte for shorter strings) and the string contents, as individual characters(<xref linkend="figure-3"/>).</para>
<figure id="figure-3">
<title>A ØMQ string</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/fig3.eps" format="EPS" width="4.8in"/>
</imageobject>
</mediaobject>
</figure>
<para>And if you read this from a C program, you will get something that looks like a string, and might by accident act like a string (if by luck the five bytes find themselves followed by an innocently lurking null), but isn't a proper string. When your client and server don't agree on the string format, you will get weird results.</para>
<para>When you receive string data from ØMQ, in C, you simply cannot trust that it's safely terminated. Every single time you read a string you should allocate a new buffer with space for an extra byte, copy the string, and terminate it properly with a null.</para>
<para>So let's establish the rule that <emphasis role="bold">ØMQ strings are length-specified, and are sent on the wire <emphasis>without</emphasis> a trailing null</emphasis>. In the simplest case (and we'll do this in our examples) a ØMQ string maps neatly to a ØMQ message frame, which looks like the above figure, a length and some bytes.</para>
<para>Here is what we need to do, in C, to receive a ØMQ string and deliver it to the application as a valid C string:</para>
<programlisting language="c">
// Receive 0MQ string from socket and convert into C string
static char *
s-recv (void *socket) {
zmq-msg-t message;
zmq-msg-init (&message);
int size = zmq-msg-recv (&message, socket, 0);
if (size == -1)
return NULL;
char *string = malloc (size + 1);
memcpy (string, zmq-msg-data (&message), size);
zmq-msg-close (&message);
string [size] = 0;
return (string);
}
</programlisting>
<para>This makes a very handy helper function and in the spirit of making things we can reuse profitably, let's write a similar 's-send' function that sends strings in the correct ØMQ format, and package this into a header file we can reuse.</para>
<para>The result is <literal>zhelpers.h</literal>, which lets us write sweeter and shorter ØMQ applications in C. It is a fairly long source, and only fun for C developers, so <ulink url="https://github.com/imatix/zguide/blob/master/examples/C/zhelpers.h">read it at leisure</ulink>.</para>
</sect1>
<sect1>
<title>Version Reporting</title>
<para>ØMQ does come in several versions and quite often, if you hit a problem, it'll be something that's been fixed in a later version. So it's a useful trick to know <emphasis>exactly</emphasis> what version of ØMQ you're actually linking with. Here is a tiny program that does that:</para>
<example id="version-php">
<title>ØMQ version reporting (version.php)</title>
<programlisting language="php">
<?php
/* Report 0MQ version
*
* @author Ian Barber <ian(dot)barber(at)gmail(dot)com>
*/
if(class_exists("ZMQ") && defined("ZMQ::LIBZMQ_VER")) {
echo ZMQ::LIBZMQ_VER, PHP_EOL;
</programlisting>
</example>
</sect1>
<sect1>
<title>Getting the Message Out</title>
<para>The second classic pattern is one-way data distribution, in which a server pushes updates to a set of clients. Let's see an example that pushes out weather updates consisting of a zip code, temperature, and relative humidity. We'll generate random values, just like the real weather stations do.</para>
<para>Here's the server. We'll use port 5556 for this application:</para>
<example id="wuserver-php">
<title>Weather update server (wuserver.php)</title>
<programlisting language="php">
<?php
/*
* Weather update server
* Binds PUB socket to tcp://*:5556
* Publishes random weather updates
* @author Ian Barber <ian(dot)barber(at)gmail(dot)com>
*/
// Prepare our context and publisher
$context = new ZMQContext();
$publisher = $context->getSocket(ZMQ::SOCKET_PUB);
$publisher->bind("tcp://*:5556");
$publisher->bind("ipc://weather.ipc");
while (true) {
// Get values that will fool the boss
$zipcode = mt_rand(0, 100000);
$temperature = mt_rand(-80, 135);
$relhumidity = mt_rand(10, 60);
// Send message to all subscribers
$update = sprintf ("%05d %d %d", $zipcode, $temperature, $relhumidity);
$publisher->send($update);
}
</programlisting>
</example>
<para>There's no start, and no end to this stream of updates, it's like a never ending broadcast(<xref linkend="figure-4"/>).</para>
<figure id="figure-4">
<title>Publish-Subscribe</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/fig4.eps" format="EPS" width="4.8in"/>
</imageobject>
</mediaobject>
</figure>
<para>Here is client application, which listens to the stream of updates and grabs anything to do with a specified zip code, by default New York City because that's a great place to start any adventure:</para>
<example id="wuclient-php">
<title>Weather update client (wuclient.php)</title>
<programlisting language="php">
<?php
/*
* Weather update client
* Connects SUB socket to tcp://localhost:5556
* Collects weather updates and finds avg temp in zipcode
* @author Ian Barber <ian(dot)barber(at)gmail(dot)com>
*/
$context = new ZMQContext();
// Socket to talk to server
echo "Collecting updates from weather server...", PHP_EOL;
$subscriber = new ZMQSocket($context, ZMQ::SOCKET_SUB);
$subscriber->connect("tcp://localhost:5556");
// Subscribe to zipcode, default is NYC, 10001
$filter = $_SERVER['argc'] > 1 ? $_SERVER['argv'][1] : "10001";
$subscriber->setSockOpt(ZMQ::SOCKOPT_SUBSCRIBE, $filter);
// Process 100 updates
$total_temp = 0;
for ($update_nbr = 0; $update_nbr < 100; $update_nbr++) {
$string = $subscriber->recv();
sscanf ($string, "%d %d %d", $zipcode, $temperature, $relhumidity);
$total_temp += $temperature;
}
printf ("Average temperature for zipcode '%s' was %dF\n",
$filter, (int) ($total_temp / $update_nbr))
</programlisting>
</example>
<para>Note that when you use a SUB socket you <emphasis role="bold">must</emphasis> set a subscription using <literal>zmq-setsockopt[3]</literal> and SUBSCRIBE, as in this code. If you don't set any subscription, you won't get any messages. It's a common mistake for beginners. The subscriber can set many subscriptions, which are added together. That is, if a update matches ANY subscription, the subscriber receives it. The subscriber can also cancel specific subscriptions. A subscription is often but not necessarily a printable string. See <literal>zmq-setsockopt[3]</literal> for how this works.</para>
<para>The PUB-SUB socket pair is asynchronous. The client does <literal>zmq-msg-recv[3]</literal>, in a loop (or once if that's all it needs). Trying to send a message to a SUB socket will cause an error. Similarly the service does <literal>zmq-msg-send[3]</literal> as often as it needs to, but must not do <literal>zmq-msg-recv[3]</literal> on a PUB socket.</para>
<para>In theory with ØMQ sockets, it does not matter which end connects, and which end binds. However in practice there are undocumented differences that I'll come to later. For now, bind the PUB and connect the SUB, unless your network design makes that impossible.</para>
<para>There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, <emphasis role="bold">the subscriber will always miss the first messages that the publisher sends</emphasis>. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.</para>
<para>This "slow joiner" symptom hits enough people, often enough, that we're going to explain it in detail. Remember that ØMQ does asynchronous I/O, i.e. in the background. Say you have two nodes doing this, in this order:</para>
<itemizedlist>
<listitem><para>Subscriber connects to an endpoint and receives and counts messages.</para></listitem>
<listitem><para>Publisher binds to an endpoint and immediately sends 1,000 messages.</para></listitem>
</itemizedlist>
<para>Then the subscriber will most likely not receive anything. You'll blink, check that you set a correct filter, and try again, and the subscriber will still not receive anything.</para>
<para>Making a TCP connection involves to and fro handshaking that takes several milliseconds depending on your network and the number of hops between peers. In that time, ØMQ can send very many messages. For sake of argument assume it takes 5 msecs to establish a connection, and that same link can handle 1M messages per second. During the 5 msecs that the subscriber is connecting to the publisher, it takes the publisher only 1 msec to send out those 1K messages.</para>
<para>In Sockets and Patterns<xref linkend="sockets-and-patterns"/> we'll explain how to synchronize a publisher and subscribers so that you don't start to publish data until the subscriber(s) really are connected and ready. There is a simple and stupid way to delay the publisher, which is to sleep. Don't do this in a real application, though, because it is extremely fragile as well as inelegant and slow. Use sleeps to prove to yourself what's happening, and then wait for Sockets and Patterns<xref linkend="sockets-and-patterns"/> to see how to do this right.</para>
<para>The alternative to synchronization is to simply assume that the published data stream is infinite and has no start, and no end. One also assumes that the subscriber doesn't care what transpired before it started up. This is how we built our weather client example.</para>
<para>So the client subscribes to its chosen zip code and collects a thousand updates for that zip code. That means about ten million updates from the server, if zip codes are randomly distributed. You can start the client, and then the server, and the client will keep working. You can stop and restart the server as often as you like, and the client will keep working. When the client has collected its thousand updates, it calculates the average, prints it, and exits.</para>
<para>Some points about the publish-subscribe pattern:</para>
<itemizedlist>
<listitem><para>A subscriber can connect to more than one publisher, using one 'connect' call each time. Data will then arrive and be interleaved ("fair-queued") so that no single publisher drowns out the others.</para></listitem>
<listitem><para>If a publisher has no connected subscribers, then it will simply drop all messages.</para></listitem>
<listitem><para>If you're using TCP, and a subscriber is slow, messages will queue up on the publisher. We'll look at how to protect publishers against this, using the "high-water mark" later.</para></listitem>
<listitem><para>From ØMQ 3.x, filtering happens at the publisher side, when using a connected protocol (<literal>tcp://</literal> or <literal>ipc://</literal>). Using the <literal>epgm://</literal> protocol, filtering happens at the subscriber side. In ØMQ/2.x, all filtering happened at the subscriber side.</para></listitem>
</itemizedlist>
<para>This is how long it takes to receive and filter 10M messages on my laptop, which is an 2011-era Intel I7, fast but nothing special:</para>
<screen>ph@nb201103:~/work/git/zguide/examples/c$ time wuclient
Collecting updates from weather server...
Average temperature for zipcode '10001 ' was 28F
real 0m4.470s
user 0m0.000s
sys 0m0.008s
</screen>
</sect1>
<sect1>
<title>Divide and Conquer</title>
<para>As a final example (you are surely getting tired of juicy code and want to delve back into philological discussions about comparative abstractive norms), let's do a little supercomputing. Then coffee. Our supercomputing application is a fairly typical parallel processing model(<xref linkend="figure-5"/>). We have:</para>
<itemizedlist>
<listitem><para>A ventilator that produces tasks that can be done in parallel</para></listitem>
<listitem><para>A set of workers that process tasks</para></listitem>
<listitem><para>A sink that collects results back from the worker processes</para></listitem>
</itemizedlist>
<para>In reality, workers run on superfast boxes, perhaps using GPUs (graphic processing units) to do the hard math. Here is the ventilator. It generates 100 tasks, each is a message telling the worker to sleep for some number of milliseconds:</para>
<example id="taskvent-php">
<title>Parallel task ventilator (taskvent.php)</title>
<programlisting language="php">
<?php
/*
* Task ventilator
* Binds PUSH socket to tcp://localhost:5557
* Sends batch of tasks to workers via that socket
* @author Ian Barber <ian(dot)barber(at)gmail(dot)com>
*/
$context = new ZMQContext();
// Socket to send messages on
$sender = new ZMQSocket($context, ZMQ::SOCKET_PUSH);
$sender->bind("tcp://*:5557");
echo "Press Enter when the workers are ready: ";
$fp = fopen('php://stdin', 'r');
$line = fgets($fp, 512);
fclose($fp);
echo "Sending tasks to workers...", PHP_EOL;
// The first message is "0" and signals start of batch
$sender->send(0);
// Send 100 tasks
$total_msec = 0; // Total expected cost in msecs
for ($task_nbr = 0; $task_nbr < 100; $task_nbr++) {
// Random workload from 1 to 100msecs
$workload = mt_rand(1, 100);
$total_msec += $workload;
$sender->send($workload);
}
printf ("Total expected cost: %d msec\n", $total_msec);
sleep (1); // Give 0MQ time to delive
</programlisting>
</example>
<figure id="figure-5">
<title>Parallel Pipeline</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/fig5.eps" format="EPS" width="4.8in"/>
</imageobject>
</mediaobject>
</figure>
<para>Here is the worker application. It receives a message, sleeps for that number of seconds, then signals that it's finished:</para>
<example id="taskwork-php">
<title>Parallel task worker (taskwork.php)</title>
<programlisting language="php">
<?php
/*
* Task worker
* Connects PULL socket to tcp://localhost:5557
* Collects workloads from ventilator via that socket
* Connects PUSH socket to tcp://localhost:5558
* Sends results to sink via that socket
* @author Ian Barber <ian(dot)barber(at)gmail(dot)com>
*/
$context = new ZMQContext();
// Socket to receive messages on
$receiver = new ZMQSocket($context, ZMQ::SOCKET_PULL);
$receiver->connect("tcp://localhost:5557");
// Socket to send messages to
$sender = new ZMQSocket($context, ZMQ::SOCKET_PUSH);
$sender->connect("tcp://localhost:5558");
// Process tasks forever
while (true) {
$string = $receiver->recv();
// Simple progress indicator for the viewer
echo $string, PHP_EOL;
// Do the work
usleep($string * 1000);
// Send results to sink
$sender->send("");
</programlisting>
</example>
<para>Here is the sink application. It collects the 100 tasks, then calculates how long the overall processing took, so we can confirm that the workers really were running in parallel, if there are more than one of them:</para>
<example id="tasksink-php">
<title>Parallel task sink (tasksink.php)</title>
<programlisting language="php">
<?php
/*
* Task sink
* Binds PULL socket to tcp://localhost:5558
* Collects results from workers via that socket
* @author Ian Barber <ian(dot)barber(at)gmail(dot)com>
*/
// Prepare our context and socket
$context = new ZMQContext();
$receiver = new ZMQSocket($context, ZMQ::SOCKET_PULL);
$receiver->bind("tcp://*:5558");
// Wait for start of batch
$string = $receiver->recv();
// Start our clock now
$tstart = microtime(true);
// Process 100 confirmations
$total_msec = 0; // Total calculated cost in msecs
for ($task_nbr = 0; $task_nbr < 100; $task_nbr++) {
$string = $receiver->recv();
if($task_nbr % 10 == 0) {
echo ":";
} else {
echo ".";
}
}
$tend = microtime(true);
$total_msec = ($tend - $tstart) * 1000;
echo PHP_EOL;
printf ("Total elapsed time: %d msec", $total_msec);
echo PHP_EOL
</programlisting>
</example>
<para>The average cost of a batch is 5 seconds. When we start 1, 2, 4 workers we get results like this from the sink:</para>
<screen># 1 worker
Total elapsed time: 5034 msec
# 2 workers
Total elapsed time: 2421 msec
# 4 workers
Total elapsed time: 1018 msec
</screen>
<para>Let's look at some aspects of this code in more detail:</para>
<itemizedlist>
<listitem><para>The workers connect upstream to the ventilator, and downstream to the sink. This means you can add workers arbitrarily. If the workers bound to their endpoints, you would need (a) more endpoints and (b) to modify the ventilator and/or the sink each time you added a worker. We say that the ventilator and sink are 'stable' parts of our architecture and the workers are 'dynamic' parts of it.</para></listitem>
<listitem><para>We have to synchronize the start of the batch with all workers being up and running. This is a fairly common gotcha in ØMQ and there is no easy solution. The 'connect' method takes a certain time. So when a set of workers connect to the ventilator, the first one to successfully connect will get a whole load of messages in that short time while the others are also connecting. If you don't synchronize the start of the batch somehow, the system won't run in parallel at all. Try removing the wait, and see.</para></listitem>
<listitem><para>The ventilator's PUSH socket distributes tasks to workers (assuming they are all connected <emphasis>before</emphasis> the batch starts going out) evenly. This is called <emphasis>load-balancing</emphasis> and it's something we'll look at again in more detail.</para></listitem>
<listitem><para>The sink's PULL socket collects results from workers evenly. This is called <emphasis>fair-queuing</emphasis>(<xref linkend="figure-6"/>).</para></listitem>
</itemizedlist>
<figure id="figure-6">
<title>Fair Queuing</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/fig6.eps" format="EPS" width="4.8in"/>
</imageobject>
</mediaobject>
</figure>
<para>The pipeline pattern also exhibits the "slow joiner" syndrome, leading to accusations that PUSH sockets don't load balance properly. If you are using PUSH and PULL, and one of your workers gets way more messages than the others, it's because that PULL socket has joined faster than the others, and grabs a lot of messages before the others manage to connect.</para>
</sect1>
<sect1>
<title>Programming with ØMQ</title>
<para>Having seen some examples, you're eager to start using ØMQ in some apps. Before you start that, take a deep breath, chillax, and reflect on some basic advice that will save you stress and confusion.</para>
<itemizedlist>
<listitem><para>Learn ØMQ step by step. It's just one simple API but it hides a world of possibilities. Take the possibilities slowly, master each one.</para></listitem>
<listitem><para>Write nice code. Ugly code hides problems and makes it hard for others to help you. You might get used to meaningless variable names, but people reading your code won't. Use names that are real words, that say something other than "I'm too careless to tell you what this variable is really for". Use consistent indentation, clean layout. Write nice code and your world will be more comfortable.</para></listitem>
<listitem><para>Test what you make as you make it. When your program doesn't work, you should know what five lines are to blame. This is especially true when you do ØMQ magic, which just <emphasis>won't</emphasis> work the first few times you try it.</para></listitem>
<listitem><para>When you find that things don't work as expected, break your code into pieces, test each one, see which one is not working. ØMQ lets you make essentially modular code, use that to your advantage.</para></listitem>
<listitem><para>Make abstractions (classes, methods, whatever) as you need them. If you copy/paste a lot of code you're going to copy/paste errors too.</para></listitem>
</itemizedlist>
<para>To illustrate, here is a fragment of code someone asked me to help fix:</para>
<programlisting language="c">
// NOTE: do NOT reuse this example code!
static char *topic-str = "msg.x|";
void* pub-worker(void* arg){
void *ctx = arg;
assert(ctx);
void *qskt = zmq-socket(ctx, ZMQ-REP);
assert(qskt);
int rc = zmq-connect(qskt, "inproc://querys");
assert(rc == 0);
void *pubskt = zmq-socket(ctx, ZMQ-PUB);
assert(pubskt);
rc = zmq-bind(pubskt, "inproc://publish");
assert(rc == 0);
uint8-t cmd;
uint32-t nb;
zmq-msg-t topic-msg, cmd-msg, nb-msg, resp-msg;
zmq-msg-init-data(&topic-msg, topic-str, strlen(topic-str) , NULL, NULL);
fprintf(stdout,"WORKER: ready to receive messages\n");
// NOTE: do NOT reuse this example code, It's broken.
// e.g. topic-msg will be invalid the second time through
while (1){
zmq-msg-send(pubskt, &topic-msg, ZMQ-SNDMORE);
zmq-msg-init(&cmd-msg);
zmq-msg-recv(qskt, &cmd-msg, 0);
memcpy(&cmd, zmq-msg-data(&cmd-msg), sizeof(uint8-t));
zmq-msg-send(pubskt, &cmd-msg, ZMQ-SNDMORE);
zmq-msg-close(&cmd-msg);
fprintf(stdout, "received cmd %u\n", cmd);
zmq-msg-init(&nb-msg);
zmq-msg-recv(qskt, &nb-msg, 0);
memcpy(&nb, zmq-msg-data(&nb-msg), sizeof(uint32-t));
zmq-msg-send(pubskt, &nb-msg, 0);
zmq-msg-close(&nb-msg);
fprintf(stdout, "received nb %u\n", nb);
zmq-msg-init-size(&resp-msg, sizeof(uint8-t));
memset(zmq-msg-data(&resp-msg), 0, sizeof(uint8-t));
zmq-msg-send(qskt, &resp-msg, 0);
zmq-msg-close(&resp-msg);
}
return NULL;
}
</programlisting>
<para>This is what I rewrote it to, as part of finding the bug:</para>
<programlisting language="c">
static void *
worker-thread (void *arg) {
void *context = arg;
void *worker = zmq-socket (context, ZMQ-REP);
assert (worker);
int rc;
rc = zmq-connect (worker, "ipc://worker");
assert (rc == 0);
void *broadcast = zmq-socket (context, ZMQ-PUB);
assert (broadcast);
rc = zmq-bind (broadcast, "ipc://publish");
assert (rc == 0);
while (1) {
char *part1 = s-recv (worker);
char *part2 = s-recv (worker);
printf ("Worker got [%s][%s]\n", part1, part2);
s-sendmore (broadcast, "msg");
s-sendmore (broadcast, part1);
s-send (broadcast, part2);
free (part1);
free (part2);
s-send (worker, "OK");
}
return NULL;
}
</programlisting>
<para>In the end, the problem was that the application was passing sockets between threads, which crashes weirdly. Sockets are not threadsafe. It became legal behavior to migrate sockets from one thread to another in ØMQ/2.1, but this remains dangerous unless you use a "full memory barrier". If you don't know what that means, don't attempt socket migration.</para>
<sect2>
<title>Getting the Context Right</title>
<para>ØMQ applications always start by creating a <emphasis>context</emphasis>, and then using that for creating sockets. In C, it's the <literal>zmq-ctx-new[3]</literal> call. You should create and use exactly one context in your process. Technically, the context is the container for all sockets in a single process, and acts as the transport for <literal>inproc</literal> sockets, which are the fastest way to connect threads in one process. If at runtime a process has two contexts, these are like separate ØMQ instances. If that's explicitly what you want, OK, but otherwise remember:</para>
<para><emphasis role="bold">Do one <literal>zmq-ctx-new[3]</literal> at the start of your main line code, and one <literal>zmq-ctx-destroy[3]</literal> at the end.</emphasis></para>
<para>If you're using the <literal>fork()</literal> system call, each process needs its own context. If you do <literal>zmq-ctx-new[3]</literal> in the main process before calling <literal>fork()</literal>, the child processes get their own contexts. In general you want to do the interesting stuff in the child processes, and just manage these from the parent process.</para>
</sect2>
<sect2>
<title>Making a Clean Exit</title>
<para>Classy programmers share the same motto as classy hit men: always clean-up when you finish the job. When you use ØMQ in a language like Python, stuff gets automatically freed for you. But when using C you have to carefully free objects when you're finished with them, or you get memory leaks, unstable applications, and generally bad karma.</para>
<para>Memory leaks are one thing, but ØMQ is quite finicky about how you exit an application. The reasons are technical and painful but the upshot is that if you leave any sockets open, the <literal>zmq-ctx-destroy[3]</literal> function will hang forever. And even if you close all sockets, <literal>zmq-ctx-destroy[3]</literal> will by default wait forever if there are pending connects or sends. Unless you set the LINGER to zero on those sockets before closing them.</para>
<para>The ØMQ objects we need to worry about are messages, sockets, and contexts. Luckily it's quite simple, at least in simple programs:</para>
<itemizedlist>
<listitem><para>Always close a message the moment you are done with it, using <literal>zmq-msg-close[3]</literal>.</para></listitem>
<listitem><para>If you are opening and closing a lot of sockets, that's probably a sign you need to redesign your application.</para></listitem>
<listitem><para>When you exit the program, close your sockets and then call <literal>zmq-ctx-destroy[3]</literal>. This destroys the context.</para></listitem>
</itemizedlist>
<para>This is at least for C development. In a language with automatic object destruction, sockets and contexts will be destroyed as you leave the scope. If you use exceptions you'll have to do the clean-up in something like a "final" block, the same as for any resource.</para>
<para>If you're doing multithreaded work, it gets rather more complex than this. We'll get to multithreading in the next chapter, but because some of you will, despite warnings, will try to run before you can safely walk, below is the quick and dirty guide to making a clean exit in a <emphasis>multithreaded</emphasis> ØMQ application.</para>
<para>First, do not try to use the same socket from multiple threads. No, don't explain why you think this would be excellent fun, just please don't do it. Next, you need to shut down each socket that has ongoing requests. The proper way is to set a low LINGER value (1 second), then close the socket. If your language binding doesn't do this for you automatically when you destroy a context, I'd suggest sending a patch.</para>
<para>Finally, destroy the context. This will cause any blocking receives or polls or sends in attached threads (i.e. which share the same context) to return with an error. Catch that error, and then set linger on, and close sockets in <emphasis>that</emphasis> thread, and exit. Do not destroy the same context twice. The zmq-ctx-destroy in the main thread will block until all sockets it knows about are safely closed.</para>
<para>Voila! It's complex and painful enough that any language binding author worth his or her salt will do this automatically and make the socket closing dance unnecessary.</para>
</sect2>
</sect1>
<sect1>
<title>Why We Needed ØMQ</title>
<para>Now that you've seen ØMQ in action, let's go back to the "why".</para>
<para>Many applications these days consist of components that stretch across some kind of network, either a LAN or the Internet. So many application developers end up doing some kind of messaging. Some developers use message queuing products, but most of the time they do it themselves, using TCP or UDP. These protocols are not hard to use, but there is a great difference between sending a few bytes from A to B, and doing messaging in any kind of reliable way.</para>
<para>Let's look at the typical problems we face when we start to connect pieces using raw TCP. Any reusable messaging layer would need to solve all or most these:</para>
<itemizedlist>
<listitem><para>How do we handle I/O? Does our application block, or do we handle I/O in the background? This is a key design decision. Blocking I/O creates architectures that do not scale well. But background I/O can be very hard to do right.</para></listitem>
<listitem><para>How do we handle dynamic components, i.e. pieces that go away temporarily? Do we formally split components into "clients" and "servers" and mandate that servers cannot disappear? What then if we want to connect servers to servers? Do we try to reconnect every few seconds?</para></listitem>
<listitem><para>How do we represent a message on the wire? How do we frame data so it's easy to write and read, safe from buffer overflows, efficient for small messages, yet adequate for the very largest videos of dancing cats wearing party hats?</para></listitem>
<listitem><para>How do we handle messages that we can't deliver immediately? Particularly, if we're waiting for a component to come back on-line? Do we discard messages, put them into a database, or into a memory queue?</para></listitem>
<listitem><para>Where do we store message queues? What happens if the component reading from a queue is very slow, and causes our queues to build up? What's our strategy then?</para></listitem>
<listitem><para>How do we handle lost messages? Do we wait for fresh data, request a resend, or do we build some kind of reliability layer that ensures messages cannot be lost? What if that layer itself crashes?</para></listitem>
<listitem><para>What if we need to use a different network transport. Say, multicast instead of TCP unicast? Or IPv6? Do we need to rewrite the applications, or is the transport abstracted in some layer?</para></listitem>
<listitem><para>How do we route messages? Can we send the same message to multiple peers? Can we send replies back to an original requester?</para></listitem>
<listitem><para>How do we write an API for another language? Do we re-implement a wire-level protocol or do we repackage a library? If the former, how can we guarantee efficient and stable stacks? If the latter, how can we guarantee interoperability?</para></listitem>
<listitem><para>How do we represent data so that it can be read between different architectures? Do we enforce a particular encoding for data types? How far is this the job of the messaging system rather than a higher layer?</para></listitem>
<listitem><para>How do we handle network errors? Do we wait and retry, ignore them silently, or abort?</para></listitem>
</itemizedlist>
<para>Take a typical open source project like <ulink url="http://hadoop.apache.org/zookeeper/">Hadoop Zookeeper</ulink> and read the C API code in <ulink url="http://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c">src/c/src/zookeeper.c</ulink>. As I write this, in 2010, the code is 3,200 lines of mystery and in there is an undocumented, client-server network communication protocol. I see it's efficient because it uses poll() instead of select(). But really, Zookeeper should be using a generic messaging layer and an explicitly documented wire level protocol. It is incredibly wasteful for teams to be building this particular wheel over and over.</para>
<para>But how to make a reusable messaging layer? Why, when so many projects need this technology, are people still doing it the hard way, by driving TCP sockets in their code, and solving the problems in that long list, over and over(<xref linkend="figure-7"/>)?</para>
<figure id="figure-7">
<title>Messaging as it Starts</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/fig7.eps" format="EPS" width="4.8in"/>
</imageobject>
</mediaobject>
</figure>
<para>It turns out that building reusable messaging systems is really difficult, which is why few FOSS projects ever tried, and why commercial messaging products are complex, expensive, inflexible, and brittle. In 2006 iMatix designed <ulink url="http://www.amqp.org">AMQP</ulink> which started to give FOSS developers perhaps the first reusable recipe for a messaging system. AMQP works better than many other designs <ulink url="http://www.imatix.com/articles:whats-wrong-with-amqp">but remains relatively complex, expensive, and brittle</ulink>. It takes weeks to learn to use, and months to create stable architectures that don't crash when things get hairy.</para>
<para>Most messaging projects, like AMQP, that try to solve this long list of problems in a reusable way do so by inventing a new concept, the "broker", that does addressing, routing, and queuing. This results in a client-server protocol or a set of APIs on top of some undocumented protocol, that let applications speak to this broker. Brokers are an excellent thing in reducing the complexity of large networks. But adding broker-based messaging to a product like Zookeeper would make it worse, not better. It would mean adding an additional big box, and a new single point of failure. A broker rapidly becomes a bottleneck and a new risk to manage. If the software supports it, we can add a second, third, fourth broker and make some fail-over scheme. People do this. It creates more moving pieces, more complexity, more things to break.</para>
<para>And a broker-centric set-up needs its own operations team. You literally need to watch the brokers day and night, and beat them with a stick when they start misbehaving. You need boxes, and you need backup boxes, and you need people to manage those boxes. It is only worth doing for large applications with many moving pieces, built by several teams of people, over several years.</para>
<para>So small to medium application developers are trapped. Either they avoid network programming, and make monolithic applications that do not scale. Or they jump into network programming and make brittle, complex applications that are hard to maintain. Or they bet on a messaging product, and end up with scalable applications that depend on expensive, easily broken technology. There has been no really good choice, which is maybe why messaging is largely stuck in the last century and stirs strong emotions. Negative ones for users, gleeful joy for those selling support and licenses(<xref linkend="figure-8"/>).</para>
<figure id="figure-8">
<title>Messaging as it Becomes</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/fig8.eps" format="EPS" width="4.8in"/>
</imageobject>
</mediaobject>
</figure>
<para>What we need is something that does the job of messaging but does it in such a simple and cheap way that it can work in any application, with close to zero cost. It should be a library that you just link with, without any other dependencies. No additional moving pieces, so no additional risk. It should run on any OS and work with any programming language.</para>
<para>And this is ØMQ: an efficient, embeddable library that solves most of the problems an application needs to become nicely elastic across a network, without much cost.</para>
<para>Specifically:</para>
<itemizedlist>
<listitem><para>It handles I/O asynchronously, in background threads. These communicate with application threads using lock-free data structures, so concurrent ØMQ applications need no locks, semaphores, or other wait states.</para></listitem>
<listitem><para>Components can come and go dynamically and ØMQ will automatically reconnect. This means you can start components in any order. You can create "service-oriented architectures" (SOAs) where services can join and leave the network at any time.</para></listitem>
<listitem><para>It queues messages automatically when needed. It does this intelligently, pushing messages as close as possible to the receiver before queuing them.</para></listitem>
<listitem><para>It has ways of dealing with over-full queues (called "high water mark"). When a queue is full, ØMQ automatically blocks senders, or throws away messages, depending on the kind of messaging you are doing (the so-called "pattern").</para></listitem>
<listitem><para>It lets your applications talk to each other over arbitrary transports: TCP, multicast, in-process, inter-process. You don't need to change your code to use a different transport.</para></listitem>
<listitem><para>It handles slow/blocked readers safely, using different strategies that depend on the messaging pattern.</para></listitem>
<listitem><para>It lets you route messages using a variety of patterns such as request-reply and publish-subscribe. These patterns are how you create the topology, the structure of your network.</para></listitem>
<listitem><para>It lets you create proxies to queue, forward, or capture messages with a single call. Proxies can reduce the interconnection complexity of a network.</para></listitem>
<listitem><para>It delivers whole messages exactly as they were sent, using a simple framing on the wire. If you write a 10k message, you will receive a 10k message.</para></listitem>
<listitem><para>It does not impose any format on messages. They are blobs of zero to gigabytes large. When you want to represent data you choose some other product on top, such as Google's protocol buffers, XDR, and others.</para></listitem>
<listitem><para>It handles network errors intelligently. Sometimes it retries, sometimes it tells you an operation failed.</para></listitem>
<listitem><para>It reduces your carbon footprint. Doing more with less CPU means your boxes use less power, and you can keep your old boxes in use for longer. Al Gore would love ØMQ.</para></listitem>
</itemizedlist>
<para>Actually ØMQ does rather more than this. It has a subversive effect on how you develop network-capable applications. Superficially it's a socket-inspired API on which you do <literal>zmq-msg-recv[3]</literal> and <literal>zmq-msg-send[3]</literal>. But message processing rapidly becomes the central loop, and your application soon breaks down into a set of message processing tasks. It is elegant and natural. And it scales: each of these tasks maps to a node, and the nodes talk to each other across arbitrary transports. Two nodes in one process (node is a thread), two nodes on one box (node is a process), or two boxes on one network (node is a box) - it's all the same, with no application code changes.</para>
</sect1>
<sect1>
<title>Socket Scalability</title>
<para>Let's see ØMQ's scalability in action. Here is a shell script that starts the weather server and then a bunch of clients in parallel:</para>
<screen>wuserver &
wuclient 12345 &
wuclient 23456 &
wuclient 34567 &
wuclient 45678 &
wuclient 56789 &
</screen>
<para>As the clients run, we take a look at the active processes using 'top', and we see something like (on a 4-core box):</para>
<screen> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7136 ph 20 0 1040m 959m 1156 R 157 12.0 16:25.47 wuserver
7966 ph 20 0 98608 1804 1372 S 33 0.0 0:03.94 wuclient
7963 ph 20 0 33116 1748 1372 S 14 0.0 0:00.76 wuclient
7965 ph 20 0 33116 1784 1372 S 6 0.0 0:00.47 wuclient
7964 ph 20 0 33116 1788 1372 S 5 0.0 0:00.25 wuclient
7967 ph 20 0 33072 1740 1372 S 5 0.0 0:00.35 wuclient
</screen>
<para>Let's think for a second about what is happening here. The weather server has a single socket, and yet here we have it sending data to five clients in parallel. We could have thousands of concurrent clients. The server application doesn't see them, doesn't talk to them directly. So the ØMQ socket is acting like a little server, silently accepting client requests and shoving data out to them as fast as the network can handle it. And it's a multithreaded server, squeezing more juice out of your CPU.</para>
</sect1>
<sect1>
<title>Upgrading from ØMQ/2.2 to ØMQ/3.2</title>
<para>In early 2012, ØMQ/3.2 became stable enough for live use and by the time you're reading this, it's what you really should be using. If you are still using 2.2, here's a quick summary of the changes, and how to migrate your code.</para>
<para>The main change in 3.x is that PUB-SUB works properly, as in, the publisher only sends subscribers stuff they actually want. In 2.x, publishers send everything and the subscribers filter. Simple, but not ideal for performance on a TCP network.</para>
<para>Most of the API is backwards compatible, except a few changes that went into 3.0 with little regard to the cost of breaking existing code. The syntax of <literal>zmq-send[3]</literal> and <literal>zmq-recv[3]</literal> changed, and <literal>ZMQ-NOBLOCK</literal> got rebaptized to <literal>ZMQ-DONTWAIT</literal>. So although I'd love to say, "you just recompile your code with the latest libzmq and everything will work", that's not how it is. For what it's worth, we banned such API breakage afterwards.</para>
<para>So the minimal change for C/C++ apps that use the low-level libzmq API is to replace all calls to <literal>zmq-send[3]</literal> with <literal>zmq-msg-send[3]</literal>, and <literal>zmq-recv[3]</literal> with <literal>zmq-msg-recv[3]</literal>. In other languages, your binding author may have done the work already. Note that these two functions now return -1 in case of error, and zero or more according to how many bytes were sent or received.</para>
<para>Other parts of the libzmq API became more consistent. We deprecated <literal>zmq-init[3]</literal> and <literal>zmq-term[3]</literal>, replacing them with <literal>zmq-ctx-new[3]</literal> and <literal>zmq-ctx-destroy[3]</literal>. We added <literal>zmq-ctx-set[3]</literal> to let you configure a context before starting to work with it.</para>
<para>Finally, we added context monitoring via the <literal>zmq-ctx-set-monitor[3]</literal> call, which lets you track connections and disconnections, and other events on sockets.</para>
</sect1>
<sect1>
<title>Warning - Unstable Paradigms!</title>
<para>Traditional network programming is built on the general assumption that one socket talks to one connection, one peer. There are multicast protocols but these are exotic. When we assume "one socket = one connection", we scale our architectures in certain ways. We create threads of logic where each thread work with one socket, one peer. We place intelligence and state in these threads.</para>
<para>In the ØMQ universe, sockets are doorways to fast little background communications engines that manage a whole set of connections automagically for you. You can't see, work with, open, close, or attach state to these connections. Whether you use blocking send or receive, or poll, all you can talk to is the socket, not the connections it manages for you. The connections are private and invisible, and this is the key to ØMQ's scalability.</para>
<para>Because your code, talking to a socket, can then handle any number of connections across whatever network protocols are around, without change. A messaging pattern sitting in ØMQ can scale more cheaply than a messaging pattern sitting in your application code.</para>
<para>So the general assumption no longer applies. As you read the code examples, your brain will try to map them to what you know. You will read "socket" and think "ah, that represents a connection to another node". That is wrong. You will read "thread" and your brain will again think, "ah, a thread represents a connection to another node", and again your brain will be wrong.</para>
<para>If you're reading this Guide for the first time, realize that until you actually write ØMQ code for a day or two (and maybe three or four days), you may feel confused, especially by how simple ØMQ makes things for you, and you may try to impose that general assumption on ØMQ, and it won't work. And then you will experience your moment of enlightenment and trust, that <emphasis>zap-pow-kaboom</emphasis> satori paradigm-shift moment when it all becomes clear.</para>
</sect1>
</chapter>
<chapter id="sockets-and-patterns">
<title>Sockets and Patterns</title>
<para>In Basics<xref linkend="basics"/> we took ØMQ for a drive, with some basic examples of the main ØMQ patterns: request-reply, publish-subscribe, and pipeline. In this chapter we're going to get our hands dirty and start to learn how to use these tools in real programs.</para>
<para>We'll cover:</para>
<itemizedlist>
<listitem><para>How to create and work with ØMQ sockets.</para></listitem>
<listitem><para>How to send and receive messages on sockets.</para></listitem>
<listitem><para>How to build your apps around ØMQ's asynchronous I/O model.</para></listitem>
<listitem><para>How to handle multiple sockets in one thread.</para></listitem>
<listitem><para>How to handle fatal and non-fatal errors properly.</para></listitem>
<listitem><para>How to handle interrupt signals like Ctrl-C.</para></listitem>
<listitem><para>How to shutdown a ØMQ application cleanly.</para></listitem>
<listitem><para>How to check a ØMQ application for memory leaks.</para></listitem>
<listitem><para>How to send and receive multi-part messages.</para></listitem>
<listitem><para>How to forward messages across networks.</para></listitem>
<listitem><para>How to build a simple message queuing broker.</para></listitem>
<listitem><para>How to write multithreaded applications with ØMQ.</para></listitem>
<listitem><para>How to use ØMQ to signal between threads.</para></listitem>
<listitem><para>How to use ØMQ to coordinate a network of nodes.</para></listitem>
<listitem><para>How to create and use message envelopes for publish-subscribe.</para></listitem>
<listitem><para>Using the high-water mark (HWM) to protect against memory overflows.</para></listitem>
</itemizedlist>
<sect1>
<title>The Socket API</title>
<para>To be perfectly honest, ØMQ does a kind of switch-and-bait on you. Which we don't apologize for, it's for your own good and hurts us more than it hurts you. It presents a familiar socket-based API, which requires great effort for us to hide a bunch of message-processing engines. However, the result will slowly fix your world-view about how to design and write distributed software.</para>
<para>Sockets are the de-facto standard API for network programming, as well as being useful for stopping your eyes from falling onto your cheeks. One thing that makes ØMQ especially tasty to developers is that it uses sockets and messages instead of some other arbitrary set of concepts. Kudos to Martin Sustrik for pulling this off. It turns "Message Oriented Middleware", a phrase guaranteed to send the whole room off to Catatonia, into "Extra Spicy Sockets!" which leaves us with a strange craving for pizza, and a desire to know more.</para>
<para>Like a favorite dish, ØMQ sockets are easy to digest. Sockets have a life in four parts, just like BSD sockets:</para>
<itemizedlist>
<listitem><para>Creating and destroying sockets, which go together to form a karmic circle of socket life (see <literal>zmq-socket[3]</literal>, {{zmq-close[3]).</para></listitem>
<listitem><para>Configuring sockets by setting options on them and checking them if necessary (see <literal>zmq-setsockopt[3]</literal>, {{zmq-getsockopt[3]).</para></listitem>
<listitem><para>Plugging sockets onto the network topology by creating ØMQ connections to and from them (see <literal>zmq-bind[3]</literal>, {{zmq-connect[3]).</para></listitem>
<listitem><para>Using the sockets to carry data by writing and receiving messages on them (see <literal>zmq-msg-send[3]</literal>, {{zmq-msg-recv[3]).</para></listitem>
</itemizedlist>
<para>Note that sockets are always void pointers, and messages (which we'll come to very soon) are structures. So in C you pass sockets as-such, but you pass addresses of messages in all functions that work with messages, like <literal>zmq-msg-send[3]</literal> and <literal>zmq-msg-recv[3]</literal>. As a mnemonic, realize that "in ØMQ all your sockets are belong to us", but messages are things you actually own in your code.</para>
<para>Creating, destroying, and configuring sockets works as you'd expect for any object. But remember that ØMQ is an asynchronous, elastic fabric. This has some impact on how we plug sockets into the network topology, and how we use the sockets after that.</para>
<sect2>
<title>Plugging Sockets Into the Topology</title>
<para>To create a connection between two nodes you use <literal>zmq-bind[3]</literal> in one node, and <literal>zmq-connect[3]</literal> in the other. As a general rule of thumb, the node which does <literal>zmq-bind[3]</literal> is a "server", sitting on a well-known network address, and the node which does <literal>zmq-connect[3]</literal> is a "client", with unknown or arbitrary network addresses. Thus we say that we "bind a socket to an endpoint" and "connect a socket to an endpoint", the endpoint being that well-known network address.</para>
<para>ØMQ connections are somewhat different from old-fashioned TCP connections. The main notable differences are:</para>
<itemizedlist>
<listitem><para>They go across an arbitrary transport (<literal>inproc</literal>, <literal>ipc</literal>, <literal>tcp</literal>, <literal>pgm</literal> or <literal>epgm</literal>). See <literal>zmq-inproc[7]</literal>, <literal>zmq-ipc[7]</literal>, <literal>zmq-tcp[7]</literal>, <literal>zmq-pgm[7]</literal>, and <literal>zmq-epgm[7]</literal>.</para></listitem>
<listitem><para>One socket may have many outgoing and many incoming connections.</para></listitem>
<listitem><para>There is no {{zmq-accept() method. When a socket is bound to an endpoint it automatically starts accepting connections.</para></listitem>
<listitem><para>The network connection itself happens in the background, and ØMQ will automatically re-connect if the network connection is broken (e.g. if the peer disappears and then comes back).</para></listitem>
<listitem><para>Your application code cannot work with these connections directly; they are encapsulated under the socket.</para></listitem>
</itemizedlist>
<para>Many architectures follow some kind of client-server model, where the server is the component that is most static, and the clients are the components that are most dynamic, i.e. they come and go the most. There are sometimes issues of addressing: servers will be visible to clients, but not necessarily vice-versa. So mostly it's obvious which node should be doing <literal>zmq-bind[3]</literal> (the server) and which should be doing <literal>zmq-connect[3]</literal> (the client). It also depends on the kind of sockets you're using, with some exceptions for unusual network architectures. We'll look at socket types later.</para>
<para>Now, imagine we start the client <emphasis>before</emphasis> we start the server. In traditional networking we get a big red Fail flag. But ØMQ lets us start and stop pieces arbitrarily. As soon as the client node does <literal>zmq-connect[3]</literal> the connection exists and that node can start to write messages to the socket. At some stage (hopefully before messages queue up so much that they start to get discarded, or the client blocks), the server comes alive, does a <literal>zmq-bind[3]</literal> and ØMQ starts to deliver messages.</para>
<para>A server node can bind to many endpoints (that is, a combination of protocol and address) and it can do this using a single socket. This means it will accept connections across different transports:</para>