Discussion:
[zeromq-dev] REQ / REP and inproc sockets
gonzalo diethelm
2010-08-25 16:11:46 UTC
Permalink
If this is a known issue, I apologize for not paying attention.

I am developing a small client/server app in Java and 0MQ. The client
does the following:

* On the main thread:
+ Create a REQ socket bound to inproc://r
+ Add that socket to a Poller.
+ Spawn a new thread.

* On the new thread:
+ Create a REP socket connected to inproc://r and a REQ socket connected
to tcp://127.0.0.1:5555, where my server has a REP socket bound.
+ Loop: read from inproc://r and write to tcp; read from tcp and write
to inproc://r. Notice the new thread does not poll, it simply issues
blocking recv and send calls.

* Back on the main thread:
+ Send one request down inproc://r.
+ Poll for incoming messages (only on inproc://r).
+ The expectation is that the main thread will wake up from the poll and
will be able to process the response it got from the remote server.

This whole thing works, but NOT most of the time. Sometimes the new
thread writes the reply on inproc://r but the main thread never wakes up
from the poll. Or, the main thread writes the message on inproc://r but
the new thread never receives it. The only common factor between the two
errors is the use of an inproc socket between the two threads. I have
taken care to create each socket within its proper thread.

The fact that it DOES work sometimes is the most tantalizing. I thought
I would run this design by the list to check for any more-than-usual
stupidity on my part, before I dive into the debugging process. Could
this be related to the use of REQ/REP sockets and their strict ordering
semantics?

Thanks for any hints and best regards.
--
Gonzalo Diethelm
gonzalo diethelm
2010-08-25 20:58:49 UTC
Permalink
Post by gonzalo diethelm
This whole thing works, but NOT most of the time. Sometimes the new
thread writes the reply on inproc://r but the main thread never wakes up
from the poll. Or, the main thread writes the message on inproc://r but
the new thread never receives it. The only common factor between the two
errors is the use of an inproc socket between the two threads. I have
taken care to create each socket within its proper thread.
The fact that it DOES work sometimes is the most tantalizing. I thought
I would run this design by the list to check for any more-than-usual
stupidity on my part, before I dive into the debugging process. Could
this be related to the use of REQ/REP sockets and their strict
ordering
Post by gonzalo diethelm
semantics?
I replaced the inproc sockets with plain tcp sockets. I still see the
same problem; it works sometimes but, even it fails as described above,
even more often than with inproc sockets. So, it is not inproc sockets
at the root of the problem. Could it be the REQ/REP sockets?

Say I have a REQ or REP socket. Can I do a poll() on it before anything
is sent or received on the socket? If I poll() such an unused socket
for, say, 5 seconds, will it block for at most those 5 seconds? If a
message arrives within those 5 seconds, will poll() return right away?

What would happen if I tried to poll() such a socket before the
connection has been internally established by 0MQ?

I know this all sounds pretty bread-and-butter-y, but I am just trying
to make sure I've got the basics covered. Thanks.
--
Gonzalo Diethelm
gonzalo diethelm
2010-08-25 21:23:14 UTC
Permalink
Post by gonzalo diethelm
I know this all sounds pretty bread-and-butter-y, but I am just trying
to make sure I've got the basics covered. Thanks.
And yes, I am an ass. I was creating the sockets in the constructor of
my threaded object, which runs in the context of the MAIN thread. After
I moved the creation to the run() method, which runs in the context of
the NEW thread, poof! 0MQ started doing its magic again.

Sorry for all the public introspection; it really helped me diagnose
this.
--
Gonzalo Diethelm
gonzalo diethelm
2010-08-25 21:34:55 UTC
Permalink
Post by gonzalo diethelm
And yes, I am an ass.
On this high note, allow me two more questions.

1. I use a PUB socket (connects) in my server to publish every five
seconds a heartbeat.
Martin Sustrik
2010-08-26 13:21:47 UTC
Permalink
Hi Gonzalo,
Post by gonzalo diethelm
Post by gonzalo diethelm
And yes, I am an ass.
The introspection was fun to read :)
Post by gonzalo diethelm
On this high note, allow me two more questions.
1. I use a PUB socket (connects) in my server to publish every five
seconds a heartbeat.
gonzalo diethelm
2010-08-26 13:40:54 UTC
Permalink
Post by Martin Sustrik
Post by gonzalo diethelm
And yes, I am an ass.
The introspection was fun to read :)
And fun to do as well. In fact, one of the great things about 0MQ is how
much fun it is to work with.
Post by Martin Sustrik
There's still a transient buffer between the PUB and SUB socket so
that
Post by Martin Sustrik
random jitter on the line doesn't disrupt the transmission.
For heartbeats it would make sense to set the buffer size to 1 message
(ZMQ_HWM option), so that at any point of time only 1 heartbeat is
buffered.
This worked perfectly, thanks for the hint.
Post by Martin Sustrik
The latest changes on the trunk are the "socket migration between
threads" code. Thus it will be possible to move your socket from
thread
Post by Martin Sustrik
A to thread B. That makes implementing the check impossible.
Oh, OK. I am looking forward to the next version.
--
Gonzalo Diethelm
Loading...