[zeromq-dev] question about HWM

Discussion:

Diffuser78

2012-06-21 16:48:29 UTC

Hi,

I just started playing with zmq, and I had a question about HWM.

If my socket type is of ZMQ_REP, and if this socket enters an exceptional
state due to HWM reached, I read in the guide that it will drop the reply
message being sent to the client. *My questions is: would it retry again ?
Can I trust zmq to deliver the stuff once I hand it over to it ?*

I have two apps: App1 and App2 on two different boxes. App1 send a payload
to App2, and waits for an application level ACK from App2. Since ZMQ may
drop messages if HWM is reached, my App1 would not know if it's message was
delivered or not, and hence it would not know whether to retry or not after
timeout ?

Can you share your experience around this scenario ?

Thanks.

Chuck Remes

2012-06-21 20:46:44 UTC

Permalink

So, let's walk through the full chain of events here. In your scenario you have a REP socket that is sending ACKs back to a REQ socket somewhere else. You have set the HWM to (for example) 10.

If there is a *single* REQ socket talking to a *single* REP socket, then you will never hit the HWM. The REQ/REP sockets enforce a very strict request/reply/request/reply pattern (so you can't do request/request/request/reply/reply/reply).

A more reasonable scenario is that you have several REQ sockets all sending requests to the same REP socket (perhaps there are 100s or 1000s). Let's assume that the network is slow (or high latency) but the work done by the REP process is minimal, so it is able to process the requests very quickly and send a reply but the reply is slow to go out.

From the FAQ you can see that the REQ socket will have an incoming queue and kernel buffers that can store messages. Likewise, the REP system has an outgoing queue which in turn hands messages off to the kernel for transmission. Only when the REP socket finally gets backpressure from the kernel (e.g. ran out of buffers) does it start to internally queue stuff. When you hit the magic number of 10 messages, the 11th message will be dropped. There is *no* retry because this drop happens at a layer above TCP (and it works the same for the other transport types that don't share TCP semantics, so this is consistent).

To solve this, your REQ system needs to have a timer set to a reasonable period of time within which it expects a reply. If it doesn't get the reply before the timer expires, then the request should be assumed to be lost. Furthermore, your app will need to be able to handle the case where the timer expires, the request is sent a second time, and then the *original* reply arrives (it's just late).

I don't know of a specific example in the guide that covers this, but several people have solved it in their applications. You may want to seek out the "Salt" system that uses zmq for its message bus; I imagine it has a mechanism to handle what I described above.

cr

Diffuser78

2012-06-21 23:23:09 UTC

Permalink

Thanks, Chuck, for your views and insight on this. I have a fairly newbie
question about the messaging pattern:

In my scenario, App1 (server) pushes say ~10 MB of data to App2 (client)
periodically. In this case, can I make my App1 as REQ and App2 as REP ? In
this communications, App1 is the initiating entity and also the one that
sends payload and App2 is the receiving entity. Is there any rule of thumb
or recommended practice about who should be REQ versus who should be REP ?

Many thanks.

Post by Diffuser78
Hi,
I just started playing with zmq, and I had a question about HWM.
If my socket type is of ZMQ_REP, and if this socket enters an exceptional
state due to HWM reached, I read in the guide that it will drop the reply
message being sent to the client. *My questions is: would it retry again
? Can I trust zmq to deliver the stuff once I hand it over to it ?*
I have two apps: App1 and App2 on two different boxes. App1 send a payload
to App2, and waits for an application level ACK from App2. Since ZMQ may
drop messages if HWM is reached, my App1 would not know if it's message was
delivered or not, and hence it would not know whether to retry or not after
timeout ?
Can you share your experience around this scenario ?
So, let's walk through the full chain of events here. In your scenario you
have a REP socket that is sending ACKs back to a REQ socket somewhere else.
You have set the HWM to (for example) 10.
If there is a *single* REQ socket talking to a *single* REP socket, then
you will never hit the HWM. The REQ/REP sockets enforce a very strict
request/reply/request/reply pattern (so you can't do
request/request/request/reply/reply/reply).
A more reasonable scenario is that you have several REQ sockets all
sending requests to the same REP socket (perhaps there are 100s or 1000s).
Let's assume that the network is slow (or high latency) but the work done
by the REP process is minimal, so it is able to process the requests very
quickly and send a reply but the reply is slow to go out.

Chuck Remes

2012-06-22 00:48:47 UTC

Permalink

In my scenario, App1 (server) pushes say ~10 MB of data to App2 (client) periodically. In this case, can I make my App1 as REQ and App2 as REP ? In this communications, App1 is the initiating entity and also the one that sends payload and App2 is the receiving entity. Is there any rule of thumb or recommended practice about who should be REQ versus who should be REP ?
Many thanks.

What you describe above is a pretty classic example for using REQ/REP. Generally, I recommend that people use DEALER/ROUTER (formerly named XREQ/XREP) for communications so that you can send multiple requests and multiple replies without being tied to the req/rep/req/rep pattern enforced by the REQ/REP sockets.

Anyway, the rule of thumb is that the entity creating the work or the request should be the REQ or DEALER socket. The entity (or entities) handling the work and generating the reply should be the REP (or ROUTER) socket. So, you have it exactly right. It really is as simple as it looks.

cr

Diffuser78

2012-06-22 21:39:45 UTC

Permalink

Hi,

I had a very simply client and server processes. The server pushes say ~1
GB of data to client. When I used REQ/REP, everything was fine. When I
tried the same program with DEALER/ROUTER, I got seg faults. Can you just
replace REP/REQ with DEALER/ROUTER or are there semantics that are related
to DEALER/ROUTER that needs to be followed.

I am still reading zguide but since its quite exhaustive, I am yet to reach
to the messaging patterns.

Thanks.

Post by Diffuser78

Post by Diffuser78
Thanks, Chuck, for your views and insight on this. I have a fairly
In my scenario, App1 (server) pushes say ~10 MB of data to App2 (client)

periodically. In this case, can I make my App1 as REQ and App2 as REP ? In
this communications, App1 is the initiating entity and also the one that
sends payload and App2 is the receiving entity. Is there any rule of thumb
or recommended practice about who should be REQ versus who should be REP ?

Post by Diffuser78
Many thanks.

What you describe above is a pretty classic example for using REQ/REP.
Generally, I recommend that people use DEALER/ROUTER (formerly named
XREQ/XREP) for communications so that you can send multiple requests and
multiple replies without being tied to the req/rep/req/rep pattern enforced
by the REQ/REP sockets.
Anyway, the rule of thumb is that the entity creating the work or the
request should be the REQ or DEALER socket. The entity (or entities)
handling the work and generating the reply should be the REP (or ROUTER)
socket. So, you have it exactly right. It really is as simple as it looks.
cr
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Chuck Remes

2012-06-22 22:04:41 UTC

Permalink

Yes, there are differences. The documentation will help you.

cr

Hi,
I had a very simply client and server processes. The server pushes say ~1 GB of data to client. When I used REQ/REP, everything was fine. When I tried the same program with DEALER/ROUTER, I got seg faults. Can you just replace REP/REQ with DEALER/ROUTER or are there semantics that are related to DEALER/ROUTER that needs to be followed.
I am still reading zguide but since its quite exhaustive, I am yet to reach to the messaging patterns.