Discussion:
[zeromq-dev] sending from multiple threads
Lineker Tomazeli
2016-12-01 15:23:46 UTC
Permalink
Hi guys,

I have been reading zeromq list for a while now but is the first time I'm
asking something. So nice to meet you all :)

In the zeromq documentation/book is very clear that we shouldn't be
handling zeromq sockets from multiple threads without doing a memory
barrier.

*First question:*

What is the recommended pattern to have one thread responsible for
connecting, binding and polling messages from sockets. And have multiple
thread sending messages.

My first approach was to have a threadsafe producer/consumer queue where
sender threads would push on the queue and the "zeromq" thread would pick
from queue and send through the wire. This works but I find it redundant
and memory inefficient.

My second approach (on which I used netmq) was to schedule a "send a
message" task on the Poller. This guaranteed that the task ran in the same
thread as the poller.


What would be the correct approach if I use clrzmq4 ?


*Second question :*

What is the correct approach for a "zeromq" thread to notify other threads
that I new message was received ?
- raise an event on another thread?
- use a threadsafe producer/consumer queue to dump receiving message ?


*Third question:*

Zeromq is truly a great and fun framework to work with. I was wondering why
there not so much buzz about it? Do you know companies using in production?



Lineker Tomazeli
tomazeli.net
Luca Boccassi
2016-12-01 17:19:30 UTC
Permalink
Post by Lineker Tomazeli
Hi guys,
I have been reading zeromq list for a while now but is the first time I'm
asking something. So nice to meet you all :)
In the zeromq documentation/book is very clear that we shouldn't be
handling zeromq sockets from multiple threads without doing a memory
barrier.
*First question:*
What is the recommended pattern to have one thread responsible for
connecting, binding and polling messages from sockets. And have multiple
thread sending messages.
My first approach was to have a threadsafe producer/consumer queue where
sender threads would push on the queue and the "zeromq" thread would pick
from queue and send through the wire. This works but I find it redundant
and memory inefficient.
My second approach (on which I used netmq) was to schedule a "send a
message" task on the Poller. This guaranteed that the task ran in the same
thread as the poller.
What would be the correct approach if I use clrzmq4 ?
Why do you want to manage the socket on one thread and use from another?
What about just passing the endpoint string to each thread, so that they
can manage their own socket?
Post by Lineker Tomazeli
*Second question :*
What is the correct approach for a "zeromq" thread to notify other threads
that I new message was received ?
- raise an event on another thread?
- use a threadsafe producer/consumer queue to dump receiving message ?
For inter-thread communication you could use an inproc:// endpoint,
which is essentially a lock-less in-memory queue. You can build any
model you want with it as a communication channel between threads. And
given they all take zmq_msg_t, you can just pass them along directly.

If the tasks to distribute are all equal in cost/time, a simple and
common pattern is to use a round-robin type of socket to distribute
work. There are a lot of examples and suggestions in the zguide.
Post by Lineker Tomazeli
*Third question:*
Zeromq is truly a great and fun framework to work with. I was wondering why
there not so much buzz about it? Do you know companies using in production?
Many companies do. We use it at Brocade in the vRouter, for example.
--
Kind regards,
Luca Boccassi
Lineker Tomazeli
2016-12-01 19:30:54 UTC
Permalink
Hi Luca,

Thanks for answering. See my comments below.

Lineker Tomazeli
tomazeli.net
Post by Luca Boccassi
Post by Lineker Tomazeli
Hi guys,
I have been reading zeromq list for a while now but is the first time I'm
asking something. So nice to meet you all :)
In the zeromq documentation/book is very clear that we shouldn't be
handling zeromq sockets from multiple threads without doing a memory
barrier.
*First question:*
What is the recommended pattern to have one thread responsible for
connecting, binding and polling messages from sockets. And have multiple
thread sending messages.
My first approach was to have a threadsafe producer/consumer queue where
sender threads would push on the queue and the "zeromq" thread would pick
from queue and send through the wire. This works but I find it redundant
and memory inefficient.
My second approach (on which I used netmq) was to schedule a "send a
message" task on the Poller. This guaranteed that the task ran in the
same
Post by Lineker Tomazeli
thread as the poller.
What would be the correct approach if I use clrzmq4 ?
Why do you want to manage the socket on one thread and use from another?
What about just passing the endpoint string to each thread, so that they
can manage their own socket?
yes, I want to have one thread managing the socket and multiple thread
doing work. When this multiple threads doing work need to send messages
they say, "hey thread managing zeromq socket, here is message please
deliver to me"

having each thread manage their own socket is not feasible since these
threads come and go when needed (thread pool). I don't want each of them to
start connecting sockets if they only need to send one message.,
Post by Luca Boccassi
Post by Lineker Tomazeli
*Second question :*
What is the correct approach for a "zeromq" thread to notify other
threads
Post by Lineker Tomazeli
that I new message was received ?
- raise an event on another thread?
- use a threadsafe producer/consumer queue to dump receiving message ?
For inter-thread communication you could use an inproc:// endpoint,
which is essentially a lock-less in-memory queue. You can build any
model you want with it as a communication channel between threads. And
given they all take zmq_msg_t, you can just pass them along directly.
If the tasks to distribute are all equal in cost/time, a simple and
common pattern is to use a round-robin type of socket to distribute
work. There are a lot of examples and suggestions in the zguide.
So would it be ok (performance, overhead and etc) for each thread that
needs to send a message to connect using inproc:// to the "sender thread"

worker thread ------inproc:// ---->
worker thread ------inproc:// ----> sender thread -----tcp:// ----->
server
worker thread ------inproc:// ---->
Post by Luca Boccassi
Post by Lineker Tomazeli
*Third question:*
Zeromq is truly a great and fun framework to work with. I was wondering
why
Post by Lineker Tomazeli
there not so much buzz about it? Do you know companies using in
production?
Many companies do. We use it at Brocade in the vRouter, for example.
--
Kind regards,
Luca Boccassi
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Colin Ingarfield
2016-12-01 21:34:34 UTC
Permalink
Post by Lineker Tomazeli
Hi Luca,
Thanks for answering. See my comments below.
Lineker Tomazeli
tomazeli.net <http://tomazeli.net>
<snip>
Post by Lineker Tomazeli
Post by Lineker Tomazeli
*Second question :*
What is the correct approach for a "zeromq" thread to notify
other threads
Post by Lineker Tomazeli
that I new message was received ?
- raise an event on another thread?
- use a threadsafe producer/consumer queue to dump receiving
message ?
For inter-thread communication you could use an inproc:// endpoint,
which is essentially a lock-less in-memory queue. You can build any
model you want with it as a communication channel between threads. And
given they all take zmq_msg_t, you can just pass them along directly.
If the tasks to distribute are all equal in cost/time, a simple and
common pattern is to use a round-robin type of socket to distribute
work. There are a lot of examples and suggestions in the zguide.
So would it be ok (performance, overhead and etc) for each thread that
needs to send a message to connect using inproc:// to the "sender thread"
worker thread ------inproc:// ---->
worker thread ------inproc:// ----> sender thread -----tcp://
-----> server
worker thread ------inproc:// ---->
Depending on how frequently worker threads need to communicate with the
sender thread this is a valid approach. I've done this successfully in
the past. I have also created a single long-lived inproc connection to
the sender thread for use by multiple worker threads. In that case the
shared connection must be guarded by a mutex.

In general it's considered an anti-pattern in 0MQ to create and destroy
lots of sockets. Some resource cleanup is asynchronous (handled by a
background reaper thread), so it is possible to run out of file handles
if you create/destroy sockets too quickly.

-- Colin
Marcin Romaszewicz
2016-12-02 02:02:04 UTC
Permalink
Post by Lineker Tomazeli
Hi Luca,
Thanks for answering. See my comments below.
Lineker Tomazeli
tomazeli.net
Post by Luca Boccassi
Post by Lineker Tomazeli
Hi guys,
I have been reading zeromq list for a while now but is the first time
I'm
Post by Lineker Tomazeli
asking something. So nice to meet you all :)
In the zeromq documentation/book is very clear that we shouldn't be
handling zeromq sockets from multiple threads without doing a memory
barrier.
*First question:*
What is the recommended pattern to have one thread responsible for
connecting, binding and polling messages from sockets. And have multiple
thread sending messages.
My first approach was to have a threadsafe producer/consumer queue where
sender threads would push on the queue and the "zeromq" thread would
pick
Post by Lineker Tomazeli
from queue and send through the wire. This works but I find it redundant
and memory inefficient.
My second approach (on which I used netmq) was to schedule a "send a
message" task on the Poller. This guaranteed that the task ran in the
same
Post by Lineker Tomazeli
thread as the poller.
What would be the correct approach if I use clrzmq4 ?
Why do you want to manage the socket on one thread and use from another?
What about just passing the endpoint string to each thread, so that they
can manage their own socket?
yes, I want to have one thread managing the socket and multiple thread
doing work. When this multiple threads doing work need to send messages
they say, "hey thread managing zeromq socket, here is message please
deliver to me"
having each thread manage their own socket is not feasible since these
threads come and go when needed (thread pool). I don't want each of them to
start connecting sockets if they only need to send one message.,
Post by Luca Boccassi
Post by Lineker Tomazeli
*Second question :*
What is the correct approach for a "zeromq" thread to notify other
threads
Post by Lineker Tomazeli
that I new message was received ?
- raise an event on another thread?
- use a threadsafe producer/consumer queue to dump receiving message ?
For inter-thread communication you could use an inproc:// endpoint,
which is essentially a lock-less in-memory queue. You can build any
model you want with it as a communication channel between threads. And
given they all take zmq_msg_t, you can just pass them along directly.
If the tasks to distribute are all equal in cost/time, a simple and
common pattern is to use a round-robin type of socket to distribute
work. There are a lot of examples and suggestions in the zguide.
So would it be ok (performance, overhead and etc) for each thread that
needs to send a message to connect using inproc:// to the "sender thread"
worker thread ------inproc:// ---->
worker thread ------inproc:// ----> sender thread -----tcp:// ----->
server
worker thread ------inproc:// ---->
It's generally a bad idea to create and destroy lots of sockets, and also
create and destroy lots of threads, you'll have throughput problems due to
this, and as Luca mentioned, you'll run out of file descriptors. It can
take seconds to create a thread on a highly loaded system.

You described a good model in your first solution. Consider having a thread
worker pool which has long lived threads using some kind of semaphore to
allocate work to the work units. if you have a single socket where you get
the work, just use one thread to read all the data off that socket, then
package it up into some kind of message, and send it to the worker threads
either via ZMQ in-proc sockets, or some other mechanism. The workers each
have a socket and use it to send their response when they're done. Don't
kill those threads. Your problem description sounds like a classic
producer/consumer queue. Your producers are the long-lived threads which
take in ZMQ connections, and your consumers are your worker threads, and
many data structures exist which can do this. This may be less memory
efficient than a zero-copy approach, but it works fantastically, because
copying objects by value removes contention which you must solve by
locking.

When we were using ZMQ in production, I had a ZMQ Router taking in 100,000
- 200,000 PUB/SUB messages per second on a single ZMQ SUB socket using a
single thread (!!) and fanning out the resulting work to a pile of worker
threads, each of which did some work, and sent the result elsewhere. My
messages were a few hundred bytes in size, that's a 100's of megabytes
coming in per second, and my working memory was < 500MB. Memory is really
cheap these days :) ZMQ honestly blew my mind in how well it worked with
many publishers generating so much flow to a handful of subscribers.
Post by Lineker Tomazeli
Post by Luca Boccassi
Post by Lineker Tomazeli
*Third question:*
Zeromq is truly a great and fun framework to work with. I was wondering
why
Post by Lineker Tomazeli
there not so much buzz about it? Do you know companies using in
production?
Many companies do. We use it at Brocade in the vRouter, for example.
--
Kind regards,
Luca Boccassi
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Lineker Tomazeli
2016-12-02 17:00:52 UTC
Permalink
Hey guys,

Thank you so much for you input. It gave me some good insights. Is always
good to see some numbers from production enviroments.

What is the overhead of creating a inproc:// socket and killing it after
sending each message?

Would that be better than having a mutex on one shared inproc:// socket ?

Does inproc:// uses file descriptors?

Cheers

Lineker Tomazeli

Lineker Tomazeli
tomazeli.net
Post by Marcin Romaszewicz
Post by Lineker Tomazeli
Hi Luca,
Thanks for answering. See my comments below.
Lineker Tomazeli
tomazeli.net
Post by Lineker Tomazeli
Post by Lineker Tomazeli
Hi guys,
I have been reading zeromq list for a while now but is the first time
I'm
Post by Lineker Tomazeli
asking something. So nice to meet you all :)
In the zeromq documentation/book is very clear that we shouldn't be
handling zeromq sockets from multiple threads without doing a memory
barrier.
*First question:*
What is the recommended pattern to have one thread responsible for
connecting, binding and polling messages from sockets. And have
multiple
Post by Lineker Tomazeli
thread sending messages.
My first approach was to have a threadsafe producer/consumer queue
where
Post by Lineker Tomazeli
sender threads would push on the queue and the "zeromq" thread would
pick
Post by Lineker Tomazeli
from queue and send through the wire. This works but I find it
redundant
Post by Lineker Tomazeli
and memory inefficient.
My second approach (on which I used netmq) was to schedule a "send a
message" task on the Poller. This guaranteed that the task ran in the
same
Post by Lineker Tomazeli
thread as the poller.
What would be the correct approach if I use clrzmq4 ?
Why do you want to manage the socket on one thread and use from another?
What about just passing the endpoint string to each thread, so that they
can manage their own socket?
yes, I want to have one thread managing the socket and multiple thread
doing work. When this multiple threads doing work need to send messages
they say, "hey thread managing zeromq socket, here is message please
deliver to me"
having each thread manage their own socket is not feasible since these
threads come and go when needed (thread pool). I don't want each of them to
start connecting sockets if they only need to send one message.,
Post by Lineker Tomazeli
Post by Lineker Tomazeli
*Second question :*
What is the correct approach for a "zeromq" thread to notify other
threads
Post by Lineker Tomazeli
that I new message was received ?
- raise an event on another thread?
- use a threadsafe producer/consumer queue to dump receiving message ?
For inter-thread communication you could use an inproc:// endpoint,
which is essentially a lock-less in-memory queue. You can build any
model you want with it as a communication channel between threads. And
given they all take zmq_msg_t, you can just pass them along directly.
If the tasks to distribute are all equal in cost/time, a simple and
common pattern is to use a round-robin type of socket to distribute
work. There are a lot of examples and suggestions in the zguide.
So would it be ok (performance, overhead and etc) for each thread that
needs to send a message to connect using inproc:// to the "sender thread"
worker thread ------inproc:// ---->
worker thread ------inproc:// ----> sender thread -----tcp:// ----->
server
worker thread ------inproc:// ---->
It's generally a bad idea to create and destroy lots of sockets, and also
create and destroy lots of threads, you'll have throughput problems due to
this, and as Luca mentioned, you'll run out of file descriptors. It can
take seconds to create a thread on a highly loaded system.
You described a good model in your first solution. Consider having a
thread worker pool which has long lived threads using some kind of
semaphore to allocate work to the work units. if you have a single socket
where you get the work, just use one thread to read all the data off that
socket, then package it up into some kind of message, and send it to the
worker threads either via ZMQ in-proc sockets, or some other mechanism. The
workers each have a socket and use it to send their response when they're
done. Don't kill those threads. Your problem description sounds like a
classic producer/consumer queue. Your producers are the long-lived threads
which take in ZMQ connections, and your consumers are your worker threads,
and many data structures exist which can do this. This may be less memory
efficient than a zero-copy approach, but it works fantastically, because
copying objects by value removes contention which you must solve by
locking.
When we were using ZMQ in production, I had a ZMQ Router taking in 100,000
- 200,000 PUB/SUB messages per second on a single ZMQ SUB socket using a
single thread (!!) and fanning out the resulting work to a pile of worker
threads, each of which did some work, and sent the result elsewhere. My
messages were a few hundred bytes in size, that's a 100's of megabytes
coming in per second, and my working memory was < 500MB. Memory is really
cheap these days :) ZMQ honestly blew my mind in how well it worked with
many publishers generating so much flow to a handful of subscribers.
Post by Lineker Tomazeli
Post by Lineker Tomazeli
Post by Lineker Tomazeli
*Third question:*
Zeromq is truly a great and fun framework to work with. I was
wondering why
Post by Lineker Tomazeli
there not so much buzz about it? Do you know companies using in
production?
Many companies do. We use it at Brocade in the vRouter, for example.
--
Kind regards,
Luca Boccassi
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Luca Boccassi
2016-12-02 18:45:18 UTC
Permalink
As it was written before it's not recommended to constantly create and
delete sockets. I think I've seen issues fly past on Github where that
caused problems with inproc, I would advise to search there to see what
it was about.

inproc sockets are in-memory queues, they do not use file descriptors.
In the ZMQ_PAIR case they are also lock-free and zero-copy, so as
efficient as it can get.

If you need to distribute work, the recommendation again would be to use
a pattern that distributes messages in a round-robin fashion when there
are multiple clients, and have the distributors bind and multiple
consumers connect. I think dealer-dealer would fit, but the exact
behaviour is documented in the zmq_socket doc.
Post by Lineker Tomazeli
Hey guys,
Thank you so much for you input. It gave me some good insights. Is always
good to see some numbers from production enviroments.
What is the overhead of creating a inproc:// socket and killing it after
sending each message?
Would that be better than having a mutex on one shared inproc:// socket ?
Does inproc:// uses file descriptors?
Cheers
Lineker Tomazeli
Lineker Tomazeli
tomazeli.net
Post by Marcin Romaszewicz
Post by Lineker Tomazeli
Hi Luca,
Thanks for answering. See my comments below.
Lineker Tomazeli
tomazeli.net
Post by Lineker Tomazeli
Post by Lineker Tomazeli
Hi guys,
I have been reading zeromq list for a while now but is the first time
I'm
Post by Lineker Tomazeli
asking something. So nice to meet you all :)
In the zeromq documentation/book is very clear that we shouldn't be
handling zeromq sockets from multiple threads without doing a memory
barrier.
*First question:*
What is the recommended pattern to have one thread responsible for
connecting, binding and polling messages from sockets. And have
multiple
Post by Lineker Tomazeli
thread sending messages.
My first approach was to have a threadsafe producer/consumer queue
where
Post by Lineker Tomazeli
sender threads would push on the queue and the "zeromq" thread would
pick
Post by Lineker Tomazeli
from queue and send through the wire. This works but I find it
redundant
Post by Lineker Tomazeli
and memory inefficient.
My second approach (on which I used netmq) was to schedule a "send a
message" task on the Poller. This guaranteed that the task ran in the
same
Post by Lineker Tomazeli
thread as the poller.
What would be the correct approach if I use clrzmq4 ?
Why do you want to manage the socket on one thread and use from another?
What about just passing the endpoint string to each thread, so that they
can manage their own socket?
yes, I want to have one thread managing the socket and multiple thread
doing work. When this multiple threads doing work need to send messages
they say, "hey thread managing zeromq socket, here is message please
deliver to me"
having each thread manage their own socket is not feasible since these
threads come and go when needed (thread pool). I don't want each of them to
start connecting sockets if they only need to send one message.,
Post by Lineker Tomazeli
Post by Lineker Tomazeli
*Second question :*
What is the correct approach for a "zeromq" thread to notify other
threads
Post by Lineker Tomazeli
that I new message was received ?
- raise an event on another thread?
- use a threadsafe producer/consumer queue to dump receiving message ?
For inter-thread communication you could use an inproc:// endpoint,
which is essentially a lock-less in-memory queue. You can build any
model you want with it as a communication channel between threads. And
given they all take zmq_msg_t, you can just pass them along directly.
If the tasks to distribute are all equal in cost/time, a simple and
common pattern is to use a round-robin type of socket to distribute
work. There are a lot of examples and suggestions in the zguide.
So would it be ok (performance, overhead and etc) for each thread that
needs to send a message to connect using inproc:// to the "sender thread"
worker thread ------inproc:// ---->
worker thread ------inproc:// ----> sender thread -----tcp:// ----->
server
worker thread ------inproc:// ---->
It's generally a bad idea to create and destroy lots of sockets, and also
create and destroy lots of threads, you'll have throughput problems due to
this, and as Luca mentioned, you'll run out of file descriptors. It can
take seconds to create a thread on a highly loaded system.
You described a good model in your first solution. Consider having a
thread worker pool which has long lived threads using some kind of
semaphore to allocate work to the work units. if you have a single socket
where you get the work, just use one thread to read all the data off that
socket, then package it up into some kind of message, and send it to the
worker threads either via ZMQ in-proc sockets, or some other mechanism. The
workers each have a socket and use it to send their response when they're
done. Don't kill those threads. Your problem description sounds like a
classic producer/consumer queue. Your producers are the long-lived threads
which take in ZMQ connections, and your consumers are your worker threads,
and many data structures exist which can do this. This may be less memory
efficient than a zero-copy approach, but it works fantastically, because
copying objects by value removes contention which you must solve by
locking.
When we were using ZMQ in production, I had a ZMQ Router taking in 100,000
- 200,000 PUB/SUB messages per second on a single ZMQ SUB socket using a
single thread (!!) and fanning out the resulting work to a pile of worker
threads, each of which did some work, and sent the result elsewhere. My
messages were a few hundred bytes in size, that's a 100's of megabytes
coming in per second, and my working memory was < 500MB. Memory is really
cheap these days :) ZMQ honestly blew my mind in how well it worked with
many publishers generating so much flow to a handful of subscribers.
Post by Lineker Tomazeli
Post by Lineker Tomazeli
Post by Lineker Tomazeli
*Third question:*
Zeromq is truly a great and fun framework to work with. I was
wondering why
Post by Lineker Tomazeli
there not so much buzz about it? Do you know companies using in
production?
Many companies do. We use it at Brocade in the vRouter, for example.
--
Kind regards,
Luca Boccassi
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Manuel Amador (Rudd-O)
2016-12-02 01:44:18 UTC
Permalink
Post by Luca Boccassi
Post by Lineker Tomazeli
What would be the correct approach if I use clrzmq4 ?
Why do you want to manage the socket on one thread and use from another?
What about just passing the endpoint string to each thread, so that they
can manage their own socket?
I do not understand this. Are there examples of code where I can see
this happen?
--
Rudd-O
http://rudd-o.com/
Loading...