Heungsub Lee
2016-12-12 19:03:04 UTC
Was my email sent well? I sent this email yesterday, but I couldn't see at
the archive.
2016ë 12ì 12ìŒ (ì) ì€ì 3:55, Heungsub Lee <***@subl.ee>ëìŽ ìì±:
Hi folks, I'm Heungsub Lee.
I've been making a game server with ZeroMQ's Pub/Sub approach. I got a
critical problem by using PUB/SUB sockets. Sometimes my processes are
aborted with assertion failure from ZeroMQ:
Assertion failed: erased == 1 (src/mtrie.cpp:297)
I tried with pyzmq-16.0.2 over libzmq-4.2.0.
In my case, a SUB socket binds to an address then a PUB socket connects to
the address. All of PUB sockets and SUB sockets in a cluster connect with
each others. They makes a fully connected network among 500+ server
processes.
A SUB socket frequently subscribes or unsubscribes their topics. The
topics in a cluster grow up since the cluster started. At a moment when I
checked, one of SUB sockets is subscribing 3000+ topics.
I saw 3 aborting scenarios:
1. When a SUB socket closes, some PUB sockets abort. Perhaps it is a
concurrency bug from pyzmq what I'm using. I reproduced it by a test
case
<https://github.com/what-studio/pyzmq/commit/5159ee563a571daccf1285aa74917bb875c774a7>.
And I think I fixed it
<https://github.com/what-studio/pyzmq/commit/94ab0a88dbef7d0f33b34cdf18e55487735dde01>
.
2. When a PUB socket joins to a mature cluster it aborts almost
immediately. A mature cluster means there are already so many subscribing
topics and subscribe/unsubscribe synchronization messages.
3. A PUB socket on a weak host machine (e.g. AWS EC2 t2.medium),
sometimes aborts. I'm not sure what is the point.
Unfortunately, I couldn't reproduce the last 2 scenarios by a small code.
But my server still has been aborted.
The assertion failure occurs when a PUB socket tries to remove a pipe to a
SUB socket but there's no matched pipe. I'm wondering if ZeroMQ guarantees
the consistency of subscribe/unsubscribe synchronizations between busy PUB
and SUB sockets.
Regards,
Heungsub
the archive.
2016ë 12ì 12ìŒ (ì) ì€ì 3:55, Heungsub Lee <***@subl.ee>ëìŽ ìì±:
Hi folks, I'm Heungsub Lee.
I've been making a game server with ZeroMQ's Pub/Sub approach. I got a
critical problem by using PUB/SUB sockets. Sometimes my processes are
aborted with assertion failure from ZeroMQ:
Assertion failed: erased == 1 (src/mtrie.cpp:297)
I tried with pyzmq-16.0.2 over libzmq-4.2.0.
In my case, a SUB socket binds to an address then a PUB socket connects to
the address. All of PUB sockets and SUB sockets in a cluster connect with
each others. They makes a fully connected network among 500+ server
processes.
A SUB socket frequently subscribes or unsubscribes their topics. The
topics in a cluster grow up since the cluster started. At a moment when I
checked, one of SUB sockets is subscribing 3000+ topics.
I saw 3 aborting scenarios:
1. When a SUB socket closes, some PUB sockets abort. Perhaps it is a
concurrency bug from pyzmq what I'm using. I reproduced it by a test
case
<https://github.com/what-studio/pyzmq/commit/5159ee563a571daccf1285aa74917bb875c774a7>.
And I think I fixed it
<https://github.com/what-studio/pyzmq/commit/94ab0a88dbef7d0f33b34cdf18e55487735dde01>
.
2. When a PUB socket joins to a mature cluster it aborts almost
immediately. A mature cluster means there are already so many subscribing
topics and subscribe/unsubscribe synchronization messages.
3. A PUB socket on a weak host machine (e.g. AWS EC2 t2.medium),
sometimes aborts. I'm not sure what is the point.
Unfortunately, I couldn't reproduce the last 2 scenarios by a small code.
But my server still has been aborted.
The assertion failure occurs when a PUB socket tries to remove a pipe to a
SUB socket but there's no matched pipe. I'm wondering if ZeroMQ guarantees
the consistency of subscribe/unsubscribe synchronizations between busy PUB
and SUB sockets.
Regards,
Heungsub