You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While experimenting with NSQ I've encountered a problem that looks like a flaw in the consumer code.
My setup is:
There are two producers on different hosts.
Each producer periodically (interval = 3 secs) posts to the same topic/channel to it's local nsqd.
There is one consumer for this topic/channel. It discovers both nsqd nodes through nsqlookupd.
The consumer uses the default configuration, and, thus, has MaxInFlight = 1.
It occurs, that with this setup the consumer is able to read only from one nsqd of the two. The second nsqd never gets a chance to send anything because consumer constantly keeps the correspondent RDY = 0 for the second nsqd connection.
I found that for such cases there is a special treatment in the code - every once in a while the consumer tries to randomly redistribute RDY among connected nsqd nodes (method redistributeRDY). However, in practice it never works because of this condition in the code:
Debugging shows that availableMaxInFlight always equal 0 at this moment, and actual redistribution never happens.
In fact, even if I hack a little, and force this redistribution to happen it does not work either, because there is a similar condition in method updateRDY. When there is a lack of available RDY, this method sets up a timer to retry in 5 seconds, and this attempt to retry always fails, and the timer is rescheduled again and again.
So, the questions are:
Is it a bug or a feature?
How this redistribution is supposed to work in such cases where active producers never give other producers a chance to receive even a minimal share of RDY/MaxInFlight?
Am I supposed to be preventive when creating a consumer, and set up the value of MaxInFlight in such a way that it will most likely be bigger than a potential number of producing nsqd nodes? Does not seem as very viable solution in the long term.
The text was updated successfully, but these errors were encountered:
I just refreshed my memory of this code and I think you might have found a bug. Yes, go-nsqstrictly enforces max-in-flight, so we do expect the consumer to perform "redistribution". However, the redistribution logic currently assumes that one of the connections will (eventually) be idle. In your case it never is, since there is a steady incoming stream of messages from each nsqd. This is the problematic code.
This looks similar to an issue I attempted to address in the C# client, discussion here. The solution isn't optimal and requires a flag to be set because it will exceed max-in-flight as it's floating RDY's around looking for another nsqd with the current implementation.
mreiferson
changed the title
consumer: question about redistributeRDY
consumer: redistributeRDY fails when connection remains active
Nov 19, 2016
While experimenting with NSQ I've encountered a problem that looks like a flaw in the consumer code.
My setup is:
nsqd
.nsqd
nodes throughnsqlookupd
.MaxInFlight
= 1.It occurs, that with this setup the consumer is able to read only from one nsqd of the two. The second nsqd never gets a chance to send anything because consumer constantly keeps the correspondent RDY = 0 for the second nsqd connection.
I found that for such cases there is a special treatment in the code - every once in a while the consumer tries to randomly redistribute RDY among connected nsqd nodes (method
redistributeRDY
). However, in practice it never works because of this condition in the code:Debugging shows that
availableMaxInFlight
always equal 0 at this moment, and actual redistribution never happens.In fact, even if I hack a little, and force this redistribution to happen it does not work either, because there is a similar condition in method
updateRDY
. When there is a lack of available RDY, this method sets up a timer to retry in 5 seconds, and this attempt to retry always fails, and the timer is rescheduled again and again.So, the questions are:
The text was updated successfully, but these errors were encountered: