-
-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long polling #30
Comments
Yes, this would be ideal. We talked about this before, but implementing it may be tricky due to the way shoryuken load-balances multiple queues. Maybe @phstc can chime in on this |
Correct! It's complicated because of the queue load balancing and the fixed number of processors Shoryuken pauses queues when they are empty to prevent some blank requests. |
It might be worth leaving this issue open in case somebody wants to take a crack at it sometime. I think it would be a good improvement to shoryuken, especially since it's a recommended practice. |
Let's open it and 🙏 But I don't see a simple way in the way Shoryuken currently goes 😢 |
I thought this was already implemented via the |
If not, shouldn't the best option be to have a separate connection for each queue with long polling? If there aren't any processors available when the request comes in, it should just wait for one to become available. In most situations the visibility timeout value should be set in a way forgiving enough to allow for a small wait for a processor to become available and, in a worst case, the item gets pushed back on the queue and gets processed next time. Also, this case would only occur rarely because it only applies to situations where the processors are full and one ore more queues are largely empty. As long as developers are made aware of this limitation, I don't think most would have much issue. Messages already need to be indempotent because Amazon makes no guarantees messages won't be delivered multiple times, or in order. And if they don't want to use this feature, they just have to ensure that long polling is turned off. |
@waynerobinson you can pass the
Queue A can make the process wait 10 seconds without returning any message, so queue B would wait 10 seconds for nothing. The idea @elsurudo is saying is to have the receiving in parallel:
Although it seems to be trivial, it's hard to implement due the processors limitation |
This is not recommended because of the dead letters queue as it will increase the receive count.
I agree, it could be optional. But I don't see a clear way of doing that without making a big change in the code base. |
@phstc I guess what I'm getting at is I don't think that edge case you mentioned about receiving from multiple simultaneous queues is likely. Maybe I'm mistaken, but I would imagine most people either want to link individual queues to specific workers for them, or to use them for priority. I also don't see most people trying to process large batches of data without having equally large amounts of concurrency, or very fast jobs. I'm happy to have a go at this when I get a chance (although I have no idea when that would be at this point in time, although we're looking at options to move away from all our ActiveMQ-based queueing). Although on a curosory glance of the code, I can't work out where the actually fetching is happening from the queues, does a fetcher run for each queue or is there a global fetcher that iterates over each of the queues and then assigns them to workers? |
Although long polling is a cost effective solution (reducing the number of empty requests) as shoryuken pauses empty queues I'm not sure if it would make a huge cost difference. |
I see, but to have a safe solution (without returning messages to the queues |
Could you please describe how this pausing works, and how long for? Even if a queue is sitting empty, we would prefer for the worker to start processing messages ASAP when they appear. |
It seems to tie into the standard |
The pausing is based on the delay option, you can set it to 0 to stop pausing queues (I do that - SQS is cheap!). |
I guess everything you've said makes sense. You're probably right and I shouldn't really be too concerned with this. We already spend > $8k/month with AWS and I'm not sure why I'm caring about $1.30 per empty queue per month without long polling. |
@waynerobinson the messages retrieval: https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/fetcher.rb#L24 the pause: https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/fetcher.rb#L47 and where it rotates queues: https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/manager.rb#L182 |
Long polling is the best solution for sure, but I don't see an easy way to implement it yet, the |
You're right, I don't think I could ever justify the expense of implementing long polling versus the savings. 👅 Especially as we're replacing hundreds of dollars a month in AMQ servers. Just to confirm though, multiple workers could be assigned to a single queue and limit the app to one queue per Shoryuken process with long polling on? Almost the best of both worlds (if we don't need any kind of priority management). |
$1.30 is like 2.600,000 requests and a regular month (30 days) has 2.592,000 seconds. Not counting that the first 1M requests is FREE. |
You can have multiples workers per queue and multiple queues per shoryuken process. You can enable |
That isn't such a bad work-around regarding priority actually.
High Priority queue processes messages as they arrive, waiting for new ones up to 20 seconds. Low priority queue only runs either based on Shoryuken rules or once every 20 seconds (if High Priority queue is empty) and checks the High Priority queue immediately after. |
Actually that's not the way priorities work. The Have a look at https://github.com/phstc/shoryuken#load-balancing to check how priorities work. |
I thought it respected the wait time configured on the SQS queue itself? |
Yup, it does: http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/SQS/Queue.html#receive_message-instance_method
|
Just FYI, a colleague wrote a PR that might address this problem. |
It would be more cost effective to use long polling rather than short polling, as stated in the SQS docs :
When should I use SQS long polling, and when should I use SQS short polling?
What do you think ?
Thx,
Vincent
The text was updated successfully, but these errors were encountered: