Long polling #30

vdaubry · 2015-01-06T17:13:41Z

It would be more cost effective to use long polling rather than short polling, as stated in the SQS docs :

In almost all cases, SQS long polling is preferable to SQS short polling [...] However, if your application
is written to expect an immediate response from a ReceiveMessage call, you may not be able to take
advantage of long polling without some application modifications [...] In such an application, it is
recommended that a single thread be used to process only one queue, allowing for the application to
take advantage of the benefits SQS long polling provides.

When should I use SQS long polling, and when should I use SQS short polling?

What do you think ?

Thx,
Vincent

elsurudo · 2015-01-06T17:34:17Z

Yes, this would be ideal. We talked about this before, but implementing it may be tricky due to the way shoryuken load-balances multiple queues. Maybe @phstc can chime in on this

phstc · 2015-01-06T17:39:55Z

Correct! It's complicated because of the queue load balancing and the fixed number of processors concurrency: 25. It might cause a message ready to process without any available processor to consume.

Shoryuken pauses queues when they are empty to prevent some blank requests.

elsurudo · 2015-01-06T17:43:19Z

It might be worth leaving this issue open in case somebody wants to take a crack at it sometime. I think it would be a good improvement to shoryuken, especially since it's a recommended practice.

phstc · 2015-01-06T21:07:16Z

Let's open it and 🙏

But I don't see a simple way in the way Shoryuken currently goes 😢

waynerobinson · 2015-01-16T23:51:26Z

I thought this was already implemented via the wait_time_seconds option?

waynerobinson · 2015-01-17T00:04:56Z

If not, shouldn't the best option be to have a separate connection for each queue with long polling?

If there aren't any processors available when the request comes in, it should just wait for one to become available. In most situations the visibility timeout value should be set in a way forgiving enough to allow for a small wait for a processor to become available and, in a worst case, the item gets pushed back on the queue and gets processed next time.

Also, this case would only occur rarely because it only applies to situations where the processors are full and one ore more queues are largely empty.

As long as developers are made aware of this limitation, I don't think most would have much issue. Messages already need to be indempotent because Amazon makes no guarantees messages won't be delivered multiple times, or in order. And if they don't want to use this feature, they just have to ensure that long polling is turned off.

phstc · 2015-01-17T00:10:47Z

@waynerobinson you can pass the wait_time_seconds for long polling for sure, but the problem is that the messages' retrieval is sequential.

receive from queue A and wait 10 seconds
receive from queue B and wait 10 seconds

Queue A can make the process wait 10 seconds without returning any message, so queue B would wait 10 seconds for nothing.

The idea @elsurudo is saying is to have the receiving in parallel:

receive from queue A and wait 10 seconds AND receive from queue B and wait 10 seconds

Although it seems to be trivial, it's hard to implement due the processors limitation concurrency. Suppose you have 10 queues and all of them receive 10 messages at the same time. You will need 100 processors concurrency=100 to process all of them.

phstc · 2015-01-17T00:15:51Z

the item gets pushed back on the queue and gets processed next time.

This is not recommended because of the dead letters queue as it will increase the receive count.

As long as developers are made aware of this limitation, I don't think most would have much issue. Messages already need to be indempotent because Amazon makes no guarantees messages won't be delivered multiple times, or in order. And if they don't want to use this feature, they just have to ensure that long polling is turned off.

I agree, it could be optional. But I don't see a clear way of doing that without making a big change in the code base.

waynerobinson · 2015-01-17T00:18:16Z

@phstc I guess what I'm getting at is I don't think that edge case you mentioned about receiving from multiple simultaneous queues is likely.

Maybe I'm mistaken, but I would imagine most people either want to link individual queues to specific workers for them, or to use them for priority.

I also don't see most people trying to process large batches of data without having equally large amounts of concurrency, or very fast jobs.

I'm happy to have a go at this when I get a chance (although I have no idea when that would be at this point in time, although we're looking at options to move away from all our ActiveMQ-based queueing). Although on a curosory glance of the code, I can't work out where the actually fetching is happening from the queues, does a fetcher run for each queue or is there a global fetcher that iterates over each of the queues and then assigns them to workers?

phstc · 2015-01-17T00:19:31Z

Although long polling is a cost effective solution (reducing the number of empty requests) as shoryuken pauses empty queues I'm not sure if it would make a huge cost difference.

phstc · 2015-01-17T00:21:31Z

Maybe I'm mistaken, but I would imagine most people either want to link individual queues to specific workers for them, or to use them for priority.

I see, but to have a safe solution (without returning messages to the queues ++receive_count) the numbers of processors concurrency must be number of queues * 10 as 10 is the maximum batch size.

waynerobinson · 2015-01-17T00:22:08Z

Could you please describe how this pausing works, and how long for? Even if a queue is sitting empty, we would prefer for the worker to start processing messages ASAP when they appear.

waynerobinson · 2015-01-17T00:23:54Z

It seems to tie into the standard delay mechanism, which we would generally have set to 0 anyway.

phstc · 2015-01-17T00:24:22Z

The pausing is based on the delay option, you can set it to 0 to stop pausing queues (I do that - SQS is cheap!).

waynerobinson · 2015-01-17T00:26:28Z

I guess everything you've said makes sense. You're probably right and I shouldn't really be too concerned with this. We already spend > $8k/month with AWS and I'm not sure why I'm caring about $1.30 per empty queue per month without long polling.

phstc · 2015-01-17T00:26:37Z

@waynerobinson the messages retrieval: https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/fetcher.rb#L24 the pause: https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/fetcher.rb#L47 and where it rotates queues: https://github.com/phstc/shoryuken/blob/master/lib/shoryuken/manager.rb#L182

phstc · 2015-01-17T00:28:49Z

Long polling is the best solution for sure, but I don't see an easy way to implement it yet, the delay is a good workaround for that for people looking to reduce cost. I don't use it, I usually need to consume as fast as I can.

waynerobinson · 2015-01-17T00:32:54Z

You're right, I don't think I could ever justify the expense of implementing long polling versus the savings. 👅 Especially as we're replacing hundreds of dollars a month in AMQ servers.

Just to confirm though, multiple workers could be assigned to a single queue and limit the app to one queue per Shoryuken process with long polling on? Almost the best of both worlds (if we don't need any kind of priority management).

phstc · 2015-01-17T00:35:06Z

$1.30 is like 2.600,000 requests and a regular month (30 days) has 2.592,000 seconds. Not counting that the first 1M requests is FREE.

phstc · 2015-01-17T00:40:22Z

You can have multiples workers per queue and multiple queues per shoryuken process.

You can enable wait_time_seconds even using multiple queues and priorities the cons is because the receiving is sequential so it might take long to receive messages from queues with lower priority.

waynerobinson · 2015-01-17T00:47:24Z

That isn't such a bad work-around regarding priority actually.

High Priority: 20 second wait_time_second configured on SQS
Low Priority: no wait_time_second

High Priority queue processes messages as they arrive, waiting for new ones up to 20 seconds. Low priority queue only runs either based on Shoryuken rules or once every 20 seconds (if High Priority queue is empty) and checks the High Priority queue immediately after.

phstc · 2015-01-17T00:52:25Z

Actually that's not the way priorities work. The wait_time_second is global, if you enable it, it will be applied for all queues.

Have a look at https://github.com/phstc/shoryuken#load-balancing to check how priorities work.

waynerobinson · 2015-01-17T00:53:08Z

I thought it respected the wait time configured on the SQS queue itself?

phstc · 2015-01-17T00:56:34Z

Yup, it does: http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/SQS/Queue.html#receive_message-instance_method

:wait_time_seconds (Integer) — The number of seconds the service should wait for a response when requesting a new message. Defaults to the #wait_time_seconds attribute defined on the queue. See #wait_time_seconds to set the global long poll setting on the queue.

bbaja42 · 2016-05-26T10:32:50Z

Just FYI, a colleague wrote a PR that might address this problem.
#214

phstc closed this as completed Jan 6, 2015

phstc reopened this Jan 6, 2015

phstc closed this as completed Feb 28, 2015

phstc mentioned this issue Apr 13, 2015

Using Aws::SQS::QueuePoller? #92

Closed

gurukannappa mentioned this issue Apr 17, 2023

Can concurrency be shared across groups (Managers) ? #732

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long polling #30

Long polling #30

vdaubry commented Jan 6, 2015

elsurudo commented Jan 6, 2015

phstc commented Jan 6, 2015

elsurudo commented Jan 6, 2015

phstc commented Jan 6, 2015

waynerobinson commented Jan 16, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015 •

edited

Loading

phstc commented Jan 17, 2015 •

edited

Loading

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015 •

edited

Loading

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

bbaja42 commented May 26, 2016

Long polling #30

Long polling #30

Comments

vdaubry commented Jan 6, 2015

elsurudo commented Jan 6, 2015

phstc commented Jan 6, 2015

elsurudo commented Jan 6, 2015

phstc commented Jan 6, 2015

waynerobinson commented Jan 16, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015 • edited Loading

phstc commented Jan 17, 2015 • edited Loading

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015 • edited Loading

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

waynerobinson commented Jan 17, 2015

phstc commented Jan 17, 2015

bbaja42 commented May 26, 2016

phstc commented Jan 17, 2015 •

edited

Loading

phstc commented Jan 17, 2015 •

edited

Loading

phstc commented Jan 17, 2015 •

edited

Loading