In many systems, the rate at which you write messages to a queue can easily exceed the rate at which a single consumer can read and process those same messages. This is often because your consumer might be parsing message contents, writing to storage or a database, or making third-party (upstream) API calls.
How concurrency works
The number of consumers concurrently invoked for a queue will autoscale based on several factors, including:
- The number of messages in the queue (backlog) and its rate of growth.
- The ratio of failed (versus successful) invocations.
- The value of
max_concurrencyset for that consumer.
Where possible, Queues will optimize for keeping your backlog from growing exponentially, in order to minimize scenarios where the backlog of messages in a queue grows to the point that they would reach the before being processed.
If you are writing 100 messages/second to a queue with a single concurrent consumer that takes 5 seconds to process a batch of 100 messages, the number of messages in-flight will continue to grow at a rate faster than your consumer can keep up.
In this scenario, Queues will notice the growing backlog and will scale the number of concurrent consumer Workers invocations up to a steady-state of (approximately) five (5) until the rate of incoming messages decreases, the consumer processes messages faster, or the consumer begins to generate errors.
If you have a workflow that is limited by an upstream API and/or system, you may prefer for your backlog to grow, trading off increased overall latency in order to avoid overwhelming an upstream system.
Concurrency settings can be configured in each projects’
wrangler.toml file and/or the Cloudflare dashboard. To set concurrency settings in the Cloudflare dashboard:
- Log in to the and select your account.
- Select Workers & Pages > Queues.
- Select your queue > Settings.
- Select Edit Consumer under Consumer details.
- Set Maximum consumer invocations to a value between
10. This value represents the maximum number of concurrent consumer invocations available to your queue.
To remove a fixed maximum value, select auto (recommended).
Note that if you are writing messages to a queue faster than you can process them, messages may eventually reach the set for that queue. Individual messages that reach that limit will expire from the queue and be deleted.
Set concurrency settings via
To set a fixed maximum number of concurrent consumer invocations for a given queue, configure a
max_concurrency in your
[[queues.consumers]] queue = "my-queue" max_concurrency = 1
To remove the limit, remove the
max_concurrency setting from the
[[queues.consumers]] configuration for a given queue and call
npx wrangler deploy to push your configuration update.
- If you intend to process all messages written to a queue, the effective overall cost is the same, even with concurrency enabled.
- Enabling concurrency simply brings those costs forward, and can help prevent messages from reaching the .
A consumer Worker that takes 2 seconds () to process a batch of messages will incur the same overall costs to process 50 million (50,000,000) messages, whether it does so concurrently (faster) or individually (slower).