Changelog
Queues is now generally available.
The per-queue message throughput has increased from 400 to 5,000 messages per second. This applies to new and existing queues.
Maximum concurrent consumers has increased from 20 to 250. This applies to new and existing queues. Queues with no explicit limit will automatically scale to the new maximum. Review the consumer concurrency documentation to learn more.
Messages published to a queue and/or marked for retry from a queue consumer can now be explicitly delayed. Delaying messages allows you to defer tasks until later, and/or respond to backpressure when consuming from a queue.
Refer to Batching and Retries to learn how to delay messages written to a queue.
Queues now supports pull-based consumers. A pull-based consumer allows you to pull from a queue over HTTP from any environment and/or programming language outside of Cloudflare Workers. A pull-based consumer can be useful when your message consumption rate is limited by upstream infrastructure or long-running tasks.
Review the documentation on pull-based consumers to configure HTTP-based pull.
The default content type for messages published to a queue is now json
, which improves compatibility with the upcoming pull-based queues.
Any Workers created on or after the compatibility date of 2024-03-18
, or that explicitly set the queues_json_messages
compatibility flag, will use the new default behaviour. Existing Workers with a compatibility date prior will continue to use v8
as the default content type for published messages.
Calling retry()
or retryAll()
on a message or message batch will no longer have an impact on how Queues scales consumer concurrency.
Previously, using explicit retries via retry()
or retryAll()
would count as an error and could result in Queues scaling down the number of concurrent consumers.
Developers building on Queues can now create up to 10,000 queues per account, enabling easier per-user, per-job and sharding use-cases.
Refer to Limits to learn more about Queues' current limits.
Queue consumers can now scale to 20 concurrent invocations (per queue), up from 10. This allows you to scale out and process higher throughput queues more quickly.
Queues with no explicit limit specified will automatically scale to the new maximum.
This limit will continue to grow during the Queues beta.
Queue consumers will now automatically scale up based on the number of messages being written to the queue. To control or limit concurrency, you can explicitly define a max_concurrency
for your consumer.
Queue consumers will soon automatically scale up concurrently as a queues' backlog grows in order to keep overall message processing latency down. Concurrency will be enabled on all existing queues by 2023-03-28.
To opt-out, or to configure a fixed maximum concurrency, set max_concurrency = 1
in your wrangler.toml
file or via the queues dashboard.
To opt-in, you do not need to take any action: your consumer will begin to scale out as needed to keep up with your message backlog. It will scale back down as the backlog shrinks, and/or if a consumer starts to generate a higher rate of errors. To learn more about how consumers scale, refer to the consumer concurrency documentation.
You can now acknowledge individual messages with a batch by calling .ack()
on a message.
This allows you to mark a message as delivered as you process it within a batch, and avoids the entire batch from being redelivered if your consumer throws an error during batch processing. This can be particularly useful when you are calling external APIs, writing messages to a database, or otherwise performing non-idempotent actions on individual messages within a batch.
The per-queue throughput limit has now been raised to 400 messages per second.
The JavaScript API for Queue producers now includes a sendBatch
method which supports sending up to 100 messages at a time.