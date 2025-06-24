Scaling and Routing
Currently, Containers are only scaled manually by calling
BINDING.get() with a unique ID, then
starting the container. Unless
manualStart is set to
true on the Container class, each
instance will start when
get() is called.
Each instance will run until its
sleepAfter time has elapsed, or until it is manually stopped.
This behavior is very useful when you want explicit control over the lifecycle of container instances. For instance, you may want to spin up a container backend instance for a specific user, or you may briefly run a code sandbox to isolate AI-generated code, or you may want to run a short-lived batch job.
However, sometimes you want to run multiple instances of a container and easily route requests to them.
Currently, the best way to achieve this is with the temporary
getRandom helper function:
We have provided the getRandom function as a stopgap solution to route to multiple stateless container instances. It will randomly select one of N instances for each request and route to it. Unfortunately, it has two major downsides:
- It requires that the user set a fixed number of instances to route to.
- It will randomly select each instance, regardless of location.
We plan to fix these issues with built-in autoscaling and routing features in the near future.
You will be able to turn autoscaling on for a Container, by setting the
autoscale property to on the Container class:
This instructs the platform to automatically scale instances based on incoming traffic and resource usage (memory, CPU).
Container instances will be launched automatically to serve local traffic, and will be stopped when they are no longer needed.
To route requests to the correct instance, you will use the
getContainer() helper function to get a container instance, then
pass requests to it:
This will send traffic to the nearest ready instance of a container. If a container is overloaded or has not yet launched, requests will be routed to potentially more distant container. Container readiness can be automatically determined based on resource use, but will also be configurable with custom readiness checks.
Autoscaling and latency-aware routing will be available in the near future, and will be documented in more detail when released.
Until then, you can use the
getRandom helper function to route requests to multiple container instances.
