Rolling Deployments

Use rolling deployments to gradually update WorkClaw agent runtimes one host at a time, minimizing blast radius while maintaining zero downtime.

What is a rolling deployment?

A rolling deployment updates runtime hosts one at a time (or a configurable number at a time). Each host is drained of active connections, updated to the new build, health-checked, and returned to service before the next host begins. This approach maintains zero downtime while limiting the blast radius of any issues to a single host.

How does the rolling process work?

The deployment proceeds host by host:

Select host -- WorkClaw picks the next runtime host in the rotation.
Drain connections -- active conversations and in-flight requests on that host are allowed to complete. New traffic is routed to other healthy hosts. Drains are parallelized for efficiency.
Update -- the host's containers are replaced with the new build.
Health check -- WorkClaw verifies the updated host responds correctly, skills initialize, and connections authenticate.
Return to service -- the host rejoins the load balancer and begins receiving traffic.
Repeat -- the process continues with the next host until all hosts are updated.

When should I use rolling deployments?

Rolling deployments work well when:

You operate a multi-host fleet and want to minimize risk during rollouts.
You want to avoid the resource overhead of running duplicate environments (unlike blue/green).
You are comfortable with a longer rollout window in exchange for a smaller blast radius.

How do I configure the concurrency?

By default, WorkClaw updates one host at a time. You can increase the concurrency under Admin > Settings > Deploy Strategy > Rolling > Max Concurrent. Higher concurrency speeds up the rollout but increases the blast radius if the new build has issues.

What happens if a host fails its health check?

WorkClaw pauses the rolling deployment and sends an alert. The failed host remains out of rotation, and all previously updated hosts continue serving traffic on the new build. You can investigate via Monitoring, then choose to retry the failed host, skip it, or roll back all updated hosts to the previous build.

How does rollback work for rolling deployments?

Rolling rollbacks reverse the process -- updated hosts are reverted one at a time to the previous build. Hosts that were not yet updated remain untouched. This means rollback speed is proportional to how many hosts were updated before the issue was caught.