Data persistence with R2
Mount object storage buckets as local filesystem paths to persist data across sandbox lifecycles. This tutorial uses Cloudflare R2, but the same approach works with any S3-compatible provider.
Time to complete: 20 minutes
A Worker that processes data, stores results in an R2 bucket mounted as a local directory, and demonstrates that data persists even after the sandbox is destroyed and recreated.
Key concepts you'll learn:
- Mounting R2 buckets as filesystem paths
- Automatic data persistence across sandbox lifecycles
- Working with mounted storage using standard file operations
- Sign up for a Cloudflare account ↗.
- Install
Node.js↗.
Node.js version manager
Use a Node version manager like Volta ↗ or nvm ↗ to avoid permission issues and change Node.js versions. Wrangler, discussed later in this guide, requires a Node version of 16.17.0 or later.
You'll also need:
- Docker ↗ running locally
- An R2 bucket (create one in the Cloudflare dashboard ↗)
npm create cloudflare@latest -- data-pipeline --template=cloudflare/sandbox-sdk/examples/minimalyarn create cloudflare data-pipeline --template=cloudflare/sandbox-sdk/examples/minimalpnpm create cloudflare@latest data-pipeline --template=cloudflare/sandbox-sdk/examples/minimalcd data-pipelineAdd an R2 bucket binding to your wrangler.json:
{ "name": "data-pipeline", "compatibility_date": "2025-11-09", "durable_objects": { "bindings": [ { "name": "Sandbox", "class_name": "Sandbox" } ] }, "r2_buckets": [ { "binding": "DATA_BUCKET", "bucket_name": "my-data-bucket" } ]}Replace my-data-bucket with your R2 bucket name. Create the bucket first in the Cloudflare dashboard ↗.
Replace src/index.ts with code that mounts R2 and processes data:
import { getSandbox } from "@cloudflare/sandbox";
export { Sandbox } from "@cloudflare/sandbox";
export default { async fetch(request, env) { const url = new URL(request.url); const sandbox = getSandbox(env.Sandbox, "data-processor");
// Mount R2 bucket to /data directory await sandbox.mountBucket("my-data-bucket", "/data", { endpoint: "https://YOUR_ACCOUNT_ID.r2.cloudflarestorage.com", });
if (url.pathname === "/process") { // Process data and save to mounted R2 const result = await sandbox.exec("python", { args: [ "-c", `import jsonimport osfrom datetime import datetime
# Read input (or create sample data)data = [ {'id': 1, 'value': 42}, {'id': 2, 'value': 87}, {'id': 3, 'value': 15}]
# Process: calculate sum and averagetotal = sum(item['value'] for item in data)avg = total / len(data)
# Save results to mounted R2 (/data is the mounted bucket)result = { 'timestamp': datetime.now().isoformat(), 'total': total, 'average': avg, 'processed_count': len(data)}
os.makedirs('/data/results', exist_ok=True)with open('/data/results/latest.json', 'w') as f: json.dump(result, f, indent=2)
print(json.dumps(result)) `, ], });
return Response.json({ message: "Data processed and saved to R2", result: JSON.parse(result.stdout), }); }
if (url.pathname === "/results") { // Read results from mounted R2 const result = await sandbox.exec("cat", { args: ["/data/results/latest.json"], });
if (!result.success) { return Response.json( { error: "No results found yet" }, { status: 404 }, ); }
return Response.json({ message: "Results retrieved from R2", data: JSON.parse(result.stdout), }); }
if (url.pathname === "/destroy") { // Destroy sandbox to demonstrate persistence await sandbox.destroy(); return Response.json({ message: "Sandbox destroyed. Data persists in R2!", }); }
return new Response( `Data Pipeline with Persistent Storage
Endpoints:- POST /process - Process data and save to R2- GET /results - Retrieve results from R2- POST /destroy - Destroy sandbox (data survives!)
Try this flow:1. POST /process (processes and saves to R2)2. POST /destroy (destroys sandbox)3. GET /results (data still accessible from R2) `, { headers: { "Content-Type": "text/plain" } }, ); },};import { getSandbox, type Sandbox } from '@cloudflare/sandbox';
export { Sandbox } from '@cloudflare/sandbox';
interface Env { Sandbox: DurableObjectNamespace<Sandbox>; DATA_BUCKET: R2Bucket;}
export default { async fetch(request: Request, env: Env): Promise<Response> { const url = new URL(request.url); const sandbox = getSandbox(env.Sandbox, 'data-processor');
// Mount R2 bucket to /data directory await sandbox.mountBucket('my-data-bucket', '/data', { endpoint: 'https://YOUR_ACCOUNT_ID.r2.cloudflarestorage.com' });
if (url.pathname === '/process') { // Process data and save to mounted R2 const result = await sandbox.exec('python', { args: ['-c', `import jsonimport osfrom datetime import datetime
# Read input (or create sample data)data = [ {'id': 1, 'value': 42}, {'id': 2, 'value': 87}, {'id': 3, 'value': 15}]
# Process: calculate sum and averagetotal = sum(item['value'] for item in data)avg = total / len(data)
# Save results to mounted R2 (/data is the mounted bucket)result = { 'timestamp': datetime.now().isoformat(), 'total': total, 'average': avg, 'processed_count': len(data)}
os.makedirs('/data/results', exist_ok=True)with open('/data/results/latest.json', 'w') as f: json.dump(result, f, indent=2)
print(json.dumps(result)) `] });
return Response.json({ message: 'Data processed and saved to R2', result: JSON.parse(result.stdout) }); }
if (url.pathname === '/results') { // Read results from mounted R2 const result = await sandbox.exec('cat', { args: ['/data/results/latest.json'] });
if (!result.success) { return Response.json({ error: 'No results found yet' }, { status: 404 }); }
return Response.json({ message: 'Results retrieved from R2', data: JSON.parse(result.stdout) }); }
if (url.pathname === '/destroy') { // Destroy sandbox to demonstrate persistence await sandbox.destroy(); return Response.json({ message: 'Sandbox destroyed. Data persists in R2!' }); }
return new Response(`Data Pipeline with Persistent Storage
Endpoints:- POST /process - Process data and save to R2- GET /results - Retrieve results from R2- POST /destroy - Destroy sandbox (data survives!)
Try this flow:1. POST /process (processes and saves to R2)2. POST /destroy (destroys sandbox)3. GET /results (data still accessible from R2) `, { headers: { 'Content-Type': 'text/plain' } }); }};Generate R2 API tokens:
- Go to R2 > Overview in the Cloudflare dashboard ↗
- Select Manage R2 API Tokens
- Create a token with Object Read & Write permissions
- Copy the Access Key ID and Secret Access Key
Set up credentials as Worker secrets:
npx wrangler secret put AWS_ACCESS_KEY_ID# Paste your R2 Access Key ID
npx wrangler secret put AWS_SECRET_ACCESS_KEY# Paste your R2 Secret Access KeyWorker secrets are encrypted and only accessible to your deployed Worker. The SDK automatically detects these credentials when mountBucket() is called.
Deploy your Worker:
npx wrangler deployAfter deployment, wrangler outputs your Worker URL (e.g., https://data-pipeline.yourname.workers.dev).
Now test against your deployed Worker. Replace YOUR_WORKER_URL with your actual Worker URL:
# 1. Process data (saves to R2)curl -X POST https://YOUR_WORKER_URL/process# Returns: { "message": "Data processed...", "result": { "total": 144, "average": 48, ... } }
# 2. Verify data is accessiblecurl https://YOUR_WORKER_URL/results# Returns the same results from R2
# 3. Destroy the sandboxcurl -X POST https://YOUR_WORKER_URL/destroy# Returns: { "message": "Sandbox destroyed. Data persists in R2!" }
# 4. Access results again (from new sandbox)curl https://YOUR_WORKER_URL/results# Still works! Data persisted across sandbox lifecycleThe key insight: After destroying the sandbox, the next request creates a new sandbox instance, mounts the same R2 bucket, and finds the data still there.
In this tutorial, you built a data pipeline that demonstrates filesystem persistence through R2 bucket mounting:
- Mounting buckets: Use
mountBucket()to make R2 accessible as a local directory - Standard file operations: Access mounted buckets using familiar filesystem commands (
cat, Pythonopen(), etc.) - Automatic persistence: Data written to mounted directories survives sandbox destruction
- Credential management: Configure R2 access using environment variables or explicit credentials
- Mount buckets guide - Comprehensive mounting reference
- Storage API - Complete API documentation
- Environment variables - Credential configuration options
- R2 documentation - Learn about Cloudflare R2
- Background processes guide - Long-running data processing
- Sandboxes concept - Understanding sandbox lifecycle
Was this helpful?
- Resources
- API
- New to Cloudflare?
- Directory
- Sponsorships
- Open Source
- Support
- Help Center
- System Status
- Compliance
- GDPR
- Company
- cloudflare.com
- Our team
- Careers
- © 2025 Cloudflare, Inc.
- Privacy Policy
- Terms of Use
- Report Security Issues
- Trademark
-