Mount object storage buckets as local filesystem paths to persist data across sandbox lifecycles. This tutorial uses Cloudflare R2, but the same approach works with any S3-compatible provider.

Time to complete: 20 minutes

What you'll build

A Worker that processes data, stores results in an R2 bucket mounted as a local directory, and demonstrates that data persists even after the sandbox is destroyed and recreated.

Key concepts you'll learn:

Mounting R2 buckets as filesystem paths

Automatic data persistence across sandbox lifecycles

Working with mounted storage using standard file operations

Prerequisites

Node.js version manager Use a Node version manager like Volta ↗ or nvm ↗ to avoid permission issues and change Node.js versions. Wrangler, discussed later in this guide, requires a Node version of 16.17.0 or later.

You'll also need:

Docker ↗ running locally

running locally An R2 bucket (create one in the Cloudflare dashboard ↗ )

1. Create your project

npm

npm yarn

yarn pnpm Terminal window npm create cloudflare@latest -- data-pipeline --template=cloudflare/sandbox-sdk/examples/minimal Terminal window yarn create cloudflare data-pipeline --template=cloudflare/sandbox-sdk/examples/minimal Terminal window pnpm create cloudflare@latest data-pipeline --template=cloudflare/sandbox-sdk/examples/minimal

Terminal window cd data-pipeline

2. Configure R2 binding

Add an R2 bucket binding to your wrangler.json :

wrangler.json { " name " : "data-pipeline" , " compatibility_date " : "2025-11-09" , " durable_objects " : { " bindings " : [ { " name " : "Sandbox" , " class_name " : "Sandbox" } ] }, " r2_buckets " : [ { " binding " : "DATA_BUCKET" , " bucket_name " : "my-data-bucket" } ] }

Replace my-data-bucket with your R2 bucket name. Create the bucket first in the Cloudflare dashboard ↗.

3. Build the data processor

Replace src/index.ts with code that mounts R2 and processes data:

JavaScript

JavaScript TypeScript JavaScript import { getSandbox } from "@cloudflare/sandbox" ; export { Sandbox } from "@cloudflare/sandbox" ; export default { async fetch ( request , env ) { const url = new URL ( request . url ) ; const sandbox = getSandbox ( env . Sandbox , "data-processor" ) ; // Mount R2 bucket to /data directory await sandbox . mountBucket ( "my-data-bucket" , "/data" , { endpoint : "https://YOUR_ACCOUNT_ID.r2.cloudflarestorage.com" , } ) ; if ( url . pathname === "/process" ) { // Process data and save to mounted R2 const result = await sandbox . exec ( "python" , { args : [ "-c" , ` import json import os from datetime import datetime # Read input (or create sample data) data = [ {'id': 1, 'value': 42}, {'id': 2, 'value': 87}, {'id': 3, 'value': 15} ] # Process: calculate sum and average total = sum(item['value'] for item in data) avg = total / len(data) # Save results to mounted R2 (/data is the mounted bucket) result = { 'timestamp': datetime.now().isoformat(), 'total': total, 'average': avg, 'processed_count': len(data) } os.makedirs('/data/results', exist_ok=True) with open('/data/results/latest.json', 'w') as f: json.dump(result, f, indent=2) print(json.dumps(result)) ` , ] , } ) ; return Response . json ( { message : "Data processed and saved to R2" , result : JSON . parse ( result . stdout ) , } ) ; } if ( url . pathname === "/results" ) { // Read results from mounted R2 const result = await sandbox . exec ( "cat" , { args : [ "/data/results/latest.json" ] , } ) ; if ( ! result . success ) { return Response . json ( { error : "No results found yet" }, { status : 404 }, ) ; } return Response . json ( { message : "Results retrieved from R2" , data : JSON . parse ( result . stdout ) , } ) ; } if ( url . pathname === "/destroy" ) { // Destroy sandbox to demonstrate persistence await sandbox . destroy () ; return Response . json ( { message : "Sandbox destroyed. Data persists in R2!" , } ) ; } return new Response ( ` Data Pipeline with Persistent Storage Endpoints: - POST /process - Process data and save to R2 - GET /results - Retrieve results from R2 - POST /destroy - Destroy sandbox (data survives!) Try this flow: 1. POST /process (processes and saves to R2) 2. POST /destroy (destroys sandbox) 3. GET /results (data still accessible from R2) ` , { headers : { "Content-Type" : "text/plain" } }, ) ; }, }; TypeScript import { getSandbox , type Sandbox } from '@cloudflare/sandbox' ; export { Sandbox } from '@cloudflare/sandbox' ; interface Env { Sandbox : DurableObjectNamespace < Sandbox >; DATA_BUCKET : R2Bucket ; } export default { async fetch ( request : Request , env : Env ) : Promise < Response > { const url = new URL ( request . url ) ; const sandbox = getSandbox ( env . Sandbox , 'data-processor' ) ; // Mount R2 bucket to /data directory await sandbox . mountBucket ( 'my-data-bucket' , '/data' , { endpoint : 'https://YOUR_ACCOUNT_ID.r2.cloudflarestorage.com' } ) ; if ( url . pathname === '/process' ) { // Process data and save to mounted R2 const result = await sandbox . exec ( 'python' , { args : [ '-c' , ` import json import os from datetime import datetime # Read input (or create sample data) data = [ {'id': 1, 'value': 42}, {'id': 2, 'value': 87}, {'id': 3, 'value': 15} ] # Process: calculate sum and average total = sum(item['value'] for item in data) avg = total / len(data) # Save results to mounted R2 (/data is the mounted bucket) result = { 'timestamp': datetime.now().isoformat(), 'total': total, 'average': avg, 'processed_count': len(data) } os.makedirs('/data/results', exist_ok=True) with open('/data/results/latest.json', 'w') as f: json.dump(result, f, indent=2) print(json.dumps(result)) ` ] } ) ; return Response . json ( { message : 'Data processed and saved to R2' , result : JSON . parse ( result . stdout ) } ) ; } if ( url . pathname === '/results' ) { // Read results from mounted R2 const result = await sandbox . exec ( 'cat' , { args : [ '/data/results/latest.json' ] } ) ; if ( ! result . success ) { return Response . json ( { error : 'No results found yet' }, { status : 404 } ) ; } return Response . json ( { message : 'Results retrieved from R2' , data : JSON . parse ( result . stdout ) } ) ; } if ( url . pathname === '/destroy' ) { // Destroy sandbox to demonstrate persistence await sandbox . destroy () ; return Response . json ( { message : 'Sandbox destroyed. Data persists in R2!' } ) ; } return new Response ( ` Data Pipeline with Persistent Storage Endpoints: - POST /process - Process data and save to R2 - GET /results - Retrieve results from R2 - POST /destroy - Destroy sandbox (data survives!) Try this flow: 1. POST /process (processes and saves to R2) 2. POST /destroy (destroys sandbox) 3. GET /results (data still accessible from R2) ` , { headers : { 'Content-Type' : 'text/plain' } } ) ; } };

Replace YOUR_ACCOUNT_ID Replace YOUR_ACCOUNT_ID in the endpoint URL with your Cloudflare account ID. Find it in the dashboard ↗ under R2 > Overview.

4. Local development limitation

Requires production deployment Bucket mounting does not work with wrangler dev because it requires FUSE support that wrangler does not currently provide. You must deploy to production to test this feature. All other Sandbox SDK features work locally - only mountBucket() and unmountBucket() require production deployment.

5. Deploy to production

Generate R2 API tokens:

Go to R2 > Overview in the Cloudflare dashboard ↗ Select Manage R2 API Tokens Create a token with Object Read & Write permissions Copy the Access Key ID and Secret Access Key

Set up credentials as Worker secrets:

Terminal window npx wrangler secret put AWS_ACCESS_KEY_ID # Paste your R2 Access Key ID npx wrangler secret put AWS_SECRET_ACCESS_KEY # Paste your R2 Secret Access Key

Worker secrets are encrypted and only accessible to your deployed Worker. The SDK automatically detects these credentials when mountBucket() is called.

Deploy your Worker:

Terminal window npx wrangler deploy

After deployment, wrangler outputs your Worker URL (e.g., https://data-pipeline.yourname.workers.dev ).

6. Test the persistence flow

Now test against your deployed Worker. Replace YOUR_WORKER_URL with your actual Worker URL:

Terminal window # 1. Process data (saves to R2) curl -X POST https://YOUR_WORKER_URL/process # Returns: { "message": "Data processed...", "result": { "total": 144, "average": 48, ... } } # 2. Verify data is accessible curl https://YOUR_WORKER_URL/results # Returns the same results from R2 # 3. Destroy the sandbox curl -X POST https://YOUR_WORKER_URL/destroy # Returns: { "message": "Sandbox destroyed. Data persists in R2!" } # 4. Access results again (from new sandbox) curl https://YOUR_WORKER_URL/results # Still works! Data persisted across sandbox lifecycle

The key insight: After destroying the sandbox, the next request creates a new sandbox instance, mounts the same R2 bucket, and finds the data still there.

What you learned

In this tutorial, you built a data pipeline that demonstrates filesystem persistence through R2 bucket mounting:

Mounting buckets : Use mountBucket() to make R2 accessible as a local directory

: Use to make R2 accessible as a local directory Standard file operations : Access mounted buckets using familiar filesystem commands ( cat , Python open() , etc.)

: Access mounted buckets using familiar filesystem commands ( , Python , etc.) Automatic persistence : Data written to mounted directories survives sandbox destruction

: Data written to mounted directories survives sandbox destruction Credential management: Configure R2 access using environment variables or explicit credentials

Next steps

Mount buckets guide - Comprehensive mounting reference

Storage API - Complete API documentation

Environment variables - Credential configuration options

