Sippy

Sippy is a data migration service that allows you to copy data from other cloud providers to R2 as the data is requested, without paying unnecessary cloud egress fees typically associated with moving large amounts of data.

Migration-specific egress fees are reduced by leveraging requests within the flow of your application where you would already be paying egress fees to simultaneously copy objects to R2.

How it works

When enabled for an R2 bucket, Sippy implements the following migration strategy across Workers, S3 API, and public buckets:

When an object is requested, it is served from your R2 bucket if it is found.
If the object is not found in R2, the object will simultaneously be returned from your source storage bucket and copied to R2.
All other operations, including put and delete, continue to work as usual.

When is Sippy useful?

Using Sippy as part of your migration strategy can be a good choice when:

You want to start migrating your data, but you want to avoid paying upfront egress fees to facilitate the migration of your data all at once.
You want to experiment by serving frequently accessed objects from R2 to eliminate egress fees, without investing time in data migration.
You have frequently changing data and are looking to conduct a migration while avoiding downtime. Sippy can be used to serve requests while Super Slurper can be used to migrate your remaining data.

If you are looking to migrate all of your data from an existing cloud provider to R2 at one time, we recommend using Super Slurper.

Get started with Sippy

Before getting started, you will need:

An existing R2 bucket. If you don't already have one, refer to Create buckets.
API credentials for your source object storage bucket.
(Wrangler only) Cloudflare R2 Access Key ID and Secret Access Key with read and write permissions. For more information, refer to Authentication.

Enable Sippy via the Dashboard

From the Cloudflare dashboard, select R2 from the sidebar.
Select the bucket you'd like to migrate objects to.
Switch to the Settings tab, then scroll down to the Incremental migration card.
Select Enable and enter details for the AWS / GCS bucket you'd like to migrate objects from. The credentials you enter must have permissions to read from this bucket. Cloudflare also recommends scoping your credentials to only allow reads from this bucket.
Select Enable.

Enable Sippy via Wrangler

Set up Wrangler

To begin, install npm ↗. Then install Wrangler, the Developer Platform CLI.

Enable Sippy on your R2 bucket

npx wrangler r2 bucket sippy enable <BUCKET_NAME>

This will prompt you to select between supported object storage providers and lead you through setup.

Enable Sippy via API

For information on required parameters and examples of how to enable Sippy, refer to the API documentation. For information about getting started with the Cloudflare API, refer to Make API calls.

View migration metrics

When enabled, Sippy exposes metrics that help you understand the progress of your ongoing migrations.

Metric					Description
Requests served by Sippy					The percentage of overall requests served by R2 over a period of time. A higher percentage indicates that fewer requests need to be made to the source bucket.
Data migrated by Sippy					The amount of data that has been copied from the source bucket to R2 over a period of time. Reported in bytes.

To view current and historical metrics:

Log in to the Cloudflare dashboard ↗ and select your account.
Go to the R2 tab ↗ and select your bucket.
Select the Metrics tab.

You can optionally select a time window to query. This defaults to the last 24 hours.

Disable Sippy on your R2 bucket

Dashboard

From the Cloudflare dashboard, select R2 from the sidebar.
Select the bucket you'd like to disable Sippy for.
Switch to the Settings tab and scroll down to the Incremental migration card.
Press Disable.

Wrangler

To disable Sippy, run the r2 bucket sippy disable command:

npx wrangler r2 bucket sippy disable <BUCKET_NAME>

API

For more information on required parameters and examples of how to disable Sippy, refer to the API documentation.

Supported cloud storage providers

Cloudflare currently supports copying data from the following cloud object storage providers to R2:

Amazon S3
Google Cloud Storage (GCS)

R2 API interactions

When Sippy is enabled, it changes the behavior of certain actions on your R2 bucket across Workers, S3 API, and public buckets.

Action					New behavior
GetObject					Calls to GetObject will first attempt to retrieve the object from your R2 bucket. If the object is not present, the object will be served from the source storage bucket and simultaneously uploaded to the requested R2 bucket. Additional considerations: Modifications to objects in the source bucket will not be reflected in R2 after the initial copy. Once an object is stored in R2, it will not be re-retrieved and updated. Only user-defined metadata that is prefixed by `x-amz-meta-` in the HTTP response will be migrated. Remaining metadata will be omitted. For larger objects (greater than 199 MiB), multiple GET requests may be required to fully copy the object to R2. If there are multiple simultaneous GET requests for an object which has not yet been fully copied to R2, Sippy may fetch the object from the source storage bucket multiple times to serve those requests.
HeadObject					Behaves similarly to GetObject, but only retrieves object metadata. Will not copy objects to the requested R2 bucket.
PutObject					No change to behavior. Calls to PutObject will add objects to the requested R2 bucket.
DeleteObject					No change to behavior. Calls to DeleteObject will delete objects in the requested R2 bucket. Additional considerations: If deletes to objects in R2 are not also made in the source storage bucket, subsequent GetObject requests will result in objects being retrieved from the source bucket and copied to R2.

Actions not listed above have no change in behavior. For more information, refer to Workers API reference or S3 API compatibility.

Create credentials for storage providers

Amazon S3

To copy objects from Amazon S3, Sippy requires access permissions to your bucket. While you can use any AWS Identity and Access Management (IAM) user credentials with the correct permissions, Cloudflare recommends you create a user with a narrow set of permissions.

To create credentials with the correct permissions:

Log in to your AWS IAM account.
Create a policy with the following format and replace <BUCKET_NAME> with the bucket you want to grant access to:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": ["arn:aws:s3:::<BucketName>/*"]
    }
  ]
}

Create a new user and attach the created policy to that user.

You can now use both the Access Key ID and Secret Access Key when enabling Sippy.

Google Cloud Storage

To copy objects from Google Cloud Storage (GCS), Sippy requires access permissions to your bucket. Cloudflare recommends using the Google Cloud predefined Storage Object Viewer role.

To create credentials with the correct permissions:

Log in to your Google Cloud console.
Go to IAM & Admin > Service Accounts.
Create a service account with the predefined Storage Object Viewer role.
Go to the Keys tab of the service account you created.
Select Add Key > Create a new key and download the JSON key file.

You can now use this JSON key file when enabling Sippy via Wrangler or API.

Caveats

ETags

While R2's ETag generation is compatible with S3's during the regular course of operations, ETags are not guaranteed to be equal when an object is migrated using Sippy. Sippy makes autonomous decisions about the operations it uses when migrating objects to optimize for performance and network usage. It may choose to migrate an object in multiple parts, which affects ETag calculation.

For example, a 320 MiB object originally uploaded to S3 using a single PutObject operation might be migrated to R2 via multipart operations. In this case, its ETag on R2 will not be the same as its ETag on S3. Similarly, an object originally uploaded to S3 using multipart operations might also have a different ETag on R2 if the part sizes Sippy chooses for its migration differ from the part sizes this object was originally uploaded with.

Relying on matching ETags before and after the migration is therefore discouraged.

Was this helpful?

Community
X
Discord
YouTube
GitHub