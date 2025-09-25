This guide will instruct you through:

Creating your first R2 bucket and enabling its data catalog.

Creating an API token needed for pipelines to authenticate with your data catalog.

Creating your first pipeline with a simple ecommerce schema that writes to an Apache Iceberg ↗ table managed by R2 Data Catalog.

Sending sample ecommerce data via HTTP endpoint.

Validating data in your bucket and querying it with R2 SQL.

Prerequisites

Node.js version manager Use a Node version manager like Volta ↗ or nvm ↗ to avoid permission issues and change Node.js versions. Wrangler, discussed later in this guide, requires a Node version of 16.17.0 or later.

1. Create an R2 bucket

Wrangler CLI

Dashboard If not already logged in, run: npx wrangler login Create an R2 bucket: npx wrangler r2 bucket create pipelines-tutorial In the Cloudflare dashboard, go to the R2 object storage page. Go to Overview Select Create bucket. Enter the bucket name: pipelines-tutorial Select Create bucket.

2. Enable R2 Data Catalog

Wrangler CLI

Dashboard Enable the catalog on your R2 bucket: npx wrangler r2 bucket catalog enable pipelines-tutorial When you run this command, take note of the "Warehouse" and "Catalog URI". You will need these later. In the Cloudflare dashboard, go to the R2 object storage page. Go to Overview Select the bucket: pipelines-tutorial. Switch to the Settings tab, scroll down to R2 Data Catalog, and select Enable. Once enabled, note the Catalog URI and Warehouse name.

3. Create an API token

Pipelines must authenticate to R2 Data Catalog with an R2 API token that has catalog and R2 permissions.

In the Cloudflare dashboard, go to the R2 object storage page. Go to Overview Select Manage API tokens. Select Create Account API token. Give your API token a name. Under Permissions, choose the Admin Read & Write permission. Select Create Account API Token. Note the Token value.

Note This token also includes the R2 SQL Read permission, which allows you to query your data with R2 SQL.

4. Create a pipeline

Wrangler CLI

Dashboard First, create a schema file that defines your ecommerce data structure: Create schema.json : { " fields " : [ { " name " : "user_id" , " type " : "string" , " required " : true }, { " name " : "event_type" , " type " : "string" , " required " : true }, { " name " : "product_id" , " type " : "string" , " required " : false }, { " name " : "amount" , " type " : "float64" , " required " : false } ] } Use the interactive setup to create a pipeline that writes to R2 Data Catalog: Terminal window npx wrangler pipelines setup Follow the prompts: Pipeline name: Enter ecommerce Stream configuration: Enable HTTP endpoint: yes

Require authentication: no (for simplicity)

(for simplicity) Configure custom CORS origins: no

Schema definition: Load from file

Schema file path: schema.json (or your file path) Sink configuration: Destination type: Data Catalog Table

R2 bucket name: pipelines-tutorial

Namespace: default

Table name: ecommerce

Catalog API token: Enter your token from step 3

Compression: zstd

Roll file when size reaches (MB): 100

Roll file when time reaches (seconds): 10 (for faster data visibility in this tutorial) SQL transformation: Choose Use simple ingestion query to use: INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream After setup completes, note the HTTP endpoint URL displayed in the final output. In the Cloudflare dashboard, go to Pipelines > Pipelines. Go to Pipelines Select Create Pipeline. Connect to a Stream: Pipeline name: ecommerce

Enable HTTP endpoint for sending data: Enabled

HTTP authentication: Disabled (default)

Select Next Define Input Schema: Select JSON editor

Copy in the schema: { " fields " : [ { " name " : "user_id" , " type " : "string" , " required " : true }, { " name " : "event_type" , " type " : "string" , " required " : true }, { " name " : "product_id" , " type " : "string" , " required " : false }, { " name " : "amount" , " type " : "f64" , " required " : false } ] }

Select Next Define Sink: Select your R2 bucket: pipelines-tutorial

Storage type: R2 Data Catalog

Namespace: default

Table name: ecommerce

Advanced Settings : Change Maximum Time Interval to 10 seconds

: Change to Select Next Credentials: Disable Automatically create an Account API token for your sink

Enter Catalog Token from step 3

from step 3 Select Next Pipeline Definition: Leave the default SQL query: INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream;

Select Create Pipeline After pipeline creation, note the Stream ID for the next step.

5. Send sample data

Send ecommerce events to your pipeline's HTTP endpoint:

Terminal window curl -X POST https://{stream-id}.ingest.cloudflare.com \ -H "Content-Type: application/json" \ -d '[ { "user_id": "user_12345", "event_type": "purchase", "product_id": "widget-001", "amount": 29.99 }, { "user_id": "user_67890", "event_type": "view_product", "product_id": "widget-002" }, { "user_id": "user_12345", "event_type": "add_to_cart", "product_id": "widget-003", "amount": 15.50 } ]'

Replace {stream-id} with your actual stream endpoint from the pipeline setup.

In the Cloudflare dashboard, go to the R2 object storage page. Select your bucket: pipelines-tutorial . You should see Iceberg metadata files and data files created by your pipeline. Note: If you aren't seeing any files in your bucket, try waiting a couple of minutes and trying again. The data is organized in the Apache Iceberg format with metadata tracking table versions.

7. Query your data using R2 SQL

Set up your environment to use R2 SQL:

Terminal window export WRANGLER_R2_SQL_AUTH_TOKEN = YOUR_API_TOKEN

Or create a .env file with:

WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN

Where YOUR_API_TOKEN is the token you created in step 3. For more information on setting environment variables, refer to Wrangler system environment variables.

Query your data:

Terminal window npx wrangler r2 sql query "YOUR_WAREHOUSE_NAME" " SELECT user_id, event_type, product_id, amount FROM default.ecommerce WHERE event_type = 'purchase' LIMIT 10"

Replace YOUR_WAREHOUSE_NAME with the warehouse name from step 2.

You can also query this table with any engine that supports Apache Iceberg. To learn more about connecting other engines to R2 Data Catalog, refer to Connect to Iceberg engines.

