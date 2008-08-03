Getting started
This guide will instruct you through:
- Creating your first R2 bucket and enabling its data catalog.
- Creating an API token needed for query engines to authenticate with your data catalog.
- Using PyIceberg ↗ to create your first Iceberg table in a marimo ↗ Python notebook.
- Using PyIceberg ↗ to load sample data into your table and query it.
- Sign up for a Cloudflare account ↗.
- Install
Node.js↗.
Node.js version manager
Use a Node version manager like Volta ↗ or nvm ↗ to avoid permission issues and change Node.js versions. Wrangler, discussed later in this guide, requires a Node version of
16.17.0 or later.
-
If not already logged in, run:
-
Create an R2 bucket:
- From the Cloudflare dashboard, select R2 Object Storage from the sidebar.
- Select Create bucket.
- Enter the bucket name: r2-data-catalog-tutorial
- Select Create bucket.
Then, enable the catalog on your chosen R2 bucket:
- From the Cloudflare dashboard, select R2 Object Storage from the sidebar.
- Select the bucket: r2-data-catalog-tutorial.
- Switch to the Settings tab, scroll down to R2 Data Catalog, and select Enable.
- Once enabled, note the Catalog URI and Warehouse name.
Iceberg clients (including PyIceberg ↗) must authenticate to the catalog with a Cloudflare API token that has both R2 and catalog permissions.
-
From the Cloudflare dashboard, select R2 Object Storage from the sidebar.
-
Expand the API dropdown and select Manage API tokens.
-
Select Create API token.
-
Select the R2 Token text to edit your API token name.
-
Under Permissions, choose the Admin Read & Write permission.
-
Select Create API Token.
-
Note the Token value.
You need to install a Python package manager. In this guide, use uv ↗. If you do not already have uv installed, follow the installing uv guide ↗.
We will use marimo ↗ as a Python notebook.
-
Create a directory where our notebook will be stored:
-
Change into our new directory:
-
Create a new Python virtual environment:
-
Activate the Python virtual environment:
-
Install marimo with uv:
-
Create a file called
r2-data-catalog-tutorial.py.
-
Paste the following code snippet into your
r2-data-catalog-tutorial.pyfile:
-
Replace the
CATALOG_URI,
WAREHOUSE, and
TOKENvariables with your values from sections 2 and 3 respectively.
In the Python notebook above, you:
- Connect to your catalog.
- Create the
defaultnamespace.
- Create a simple PyArrow table.
- Create (or load) the
peopletable in the
defaultnamespace.
- Append sample data to the table.
- Print the contents of the table.
- (Optional) Drop the
peopletable we created for this tutorial.
Was this helpful?
- Resources
- API
- New to Cloudflare?
- Products
- Sponsorships
- Open Source
- Support
- Help Center
- System Status
- Compliance
- GDPR
- Company
- cloudflare.com
- Our team
- Careers
- 2025 Cloudflare, Inc.
- Privacy Policy
- Terms of Use
- Report Security Issues
- Trademark
-