SQL reference
R2 SQL is Cloudflare's serverless, distributed, analytics query engine for querying Apache Iceberg ↗ tables stored in R2 Data Catalog. This page documents the supported SQL syntax.
SELECT [DISTINCT] column_list | expression | aggregate_function | window_functionFROM namespace_name.table_name[JOIN namespace_name.table_name ON condition][WHERE conditions][GROUP BY column_list][HAVING conditions][QUALIFY window_condition][ORDER BY expression [ASC | DESC]][LIMIT number]Two or more queries can be combined with set operations (UNION, UNION ALL, INTERSECT, EXCEPT).
Lists all available namespaces.
SHOW DATABASES;Alias for SHOW DATABASES. Lists all available namespaces.
SHOW NAMESPACES;Lists all tables within a specific namespace.
SHOW TABLES IN namespace_name;Describes the structure of a table, showing column names and data types.
DESCRIBE namespace_name.table_name;SELECT [DISTINCT] column_specification [, column_specification, ...]- Column name:
column_name - All columns:
* - Qualified wildcard:
table_name.* - Column alias:
column_name AS alias - Expressions: arithmetic, function calls, CASE expressions, and casts
SELECT * FROM my_namespace.sales_data LIMIT 10SELECT customer_id, region, total_amount FROM my_namespace.sales_data LIMIT 10SELECT region, total_amount * 1.1 AS total_with_tax FROM my_namespace.sales_data LIMIT 10SELECT DISTINCT returns unique rows. DISTINCT ON (...) returns the first row for each combination of the listed expressions, using the ORDER BY clause to determine which row is kept.
-- Unique combinationsSELECT DISTINCT region, department FROM my_namespace.sales_data
-- First row per region by amountSELECT DISTINCT ON (region) region, customer_id, total_amountFROM my_namespace.sales_dataORDER BY region, total_amount DESCFor counting unique values on large datasets, approx_distinct() is a faster alternative.
CTEs let you define named temporary result sets using WITH that you can reference in the main query. CTEs can reference different tables and can include JOINs. A CTE can also be joined with other CTEs or regular tables in the main query.
WITH cte_name AS ( SELECT ... FROM namespace_name.table_name [WHERE ...])SELECT ... FROM cte_nameA CTE can reference a previously defined CTE.
WITH filtered AS ( SELECT customer_id, department, total_amount FROM my_namespace.sales_data WHERE total_amount > 0),summary AS ( SELECT department, COUNT(*) AS order_count, round(AVG(total_amount), 2) AS avg_amount FROM filtered GROUP BY department)SELECT *FROM summaryWHERE order_count > 100ORDER BY avg_amount DESCWITH enterprise_zones AS ( SELECT zone_id, domain, plan FROM my_namespace.zones WHERE plan = 'enterprise')SELECT ez.domain, f.action, COUNT(*) AS cntFROM enterprise_zones ezINNER JOIN my_namespace.firewall_events f ON ez.zone_id = f.zone_idGROUP BY ez.domain, f.actionORDER BY cnt DESCLIMIT 20WITH top_zones AS ( SELECT zone_id, COUNT(*) AS req_count FROM my_namespace.http_requests GROUP BY zone_id ORDER BY req_count DESC LIMIT 50),zone_threats AS ( SELECT zone_id, COUNT(*) AS threat_count FROM my_namespace.firewall_events WHERE risk_score > 0.5 GROUP BY zone_id)SELECT tz.zone_id, tz.req_count, COALESCE(zt.threat_count, 0) AS threat_countFROM top_zones tzLEFT JOIN zone_threats zt ON tz.zone_id = zt.zone_idORDER BY tz.req_count DESCLIMIT 20SELECT * FROM namespace_name.table_nameR2 SQL queries can reference one or more tables. Tables are specified as namespace_name.table_name. Multiple tables can be combined using JOINs or comma-separated syntax. Refer to the JOIN clause section for details.
R2 SQL supports joining multiple Iceberg tables in a single query. All join types use standard SQL syntax.
| Join type | Syntax | Description |
|---|---|---|
| Inner join | INNER JOIN ... ON | Returns rows that match in both tables |
| Left outer join | LEFT JOIN ... ON | Returns all rows from the left table, NULLs for non-matching right |
| Right outer join | RIGHT JOIN ... ON | Returns all rows from the right table, NULLs for non-matching left |
| Full outer join | FULL OUTER JOIN ... ON | Returns all rows from both tables, NULLs where no match |
| Cross join | CROSS JOIN | Cartesian product of both tables |
| Implicit join | FROM t1, t2 WHERE t1.id = t2.id | Comma-separated tables with join condition in WHERE |
-- Explicit JOINSELECT columnsFROM namespace.table1 alias1[INNER | LEFT | RIGHT | FULL OUTER | CROSS] JOIN namespace.table2 alias2 ON alias1.column = alias2.column[WHERE conditions]
-- Implicit joinSELECT columnsFROM namespace.table1 alias1, namespace.table2 alias2WHERE alias1.column = alias2.columnYou can join three or more tables in a single query:
SELECT z.domain, h.method, f.action, COUNT(*) AS cntFROM my_namespace.zones zINNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_idINNER JOIN my_namespace.firewall_events f ON z.zone_id = f.zone_idWHERE h.status_code >= 400GROUP BY z.domain, h.method, f.actionORDER BY cnt DESCLIMIT 20A table can be joined with itself using different aliases:
SELECT f1.source_ip, f1.zone_id AS zone1, f2.zone_id AS zone2FROM my_namespace.firewall_events f1INNER JOIN my_namespace.firewall_events f2 ON f1.source_ip = f2.source_ip AND f1.zone_id < f2.zone_idWHERE f1.action = 'block'LIMIT 20- Join conditions use the
ONclause with equality (=) or expression-based predicates. - Functions are supported in join predicates (for example,
ON LOWER(a.col) = LOWER(b.col)). - Multiple conditions can be combined with
AND.
- Include
WHEREfilters to reduce intermediate result sizes, especially for multi-way joins. - Join large fact tables through a shared dimension table rather than directly cross-joining two large tables.
- Use
LIMITto cap result sizes.
R2 SQL supports subqueries in multiple positions within a query.
A subquery in the FROM clause creates a derived table that can be referenced in the outer query:
SELECT sub.domain, sub.total_requestsFROM ( SELECT z.domain, COUNT(*) AS total_requests FROM my_namespace.zones z INNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_id GROUP BY z.domain) subWHERE sub.total_requests > 1000ORDER BY sub.total_requests DESCLIMIT 20Derived tables can be joined with other derived tables or regular tables:
SELECT req.domain, req.total_reqs, fw.total_eventsFROM ( SELECT zone_id, domain, COUNT(*) AS total_reqs FROM my_namespace.zones z INNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_id GROUP BY zone_id, domain) reqINNER JOIN ( SELECT zone_id, COUNT(*) AS total_events FROM my_namespace.firewall_events GROUP BY zone_id) fw ON req.zone_id = fw.zone_idORDER BY fw.total_events DESCLIMIT 20Filter rows based on whether a value exists in the result of a subquery:
-- Find requests from enterprise zonesSELECT method, status_code, COUNT(*) AS cntFROM my_namespace.http_requestsWHERE zone_id IN ( SELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise')GROUP BY method, status_codeORDER BY cnt DESCLIMIT 20-- NOT IN exampleSELECT zone_id, COUNT(*) AS cntFROM my_namespace.http_requestsWHERE zone_id NOT IN ( SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block')GROUP BY zone_idLIMIT 10Test for the existence of rows matching a correlated condition:
-- Find zones with blocked firewall events (semi-join)SELECT z.domain, z.planFROM my_namespace.zones zWHERE EXISTS ( SELECT 1 FROM my_namespace.firewall_events f WHERE f.zone_id = z.zone_id AND f.action = 'block')ORDER BY z.domainLIMIT 20-- Find zones with NO firewall events (anti-join)SELECT z.domain, z.planFROM my_namespace.zones zWHERE NOT EXISTS ( SELECT 1 FROM my_namespace.firewall_events f WHERE f.zone_id = z.zone_id)ORDER BY z.domainLIMIT 20A subquery that returns a single value can be used in SELECT, WHERE, or HAVING:
-- In SELECT (constant value per row)SELECT z.domain, z.plan, (SELECT COUNT(*) FROM my_namespace.zones) AS total_zonesFROM my_namespace.zones zWHERE z.plan = 'enterprise'LIMIT 10-- In WHERE (comparison)SELECT z.domain, z.plan, z.requests_30dFROM my_namespace.zones zWHERE z.requests_30d > ( SELECT AVG(requests_30d) FROM my_namespace.zones)ORDER BY z.requests_30d DESCLIMIT 20SELECT * FROM namespace_name.table_name WHERE condition [AND | OR condition ...]=, !=, <>, <, >, <=, >=
column_name IS NULLcolumn_name IS NOT NULL
IS TRUE,IS FALSE,IS NOT TRUE,IS NOT FALSEIS UNKNOWN,IS NOT UNKNOWN
column_name BETWEEN value1 AND value2column_name NOT BETWEEN value1 AND value2
column_name IN ('value1', 'value2')column_name NOT IN ('value1', 'value2')
column_name LIKE 'pattern'column_name NOT LIKE 'pattern'column_name ILIKE 'pattern'(case-insensitive)column_name NOT ILIKE 'pattern'column_name SIMILAR TO 'regex_pattern'
ANDORNOT
SELECT * FROM my_namespace.sales_dataWHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'
SELECT * FROM my_namespace.sales_dataWHERE status = 200 AND response_time > 1000
SELECT * FROM my_namespace.sales_dataWHERE (region = 'North' OR region = 'South') AND total_amount IS NOT NULL
SELECT * FROM my_namespace.sales_dataWHERE department ILIKE '%eng%'SELECT column_list, aggregation_function(column)FROM namespace_name.table_name[WHERE conditions]GROUP BY column_listSELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataGROUP BY department
SELECT department, category, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY department, categoryThese extensions compute multiple groupings, including subtotals and grand totals, in a single query.
GROUPING SETS: Computes exactly the groupings you list.()produces the grand total.ROLLUP: Computes hierarchical subtotals from left to right.ROLLUP(a, b)groups by(a, b),(a), and().CUBE: Computes every combination of the listed columns.CUBE(a, b)groups by(a, b),(a),(b), and().
-- Subtotals per department plus a grand totalSELECT department, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY ROLLUP(department)
-- Every combination of department and categorySELECT department, category, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY CUBE(department, category)
-- Explicit groupingsSELECT department, category, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY GROUPING SETS ((department, category), (department), ())SELECT column_list, aggregation_function(column) AS aliasFROM namespace_name.table_nameGROUP BY column_listHAVING aggregation_function(column) comparison_operator valueSELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataGROUP BY departmentHAVING COUNT(*) > 1000
SELECT region, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY regionHAVING SUM(total_amount) > 1000000ORDER BY expression [ASC | DESC] [, expression [ASC | DESC], ...]- ASC: Ascending order (default)
- DESC: Descending order
- Multi-column ordering is supported
SELECT customer_id, total_amountFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLORDER BY total_amount DESCLIMIT 50
SELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataGROUP BY departmentORDER BY dept_count DESC, department ASCLIMIT number- Type: Integer only
- Default: 500
SELECT * FROM my_namespace.sales_data LIMIT 100Window functions compute a value across a set of rows related to the current row without collapsing them into a single output row. The window is defined inline with an OVER (...) clause containing an optional PARTITION BY, ORDER BY, and frame specification.
function(args) OVER ( [PARTITION BY expression [, ...]] [ORDER BY expression [ASC | DESC] [, ...]] [frame_specification])| Category | Functions |
|---|---|
| Ranking | ROW_NUMBER, RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST, NTILE |
| Offset | LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE |
| Aggregate | SUM, AVG, COUNT, MIN, MAX, and other aggregates used with OVER |
-- Rank rows within each partitionSELECT customer_id, region, ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_amount DESC) AS rank_in_region, LAG(total_amount) OVER (PARTITION BY region ORDER BY total_amount DESC) AS prev_amountFROM my_namespace.sales_data
-- Running total with an explicit frameSELECT customer_id, total_amount, SUM(total_amount) OVER (ORDER BY total_amount ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS running_totalFROM my_namespace.sales_dataQUALIFY filters rows based on the result of a window function, similar to how HAVING filters grouped rows.
-- Keep only the top 3 customers by amount in each regionSELECT customer_id, region, total_amountFROM my_namespace.sales_dataQUALIFY ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_amount DESC) <= 3Set operations combine the results of two or more SELECT statements.
SELECT ... FROM table1UNION | UNION ALL | INTERSECT | EXCEPTSELECT ... FROM table2| Operation | Description |
|---|---|
UNION | Returns all rows from both queries, removing duplicates |
UNION ALL | Returns all rows from both queries, including duplicates |
INTERSECT | Returns only rows that appear in both query results |
EXCEPT | Returns rows from the first query that do not appear in the second |
-- Find zones that had either firewall blocks OR high-risk requestsSELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block'UNIONSELECT zone_id FROM my_namespace.http_requests WHERE risk_score > 0.8-- Find zones with both firewall blocks AND entries in the zones tableSELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block'INTERSECTSELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise'-- Find enterprise zones that have no firewall eventsSELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise'EXCEPTSELECT zone_id FROM my_namespace.firewall_events- All queries in a set operation must return the same number of columns.
- Corresponding columns must have compatible data types.
- Column names in the result are taken from the first query.
Returns the execution plan for a query without running it.
EXPLAIN SELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLGROUP BY department;Returns the execution plan as structured JSON for programmatic analysis.
EXPLAIN FORMAT JSON SELECT * FROM my_namespace.sales_data LIMIT 10;Expressions can be used in SELECT, WHERE, GROUP BY, HAVING, and ORDER BY clauses.
SELECT 42 AS int_val, 3.14 AS float_val, 'hello' AS str_val, TRUE AS bool_val, NULL AS null_valFROM my_namespace.sales_data LIMIT 1+, -, *, /, %
SELECT customer_id, total_amount * 1.1 AS total_with_tax, total_amount % 10 AS remainderFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5SELECT customer_id || ' - ' || region AS labelFROM my_namespace.sales_dataLIMIT 5Searched form:
SELECT customer_id, CASE WHEN total_amount > 1000 THEN 'high' WHEN total_amount > 100 THEN 'medium' ELSE 'low' END AS tierFROM my_namespace.sales_dataLIMIT 10Simple form:
SELECT customer_id, CASE region WHEN 'North' THEN 'N' WHEN 'South' THEN 'S' ELSE 'Other' END AS region_codeFROM my_namespace.sales_dataLIMIT 10-- CASTSELECT CAST(total_amount AS INT) AS amount_int FROM my_namespace.sales_data LIMIT 5
-- TRY_CAST (returns NULL on failure instead of error)SELECT TRY_CAST(customer_id AS INT) AS id_int FROM my_namespace.sales_data LIMIT 5
-- Shorthand (::)SELECT total_amount::INT AS amount_int FROM my_namespace.sales_data LIMIT 5SELECT EXTRACT(YEAR FROM timestamp) AS yr, EXTRACT(MONTH FROM timestamp) AS mo, EXTRACT(DAY FROM timestamp) AS dyFROM my_namespace.sales_dataLIMIT 1| Type | Description | Example Values |
|---|---|---|
integer | Whole numbers | 1, 42, -10, 0 |
float | Decimal numbers | 1.5, 3.14, -2.7, 0.0 |
string | Text values | 'hello', 'GET', '2024-01-01' |
boolean | Boolean values | true, false |
timestamp | RFC3339 | '2025-09-24T01:00:00Z' |
date | Date values | '2025-09-24' |
struct | Named fields | struct_col['field_name'] |
array | Ordered list | array_col[1] (1-indexed) |
map | Key-value pairs | map_keys(map_col) |
- Comparison operators:
=,!=,<,<=,>,>=,LIKE,BETWEEN,IS NULL,IS NOT NULL - AND (higher precedence)
- OR (lower precedence)
Use parentheses to override default precedence:
SELECT * FROM my_namespace.sales_data WHERE (status = 404 OR status = 500) AND region = 'North'SELECT *FROM my_namespace.sales_dataWHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'LIMIT 100SELECT customer_id, timestamp, status, total_amountFROM my_namespace.sales_dataWHERE status >= 400 AND total_amount > 5000ORDER BY total_amount DESCLIMIT 50SELECT region, COUNT(*) AS region_count, AVG(total_amount) AS avg_amountFROM my_namespace.sales_dataWHERE status = 'completed'GROUP BY regionHAVING COUNT(*) > 1000ORDER BY avg_amount DESCLIMIT 20SELECT customer_id, CASE WHEN total_amount >= 1000 THEN 'Premium' WHEN total_amount >= 100 THEN 'Standard' ELSE 'Basic' END AS tier, total_amountFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLORDER BY total_amount DESCLIMIT 20