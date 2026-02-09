Changelog
New updates and improvements at Cloudflare.
R2 SQL now supports five approximate aggregation functions for fast analysis of large datasets. These functions trade minor precision for improved performance on high-cardinality data.
APPROX_PERCENTILE_CONT(column, percentile)— Returns the approximate value at a given percentile (0.0 to 1.0). Works on integer and decimal columns.
APPROX_PERCENTILE_CONT_WITH_WEIGHT(column, weight, percentile)— Weighted percentile calculation where each row contributes proportionally to its weight column value.
APPROX_MEDIAN(column)— Returns the approximate median. Equivalent to
APPROX_PERCENTILE_CONT(column, 0.5).
APPROX_DISTINCT(column)— Returns the approximate number of distinct values. Works on any column type.
APPROX_TOP_K(column, k)— Returns the
kmost frequent values with their counts as a JSON array.
All functions support
WHEREfilters. All except
APPROX_TOP_Ksupport
GROUP BY.
For the full syntax and additional examples, refer to the SQL reference.
-
R2 SQL now supports aggregation functions,
GROUP BY,
HAVING, along with schema discovery commands to make it easy to explore your data catalog.
You can now perform aggregations on Apache Iceberg tables in R2 Data Catalog using standard SQL functions including
COUNT(*),
SUM(),
AVG(),
MIN(), and
MAX(). Combine these with
GROUP BYto analyze data across dimensions, and use
HAVINGto filter aggregated results.
New metadata commands make it easy to explore your data catalog and understand table structures:
SHOW DATABASESor
SHOW NAMESPACES- List all available namespaces
SHOW TABLES IN namespace_name- List tables within a namespace
DESCRIBE namespace_name.table_name- View table schema and column types
To learn more about the new aggregation capabilities and schema discovery commands, check out the SQL reference. If you're new to R2 SQL, visit our getting started guide to begin querying your data.
-
Today, we're launching the open beta for R2 SQL: A serverless, distributed query engine that can efficiently analyze petabytes of data in Apache Iceberg ↗ tables managed by R2 Data Catalog.
R2 SQL is ideal for exploring analytical and time-series data stored in R2, such as logs, events from Pipelines, or clickstream and user behavior data.
If you already have a table in R2 Data Catalog, running queries is as simple as:
To get started with R2 SQL, check out our getting started guide or learn more about supported features in the SQL reference. For a technical deep dive into how we built R2 SQL, read our blog post ↗.