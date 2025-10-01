Aggregate functions
Usage:
count is an aggregation function that returns the number of rows in each group or results set.
count can also be used to count the number of distinct (unique) values in each column:
Example:
Usage:
sum is an aggregation function that returns the sum of column values across all rows in each group or results set. Sum also supports
DISTINCT, but in this case it will only sum the unique values in the column.
Example:
Usage:
avg is an aggregation function that returns the mean of column values across all rows in each group or results set. Avg also supports
DISTINCT, but in this case it will only average the unique values in the column.
Example:
Usage:
min is an aggregation function that returns the minimum value of a column across all rows.
Example:
Usage:
max is an aggregation function that returns the maximum value of a column across all rows.
Example:
Usage:
quantileExactWeighted is an aggregation function that returns the value at the qth quantile in the named column across all rows in each group or results set. Each row will be weighted by the value in
weight_column_name. Typically this would be
_sample_interval (refer to Sampling for more information).
Example:
For backwards compatibility, this is also available as
quantileWeighted(q, column_name, weight_column_name).
Usage:
argMax is an aggregation function that returns the
arg value that corresponds to the maximum value of
val.
If multiple
arg values have the maximum value of
val, any one will be returned.
Example:
Usage:
argMin is an aggregation function that returns the
arg value that corresponds to the minimum value of
val.
If multiple
arg values have the minimum value of
val, any one will be returned.
Example:
Usage:
first_value is an aggregation function which returns the first value of the provided column.
Example:
Usage:
last_value is an aggregation function which returns the last value of the provided column.
Example:
Usage:
topK is an aggregation function which returns the most common
N values of a column.
N is optional and defaults to
10.
Example:
Usage:
topKWeighted is an aggregation function which returns the most common
N values of a column, weighted by a second column.
N is optional and defaults to
10.
Example:
