New

The executive guide to generative AI

Read more

Aggregate Functions

edit

This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.

Functions for computing a single result from a set of input values. Elasticsearch SQL supports aggregate functions only alongside grouping (implicit or explicit).

General Purpose

edit

AVG

edit

Synopsis:

AVG(numeric_field) 

Input:

numeric field

Output: double numeric value

Description:Returns the Average (arithmetic mean) of input values.

SELECT AVG(salary) AS avg FROM emp;

      avg:d
---------------
48248.55

COUNT

edit

Synopsis:

COUNT(expression) 

Input:

a field name, wildcard (*) or any numeric value

Output: numeric value

Description:Returns the total number (count) of input values.

In case of COUNT(*) or COUNT(<literal>), all values are considered (including null or missing ones).

In case of COUNT(<field_name>) null values are not considered.

SELECT COUNT(*) AS count FROM emp;

     count
---------------
100

COUNT(ALL)

edit

Synopsis:

COUNT(ALL field_name) 

Input:

a field name

Output: numeric value

Description:Returns the total number (count) of all non-null input values. COUNT(<field_name>) and COUNT(ALL <field_name>) are equivalent.

SELECT COUNT(ALL last_name) AS count_all, COUNT(DISTINCT last_name) count_distinct FROM emp;

   count_all   |  count_distinct
---------------+------------------
100            |96

COUNT(DISTINCT)

edit

Synopsis:

COUNT(DISTINCT field_name) 

Input:

a field name

Output: numeric value

Description:Returns the total number of distinct non-null values in input values.

SELECT COUNT(DISTINCT hire_date) unique_hires, COUNT(hire_date) AS hires FROM emp;

  unique_hires  |     hires
----------------+---------------
99              |100

MAX

edit

Synopsis:

MAX(field_name) 

Input:

a numeric field

Output: same type as the input

Description:Returns the maximum value across input values in the field field_name.

SELECT MAX(salary) AS max FROM emp;

      max
---------------
74999

MIN

edit

Synopsis:

MIN(field_name) 

Input:

a numeric field

Output: same type as the input

Description:Returns the minimum value across input values in the field field_name.

SELECT MIN(salary) AS min FROM emp;

      min
---------------
25324

SUM

edit

Synopsis:

SUM(field_name) 

Input:

a numeric field

Output: bigint for integer input, double for floating points

Description:Returns the sum of input values in the field field_name.

SELECT SUM(salary) AS sum FROM emp;

      sum
---------------
4824855

Statistics

edit

KURTOSIS

edit

Synopsis:

KURTOSIS(field_name) 

Input:

a numeric field

Output: double numeric value

Description:Quantify the shape of the distribution of input values in the field field_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, KURTOSIS(salary) AS k FROM emp;

      min      |      max      |        k
---------------+---------------+------------------
25324          |74999          |2.0444718929142986

PERCENTILE

edit

Synopsis:

PERCENTILE(
    field_name,  
    numeric_exp) 

Input:

a numeric field

a numeric expression

Output: double numeric value

Description:Returns the nth percentile (represented by numeric_exp parameter) of input values in the field field_name.

SELECT languages, PERCENTILE(salary, 95) AS "95th" FROM emp
       GROUP BY languages;

   languages   |      95th
---------------+-----------------
null           |74999.0
1              |72790.5
2              |71924.70000000001
3              |73638.25
4              |72115.59999999999
5              |61071.7

PERCENTILE_RANK

edit

Synopsis:

PERCENTILE_RANK(
    field_name,  
    numeric_exp) 

Input:

a numeric field

a numeric expression

Output: double numeric value

Description:Returns the nth percentile rank (represented by numeric_exp parameter) of input values in the field field_name.

SELECT languages, PERCENTILE_RANK(salary, 65000) AS rank FROM emp GROUP BY languages;

   languages   |      rank
---------------+-----------------
null           |73.65766569962062
1              |73.7291625157734
2              |88.88005607010643
3              |79.43662623295829
4              |85.70446389643493
5              |100.0

SKEWNESS

edit

Synopsis:

SKEWNESS(field_name) 

Input:

a numeric field

Output: double numeric value

Description:Quantify the asymmetric distribution of input values in the field field_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, SKEWNESS(salary) AS s FROM emp;

      min      |      max      |        s
---------------+---------------+------------------
25324          |74999          |0.2707722118423227

STDDEV_POP

edit

Synopsis:

STDDEV_POP(field_name) 

Input:

a numeric field

Output: double numeric value

Description:Returns the population standard deviation of input values in the field field_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, STDDEV_POP(salary) AS stddev
       FROM emp;

      min      |      max      |      stddev
---------------+---------------+------------------
25324          |74999          |13765.125502787832

SUM_OF_SQUARES

edit

Synopsis:

SUM_OF_SQUARES(field_name) 

Input:

a numeric field

Output: double numeric value

Description:Returns the sum of squares of input values in the field field_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, SUM_OF_SQUARES(salary) AS sumsq
       FROM emp;

      min      |      max      |     sumsq
---------------+---------------+----------------
25324          |74999          |2.51740125721E11

VAR_POP

edit

Synopsis:

VAR_POP(field_name) 

Input:

a numeric field

Output: double numeric value

Description:Returns the population variance of input values in the field field_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, VAR_POP(salary) AS varpop FROM emp;

      min      |      max      |     varpop
---------------+---------------+----------------
25324          |74999          |1.894786801075E8