Version: 0.5

Time-Window Aggregation Functions Reference

Time-window aggregation functions are built-in functions that are used by defining an Aggregation object in a Batch Feature View or a Stream Feature View.

This page is a reference that contains the available time-window aggregation functions. The aggregation functions discussed on this page are either available exclusively under the tecton.aggregation_functions namespace or can only be specified through string representations. For specific examples of how to use these functions, please refer to the examples provided under each aggregation function.

count

An aggregation function that returns, for a materialization time window, the number of row values for a column, per entity value (such as a user_id value). Null values are excluded.

Supported Data Platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Tecton on Spark: All types
Tecton on Snowflake: All types

Output column types

Int64

Usage

To use this aggregation, define an Aggregation object, using function="count", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="transaction_id", function="count", time_window=timedelta(days=1))

last_distinct(n)

An aggregation function that returns, for a materialization time window, the last N distinct row values for a column, per entity value (such as a user_id value).

For example, if the last 2 distinct row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].

note

The output sequence is in ascending order based on the timestamp.

Supported data platforms

Tecton on Spark (Databricks and EMR)

Input column types

String

Output column type

Array[String]

Usage

Import this aggregation with from tecton.aggregation_functions import last_distinct.

Then, define an Aggregation object, using function=last_distinct(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function=last_distinct(2), time_window=timedelta(days=1))

max

An aggregation function that returns, for a materialization time window, the maximum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Int64, Int32, Float64, String

Output column type

Int64, Float64, String

Usage

To use this aggregation, define an Aggregation object, using function="max", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="max", time_window=timedelta(days=1))

mean

An aggregation function that returns, for a materialization time window, the mean of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Int64, Int32, Float64

Output column type

Float64

Usage

To use this aggregation, define an Aggregation object, using function="mean", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="mean", time_window=timedelta(days=1))

min

An aggregation function that returns, for a materialization time window, the minimum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Int64, Int32, Float64, String

Output column type

Int64, Float64, String

Usage

To use this aggregation, define an Aggregation object, using function="min", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="min", time_window=timedelta(days=1))

stddev_pop

An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the population mean, per entity value (such as a user_id value).

Supported data platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Int64, Int32, Float64

Output column type

Float64

Usage

To use this aggregation, define an Aggregation object, using function="stddev_pop", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="stddev_pop", time_window=timedelta(days=1))

stddev_samp

An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the sample mean, per entity value (such as a user_id value).

Supported data platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Int64, Int32, Float64

Output column type

Float64

Usage

To use this aggregation, define an Aggregation object, using function="stddev_samp", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="stddev_samp", time_window=timedelta(days=1))

sum

An aggregation function that returns, for a materialization time window, the sum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Int64, Int32, Float64

Output column type

Int64 or Float64

Usage

To use this aggregation, define an Aggregation object, using function="sum", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="sum", time_window=timedelta(days=1))

var_pop

An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the population mean, per entity value (such as a user_id value).

Supported data platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Int64, Int32, Float64

Output column type

Float64

Usage

To use this aggregation, define an Aggregation object, using function="var_pop", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="var_pop", time_window=timedelta(days=1))

var_samp

An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the sample mean, per entity value (such as a user_id value).

Supported data platforms

Tecton on Spark (Databricks and EMR)
Tecton on Snowflake

Input column types

Int64, Int32, Float64

Output column type

Float64

Usage

To use this aggregation, define an Aggregation object, using function="var_samp", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="var_samp", time_window=timedelta(days=1))

Time-Window Aggregation Functions Reference

count​

last_distinct(n)​

max​

mean​

min​

stddev_pop​

stddev_samp​

sum​

var_pop​

var_samp​

Was this page helpful?

count

last_distinct(n)

max

mean

min

stddev_pop

stddev_samp

sum

var_pop

var_samp