We are excited to launch the latest version 0.8 of our SDK! This release introduces a suite of new capabilities that unlock new features, boost performance & reliability, enhance the overall feature development experience, and more.
To get started with the 0.8 SDK, please refer to our comprehensive upgrade guide. We're excited for you to explore these new capabilities and see how they can transform your feature engineering workflows — we look forward to your feedback!
Unlocking New Features
This release introduces powerful new capabilities to Tecton’s Aggregation Engine and our set of built-in aggregations. These are performant, simple to write, available for both batch and streaming features, and guaranteed to be consistent across both online and offline environments.
First, Secondary Key Aggregations
(documentation)
enable aggregating not only over a Feature View’s entity join key(s) but also
over a specified secondary key. For example, consider a dataset of user IDs and
ad IDs. When entities=[user]
and aggregation_secondary_key="ad_id”
, Tecton
will automatically compute and retrieve feature values not just for each user,
but for each ad that each user has interacted with.
In addition, Offset Windows
(documentation)
allow users to shift the time window for which aggregated features are retrieved
by some fixed interval, instead of just based on the current time. For example,
setting TimeWindow(window_size=timedelta(days=7), offset=timedelta(days=-3))
will aggregate values over -10 days to -3 days, instead of over the past 7 days.
This is especially useful when evaluating how feature values change over time
(e.g. comparing values from the past day to values from all previous days in the
week).
Tecton 0.8 also introduces Custom Environments for On-Demand Feature Views
(documentation)
which enable features that rely on Python dependencies (including arbitrary
packages that can be installed from pip
!). Now, users can easily define their
own custom ODFV Python Environment via a requirements.txt
file.
Performance & Costs
Tecton’s new Feature Server Caching capability results in reductions in both cost (upwards of 50% savings) and latency during online feature retrieval. This capability is especially impactful for high-scale feature retrieval use cases. See our documentation for more detailed benchmarking and please reach out to Tecton’s team to learn more!
Finally, 0.8 includes performance optimizations for offline feature retrieval
and materialization in Tecton on Snowflake. In addition, Tecton on Snowflake
queries are now much easier to analyze and debug using the
TectonDataFrame.explain()
or TectonDataFrame.subtree()
methods described
here.
Enhanced Feature Development Experience
Tecton 0.8 introduces a new Repo Config file
(documentation)
to set defaults for Tecton objects in your Tecton feature repository, resulting
in simpler and more easy-to-understand feature definitions. For example,
customers can use this file to specify a default batch_compute
or
tecton_materialization_runtime
for their Feature Views.
This release improves the usability and simplicity of Tecton’s methods for offline feature retrieval and testing (documentation). This includes new output fields to more clearly understand feature validity over time. This also lets users toggle if the validity of a feature should take into account the time at which the feature was materialized to the online store.
All Feature Views can now have explicitly defined output schemas (documentation) using Tecton’s Data Types, ensuring clarity in your team’s feature repo. Configuring schemas can also speed up feature development and iteration by letting users bypass server-side validation.
Users working with Tecton’s SDK in a notebook can use the new tecton.login()
function
(documentation),
which makes it easier than ever to authenticate to Tecton and interact with
objects from your workspace.
Exploring Tecton Features from a Data Warehouse
Tecton 0.8 introduces the ability to Publish Features to Your Data Warehouse (documentation), improving interoperability with your own data ecosystem. Tecton will automatically compute feature values and make them available in customers’ data warehouse (e.g. Snowflake), enabling feature analysis, exploration, selection, and evaluation.
Release Management
Tecton 0.8 introduces
versioning of the Tecton Materialization Runtime
that is deployed to Databricks & EMR clusters for orchestration of backfills and
materialization. This further improves the reliability of Tecton releases beyond
our robust
testing and validation process
by letting customers iteratively upgrade their Batch & Stream Feature Views by
setting the tecton_materialization_runtime
parameter.