Connect to a Kafka Stream
Tecton can use Kafka as a data source for feature materialization. Connecting to Kafka requires setting up authentication and, if using Amazon MSK, establishing Virtual Private Cloud (VPC) connectivity.
Establish network connectivity​
For Amazon MSK​
Because your data platform (Databricks or EMR) resides on a different Amazon VPC than the VPC where Amazon MSK resides, you will need to configure access between the two VPCs.
For Confluent​
See the Confluent documentation that explains the available options for establishing network connectivity.
Configure authentication​
Tecton can connect to Kafka using TLS and SASL.
TLS authentication​
Configuring authentication with SSL requires configuring a keystore, and optionally, a truststore.
Set up a keystore and truststore​
To set up a keystore, upload your keystore file (a .jks
file) to either S3 or
DBFS (Databricks only) and set the Tecton ssl_keystore_location
parameter of a
KafkaConfig
object.
If accessing the keystore file requires a password, follow these steps, based on your data platform:
Databricks​
- Create a secret using the scope you created in
Connecting Databricks.
The secret key name must begin with
SECRET_
. - In your
KafkaConfig
object, setssl_keystore_password_secret_id
to the secret key name you created in the first step.
EMR​
- In AWS Secrets Manager, create a secret key having the format:
<prefix>/SECRET_<rest of the secret name>
where<prefix>
is:<deployment name>
, if your deployment name begins withtecton
tecton-<deployment name>
, otherwise and<deployment name>
is the first part of the URL used to access the Tecton UI:https://<deployment name>.tecton.ai
and<rest of the secret name>
is a string of your choice.
- In your
KafkaConfig
object, setssl_keystore_password_secret_id
to the secret key name you created in the first step.
To set up a truststore, upload your truststore file (a .jks
file) to either S3
or DBFS (Databricks only) and set the Tecton ssl_keystore_location
parameter
of a KafkaConfig
object.
If accessing the truststore file requires a password, follow these steps, based on your data platform:
Databricks​
-
Create a secret using the scope you created in Connecting Databricks. The secret key name must begin with
SECRET_
. -
In your
KafkaConfig
object, setssl_truststore_password_secret_id
to the secret key name you created in the first step.
EMR​
-
In AWS Secrets Manager, create a secret key having the format:
<prefix>/SECRET_<rest of the secret name>
where
<prefix>
is:<deployment name>
, if your deployment name begins withtecton
tecton-<deployment name>
, otherwise
and
<deployment name>
is the first part of the URL used to access the Tecton UI:https://<deployment name>.tecton.ai
and
<rest of the secret name>
is a string of your choice. -
In your
KafkaConfig
object, setssl_truststore_password_secret_id
to the secret key name you created in the first step.
In the following cases, use of a truststore is required.
- Kafka configured with a custom Certificate Authority
- Using Amazon MSK with Databricks
In many other cases, use of a truststore is optional.
Databricks: Notes on keystore/truststore files (Databricks)​
You can store keystore and truststore files in any location. The following are
example locations (these are set as parameters in the KafkaConfig
object):
ssl_keystore_location
:dbfs:/kafka-credentials/kafka_client_keystore.jks
ssl_truststore_location
:dbfs:/kafka-credentials/amazon_truststore.jks